Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Range estimate not to go below equivalent point get estimate #53907

Closed
terry1purcell opened this issue Jun 10, 2024 · 0 comments · Fixed by #53860
Closed

Range estimate not to go below equivalent point get estimate #53907

terry1purcell opened this issue Jun 10, 2024 · 0 comments · Fixed by #53860
Labels
type/enhancement The issue or PR belongs to an enhancement.

Comments

@terry1purcell
Copy link
Contributor

terry1purcell commented Jun 10, 2024

Enhancement

Range predicate estimate can be much lower than an equivalent point get estimate if the high/low range of a histogram covers a broader range of values. This is because range estimation uses an interpolation formula whereas point get estimates will default to 1/NDV. These should be relatively consistent.

Example below shows the same query written as between vs an IN predicate. The example used comes from mysql_test/t/select_all.test. And any predicate values can be used where the histograms show a large range.

tidb> explain select fld1 from t2 where fld1 between 228313 and 228314;
+------------------------+---------+-----------+----------------------------+-----------------------------------------+
| id                     | estRows | task      | access object              | operator info                           |
+------------------------+---------+-----------+----------------------------+-----------------------------------------+
| IndexReader_6          | 0.05    | root      |                            | index:IndexRangeScan_5                  |
| └─IndexRangeScan_5    | 0.05    | cop[tikv] | table:t2, index:fld1(fld1) | range:[228313,228314], keep order:false |
+------------------------+---------+-----------+----------------------------+-----------------------------------------+

tidb> explain select fld1 from t2 where fld1 in (228313, 228314);
+-------------------+---------+------+----------------------------+------------------------------+
| id                | estRows | task | access object              | operator info                |
+-------------------+---------+------+----------------------------+------------------------------+
| Batch_Point_Get_1 | 2.00    | root | table:t2, index:fld1(fld1) | keep order:false, desc:false |
+-------------------+---------+------+----------------------------+------------------------------+

@terry1purcell terry1purcell added the type/enhancement The issue or PR belongs to an enhancement. label Jun 10, 2024
ti-chi-bot bot pushed a commit that referenced this issue Jun 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant