Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

planner, executor: fix index merge partial table scan schema #23936

Merged
merged 19 commits into from
Apr 19, 2021

Conversation

tangenta
Copy link
Contributor

@tangenta tangenta commented Apr 9, 2021

What problem does this PR solve?

Issue Number: close #23919

Problem Summary:

create table t (a int, b int, index(a), index(b)) partition by hash (a) partitions 2;
insert into t values (1, 5);
select /*+ use_index_merge( t ) */ * from t where a in (3) or b in (5) order by a;
ERROR 1105 (HY000): runtime error: index out of range [2] with length 2

The explain result:

+--------------------------------------+---------+-----------+-----------------------------------+---------------------------------------------+
| id                                   | estRows | task      | access object                     | operator info                               |
+--------------------------------------+---------+-----------+-----------------------------------+---------------------------------------------+
| Sort_9                               | 39.98   | root      |                                   | test.t.a                                    |
| └─PartitionUnion_12                  | 39.98   | root      |                                   |                                             |
|   ├─IndexMerge_16                    | 19.99   | root      |                                   |                                             |
|   │ ├─IndexRangeScan_13(Build)       | 10.00   | cop[tikv] | table:t, partition:p0, index:a(a) | range:[3,3], keep order:false, stats:pseudo |
|   │ ├─IndexRangeScan_14(Build)       | 10.00   | cop[tikv] | table:t, partition:p0, index:b(b) | range:[5,5], keep order:false, stats:pseudo |
|   │ └─TableRowIDScan_15(Probe)       | 19.99   | cop[tikv] | table:t, partition:p0             | keep order:false, stats:pseudo              |
|   └─IndexMerge_20                    | 19.99   | root      |                                   |                                             |
|     ├─IndexRangeScan_17(Build)       | 10.00   | cop[tikv] | table:t, partition:p1, index:a(a) | range:[3,3], keep order:false, stats:pseudo |
|     ├─IndexRangeScan_18(Build)       | 10.00   | cop[tikv] | table:t, partition:p1, index:b(b) | range:[5,5], keep order:false, stats:pseudo |
|     └─TableRowIDScan_19(Probe)       | 19.99   | cop[tikv] | table:t, partition:p1             | keep order:false, stats:pseudo              |
+--------------------------------------+---------+-----------+-----------------------------------+---------------------------------------------+

SortExec.Next() only requires 2 columns(a and b), but the row fetched from the underlying UnionExec has 3(a, b and _tidb_rowid). This cause Chunk.AppendPartialRow panic because the slice out-of-range error.

The request chunk for UnionExec.Next should have only 2 columns, but req.SwapColumns(result.chk) change it into a 3-columns chunk. This chunk comes from IndexMergeReaderExecutor, which has a 3-columns schema.

IndexMergeReaderExecutor inherits the schema from PhysicalIndexMergeReader, which inherits from the PhysicalTableScan built by buildIndexMergeTableScan:

if ts.HandleCols == nil {
handleCol := ds.getPKIsHandleCol()
if handleCol == nil {
handleCol, _ = ts.appendExtraHandleCol(ds)
}
ts.HandleCols = NewIntHandleCols(handleCol)
}
var err error
ts.HandleCols, err = ts.HandleCols.ResolveIndices(ts.schema)
if err != nil {
return nil, 0, err
}

It appends the _tidb_rowid to the schema of DataSource to help HandleCols resolve its indices.

The HandleCols is used to fetch handles in both TableRangeScan and IndexRangeScan.

  • For the TableRangeScan, the current schema is from DataSource plus additional common handle columns. This is unresonable because only the handle columns are needed.
  • For the IndexRangeScan, the current schema is always indexed columns plus full common handle columns. ResolveIndices is useless in IndexRangeScan because the expression.Column.Index is never used.

What is changed and how it works?

How it Works:

This PR makes the schema of IndexMerge consistent with the upper operator, by avoid changing DataSource schema directly. It also fixes the issue in #23933 (review).

What is changed:

  • Change the schema of TableRangeScan(as a child of IndexMerge operator) to the one that only contains handle columns.
  • Resolve the HandleCols in IndexMergeReaderExecutor against the schema that only contains the handle columns.
  • Remove unnecessary projection.
  • Add test for sorting index merge result in partition union reports 'index out of range' #23919 and clustered index/_tidb_rowid/PkIsHandle for index merge.

Related changes

NA

Check List

Tests

  • Unit test
  • Integration test

Side effects

NA

Release note

  • Fix an issue that sorting on index-merge results in partition union reports 'index out of range'.

@tangenta tangenta requested review from a team as code owners April 9, 2021 13:30
@tangenta tangenta requested review from lzmhhh123 and removed request for a team April 9, 2021 13:30
@ti-chi-bot ti-chi-bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Apr 9, 2021
@github-actions github-actions bot added the sig/execution SIG execution label Apr 9, 2021
@ichn-hu ichn-hu mentioned this pull request Apr 9, 2021
@tangenta
Copy link
Contributor Author

/hold

@ti-chi-bot ti-chi-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 12, 2021
@tangenta tangenta changed the title planner: prevent appending _tidb_rowid to the schema of DataSource planner: prevent appending _tidb_rowid to Datasource schema in index-merge Apr 12, 2021
@tangenta tangenta changed the title planner: prevent appending _tidb_rowid to Datasource schema in index-merge planner: prevent appending _tidb_rowid to DataSource schema in index merge Apr 12, 2021
@ti-chi-bot ti-chi-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 14, 2021
@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 14, 2021
@tangenta tangenta changed the title planner: prevent appending _tidb_rowid to DataSource schema in index merge planner, executor: fix index merge partial table scan schema Apr 14, 2021
@tangenta
Copy link
Contributor Author

/unhold

@tangenta tangenta added the type/bugfix This PR fixes a bug. label Apr 14, 2021
@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 17, 2021
@ti-chi-bot ti-chi-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 19, 2021
@tangenta tangenta requested a review from eurekaka April 19, 2021 05:12
@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Apr 19, 2021
Copy link
Contributor

@eurekaka eurekaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

planner/core/find_best_task.go Show resolved Hide resolved
@ti-chi-bot
Copy link
Member

@eurekaka: /lgtm is only allowed for the reviewers in list.

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@tangenta tangenta added the sig/planner SIG: Planner label Apr 19, 2021
Copy link
Contributor

@eurekaka eurekaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@eurekaka
Copy link
Contributor

/lgtm

@ti-chi-bot
Copy link
Member

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • eurekaka
  • lzmhhh123

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by writing /lgtm in a comment.
Reviewer can cancel approval by writing /lgtm cancel in a comment.

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Apr 19, 2021
@eurekaka
Copy link
Contributor

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 42e3493

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Apr 19, 2021
@ti-chi-bot ti-chi-bot merged commit 421571f into pingcap:master Apr 19, 2021
ti-srebot pushed a commit to ti-srebot/tidb that referenced this pull request Apr 20, 2021
Signed-off-by: ti-srebot <ti-srebot@pingcap.com>
@ti-srebot
Copy link
Contributor

cherry pick to release-5.0 in PR #24155

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-cherry-pick-release-5.0 sig/execution SIG execution sig/planner SIG: Planner size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/bugfix This PR fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sorting index merge result in partition union reports 'index out of range'
5 participants