Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-20.2: opt: prevent cycle in memo created by SplitDisjunction #58437

Merged
merged 1 commit into from
Jan 5, 2021

Conversation

mgartner
Copy link
Collaborator

@mgartner mgartner commented Jan 4, 2021

Backport 1/1 commits from #58434.

/cc @cockroachdb/release


opt: prevent cycle in memo created by SplitDisjunction

Previously, the SplitDisjunction rule would reuse table and column IDs
from the rule's input expressions in the left side of the generated
UnionAll expression. This was incorrectly believed to be safe. The
addition of partial indexes has provided an example that highlights the
flaw: this can create cycles in the memo.

Consider the table and query below:

CREATE TABLE t (
  k INT PRIMARY KEY,
  a INT,
  b INT,
  c INT,
  INDEX tab_c_idx (c) WHERE a > 1 OR b > 1
)

SELECT * FROM t WHERE a > 1 OR b > 1

This leads to a cycle in the memo because tab_c_idx can be scanned to
retrieve rows where a > 1 and where a > 1 OR b > 1. Notice the cycle
in the memo below: G2 -> G6 -> G10 -> G2. G2 includes both an
unconstrained partial index scan a DistinctOn created by
SplitDisjunction. The child of the DistinctOn is a UnionAll (G7)
with a left child of G10. G10 has two expressions, one of which is a
Select with an input of G2.

memo
 ├── G1: (scalar-group-by G2 G3)
 ├── G2: (select G4 G5) (scan t@tab_c_idx) (distinct-on G6 G7)
 ├── G3: (aggregations G8)
 ├── G4: (scan t)
 ├── G5: (filters G9)
 ├── G6: (union-all G10 G11)
 ├── G7: (aggregations G12)
 ├── G8: (count-rows)
 ├── G9: (or G13 G14)
 ├── G10: (select G4 G15) (select G2 G15)
 ├── G11: (select G16 G17)
 ├── G12: (const-agg G18)
 ├── G13: (eq G19 G20)
 ├── G14: (eq G18 G20)
 ├── G15: (filters G13)
 ├── G16: (scan t)
 ├── G17: (filters G21)
 ├── G18: (variable b)
 ├── G19: (variable a)
 ├── G20: (const 1)
 ├── G21: (eq G22 G20)
 └── G22: (variable b)

In order to prevent this cycle, SplitDisjunction now creates new table
and column IDs for both its left and right inputs. I've been unable to
find an example where table and column ID reuse in
SplitDisjuctionAddKey causes problems, but I've updated that rule to
err on the side of caution.

Fixes #58390

Release justification: This is a critical fix for a bug that causes
errors querying tables with disjunctive filters that are the same or
similar to the predicate of one of the table's partial indexes.

Release note (bug fix): A bug has been fixed which caused errors when
querying a table with a disjunctive filter (an OR expression) that is
the same or similar to the predicate of one of the table's partial
indexes.

@mgartner mgartner requested a review from rytaft January 4, 2021 23:16
@mgartner mgartner requested a review from a team as a code owner January 4, 2021 23:16
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Previously, the `SplitDisjunction` rule would reuse table and column IDs
from the rule's input expressions in the left side of the generated
`UnionAll` expression. This was incorrectly believed to be safe. The
addition of partial indexes has provided an example that highlights the
flaw: this can create cycles in the memo.

Consider the table and query below:

    CREATE TABLE t (
      k INT PRIMARY KEY,
      a INT,
      b INT,
      c INT,
      INDEX tab_c_idx (c) WHERE a > 1 OR b > 1
    )

    SELECT * FROM t WHERE a > 1 OR b > 1

This leads to a cycle in the memo because `tab_c_idx` can be scanned to
retrieve rows where `a > 1` and where `a > 1 OR b > 1`. Notice the cycle
in the memo below: `G2 -> G6 -> G10 -> G2`. `G2` includes both an
unconstrained partial index scan a `DistinctOn` created by
`SplitDisjunction`. The child of the `DistinctOn` is a `UnionAll` (`G7`)
with a left child of `G10`. `G10` has two expressions, one of which is a
`Select` with an input of `G2`.

    memo
     ├── G1: (scalar-group-by G2 G3)
     ├── G2: (select G4 G5) (scan t@tab_c_idx) (distinct-on G6 G7)
     ├── G3: (aggregations G8)
     ├── G4: (scan t)
     ├── G5: (filters G9)
     ├── G6: (union-all G10 G11)
     ├── G7: (aggregations G12)
     ├── G8: (count-rows)
     ├── G9: (or G13 G14)
     ├── G10: (select G4 G15) (select G2 G15)
     ├── G11: (select G16 G17)
     ├── G12: (const-agg G18)
     ├── G13: (eq G19 G20)
     ├── G14: (eq G18 G20)
     ├── G15: (filters G13)
     ├── G16: (scan t)
     ├── G17: (filters G21)
     ├── G18: (variable b)
     ├── G19: (variable a)
     ├── G20: (const 1)
     ├── G21: (eq G22 G20)
     └── G22: (variable b)

In order to prevent this cycle, `SplitDisjunction` now creates new table
and column IDs for both its left and right inputs. I've been unable to
find an example where table and column ID reuse in
`SplitDisjuctionAddKey` causes problems, but I've updated that rule to
err on the side of caution.

Release justification: This is a critical fix for a bug that causes
errors querying tables with disjunctive filters that are the same or
similar to the predicate of one of the table's partial indexes.

Release note (bug fix): A bug has been fixed which caused errors when
querying a table with a disjunctive filter (an `OR` expression) that is
the same or similar to the predicate of one of the table's partial
indexes.
Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 5 of 5 files at r1.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mgartner)

@mgartner mgartner merged commit 7274fe0 into cockroachdb:release-20.2 Jan 5, 2021
@mgartner mgartner deleted the backport20.2-58434 branch January 5, 2021 03:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants