-
-
Notifications
You must be signed in to change notification settings - Fork 210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql/analyzer: Make indexed joins more aware of NULL safe comparisons. #1080
sql/analyzer: Make indexed joins more aware of NULL safe comparisons. #1080
Conversation
…they include NULL.
…on = do not scan NULL entries.
…-null' into aaron/sql-ranges-add-above-below-null
…anges-add-above-below-null-2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
"", | ||
}, | ||
{ | ||
Query: `SELECT * FROM niltable WHERE i2 <=> NULL`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this look like on indexes of multiple columns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IndexBuilder always builds RangeCollections where each Range has a RangeColumnExpr for every expression in the index. By default, it's going to return a single range that matches everything for every column (so [NULL, ∞)
in the new syntax).
For the same clause on a two-column index, it's either going to return ([NULL, NULL], [NULL, ∞))
or ([NULL, ∞), [NULL, NULL])
, depending on whether the targeted column comes first or second in the index.
For an AND on both columns <=> NULL
it will return ([NULL, NULL], [NULL, NULL])
. There's a bunch of tests in index_query_plans for how this shakes out in different cases, including when the analyzer is managing conjunctions and disjunctions, etc.
@@ -783,6 +791,9 @@ func getJoinIndexes( | |||
comparandExprs := make([]sql.Expression, 0, len(left.comparandExprs)+len(right.comparandExprs)) | |||
comparandExprs = append(comparandExprs, left.comparandExprs...) | |||
comparandExprs = append(comparandExprs, right.comparandExprs...) | |||
nullmask := make([]bool, 0, len(left.nullmask)+len(right.nullmask)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nullMask := append(left.nullmask, right.nullmaks...)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could do the same for comparandExprs
it would appear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this a little sketchy / hasn't it bitten us in the past?
Like, if the new slice shares storage with the old slice, and someone goes and mutates it? (Not saying that's necessarily a concern here...)
sql/plan/indexed_table_access.go
Outdated
if key[i] == nil && lb.matchesNullMask[i] { | ||
ib = ib.IsNull(ctx, iexprs[i]) | ||
} else { | ||
ib = ib.Equals(ctx, iexprs[i], key[i]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably want to push this logic into the analyzer (it's a bit jank how we assume an equality relationship here when that's not part of the contract of a lookup).
Add a note for now, and make sure there's a test that we don't try to do this for e.g. join t2 on a > b
or something. We might already have that test, but I'm not sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm certain the analyzer doesn't build these except for equality conditions. But yes, letting the analyzer fully own and have reasonable ways to express the row -> index builder relationship here is the right direction.
Fix joins and indexed joins using <=>. Ensure that joins using = do not scan NULL values in the index.