Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge join inclusion correctness #1553

Merged
merged 6 commits into from
Jan 26, 2023
Merged

Conversation

max-hoffman
Copy link
Contributor

@max-hoffman max-hoffman commented Jan 24, 2023

Merge join was not principled selecting 1) monotonic filters and 2) sorting indexes that aligned with the filter of choice. We also failed to push filters referencing a single table out of join conditions.

@max-hoffman max-hoffman changed the title Merge join filter cannot reference one table Merge join inclusion correctness Jan 25, 2023
Copy link
Contributor

@fulghum fulghum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code generally looks good. I just had a couple suggestions and a couple questions to make sure I understood how the code is supposed to work.

enginetest/queries/query_plans.go Outdated Show resolved Hide resolved
joinColExprs []*joinColExpr,
tableAliases TableAliases,
) ([]sql.Expression, []bool) {
func indexMatchesKeyExprs(i sql.Index, joinColExprs []*joinColExpr, tableAliases TableAliases) ([]sql.Expression, []bool, sql.Expression) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be super helpful to future maintainers to update the function docs to explain what the new/third return value is.

Comment on lines 757 to 758
case expression.Comparer, *expression.Literal, *expression.GetField, *expression.And,
*expression.Tuple, *expression.IsNull, *expression.BindVar:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(minor) Are there other edge cases here that are worth handling or noting? Would something like "GetField(x) + Literal(-1)" cause a problem?

Copy link
Contributor Author

@max-hoffman max-hoffman Jan 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For that example, as long as the get fields are not negated, the value of the expression will be strictly increasing (monotonic). So a big negative literal in the expression value will still increase as we move along the index, as long as we are always adding x. On closer inspection, I should maybe take out the InTuple,Regex, and maybe LessThan Comparer types, which might decrease as x increases.

}
valid := true
for i, c := range cols {
if strings.ToLower(idxExprs[i]) != source+"."+c {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the index expressions have to match cols without any gaps? For example... if we need columns a,b,c, could we still use an index on a,b,b2,c?

@@ -432,24 +434,56 @@ func collectJoinConds(attributeSource string, filters ...sql.Expression) []*join
return conds
}

func collectMergeJoinConds(la, ra string, filters ...sql.Expression) ([]*joinColExpr, []*joinColExpr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function doesn't seem to be called from anywhere. Is this just dead code?

@max-hoffman max-hoffman merged commit 7acc8e1 into main Jan 26, 2023
@max-hoffman max-hoffman deleted the max/merge-join-one-sided-filter branch January 26, 2023 01:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants