Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply IndexJoins with CrossJoin children #913

Merged
merged 18 commits into from
Apr 1, 2022

Conversation

max-hoffman
Copy link
Contributor

No description provided.

@max-hoffman
Copy link
Contributor Author

Concerns about right join commutativity in this ticket -- #916. When i fix the plans, query tests break.

@max-hoffman max-hoffman marked this pull request as ready for review March 26, 2022 18:46
@max-hoffman max-hoffman requested a review from zachmu March 26, 2022 18:46
Copy link
Member

@zachmu zachmu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes seem pretty inoffensive, but the test plans here seem to indicate that you are reversing the semantics of what LEFT and RIGHT mean in these joins. Or I don't understand it myself.

@reltuk might want to take a peek at this one too.

enginetest/enginetests.go Outdated Show resolved Hide resolved
{
Query: `SELECT a.* FROM one_pk a CROSS JOIN one_pk c RIGHT JOIN one_pk b ON b.pk = c.pk and b.pk = a.pk`,
ExpectedPlan: "Project(a.pk, a.c1, a.c2, a.c3, a.c4, a.c5)\n" +
" └─ LeftIndexedJoin((b.pk = c.pk) AND (b.pk = a.pk))\n" +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would appear to reverse the meaning of the RIGHT JOIN. I don't mean the fact that it's a LeftIndexedJoin, I mean the fact that b is the secondary table here instead of the primary. b has to be the primary table, you can't do an index lookup on it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct that our execution machinery breaks down with this plan. My understanding is the primary table should be the cross join, for a secondary lookup into an indexed table. I added a dozen more tests to verify our query join correctness, and disabled the optimization specifically for RIGHT_JOIN in the meantime. I am lacking some understanding of both index generation and the execution machinery for applying those to right joins, but otherwise this PR should work for LEFT_JOIN with CROSS_JOINs.

{
Query: `SELECT a.* FROM one_pk a CROSS JOIN one_pk b INNER JOIN one_pk c ON b.pk = c.pk LEFT JOIN one_pk d ON c.pk = d.pk`,
ExpectedPlan: "Project(a.pk, a.c1, a.c2, a.c3, a.c4, a.c5)\n" +
" └─ LeftIndexedJoin(c.pk = d.pk)\n" +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I would expect d to be the primary table in this join node. I could be confused about how this works, I guess. A good way to find out would be to write a test in MySQL where the join condition fails for a bunch of rows in the secondary tables and see what gets included.

Query: `select a.pk, c.v2 from one_pk_three_idx a cross join one_pk_three_idx b right join one_pk_three_idx c on b.pk = c.v1 where b.pk = 0 and c.v2 = 0;`,
ExpectedPlan: "Project(a.pk, c.v2)\n" +
" └─ Filter(c.v2 = 0)\n" +
" └─ LeftIndexedJoin(b.pk = c.v1)\n" +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here

sql/analyzer/indexed_joins.go Show resolved Hide resolved
sql/analyzer/join_search.go Outdated Show resolved Hide resolved
@max-hoffman max-hoffman requested a review from zachmu March 30, 2022 19:37
Copy link
Member

@zachmu zachmu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this works. Am I missing something?

enginetest/memory_engine_test.go Outdated Show resolved Hide resolved
enginetest/queries.go Outdated Show resolved Hide resolved
" └─ Filter(NOT((convert(othertable.s2, signed) = 0)))\n" +
" └─ IndexedTableAccess(othertable on [othertable.i2])\n" +
" └─ Filter(ot.i2 > 0)\n" +
" └─ LeftIndexedJoin(sub.i = ot.i2)\n" +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks incorrect to me. ot needs to be the primary table in this join or the semantics are incorrect. Right now this includes every row from the subquery and returns matching rows from ot, and it should be the opposite.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the correctness tests not thorough enough? I would need to spend a few more days taking notes on all of our join code to give a good answer.

" └─ SubqueryAlias(othertable)\n" +
" └─ Projected table access on [s2 i2]\n" +
" └─ Table(othertable)\n" +
" └─ LeftIndexedJoin(othertable.i2 = mytable.i)\n" +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same problem here

@max-hoffman max-hoffman merged commit b5db32c into main Apr 1, 2022
@max-hoffman max-hoffman deleted the max/index-nested-cross-joins branch April 1, 2022 14:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants