-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Dialect requires derived table alias #12994
fix: Dialect requires derived table alias #12994
Conversation
* fix: Add Dialect option for requiring table aliases * feat: Add CustomDialectBuilder for requires_table_alias * docs: Spelling * refactor: rename requires_derived_table_alias * refactor: rename requires_derived_table_alias
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @peasee for working on this. I left some suggestions here.
expected: | ||
// top projection sort gets derived into a subquery | ||
// for MySQL, this subquery needs an alias | ||
"SELECT `j1_min` FROM (SELECT min(`ta`.`j1_id`) AS `j1_min`, min(`ta`.`j1_id`) FROM `j1` AS `ta` ORDER BY min(`ta`.`j1_id`) ASC) AS `derived_sort` LIMIT 10", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the test only covers the derived_sort
case but some cases aren't covered, such as derived_projection
, derived_limit
,...
Could you add more tests for them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added another test case for a more complex SQL that results in 2 nested derives for derived_sort
and derived_distinct
in the same query.
I'm not too sure what SQL to use to trigger derived_limit
or derived_projection
though... any ideas that I could include?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For derived_projection
, I found the SQL can trigger:
select j1_id from (select 1 as j1_id)
-> SELECT `j1_id` FROM (SELECT 1 AS `j1_id`) AS `derived_projection`
However, I guess it's not a valid SQL for MySQL, right? I tried it in DataFusion and DuckDB, they accept it but Postgres doesn't allow it. Maybe, it could be a DataFusion
dialect to MySQL
dialect case.
For derived_limit
, a similar case can trigger it:
select * from (select * from j1 limit 10)
-> SELECT * FROM (SELECT * FROM `j1` LIMIT 10) AS `derived_limit`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is valid for MySQL, I just tested it out in my CLI. I've added both as new cases, thanks!
By the way, I tried to pull your branch and run the tests on my laptop but the tests always fail. After merging with the latest main branch, the tests are passed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a good improvement for the derived table. 👍 It makes sense to me.
I noticed some databases don't support this kind of unnamed subquery, see #12896 (comment)
This PR would be helpful.
Maybe @phillipleblanc and @sgrebnov want to take a look at this PR. |
Looks good to me, thanks @goldmedal and @peasee |
@goldmedal perhaps you would like to merge this PR as a way to verify your permissions are configured correctly? |
Sure, I plan to merge this PR if no more comments tomorrow. |
Let's hold off on merging this for now. I think I've introduced a regression that I'd like to test more first before merging. |
Okay, looks like my PR didn't introduce the regression but there's a somewhat related bug currently on
This gets rewritten in the
Which is invalid, because the subquery is un-aliased. I tested it in DuckDB to confirm:
Let's merge my PR, and I can raise a new issue for this other bug @goldmedal ? |
It's valid for DataFusion
Generally,
How about
or
If it's the first one, I think it's an issue because it isn't valid. We can file an issue for it. |
My bad, I should've clarified better. It is the first one with
|
I see. I think it's an issue but we can fix it in the follow-up PR. Let's merge this PR first. |
|
Thanks @peasee and @phillipleblanc @alamb for reviewing. 👍 |
Which issue does this PR close?
Closes #12993.
Rationale for this change
Fixes a bug preventing derived logical plans from running in MySQL due to a requirement that every derived table must have an alias:
What changes are included in this PR?
Includes a dialect update to specify whether derived tables require aliases. For dialects that do, set a static alias. I've simply selected the name of the operation prefixed with
derived_
, likederived_sort
, etc.Are these changes tested?
Yes
Are there any user-facing changes?
No