-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consolidate Filter::remove_aliases
into Expr::unalias_nested
#11001
Conversation
Expr::Alias(alias) => alias.expr.unalias_nested(), | ||
_ => self, | ||
} | ||
pub fn unalias_nested(self) -> Transformed<Expr> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is an API change (it returns Transformed rather than just Expr) but I think that makes more sense
datafusion/expr/src/expr.rs
Outdated
Expr::Alias(Alias { expr, .. }) => { | ||
// directly recurse into the alias expression | ||
// as we just modified the tree structure | ||
Ok(Transformed::yes(expr.unalias_nested().data)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to directly recurse into the aliased expression? Why don't just retrun Transformed::yes(*expr)
(that implies TreeNodeRecursion::Continue
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is what my first version did -- and it turns out that when Transformed::yes(*expr)
is returned the recursion doesn't traverse into the *expr (so nested aliases like (x as foo) as bar
are only rewritten to (x as bar)
I tried to convince myself this was a bug in tree node, but think the current behavior makes sense as the newly transformed node wasn't (strictly speaking) part of the original tree and thus automatically traversing down into it may not make senes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, this is because of how transform_down()
works when you replace the current alias to its child that is also an alias.
In the case of (x as foo) as bar
when we return Transformed::yes(*expr)
then ... as bar
gets replaced to x as foo
(so x as foo
becomes the current node) and then recursion continues into x
(the current node's children).
So currently with the expr.unalias_nested()
call and the TreeNodeRecursion::Continue
there is actually a double recursion.
I think transform_up()
should be ok, but then you can't skip on subqueries.
So I think we have 2 options here:
- Use
transform_down()
but then you should returnOk(Transformed::new(expr.unalias_nested(), true, TreeNodeRecursion::Jump))
to avoid double recursion. - Or use
transform_down_up()
and inf_down
handle the subqueries and inf_up
handle alias with returningOk(Transformed::yes(*expr))
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated my comment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the code to use transform_down_up
-- I think it looks nice now
@comphead I wonder if you have time to re-review this PR? |
// f_down: skip subqueries. Check in f_down to avoid recursing into them | ||
let recursion = if matches!( | ||
expr, | ||
Expr::Exists { .. } | Expr::ScalarSubquery(_) | Expr::InSubquery(_) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @alamb
I'm not sure if I read this line correctly, I just checked some queries with duckdb and it doesn't allow setting the alias for Exists or In Subqueries
D with t as (select 1 a) select 1 where exists (select * from t) as x;
Error: Parser Error: syntax error at or near "as"
LINE 1: ...elect 1 where exists (select * from t) as x;
^
D select 1 where 1 in (1, 2) as x;
Error: Parser Error: syntax error at or near "as"
LINE 1: select 1 where 1 in (1, 2) as x;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I don't think aliases can be set for such plans via SQL -- I think they are (or were) sometimes set programatically (e.g. via some other optimizer pass or rewrite)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm thanks @alamb there is some comment, I'm not sure how is it relevant though
Thanks again @peter-toth and @comphead |
…che#11001) * Consolidate Expr::unalias_nested * fix doc test * Clarify unreachable code * Use transform_down_up to handle recursion
Which issue does this PR close?
Follow on to #10835
Rationale for this change
As part of #10835, @peter-toth noted that the nested unaliasing code in
FilterExec
probably belonged inExrp
to make it more discovrable: https://github.com/apache/datafusion/pull/10835/files#r1644664005It turns out when I looked into doing so, there was already an existing implementation that could be used
What changes are included in this PR?
Filter::remove_aliases
and fold its implementation intoExpr::unalias_nested
Are these changes tested?
They are covered by existing tests / CI
Are there any user-facing changes?