Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consolidate Filter::remove_aliases into Expr::unalias_nested #11001

Merged
merged 10 commits into from
Jul 2, 2024
47 changes: 34 additions & 13 deletions datafusion/expr/src/expr.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1198,32 +1198,53 @@ impl Expr {
}
}

/// Recursively potentially multiple aliases from an expression.
/// Recursively removed potentially multiple aliases from an expression.
///
/// If the expression is not an alias, the expression is returned unchanged.
/// This method removes directly nested aliases, but not other nested
/// aliases.
/// This method removes nested aliases and returns [`Transformed`]
/// to signal if the expression was changed.
///
/// # Example
/// ```
/// # use datafusion_expr::col;
/// // `foo as "bar"` is unaliased to `foo`
/// let expr = col("foo").alias("bar");
/// assert_eq!(expr.unalias_nested(), col("foo"));
/// assert_eq!(expr.unalias_nested().data, col("foo"));
///
/// // `foo as "bar" + baz` is not unaliased
/// // `foo as "bar" + baz` is unaliased
/// let expr = col("foo").alias("bar") + col("baz");
/// assert_eq!(expr.clone().unalias_nested(), expr);
/// assert_eq!(expr.clone().unalias_nested().data, col("foo") + col("baz"));
///
/// // `foo as "bar" as "baz" is unalaised to foo
/// let expr = col("foo").alias("bar").alias("baz");
/// assert_eq!(expr.unalias_nested(), col("foo"));
/// assert_eq!(expr.unalias_nested().data, col("foo"));
/// ```
pub fn unalias_nested(self) -> Expr {
match self {
Expr::Alias(alias) => alias.expr.unalias_nested(),
_ => self,
}
pub fn unalias_nested(self) -> Transformed<Expr> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an API change (it returns Transformed rather than just Expr) but I think that makes more sense

self.transform_down_up(
|expr| {
// f_down: skip subqueries. Check in f_down to avoid recursing into them
let recursion = if matches!(
expr,
Expr::Exists { .. } | Expr::ScalarSubquery(_) | Expr::InSubquery(_)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @alamb
I'm not sure if I read this line correctly, I just checked some queries with duckdb and it doesn't allow setting the alias for Exists or In Subqueries

D with t as (select 1 a) select 1 where exists (select * from t) as x;
Error: Parser Error: syntax error at or near "as"
LINE 1: ...elect 1 where exists (select * from t) as x;
                                                  ^
D select 1 where 1 in (1, 2) as x;
Error: Parser Error: syntax error at or near "as"
LINE 1: select 1 where 1 in (1, 2) as x;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't think aliases can be set for such plans via SQL -- I think they are (or were) sometimes set programatically (e.g. via some other optimizer pass or rewrite)

) {
// subqueries could contain aliases so don't recurse into those
TreeNodeRecursion::Jump
} else {
TreeNodeRecursion::Continue
};
Ok(Transformed::new(expr, false, recursion))
},
|expr| {
// f_up: unalias on up so we can remove nested aliases like
// `(x as foo) as bar`
if let Expr::Alias(Alias { expr, .. }) = expr {
Ok(Transformed::yes(*expr))
} else {
Ok(Transformed::no(expr))
}
},
)
// unreachable code: internal closure doesn't return err
.unwrap()
}

/// Return `self IN <list>` if `negated` is false, otherwise
Expand Down
35 changes: 1 addition & 34 deletions datafusion/expr/src/logical_plan/plan.rs
Original file line number Diff line number Diff line change
Expand Up @@ -878,8 +878,7 @@ impl LogicalPlan {
}
LogicalPlan::Filter { .. } => {
assert_eq!(1, expr.len());
let predicate = expr.pop().unwrap();
let predicate = Filter::remove_aliases(predicate)?.data;
let predicate = expr.pop().unwrap().unalias_nested().data;

Filter::try_new(predicate, Arc::new(inputs.swap_remove(0)))
.map(LogicalPlan::Filter)
Expand Down Expand Up @@ -2209,38 +2208,6 @@ impl Filter {
}
false
}

/// Remove aliases from a predicate for use in a `Filter`
///
/// filter predicates should not contain aliased expressions so we remove
/// any aliases.
///
/// before this logic was added we would have aliases within filters such as
/// for benchmark q6:
///
/// ```sql
/// lineitem.l_shipdate >= Date32(\"8766\")
/// AND lineitem.l_shipdate < Date32(\"9131\")
/// AND CAST(lineitem.l_discount AS Decimal128(30, 15)) AS lineitem.l_discount >=
/// Decimal128(Some(49999999999999),30,15)
/// AND CAST(lineitem.l_discount AS Decimal128(30, 15)) AS lineitem.l_discount <=
/// Decimal128(Some(69999999999999),30,15)
/// AND lineitem.l_quantity < Decimal128(Some(2400),15,2)
/// ```
pub fn remove_aliases(predicate: Expr) -> Result<Transformed<Expr>> {
predicate.transform_down(|expr| {
match expr {
Expr::Exists { .. } | Expr::ScalarSubquery(_) | Expr::InSubquery(_) => {
// subqueries could contain aliases so we don't recurse into those
Ok(Transformed::new(expr, false, TreeNodeRecursion::Jump))
}
Expr::Alias(Alias { expr, .. }) => {
Ok(Transformed::new(*expr, true, TreeNodeRecursion::Jump))
}
_ => Ok(Transformed::no(expr)),
}
})
}
}

/// Window its input based on a set of window spec and window function (e.g. SUM or RANK)
Expand Down
9 changes: 6 additions & 3 deletions datafusion/optimizer/src/common_subexpr_eliminate.rs
Original file line number Diff line number Diff line change
Expand Up @@ -345,9 +345,12 @@ impl CommonSubexprEliminate {
self.try_unary_plan(expr, input, config)?
.transform_data(|(mut new_expr, new_input)| {
assert_eq!(new_expr.len(), 1); // passed in vec![predicate]
let new_predicate = new_expr.pop().unwrap();
Ok(Filter::remove_aliases(new_predicate)?
.update_data(|new_predicate| (new_predicate, new_input)))
let new_predicate = new_expr
.pop()
.unwrap()
.unalias_nested()
.update_data(|new_predicate| (new_predicate, new_input));
Ok(new_predicate)
})?
.map_data(|(new_predicate, new_input)| {
Filter::try_new(new_predicate, Arc::new(new_input))
Expand Down
2 changes: 1 addition & 1 deletion datafusion/optimizer/src/optimize_projections/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -600,7 +600,7 @@ fn rewrite_expr(expr: Expr, input: &Projection) -> Result<Transformed<Expr>> {
// * the current column is an expression "f"
//
// return the expression `d + e` (not `d + e` as f)
let input_expr = input.expr[idx].clone().unalias_nested();
let input_expr = input.expr[idx].clone().unalias_nested().data;
Ok(Transformed::yes(input_expr))
}
// Unsupported type for consecutive projection merge analysis.
Expand Down