Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GroupBy normalization rule maintains parent Projection dependencies #749

Merged
merged 2 commits into from
Jan 26, 2022

Conversation

max-hoffman
Copy link
Contributor

We flatten GroupBy aggregations to isolation agg functions, but in the process lose nested column dependencies. The downstream qualifyColumns rule was erroring, but without passthrough projections binary expressions like Arithemtic(Sum(x.i), y.i) will fail at execution time without a full set of input dependencies.

For this query:`

select sum(x.i) + y.i from mytable as x, mytable as y where x.i = y.i GROUP BY x.i

We correctly identify that the GroupBy node has one primary aggregation, Sum, and project the Arithmetic separately.

GroupBy -> Sum(x.i) + y.i -> ... TableScan (x,y)
=> 
Project (Arithmetic(sum(x.i) + (y.i)))-> GroupBy(Sum(x.i)) -> ...

The Project node fails downstream trying to lookup the y.i dependency we discarded in the transform. This PR adds dependencies back to cover the new parent Project for GroupBy flattening.

GroupBy -> Sum(x.i) + y.i -> ... TableScan (x,y)
=> 
Project (Arithmetic(sum(x.i) + (y.i)))-> GroupBy(Sum(x.i), y.i) -> ...

Copy link
Member

@zachmu zachmu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -75,13 +75,18 @@ func flattenedGroupBy(ctx *sql.Context, projection, grouping []sql.Expression, c
func replaceAggregatesWithGetFieldProjections(ctx *sql.Context, projection []sql.Expression) (projections, aggregations []sql.Expression, err error) {
var newProjection = make([]sql.Expression, len(projection))
var newAggregates []sql.Expression

allGetFields := make(map[int]sql.Expression, 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to declare size zero in map initializers like this.

Generally speaking the only time you want to explicitly size a map or slice is when you know ahead of time how big it will be.

@max-hoffman max-hoffman merged commit d2e4b47 into main Jan 26, 2022
@max-hoffman max-hoffman deleted the max/groupby-flatten-bug branch January 26, 2022 18:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants