-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize Projections during Logical Plan #8340
Optimize Projections during Logical Plan #8340
Conversation
----------Projection: aggregate_test_100.c1 | ||
------------Filter: aggregate_test_100.c13 != Utf8("C2GT5KVyOPZpgKVl110TyZO0NcJ434") | ||
--------------TableScan: aggregate_test_100 projection=[c1, c13], partial_filters=[aggregate_test_100.c13 != Utf8("C2GT5KVyOPZpgKVl110TyZO0NcJ434")] | ||
------Projection: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Plans seem to be all slightly better :)
}) | ||
.collect::<HashMap<_, _>>() | ||
} | ||
|
||
#[cfg(test)] | ||
mod tests { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we move these tests into optimize_projection.rs
🤔, so that we can delete the merge_projection.rs
file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I plan to do so in next PR. I didn't want to increase diff because of test movement as this is PR already big.
@@ -198,8 +199,9 @@ fn between_date64_plus_interval() -> Result<()> { | |||
let plan = test_sql(sql)?; | |||
let expected = | |||
"Aggregate: groupBy=[[]], aggr=[[COUNT(Int64(1))]]\ | |||
\n Filter: test.col_date64 >= Date64(\"890179200000\") AND test.col_date64 <= Date64(\"897955200000\")\ | |||
\n TableScan: test projection=[col_date64]"; | |||
\n Projection: \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems there is an additional empty projection here compared to before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but this (empty) projection is fine, as the aggregate doesn't need any columns.
…ze_projections.rs` Fixes: apache#9978 The PR apache#8340 removed `push_down_projections.rs` in favour of `optimize_projections.rs`. This PR moves the tests as well. Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>
…ze_projections.rs` Fixes: apache#9978 The PR apache#8340 removed `push_down_projections.rs` in favour of `optimize_projections.rs`. This PR moves the tests as well. Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>
…ze_projections.rs` (#10071) * cleanup(tests): Move tests from `push_down_projections.rs` to `optimize_projections.rs` Fixes: #9978 The PR #8340 removed `push_down_projections.rs` in favour of `optimize_projections.rs`. This PR moves the tests as well. Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com> * remove the file `push_down_projection.rs` Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com> * Fix method signatures that are broken by other PRs Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com> * remove `push_down_projections.rs` from `lib.rs` Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com> --------- Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>
Which issue does this PR close?
Closes 8296.
Rationale for this change
What changes are included in this PR?
This PR removes unnecessary
Columns(Fields)
from theLogicalPlan
, when it is beneficial, by adding a projection that removes unnecessary columns from the plan.This feature is accomplished by
optimize_projections
LogicalPlan
rule.New
OptimizeProjections
rule covers the functionality ofMergeProjections
,PushDownProjection
andEliminateProjection
. Hence as part of this PR, these rules are removed also.With this PR
LogicalPlan
s that contain only absolutely necessary fields by subsequent operators (when doing so is desirable, or beneficial e.g most of the case)Consider
will not be merged to
The reason is that, in the first plan, computation
t.a/2
is calculated once, and used by subsequent projection twice.However, in the second plan computation
t.a/2
calculated twice. (This feature solves the problem in the issue.)Are these changes tested?
Yes
Are there any user-facing changes?