Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter push down for Union #559

Merged
merged 3 commits into from
Jun 14, 2021
Merged

Filter push down for Union #559

merged 3 commits into from
Jun 14, 2021

Conversation

Dandandan
Copy link
Contributor

Which issue does this PR close?

Closes #557

Rationale for this change

What changes are included in this PR?

Filter is pushed down through union (all), so it can be pushed down further towards other operations like table scans.

Are there any user-facing changes?

FYI @nevi-me

@Dandandan Dandandan changed the title Push down filter through UNION Filter push down for Union Jun 14, 2021
@Dandandan Dandandan requested a review from nevi-me June 14, 2021 06:54
@codecov-commenter
Copy link

Codecov Report

Merging #559 (1385342) into master (d382854) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #559   +/-   ##
=======================================
  Coverage   76.09%   76.10%           
=======================================
  Files         156      156           
  Lines       27047    27056    +9     
=======================================
+ Hits        20581    20590    +9     
  Misses       6466     6466           
Impacted Files Coverage Δ
datafusion/src/optimizer/filter_push_down.rs 97.84% <100.00%> (+0.04%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d382854...1385342. Read the comment docs.

Copy link
Contributor

@nevi-me nevi-me left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm seeing:

Projection: #COUNT(UInt8(1)) AS total_records, #COUNT(DISTINCT payment_type) AS total_payment_types, #SUM(CAST(trip_distance AS Float64)) AS total_distance
  Aggregate: groupBy=[[]], aggr=[[COUNT(UInt8(1)), COUNT(DISTINCT #payment_type), SUM(CAST(#trip_distance AS Float64))]]
    Union
      Projection: #passenger_count, #trip_distance, #payment_type, #total_amount
        TableScan: mongo_nyc projection=Some([3, 4, 9, 16]), filters=[#passenger_count Gt Int64(3), #total_amount Lt Float64(20)]
      Projection: #passenger_count, #trip_distance, #payment_type, #total_amount
        Filter: #passenger_count Gt Int64(3) And #total_amount Lt Float64(20)
          TableScan: csv_nyc projection=Some([3, 4, 9, 16])

Thanks @Dandandan

@Dandandan Dandandan merged commit 396a50b into apache:master Jun 14, 2021
@houqp houqp added datafusion Changes in the datafusion crate enhancement New feature or request labels Jul 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datafusion Changes in the datafusion crate enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Filters aren't passed down to table scans in a union
4 participants