Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rule to eliminate LIMIT 0 and replace it with an EmptyRelation #213

Merged
merged 3 commits into from
Apr 29, 2021

Conversation

Dandandan
Copy link
Contributor

Which issue does this PR close?

Closes #206

Rationale for this change

LIMIT 0 can be used to test a certain query / get the schema from a query or the value 0 can result from a constant folding. This rule replaces the limit 0 by an empty plan, saving time in the physical planner (no need to read metadata/statistics/partitions etc) and there is no need to produce a single batch.

What changes are included in this PR?

A new optimization pass to replace LIMIT 0 in the LogicalPlan with an EmptyRelation, with the same schema.

Are there any user-facing changes?

No breaking changes, should only be faster in some cases.

@Dandandan Dandandan changed the title Add rule to eliminate limit 0 and replace it with an empty relation. Add rule to eliminate LIMIT 0 and replace it with an EmptyRelation Apr 27, 2021
Copy link
Contributor

@returnString returnString left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! I know lots of SQL client libs will issue select * from table limit 0-type queries to ascertain query result set descriptions, so definitely a worthwhile optimisation to avoid data source fetches even just in that instance 👍

datafusion/src/optimizer/eliminate_limit.rs Outdated Show resolved Hide resolved
@Dandandan
Copy link
Contributor Author

This looks great! I know lots of SQL client libs will issue select * from table limit 0-type queries to ascertain query result set descriptions, so definitely a worthwhile optimisation to avoid data source fetches even just in that instance 👍

@returnString Cool, that was one of the use cases I was thinking about! Let me know when you experience some other queries that could benefit from a similar rule.

@codecov-commenter
Copy link

Codecov Report

Merging #213 (42ae033) into master (e86ad26) will increase coverage by 0.01%.
The diff coverage is 88.88%.

❗ Current head 42ae033 differs from pull request most recent head c165e42. Consider uploading reports for the commit c165e42 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master     #213      +/-   ##
==========================================
+ Coverage   76.30%   76.32%   +0.01%     
==========================================
  Files         134      135       +1     
  Lines       23170    23206      +36     
==========================================
+ Hits        17681    17713      +32     
- Misses       5489     5493       +4     
Impacted Files Coverage Δ
datafusion/src/execution/context.rs 92.94% <ø> (ø)
datafusion/src/optimizer/eliminate_limit.rs 88.88% <88.88%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e86ad26...c165e42. Read the comment docs.

@andygrove andygrove merged commit aa033db into apache:master Apr 29, 2021
@houqp houqp added datafusion Changes in the datafusion crate enhancement New feature or request labels Jul 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datafusion Changes in the datafusion crate enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Replace LIMIT 0 with EmptyRelation`
5 participants