Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Add tokomak optimizer #441

Closed
wants to merge 14 commits into from

Conversation

Dandandan
Copy link
Contributor

@Dandandan Dandandan commented May 29, 2021

Which issue does this PR close?

Closes #440

TODO:

  • Decide whether to make this (and the egg dependency) an optional (off by default) feature.
  • Add some more tests
  • Add support for more expressions to support more queries:
    • dates
    • in
    • between
    • aggregations
    • cast
  • Some cleanup

Rationale for this change

See #440 for a longer explanation.
I believe using a optimization library is a great way to accelerate developing & increase the number of optimizations possible.

Also, the more optimizations we add, the more we end up building a own (probably worse) version of a optimization framework.

What changes are included in this PR?

Adds a new optimizer Tokomak that does expression simplification based on rewrite rules.

Are there any user-facing changes?

@Dandandan Dandandan marked this pull request as draft May 29, 2021 20:51
@codecov-commenter
Copy link

codecov-commenter commented May 29, 2021

Codecov Report

Merging #441 (cad2aae) into master (c9ed34c) will decrease coverage by 0.02%.
The diff coverage is 70.78%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #441      +/-   ##
==========================================
- Coverage   75.28%   75.26%   -0.03%     
==========================================
  Files         147      153       +6     
  Lines       24868    25542     +674     
==========================================
+ Hits        18723    19223     +500     
- Misses       6145     6319     +174     
Impacted Files Coverage Δ
datafusion/src/execution/context.rs 92.08% <ø> (ø)
datafusion/src/optimizer/simplification.rs 70.78% <70.78%> (ø)
datafusion/tests/sql.rs 99.89% <0.00%> (ø)
datafusion-cli/src/main.rs 0.00% <0.00%> (ø)
ballista/rust/core/src/utils.rs 0.00% <0.00%> (ø)
datafusion/src/datasource/csv.rs 72.81% <0.00%> (ø)
ballista/rust/executor/src/main.rs 0.00% <0.00%> (ø)
datafusion/src/physical_plan/mod.rs 78.70% <0.00%> (ø)
ballista/rust/executor/src/execution_loop.rs 0.00% <0.00%> (ø)
ballista/rust/executor/src/flight_service.rs 0.00% <0.00%> (ø)
... and 15 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c9ed34c...cad2aae. Read the comment docs.

@alamb alamb added the datafusion Changes in the datafusion crate label Jun 4, 2021
@alamb alamb added the stale-pr label Jul 13, 2021
@alamb
Copy link
Contributor

alamb commented Jul 13, 2021

@Dandandan do you still plan to work on this PR?

@houqp
Copy link
Member

houqp commented Jul 14, 2021

@Dandandan let me know if you need help with some of the todo list items, happy to help push this PR forward.

@alamb
Copy link
Contributor

alamb commented Aug 20, 2021

Closing stale PRs to keep PR review list manageable. Please reopen if that is a mistake

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datafusion Changes in the datafusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Future of experimental optimizer datafusion-tokomak
4 participants