You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Can particulary help in the parquet scan/filter case to avoid string copies. See Andrew Lamb's talk from the June 24 Bay Area DataFusion Meetup for more info.
Should we implement native versions of RowToColumnar / ColumnarToRow?
Check that we are removing FilterExec when all filter conditions are successfully pushed down to Parquet, to avoid evaluating the filter predicates twice
Implement tooling for saving output of query stage to disk so that we can benchmark individual query stages in rust, outside of spark
Spark will remove dictionary-encoding after a cast (and maybe after other expressions?) but it could be advantageous to retain the dictionary encoding so that upstream native operators can take advantage of this?
Ideas no longer being pursued
Use JIT for evaluating nested expressions to avoid intermediate arrays (there was a datafusion-jit module, but it was abandonded, so we need to see why)
Update: previous experiments in Arrow/DataFusion did not show a speedup from this appraoch and it added a lot of extra code and complexity
Use mutable vectors during expression evaluation to avoid intermediate arrays (in-place updates are available in arrow-rs)
What is the problem the feature request solves?
This epic is a place to track various ideas around improving query performance.
Some of these ideas apply to upstream DataFusion rather than being Comet-specific.
Planned Work
Ideas to Research
These are longer term ideas to explore.
Ideas no longer being pursued
datafusion-jit
module, but it was abandonded, so we need to see why)Related DataFusion issues
Accumulator
trait to acceptRecordBatch
/num_rows
to allow fasterCount
datafusion#8067Related Comet issues:
Describe the potential solution
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: