Consider inequality joins #576

richox · 2024-09-18T09:32:48Z

Is your feature request related to a problem? Please describe.
currently we do not support inequality joins (hash join and sort-merge join). it is hard to implement such feature because datafusion has no direct supports to row-based evaluation.

Describe the solution you'd like
Describe alternatives you've considered

simulate row-based evaluation with one-row columnar evaluation, which has super low performance in practice. in cases where the equality pred has filtered away most records, this method may work. but if the equality pred takes no effects (like tpcds q72). the query will hang.
supports limited row-based filter in datafusion. currently datafusion already has some supports like make_comparator to build a row-based evaluator. we can extend it to support more row-based evaluations, like make_binary_op etc.
fallback the post-filter evaluation to spark, and use codegen to speedup the evaluation. but we also have to consider the fallback overheads.

Additional context

The text was updated successfully, but these errors were encountered:

richox pinned this issue Sep 18, 2024

richox added the feature required Functionalities must have label Sep 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider inequality joins #576

Consider inequality joins #576

richox commented Sep 18, 2024 •

edited

Loading

Consider inequality joins #576

Consider inequality joins #576

Comments

richox commented Sep 18, 2024 • edited Loading

richox commented Sep 18, 2024 •

edited

Loading