Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for non-correlated subqueries #3266

Closed
andygrove opened this issue Aug 25, 2022 · 5 comments · Fixed by #3287
Closed

Add support for non-correlated subqueries #3266

andygrove opened this issue Aug 25, 2022 · 5 comments · Fixed by #3287
Labels
enhancement New feature or request sql SQL Planner

Comments

@andygrove
Copy link
Member

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

I would like to be able to run this query:

CREATE TABLE paintings AS SELECT 'Mona Lisa' as name, 1000000 as listed_price;

SELECT name, listed_price
FROM paintings
WHERE listed_price > (
    SELECT AVG(listed_price)
    FROM paintings
);

It currently fails with:

Skipping optimizer rule decorrelate_scalar_subquery due to unexpected error: scalar subqueries must have a filter to be correlated at /home/andy/git/apache/arrow-datafusion/datafusion/optimizer/src/decorrelate_scalar_subquery.rs:177
caused by
Error during planning: Could not coerce into Filter! at /home/andy/git/apache/arrow-datafusion/datafusion/expr/src/logical_plan/plan.rs:1127
NotImplemented("Physical plan does not support logical expression (<subquery>)")

This is because the existing subquery optimizer rules fail to rewrite the query.

It should be possible to rewrite the query as a join:

SELECT name, listed_price
FROM paintings
CROSS JOIN (SELECT AVG(listed_price) AS avg_price FROM paintings) temp
WHERE listed_price > temp.avg_price;

Describe the solution you'd like
☝️

Describe alternatives you've considered
None

Additional context
None

@andygrove andygrove added enhancement New feature or request sql SQL Planner labels Aug 25, 2022
@avantgardnerio
Copy link
Contributor

FYI this should not be super-hard. I actually started adding uncorrelated support, and there is even an example of it here. CC @DaltonModlin

@kmitchener
Copy link
Contributor

When this works, q15 in the TPCH benchmark should work as well.

@DaltonModlin
Copy link
Contributor

I'll take a look into this issue as I'm currently trying to fix q15 for TPCH benchmarks.

@avantgardnerio
Copy link
Contributor

I think there are three things preventing q15 from working:

  1. projections in the view definition (now fixed by @DaltonModlin )
  2. a different view-related error @DaltonModlin is working on presently
  3. The uncorrelated issue we're chatting on right now

2 & # 3 could be worked on independently I believe.

@kmitchener
Copy link
Contributor

4 things! We need the PR for drop view #3267 to be merged in too for q15. :)

I can take a stab at this issue, but I wouldn't care if someone came up with a competing patch .. plan optimization is totally new to me so I'm not sure how successful I'll be over what timeframe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request sql SQL Planner
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants