-
Notifications
You must be signed in to change notification settings - Fork 574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: limit push down #8844
Comments
IIRC we don't have any "limit pushdown", but only "two phase limit" |
Oh, now I remembered, we count on the early termination of limit to early terminate the scan. The scan probably did a read to fetch 1024 rows anyway in this case and udf processed them all. |
So are you making a feature request to pushing limit to the lowest level? 🤔 |
Yes, not a bug but a feature request |
I think I've done similar things when optimizing risingwave/src/frontend/src/optimizer/rule/top_n_on_index_rule.rs Lines 64 to 66 in 83f1e9d
Do you want more cases that set chunk_size for scan?
|
Wow I don't know we have that 👀. But I don't think limit push down is like that. I think we just need to blindly push limit down until a leaf node 🤔 |
Oh I see, then IIUC it's pushing down until leaf or filter? |
Oh yes and no... By mentioning filter you also remind me of other cases, like join or agg. |
I find the problem is more tricky that it might seem... A solvable problem is the project set generates multiple rows for each row input, e.g., A more tricky case is that, say one UDF generates 0 records for the first row in TABLE1 and generate 1 records for the second row in TABLE1, we cannot push down the limit to the scan as then no records will in the output. Generally speaking, we have to ensure the current op will generate at least 1 row(s) per input row to push down a limit, which there is no existing method to check and I think we may introduce some checks in Therefore I find the use case where we can push the limit down could be very limited, could you please reconfirm this feature request? |
I think UDF always returns a single row. |
Cool, then we can do something. Could you please provide a runnable use case of UDF? I cannot find any in our e2e or planner tests😄 |
Hello guys. I find my pr a bit stuck, and thus I want to followup. Since pushing down |
I don't have any other concerns, but can also only imagine Limit -> Project case. 😅 So it seems too limited and ad-hoc. Wanted @lmatz to elaborate the use cases. |
You can find them here. I think pushing |
it is not a very big optimization because the LimitExecutor can stop the stream early when the input achieves the limit row count. But generally LGTM. |
My input comes from one user who complains about why a query with |
completed by #8971 |
Describe the bug
query:
plan:
Is it a bug or is
limit 1
doing what it is supposed to do?To Reproduce
No response
Expected behavior
No response
Additional context
Reported by a design partner
The text was updated successfully, but these errors were encountered: