Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treat unbounded windows as truly non-finite. #8802

Merged
merged 1 commit into from
Jul 27, 2023

Conversation

mythrocks
Copy link
Collaborator

Depends on rapidsai/cudf#13727.

This change addresses the slowness in window aggregations for windows defined as [UNBOUNDED PRECEDING, UNBOUNDED FOLLOWING]. Before this change, unbounded row window bounds were interpreted as finite values, e.g. [MAX_INT, MAX_INT]. While this might be technically indistinguishable from a fully unbounded window, it causes the optimization in rapidsai/cudf/pull/13727 not to be triggered, because the window bounds are still finite.

The change in this PR allows the plugin to detect unbounded windows, and mark them as such for libcudf. The libcudf window function primitives can then detect fully unbounded windows, and use a faster/optimized path for execution.

Preliminary test results indicate that [UNBOUNDED PRECEDING, UNBOUNDED FOLLOWING] window function computations over 1B rows and thousands of groups are sped up by a factor of 10-14x over the previous/naive GPU implementation.

The prior tests cover the UNBOUNDED scenario already. An additional test was added.

Follow-up to rapidsai/cudf#13727.

This change addresses the slowness in window aggregations for windows defined as
`[UNBOUNDED PRECEDING, UNBOUNDED FOLLOWING]`. Before this change, unbounded row
window bounds were interpreted as finite values, e.g. `[MAX_INT, MAX_INT]`.
While this might be technically indistinguishable from a fully unbounded window,
it causes the optimization in rapidsai/cudf/pull/13727 not to be triggered,
because the window bounds are still finite.

The change in this PR allows the plugin to detect unbounded windows, and mark
them as such for `libcudf`. The `libcudf` window function primitives can then
detect fully unbounded windows, and use a faster/optimized path for execution.

Preliminary test results indicate that `[UNBOUNDED PRECEDING, UNBOUNDED FOLLOWING]`
window function computations over 1B rows and thousands of groups are sped up
by a factor of 10-14x over the previous/naive GPU implementation.

Signed-off-by: MithunR <mythrocks@gmail.com>
@mythrocks mythrocks self-assigned this Jul 25, 2023
@mythrocks mythrocks added the bug Something isn't working label Jul 25, 2023
@mythrocks
Copy link
Collaborator Author

Build

@mythrocks
Copy link
Collaborator Author

rapidsai/cudf#13727 has been merged. We should be able to re-run tests after the dependency makes its way to spark-rapids-jni.

@mythrocks
Copy link
Collaborator Author

Build

@mythrocks
Copy link
Collaborator Author

Hmm. This error is concerning. Investigating now.

@mythrocks
Copy link
Collaborator Author

It looks like spark-rapids-jni just updated to include rapidsai/cudf#13727. It would be worth checking the build again.

Note: This test runs locally, with rapidsai/cudf#13727 installed.

@mythrocks
Copy link
Collaborator Author

Build

@mythrocks mythrocks merged commit a215eab into NVIDIA:branch-23.08 Jul 27, 2023
28 checks passed
@mythrocks
Copy link
Collaborator Author

This is merged. Thank you for reviewing, @revans2.

@mythrocks mythrocks deleted the unbounded-row-window branch July 27, 2023 21:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants