Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Use CUDF API for getting join output size #2440

Open
revans2 opened this issue May 18, 2021 · 2 comments · Fixed by #3288
Open

[FEA] Use CUDF API for getting join output size #2440

revans2 opened this issue May 18, 2021 · 2 comments · Fixed by #3288
Assignees
Labels
cudf_dependency An issue or PR with this label depends on a new feature in cudf feature request New feature or request P2 Not required for release performance A performance related task/issue

Comments

@revans2
Copy link
Collaborator

revans2 commented May 18, 2021

Is your feature request related to a problem? Please describe.
This is a follow on issue to #2433 and depends on rapidsai/cudf#8237.

#2433 is enough to get NDS query 72 to succeed, but it is not nearly as fast as it should/could be. This is to use the CUDF API when it is available instead of getting an OOM error. We should evaluate if we want to remove the pre-check/spilt that happens for joins and just go off of the CUDF API or not.

@revans2 revans2 added feature request New feature or request ? - Needs Triage Need team to review and classify performance A performance related task/issue cudf_dependency An issue or PR with this label depends on a new feature in cudf labels May 18, 2021
@sameerz sameerz removed the ? - Needs Triage Need team to review and classify label May 25, 2021
@jlowe jlowe self-assigned this Aug 20, 2021
@jlowe jlowe reopened this Sep 24, 2021
@jlowe
Copy link
Member

jlowe commented Sep 24, 2021

Reopened since this is being reverted in #3657.

@jlowe
Copy link
Member

jlowe commented Sep 24, 2021

In order to do this properly, we minimally need a way to use the smaller table for the build-side table when performing inner joins to avoid the issue reported in #3288. There may be a way to hack this in, but ideally we'd also want to avoid always fetching the entire, arbitrarily-chosen build-side data for inner joins which should be flexible on the build-side choice. #2354 is related to a more general solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cudf_dependency An issue or PR with this label depends on a new feature in cudf feature request New feature or request P2 Not required for release performance A performance related task/issue
Projects
None yet
4 participants