Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Why is the hash aggregate not handling empty result expressions #4017

Closed
abellina opened this issue Nov 3, 2021 · 3 comments · Fixed by #4035
Closed

[BUG] Why is the hash aggregate not handling empty result expressions #4017

abellina opened this issue Nov 3, 2021 · 3 comments · Fixed by #4035
Assignees
Labels
bug Something isn't working

Comments

@abellina
Copy link
Collaborator

abellina commented Nov 3, 2021

The hash aggregate is currently disabled for cases where there are no resultExpressions. This is something that was missed and has been in the hash aggregate code for a long time.

https://github.com/NVIDIA/spark-rapids/blob/branch-21.12/sql-plugin/src/main/scala/com/nvidia/spark/rapids/aggregate.scala#L872

Prior issues reference the mortgage test as a potential repro case. I can imagine that without much effort this can be fixed and we can remove this special case.

@abellina abellina added bug Something isn't working ? - Needs Triage Need team to review and classify labels Nov 3, 2021
@abellina abellina self-assigned this Nov 3, 2021
@revans2
Copy link
Collaborator

revans2 commented Nov 3, 2021

spark.sql("SELECT c_birth_month, SUM(c_birth_year) as sum_year from customer where c_salutation rlike 'Mr.' group by c_birth_month").count()

Shows this on the TPCDS dataset. Although it is really just a bogus query. Essentially it looks like in this situation there is a count as the output of the query so they don't want to materialize the output. Just get the number of rows of output there would be. So output a batch with no columns, just rows.

@abellina
Copy link
Collaborator Author

abellina commented Nov 3, 2021

@revans2 thanks, the example reproduces locally

@sameerz sameerz added this to the Nov 1 - Nov 12 milestone Nov 3, 2021
@viadea
Copy link
Collaborator

viadea commented Nov 3, 2021

Thanks Bobby for the mini repro. Let me put the interested Spark driver log message here for future search:

!Exec <HashAggregateExec> cannot run on GPU because result expressions is empty

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants