[BUG] q82 regression after #3288 #3640

abellina · 2021-09-23T22:07:19Z

It looks like the changes in this PR: #3288 are causing a regression in performance for q82 (to several minutes).

I noticed a lot of time spent in the following code and @jlowe suggested to revert. After doing so, we are able to get back to ~20 seconds.

ai.rapids.cudf.HashJoin.create(Native Method)
ai.rapids.cudf.HashJoin.<init>(HashJoin.java:87)
org.apache.spark.sql.rapids.execution.HashJoinIterator.$anonfun$maybeBuildHashTable$3(GpuHashJoin.scala:358)
org.apache.spark.sql.rapids.execution.HashJoinIterator$$Lambda$2420/1713476990.apply(Unknown Source)

The text was updated successfully, but these errors were encountered:

jlowe · 2021-09-24T16:00:34Z

I think I know what triggered this. In the new code, we will always use Spark's build-side table to build the hash table used in the join. However in the old code, the hash table was being built implicitly as part of the libcudf join call, and libcudf's inner join logic would choose whichever table had the fewest rows as the build-side table. For query 82, there is an inner join where the right side table is enormous compared to the left side table. That means in the old code we would build a tiny table from the left, whereas in the new code we build an enormous table from the right. Building the hash table is the most expensive part of the join, so we end up losing badly to the code that chose the smaller table as the build table for inner joins.

abellina added bug Something isn't working ? - Needs Triage Need team to review and classify P0 Must have for release labels Sep 23, 2021

jlowe self-assigned this Sep 24, 2021

jlowe removed the ? - Needs Triage Need team to review and classify label Sep 24, 2021

jlowe mentioned this issue Sep 24, 2021

Revert "Use cudf to compute exact hash join output row sizes (#3288)" #3657

Merged

tgravescs closed this as completed in #3657 Sep 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] q82 regression after #3288 #3640

[BUG] q82 regression after #3288 #3640

abellina commented Sep 23, 2021

jlowe commented Sep 24, 2021

[BUG] q82 regression after #3288 #3640

[BUG] q82 regression after #3288 #3640

Comments

abellina commented Sep 23, 2021

jlowe commented Sep 24, 2021