-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent approx_percentile aggregate from being split between CPU and GPU #3862
Conversation
Signed-off-by: Andy Grove <andygrove@nvidia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could there be a comment in the code or the description of the PR on the gpuSupportedTag
change?
def test_hash_groupby_approx_percentile_partial_fallback_to_cpu(aqe_enabled): | ||
conf = copy_and_update(_approx_percentile_conf, { | ||
'spark.sql.adaptive.enabled': aqe_enabled, | ||
'spark.rapids.sql.explain': 'ALL' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
debug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could this test work in a similar way to the assert_cpu_and_gpu_are_equal_collect_with_capture
, where a list of "exist" and "non_exist" classes are used to assert that the query has indeed fallen back.
Thinking of the case if the cast gets "pushed up" to a projection after the hash agg in the future.
build |
build |
build |
build |
build |
TypeSig.MAP).nested(), TypeSig.all), | ||
TypeSig.MAP + TypeSig.BINARY).nested() | ||
.withPsNote(TypeEnum.BINARY, | ||
"Binary columns are supported but not for the sort expressions") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main concern is that we have not added notes for the other types that can only go along for a ride, like arrays. We should be consistent. Also we should not put in a note for maps because Spark only allows maps to go along for the ride. You cannot sort on them.
sql-plugin/src/main/scala/com/nvidia/spark/rapids/aggregate.scala
Outdated
Show resolved
Hide resolved
build |
build failed with:
|
build |
Signed-off-by: Andy Grove andygrove@nvidia.com
Closes #3834
We have a mechanism for tagging the underlying SparkPlan when operators are not supported on GPU because we rely on this information when planning query stages when AQE is enabled. The reason for this is that we are only planning a subset of the plan without context about the parent of the query stage being planned. However, we were only respecting these tags for exchange operators and were ignoring them for aggregates. The main change in this PR is to check these tags on all operators.
This change caused some regressions that also had to be addressed in this PR, such as:
CoalesceShufflePartitions
optimization performs atransformUp
and may replaceShuffleQueryStageExec
withCustomShuffleReaderExec
, causing Spark to copy tags fromShuffleQueryStageExec
toCustomShuffleReaderExec
, including the "no need to replace ShuffleQueryStageExec" tag, so this tag needed to be ignored.ObjectHashAggregateExec
andSortAggregateExec
both had type signatures that declared thatBinaryType
was not supported, which is not always the case. We do supportBinaryType
for aggregate buffers and there is special handling in the aggregate code for both the case when we are able to convert between CPU and GPU for these buffers and also for the case we are not. This PR addsBinaryType
to the type signatures to prevent them from being tagged early on as unsupported on GPU, which was causing regressions in the AQE case.SortExec
was incorrectly declaring thatBinaryType
is not supported so the type checks are updated and new tests added to demonstrate that we fall back to CPU if a sort expression is binary, but we allow non-sort columns to be binary.There are some other smaller changes in the PR:
gpuSupportedTag
was moved to a newobject RapidsMeta
so that it could be referenced from classes outside of theRapidsMeta
hierarchy.RapidsMeta.willNotWorkOnGpu
for reasons that have not already been added to thecannotBeReplacedReasons
set.