Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ClassCastException for unsupported TypedImperativeAggregate functions #3283

Merged

Conversation

sperlingxx
Copy link
Collaborator

Signed-off-by: sperlingxx lovedreamf@gmail.com

Fixes #3253

The ClassCastException occurred during overriding the buffer types of TypedImperativeAggregate functions. The code for overriding assumed that TypedImperativeAggreagte must be wrapped by TypedImperativeAggExprMeta. But there are exceptions: for TypedImperativeAggreagte without GPU implementation , they are wrapped by RuleNotFoundExprMeta.

…ypedImperativeAgg

Signed-off-by: sperlingxx <lovedreamf@gmail.com>
@sperlingxx sperlingxx requested a review from andygrove August 24, 2021 09:18
@sperlingxx
Copy link
Collaborator Author

build

agg.aggregateExpressions.zipWithIndex.foreach {
case (expr, i) if expr.aggregateFunction.isInstanceOf[TypedImperativeAggregate[_]] =>
aggregateExpressions.map(_.childExprs.head).foreach {
case aggMeta: TypedImperativeAggExprMeta[_] =>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, this removes the need for the asInstanceOf for wrapped:

case aggMeta: TypedImperativeAggExprMeta[TypedImperativeAggregate[_]] =>

@@ -547,6 +547,20 @@ def test_hash_groupby_collect_partial_replace_fallback_with_other_agg(conf, aqe_
group by k1""",
conf=local_conf)

@ignore_order(local=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe approx_percentile is coming, and when it does show up we will fail this test and then whoever implements the function needs to figure out how to address, it would be better if we force the aggregate function to not be on the GPU. Could you explicitly turn off an already working typed imperative aggregate?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(i.e. spark.rapids.sql.expression.CollectList=false, or you should be able to get away with spark.rapids.sql.expression.ApproximatePercentile=false)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Newbie question: Why do we allow_non_gpu for these aggs?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plugin can be placed in a mode that throws if some parts of the plan can't be replaced (spark.rapids.sql.test.enabled). The allow list (spark.rapids.sql.test.allowedNonGpu) specifies expressions that are OK even in this mode, as they are pertinent to the test.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spark.rapids.sql.expression.ApproximatePercentile=false is not going to work for actually testing this. We have to have a TypedImperativeAggregate that does not exist in the GpuOverrides mapping. Because even with that config it would still create a TypedImperativeAggExprMeta to wrap it. It would just be turned off.

I would propose we go with an API that is less likely we are going to support any time soon. Specifically a udaf, but that is really hard to do inside the python tests. As it is I am fine with this for now, but looking at the list of TypedImperativeAggregate functions there are not a lot to choose from.

@@ -547,6 +547,20 @@ def test_hash_groupby_collect_partial_replace_fallback_with_other_agg(conf, aqe_
group by k1""",
conf=local_conf)

@ignore_order(local=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spark.rapids.sql.expression.ApproximatePercentile=false is not going to work for actually testing this. We have to have a TypedImperativeAggregate that does not exist in the GpuOverrides mapping. Because even with that config it would still create a TypedImperativeAggExprMeta to wrap it. It would just be turned off.

I would propose we go with an API that is less likely we are going to support any time soon. Specifically a udaf, but that is really hard to do inside the python tests. As it is I am fine with this for now, but looking at the list of TypedImperativeAggregate functions there are not a lot to choose from.

@sperlingxx sperlingxx merged commit f8b001d into NVIDIA:branch-21.10 Aug 25, 2021
@sperlingxx sperlingxx deleted the fix_classCastException_typImpAgg branch August 25, 2021 02:33
@sameerz sameerz added the bug Something isn't working label Aug 25, 2021
@sameerz sameerz added this to the Aug 16 - Aug 27 milestone Aug 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] ClassCastException for unsupported TypedImperativeAggregate functions
5 participants