Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ANSI check for aggregates #3597

Merged
merged 6 commits into from
Sep 22, 2021

Conversation

abellina
Copy link
Collaborator

Adds fallback for aggregates if ANSI mode is enabled. Specifically: Sum, Count, and Average all fallback in various ways. Sum and Count check the types that they are producing (resultType), where Average unconditionally falls back, because it uses Long internally. The other aggregates do not fallback.

I am finding that adding DecimalGen to the mix for the testing is probably not possible. Spark is either optimizing it out (UnscaledValue), or we don't support the precision. But I could have missed something.

This does not add the CentralMomentAgg variety to the checks, since they are not merged yet, but those definitely need to fallback.

@abellina
Copy link
Collaborator Author

build

@abellina
Copy link
Collaborator Author

Note I didn't include window agg tests, but I can certainly add some. I did verify that a simple window op using a sum, would fallback given the ANSI flag.

@abellina abellina changed the title Ansi/sum avg decimal ansi ANSI check for aggregates Sep 22, 2021
@pytest.mark.parametrize('ansi_enabled', ['true', 'false'])
def test_hash_grpby_avg_nulls_ansi(data_gen, conf, ansi_enabled):
local_conf = copy_and_update(conf, {'spark.sql.ansi.enabled': ansi_enabled})
assert_gpu_and_cpu_are_equal_collect(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I m confused why want to combine this into a single test. Doing it this way makes it so we cannot verify that we did fallback for Average and it also means that we explicitly allow things to not be on the GPU when we know that they should be in non-ansi mode.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I moved that test down together with the other ANSI stuff, I didn't change it. Will split it, totally agree.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done here: 4ddbb6b

@revans2 revans2 linked an issue Sep 22, 2021 that may be closed by this pull request
@revans2
Copy link
Collaborator

revans2 commented Sep 22, 2021

build

@abellina abellina merged commit fd4c9bc into NVIDIA:branch-21.10 Sep 22, 2021
@abellina abellina deleted the ansi/sum_avg_decimal_ansi branch September 22, 2021 15:57
@pxLi pxLi added the bug Something isn't working label Sep 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Aggregations in ANSI mode do not detect overflows
3 participants