-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ANSI check for aggregates #3597
ANSI check for aggregates #3597
Conversation
Signed-off-by: Alessandro Bellina <abellina@nvidia.com>
build |
Note I didn't include window agg tests, but I can certainly add some. I did verify that a simple window op using a sum, would fallback given the ANSI flag. |
@pytest.mark.parametrize('ansi_enabled', ['true', 'false']) | ||
def test_hash_grpby_avg_nulls_ansi(data_gen, conf, ansi_enabled): | ||
local_conf = copy_and_update(conf, {'spark.sql.ansi.enabled': ansi_enabled}) | ||
assert_gpu_and_cpu_are_equal_collect( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I m confused why want to combine this into a single test. Doing it this way makes it so we cannot verify that we did fallback for Average and it also means that we explicitly allow things to not be on the GPU when we know that they should be in non-ansi mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I moved that test down together with the other ANSI stuff, I didn't change it. Will split it, totally agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done here: 4ddbb6b
build |
Adds fallback for aggregates if ANSI mode is enabled. Specifically:
Sum
,Count
, andAverage
all fallback in various ways.Sum
andCount
check the types that they are producing (resultType
), whereAverage
unconditionally falls back, because it usesLong
internally. The other aggregates do not fallback.I am finding that adding
DecimalGen
to the mix for the testing is probably not possible. Spark is either optimizing it out (UnscaledValue
), or we don't support the precision. But I could have missed something.This does not add the
CentralMomentAgg
variety to the checks, since they are not merged yet, but those definitely need to fallback.