Remove the `hasNans` config and `no_nan` data gen in test code. #6502

HaoYang670 · 2022-09-05T02:46:40Z

Signed-off-by: remzi 13716567376yh@gmail.com
close #6487.

Signed-off-by: remzi <13716567376yh@gmail.com>

HaoYang670 · 2022-09-05T09:32:31Z

build

integration_tests/src/main/python/hash_aggregate_test.py

jlowe · 2022-09-08T22:34:59Z

integration_tests/src/main/python/hash_aggregate_test.py

@@ -954,7 +891,7 @@ def test_hash_multiple_mode_query_avg_distincts(data_gen, conf):
 @approximate_float
 @ignore_order
 @incompat
-@pytest.mark.parametrize('data_gen', _init_list_no_nans, ids=idfn)
+@pytest.mark.parametrize('data_gen', _init_list, ids=idfn)


Note that just because we're removing the hasNans config does not mean we want to avoid testing values without NaNs. One concern with allowing NaNs into the mix is that it can cause some operations to effectively always return NaN because the chance of generating at least one NaN in the input could be high, and summing/averaging with a NaN produces a NaN. If too many NaNs are generated then we're not really testing whether we're getting the non-NaN case right because the NaNs "eclipse" the results.

If we're convinced that NaN-eclipsing isn't going to be a problem then we're fine, but for a full reduction on a column it seems likely at least one NaN would be generated. Ideally we want to test both cases, with and without NaNs, to make sure we're doing both correctly.

Co-authored-by: Jason Lowe <jlowe@nvidia.com>

HaoYang670 · 2022-09-14T02:50:21Z

Hi @jlowe. Thank you for your explanation.
I will close this PR and just clean the hasNan config in the test code (I will do this in #6512).

HaoYang670 added 8 commits September 5, 2022 09:30

clean no_nans conf

29de84f

Signed-off-by: remzi <13716567376yh@gmail.com>

clean _init_list_no_nan

c84a1d7

Signed-off-by: remzi <13716567376yh@gmail.com>

clean gen

8c91bee

Signed-off-by: remzi <13716567376yh@gmail.com>

clean json test

f9e69d2

Signed-off-by: remzi <13716567376yh@gmail.com>

clean window_function_test

63b0a48

Signed-off-by: remzi <13716567376yh@gmail.com>

clean window test nan conf

25758d4

Signed-off-by: remzi <13716567376yh@gmail.com>

remove hasnan conf

5592921

Signed-off-by: remzi <13716567376yh@gmail.com>

remove hasnan conf

bf7b312

Signed-off-by: remzi <13716567376yh@gmail.com>

HaoYang670 self-assigned this Sep 5, 2022

HaoYang670 added the test Only impacts tests label Sep 5, 2022

jlowe reviewed Sep 8, 2022

View reviewed changes

Update integration_tests/src/main/python/hash_aggregate_test.py

c7ccff2

Co-authored-by: Jason Lowe <jlowe@nvidia.com>

HaoYang670 closed this Sep 14, 2022

HaoYang670 deleted the 6487_clean_nan_tests branch September 15, 2022 01:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove the `hasNans` config and `no_nan` data gen in test code. #6502

Remove the `hasNans` config and `no_nan` data gen in test code. #6502

HaoYang670 commented Sep 5, 2022 •

edited

Loading

HaoYang670 commented Sep 5, 2022

jlowe Sep 8, 2022

HaoYang670 commented Sep 14, 2022 •

edited

Loading

Remove the hasNans config and no_nan data gen in test code. #6502

Remove the hasNans config and no_nan data gen in test code. #6502

Conversation

HaoYang670 commented Sep 5, 2022 • edited Loading

HaoYang670 commented Sep 5, 2022

jlowe Sep 8, 2022

Choose a reason for hiding this comment

HaoYang670 commented Sep 14, 2022 • edited Loading

Remove the `hasNans` config and `no_nan` data gen in test code. #6502

Remove the `hasNans` config and `no_nan` data gen in test code. #6502

HaoYang670 commented Sep 5, 2022 •

edited

Loading

HaoYang670 commented Sep 14, 2022 •

edited

Loading