Extend TagForReplaceMode to adapt Databricks runtime #3368

sperlingxx · 2021-09-02T11:58:52Z

Signed-off-by: sperlingxx lovedreamf@gmail.com

Current PR is to fix the test failure of test_hash_groupby_collect_partial_replace_fallback on Databricks runtime. The root cause of the test failure is about different planning strategies for Aggregation between Spark and Databricks runtime. To support both Spark and Databricks runtime, we need a more expressive configuration for hashAggReplaceMode. In addition, with the extended TagForReplaceMode method, we are able to build tests on partial GPU replacement for Aggregate in a more preciser way.

P.S. It has been manually tested on DB_301.

Signed-off-by: sperlingxx <lovedreamf@gmail.com>

sperlingxx · 2021-09-02T11:59:42Z

build

Signed-off-by: sperlingxx <lovedreamf@gmail.com>

sperlingxx · 2021-09-03T06:56:56Z

build

…eMode

sperlingxx · 2021-09-03T08:58:14Z

build

tgravescs · 2021-09-03T13:03:36Z

@revans2 could you review if you have time

revans2

Just a nit

revans2 · 2021-09-03T13:27:55Z

integration_tests/src/main/python/hash_aggregate_test.py

@@ -615,10 +661,9 @@ def test_hash_multiple_mode_query_avg_distincts(data_gen, conf):
 @approximate_float
 @ignore_order
 @incompat
-@pytest.mark.parametrize('data_gen', _init_list_no_nans, ids=idfn)
+@pytest.mark.parametrize('data_gen', _init_list_no_nans[1:2], ids=idfn)


Using a slice for the data gen is really confusing here. At a minimum we need a comment explaining why we are slicing it. Preferably it is a separate value along with the explanation.

Sorry, its my mistake. I added the slice to facilitate debugging, and I forgot to remove it before submission.

Signed-off-by: sperlingxx <lovedreamf@gmail.com>

sperlingxx · 2021-09-06T02:22:50Z

build

revans2

Looks good to me, but I would like to have a few more eyes look at it.

tgravescs · 2021-09-08T13:33:09Z

integration_tests/src/main/python/hash_aggregate_test.py

@@ -617,8 +663,7 @@ def test_hash_multiple_mode_query_avg_distincts(data_gen, conf):
 @incompat
 @pytest.mark.parametrize('data_gen', _init_list_no_nans, ids=idfn)
 @pytest.mark.parametrize('conf', get_params(_confs, params_markers_for_confs), ids=idfn)
-@pytest.mark.parametrize('parameterless', ['true', pytest.param('false', marks=pytest.mark.xfail(
-    condition=not is_before_spark_311(), reason="parameterless count not supported by default in Spark 3.1+"))])


why was this removed?

I am sorry. This option shouldn't be removed. I added it back.

abellina

It seems like a good approach to be able to express the varying nature of these aggregate plans and the prior method was limiting and wrong in some cases.

I'd like to see these patterns tied to a spark version/flavor (somehow), but that can come in a follow up. @sperlingxx what do you think? Main reason I bring this up is so that it's clearer from the pattern what Spark we are targeting.

abellina · 2021-09-08T13:47:35Z

integration_tests/src/main/python/hash_aggregate_test.py

    # test with single Distinct
    assert_cpu_and_gpu_are_equal_collect_with_capture(
        lambda spark: gen_df(spark, data_gen, length=100)
            .groupby('a')
            .agg(f.sort_array(f.collect_list('b')),
                 f.sort_array(f.collect_set('b')),
-                 f.countDistinct('c'),
-                 f.count('c')),
+                 f.countDistinct('c')),


why did f.count get removed here?

@sperlingxx I did have this issue, I am not sure why this test content had to change. That said, I know we are trying to get some tests to pass, as long as we address this either on this PR or a subsequent one linked to this PR.

@abellina f.count got removed because it looks redundant since we already had non-distinct aggregations: collect_list and collect_set. The test case can successfully run with the removed f.count.

Ok, that's not consistent. There are other tests with that combination and with collect*. I personally would rather have more aggregates computed than less, it is more chance to catch issues with the aggregation buffer/ordinal logic.

This test also has two assertions, and I recall distinctly having to remove one of them to debug the other. This is different than my previous comment, but it should really be two tests.

This is blocking other things, I am ok doing this as a follow up.

…eMode

Signed-off-by: sperlingxx <lovedreamf@gmail.com>

sperlingxx · 2021-09-10T03:11:30Z

build

sperlingxx · 2021-09-10T03:44:40Z

It seems like a good approach to be able to express the varying nature of these aggregate plans and the prior method was limiting and wrong in some cases.

I'd like to see these patterns tied to a spark version/flavor (somehow), but that can come in a follow up. @sperlingxx what do you think? Main reason I bring this up is so that it's clearer from the pattern what Spark we are targeting.

Hi @abellina , I fully agree on the idea about listing common aggregation patterns of spark. Perhaps we can label them as constants? I filed a follow-up issue #3437 for your idea.

sperlingxx added 2 commits September 2, 2021 16:24

draft

93aef60

Signed-off-by: sperlingxx <lovedreamf@gmail.com>

refinement

c03fb7e

Signed-off-by: sperlingxx <lovedreamf@gmail.com>

fix ut

2cb9a63

Signed-off-by: sperlingxx <lovedreamf@gmail.com>

Merge remote-tracking branch 'origin/branch-21.10' into rework_replac…

5a26402

…eMode

sperlingxx requested a review from tgravescs September 3, 2021 09:24

revans2 previously approved these changes Sep 3, 2021

View reviewed changes

sameerz assigned sperlingxx Sep 3, 2021

sameerz added the task Work required that improves the product but is not user facing label Sep 3, 2021

fix

85c43ad

Signed-off-by: sperlingxx <lovedreamf@gmail.com>

sperlingxx dismissed revans2’s stale review via 85c43ad September 6, 2021 02:13

revans2 previously approved these changes Sep 7, 2021

View reviewed changes

tgravescs reviewed Sep 8, 2021

View reviewed changes

abellina previously approved these changes Sep 8, 2021

View reviewed changes

sperlingxx added 2 commits September 10, 2021 11:00

Merge remote-tracking branch 'origin/branch-21.10' into rework_replac…

f8e0d57

…eMode

update

54ab703

Signed-off-by: sperlingxx <lovedreamf@gmail.com>

sperlingxx dismissed stale reviews from abellina and revans2 via 54ab703 September 10, 2021 03:09

sperlingxx mentioned this pull request Sep 10, 2021

[FEA] provide typical aggregation patterns for different spark version/flavor #3437

Open

sperlingxx requested a review from abellina September 10, 2021 10:33

tgravescs approved these changes Sep 10, 2021

View reviewed changes

abellina approved these changes Sep 10, 2021

View reviewed changes

tgravescs merged commit 0f30671 into NVIDIA:branch-21.10 Sep 10, 2021

sperlingxx deleted the rework_replaceMode branch December 2, 2021 02:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend TagForReplaceMode to adapt Databricks runtime #3368

Extend TagForReplaceMode to adapt Databricks runtime #3368

sperlingxx commented Sep 2, 2021 •

edited

Loading

sperlingxx commented Sep 2, 2021

sperlingxx commented Sep 3, 2021

sperlingxx commented Sep 3, 2021

tgravescs commented Sep 3, 2021

revans2 left a comment

revans2 Sep 3, 2021

sperlingxx Sep 6, 2021

sperlingxx commented Sep 6, 2021

revans2 left a comment

tgravescs Sep 8, 2021

sperlingxx Sep 10, 2021

abellina left a comment

abellina Sep 8, 2021

abellina Sep 8, 2021

sperlingxx Sep 10, 2021 •

edited

Loading

abellina Sep 10, 2021

sperlingxx commented Sep 10, 2021

sperlingxx commented Sep 10, 2021 •

edited

Loading

Extend TagForReplaceMode to adapt Databricks runtime #3368

Extend TagForReplaceMode to adapt Databricks runtime #3368

Conversation

sperlingxx commented Sep 2, 2021 • edited Loading

sperlingxx commented Sep 2, 2021

sperlingxx commented Sep 3, 2021

sperlingxx commented Sep 3, 2021

tgravescs commented Sep 3, 2021

revans2 left a comment

Choose a reason for hiding this comment

revans2 Sep 3, 2021

Choose a reason for hiding this comment

sperlingxx Sep 6, 2021

Choose a reason for hiding this comment

sperlingxx commented Sep 6, 2021

revans2 left a comment

Choose a reason for hiding this comment

tgravescs Sep 8, 2021

Choose a reason for hiding this comment

sperlingxx Sep 10, 2021

Choose a reason for hiding this comment

abellina left a comment

Choose a reason for hiding this comment

abellina Sep 8, 2021

Choose a reason for hiding this comment

abellina Sep 8, 2021

Choose a reason for hiding this comment

sperlingxx Sep 10, 2021 • edited Loading

Choose a reason for hiding this comment

abellina Sep 10, 2021

Choose a reason for hiding this comment

sperlingxx commented Sep 10, 2021

sperlingxx commented Sep 10, 2021 • edited Loading

sperlingxx commented Sep 2, 2021 •

edited

Loading

sperlingxx Sep 10, 2021 •

edited

Loading

sperlingxx commented Sep 10, 2021 •

edited

Loading