Advertise CPU sort order and partitioning expressions to Catalyst [databricks] #3719

jlowe · 2021-09-30T20:18:58Z

Changes the GPU nodes in the Catalyst plan to advertise CPU SortOrder and Partitioning classes and child expressions. This helps the Spark code better recognize that portions of the plan are still compatible from a sorting and partitioning perspective even after the GPU has updated the plan. Spark has a number of places in the code where it is matching against specific Spark CPU partition classes or checking if CPU sort/partition expressions match, Returning the CPU expressions from the GPU nodes' outputPartitioning, outputOrdering, requiredChildDistribution and requiredChildOrderingmethods helps us pass these checks in Apache Spark code.

This is accomplished by passing both the GPU and CPU expressions to exec nodes that need to override their output ordering or partitioning. It uses the GPU expressions during computation but "advertises" the CPU expressions from the Catalyst methods to query the node's intentions.

Signed-off-by: Jason Lowe <jlowe@nvidia.com>

…ng issues Signed-off-by: Jason Lowe <jlowe@nvidia.com>

jlowe · 2021-09-30T20:37:44Z

This depends on #3691. Marking as a draft until the dependency is merged.

jlowe · 2021-09-30T20:37:49Z

build

jlowe · 2021-09-30T22:02:47Z

build

sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuTransitionOverrides.scala

abellina

Overall this makes sense, I had one comment on whether we wanted to fix or add more explicit asserts if the gpu order was complex (required an eval). I am not sure what the outcome of the failure is here => failed query, or less performant.

tgravescs · 2021-10-01T12:51:10Z

build

sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCoalesceBatches.scala

shims/spark320/src/main/scala/com/nvidia/spark/rapids/shims/spark320/Spark320Shims.scala

sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuSortExec.scala

jlowe · 2021-10-01T16:54:21Z

build

jlowe · 2021-10-01T17:34:06Z

build

jlowe · 2021-10-01T21:15:25Z

Ran into a couple of issues with passing the CPU expressions as part of the regular arguments to the case classes:

These arguments would appear in the plan, being mostly redundant with the GPU form of the expressions
The assertOnGpu code would "find" these expressions and assert that they're not on the GPU

By passing them as a separate parameter list to the case classes, they no longer appear in the expressions list for these classes. That avoids the CPU expressions appearing in the plan explain output and being seen by generic tree traversal code.

jlowe · 2021-10-01T21:22:47Z

build

tgravescs · 2021-10-04T13:02:57Z

db 8.2 failures:
` pyspark.sql.utils.IllegalArgumentException: The expression knownfloatingpointnormalized(normalizenanandzero(a#133227)) AS a#133227 is not columnar class org.apache.spark.sql.catalyst.expressions.Alias

`

18:34:16  FAILED ../../src/main/python/join_test.py::test_right_broadcast_nested_loop_join_condition_missing[Left-Float][IGNORE_ORDER({'local': True}), INCOMPAT]
18:34:16  FAILED ../../src/main/python/join_test.py::test_right_broadcast_nested_loop_join_condition_missing[Left-Double][IGNORE_ORDER({'local': True}), INCOMPAT]
18:34:16  FAILED ../../src/main/python/join_test.py::test_right_broadcast_nested_loop_join_condition_missing[LeftSemi-Float][IGNORE_ORDER({'local': True}), INCOMPAT]
18:34:16  FAILED ../../src/main/python/join_test.py::test_right_broadcast_nested_loop_join_condition_missing[LeftSemi-Double][IGNORE_ORDER({'local': True}), INCOMPAT]
18:34:16  FAILED ../../src/main/python/join_test.py::test_right_broadcast_nested_loop_join_condition_missing[LeftAnti-Float][IGNORE_ORDER({'local': True}), INCOMPAT]
18:34:16  FAILED ../../src/main/python/join_test.py::test_right_broadcast_nested_loop_join_condition_missing[LeftAnti-Double][IGNORE_ORDER({'local': True}), INCOMPAT]
18:34:16  FAILED ../../src/main/python/join_test.py::test_left_broadcast_nested_loop_join_condition_missing[Right-Float][IGNORE_ORDER({'local': True}), INCOMPAT]
18:34:16  FAILED ../../src/main/python/join_test.py::test_left_broadcast_nested_loop_join_condition_missing[Right-Double][IGNORE_ORDER({'local': True}), INCOMPAT]

jlowe · 2021-10-04T14:56:43Z

build

jlowe · 2021-10-04T17:13:15Z

build

gerashegalov · 2021-10-04T20:17:27Z

.../main/301until310-nondb/scala/com/nvidia/spark/rapids/shims/v2/GpuShuffledHashJoinExec.scala

+    cpuLeftKeys,
+    cpuRightKeys) {
+
+  override def otherCopyArgs: Seq[AnyRef] = cpuLeftKeys :: cpuRightKeys :: Nil


nits:

prefer Seq.empty to Nil.

could also just have cpuLeftKeys ++ cpuRightKeys ?

gerashegalov · 2021-10-04T20:25:56Z

sql-plugin/src/main/311db/scala/com/nvidia/spark/rapids/shims/v2/GpuWindowInPandasExec.scala

+    child: SparkPlan)(
+    override val cpuPartitionSpec: Seq[Expression]) extends GpuWindowInPandasExecBase {
+
+  override def otherCopyArgs: Seq[AnyRef] = cpuPartitionSpec :: Nil


nit: why :: Nil

gerashegalov · 2021-10-04T20:35:45Z

sql-plugin/src/main/scala/org/apache/spark/sql/rapids/GpuFileSourceScanExec.scala

@@ -78,6 +78,9 @@ case class GpuFileSourceScanExec(

  override def otherCopyArgs: Seq[AnyRef] = Seq(rapidsConf)

+  // All expressions are filter expressions used on the CPU.
+  override def gpuExpressions: Seq[Expression] = Nil


nit: Seq.empty

jlowe added 7 commits September 28, 2021 14:59

Fix issues with AQE and DPP enabled on Spark 3.2

d02bac3

Signed-off-by: Jason Lowe <jlowe@nvidia.com>

Add canonicalized parameter for 301db shim

5ff10a6

Fix double-close when batch contains multiple columns

0f0883a

Fix HostColumnVector deserialization

fe1acd5

Advertise CPU sort and partitioning expressions to avoid AQE replanni…

f7acb80

…ng issues Signed-off-by: Jason Lowe <jlowe@nvidia.com>

Fix IndexOutOfBoundsException errors from WindowFunctionSuite

81abdbb

Update other shims

95618f8

jlowe self-assigned this Sep 30, 2021

jlowe added 2 commits September 30, 2021 16:59

Merge branch 'branch-21.10' into fix-aqe-shuffle-coalesce

c1fd988

Fix 311db build

b8118df

jlowe marked this pull request as ready for review September 30, 2021 22:02

abellina reviewed Sep 30, 2021

View reviewed changes

sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuTransitionOverrides.scala Outdated Show resolved Hide resolved

sameerz added the bug Something isn't working label Oct 1, 2021

sameerz added this to the Sep 27 - Oct 1 milestone Oct 1, 2021

abellina reviewed Oct 1, 2021

View reviewed changes

tgravescs reviewed Oct 1, 2021

View reviewed changes

sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCoalesceBatches.scala Outdated Show resolved Hide resolved

revans2 previously approved these changes Oct 1, 2021

View reviewed changes

shims/spark320/src/main/scala/com/nvidia/spark/rapids/shims/spark320/Spark320Shims.scala Outdated Show resolved Hide resolved

sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuSortExec.scala Show resolved Hide resolved

Fix 311db GpuWindowsInPandasExec and update optimized sort comment

4277e66

This was referenced Oct 1, 2021

Revisit GpuSorter now that CPU sort order is easily accessed #3731

Closed

Revisit GpuOverrides.orderingSatisifies #3732

Closed

Specify GpuPartitioning type when expecting a GPU partitioning argument

e3c2c1f

jlowe mentioned this pull request Oct 1, 2021

Further separate GpuPartitioning from Partitioning #3733

Open

Add ability to override withNewChildren and fix BatchedByKey

9dc3acb

jlowe dismissed revans2’s stale review via 9dc3acb October 1, 2021 16:32

Fix 301db GpuWindowInPandasExec

9e19a47

jlowe added 2 commits October 1, 2021 14:10

Merge branch 'branch-21.10' into fix-aqe-shuffle-coalesce

dd11f09

Use separate parameter list to pass CPU expressions

c733065

Merge branch 'branch-21.10' into fix-aqe-shuffle-coalesce

2216f23

sameerz modified the milestones: Sep 27 - Oct 1, Oct 4 - Oct 15 Oct 4, 2021

Exclude CPU expressions from assertOnGpu checks

ef48a67

revans2 previously approved these changes Oct 4, 2021

View reviewed changes

Fix scalastyle on imports

4e1c4fe

jlowe dismissed revans2’s stale review via 4e1c4fe October 4, 2021 17:13

revans2 approved these changes Oct 4, 2021

View reviewed changes

jlowe merged commit e811543 into NVIDIA:branch-21.10 Oct 4, 2021

jlowe deleted the fix-aqe-shuffle-coalesce branch October 4, 2021 20:53

gerashegalov reviewed Oct 4, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Advertise CPU sort order and partitioning expressions to Catalyst [databricks] #3719

Advertise CPU sort order and partitioning expressions to Catalyst [databricks] #3719

jlowe commented Sep 30, 2021

jlowe commented Sep 30, 2021

jlowe commented Sep 30, 2021

jlowe commented Sep 30, 2021

abellina left a comment

tgravescs commented Oct 1, 2021

jlowe commented Oct 1, 2021

jlowe commented Oct 1, 2021

jlowe commented Oct 1, 2021

jlowe commented Oct 1, 2021

tgravescs commented Oct 4, 2021

jlowe commented Oct 4, 2021

jlowe commented Oct 4, 2021

gerashegalov Oct 4, 2021

gerashegalov Oct 4, 2021

gerashegalov Oct 4, 2021

Advertise CPU sort order and partitioning expressions to Catalyst [databricks] #3719

Advertise CPU sort order and partitioning expressions to Catalyst [databricks] #3719

Conversation

jlowe commented Sep 30, 2021

jlowe commented Sep 30, 2021

jlowe commented Sep 30, 2021

jlowe commented Sep 30, 2021

abellina left a comment

Choose a reason for hiding this comment

tgravescs commented Oct 1, 2021

jlowe commented Oct 1, 2021

jlowe commented Oct 1, 2021

jlowe commented Oct 1, 2021

jlowe commented Oct 1, 2021

tgravescs commented Oct 4, 2021

jlowe commented Oct 4, 2021

jlowe commented Oct 4, 2021

gerashegalov Oct 4, 2021

Choose a reason for hiding this comment

gerashegalov Oct 4, 2021

Choose a reason for hiding this comment

gerashegalov Oct 4, 2021

Choose a reason for hiding this comment