Add in running window optimization using scan #2895

revans2 · 2021-07-09T17:17:50Z

Spark optimizes running windows to have a linear time algorithm. When discussing this with cudf rapidsai/cudf#8440 it was decided to use scan and segmented_scan (group by scan). This puts in a framework for this and adds in a few initial implementations.

In performance tests on my local box row_number and count are only slightly faster now than they were under window, but min, max, and sum all show significant performance gains similar to those I showed were possible in rapidsai/cudf#8440

In a large max running window with no partition by I have seen performance improvements of 171x faster cold and 542x faster hot compared to the CPU

scala> spark.time(spark.range(0L, Int.MaxValue, 1L, numPartitions=1).select(max("id").over(Window.orderBy("id").rowsBetween(Window.unboundedPreceding, 0)).as("RN"), col("id")).orderBy(desc("RN")).show)
...
Time taken: 3633 ms
scala> spark.time(spark.range(0L, Int.MaxValue, 1L, numPartitions=1).select(max("id").over(Window.orderBy("id").rowsBetween(Window.unboundedPreceding, 0)).as("RN"), col("id")).orderBy(desc("RN")).show)
...
Time taken: 1153 ms
scala> spark.conf.set("spark.rapids.sql.enabled", "false")
scala> spark.time(spark.range(0L, Int.MaxValue, 1L, numPartitions=1).select(max("id").over(Window.orderBy("id").rowsBetween(Window.unboundedPreceding, 0)).as("RN"), col("id")).orderBy(desc("RN")).show)
...
Time taken: 622094 ms
scala> spark.conf.set("spark.rapids.sql.enabled", "false")
scala> spark.time(spark.range(0L, Int.MaxValue, 1L, numPartitions=1).select(max("id").over(Window.orderBy("id").rowsBetween(Window.unboundedPreceding, 0)).as("RN"), col("id")).orderBy(desc("RN")).show)
...
Time taken: 625660 ms

This is a special case because when no partition is given the data all goes to a single task, so it needs a single core to process the data. But as you can see it would still take hundreds of CPU cores in the partitioned cast to offset the performance gains.

On the previous GPU code I could not run this because I had to kill it before my GPU overheated.

This is stepping stone to be able to support rank and dense_rank

Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>

revans2 · 2021-07-09T17:50:08Z

I tested this on databricks and it works there too.

integration_tests/src/main/python/window_function_test.py

sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuWindowExec.scala

sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuWindowExpression.scala

revans2 · 2021-07-09T20:50:30Z

build

revans2 · 2021-07-12T12:06:35Z

With the review work I accidentally checked in a change that makes this require the fix from rapidsai/cudf#8705. I am inclined to wait for it to get merged in, but if others want to merge this in sooner I can revert the small change and do a follow on PR when the cudf change does get merged in.

revans2 · 2021-07-13T14:57:59Z

build

revans2 · 2021-07-13T15:19:48Z

build

Add in running window optimization using scan

6567ff7

Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>

revans2 added the performance A performance related task/issue label Jul 9, 2021

revans2 added this to the July 5 - July 16 milestone Jul 9, 2021

revans2 self-assigned this Jul 9, 2021

jlowe reviewed Jul 9, 2021

View reviewed changes

Review Comments

df2f68a

jlowe approved these changes Jul 9, 2021

View reviewed changes

Merge branch 'branch-21.08' into window_scan

a55b9bc

Merge branch 'branch-21.08' into window_scan

5d9d73c

revans2 merged commit 31873a0 into NVIDIA:branch-21.08 Jul 13, 2021

revans2 deleted the window_scan branch July 13, 2021 18:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add in running window optimization using scan #2895

Add in running window optimization using scan #2895

revans2 commented Jul 9, 2021

revans2 commented Jul 9, 2021

revans2 commented Jul 9, 2021

revans2 commented Jul 12, 2021

revans2 commented Jul 13, 2021

revans2 commented Jul 13, 2021

Add in running window optimization using scan #2895

Add in running window optimization using scan #2895

Conversation

revans2 commented Jul 9, 2021

revans2 commented Jul 9, 2021

revans2 commented Jul 9, 2021

revans2 commented Jul 12, 2021

revans2 commented Jul 13, 2021

revans2 commented Jul 13, 2021