You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
running test_cartesian_join_special_case_count fails with Caused by: java.lang.NoClassDefFoundError: com/nvidia/spark/rapids/Arm
E at com.nvidia.spark.rapids.GpuShuffleCoalesceIterator.$anonfun$next$3(GpuShuffleCoalesceExec.scala:214)
E at com.nvidia.spark.rapids.Arm.withResource(Arm.scala:28)
E at com.nvidia.spark.rapids.Arm.withResource$(Arm.scala:26)
E at com.nvidia.spark.rapids.GpuShuffleCoalesceIterator.withResource(GpuShuffleCoalesceExec.scala:191)
E at com.nvidia.spark.rapids.GpuShuffleCoalesceIterator.$anonfun$next$2(GpuShuffleCoalesceExec.scala:213)
E at com.nvidia.spark.rapids.Arm.withResource(Arm.scala:28)
E at com.nvidia.spark.rapids.Arm.withResource$(Arm.scala:26)
E at com.nvidia.spark.rapids.GpuShuffleCoalesceIterator.withResource(GpuShuffleCoalesceExec.scala:191)
E at com.nvidia.spark.rapids.GpuShuffleCoalesceIterator.$anonfun$next$1(GpuShuffleCoalesceExec.scala:207)
E at com.nvidia.spark.rapids.Arm.withResource(Arm.scala:28)
E at com.nvidia.spark.rapids.Arm.withResource$(Arm.scala:26)
E at com.nvidia.spark.rapids.GpuShuffleCoalesceIterator.withResource(GpuShuffleCoalesceExec.scala:191)
E at com.nvidia.spark.rapids.GpuShuffleCoalesceIterator.next(GpuShuffleCoalesceExec.scala:206)
E at com.nvidia.spark.rapids.GpuShuffleCoalesceIterator.next(GpuShuffleCoalesceExec.scala:191)
E at com.nvidia.spark.rapids.GpuHashAggregateIterator.aggregateInputBatches(aggregate.scala:283)
E at com.nvidia.spark.rapids.GpuHashAggregateIterator.$anonfun$next$2(aggregate.scala:238)
E at scala.Option.getOrElse(Option.scala:189)
E at com.nvidia.spark.rapids.GpuHashAggregateIterator.next(aggregate.scala:235)
E at com.nvidia.spark.rapids.GpuHashAggregateIterator.next(aggregate.scala:181)
E at com.nvidia.spark.rapids.ColumnarToRowIterator.$anonfun$fetchNextBatch$2(GpuColumnarToRowExec.scala:241)
E at com.nvidia.spark.rapids.Arm.withResource(Arm.scala:28)
E at com.nvidia.spark.rapids.Arm.withResource$(Arm.scala:26)
E at com.nvidia.spark.rapids.ColumnarToRowIterator.withResource(GpuColumnarToRowExec.scala:187)
E at com.nvidia.spark.rapids.ColumnarToRowIterator.fetchNextBatch(GpuColumnarToRowExec.scala:238)
E at com.nvidia.spark.rapids.ColumnarToRowIterator.loadNextBatch(GpuColumnarToRowExec.scala:215)
E at com.nvidia.spark.rapids.ColumnarToRowIterator.hasNext(GpuColumnarToRowExec.scala:255)
E at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
E at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:349)
E at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:898)
E at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:898)
E at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
E at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
E at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
E at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
E at org.apache.spark.scheduler.Task.run(Task.scala:131)
E at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
E at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462)
E at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
E at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
E at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
E ... 1 more
This issue is caused by #4588 since 22.04 adding a Scala class ai.rapids.cudf.HostConcatResultUtil. Our build assumes that all classes ai.rapids are Java classes from the rapidsai/cudf repo without any compatibility issues and keeps them in the conventional jar location. HostConcatResultUtil is a scala class and had direct references to Arm that is not visible to the conventional classloader. So when we are not forcefully modifying the "conventional" classloader, we hit the issue
Don't use ai.rapidsai.cudf package for spark-rapids Scala classes. Otherwise it is going to be loaded by conventional classloader and fail to load referenced classes out of the shimmed areas.
- Move the class
- add a smoking test to prevent this sort of regressions in premerge
Closes#5513.
Depends on rapidsai/cudf#10949
Signed-off-by: Gera Shegalov <gera@apache.org>
Describe the bug
running
test_cartesian_join_special_case_count
fails withCaused by: java.lang.NoClassDefFoundError: com/nvidia/spark/rapids/Arm
Steps/Code to reproduce bug
invoke:
Expected behavior
Should pass
Environment details (please complete the following information)
Environment location: Standalone,
Additional context
Add any other context about the problem here.
originally reported by h/t @pxLi
The text was updated successfully, but these errors were encountered: