-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transition to v2 shims [Databricks] #4857
Conversation
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
… ShimLoader Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
so one thing to be aware of here is that any of our partners that have shim layers are going to have to transition as well. which means more work on their side so lets try to get this done how we want in one release. |
...src/main/301db/scala/com/nvidia/spark/rapids/shims/spark301db/SparkShimServiceProvider.scala
Show resolved
Hide resolved
sql-plugin/src/main/301db/scala/com/nvidia/spark/rapids/v2/RapidsShuffleManager.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/31xdb/scala/com/nvidia/spark/rapids/shims/v2/Spark31XdbShimsBase.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/320/scala/com/nvidia/spark/rapids/shims/v2/RapidsShuffleManager.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/322/scala/com/nvidia/spark/rapids/shims/v2/RapidsShuffleManager.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/330/scala/com/nvidia/spark/rapids/v2/RapidsShuffleManager.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/ShimLoader.scala
Outdated
Show resolved
Hide resolved
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
build |
| Spark Shim | spark.shuffle.manager value | | ||
| --------------| -------------------------------------------------------- | | ||
| 3.0.1 | com.nvidia.spark.rapids.spark301.RapidsShuffleManager | | ||
| 3.0.2 | com.nvidia.spark.rapids.spark302.RapidsShuffleManager | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did we end up solving incompatible ShuffleManager base across Apache Spark versions. I thought having a single facade class across shims required dynamic bytecode generation at Runtime depending on the shim because compile-time method signatures cannot be consistent for each Spark version?
I believe one of the issues in 30x vs 32x
where there is a reference to ShuffleBlockResolver https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/shuffle/ShuffleManager.scala#L92
where in 31x we have a transitive reference to MergedBlockMeta
https://github.com/apache/spark/blob/branch-3.2/core/src/main/scala/org/apache/spark/shuffle/ShuffleBlockResolver.scala#L56
However there is no MergedBlockMeta in branch-3.0. And we should get a class verification exception
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested everything after 3.1.1+ and it worked. I have reverted the change as it wasn't my original intention to remove the duplicate ShuffleManagers. The original intention was to get rid of the dynamic lookup of the shims at runtime.
build |
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Will be fixed as a part of a separate patch Signed-off-by: Raza Jafri <rjafri@nvidia.com>
build |
sql-plugin/src/main/301/scala/com/nvidia/spark/rapids/spark301/RapidsShuffleManager.scala
Outdated
Show resolved
Hide resolved
...main/301db/scala/com/nvidia/spark/rapids/shims/spark301db/RapidsShuffleInternalManager.scala
Outdated
Show resolved
Hide resolved
...c/main/312-nondb/scala/com/nvidia/spark/rapids/shims/spark312/SparkShimServiceProvider.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/320/scala/com/nvidia/spark/rapids/shims/SparkShims.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/ShimLoader.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/ShimLoader.scala
Outdated
Show resolved
Hide resolved
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
build |
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
build |
build |
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
build |
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
build |
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
build |
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
build |
@razajafri there are more references to non-existent shims dir that will need updating
|
we do exactly match for post-fix trigger so this one was actually passed CI w/o db tests, created ticket here #4918 FYI also I decide to get the check case-insensitive to avoid cases like this #4919 |
This PR in a nutshell
com.nvidia.spark.rapids.shims.v2
so we don't have to dynamically look them up at runtimeShimLoader
when accessing the Shimsfixes #4474