-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QST] Exposing the GPU operator exec classes to user #3873
Comments
These are not public facing API's so the jar doesn't expose them without first doing some classloading stuff to load the files for the correct version of Spark. This is something to deal with allowing us to support multiple spark versions in a single jar. Can you be more specific about what you are trying to do here, ie what setup, environment are you trying to do this from? What is your test trying to compare? ie isInstanceOf or can you just look at class name and what are you looking at - explain() output, sparkPlan ,etc? |
I am doing this as a part of our regular system runs (in our production environment) to verify for GPU scenarios the expected operators are being overridden properly. I do this as a part of batch jobs, by loading the physical plan ( |
Do you have to use the actual SparkPlan and Execs instanceOf or can you just compare the strings? You can walk the plan and get the Class string or nodeName as a string and just see if it contains GpuProjectExec for instance. |
@tgravescs I can explore doing that, but for now could you walk me through the change needed to make these APIs public? We can do that in our internal repo of Spark RAPIDS so this change doesn't need to go in the OSS repo |
sure, you can find some docs related here: https://github.com/NVIDIA/spark-rapids/tree/branch-21.12/dist Essentially you need to add the classes into https://github.com/NVIDIA/spark-rapids/blob/branch-21.12/dist/unshimmed-common-from-spark301.txt file. |
I agree with @tgravescs and would also recommend just using nodeName / toString Going the route of modifying unshimmed-common can quickly can turn into classloader-related problems for production jobs even in a single-shim jar. I think if this functionality is really needed you need to create a custom single shim build, and we can make it easy with #3682, or we can look into providing / generating required checks. We could add some |
@gerashegalov , @tgravescs Can we look into the below as suggested by Gera:
While trying to add the operator exec classes, I'm seeing it depends on a lot of classes like |
If the dependencies are a lot there, then you might just end up with most of the classes being in there, which seems like a lot of work.
|
Got it, thanks @tgravescs we will proceed with using Node strings to validate, closing this issue. |
What is your question?
I'm trying to use the following classes for checking the RAPIDS overridden physical plan (for internal validations that Spark jobs are using RAPIDS acceleration as expected):
And I can see these classes as a part of the v21.10 jar:
However on importing these classes I see the following error:
Can we have these classes loadable outside of the catalyst by the user?
The text was updated successfully, but these errors were encountered: