Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Explore javaagent API for loading shims and config simplification #3803

Closed
gerashegalov opened this issue Oct 12, 2021 · 1 comment
Closed
Labels
feature request New feature or request

Comments

@gerashegalov
Copy link
Collaborator

Is your feature request related to a problem? Please describe.

Loading of shims in 21.10 from Parallel Worlds jar took several iterations to get right because we try to manipulate classloaders relatively late in the Spark JVM lifecycle. The most robust approach is crystallizing to be manipulation of the caller's classloader (mostly AppClassLoader) by calling a protected method using reflection.

We also try to accommodate loading of classes such as ShuffleManager that happens before spark.plugins are loaded by using a lazy proxy approach.

Describe the solution you'd like
JVM provides the java agent API that gives access to an instance implementing Instrumentation

public static void premain(String agentArgs, Instrumentation inst);

This should allow the agent portion of the plugin to add the right parallel world Jar URL to the sytem classpath using public API without use of reflection after determining the Spark version.

This will also solve the chicken-and-egg Problem with the shuffle manager.
a) if we want we will be able to use a single class name for Rapids shuffle manager
b) we can add a boolean config for using RapidsShuffleManager, then the name of the class for Shuffle Manager does not really matter because it won't longer be exposed to the user.

We can also remove some of the boilerplate code that just delegates calls to wrapped objects by (generating it at load time)[https://www.baeldung.com/java-instrumentation]

Describe alternatives you've considered
21.10 way

Additional context

@gerashegalov gerashegalov added feature request New feature or request ? - Needs Triage Need team to review and classify labels Oct 12, 2021
@Salonijain27 Salonijain27 removed the ? - Needs Triage Need team to review and classify label Oct 26, 2021
@gerashegalov
Copy link
Collaborator Author

This will be solved simpler via apache/spark#43627

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants