Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-20059][YARN] Use the correct classloader for HBaseCredentialProvider #17388

Closed
wants to merge 3 commits into from

Conversation

jerryshao
Copy link
Contributor

What changes were proposed in this pull request?

Currently we use system classloader to find HBase jars, if it is specified by --jars, then it will be failed with ClassNotFound issue. So here changing to use child classloader.

Also putting added jars and main jar into classpath of submitted application in yarn cluster mode, otherwise HBase jars specified with --jars will never be honored in cluster mode, and fetching tokens in client side will always be failed.

How was this patch tested?

Unit test and local verification.

Change-Id: I04d6ffcdca58277e60d3dd0c456f4c6a2936b320
@SparkQA
Copy link

SparkQA commented Mar 22, 2017

Test build #75052 has finished for PR 17388 at commit 9e15391.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jerryshao
Copy link
Contributor Author

jerryshao commented Mar 23, 2017

@vanzin @tgravescs @mridulm do you think it necessary to add additional jars and main jar into classloader for yarn cluster mode?

In my case I run Spark with HBase in secure cluster, so I need to specify hbase jars with --jars to make HBaseCredentailProvider work. But fortunately in yarn cluster mode, this jars are not added into classloader, so it will fail to get HBase token with class not found issue.

This also applies to the customized credential provider, if we write a customized one and package into main jar, then it will be failed to load by ServiceLoader because this main jar is not presented in client's classloader.

Though this could be fixed by expanding launch classpath (like SPARK_CLASSPATH) as a workaround, I think a good solution is to add to child's classpath.

What do you think, is there any concern to put these jars into child's classpath in yarn cluster mode? Thanks a lot.

Change-Id: Ib5e831e46aee395833af5ac4829f7da0a5e6282c
@SparkQA
Copy link

SparkQA commented Mar 23, 2017

Test build #75078 has finished for PR 17388 at commit ec48ccf.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Mar 25, 2017

I think that's ok; the jars are still added to a separate class loader, so they won't affect the Spark classes. The only caveat is that --jars also accepts remote paths, which won't work for the purpose of getting HBase tokens. But then those jars are also ignored for anything run in client mode.

@jerryshao
Copy link
Contributor Author

Thanks @vanzin for your comments. Yes, remote jars will also not be added to client's classpath currently.

Any further comments?

@@ -485,12 +485,17 @@ object SparkSubmit extends CommandLineUtils {

// In client mode, launch the application main class directly
// In addition, add the main application jar and any added jars (if any) to the classpath
if (deployMode == CLIENT) {
// Also add the main application jar and any added jars to classpath in case yarn#client
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/yarn#client/the YARN client/

Otherwise it seems you're talking about yarn-client mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @vanzin , just updated the comment.

@vanzin
Copy link
Contributor

vanzin commented Mar 28, 2017

Minor comment on a comment, otherwise LGTM.

Change-Id: I9c694b1789a3216ac49f55b851090d0170fd5aec
@SparkQA
Copy link

SparkQA commented Mar 29, 2017

Test build #75334 has finished for PR 17388 at commit 92d9587.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Mar 29, 2017

Merging to master / 2.1.

asfgit pushed a commit that referenced this pull request Mar 29, 2017
…ovider

## What changes were proposed in this pull request?

Currently we use system classloader to find HBase jars, if it is specified by `--jars`, then it will be failed with ClassNotFound issue. So here changing to use child classloader.

Also putting added jars and main jar into classpath of submitted application in yarn cluster mode, otherwise HBase jars specified with `--jars` will never be honored in cluster mode, and fetching tokens in client side will always be failed.

## How was this patch tested?

Unit test and local verification.

Author: jerryshao <sshao@hortonworks.com>

Closes #17388 from jerryshao/SPARK-20059.

(cherry picked from commit c622a87)
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
@asfgit asfgit closed this in c622a87 Mar 29, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants