-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[jvm-packages] xgboost-spark warning when Spark encryption is turned on #3667
[jvm-packages] xgboost-spark warning when Spark encryption is turned on #3667
Conversation
s"Spark Conf spark.ssl.enabled=true was overridden with xgboost.spark.ignoreSsl=true.") | ||
} else { | ||
throw new Exception("xgboost-spark found spark.ssl.enabled=true to encrypt data " + | ||
"in transit, but xgboost-spark uses MPI to send non-encrypted data over the wire. " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't use MPI to send data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right : P will fix!
LGTM except for one minor message. cc: @CodingCat for making a pass |
In Parental leave........will look at it later, sorry for the delay |
The last failure was spurious (from timeout). Pushing NOP update to retest. |
It seems like other Travis builds are having timeout issues too. Will return to this tomorrow. |
@CodingCat OK definitely understandable that you're busy! @hcho3 Any chance you'd be able to check this out? I appreciate it! |
@jkbradley Wish I knew anything about Java world hehe :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, left one minor comment
"To override this protection and still use xgboost-spark at your own risk, " + | ||
"you can set the SparkSession conf to use xgboost.spark.ignoreSsl=true.") | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would you please make this part as well as the lines until to L210 as a separate function (validation something), as this function has been too long
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! Thanks for checking this out : )
LGTM. merged to master, thanks! |
Thank you! Btw, do you know what the ETA for a new release of xgboost-spark is? |
…on (dmlc#3667) * added test, commented out right now * reinstated test * added fix for checking encryption settings * fix by using RDD conf * fix compilation * renamed conf * use SparkSession if available * fix message * nop * code review fixes
…on (dmlc#3667) * added test, commented out right now * reinstated test * added fix for checking encryption settings * fix by using RDD conf * fix compilation * renamed conf * use SparkSession if available * fix message * nop * code review fixes
Tracking Github issue: #3647
Issue: Apache Spark users running XGBoost on Spark may expect over-the-wire encryption for XGBoost based on Spark confs, but they will not get it and may never know about the security issue.
This PR changes XGBoost.trainDistributed to check the Spark conf
spark.ssl.enabled
and throw an exception if the user expects encryption over-the-wire.In the error message, this tells the user how to get around this security check at their own risk.
The user can tell xgboost-spark to ignore spark.ssl.enabled by setting an xgboost-specific conf in the SparkConf:
xgboost.spark.ignoreSsl
.This adds 1 unit test with XGBoostClassifier to check this behavior.