Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[jvm-packages] cross-version spark support #4350

Closed
CodingCat opened this issue Apr 9, 2019 · 5 comments · Fixed by #4377
Closed

[jvm-packages] cross-version spark support #4350

CodingCat opened this issue Apr 9, 2019 · 5 comments · Fixed by #4377
Assignees

Comments

@CodingCat
Copy link
Member

@hcho3 I am going to work on support spark 2.4.1 and have compatibility test over spark 2.3

my plan is to trigger two builds for spark 2.4/2.3 respectively and also have version-specific test to ensure the compatibility

shall I wait for java worker to be ready in jenkins or I should work on travis?

@CodingCat CodingCat self-assigned this Apr 9, 2019
@hcho3
Copy link
Collaborator

hcho3 commented Apr 9, 2019

Let me add Java workers to Jenkins. Can you provide commands to compile JARs?

@CodingCat
Copy link
Member Author

yes, just mvn package

@srowen
Copy link
Contributor

srowen commented Apr 11, 2019

That's great @CodingCat -- would be great to get a 2.4.x build going as 2.3.x is EOL in a few months. I suspect you have this well in hand, but if you're hitting weird problems updating to 2.4 (shouldn't be much) I'd be happy to try to debug.

@CodingCat
Copy link
Member Author

I would limit the definition of cross-version support to "support loading models trained in previous version" in XGBoost

I have done several experiments on running a spark-2.4-built xgboost with spark 2.3 or vice versa. The most significant problem is from the library which Spark depends on and brings some breaking changes by their own. In that way we cannot guarantee a spark 2.4 built version can be run with spark 2.3 runtimes

even to "support loading models trained in previous version", we need some code to handle (1) breaking changes in XGBoost parameters, e.g. reg:linear doesn't exist anymore; (2) breaking changes in Spark, e.g. vectorAssembler will fail with Float.NaN by default...

@srowen
Copy link
Contributor

srowen commented Apr 16, 2019

Personally, I'd relax that condition if needed -- require the same version of xgboost / Spark to read/write things.

@CodingCat CodingCat mentioned this issue Apr 22, 2019
18 tasks
@lock lock bot locked as resolved and limited conversation to collaborators Jul 16, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
3 participants