Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[jvm-packages]fix XGBoost-on-Spark SparseVector missing value problem #4455

Closed
wants to merge 2 commits into from

Conversation

adobay
Copy link

@adobay adobay commented May 10, 2019

treat SparseVector as DenseVector

@adobay adobay changed the title fix XGBoost-on-Spark missing value problem [jvm-packages]fix XGBoost-on-Spark missing value problem May 10, 2019
@adobay
Copy link
Author

adobay commented May 10, 2019

the problem
#4315
#3742
#3634
#4227

@adobay adobay changed the title [jvm-packages]fix XGBoost-on-Spark missing value problem [jvm-packages]fix XGBoost-on-Spark SparseVector missing value problem May 10, 2019
@CodingCat
Copy link
Member

treating everything as dense vector brings significant amount of waste on the memory space, and the problem has been fixed in #4349 , are you still observing something out of order?

@hcho3
Copy link
Collaborator

hcho3 commented May 10, 2019

@CodingCat I'll defer to you on whether this pull request should make it to 0.90 release. However, I'm concerned that this PR increases memory consumption significantly and caused some tests to fail.

@shishaochen
Copy link
Contributor

Memory issue is extemely important in distributed training on large datasets.
In my cases, thousands of slots are
defined while only small set of them are used per training job. To not change feature index, sparse
vectors are needed or a must.

@CodingCat
Copy link
Member

I will close this PR soon as the problem has been fixed by other PRs,

@CodingCat CodingCat closed this May 12, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Aug 10, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants