Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update docs for gpu external memory #5332

Merged
merged 3 commits into from
Feb 22, 2020

Conversation

rongou
Copy link
Contributor

@rongou rongou commented Feb 20, 2020

@mli
Copy link
Member

mli commented Feb 20, 2020

Codecov Report

Merging #5332 into master will not change coverage by %.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #5332   +/-   ##
=======================================
  Coverage   83.76%   83.76%           
=======================================
  Files          11       11           
  Lines        2409     2409           
=======================================
  Hits         2018     2018           
  Misses        391      391           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8aa8ef1...d98d145. Read the comment docs.

- ``gradient_based``: the selection probability for each training instance is proportional to the
*regularized absolute value* of gradients (more specifically, :math:`\sqrt{g^2+\lambda h^2}`).
``subsample`` may be set to as low as 0.1 without loss of model accuracy. Note that this
sampling method is only supported when ``tree_method`` is set to ``gpu_hist``.
Copy link
Member

@trivialfis trivialfis Feb 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe leave a note that other tree methods support only the uniform sampling?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -1,6 +1,6 @@
############################################
Using XGBoost External Memory Version (beta)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Em ... the external memory support is not available for CPU Hist, if we were to declare it stable, we should at least document it in the first section that Hist is not supported.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure? I think I've tried hist for comparison in external memory mode. The limitation is right now the CPU code keeps the whole histogram in memory, which may still cause out-of-memory errors if the dataset is too large, but the mechanism of using #cacheprefix and dealing with multiple pages is supported.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not. But there's #4093

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, added hist not well tested as a limitation. If you want, I can add the beta back. :)

@trivialfis
Copy link
Member

Otherwise looks good to me.

Copy link
Member

@trivialfis trivialfis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the great work on external memory and sampling method.

@trivialfis trivialfis merged commit d6b31df into dmlc:master Feb 22, 2020
@lock lock bot locked as resolved and limited conversation to collaborators Jun 24, 2020
@rongou rongou deleted the external-memory-guide branch November 18, 2022 19:01
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants