Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic unittests for gpu-hist method. #3785

Merged
merged 1 commit into from
Oct 15, 2018

Conversation

trivialfis
Copy link
Member

@trivialfis trivialfis commented Oct 11, 2018

  • Split building histogram into separated class.
  • Extract InitCompressedRow definition.
  • Basic tests for gpu-hist.
  • Document the code more verbosely.
  • Removed HistCutUnit.
  • Removed some duplicated copies in GPUHistMaker.

This PR is to add some basic unittests for gpu-hist method, for later refactor (using span, reducing memory requirement, etc.) needs.
Trying to create the unit tests somehow shows that each step in the algorithm depends on different and large amount of data pieces. I hope we can come up with a better solution to handle these data (and parameters) dependencies before adding more extensive tests.

Copy link
Member

@RAMitchell RAMitchell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job isolating some of the functionality for testing, this is not easy.

I am a little disturbed by how much manual initialisation of public member variables is necessary to set up these tests. I think it indicates design problems in our code. In my view these members should probably have been private with constructors controlling how they are allowed to be initialised or member functions for modification after initialisation.

It is a problem throughout xgboost that it is very easy to incorrectly initialize classes resulting in seg faults and other problems. If a class is correctly designed its interface should prevent misuse by other developers.

This is not a problem to be addressed by this PR, just some comments in general.

I can see a couple of files that only have minor formatting changes - please remove these from the PR, it makes it easier to manage.

{0.204452, 0.443453}, {0.878117, 0.229577},
{0.0273876, 0.534414}, {0.670467, 0.913962},
};
thrust::device_vector<GradientPair> gpair (n_rows);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can copy a std::vector to device_vector directly by assignment.

TEST(GpuHist, ApplySplit) {
GPUHistMaker hist_maker = GPUHistMaker();
int constexpr nid = 0;
int constexpr n_rows = 16;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preferred syntax is "constexpr int", this is strange to read.

@hcho3
Copy link
Collaborator

hcho3 commented Oct 12, 2018

@trivialfis Nice job creating LCG random generator. Maybe in the future we can re-use it for other purposes.

@trivialfis
Copy link
Member Author

trivialfis commented Oct 12, 2018

@hcho3 Thanks. maybe we can put it in dmlc-core. But it's not a thoroughly designed random generator, I imagine recovering the equation from output is not too hard to achieve. So I want to keep it for testing purpose only, in case others mis-use it in real algorithm.

* Split building histogram into separated class.
* Extract `InitCompressedRow` definition.
* Basic tests for gpu-hist.
* Document the code more verbosely.
* Removed `HistCutUnit`.
* Removed some duplicated copies in `GPUHistMaker`.
* Implement LCG and use it in tests.
@RAMitchell RAMitchell merged commit 516457f into dmlc:master Oct 15, 2018
@trivialfis trivialfis deleted the hist-tests branch October 15, 2018 04:00
CodingCat pushed a commit to CodingCat/xgboost that referenced this pull request Oct 25, 2018
* Split building histogram into separated class.
* Extract `InitCompressedRow` definition.
* Basic tests for gpu-hist.
* Document the code more verbosely.
* Removed `HistCutUnit`.
* Removed some duplicated copies in `GPUHistMaker`.
* Implement LCG and use it in tests.
alois-bissuel pushed a commit to criteo-forks/xgboost that referenced this pull request Dec 4, 2018
* Split building histogram into separated class.
* Extract `InitCompressedRow` definition.
* Basic tests for gpu-hist.
* Document the code more verbosely.
* Removed `HistCutUnit`.
* Removed some duplicated copies in `GPUHistMaker`.
* Implement LCG and use it in tests.
@lock lock bot locked as resolved and limited conversation to collaborators Jan 13, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants