Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract evaluate splits from CPU hist. #7079

Merged
merged 21 commits into from
Jul 7, 2021

Conversation

trivialfis
Copy link
Member

@trivialfis trivialfis commented Jul 5, 2021

Other than modularizing the split evaluation function, this PR also removes some more functions including InitNewNodes and BuildNodeStats among some other unused variables. Also, scattered code like setting leaf weights is grouped into the split evaluator and NodeEntry is simplified and made private. Another subtle difference with the original implementation is that the modified code doesn't call tree[nidx].Parent() to traversal upward.

perf

I think it's faster after the PR. URL is only run for 100 rounds, others are 500 rounds.

HIGGS covtype URL
master 180.24662968399934 76.94775568600744 196.1527223587036
PR 177.9364239120041 68.41261254699202 180.37877798080444

@trivialfis trivialfis requested a review from SmirnovEgorRu July 5, 2021 10:40
@trivialfis
Copy link
Member Author

trivialfis commented Jul 5, 2021

cc @ShvetsKS @SmirnovEgorRu

The unused_rows_ is not used in the builder, but used in the test. I think it's legacy?

@ShvetsKS
Copy link
Contributor

ShvetsKS commented Jul 6, 2021

cc @ShvetsKS @SmirnovEgorRu

The unused_rows_ is not used in the builder, but used in the test. I think it's legacy?

yes. It's not used now, gradients are set to zero instead.

@trivialfis
Copy link
Member Author

@ShvetsKS Thanks for the reply. Could you help review this PR? I will remove the unused_rows_ in upcoming refactors as I'm migrating the approx implementation to the fast hist code base.

@trivialfis trivialfis mentioned this pull request Jul 6, 2021
67 tasks
Copy link
Contributor

@SmirnovEgorRu SmirnovEgorRu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@hcho3 hcho3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Two comments:

  • It's nice that the CPU Hist is becoming more like GPU Hist. Consistency makes the whole codebase more legible.
  • "Evaluator" is a nice abstraction. I like how we no longer need a mock for testing split evaluation.

src/tree/updater_quantile_hist.cc Show resolved Hide resolved
@ShvetsKS
Copy link
Contributor

ShvetsKS commented Jul 7, 2021

LGTM

@trivialfis trivialfis merged commit 615ab2b into dmlc:master Jul 7, 2021
@trivialfis trivialfis deleted the extract-evaluate-splits branch July 7, 2021 07:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants