Add Cost Effective Gradient Boosting #2014

remcob-gr · 2019-02-14T14:35:31Z

Fixes #1119 .
The implementation is in the form of a tweak to the serial tree learner and so should work with every derived tree learner, though I've only tested in serial.

Like the original CEGB version, this inherits from SerialTreeLearner. Currently, it changes nothing from the original.

This is heavily based on the serial version, but just adds using the coupled penalties.

…rhead of CEGB, and add sanity checks for the lengths of the penalty vectors.

The tree learner did not update the gains of previously computed leaf splits when splitting a leaf elsewhere in the tree. This caused it to prefer new features due to incorrectly penalising splitting on previously used features.

guolinke · 2019-04-02T01:55:49Z

is this ready to merge ?

remcob-gr · 2019-04-02T19:33:24Z

@StrikerRUS: It's ready from my perspective.
@guolinke : Could you review?

StrikerRUS

Some minor notes.

src/treelearner/serial_tree_learner.cpp

include/LightGBM/config.h

StrikerRUS · 2019-04-02T23:48:42Z

I think we should cite CEGB somehow...
Maybe in params description, like `cost-effective gradient-boosting` <https://papers.nips.cc/paper/6753-cost-efficient-gradient-boosting.pdf>__ penalty for ... ?

guolinke · 2019-04-03T01:51:40Z

src/treelearner/serial_tree_learner.cpp

@@ -496,6 +530,14 @@ void SerialTreeLearner::FindBestSplitsFromHistograms(const std::vector<int8_t>&
      smaller_leaf_splits_->max_constraint(),
      &smaller_split);
    smaller_split.feature = real_fidx;
+    smaller_split.gain -= config_->cegb_tradeoff * config_->cegb_penalty_split * smaller_leaf_splits_->num_data_in_leaf();


config_->cegb_tradeoff * config_->cegb_penalty_split is zero by default, right?

Yes. By default, this line doesn't change the gain, and so doesn't change the behaviour.
It only makes a difference if the user specifies a cegb_penalty_split.

guolinke · 2019-04-03T01:54:21Z

I think it will be better to add a section into advaced-topics: https://github.com/Microsoft/LightGBM/blob/master/docs/Advanced-Topics.rst, about how to use cfgb.

remcob-gr · 2019-04-03T08:42:24Z

I've added a section to the docs on using CEGB, including a link to the paper.
@StrikerRUS , @guolinke : Could you take a look?

guolinke · 2019-04-03T08:45:08Z

Thanks @remcob-gr ,it looks good to me.

StrikerRUS

LGTM! Thanks a lot @remcob-gr !

Also, it'll be great if you can add some tests.

remcob-gr · 2019-04-03T12:15:33Z

@StrikerRUS : I've added some tests. Could you take a look and see if there are other tests you'd like?

StrikerRUS · 2019-04-03T16:09:42Z

@remcob-gr Perfect! Many thanks!

@guolinke Can we merge?

guolinke · 2019-04-04T02:35:03Z

@StrikerRUS sure

Remco Bras added 16 commits February 14, 2019 13:48

Add configuration parameters for CEGB.

93da906

Add skeleton CEGB tree learner

44dc8c7

Like the original CEGB version, this inherits from SerialTreeLearner. Currently, it changes nothing from the original.

Track features used in CEGB tree learner.

35d1315

Pull CEGB tradeoff and coupled feature penalty from config.

53902e9

Implement finding best splits for CEGB

928af45

This is heavily based on the serial version, but just adds using the coupled penalties.

Set proper defaults for cegb parameters.

f79417f

Ensure sanity checks don't switch off CEGB.

95d4b31

Implement per-data-point feature penalties in CEGB.

f273cbb

Implement split penalty and remove unused parameters.

312814b

Merge changes from CEGB tree learner into serial tree learner

6fe0844

Represent features_used_in_data by a bitset, to reduce the memory ove…

6b38924

…rhead of CEGB, and add sanity checks for the lengths of the penalty vectors.

Document CEGB parameters and add them to the appropriate section.

716c508

Remove leftover reference to cegb tree learner.

780a645

Remove outdated diff.

05f2a7c

Fix warnings

f9de3f9

StrikerRUS requested review from guolinke and chivee April 1, 2019 12:06

StrikerRUS reviewed Apr 2, 2019

View reviewed changes

src/treelearner/serial_tree_learner.cpp Outdated Show resolved Hide resolved

include/LightGBM/config.h Outdated Show resolved Hide resolved

include/LightGBM/config.h Outdated Show resolved Hide resolved

guolinke reviewed Apr 3, 2019

View reviewed changes

Remco Bras added 3 commits April 3, 2019 09:18

Fix minor issues identified by @StrikerRUS.

fa98ec1

Add docs section on CEGB, including citation.

6b0881b

Fix link.

e0b0f4d

guolinke approved these changes Apr 3, 2019

View reviewed changes

Fix CI failure.

0d5190e

StrikerRUS approved these changes Apr 3, 2019

View reviewed changes

Remco Bras added 2 commits April 3, 2019 12:51

Merge remote-tracking branch 'ms/master' into CEGB-pr

30be873

Add some unit tests

3b5a60b

Remco Bras added 2 commits April 3, 2019 13:26

Fix pylint issues.

4035c02

Fix remaining pylint issue

05e7648

guolinke merged commit 7610228 into microsoft:master Apr 4, 2019

StrikerRUS mentioned this pull request Jun 17, 2019

BUG in GPU histogram #1003

Open

lock bot locked as resolved and limited conversation to collaborators Mar 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Cost Effective Gradient Boosting #2014

Add Cost Effective Gradient Boosting #2014

remcob-gr commented Feb 14, 2019

guolinke commented Apr 2, 2019

remcob-gr commented Apr 2, 2019

StrikerRUS left a comment

StrikerRUS commented Apr 2, 2019

guolinke Apr 3, 2019

remcob-gr Apr 3, 2019

guolinke commented Apr 3, 2019

remcob-gr commented Apr 3, 2019

guolinke commented Apr 3, 2019

StrikerRUS left a comment

remcob-gr commented Apr 3, 2019

StrikerRUS commented Apr 3, 2019

guolinke commented Apr 4, 2019

Add Cost Effective Gradient Boosting #2014

Add Cost Effective Gradient Boosting #2014

Conversation

remcob-gr commented Feb 14, 2019

guolinke commented Apr 2, 2019

remcob-gr commented Apr 2, 2019

StrikerRUS left a comment

Choose a reason for hiding this comment

StrikerRUS commented Apr 2, 2019

guolinke Apr 3, 2019

Choose a reason for hiding this comment

remcob-gr Apr 3, 2019

Choose a reason for hiding this comment

guolinke commented Apr 3, 2019

remcob-gr commented Apr 3, 2019

guolinke commented Apr 3, 2019

StrikerRUS left a comment

Choose a reason for hiding this comment

remcob-gr commented Apr 3, 2019

StrikerRUS commented Apr 3, 2019

guolinke commented Apr 4, 2019