[RFC] Inconsistent behavior when a single-leaf tree is encountered #5051

shiyu1994 · 2022-03-03T17:17:53Z

Description

With a single-leaf tree, CLI version stops training at once. But Python API continues to train.

Reproducible example

Example by @arnocandel in #4708.

Additional Comments

We have two choices:

Stopped training at once when a single leaf tree is encountered in all APIs.
Continue to train, add the same prediction value to all training samples. The prediction value should be
- sum_of_gradients/sum_of_hessians in the root node, which is now not calculated.

Gently ping @guolinke @jameslamb @StrikerRUS @hzy46 @btrotta for your opinion.

The text was updated successfully, but these errors were encountered:

guolinke · 2022-03-07T15:27:55Z

I think option 2 is better.

shiyu1994 · 2022-03-07T16:38:03Z

I think option 2 is better.

Strongly agree.

jameslamb · 2022-03-09T03:43:38Z

Thanks for writing this up and for the description in #4708 (comment)!

I think option 2 is preferable.

Encountering a single-leaf tree doesn't necessarily mean that training should stop. I'd expect future boosting rounds could still find informative splits in some situations, like:

using bagging_fraction to re-sample rows
using feature_fraction to randomly choose features
using a custom objective that has some sort of randomness in it or some behavior dependent on the number of iterations

shiyu1994 · 2022-03-24T03:03:32Z

@jameslamb Thanks. Then let's go to option 2.

As per discussion on GH-microsoft#5051 and GH-microsoft#5193 the python package does not stop training if a single leaf tree (stupm) is found and relies on early stopping methods to stop training. This commits removes the finish condition on training based on the result of `TrainOneIter()` and sets the `is_finished` flag on early stopping alone.

jameslamb added the question label Mar 9, 2022

jameslamb mentioned this issue May 4, 2022

[python-package] Use update() finish condition on booster loop #5193

Closed

Samsagax mentioned this issue Feb 4, 2023

Continue training in CLI if one iteration produces a single-leaf tree #5699

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Inconsistent behavior when a single-leaf tree is encountered #5051

[RFC] Inconsistent behavior when a single-leaf tree is encountered #5051

shiyu1994 commented Mar 3, 2022 •

edited

Loading

guolinke commented Mar 7, 2022

shiyu1994 commented Mar 7, 2022

jameslamb commented Mar 9, 2022 •

edited

Loading

shiyu1994 commented Mar 24, 2022 •

edited

Loading

[RFC] Inconsistent behavior when a single-leaf tree is encountered #5051

[RFC] Inconsistent behavior when a single-leaf tree is encountered #5051

Comments

shiyu1994 commented Mar 3, 2022 • edited Loading

Description

Reproducible example

Additional Comments

guolinke commented Mar 7, 2022

shiyu1994 commented Mar 7, 2022

jameslamb commented Mar 9, 2022 • edited Loading

shiyu1994 commented Mar 24, 2022 • edited Loading

shiyu1994 commented Mar 3, 2022 •

edited

Loading

jameslamb commented Mar 9, 2022 •

edited

Loading

shiyu1994 commented Mar 24, 2022 •

edited

Loading