-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revert ntree limit fix #6616
Revert ntree limit fix #6616
Conversation
best_ntree_limit=str( | ||
(bst.best_iteration + 1) * num_parallel_tree * num_groups | ||
) | ||
best_ntree_limit=str((bst.best_iteration + 1) * num_parallel_tree) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not an exact revert since we also have fix for an old bug for gblinear
with ntree_limit
, which is valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the best_ntree_limit
attribute is really a misnomer; it behaves more like best_iteration
in the C++ layer. Is my understanding correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The C++ layer considers num_group
as it's a model parameter, and predictor PredictBatch
has GBTreeModel
as its argument. But it doesn't consider num_parallel_tree
, which is a training parameter instead of model parameter. So the C++ function for ntree_limit
is half implementation for best_iteration
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it's best_iteration * num_parallel_tree
then? We should clearly document the meaning of best_ntree_limit
, e.g.
Despite its name, the best_ntree_limit attribute is actually the product (best_iteration * num_parallel_tree).
If you set num_parallel_tree > 1, then best_ntree_limit won't be equal to the best boosting round.
Going forward, please use model slicing in all new code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify and reinforce: The best_ntree_limit
equals num_parallel_tree * best_iteration
. num_class
is multiplied inside predictor.
When using classifier, the best_ntree_limit
equals to number of trees produced in best iteration / num_class
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inplace prediction doesn't have this problem as it can use best_iteration
directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hcho3 I will try to deprecate the attribute after setting it as a Python @property
where we can throw proper warnings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the sklearn model uses this attribute by default when running prediction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a short note on train
function.
The old (before fix) best_ntree_limit ignores the num_class parameters, which is incorrect. In before we workarounded it in c++ layer to avoid possible breaking changes on other language bindings. But the Python interpretation stayed incorrect. The PR fixed that in Python to consider num_class, but didn't remove the old workaround, so tree calculation in predictor is incorrect, see PredictBatch in CPUPredictor.
The old (before fix) best_ntree_limit ignores the num_class parameters, which is incorrect. In before we workarounded it in c++ layer to avoid possible breaking changes on other language bindings. But the Python interpretation stayed incorrect. The PR fixed that in Python to consider num_class, but didn't remove the old workaround, so tree calculation in predictor is incorrect, see PredictBatch in CPUPredictor.
The old (before fix) best_ntree_limit ignores the num_class parameters, which is incorrect. In before we workarounded it in c++ layer to avoid possible breaking changes on other language bindings. But the Python interpretation stayed incorrect. The PR fixed that in Python to consider num_class, but didn't remove the old workaround, so tree calculation in predictor is incorrect, see PredictBatch in CPUPredictor.
Close #6615 .