Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert ntree limit fix #6616

Merged
merged 5 commits into from
Jan 19, 2021
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 3 additions & 6 deletions python-package/xgboost/training.py
Original file line number Diff line number Diff line change
Expand Up @@ -109,22 +109,19 @@ def _train_internal(params, dtrain,
else:
raise ValueError(f'Unknown booster: {booster}')

num_groups = int(config['learner']['learner_model_param']['num_class'])
num_groups = 1 if num_groups == 0 else num_groups
if bst.attr('best_score') is not None:
bst.best_score = float(bst.attr('best_score'))
bst.best_iteration = int(bst.attr('best_iteration'))
# num_class is handled internally
bst.set_attr(
best_ntree_limit=str(
(bst.best_iteration + 1) * num_parallel_tree * num_groups
)
best_ntree_limit=str((bst.best_iteration + 1) * num_parallel_tree)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not an exact revert since we also have fix for an old bug for gblinear with ntree_limit, which is valid.

Copy link
Collaborator

@hcho3 hcho3 Jan 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the best_ntree_limit attribute is really a misnomer; it behaves more like best_iteration in the C++ layer. Is my understanding correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C++ layer considers num_group as it's a model parameter, and predictor PredictBatch has GBTreeModel as its argument. But it doesn't consider num_parallel_tree, which is a training parameter instead of model parameter. So the C++ function for ntree_limit is half implementation for best_iteration.

Copy link
Collaborator

@hcho3 hcho3 Jan 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it's best_iteration * num_parallel_tree then? We should clearly document the meaning of best_ntree_limit, e.g.

Despite its name, the best_ntree_limit attribute is actually the product (best_iteration * num_parallel_tree).
If you set num_parallel_tree > 1, then best_ntree_limit won't be equal to the best boosting round.

Going forward, please use model slicing in all new code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify and reinforce: The best_ntree_limit equals num_parallel_tree * best_iteration. num_class is multiplied inside predictor.

When using classifier, the best_ntree_limit equals to number of trees produced in best iteration / num_class.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inplace prediction doesn't have this problem as it can use best_iteration directly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hcho3 I will try to deprecate the attribute after setting it as a Python @property where we can throw proper warnings.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the sklearn model uses this attribute by default when running prediction.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a short note on train function.

)
bst.best_ntree_limit = int(bst.attr("best_ntree_limit"))
else:
# Due to compatibility with version older than 1.4, these attributes are added
# to Python object even if early stopping is not used.
bst.best_iteration = bst.num_boosted_rounds() - 1
bst.best_ntree_limit = (bst.best_iteration + 1) * num_parallel_tree * num_groups
bst.best_ntree_limit = (bst.best_iteration + 1) * num_parallel_tree

# Copy to serialise and unserialise booster to reset state and free
# training memory
Expand Down
4 changes: 2 additions & 2 deletions tests/python/test_basic_models.py
Original file line number Diff line number Diff line change
Expand Up @@ -347,7 +347,7 @@ def test_attributes(self):
X, y = load_iris(return_X_y=True)
cls = xgb.XGBClassifier(n_estimators=2)
cls.fit(X, y, early_stopping_rounds=1, eval_set=[(X, y)])
assert cls.get_booster().best_ntree_limit == 2 * cls.n_classes_
assert cls.get_booster().best_ntree_limit == 2
assert cls.best_ntree_limit == cls.get_booster().best_ntree_limit

with tempfile.TemporaryDirectory() as tmpdir:
Expand All @@ -356,7 +356,7 @@ def test_attributes(self):

cls = xgb.XGBClassifier(n_estimators=2)
cls.load_model(path)
assert cls.get_booster().best_ntree_limit == 2 * cls.n_classes_
assert cls.get_booster().best_ntree_limit == 2
assert cls.best_ntree_limit == cls.get_booster().best_ntree_limit

@pytest.mark.skipif(**tm.no_sklearn())
Expand Down
23 changes: 18 additions & 5 deletions tests/python/test_predict.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,15 @@ def run_predict_leaf(predictor):
y = rng.randint(low=0, high=classes, size=rows)
m = xgb.DMatrix(X, y)
booster = xgb.train(
{'num_parallel_tree': num_parallel_tree, 'num_class': classes,
'predictor': predictor, 'tree_method': 'hist'}, m,
num_boost_round=num_boost_round)
{
"num_parallel_tree": num_parallel_tree,
"num_class": classes,
"predictor": predictor,
"tree_method": "hist",
},
m,
num_boost_round=num_boost_round,
)

empty = xgb.DMatrix(np.ones(shape=(0, cols)))
empty_leaf = booster.predict(empty, pred_leaf=True)
Expand All @@ -52,12 +58,19 @@ def run_predict_leaf(predictor):
end = classes * num_parallel_tree * (j + 1)
layer = row[start: end]
for c in range(classes):
tree_group = layer[c * num_parallel_tree:
(c+1) * num_parallel_tree]
tree_group = layer[c * num_parallel_tree: (c + 1) * num_parallel_tree]
assert tree_group.shape[0] == num_parallel_tree
# no subsampling so tree in same forest should output same
# leaf.
assert np.all(tree_group == tree_group[0])

trivialfis marked this conversation as resolved.
Show resolved Hide resolved
ntree_limit = 2
sliced = booster.predict(
m, pred_leaf=True, ntree_limit=num_parallel_tree * ntree_limit
)
first = sliced[0, ...]

assert first.shape[0] == classes * num_parallel_tree * ntree_limit
return leaf


Expand Down
4 changes: 2 additions & 2 deletions tests/python/test_training_continuation.py
Original file line number Diff line number Diff line change
Expand Up @@ -119,13 +119,13 @@ def run_training_continuation(self, xgb_params_01, xgb_params_02,
gbdt_05 = xgb.train(xgb_params_03, dtrain_5class,
num_boost_round=7)
assert gbdt_05.best_ntree_limit == (
gbdt_05.best_iteration + 1) * self.num_parallel_tree * 5
gbdt_05.best_iteration + 1) * self.num_parallel_tree
gbdt_05 = xgb.train(xgb_params_03,
dtrain_5class,
num_boost_round=3,
xgb_model=gbdt_05)
assert gbdt_05.best_ntree_limit == (
gbdt_05.best_iteration + 1) * self.num_parallel_tree * 5
gbdt_05.best_iteration + 1) * self.num_parallel_tree

res1 = gbdt_05.predict(dtrain_5class)
res2 = gbdt_05.predict(dtrain_5class,
Expand Down
6 changes: 3 additions & 3 deletions tests/python/test_with_dask.py
Original file line number Diff line number Diff line change
Expand Up @@ -933,9 +933,9 @@ def test_n_workers(self) -> None:
def test_feature_weights(self, client: "Client") -> None:
kRows = 1024
kCols = 64

X = da.random.random((kRows, kCols), chunks=(32, -1))
y = da.random.random(kRows, chunks=32)
rng = da.random.RandomState(1994)
X = rng.random_sample((kRows, kCols), chunks=(32, -1))
y = rng.random_sample(kRows, chunks=32)

fw = np.ones(shape=(kCols,))
for i in range(kCols):
Expand Down
2 changes: 1 addition & 1 deletion tests/python/test_with_sklearn.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ def train(booster, forest):
)

if forest:
assert cls.best_ntree_limit == rounds * forest * cls.n_classes_
assert cls.best_ntree_limit == rounds * forest
else:
assert cls.best_ntree_limit == 0

Expand Down