-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
incremental learning with xgb_model - wrong predictions #5192
Comments
@jfrery Could you please provide a self-contained script I can run? Maybe start with small random data generated by numpy: import numpy as np
import xgboost
np.random.seed(1994)
kRows = 1000
kCols = 100
X = np.random.randn(kRows, kCols)
y = np.random.randn(kRows)
dtrain = xgboost.DMatrix(X, y)
... |
@trivialfis Sorry for the delay. The problem is simpler than I though. I didn't mention, but I used early_stopping_rounds parameter in the fit function of the second model. The problem is that when training the second model, the iteration starts at 0 without taking into account the previous model that we passed in the xgb_model parameter. When calling the predict method, the model simply stops at the wrong iteration since the model.best_iteration does not take into account the first model. |
Okay, I need to spend some time on that. Thanks for the explanation. |
Will implement the getter of boosted round in xgboost core later. |
Hello,
I am experimenting with the parameter xgb_model in oder to update my model with new data in an incremental fashion.
Then I use this model to train a new one over new data.
I can see that the training of reg1 is starting with the same performance on the test as the performance of reg0 on the test at the end of the learning process which is great.
Once the training is over, I call reg1.predict(X_test) and compute my_metric(). Here I can see a big difference from the metric I got at the end of my reg1 training.
In fact, the predictions from reg1.predict() are very different from the predictions that were going in my_metric() at the last iterations of my training. I assume that the predict function does not take into account reg0 in the prediction.
I have experimented with the learning api and this problem does not seem to occur.
The text was updated successfully, but these errors were encountered: