-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support specifying number of iterations in dataset evaluation #4210
Comments
@wangmn93 Thanks for using LightGBM. If you want to evaluate a constructed model, you have to make it as either a training data or validation data, and use it in the training process. |
@shiyu1994 I have a matrix of 1000000 x 1000, it takes 8.7ms using constructed dataset and 6.85s using booster.predict. Besides the prediction time, it is preferred to use constructed dataset since the raw data is big and loading data takes a lot of time. |
I think |
@shiyu1994 Do you think we can add a feature request for including |
@StrikerRUS Sure. That will be valuable. |
Closed in favor of being in #2302. We decided to keep all feature requests in one place. Welcome to contribute to this feature! Please re-open this issue (or post a comment if you are not a topic starter) if you are actively working on implementing this feature. |
I want to evaluate a constructed dataset using only one tree. booster.predict use the raw data and it is slow. Is there any way to do this?
import numpy as np
from lightgbm import train, Booster
data = np.random(0,1,(100,10))
label = np.random(0,1,100)
dataset = Dataset(data=data, label=label)
booster = train(dataset, num_boost_round=10)
tree0 = Booster(model_string=booster.model_to_string(start_iteration=0, num_iteration=1))
tree0.predict(data) # this is slow
tree0 = Booster(model_str=booster.model_to_string(start_iteration=0, num_iteration=1), train_set=dataset)
tree0._Booster__inner_predict(0) # it always output zeros, since the init score is zero
The text was updated successfully, but these errors were encountered: