-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[R-package] Add print()
and summary()
methods for Booster
#4686
Conversation
@jameslamb The linter here says this:
I'm not able to find any such notice about
|
That message from this project's linter exactly mirrors direct feedback we received from CRAN. In one submission of
Since this PR is adding According to https://github.com/jimhester/lintr#project-configuration, it's possible to tell |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome, thanks for working on it! I'll test this out soon to check what the outputs look like.
For now, I left a few initial comments. Can you also please add some tests, so we'll be confident that future refactorings don't break this functionality?
- please add new tests to the R package that cover this code. Ideally these should try to cover all the
if
branches inprint.lgb.Booster
, but at a minimum please add one testprint()
-ing andsummary()
-ing a fitted Booster and one using a Booster with a null handle - please add a test using
.Call(LGBM_BoosterGetNumFeatures_R)
to confirm that that function is returning the expected value
Co-authored-by: James Lamb <jaylamb20@gmail.com>
Co-authored-by: James Lamb <jaylamb20@gmail.com>
Co-authored-by: James Lamb <jaylamb20@gmail.com>
Added the tests. |
print
and summary
print()
and summary()
methods for Booster
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks for taking this on!
Please just see the one comment I left on the unit tests.
I installed {lightgbm}
from this branch tonight so I could generate a comparison of the old and new printed information. This is important for reviewing, since the stated purpose of this PR is to provide "shorter outputs and more relevant information".
Using the mtcars
code from the unit tests added in this PR, I see the following for print(model)
.
before (master
@ df12c1b)
<lgb.Booster>
Public:
add_valid: function (data, name)
best_iter: -1
best_score: NA
current_iter: function ()
dump_model: function (num_iteration = NULL, feature_importance_type = 0L)
eval: function (data, name, feval = NULL)
eval_train: function (feval = NULL)
eval_valid: function (feval = NULL)
finalize: function ()
initialize: function (params = list(), train_set = NULL, modelfile = NULL,
lower_bound: function ()
params: list
predict: function (data, start_iteration = NULL, num_iteration = NULL,
raw: NA
record_evals: list
reset_parameter: function (params, ...)
rollback_one_iter: function ()
save: function ()
save_model: function (filename, num_iteration = NULL, feature_importance_type = 0L)
save_model_to_string: function (num_iteration = NULL, feature_importance_type = 0L)
set_train_data_name: function (name)
to_predictor: function ()
update: function (train_set = NULL, fobj = NULL)
upper_bound: function ()
Private:
eval_names: NULL
get_eval_info: function ()
handle: lgb.Booster.handle
higher_better_inner_eval: NULL
init_predictor: NULL
inner_eval: function (data_name, data_idx, feval = NULL)
inner_predict: function (idx)
is_predicted_cur_iter: list
name_train_set: training
name_valid_sets: list
num_class: 1
num_dataset: 1
predict_buffer: list
set_objective_to_none: FALSE
train_set: lgb.Dataset, R6
train_set_version: 1
valid_sets: list
after (this PR)
LightGBM Model (1 tree)
Objective: regression
Fitted to dataset with 10 columns
Co-authored-by: James Lamb <jaylamb20@gmail.com>
The failing check is in python and unrelated to this PR: https://dev.azure.com/lightgbm-ci/lightgbm-ci/_build/results?buildId=11348&view=logs&j=6ee41f30-52b9-5fb2-cd2b-d0121164dc51&t=feed3427-4866-55c6-1937-183bc0e12687 |
Network issue:
Rerun. |
/gha run r-valgrind Workflow R valgrind tests has been triggered! 🚀 Status: success ✔️. |
/gha run r-solaris Workflow Solaris CRAN check has been triggered! 🚀 solaris-x86-patched: https://builder.r-hub.io/status/lightgbm_3.3.1.99.tar.gz-eba9024beb344aa9b04d2a339ab05b14 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, thanks so much for the idea and for the effort you put into the implementation!
Notes for other reviewers
I just updated this to latest master
, to integrate other recent changes, especially the new CI jobs mimicking CRAN's tests with sanitizers (#4678).
If those pass, I'm comfortable merging this now that v3.3.1 has been released (#4715). Even if CRAN rejects v3.3.1 for some reason and we end up needing a v3.3.2 release, I'd be comfortable with these changes being part of such a release, since I think they're low risk and do not contain any breaking changes.
@jameslamb Can we merge this now? |
yep! I'll merge it. Thanks again for the help @david-cortes ! |
This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
Currently, when examining a lightgbm object, it shows information which is typically not relevant to the user, while missing important pieces of information that one would want to see, such as the objective function or the model dimensions. This PR adds a
print
and asummary
method with shorter outputs and more relevant information. The oldprint
can still be accessed throughstr
.