Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R-package] Add print() and summary() methods for Booster #4686

Merged
merged 18 commits into from
Nov 13, 2021

Conversation

david-cortes
Copy link
Contributor

Currently, when examining a lightgbm object, it shows information which is typically not relevant to the user, while missing important pieces of information that one would want to see, such as the objective function or the model dimensions. This PR adds a print and a summary method with shorter outputs and more relevant information. The old print can still be accessed through str.

@david-cortes
Copy link
Contributor Author

@jameslamb The linter here says this:

/home/runner/work/LightGBM/LightGBM/R-package/R/lgb.Booster.R:843:5: warning: Function "cat" is undesirable. As an alternative, CRAN forbids the use of cat() in packages except in special cases. Use message() or warning()..

I'm not able to find any such notice about cat in the R extensions manual or in the repository policy. What's more:

  • The examples in the R extensions manual use cat.
  • The suggestion of using message is misguided, because it sends output to stderr.

@jameslamb
Copy link
Collaborator

That message from this project's linter exactly mirrors direct feedback we received from CRAN.

In one submission of {lightgbm} 3.0.0 to CRAN in September 2020, CRAN's response included the following (documented in cran-comments.md)

Please replace cat() by message() or warning() in your functions (except for print() and summary() functions). Messages and warnings can be suppressed if needed.

Since this PR is adding print() and summary() methods, I think uses of cat() in those specific cases will be ok with CRAN.

According to https://github.com/jimhester/lintr#project-configuration, it's possible to tell {lintr} to exclude specific lines by adding the comment # nolint. Please try adding such comments to the lines in this PR that use cat().

Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome, thanks for working on it! I'll test this out soon to check what the outputs look like.

For now, I left a few initial comments. Can you also please add some tests, so we'll be confident that future refactorings don't break this functionality?

  • please add new tests to the R package that cover this code. Ideally these should try to cover all the if branches in print.lgb.Booster, but at a minimum please add one test print()-ing and summary()-ing a fitted Booster and one using a Booster with a null handle
  • please add a test using .Call(LGBM_BoosterGetNumFeatures_R) to confirm that that function is returning the expected value

R-package/R/lgb.Booster.R Outdated Show resolved Hide resolved
include/LightGBM/c_api.h Outdated Show resolved Hide resolved
src/c_api.cpp Outdated Show resolved Hide resolved
R-package/src/lightgbm_R.h Outdated Show resolved Hide resolved
david-cortes and others added 4 commits October 15, 2021 06:43
Co-authored-by: James Lamb <jaylamb20@gmail.com>
Co-authored-by: James Lamb <jaylamb20@gmail.com>
Co-authored-by: James Lamb <jaylamb20@gmail.com>
@david-cortes
Copy link
Contributor Author

Added the tests.

@jameslamb jameslamb self-requested a review October 26, 2021 04:11
@jameslamb jameslamb changed the title [R-package] Add S3 methods for print and summary [R-package] Add print() and summary() methods for Booster Oct 26, 2021
Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks for taking this on!

Please just see the one comment I left on the unit tests.

I installed {lightgbm} from this branch tonight so I could generate a comparison of the old and new printed information. This is important for reviewing, since the stated purpose of this PR is to provide "shorter outputs and more relevant information".

Using the mtcars code from the unit tests added in this PR, I see the following for print(model).

before (master @ df12c1b)

<lgb.Booster>
  Public:
    add_valid: function (data, name) 
    best_iter: -1
    best_score: NA
    current_iter: function () 
    dump_model: function (num_iteration = NULL, feature_importance_type = 0L) 
    eval: function (data, name, feval = NULL) 
    eval_train: function (feval = NULL) 
    eval_valid: function (feval = NULL) 
    finalize: function () 
    initialize: function (params = list(), train_set = NULL, modelfile = NULL, 
    lower_bound: function () 
    params: list
    predict: function (data, start_iteration = NULL, num_iteration = NULL, 
    raw: NA
    record_evals: list
    reset_parameter: function (params, ...) 
    rollback_one_iter: function () 
    save: function () 
    save_model: function (filename, num_iteration = NULL, feature_importance_type = 0L) 
    save_model_to_string: function (num_iteration = NULL, feature_importance_type = 0L) 
    set_train_data_name: function (name) 
    to_predictor: function () 
    update: function (train_set = NULL, fobj = NULL) 
    upper_bound: function () 
  Private:
    eval_names: NULL
    get_eval_info: function () 
    handle: lgb.Booster.handle
    higher_better_inner_eval: NULL
    init_predictor: NULL
    inner_eval: function (data_name, data_idx, feval = NULL) 
    inner_predict: function (idx) 
    is_predicted_cur_iter: list
    name_train_set: training
    name_valid_sets: list
    num_class: 1
    num_dataset: 1
    predict_buffer: list
    set_objective_to_none: FALSE
    train_set: lgb.Dataset, R6
    train_set_version: 1
    valid_sets: list

after (this PR)

LightGBM Model (1 tree)
Objective: regression
Fitted to dataset with 10 columns

R-package/tests/testthat/test_lgb.Booster.R Outdated Show resolved Hide resolved
@david-cortes
Copy link
Contributor Author

@StrikerRUS
Copy link
Collaborator

The failing check is in python and unrelated to this PR:

Network issue:

Err:5 http://ppa.launchpad.net/mhier/libboost-latest/ubuntu focal InRelease
  Could not connect to ppa.launchpad.net:80 (91.189.95.85), connection timed out
Reading package lists...
W: Failed to fetch http://ppa.launchpad.net/mhier/libboost-latest/ubuntu/dists/focal/InRelease  Could not connect to ppa.launchpad.net:80 (91.189.95.85), connection timed out
W: Some index files failed to download. They have been ignored, or old ones used instead.

Rerun.

@jameslamb
Copy link
Collaborator

jameslamb commented Oct 28, 2021

/gha run r-valgrind

Workflow R valgrind tests has been triggered! 🚀
https://github.com/microsoft/LightGBM/actions/runs/1395102736

Status: success ✔️.

@jameslamb
Copy link
Collaborator

jameslamb commented Oct 28, 2021

/gha run r-solaris

Workflow Solaris CRAN check has been triggered! 🚀
https://github.com/microsoft/LightGBM/actions/runs/1395103554

solaris-x86-patched: https://builder.r-hub.io/status/lightgbm_3.3.1.99.tar.gz-eba9024beb344aa9b04d2a339ab05b14
solaris-x86-patched-ods: https://builder.r-hub.io/status/lightgbm_3.3.1.99.tar.gz-9b7953e677704cb19843dedb2b8b3aeb
Reports also have been sent to LightGBM public e-mail: https://yopmail.com?lightgbm_rhub_checks
Status: success ✔️.

@jameslamb jameslamb self-requested a review November 1, 2021 02:57
Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thanks so much for the idea and for the effort you put into the implementation!

Notes for other reviewers

I just updated this to latest master, to integrate other recent changes, especially the new CI jobs mimicking CRAN's tests with sanitizers (#4678).

If those pass, I'm comfortable merging this now that v3.3.1 has been released (#4715). Even if CRAN rejects v3.3.1 for some reason and we end up needing a v3.3.2 release, I'd be comfortable with these changes being part of such a release, since I think they're low risk and do not contain any breaking changes.

@StrikerRUS
Copy link
Collaborator

@jameslamb Can we merge this now?

@jameslamb
Copy link
Collaborator

yep! I'll merge it. Thanks again for the help @david-cortes !

@jameslamb jameslamb merged commit 2f59773 into microsoft:master Nov 13, 2021
@StrikerRUS StrikerRUS mentioned this pull request Jan 6, 2022
13 tasks
@jameslamb jameslamb mentioned this pull request Oct 7, 2022
40 tasks
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants