Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] Refactors scikit-learn API to allow a list of evaluation metrics #3254

Merged

Conversation

giresg
Copy link
Contributor

@giresg giresg commented Jul 27, 2020

This PR refactors the code in LGBM's scikit-learn API to allow the user to pass a list of evaluation metrics to the parameter eval_metric in the fit method of LGBMModel and its subclasses.

The snippet bellow shows an example of the possible uses of this functionality:

import ...

def custom_recall_sklearn(y_true, y_pred):
    return 'recall', recall_score(y_true, y_pred > 0.5), True


def custom_precision_sklearn(y_true, y_pred):
    return 'precision', precision_score(y_true, y_pred > 0.5), True

parameters = {
    "objective": "binary",
    "metric": "binary_logloss",
}

model = lightgbm.LGBMClassifier(**parameters)

model.fit(
    X=train,
    y=train_label,
    eval_set=(validation, validation_label),
    eval_metric=[custom_recall_sklearn, custom_precision_sklearn, "auc"])

Where eval_metric receives a list that can combine custom and built-in evaluation functions. The output looks like this:

[1]	valid_0's auc: 0.951963	valid_0's binary_logloss: 0.620274	valid_0's recall: 0.963916	valid_0's precision: 0.932162
[2]	valid_0's auc: 0.972375	valid_0's binary_logloss: 0.558293	valid_0's recall: 0.971234	valid_0's precision: 0.93151
[3]	valid_0's auc: 0.975112	valid_0's binary_logloss: 0.507648	valid_0's recall: 0.967701	valid_0's precision: 0.935594
[4]	valid_0's auc: 0.975005	valid_0's binary_logloss: 0.463648	valid_0's recall: 0.96644	valid_0's precision: 0.938495
[5]	valid_0's auc: 0.975215	valid_0's binary_logloss: 0.426748	valid_0's recall: 0.966944	valid_0's precision: 0.939676

The user can monitor during training as many metrics as needed.

@StrikerRUS rightly mentioned in PR 3165 and in issue 2182 that it is possible to monitor multiple metrics already. This PR leverages that functionality to make it a bit more intuitive.

This PR is backwards compatible.

@guolinke
Copy link
Collaborator

guolinke commented Aug 5, 2020

@StrikerRUS can you help to review this PR?

Copy link
Collaborator

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gramirezespinoza Thank you very much for your PR which proposes great API simplification! Please consider addressing some initial review comments below.

tests/python_package_test/test_engine.py Outdated Show resolved Hide resolved
tests/python_package_test/test_engine.py Outdated Show resolved Hide resolved
python-package/lightgbm/sklearn.py Outdated Show resolved Hide resolved
tests/python_package_test/test_engine.py Outdated Show resolved Hide resolved
tests/python_package_test/test_sklearn.py Outdated Show resolved Hide resolved
tests/python_package_test/test_sklearn.py Show resolved Hide resolved
@giresg
Copy link
Contributor Author

giresg commented Aug 23, 2020

@StrikerRUS tghanks for your comments! I've included all your recommendations into the code. Also, please see my comment above about the unit-test I had to modify.

I also fixed a couple of linting errors. Nevertheless, I don't know why this action is failing:

GitHub Actions / r-package (macOS-latest, clang, R 3.6, cmake) (pull_request) 

Any idea?

@guolinke
Copy link
Collaborator

guolinke commented Sep 2, 2020

@gramirezespinoza you can retry to rebase to master branch, also cc @jameslamb .

@jameslamb
Copy link
Collaborator

oh sorry! didn't see this comment since I wasn't a reviewer.

@gramirezespinoza there's no way that failure is related to your code. If you can please merge in the changes from master, I expect CI will build successfully. Sorry for the inconvenience.

@giresg giresg force-pushed the feature/sklearn-api-multiple-eval-metrics branch from d8acfb4 to 4fe33f6 Compare September 2, 2020 11:27
@giresg
Copy link
Contributor Author

giresg commented Sep 2, 2020

Rebased to latest master, thanks @guolinke and @jameslamb

@giresg giresg requested a review from StrikerRUS September 2, 2020 11:30
@giresg
Copy link
Contributor Author

giresg commented Sep 5, 2020

Screenshot 2020-09-05 at 17 27 20

The CI build failed. I had a look and the logs show this error:

812 URL 'http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html'
813 Name 'link2'
814 Parent URL file:///home/travis/build/microsoft/LightGBM/docs/_build/html/GPU-Performance.html, line 253, col 8
815 Real URL http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html
816 Check time 6.020 seconds
817 Result Error: ConnectionError: ('Connection aborted.', BadStatusLine('No status line received - the server has closed the connection',))

Any idea why is this failing?

@StrikerRUS
Copy link
Collaborator

StrikerRUS commented Sep 5, 2020

@gramirezespinoza

Any idea why is this failing

Sorry for the inconvenience! This test fails often due to network issues. I re-run it.

@giresg
Copy link
Contributor Author

giresg commented Sep 5, 2020

@StrikerRUS no worries, thanks for re-running it!

Copy link
Collaborator

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gramirezespinoza Thank you very much for addressing my previous comments! Looks good overall. Please consider addressing some new issues from one more review round below.

python-package/lightgbm/basic.py Outdated Show resolved Hide resolved
python-package/lightgbm/engine.py Outdated Show resolved Hide resolved
python-package/lightgbm/engine.py Outdated Show resolved Hide resolved
python-package/lightgbm/sklearn.py Outdated Show resolved Hide resolved
python-package/lightgbm/sklearn.py Outdated Show resolved Hide resolved
tests/python_package_test/test_engine.py Outdated Show resolved Hide resolved
tests/python_package_test/test_engine.py Outdated Show resolved Hide resolved
tests/python_package_test/test_engine.py Outdated Show resolved Hide resolved
tests/python_package_test/test_sklearn.py Outdated Show resolved Hide resolved
tests/python_package_test/test_sklearn.py Outdated Show resolved Hide resolved
German I Ramirez-Espinoza and others added 10 commits September 6, 2020 19:57
…rameter eval_metric of the class (and subclasses of) LGBMModel. Also adds unit tests for this functionality
…the "train" and "cv" functions can also receive a list of callables
Apply suggestions from code review

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
…ions of scikit-learn


Apply suggestions from code review

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
@giresg giresg force-pushed the feature/sklearn-api-multiple-eval-metrics branch from 24f9b83 to 73ec26e Compare September 6, 2020 11:58
@giresg giresg requested a review from StrikerRUS September 6, 2020 11:59
@giresg
Copy link
Contributor Author

giresg commented Sep 6, 2020

@StrikerRUS thanks again for the comments. I've implemented the changes requested and rebased to newest master.

Copy link
Collaborator

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gramirezespinoza Many thanks for this PR! I believe it will simplify user experience significantly.

@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants