[Python] Refactors scikit-learn API to allow a list of evaluation metrics #3254

giresg · 2020-07-27T14:11:47Z

This PR refactors the code in LGBM's scikit-learn API to allow the user to pass a list of evaluation metrics to the parameter eval_metric in the fit method of LGBMModel and its subclasses.

The snippet bellow shows an example of the possible uses of this functionality:

import ...

def custom_recall_sklearn(y_true, y_pred):
    return 'recall', recall_score(y_true, y_pred > 0.5), True


def custom_precision_sklearn(y_true, y_pred):
    return 'precision', precision_score(y_true, y_pred > 0.5), True

parameters = {
    "objective": "binary",
    "metric": "binary_logloss",
}

model = lightgbm.LGBMClassifier(**parameters)

model.fit(
    X=train,
    y=train_label,
    eval_set=(validation, validation_label),
    eval_metric=[custom_recall_sklearn, custom_precision_sklearn, "auc"])

Where eval_metric receives a list that can combine custom and built-in evaluation functions. The output looks like this:

[1]	valid_0's auc: 0.951963	valid_0's binary_logloss: 0.620274	valid_0's recall: 0.963916	valid_0's precision: 0.932162
[2]	valid_0's auc: 0.972375	valid_0's binary_logloss: 0.558293	valid_0's recall: 0.971234	valid_0's precision: 0.93151
[3]	valid_0's auc: 0.975112	valid_0's binary_logloss: 0.507648	valid_0's recall: 0.967701	valid_0's precision: 0.935594
[4]	valid_0's auc: 0.975005	valid_0's binary_logloss: 0.463648	valid_0's recall: 0.96644	valid_0's precision: 0.938495
[5]	valid_0's auc: 0.975215	valid_0's binary_logloss: 0.426748	valid_0's recall: 0.966944	valid_0's precision: 0.939676

The user can monitor during training as many metrics as needed.

@StrikerRUS rightly mentioned in PR 3165 and in issue 2182 that it is possible to monitor multiple metrics already. This PR leverages that functionality to make it a bit more intuitive.

This PR is backwards compatible.

guolinke · 2020-08-05T22:42:00Z

@StrikerRUS can you help to review this PR?

StrikerRUS

@gramirezespinoza Thank you very much for your PR which proposes great API simplification! Please consider addressing some initial review comments below.

tests/python_package_test/test_engine.py

python-package/lightgbm/sklearn.py

tests/python_package_test/test_engine.py

tests/python_package_test/test_sklearn.py

giresg · 2020-08-23T13:52:27Z

@StrikerRUS tghanks for your comments! I've included all your recommendations into the code. Also, please see my comment above about the unit-test I had to modify.

I also fixed a couple of linting errors. Nevertheless, I don't know why this action is failing:

GitHub Actions / r-package (macOS-latest, clang, R 3.6, cmake) (pull_request)

Any idea?

guolinke · 2020-09-02T02:09:54Z

@gramirezespinoza you can retry to rebase to master branch, also cc @jameslamb .

jameslamb · 2020-09-02T02:44:44Z

oh sorry! didn't see this comment since I wasn't a reviewer.

@gramirezespinoza there's no way that failure is related to your code. If you can please merge in the changes from master, I expect CI will build successfully. Sorry for the inconvenience.

giresg · 2020-09-02T11:28:47Z

Rebased to latest master, thanks @guolinke and @jameslamb

giresg · 2020-09-05T09:38:00Z

The CI build failed. I had a look and the logs show this error:

812 URL 'http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html'
813 Name 'link2'
814 Parent URL file:///home/travis/build/microsoft/LightGBM/docs/_build/html/GPU-Performance.html, line 253, col 8
815 Real URL http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html
816 Check time 6.020 seconds
817 Result Error: ConnectionError: ('Connection aborted.', BadStatusLine('No status line received - the server has closed the connection',))

Any idea why is this failing?

StrikerRUS · 2020-09-05T13:13:20Z

@gramirezespinoza

Any idea why is this failing

Sorry for the inconvenience! This test fails often due to network issues. I re-run it.

giresg · 2020-09-05T13:51:07Z

@StrikerRUS no worries, thanks for re-running it!

StrikerRUS

@gramirezespinoza Thank you very much for addressing my previous comments! Looks good overall. Please consider addressing some new issues from one more review round below.

python-package/lightgbm/basic.py

python-package/lightgbm/engine.py

python-package/lightgbm/sklearn.py

tests/python_package_test/test_engine.py

tests/python_package_test/test_sklearn.py

…rameter eval_metric of the class (and subclasses of) LGBMModel. Also adds unit tests for this functionality

…metrics to eval_metric parameter

…t file

…the "train" and "cv" functions can also receive a list of callables

Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

…ions of scikit-learn Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

For details see: microsoft#2619

giresg · 2020-09-06T12:00:04Z

@StrikerRUS thanks again for the comments. I've implemented the changes requested and rebased to newest master.

StrikerRUS

@gramirezespinoza Many thanks for this PR! I believe it will simplify user experience significantly.

github-actions · 2023-08-24T12:13:38Z

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

giresg requested review from chivee, henry0312, StrikerRUS and wxchan as code owners July 27, 2020 14:11

giresg mentioned this pull request Jul 27, 2020

[python][scikit-learn] Support for multiple evaluation metrics #3165

Closed

StrikerRUS requested changes Aug 6, 2020

View reviewed changes

StrikerRUS added the feature label Aug 6, 2020

giresg force-pushed the feature/sklearn-api-multiple-eval-metrics branch from d8acfb4 to 4fe33f6 Compare September 2, 2020 11:27

giresg requested a review from StrikerRUS September 2, 2020 11:30

StrikerRUS requested changes Sep 5, 2020

View reviewed changes

German I Ramirez-Espinoza and others added 10 commits September 6, 2020 19:57

Refactors sklearn API to allow a list of evaluation metrics in the pa…

97c5882

…rameter eval_metric of the class (and subclasses of) LGBMModel. Also adds unit tests for this functionality

Simplify expression to check whether the user passed one or multiple …

416fe93

…metrics to eval_metric parameter

Simplify new tests by using custom metrics already defined in the tes…

47160ba

…t file

Update docstring to reflect the fact that the parameter "feval" from …

d15725a

…the "train" and "cv" functions can also receive a list of callables

Remove oxford comma from docstrings

4c4c92e

Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

Use named-parameters to make sure code is compatible with future vers…

fbbaaa8

…ions of scikit-learn Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

Remove throwaway return value to make code more succinct

6c5056c

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

Move statement to group together the code related to feval

3ca1cd0

Avoid modifying original args as it causes errors in scikit-learn tools

48a760f

For details see: microsoft#2619

Consolidate multiple eval-metrics unit-tests into one test

73ec26e

giresg force-pushed the feature/sklearn-api-multiple-eval-metrics branch from 24f9b83 to 73ec26e Compare September 6, 2020 11:58

giresg requested a review from StrikerRUS September 6, 2020 11:59

Merge branch 'master' into feature/sklearn-api-multiple-eval-metrics

fe1f39f

StrikerRUS approved these changes Sep 6, 2020

View reviewed changes

StrikerRUS merged commit afc76d2 into microsoft:master Sep 6, 2020

This was referenced Sep 7, 2020

[python][examples] updated examples with multiple custom metrics #3367

Merged

[python] fix dangerous default for eval_at in LGBMRanker #3377

Merged

github-actions bot locked as resolved and limited conversation to collaborators Aug 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Python] Refactors scikit-learn API to allow a list of evaluation metrics #3254

[Python] Refactors scikit-learn API to allow a list of evaluation metrics #3254

giresg commented Jul 27, 2020

guolinke commented Aug 5, 2020

StrikerRUS left a comment

giresg commented Aug 23, 2020

guolinke commented Sep 2, 2020

jameslamb commented Sep 2, 2020

giresg commented Sep 2, 2020

giresg commented Sep 5, 2020 •

edited

Loading

StrikerRUS commented Sep 5, 2020 •

edited

Loading

giresg commented Sep 5, 2020

StrikerRUS left a comment

giresg commented Sep 6, 2020

StrikerRUS left a comment

github-actions bot commented Aug 24, 2023

[Python] Refactors scikit-learn API to allow a list of evaluation metrics #3254

[Python] Refactors scikit-learn API to allow a list of evaluation metrics #3254

Conversation

giresg commented Jul 27, 2020

guolinke commented Aug 5, 2020

StrikerRUS left a comment

Choose a reason for hiding this comment

giresg commented Aug 23, 2020

guolinke commented Sep 2, 2020

jameslamb commented Sep 2, 2020

giresg commented Sep 2, 2020

giresg commented Sep 5, 2020 • edited Loading

StrikerRUS commented Sep 5, 2020 • edited Loading

giresg commented Sep 5, 2020

StrikerRUS left a comment

Choose a reason for hiding this comment

giresg commented Sep 6, 2020

StrikerRUS left a comment

Choose a reason for hiding this comment

github-actions bot commented Aug 24, 2023

giresg commented Sep 5, 2020 •

edited

Loading

StrikerRUS commented Sep 5, 2020 •

edited

Loading