Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python][sklearn] respect objective aliases #4758

Merged
merged 4 commits into from
Nov 10, 2021
Merged

Conversation

StrikerRUS
Copy link
Collaborator

I believe this is quite odd to add objective alias used in sklearn (loss #4637) and simply ignore any objective aliases at the same time.

Comment on lines -913 to -916
# invalid objective is replaced with default multiclass one
# and invalid binary metric is replaced with multiclass alternative
gbm = lgb.LGBMClassifier(objective='invalid_obj',
**params).fit(eval_metric='binary_error', **params_fit)
Copy link
Collaborator Author

@StrikerRUS StrikerRUS Nov 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this behavior is controversial and I'd better remove such silent replacement of objective function and instead raise error.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. We should explicitly tell the user what they intended to use is not valid. Or at least we should give a warning.

@StrikerRUS StrikerRUS marked this pull request as ready for review November 1, 2021 21:38
Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally support these changes and agree that this makes the scikit-learn API behave in a way that's closer to what we've told users to expect (where parameter aliases can always be used to override keyword arguments).

I just left one minor suggestion about some of the unrelated whitespace changes.

python-package/lightgbm/sklearn.py Outdated Show resolved Hide resolved
for alias in _ConfigAliases.get('objective'):
if alias in params:
self._objective = params.pop(alias)
_log_warning(f"Found `{alias}` in params. Will use it instead of argument")
Copy link
Collaborator

@shiyu1994 shiyu1994 Nov 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this sentence complete? Should it be Will use it instead of argument `objective` ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this particular case I agree with you. Warning message was improved in 2d0392e.

However, please note that in this similar warning

for alias in _ConfigAliases.get("num_iterations"):
if alias in params:
_log_warning(f"Found `{alias}` in params. Will use it instead of argument")
num_boost_round = params.pop(alias)

we cannot add specific argument name which is being overwritten because it depends on whether user uses train() function directly or indirectly from the sklearn-wrapper. Please consider the following example:

import lightgbm as lgb

from sklearn.datasets import load_boston

X, y = load_boston(return_X_y=True)
lgb_train = lgb.Dataset(X, y)

lgb.train({'n_iter': 5}, lgb_train)  # here n_iter is used instead of num_boost_round argument

lgb.LGBMRegressor(n_iter=5).fit(X, y)  # here n_iter is used instead of n_estimators argument

Comment on lines -913 to -916
# invalid objective is replaced with default multiclass one
# and invalid binary metric is replaced with multiclass alternative
gbm = lgb.LGBMClassifier(objective='invalid_obj',
**params).fit(eval_metric='binary_error', **params_fit)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. We should explicitly tell the user what they intended to use is not valid. Or at least we should give a warning.

@jameslamb jameslamb self-requested a review November 3, 2021 14:14
@StrikerRUS
Copy link
Collaborator Author

Should we block merging of this PR for one week similarly to #4580 (comment)?

@StrikerRUS StrikerRUS requested a review from shiyu1994 November 3, 2021 15:51
@jameslamb
Copy link
Collaborator

Should we block merging of this PR for one week similarly to #4580 (comment)?

I support not merging this for a few more days, yes.

@StrikerRUS StrikerRUS merged commit 0a4d190 into master Nov 10, 2021
@StrikerRUS StrikerRUS deleted the sklearn_objective branch November 10, 2021 13:15
@StrikerRUS StrikerRUS mentioned this pull request Jan 6, 2022
13 tasks
@jameslamb jameslamb mentioned this pull request Oct 7, 2022
40 tasks
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants