-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python][sklearn] respect objective aliases #4758
Conversation
# invalid objective is replaced with default multiclass one | ||
# and invalid binary metric is replaced with multiclass alternative | ||
gbm = lgb.LGBMClassifier(objective='invalid_obj', | ||
**params).fit(eval_metric='binary_error', **params_fit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this behavior is controversial and I'd better remove such silent replacement of objective function and instead raise error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. We should explicitly tell the user what they intended to use is not valid. Or at least we should give a warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally support these changes and agree that this makes the scikit-learn API behave in a way that's closer to what we've told users to expect (where parameter aliases can always be used to override keyword arguments).
I just left one minor suggestion about some of the unrelated whitespace changes.
python-package/lightgbm/sklearn.py
Outdated
for alias in _ConfigAliases.get('objective'): | ||
if alias in params: | ||
self._objective = params.pop(alias) | ||
_log_warning(f"Found `{alias}` in params. Will use it instead of argument") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this sentence complete? Should it be Will use it instead of argument `objective`
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this particular case I agree with you. Warning message was improved in 2d0392e.
However, please note that in this similar warning
LightGBM/python-package/lightgbm/engine.py
Lines 571 to 574 in da98f24
for alias in _ConfigAliases.get("num_iterations"): | |
if alias in params: | |
_log_warning(f"Found `{alias}` in params. Will use it instead of argument") | |
num_boost_round = params.pop(alias) |
we cannot add specific argument name which is being overwritten because it depends on whether user uses train()
function directly or indirectly from the sklearn-wrapper. Please consider the following example:
import lightgbm as lgb
from sklearn.datasets import load_boston
X, y = load_boston(return_X_y=True)
lgb_train = lgb.Dataset(X, y)
lgb.train({'n_iter': 5}, lgb_train) # here n_iter is used instead of num_boost_round argument
lgb.LGBMRegressor(n_iter=5).fit(X, y) # here n_iter is used instead of n_estimators argument
# invalid objective is replaced with default multiclass one | ||
# and invalid binary metric is replaced with multiclass alternative | ||
gbm = lgb.LGBMClassifier(objective='invalid_obj', | ||
**params).fit(eval_metric='binary_error', **params_fit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. We should explicitly tell the user what they intended to use is not valid. Or at least we should give a warning.
Should we block merging of this PR for one week similarly to #4580 (comment)? |
I support not merging this for a few more days, yes. |
This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
I believe this is quite odd to add objective alias used in sklearn (
loss
#4637) and simply ignore any objective aliases at the same time.