feature_fraction doesnt picks last feature #4476

draphi · 2021-07-15T12:05:13Z

Description

When feature fraction is set to a small value (e.g. 0.6), the last feature in the data set never gets selected.
I was expecting that the feature subset is re-evaluated for each iteration.
In general, a feature fraction that is not close to 1 has some feature_importances==0 for my data-set, but not necessarily in the last position. I suspect there is a bug, either in the description of how this parameter works or the implementation

Reproducible example

I am creating three random features, only the last one is used in the response but it will not get picked up unless
the feature fraction gets increased.
this is visible from feature importance = 0 and from the response plot.
Please increase feature-fraction or change the order of the predictors (e.g. all_preds = ["z", "x", "y"]) to see how the model suddenly learns the function.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
np.random.seed(123)
param= {'booster' : 'gbdt', 'num_boost_round' : 300, 'bagging_fraction':.90,'feature_fraction_seed':10, 'min_data_in_leaf':1,'bagging_freq':3,'objective' : 'regression', 'feature_fraction':0.3
        ,'verbose':10, 'seed':1223}
all_preds=['x', 'y', 'z']
N = 100
x1 = np.random.randn(N)
x2 = np.random.rand(N)
x3 = np.random.rand(N)
df = pd.DataFrame({'x': x1, 'y': x2, 'z':(x3)})
df[response_column] = np.cos(x3)
ds = lgb.Dataset( df[all_preds], label = df[response_column], feature_name       = all_preds)

param_no_rounds = {k:v for k,v in param.items() if k!='num_boost_round'}
mdl = lgb.train(param_no_rounds, ds, num_boost_round=param['num_boost_round'], feature_name=all_preds, verbose_eval=1)
print(list(zip(all_preds,mdl.feature_importance())))
plt.plot(df['z'], mdl.predict(df[all_preds]),'.')
plt.plot(df['z'], df[response_column],'r.')

Environment info

Release version:
pip install lightgbm==3.2.1

Additional Comments

The text was updated successfully, but these errors were encountered:

StrikerRUS · 2021-07-17T14:58:44Z

Hey @draphi ! Thanks a lot for posting this issue with detailed reproducible example!

I can confirm that one feature is unused in 3.2.1 version. But I think this issue has been fixed in master via #4450. Also linking #4371 as the same issue.

Here is what I get with nightly build of LightGBM:

@draphi Please try the latest version: https://lightgbm.readthedocs.io/en/latest/Installation-Guide.html#nightly-builds.

no-response · 2021-08-16T15:25:20Z

This issue has been automatically closed because it has been awaiting a response for too long. When you have time to to work with the maintainers to resolve this issue, please post a new comment and it will be re-opened. If the issue has been locked for editing by the time you return to it, please open a new issue and reference this one. Thank you for taking the time to improve LightGBM!

github-actions · 2023-08-23T14:26:26Z

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

StrikerRUS added awaiting response duplicate labels Jul 17, 2021

no-response bot closed this as completed Aug 16, 2021

github-actions bot removed the awaiting response label Aug 23, 2023

github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature_fraction doesnt picks last feature #4476

feature_fraction doesnt picks last feature #4476

draphi commented Jul 15, 2021 •

edited

Loading

StrikerRUS commented Jul 17, 2021

no-response bot commented Aug 16, 2021

github-actions bot commented Aug 23, 2023

feature_fraction doesnt picks last feature #4476

feature_fraction doesnt picks last feature #4476

Comments

draphi commented Jul 15, 2021 • edited Loading

Description

Reproducible example

Environment info

Additional Comments

StrikerRUS commented Jul 17, 2021

no-response bot commented Aug 16, 2021

github-actions bot commented Aug 23, 2023

draphi commented Jul 15, 2021 •

edited

Loading