Feature importance #12

pjk645-zz · 2019-05-17T04:50:33Z

Are the feature scores the same as the feature weights described in http://www.cs.cornell.edu/~yinlou/papers/lou-kdd13.pdf . Namely, is it the L2/l2 norm of the feature's or features' function in the $GA^2M$ framework?

interpret-ml · 2019-05-18T00:50:46Z

Hi pjk645,

Thanks for the question! Just to make sure we understand right, are you asking about the overall feature importance ranking we assign in the EBM summary?

Currently, we calculate this rank as the average of the absolute predicted value of each feature for the training dataset. In other words, per feature, we calculate what each training data point would be scored by that feature. We then take the absolute value of all of these scores, and average the result. Sorting these results gives us the final ranking of features for the model.

Because EBM is an additive model, this method corresponds with which features have the largest impact on predictions in the training set.

There are many ways to calculate overall feature importance, and we are considering including alternative methods (ex: AUROC per feature) in future releases!

pjk645-zz · 2019-05-18T01:55:59Z

Firstly, thanks for responding and making this module. The ebm fit works really well right out of the box for several toy and work related data sets.

I'm quite intrigued because of the high performance and the feature importance aspects of the model.

Yes, I am inquiring about the overall importance scores.

From your response, it sounds like you are using the L1/l1 norm instead of the L2/l2 norm of which I was thinking, but I'm not sure.

By L1/l1 norm, I mean the average value of the integral of the absolute value of a function, which simplifies to the average absolute value at discrete points for the discrete case.

By L2/l2 norm I mean the square root of the average value of the integral of a function squared, which simplifies to the square root of the average of the square values of the function

In http://www.cs.cornell.edu/~yinlou/papers/lou-kdd13.pdf, the authors talk using the L2/l2 of each feature component as the "weight" and then use this to rank each feature's importance to the model. Am I correct in interpreting that your are pretty much doing the same but with the L1/l1 instead of the L2/l2?

interpret-ml · 2019-05-19T22:17:12Z

By L1/l1 norm, I mean the average value of the integral of the absolute value of a function, which simplifies to the average absolute value at discrete points for the discrete case.

This is correct, except we compute a weighted average absolute value across the function (weighted by the density of the training dataset). So if 90% of the training data took a value of "0" for a feature, and 10% took the value "1", the value of "0" has 9x the weight of the value of "1" before we compute the average absolute value.

The main idea is to prevent functions that take an extreme value in sparse regions from getting high scores. This is the key difference between the "weights" described in the paper, and our current methodology (alongside the L1/L2 difference).

Of course, if you want to appropriately highlight different cases, there are other choices of feature importance (like the weights described in the paper, average ROC per feature on a validation set, etc.), so we plan to make more options available. Thanks for the insightful question!

pjk645-zz · 2019-05-19T22:59:01Z

Great, thanks for the feedback.

pjk645-zz closed this as completed May 19, 2019

interpret-ml added the question Further information is requested label Jun 8, 2019

interpret-ml mentioned this issue Jul 11, 2019

How to interpret the score on Y-axis of the global interpretation of each feature plot in classification model? #21

Closed

yingboli mentioned this issue Sep 23, 2019

Feature importance with interactions #65

Closed

JoshuaC3 mentioned this issue Apr 28, 2021

Add init_score attr to python booster class. microsoft/LightGBM#4235

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature importance #12

Feature importance #12

pjk645-zz commented May 17, 2019

interpret-ml commented May 18, 2019

pjk645-zz commented May 18, 2019

interpret-ml commented May 19, 2019

pjk645-zz commented May 19, 2019 •

edited

Loading

Feature importance #12

Feature importance #12

Comments

pjk645-zz commented May 17, 2019

interpret-ml commented May 18, 2019

pjk645-zz commented May 18, 2019

interpret-ml commented May 19, 2019

pjk645-zz commented May 19, 2019 • edited Loading

pjk645-zz commented May 19, 2019 •

edited

Loading