Skip to content

Commit

Permalink
ENH add an option to center ICE and PD (scikit-learn#18310)
Browse files Browse the repository at this point in the history
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
  • Loading branch information
6 people committed Apr 29, 2022
1 parent cfec91b commit 0f6cd8f
Show file tree
Hide file tree
Showing 5 changed files with 312 additions and 66 deletions.
48 changes: 32 additions & 16 deletions doc/modules/partial_dependence.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,10 @@ Partial dependence plots (PDP) and individual conditional expectation (ICE)
plots can be used to visualize and analyze interaction between the target
response [1]_ and a set of input features of interest.

Both PDPs and ICEs assume that the input features of interest are independent
from the complement features, and this assumption is often violated in practice.
Thus, in the case of correlated features, we will create absurd data points to
compute the PDP/ICE.
Both PDPs [H2009]_ and ICEs [G2015]_ assume that the input features of interest
are independent from the complement features, and this assumption is often
violated in practice. Thus, in the case of correlated features, we will
create absurd data points to compute the PDP/ICE [M2019]_.

Partial dependence plots
========================
Expand Down Expand Up @@ -164,6 +164,18 @@ PDPs. They can be plotted together with
... kind='both')
<...>

If there are too many lines in an ICE plot, it can be difficult to see
differences between individual samples and interpret the model. Centering the
ICE at the first value on the x-axis, produces centered Individual Conditional
Expectation (cICE) plots [G2015]_. This puts emphasis on the divergence of
individual conditional expectations from the mean line, thus making it easier
to explore heterogeneous relationships. cICE plots can be plotted by setting
`centered=True`:

>>> PartialDependenceDisplay.from_estimator(clf, X, features,
... kind='both', centered=True)
<...>

Mathematical Definition
=======================

Expand Down Expand Up @@ -255,15 +267,19 @@ estimators that support it, and 'brute' is used for the rest.
.. topic:: References

T. Hastie, R. Tibshirani and J. Friedman, `The Elements of
Statistical Learning <https://web.stanford.edu/~hastie/ElemStatLearn//>`_,
Second Edition, Section 10.13.2, Springer, 2009.

C. Molnar, `Interpretable Machine Learning
<https://christophm.github.io/interpretable-ml-book/>`_, Section 5.1, 2019.

A. Goldstein, A. Kapelner, J. Bleich, and E. Pitkin, :arxiv:`Peeking Inside the
Black Box: Visualizing Statistical Learning With Plots of Individual
Conditional Expectation <1309.6392>`,
Journal of Computational and Graphical Statistics, 24(1): 44-65, Springer,
2015.
.. [H2009] T. Hastie, R. Tibshirani and J. Friedman,
`The Elements of Statistical Learning
<https://web.stanford.edu/~hastie/ElemStatLearn//>`_,
Second Edition, Section 10.13.2, Springer, 2009.
.. [M2019] C. Molnar,
`Interpretable Machine Learning
<https://christophm.github.io/interpretable-ml-book/>`_,
Section 5.1, 2019.
.. [G2015] :arxiv:`A. Goldstein, A. Kapelner, J. Bleich, and E. Pitkin,
"Peeking Inside the Black Box: Visualizing Statistical
Learning With Plots of Individual Conditional Expectation"
Journal of Computational and Graphical Statistics,
24(1): 44-65, Springer, 2015.
<1309.6392>`
11 changes: 9 additions & 2 deletions doc/whats_new/v1.1.rst
Original file line number Diff line number Diff line change
Expand Up @@ -581,12 +581,19 @@ Changelog
:pr:`16061` by `Thomas Fan`_.

- |Enhancement| In
:meth:`~sklearn.inspection.PartialDependenceDisplay.from_estimator` and
:meth:`~sklearn.inspection.PartialDependenceDisplay.from_predictions`, allow
:meth:`inspection.PartialDependenceDisplay.from_estimator`, allow
`kind` to accept a list of strings to specify which type of
plot to draw for each feature interaction.
:pr:`19438` by :user:`Guillaume Lemaitre <glemaitre>`.

- |Enhancement| :meth:`inspection.PartialDependenceDisplay.from_estimator`,
:meth:`inspection.PartialDependenceDisplay.plot`, and
:func:`inspection.plot_partial_dependence` now support plotting centered
Individual Conditional Expectation (cICE) and centered PDP curves controlled
by setting the parameter `centered`.
:pr:`18310` by :user:`Johannes Elfner <JoElfner>` and
:user:`Guillaume Lemaitre <glemaitre>`.

:mod:`sklearn.isotonic`
.......................

Expand Down
18 changes: 13 additions & 5 deletions examples/inspection/plot_partial_dependence.py
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,13 @@

from sklearn.inspection import PartialDependenceDisplay

common_params = {"subsample": 50, "n_jobs": 2, "grid_resolution": 20, "random_state": 0}
common_params = {
"subsample": 50,
"n_jobs": 2,
"grid_resolution": 20,
"centered": True,
"random_state": 0,
}

print("Computing partial dependence plots...")
tic = time()
Expand Down Expand Up @@ -188,10 +194,12 @@
# rooms per household.
#
# The ICE curves (light blue lines) complement the analysis: we can see that
# there are some exceptions, where the house price remain constant with median
# income and average occupants. On the other hand, while the house age (top
# right) does not have a strong influence on the median house price on average,
# there seems to be a number of exceptions where the house price increase when
# there are some exceptions (which are better highlighted with the option
# `centered=True`), where the house price remains constant with respect to
# median income and average occupants variations.
# On the other hand, while the house age (top right) does not have a strong
# influence on the median house price on average, there seems to be a number
# of exceptions where the house price increases when
# between the ages 15-25. Similar exceptions can be observed for the average
# number of rooms (bottom left). Therefore, ICE plots show some individual
# effect which are attenuated by taking the averages.
Expand Down
Loading

0 comments on commit 0f6cd8f

Please sign in to comment.