As models become more and more complex, it's becoming increasingly important to develop methods for interpreting the model's decisions. In this repository, I try to provide an overview of some libraries you can use to interpret models.
If you want more information about the topic, you can check out my articles:
- Introduction to Machine Learning Model Interpretation
- Hands-on Global Model Interpretation
- Local Model Interpretation: An Introduction
- Interpreting PyTorch models with Captum
Captum is a flexible, easy-to-use model interpretability library for PyTorch, providing state-of-the-art tools for understanding how specific neurons and layers affect predictions.
tf-explain implements interpretability methods for Tensorflow models. It supports two APIs: the Core API, which allows you to interpret a model after training, and a Callback API which lets you use callbacks to monitor the model while training.
SHAP (SHapley Additive exPlanations) is a game-theoretic approach to explain any machine learning model's output. It connects optimal credit allocation with local explanations using the classical Shapley values from game theory and their related extensions (see papers for details and citations).
Lime, Local Interpretable Model-Agnostic, is a local model interpretation technique using Local surrogate models to approximate the predictions of the underlying black-box model.
ELI5 helps debug machine learning classifiers and explain their predictions. It includes:
PDPbox lets you visualize the impact of certain feature towards model prediction using partial dependence plots and information plots.
The partial dependence plot (short PDP or PD plot) shows the marginal effect one or two features have on the predicted outcome of a machine learning model — J. H. Friedman
Credit goes to the people how wrote the libraries covert in the repository. I'm merely showing you some examples so you can get an idea of how the libraries work.