PyTorch Implementation of Transformer Interpretability Beyond Attention Visualization [CVPR 2021]

Check out our new advancements- Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers!

Faster, more general, and can be applied to any type of attention! Among the features:

We remove LRP for a simple and quick solution, and prove that the great results from our first paper still hold!
We expand our work to any type of Transformer- not just self-attention based encoders, but also co-attention encoders and encoder-decoders!
We show that VQA models can actually understand both image and text and make connections!
We use a DETR object detector and create segmentation masks from our explanations!
We provide a colab notebook with all the examples. You can very easily add images and questions of your own!

ViT explainability notebook:

BERT explainability notebook:

Updates

April 5 2021: Check out this new post about our paper! A great resource for understanding the main concepts behind our work.

March 15 2021: A Colab notebook for BERT for sentiment analysis added!

Feb 28 2021: Our paper was accepted to CVPR 2021!

Feb 17 2021: A Colab notebook with all examples added!

Jan 5 2021: A Jupyter notebook for DeiT added!

Introduction

Official implementation of Transformer Interpretability Beyond Attention Visualization.

We introduce a novel method which allows to visualize classifications made by a Transformer based model for both vision and NLP tasks. Our method also allows to visualize explanations per class.

Method consists of 3 phases:

Calculating relevance for each attention matrix using our novel formulation of LRP.
Backpropagation of gradients for each attention matrix w.r.t. the visualized class. Gradients are used to average attention heads.
Layer aggregation with rollout.

Please notice our Jupyter notebook where you can run the two class specific examples from the paper.

To add another input image, simply add the image to the samples folder, and use the generate_visualization function for your selected class of interest (using the class_index={class_idx}), not specifying the index will visualize the top class.

Credits

ViT implementation is based on:

https://github.com/rwightman/pytorch-image-models
https://github.com/lucidrains/vit-pytorch
pretrained weights from: https://github.com/google-research/vision_transformer

BERT implementation is taken from the huggingface Transformers library: https://huggingface.co/transformers/

ERASER benchmark code adapted from the ERASER GitHub implementation: https://github.com/jayded/eraserbenchmark

Text visualizations in supplementary were created using TAHV heatmap generator for text: https://github.com/jiesutd/Text-Attention-Heatmap-Visualization

Reproducing results on ViT

Section A. Segmentation Results

Example:

CUDA_VISIBLE_DEVICES=0 PYTHONPATH=./:$PYTHONPATH python3 baselines/ViT/imagenet_seg_eval.py --method transformer_attribution --imagenet-seg-path /path/to/gtsegs_ijcv.mat

Link to download dataset.

In the exmaple above we run a segmentation test with our method. Notice you can choose which method you wish to run using the --method argument. You must provide a path to imagenet segmentation data in --imagenet-seg-path.

Section B. Perturbation Results

Example:

CUDA_VISIBLE_DEVICES=0 PYTHONPATH=./:$PYTHONPATH python3 baselines/ViT/generate_visualizations.py --method transformer_attribution --imagenet-validation-path /path/to/imagenet_validation_directory

Notice that you can choose to visualize by target or top class by using the --vis-cls argument.

Now to run the perturbation test run the following command:

CUDA_VISIBLE_DEVICES=0 PYTHONPATH=./:$PYTHONPATH python3 baselines/ViT/pertubation_eval_from_hdf5.py --method transformer_attribution

Notice that you can use the --neg argument to run either positive or negative perturbation.

Reproducing results on BERT

Download the pretrained weights:

Download classifier.zip from https://drive.google.com/file/d/1kGMTr69UWWe70i-o2_JfjmWDQjT66xwQ/view?usp=sharing
mkdir -p ./bert_models/movies
unzip classifier.zip -d ./bert_models/movies/

Download the dataset pkl file:

Download preprocessed.pkl from https://drive.google.com/file/d/1-gfbTj6D87KIm_u1QMHGLKSL3e93hxBH/view?usp=sharing
mv preprocessed.pkl ./bert_models/movies

Download the dataset:

Download movies.zip from https://drive.google.com/file/d/11faFLGkc0hkw3wrGTYJBr1nIvkRb189F/view?usp=sharing
unzip movies.zip -d ./data/

Now you can run the model.

Example:

CUDA_VISIBLE_DEVICES=0 PYTHONPATH=./:$PYTHONPATH python3 BERT_rationale_benchmark/models/pipeline/bert_pipeline.py --data_dir data/movies/ --output_dir bert_models/movies/ --model_params BERT_params/movies_bert.json

To control which algorithm to use for explanations change the method variable in BERT_rationale_benchmark/models/pipeline/bert_pipeline.py (Defaults to 'transformer_attribution' which is our method). Running this command will create a directory for the method in bert_models/movies/<method_name>.

In order to run f1 test with k, run the following command:

PYTHONPATH=./:$PYTHONPATH python3 BERT_rationale_benchmark/metrics.py --data_dir data/movies/ --split test --results bert_models/movies/<method_name>/identifier_results_k.json

Also, in the method directory there will be created .tex files containing the explanations extracted for each example. This corresponds to our visualizations in the supplementary.

Citing our paper

If you make use of our work, please cite our paper:

@InProceedings{Chefer_2021_CVPR,
    author    = {Chefer, Hila and Gur, Shir and Wolf, Lior},
    title     = {Transformer Interpretability Beyond Attention Visualization},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {782-791}
}

Name		Name	Last commit message	Last commit date
Latest commit History 86 Commits
.ipynb_checkpoints		.ipynb_checkpoints
BERT_explainability/modules		BERT_explainability/modules
BERT_params		BERT_params
BERT_rationale_benchmark		BERT_rationale_benchmark
baselines/ViT		baselines/ViT
data		data
dataset		dataset
modules		modules
samples		samples
utils		utils
.gitignore		.gitignore
BERT_explainability.ipynb		BERT_explainability.ipynb
DeiT.PNG		DeiT.PNG
DeiT_example.ipynb		DeiT_example.ipynb
LICENSE		LICENSE
README.md		README.md
Transformer_explainability.ipynb		Transformer_explainability.ipynb
example.PNG		example.PNG
example.ipynb		example.ipynb
method-page-001.jpg		method-page-001.jpg
new_work.jpg		new_work.jpg
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PyTorch Implementation of Transformer Interpretability Beyond Attention Visualization [CVPR 2021]

Check out our new advancements- Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers!

ViT explainability notebook:

BERT explainability notebook:

Updates

Introduction

Credits

Reproducing results on ViT

Section A. Segmentation Results

Section B. Perturbation Results

Reproducing results on BERT

Citing our paper

About

Releases

Packages

Contributors 2

Languages

License

hila-chefer/Transformer-Explainability

Folders and files

Latest commit

History

Repository files navigation

PyTorch Implementation of Transformer Interpretability Beyond Attention Visualization [CVPR 2021]

Check out our new advancements- Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers!

ViT explainability notebook:

BERT explainability notebook:

Updates

Introduction

Credits

Reproducing results on ViT

Section A. Segmentation Results

Section B. Perturbation Results

Reproducing results on BERT

Citing our paper

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages