MATK (Meme Analysis Toolkit) aims at training, analyzing and comparing the state-of-the-art Vision Language Models on the various downstream memes tasks (i.e. hateful memes classification, attacked group classification, hateful memes explanation generation).
Table of Contents
- Provides a framework for training and evaluating a different language and vision-language models on well known hateful memes datasets.
- Allows for efficient experimentation and parameter tuning through modification of configuration files.
- Evaluates models using different state-of-the-art evaluation metrics such as Accuracy and AUROC.
- Supports visualization by integrating with Tensorboard, allowing users to easily view and analyze metrics in a user-friendly GUI.
BART | FLAVA | LXMERT | VisualBERT | Remarks | |
---|---|---|---|---|---|
FHM | ✔ | ✔ | ✔ | ✔ | |
Fine grained FHM | ✔ | ✔ | ✔ | ✔ | Protected target and protected group not supported |
HarMeme | ✔ | ✔ | ✔ | ✔ | |
Harm-C + Harm-P | ✔ | ✔ | ✔ | ✔ | |
MAMI | ✔ | ✔ | ✔ | ✔ |
To get started, run the following command:
pip install -r requirements.txt
For installation instructions related to image feature extraction, inpainting and captioning, please refer to the preprocessing
directory.
This section will cover how to use the toolkit to execute model training and inference using the currently supported models and datasets. The toolkit uses Hydra framework, ensuring a composable and hierarchical configuration setup. The preconfigured settings for the existing models and datasets are available within the configs/experiments directory.
Although preconfigured settings for both models and datasets have been provided, you will need to (1) download the datasets and (2) update the directory paths pertaining to the datasets and their accompanying auxiliary information. Once you have downloaded the dataset, identify and update the respective configuration file under dataset folder (i.e., fhm_finegrained)
Subsequently, once you have identified the respective configuraton file (i.e., fhm_finegrained/flava), you can train the model using the following commands:
python3 main.py \
+experiment=fhm_finegrained/flava \
action=fit
Similarly, you can run the model on your test set using the following command:
python3 main.py \
+experiment=fhm_finegrained/flava \
action=test
If you encounter issues stemming from hardware limitations or want to experiment with alternative hyperparameters, you have the option to modify the settings through either (1) the composed configuration file or (2) the command line interface in the terminal. For executing one-time override commands, utilize the following command:
python3 main.py \
+experiment=fhm_finegrained/flava \
action=fit \
datamodule.batch_size=16 \
trainer.accumulate_grad_batches=1 \
model.optimizers.0.lr=2e-5
As researchers, you may wish to introduce and experiment with either new models or new datasets. MATK offers an intuitive and modular framework equipped with designated components to streamline such implementations.
The illustration outlines the core configurations and python code used in the composed experiments configuration.
MATK ├──configs ├──── dataset ├──── datamodule ├──── model ├──── metric └──── trainer ├── datasets ├── datamodules └── models
To introduce a new dataset (i.e., fhm_finegrained), it is necessary to generate the following files:
- dataset/fhm_finegrained.py
- configs/dataset/fhm_finegrained.yaml
The Python code facilitates (1) the loading of annotation files, (2) the loading of auxiliary files, and (3) performing dataset preprocessing (i.e., stopwords removal, lowercase). To establish a unified interface for diverse model types, including unimodal and multimodal models, three common base classes are introduced in datasets/base.py: "ImageBase," "FeatureBase," and "TextBase."
For most use cases, you can inherit one of these three base classes and implement the required core functions:
- __len__(self)
- __getitem__(self, idx: int)
You can examine the existing implementations under the dataset folder for reference.
The configuration file stores the filepaths to the dataset and the relevant auxiliary information. In essense, you are required to provide:
annotation_filepaths (dict)
image_dirs (dict)
auxiliary_dicts (dict)
feats_dir (dict)
To introduce a new model (i.e., flava), it is necessary to generate the following files:
- models/flava.py
- configs/model/flava.yaml
The Python code controls (1) the model architecture and (2) the various model training stages (i.e., train, validation and test). Under the hood, we used Pytorch's LightningModule to handle these processes.
You can examine the existing implementations under the models folder for reference.
The configuration file defines the model classes and handles the the models' hyperparameters.
AUROC | FHM | FHM Finegrained | HarMeme | MAMI |
---|---|---|---|---|
LXMERT | 0.689 (0.014) | 0.680 (0.007) | 0.818 (0.014) | 0.763 (0.007) |
VisualBERT | 0.708 (0.014) | 0.672 (0.013) | 0.821 (0.015) | 0.779 (0.007) |
FLAVA | 0.786 (0.009) | 0.765 (0.011) | 0.846 (0.015) | 0.803 (0.006) |
The AUROC scores are presented in the format average (std.dev), where both the average and standard deviation values are calculated across 10 random seeds, ranging from 1111 to 1120.
- MATK package doesn't work with Python 3.10. Please kindly use Python 3.8 for now.
- Ming Shan HEE, Singapore University of Technology and Design (SUTD)
- Aditi KUMARESAN, Singapore University of Technology and Design (SUTD)
- Nirmalendu PRAKASH, Singapore University of Technology and Design (SUTD)
- Rui CAO, Singapore Management University (SMU)
- Prof. Roy Ka-Wei LEE, Singapore University of Technology and Design (SUTD)