Skip to content

Latest commit

 

History

History
42 lines (26 loc) · 1012 Bytes

README.md

File metadata and controls

42 lines (26 loc) · 1012 Bytes

Hierarchical Visual Context Fusion Transformer

The source code for Multimodal Relation Extraction via a Mixture of Hierarchical Visual Context Learners.

Data preprocessing

MNRE dataset

Due to the large size of MNRE dataset, please download the dataset from the original repository.

Unzip the data and rename the directory as mnre, which should be placed in the directory data:

mkdir data logs ckpt

We also use the detected visual objects provided in previous work, which can be downloaded using the commend:

cd data/
wget 120.27.214.45/Data/re/multimodal/data.tar.gz
tar -xzvf data.tar.gz

Dependencies

Install all necessary dependencies:

pip install -r requirements.txt

Training the model

The best hyperparameters we found have been witten in run_mre.sh file.

You can simply run the bash script for multimodal relation extraction:

bash run_mre.sh