Joint Wasserstein Autoencoders for Aligning Multimodal Embeddings

This repository is the PyTorch implementation of the paper:

Joint Wasserstein Autoencoders for Aligning Multimodal Embeddings (ICCV Worshops 2019)

Shweta Mahajan, Teresa Botschen, Iryna Gurevych and Stefan Roth

This repository is built on top of SCAN and VSE++ in PyTorch.

Requirements

The following code is written in Python 2.7.0 and CUDA 9.0.

Requirements:

torch 0.3
torchvision 0.3.0
nltk 3.5
gensim
Punkt Sentence Tokenizer:

import nltk
nltk.download()
> d punkt

To install requirements:

conda config --add channels pytorch
conda config --add channels anaconda
conda config --add channels conda-forge
conda config --add channels conda-forge/label/cf202003
conda create -n <environment_name> --file requirements.txt
conda activate <environment_name>

Preprocessed data

The preprocessed COCO and Flickr30K dataset used in the experiments are based on the SCAN and can be downloaded at COCO_Precomp and F30k_Precomp. The downloaded dataset should be placed in the data folder.
Run vocab.py to generate the vocabulary for the datasets as:

python vocab.py --data_path data --data_name f30k_precomp
python vocab.py --data_path data --data_name coco_precomp

Training

A new JWAE model can be trained using the following:

   	python train.py --data_path "$DATA_PATH" --data_name coco_precomp --vocab_path "$VOCAB_PATH"

Evaluation

The trained model can then be evaluated using the following python script:

from vocab import Vocabulary
import evaluation
evaluation.evalrank("$CHECKPOINT_PATH", data_path="$DATA_PATH", split="test")

Bibtex

@inproceedings{Mahajan:2019:JWA,
  author = {Shweta Mahajan and Teresa Botschen and Iryna Gurevych and Stefan Roth},
  booktitle = {ICCV Workshop on Cross-Modal Learning in Real World},
  title = {Joint {W}asserstein Autoencoders for Aligning Multi-modal Embeddings},
  year = {2019}
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
assets		assets
util		util
vocab		vocab
LICENSE		LICENSE
README.md		README.md
data.py		data.py
evaluation.py		evaluation.py
model.py		model.py
ranks.pth.tar		ranks.pth.tar
requirements.txt		requirements.txt
train.py		train.py
vocab.py		vocab.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Joint Wasserstein Autoencoders for Aligning Multimodal Embeddings

Requirements

Preprocessed data

Training

Evaluation

Bibtex

About

Releases

Packages

Contributors 2

Languages

License

visinf/jwae

Folders and files

Latest commit

History

Repository files navigation

Joint Wasserstein Autoencoders for Aligning Multimodal Embeddings

Requirements

Preprocessed data

Training

Evaluation

Bibtex

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages