phrase_detection

A Tensorflow implementation of phrase detection framework by Bryan Plummer (bplum@bu.edu) as described in "Revisiting Image-Language Networks for Open-ended Phrase Detection". This repository is based on the tensorflow implementation of Faster R-CNN available here which in turn was based on the python Caffe implementation of Faster RCNN available here.

Prerequisites

A basic Tensorflow installation. The code follows r1.2 format.
Python packages you might not have: nltk cython opencv-python easydict==1.6 scikit-image pyyaml

Code was tested using python 2.7

Installation

Clone the repository

git clone --recursive https://github.com/BryanPlummer/phrase_detection.git

We shall refer to the repo's root directory as $ROOTDIR

Update your -arch in setup script to match your GPU

cd $ROOTDIR/lib
# Change the GPU architecture (-arch) if necessary
vim setup.py

Download pretrained COCO models of the desired network which were released in this repo. Dy default, the code assumes they have been unpacked in a directory called pretrained. For example, after downloading the res101 coco models, you would use:

mkdir $ROOTDIR/pretrained
cd $ROOTDIR/pretrained
tar zxvf $DOWNLOADDIR/coco_900-1190k.tgz

Download a pretrained word embedding. By default, the code assumes you have downloaded the HGLMM 6K-D vectors from here and placed the unziped file in the data directory. If you want to use a different word embedding, please update the pointer to the embedding file and its dimensions in lib/model/config.py. E.g.,

cd $ROOTDIR/data
unzip $DOWNLOADDIR/hglmm_6kd.zip

Download and unpack the Flickr30K Entities and ReferIt Game datasets and build the modules and vocabularies from $ROOTDIR using,

./data/scripts/fetch_datasets.sh

Train your own model

Assuming you completed the Installation setup correctly, you should be able to train a model with,

./experiments/scripts/train_phrase_detector.sh [GPU_ID] [DATASET] [NET] [TAG]
# GPU_ID is the GPU you want to test on
# NET in {vgg16, res50, res101, res152} is the network arch to use
# DATASET {flickr, referit} is defined in train_phrase_detector.sh
# TAG is an experiment name
# Examples:
./experiments/scripts/train_phrase_detector.sh 0 flickr res101 default
./experiments/scripts/train_phrase_detector.sh 1 referit res101 default

This will train the model without the augmented phrases, to train with augmented phrases use:

./experiments/scripts/train_augmented_phrase_detector.sh [GPU_ID] [DATASET] [NET] [TAG]

Test and evaluate

You can test your models using,

./experiments/scripts/test_phrase_detector.sh [GPU_ID] [DATASET] [NET] [TAG]
# GPU_ID is the GPU you want to test on
# NET in {vgg16, res50, res101, res152} is the network arch to use
# DATASET {flickr, referit} is defined in test_phrase_detector.sh
# TAG is an experiment name
# Examples:
./experiments/scripts/test_phrase_detector.sh 0 flickr res101 default
./experiments/scripts/test_phrase_detector.sh 1 referit res101 default

Analogously, to test with augmented phrases use:

./experiments/scripts/test_augmented_phrase_detector.sh [GPU_ID] [DATASET] [NET] [TAG]

By default, trained networks are saved under:

output/[NET]/[DATASET]/{TAG}/

Citation

If you find our code useful please consider citing:

@article{plummerPhrasedetection,
  title={Revisiting Image-Language Networks for Open-ended Phrase Detection},
  author={Bryan A. Plummer and Kevin J. Shih and Yichen Li and Ke Xu and Svetlana Lazebnik and Stan Sclaroff and Kate Saenko},
  journal={arXiv:1811.07212},
  year={2018}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
experiments		experiments
external		external
lib		lib
tools		tools
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

phrase_detection

Prerequisites

Installation

Train your own model

Test and evaluate

Citation

About

Releases

Packages

Languages

License

BryanPlummer/phrase_detection

Folders and files

Latest commit

History

Repository files navigation

phrase_detection

Prerequisites

Installation

Train your own model

Test and evaluate

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages