S2VT + Attention

This is a re-tooled fork of a discontinued repo by Sundrops.

The goal is to have something basic for quick video captioning experiments in PyTorch, while still implementing encoder-decoder framework and attention mechanism, and being somewhat updated (having latest PyTorch and CUDA 9 support, for starters).

News:

Click here for checkpoint data (OneDrive)

Click here for processed data (OneDrive)

save/: model checkpoints
data/: preprocessed features for datasets along with their JSON meta files.

For any model folder, just check opt_info.json in the folder for model configuration details.

Date	Announcement
Nov 8 2018	Pretrained model and NasNet-A Large features now available for MSVD. Ran for 1000 epochs. Bleu4=0.38.
Oct 8 2020	Updated links for checkpoints and processed data.

Setup:

If you just want to caption some of your own videos skip to step 5. To train and evaluate on datasets, start at 1.

1. Data processing

Borrowed some scripts from VideoToTextDNN.

First process frames of MSVD:

python process_frames.py /path/to/videos /path/to/write/frames/directories 0 n

n = size of dataset

Next process features, this will take awhile depending on how many GPUs you have (nasnet-large shown):

python process_features.py /path/to/written/frames /path/to/write/features --type nasnetalarge

Batch size parameter -bs can be used to adjust batch size in case you run out of memory. The default of 8 uses around 10gb. Smaller batch size = less memory, but will take longer.

Now process the dataset file. We will use a pkl file from arctic-capgen-vid. Download Link

python process_dataset.py --gtdict /path/to/downloaded/youtube2text_iccv15/dict_movieID_caption.pkl

2. Patches

Some quick but important tests for coco-caption, the caption eval scorer, before we continue:

Enter coco-caption/pycocoevalcap/meteor and try running

java -jar -Xmx2G meteor-1.5.jar -- -stdio -l en -norm

You might get an error complaining about a data file. To fix re-download the .gz file it expects: https://github.com/lichengunc/refer/tree/master/evaluation/meteor/data to coco-caption/pycocoevalcap/meteor/data/

If its java related, make sure you have java working first since coco-caption relies on it.

3. Training

Use train.py, see opts.py for arguments.

4. Eval

Use eval.py, see bottom of script for arguments.

5. Inference

Use sample.py, see bottom of script for arguments.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
c3d_feat_extract		c3d_feat_extract
coco-caption		coco-caption
misc		misc
models		models
.gitignore		.gitignore
2d_feat_extract.sh		2d_feat_extract.sh
LICENSE		LICENSE
README.md		README.md
README.md.old		README.md.old
_ut_msvd_loader.py		_ut_msvd_loader.py
dataloader.py		dataloader.py
eval.py		eval.py
eval_s2vt.sh		eval_s2vt.sh
finetune_cnn.py		finetune_cnn.py
opts.py		opts.py
prepro_vocab.py		prepro_vocab.py
process_dataset.py		process_dataset.py
process_features.py		process_features.py
process_frames.py		process_frames.py
sample.py		sample.py
train.py		train.py
train_s2vt.sh		train_s2vt.sh
train_s2vt_att.sh		train_s2vt_att.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

S2VT + Attention

News:

Setup:

1. Data processing

2. Patches

3. Training

4. Eval

5. Inference

About

Releases

Packages

Languages

License

w-garcia/video-caption.pytorch

Folders and files

Latest commit

History

Repository files navigation

S2VT + Attention

News:

Setup:

1. Data processing

2. Patches

3. Training

4. Eval

5. Inference

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages