SS-VideoCaptioning

This repository contains the Tensorflow implementation of our model "Semantically Sensible Video Captioning (SSVC)"
[Code] [Paper] [ArXiv]

Authors

Md. Mushfiqur Rahman, Thasin Abedin, Khondokar S. S. Prottoy, Ayana Moshruba, Fazlul Hasan Siddiqui

Requirements

Install the following dependencies before running the model

Tensorflow 2.0 install
tqdm pip install tqdm
sklearn pip install -U scikit-learn
nltk pip install nltk

Directory structure

-root
  -glove.6B.100d.txt
  -MSVD_captions.csv
  -models_and_utils
    -models.py
    -utils.py
  -data_picle
    -train
      -filename1.pkl
      -filename2.pkl
      ...
    -test
      -filename1.pkl
      -filename2.pkl
      ...
    -validation
      -filename1.pkl
      -filename2.pkl
      ...
    -train.csv
    -test.csv
    -validation.csv

Train and Evaluate

Download and extract 'glove.6B.100d.txt' link
Download the MSVD dataset and create corresponding pickle files using vid2frames.ipynb. Split the data in train-test-val sets.

Alternate step: Download and extract 'data_pickle.zip'. This compressed file already contains the pickles files of MSVD dataset
run the train.ipynb file

This file has a detailed list of options. Change the options to adjust the model according to requirements
Train and evaluation codes are inside the python notebook

Sample Outputs

SSVC: "A woman is cutting a piece of meat"
GT: "a woman is cutting into the fatty areas of a pork chop"
SS score: 1.0, BLEU1: 1.0, BLEU2: 1.0, BLEU3: 1.0, BLEU4: 1.0

SSVC: "A person is slicing tomato"
GT: "Someone wearing blue rubber gloves is slicing a tomato with a large knife"
SS score: 0.825, BLEU1: 1.0, BLEU2: 1.0, BLEU3: 1.0, BLEU4: 1.0

SSVC: "A woman is cutting a piece of meat"
GT: "a woman is cutting into the fatty areas of a pork chop"
SS score: 0.94, BLEU1: 1.0, BLEU2: 0.84, BLEU3: 0.61, BLEU4: 0.0

Please cite the following:

@article{rahman2021video,
  title={Video captioning with stacked attention and semantic hard pull},
  author={Rahman, Md Mushfiqur and Abedin, Thasin and Prottoy, Khondokar SS and Moshruba, Ayana and Siddiqui, Fazlul Hasan},
  journal={PeerJ Computer Science},
  volume={7},
  pages={e664},
  year={2021},
  publisher={PeerJ Inc.}
}

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
MSVD		MSVD
models_and_utils		models_and_utils
sample_pictures		sample_pictures
.gitignore		.gitignore
MSVD_captions.csv		MSVD_captions.csv
README.md		README.md
train.ipynb		train.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SS-VideoCaptioning

Authors

Requirements

Directory structure

Train and Evaluate

Sample Outputs

About

Releases

Packages

Contributors 2

Languages

mushfiqur11/SS-VideoCaptioning

Folders and files

Latest commit

History

Repository files navigation

SS-VideoCaptioning

Authors

Requirements

Directory structure

Train and Evaluate

Sample Outputs

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages