Skip to content

orvee-17/SS-VideoCaptioning

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SS-VideoCaptioning

This repository contains the implementational codes of our paper "Semantically Sensible Video Captioning (SSVC)" in Tensorflow 2. The model was trained on Google Cloud Platform with no use of GPU. The process to run the codes are as follows:

  • Run the vid2frames.ipynb files to convert the videos of MSVD dataset into successive frames of images where the images will be stored in "framed_dataset" directory and a CSV file will be generated containing the frame names.
  • Convert the extracted features from the images as pickle files after passing through the VGG16 architecture and store the pickle files in a folder named "pickle_files".
  • Download "glove.6B.100d.txt" as the pretrained embedding layer.
  • After having the required files, train the video captioning model by running the train.ipynb notebook. It is to be mentioned that our code provides enough flexibility to change the architecture by changing the options in the train.ipynb notebook and analyze different combinations that are shown in the paper. For example, there are options to change the encoder and decoder type, number of layers of encoder to be used, temporal and max length of the caption and spatial hard pull units(join_seq_out). However, due to lack of GPU and some other computational constraints, we kept some of the parameters constant in our study as described in our work.

Note: Avoid the first two steps, if you use the given pickle files and the CSV files along with our code to save time.

About

Semantically Sensible Video Captioning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.6%
  • Python 0.4%