Deep Learning model to generate a caption for a given image.
See report.pdf for more details about the architecture of model.
Used VGG16 and InceptionV3 models to extract features from the Image.
To just visualise the results and model's output, download the Caption folder which contains pre-trained model.
For training the model from scratch download the Flickr Datasets.
Flickr 8k
Flickr 30k
cd Caption/
python3 caption.py dog.jpg
cd Training/
python3 features.py #Saves the features into features.pkl file
python3 train.py
First time you train this model, Keras will download the model weights from the Internet, which are about 500 Megabytes. This may take a few minutes.
If you don't want to train the model, Just download the Caption Folder, thus saving time.