Skip to content

yudi09/pytorch-image-captioning

Repository files navigation

pytorch-image-captioning

Abstract

In this project, I have implemented an end-to-end Deep Learning model for Image Captioning. The architecture consists of Encoder and Decoder Networks. Encoder is one of the pre-trained CNN architectures to get image embedding. Decoder is LSTM network with un-intialized word embeddings.

Requirements

  1. python3.6
  2. pytorch
  3. pytorch-vision
  4. pillow
  5. nltk
  6. pickle
  7. cuda version 9.0/9.1
  8. cuDNN >=7.0
pip install http://download.pytorch.org/whl/cu90/torch-0.4.0-cp36-cp36m-linux_x86_64.whl pytorch-vision pillow nltk pickle

Dataset

Flickr8K
#train : 6000
#dev : 1000
#test : 1000

Instructions to run the code

1. Pre-Processing

python3 Preprocess.py

2. Train

python3 train.py -model <encoder_architecture> -dir <train_dir_path> -save_iter <model_checkpoint> -learning_rate <learning_rate> -epoch <re-train_epoch> -gpu_device <gpu_device_number> -hidden_dim <lstm_hidden_state_dim> -embedding_dim <encoder_output>
args:

-model : one of the cnn architectures - alexnet, resnet18, resnet152, vgg, inception, squeeze, dense
-dir : training directory path
-save_iter : create model checkpoint after some iterations, default = 10
-learning_rate: default = 1e-5
-epoch : re-train the network from saved checkpoint epoch
-gpu_device : gpu device number in case multiple gpus are installed on server
-hidden_dim : number of neurons for lstm's hidden state, default = 512
-embedding_dim: output of cnn encode model, default = 512

3. Test

python3 test.py -model <encoder_architecture> -i <image_path> -epoch <saved_model> -gpu_device <gpu_device_number>
args:

-i : image path for generating caption

Download trained model: Trained for ~24 hours (230 iterations) on single NVIDIA 1080 (8GB) GTX GPU.

Results

Check whether the model is training or not by overfitting on small dataset.

Screen Shot Since training error is decreasing it seems like model is working just fine.

Train vs validation loss

Screen Shot

Image Original Captions Predicted Captions
Screen Shot 1. a beagle and a golden retriever wrestling in the grass
2. Two dogs are wrestling in the grass
3. Two puppies are playing in the green grass
4. two puppies playing around in the grass
5. Two puppies play in the grass
50. a brown and white dog is running through a grassy field .
100. a brown dog in a field .
150. a brown dog is running through a grassy field .
200. a brown and white dog is laying with its mouth open and people up in the grass .
230. a brown dog running through grass .
Screen Shot 1. a brightly decorated bicycle with cart with people walking around in the background
2. A street vending machine is parked while people walk by
3. A street vendor on the corner of a busy intersection
4. People on the city street walk past a puppet theater
5. People walk around a mobile puppet theater in a big city .
50. a man with a green shirt is standing in front of a <unk> at a <unk> .
100. a group of people standing outside a building .
150. a group of people standing around a outside of building .
200. a group of people are standing around a city street .
230. a man in a green shirt <unk> a <unk> at a carnival .
Screen Shot 1. A boat is on the water , with mountains in the background .
2. A boat on the water .
3. A lone boat sitting in the water .
4. A white boat on glassy water with mountains in the background .
5. This is a boat on the water with mountains in the background .
0. a man is on a <unk> .
30. a person on a surfboard is standing on a beach .
130. a person is standing on a mountain and overlooking the ocean .
230. a person is standing on a rock and overlooking the ocean .
Screen Shot 1. A woman climbs up a cliff.
2. A woman rock climber scales a cliff far above pastures .
3. A woman rock-climbing on a cliff .
4. A woman rock-climbs in a rural area .
5. Woman climbing a cliff in a rural area
0. a man in a red and a <unk> is on a <unk>
30. a man in a red shirt is climbing a rock .
130. a man in a red shirt is rock climbing .
230. a man in a red shirt and green pants climbs a rock cliff .
Screen Shot 1. Hikers cross a bridge over a fast moving stream and rocky scenery .
2. People crossing a long bridge over a canyon with a river .
3. People walk across a rope bridge over a rocky stream .
4. Some hikers are crossing a wood and wire bridge over a river .
5. Three people are looking across a rope and wood bridge over a river .
0. a man in a red of a <unk> .
30. a person in a blue jacket is jumping in the snow .
130. a person in the snow .
230. a person on a snowboard in the air
Screen Shot 1. Two men in ethnic dress standing in a barren landscape .
2. Two men in keffiyahs stand next to car in the desert and wave at a passing vehicle .
3. Two men in robes wave at an approaching jeep traveling through the sand .
4. Two men in traditional Arab dress standing near a car wave at an SUV in the desert .
5. Two people with head coverings stand in a sandy field .
0. a man in a red and a white and a dog is on a <unk> .
30. a man and a woman are standing on a bench in a park .
130. a man and a woman dressed in <unk> are walking along a dirt road .
230. a man holding a camera and a woman is walking with her hands on a jumping away from a
Screen Shot 1. A man mountain climbing up an icy mountain .
2. An climber is ascending an ice covered rock face .
3. A person in orange climbs a sheer cliff face covered in snow and ice .
4. Person in a yellow jacket is climbing up snow covered rocks .
5. There is a climber scaling a snowy mountainside .
0. a dog is in the water .
30. a man in a yellow shirt is standing in front of a waterfall .
130. a lone climber walks along a rocky path with mountains in the background .
230. a man climbing a huge mountain .
Screen Shot 1. A boy with a stick kneeling in front of a goalie net
2. A child in a red jacket playing street hockey guarding a goal .
3. A young kid playing the goalie in a hockey rink .
4. A young male kneeling in front of a hockey goal with a hockey stick in his right hand .
5. Hockey goalie boy in red jacket crouches by goal , with stick .
0. a man in a red shirt and a red and a white dog is on a <unk> .
30. aa man and a woman are sitting on a red bench .
130. a man in a red shirt and a white helmet is sitting on a red leash .
230. a man in a red shirt and blue jeans is sitting on a green wall .
Screen Shot 1. A group of eight people are gathered around a table at night .
2. A group of people gathered around in the dark .
3. A group of people sit around a table outside on a porch at night .
4.A group of people sit outdoors together at night .
5. A group of people sitting at a table in a darkened room .
0. a man in a <unk> .
30. a man is sitting on a bench in front of a crowd .
130. a man in a <unk> room with his closeup of two women .
230. a group of people are standing in front of a large window .

References

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages