Welcome to the Medical Image Captioning Tool repository!
This repository contains all the necessary documents, design specifications, implementation details and related tools for this Image Captioning Tool that generates natural language captions for Chest X-Rays images!
You can find the official model implementation in this Kaggle notebook: Link
The architecture for the model is inspired from "Show and Tell" by Vinyals. The model is built using Tensorflow library.
The overall model used in this project can be categorized as two types:
1- Image Feature Extraction, using CheXNet.
2- Caption Generation using LSTM.
CheXNet: the DenseCap model, which is a convolutional neural network that is pre-trained on the CXR dataset.
The project also contains code for Attention LSTM layer, although not integrated in the model.
The model is trained on Chest X-rays (Indiana University)
The datasets used for this project are the National Institute of Health Chest X Ray Dataset to train the CNN feature extractor model (CheXNet) and the Chest X-rays (Indiana University) dataset to train the model with the captions.
Also it can be trained on any others Medical Dataset
- The BLEU score for the test set is 0.64.
- Model Loss: from 12 to 2.0831.
- tensorflow
- keras
- numpy
- h5py
- progressbar2
These requirements can be easily installed by: `pip install -r requirements.txt``
[1] Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan. Show and Tell: A Neural Image Caption Generator
[2] Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, Yoshua Bengio. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention`
[3] Kaggle, Official Model Implementation
[4] Official Dataset link, Chest X-rays (Indiana University)