This repository contains the code necessary to train and test convolutional neural networks (CNNs) for predicting the rotation angle of an image to correct its orientation. There are scripts to train two models, one on MNIST and another one on the Google Street View dataset. Since the data for this application is generated on-the-fly, you can also train using your own images in a similar way. A detailed explanation of the code and motivation for this project can be found in my blog.
The code mainly relies on Keras to train and test the CNN models, and OpenCV for image manipulation.
The recommended way to use Keras is with the TensorFlow backend. If you want to use it with the Theano backend you will need to make some minor modifications to the code to make it work.
Run either python train/train_mnist.py
to train on MNIST or python train/train_street_view.py
to train on the Google Street View dataset. Note that the first time you run the scripts will take longer since the datasets will be automatically downloaded. Also, you will need a decent GPU to train the ResNet50 model that is used in train_street_view.py
, otherwise it will take quite long to finish.
If you only want to test the models, you can download pre-trained versions here.
Note also that the regression models (train_mnist_regression.py
and train_street_view_regression.py
) don't provide a good accuracy and are only included for illustration purposes.
You can evaluate the models and display examples using the provided Jupyter notebooks. Simply run jupyter notebook
from the root directory and navigate to test/test_mnist.ipynb
or test/test_street_view.ipynb
.
Finally, you can use the correct_rotation.py
script to correct the orientation of your own images. You can run it as follows:
python correct_rotation.py <path_to_hdf5_model> <path_to_input_image_or_directory>
You can also specify the following command line arguments:
-o, --output
to specify the output image or directory.-b, --batch_size
to specify the batch size used to run the model.-c, --crop
to crop out the black borders after rotating the images.