End-to-End Learning of Geometry and Context for Deep Stereo Regression
It is a simple Tensorflow implementation of the paper https://arxiv.org/pdf/1703.04309.pdf.
Test on images from Middlebury Stereo Dataset
To train this model from scratch, you will need to download the data from FlyingThings3D (cleanpass images 37GB) and FlyingThings3D (disparity 87GB)
Then run FlyingThings_TFRecord.py to generate TFRecord format dataloader.
The directory is assumed to be:
FlyingThings_TFRecord.py
flyingthings3d_frames_cleanpass
TEST
TRAIN
flyingthings3d__disparity
disparity
TEST
TRAIN
After you get fly_train.tfrecords and fly_test.tfrecords, you can run train.py to train. The temporary model files will be saved in directory saved_model.
A pre-trained model can be downloaded here
To load pre-trained model (trained after 60k steps), create directory saved_model and put all the downloaded files inside:
-60000.data-00000-of-00001
-60000.index
-60000.meta
checkpoint
Run test.py to test for new images. The default test images are from Middlebury Stereo Dataset. You can change the file name and directory to test for your own data.
Sample outputs are also provided in the middlebury folder
The training converges pretty fast. The training error, testing error, and training time are close to the paper.
However, you might need TitanX or 1080 Ti, otherwise the memory might not be enough.
The code was written about a year ago so I used Tensorflow 1.3.0 and Python 3.5.
I forgot to give names to the placeholders and output of the graph, so test.py is quite cumbersome.
I will write a function to load the graph from meta file directly later.
Kendall, Alex, et al. "End-to-End Learning of Geometry and Context for Deep Stereo Regression." arXiv preprint arXiv:1703.04309 (2017).