MFuseNet

This is the official implementation code for MFuseNet. For technical details, please refer to :

MFuseNet: Robust Depth Estimation with Learned Multiscopic Fusion
Weihao Yuan, Rui Fan, Michael Yu Wang, Qifeng Chen
ICRA2020, RA-L
[Paper] [Project Page]

Bibtex

If you find this code useful, please consider citing:

@article{yuan2020mfusenet,
  title={MFuseNet: Robust Depth Estimation With Learned Multiscopic Fusion},
  author={Yuan, Weihao and Fan, Rui and Wang, Michael Yu and Chen, Qifeng},
  journal={IEEE Robotics and Automation Letters},
  volume={5},
  number={2},
  pages={3113--3120},
  year={2020},
  publisher={IEEE}
}

Environment setup

This code has been tested on Ubuntu 16.04, CUDA 9.0, two GTX 1080 Ti GPUs.

Dependencies:

Python2.7
PyTorch (0.4.0+)
torchvision (0.2.0+)
os, time, numpy, argparse, cv2, matplotlib, PIL

Data Preparation

The input of the network are the cost volumes obtained by cost calculation step in stereo matching algorithms. They can be calculated by block matching, semi-global matching, graph cuts, deep-network-based methods, etc. The default costs are obtained by MC-CNN. Please refer to MC-CNN for computing the cost volumes.

The training data for three-view fusion are organized as follows:

dataset/
    TRAIN/
        scene1/
            view0.png
            view1.png
            view2.png
            disp1.png
            left.bin
            right.bin
    TEST/
    EVAL/

The view0.png, view1.png, view2.png are the color images of the left, center, and right view. The disp1.png is the ground-truth disparity map for view1. The left.bin and right.bin are the cost volumes obtained by MC-CNN for the matching between the left, right view and the center view.

For five-view fusion, there are additional view3.png for the bottom view and view4.png for the top view, and their corresponding cost volumes bottom.bin and top.bin.

Example data are available here.

Train

. train.sh

Pretrained Models

Five views, four costs fusion

Model_5view

Three views, two costs fusion

Model_3view

Results on Middlebury 2006:

_Model	_AvgErr	_RMS	_{Bad 0.5}	_{Bad 1}	_{Bad 2}
_{Model_3view}	_0.250	_1.036	_4.08%	_1.83%	_1.15%

License

Licensed under an MIT license.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

MFuseNet

Bibtex

Contents

Environment setup

Data Preparation

Train

Pretrained Models

Five views, four costs fusion

Three views, two costs fusion

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

MFuseNet

Bibtex

Contents

Environment setup

Data Preparation

Train

Pretrained Models

Five views, four costs fusion

Three views, two costs fusion

License