Authors:
Sascha Kirch, Valeria Olyunina, Jan Ondřej, Rafael Pagés, Sergio Martín & Clara Pérez-Molina
TensorFlow implementation for RGB-D-Fusion. For details, see the paper RGB-D-Fusion: Image Conditioned Depth Diffusion of Humanoid Subjects.
- We provide a framework for high resolution dense monocular depth estimation using diffusion models.
- We perform super-resolution for dense depth data conditioned on a multi-modal RGB-D input condition using diffusion models.
- We introduce a novel augmentation technique, namely depth noise, to enhance the robustness of the depth super-resolution model.
- We perform rigorous ablations and experiments to validate our design choices
2023/10/14
: Code is available Now!2023/09/04
: Our paper is now published in IEEE Access!2023/07/29
: We release our pre-print on arxiv.
We reccomend using a docker environment. We provide a docker file from TensorFlow and a docker file from nvidia. The later one is larger but includes nvidia's performance optimizations. Ensure docker is installed including nvidia's GPU extension.
- Build the image
docker build -t <IMAGE_NAME>/<VERSION> -f <PATH_TO_DOCKERFILE>
- Create the container
docker container create --gpus all -u 1000:1000 --name rgb-d-fusion -p 8888:8888 -v <PATH_TO_tf_DIR>:/tf -v <PATH_TO_YOUR_GIT_DIR>:/tf/GitHub -it <IMAGE_NAME>/<VERSION>
- Start the container
docker start rgb-d-fusion
The directory hierachy should look as follows
|- tf
|- manual_datasets
|- <DATASET 1>
|- test
|- DEPTH_RENDER_EXR
|- MASK
|- PARAM
|- RENDER
|- train # same hierachy as in test
|- <DATASET 2> # same hierachy as inv_humas_rendered
|- GitHub
|- ConditionalDepthDiffusion # This Repo
|- output_runs # Auto generated directory to store results
|- DepthDiffusion
|- checkpoints # stores saved model checkpoints
|- illustrations # illustrations that are beeing generated during or after training
|- diffusion_output # used for inference to store data sampled from the model
|- SuperResolution # same hierachy as in DepthDiffusion
The hierachy might be created in one place or in different directories. When starting the docker container, different directories can be mounted together.
Scripts are located under scripts. Currently there are two types of models:
- Depth Diffusion Model, a diffusion model that generates a depth map conditioned on an RGB image
- Superresolution Diffusion Model, a diffusion model that generates high resolution RGB-D from low resolution RGB-D.
Each model has it's dedicated training, eval and inference scripts written in python. You can check the functionality and parameters via python <SCRIPT> -h
.
If you find our work helpful for your research, please consider citing the following BibTeX entry.
@article{kirch_rgb-d-fusion_2023,
title = {RGB-D-Fusion: Image Conditioned Depth Diffusion of Humanoid Subjects},
author = {Kirch, Sascha and Olyunina, Valeria and Ondřej, Jan and Pagés, Rafael and Martín, Sergio and Pérez-Molina, Clara},
journal = {IEEE Access},
year = {2023},
volume = {11},
issn = {2169-3536},
doi = {10.1109/ACCESS.2023.3312017},
pages = {99111--99129},
url = {https://ieeexplore.ieee.org/document/10239167},
}