Skip to content

Official implementation of the paper "RGB-D-Fusion: Image Conditioned Depth Diffusion of Humanoid Subjects"

License

Notifications You must be signed in to change notification settings

Volograms/rgb-d-fusion

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🌈 RGB-D-Fusion: Image Conditioned Depth Diffusion of Humanoid Subjects

Authors:
Sascha Kirch, Valeria Olyunina, Jan Ondřej, Rafael Pagés, Sergio Martín & Clara Pérez-Molina

[Paper] [BibTex]

TensorFlow implementation for RGB-D-Fusion. For details, see the paper RGB-D-Fusion: Image Conditioned Depth Diffusion of Humanoid Subjects.

💡 Contribution

  • We provide a framework for high resolution dense monocular depth estimation using diffusion models.
  • We perform super-resolution for dense depth data conditioned on a multi-modal RGB-D input condition using diffusion models.
  • We introduce a novel augmentation technique, namely depth noise, to enhance the robustness of the depth super-resolution model.
  • We perform rigorous ablations and experiments to validate our design choices

🔥 News

  • 2023/10/14: Code is available Now!
  • 2023/09/04: Our paper is now published in IEEE Access!
  • 2023/07/29: We release our pre-print on arxiv.

⭐ Framework

rgb-d-fusion framework

🎖️ Results

Prediction vs. GT

results_gt

In the wild predictions

results_in_the_wild_1 results_in_the_wild_1 results_in_the_wild_1

🛠️ Installation

We reccomend using a docker environment. We provide a docker file from TensorFlow and a docker file from nvidia. The later one is larger but includes nvidia's performance optimizations. Ensure docker is installed including nvidia's GPU extension.

  1. Build the image
docker build -t <IMAGE_NAME>/<VERSION> -f <PATH_TO_DOCKERFILE>
  1. Create the container
docker container create --gpus all -u 1000:1000 --name rgb-d-fusion -p 8888:8888 -v <PATH_TO_tf_DIR>:/tf -v <PATH_TO_YOUR_GIT_DIR>:/tf/GitHub -it <IMAGE_NAME>/<VERSION>
  1. Start the container
docker start rgb-d-fusion

The directory hierachy should look as follows

|- tf
   |- manual_datasets
      |- <DATASET 1> 
         |- test
            |- DEPTH_RENDER_EXR
            |- MASK
            |- PARAM
            |- RENDER
         |- train                     # same hierachy as in test
      |- <DATASET 2>                   # same hierachy as inv_humas_rendered
   |- GitHub
      |- ConditionalDepthDiffusion    # This Repo
   |- output_runs                     # Auto generated directory to store results
      |- DepthDiffusion
         |- checkpoints               # stores saved model checkpoints
         |- illustrations             # illustrations that are beeing generated during or after training
         |- diffusion_output          # used for inference to store data sampled from the model
      |- SuperResolution              # same hierachy as in DepthDiffusion

The hierachy might be created in one place or in different directories. When starting the docker container, different directories can be mounted together.

Run Training, Evaluation and/or Inference scripts

Scripts are located under scripts. Currently there are two types of models:

  1. Depth Diffusion Model, a diffusion model that generates a depth map conditioned on an RGB image
  2. Superresolution Diffusion Model, a diffusion model that generates high resolution RGB-D from low resolution RGB-D.

Each model has it's dedicated training, eval and inference scripts written in python. You can check the functionality and parameters via python <SCRIPT> -h.

✒️ Citation

If you find our work helpful for your research, please consider citing the following BibTeX entry.

@article{kirch_rgb-d-fusion_2023,
 title = {RGB-D-Fusion: Image Conditioned Depth Diffusion of Humanoid Subjects},
 author = {Kirch, Sascha and Olyunina, Valeria and Ondřej, Jan and Pagés, Rafael and Martín, Sergio and Pérez-Molina, Clara},
 journal = {IEEE Access},
 year = {2023},
 volume = {11},
 issn = {2169-3536},
 doi = {10.1109/ACCESS.2023.3312017},
 pages = {99111--99129},
 url = {https://ieeexplore.ieee.org/document/10239167},
}

About

Official implementation of the paper "RGB-D-Fusion: Image Conditioned Depth Diffusion of Humanoid Subjects"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.8%
  • Dockerfile 0.2%