Depth Any Video with Scalable Synthetic Data

Depth Any Video introduces a scalable synthetic data pipeline, capturing 40,000 video clips from diverse games, and leverages powerful priors of generative video diffusion models to advance video depth estimation. By incorporating rotary position encoding, flow matching, and a mixed-duration training strategy, it robustly handles varying video lengths and frame rates. Additionally, a novel depth interpolation method enables high-resolution depth inference, achieving superior spatial accuracy and temporal consistency over previous models.

This repository is the official implementation of the paper:

Depth Any Video with Scalable Synthetic Data

Honghui Yang*, Di Huang*, Wei Yin, Chunhua Shen, Haifeng Liu, Xiaofei He, Binbin Lin+, Wanli Ouyang, Tong He+

News

[2024-10-20] The Replicate Demo and API is added here.

[2024-10-20] The Hugging Face online demo is live here.

[2024-10-15] The arXiv submission is available here.

Installation

Setting up the environment with conda. With support for the app.

git clone https://github.com/Nightmare-n/DepthAnyVideo
cd DepthAnyVideo

# create env using conda
conda create -n dav python==3.10
conda activate dav
pip install -r requirements.txt
pip install gradio

Inference

To run inference on an image, use the following command:

python run_infer.py --data_path ./demos/arch_2.jpg --output_dir ./outputs/ --max_resolution 2048

To run inference on a video, use the following command:

python run_infer.py --data_path ./demos/wooly_mammoth.mp4 --output_dir ./outputs/ --max_resolution 960

Citation

If you find our work useful, please cite:

@article{yang2024depthanyvideo,
  author    = {Honghui Yang and Di Huang and Wei Yin and Chunhua Shen and Haifeng Liu and Xiaofei He and Binbin Lin and Wanli Ouyang and Tong He},
  title     = {Depth Any Video with Scalable Synthetic Data},
  journal   = {arXiv preprint arXiv:2410.10815},
  year      = {2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
assets		assets
dav		dav
demos		demos
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
cog.yaml		cog.yaml
predict.py		predict.py
requirements.txt		requirements.txt
run_infer.py		run_infer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Depth Any Video with Scalable Synthetic Data

News

Installation

Inference

Citation

About

Releases

Packages

Contributors 4

Languages

License

Nightmare-n/DepthAnyVideo

Folders and files

Latest commit

History

Repository files navigation

Depth Any Video with Scalable Synthetic Data

News

Installation

Inference

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages