SV3D fine-tuning

Fine-tuning code for SV3D

Input Image	Before Training	After Training

Setting up

PyTorch 2.0

conda create -n sv3d python==3.10.14
conda activate sv3d
pip3 install -r requirements.txt

Install `deepspeed` for training

pip3 install deepspeed

Get checkpoints 💾

Store them as following structure:

cd SV3D-fine-tuning
    .
    └── checkpoints
        └── sv3d_p.safetensors

Dataset 📀

Prepare dataset as following. We use Objaverse 1.0 dataset with preprocessing pipeline. See objaverse dataloader for detail. orbit_frame_0020.png is input image, and video_latent.pt is the video latent encoded by SV3D encoder, without regularization (i.e. channel is 8)

cd dataset
    .
    └── 000-000
    |   └── orbit_frame_0020.png # input image
    |   └── orbit_frame.pt # video latent
    └── 000-001
    |   └── orbit_frame_0020.png
    |   └── orbit_frame.pt
    └── ...

Training 🚀

I used a single A6000 GPU(VRAM 48GB) to fine-tune.

sh scripts/sv3d_finetune.sh

Inference ❄️

Store the input images in assets

sh scripts/inference.sh

Notes

The encoder weights of the vae are not provided in sv3d_p.safetensors.
- To obtain the video latents, you should run the encoder separately and use them in the training pipeline, which is due to saving the time and the GPU VRAM for training.
- Note that you should use the output of the encoder of the vae, not the sample from the distribution defined by the mean and variance of the encoder. In our case, we used AutoencoderKLTemporalDecoder which is the same vae used in the SVD pipeline.

Acknowledgement 🤗

The source code is based on SV3D. Thanks for the wonderful codebase!

Additionally, GPU and NFS resources for training are supported by fal.ai🔥.

Feel free to refer to the fal Research Grants!

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
assets		assets
configs		configs
scripts		scripts
sgm		sgm
.gitignore		.gitignore
LICENSE-CODE		LICENSE-CODE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SV3D fine-tuning

Setting up

PyTorch 2.0

Install `deepspeed` for training

Get checkpoints 💾

Dataset 📀

Training 🚀

Inference ❄️

Notes

Acknowledgement 🤗

About

Releases 1

Packages

Contributors 3

Languages

License

jjihwan/SV3D-fine-tune

Folders and files

Latest commit

History

Repository files navigation

SV3D fine-tuning

Setting up

PyTorch 2.0

Install deepspeed for training

Get checkpoints 💾

Dataset 📀

Training 🚀

Inference ❄️

Notes

Acknowledgement 🤗

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 3

Languages

Install `deepspeed` for training

Packages