Skip to content

ziangcao0312/DiffTF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Large-Vocabulary 3D Diffusion Model with Transformer

1S-Lab, Nanyang Technological University  2The Chinese University of Hong Kong; 3Shanghai AI Laboratory

DiffTF can generate large-vocabulary 3D objects with rich semantics and realistic texture.

📖 For more visual results, go checkout our project page

Installation

Clone this repository and navigate to it in your terminal. Then run:

bash install_difftf.sh

This will install the related python package that the scripts depend on.

Preparing data

Training

I. Triplane fitting

1. Training the shared decoder
conda activate difftf
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
#Omniobject3D
python -m torch.distributed.launch --nproc_per_node 8 ./Triplanerecon/train.py --config ./Triplanerecon/configs/omni/train.txt \\
--datadir ./dataset/Omniobject3D/renders \\# dataset path
--basedir ./Checkpoint \\# basepath
--expname omni_sharedecoder \\# the ckpt will save in ./Checkpoint/omni_sharedecoder
#ShapeNet
python -m torch.distributed.launch --nproc_per_node 8 ./Triplanerecon/train.py --config ./Triplanerecon/configs/shapenet_car/train.txt \\
--datadir ./dataset/ShapeNet/renders_car
--basedir ./Checkpoint \\# basepath
--expname shapenet_sharedecoder \\# the ckpt will save in ./Checkpoint/shapenet_car_sharedecoder
2. Triplane fitting
conda activate difftf
#Omniobject3D
python ./Triplanerecon/train_single_omni.py \\
--config ./Triplanerecon/configs/omni/train_single.txt \\ #config path
--num_gpu 1 --idx 0 \\ #using 1gpu to fit triplanes 
--datadir ./dataset/Omniobject3D/renders \\# dataset path
--basedir ./Checkpoint \\# basepath
--expname omni_triplane \\# triplanes will save in ./Checkpoint/omni_triplane
--decoderdir ./Checkpoint/omni_sharedecoder/300000.tar # ckpt of shared decoder

#ShapeNet
python ./Triplanerecon/train_single_shapenet.py \\
--config ./Triplanerecon/configs/shapenet_car/train_single.txt \\
--num_gpu 1 --idx 0 \\ #using 1gpu to fit triplanes 
--datadir ./dataset/ShapeNet/renders_car \\# dataset path
--basedir ./Checkpoint \\# basepath
--expname shapenet_triplane \\# triplanes will save in ./Checkpoint/shapenet_triplane
--decoderdir ./Checkpoint/shapenet_sharedecoder/300000.tar # ckpt of shared decoder

#Using 8 gpus
bash multi_omni.sh 8
#Using 8 gpus
bash multi_shapenet.sh 8

Note: We input the related hyperparameters and settings in the config files. You can find them in ./configs/shapenet or ./configs/omni.

3. Preparing triplane for diffusion
#preparing triplanes for training diffusion
python ./Triplanerecon/extract.py 
--basepath ./Checkpoint/omni_triplane \\ # path of triplanes
--mode omni \\ # name of dataset (omni or shapenet)
--newpath ./Checkpoint/omni_triplane_fordiffusion #new path of triplanes

II. Training Diffusion

cd ./3dDiffusion
export PYTHONPATH=$PWD:$PYTHONPATH
conda activate difftf
cd scripts
python image_train.py 
--datasetdir ./Checkpoint/omni_triplane_fordiffusion   #path to fitted triplanes 
--expname difftf_omni #ckpt will save in ./Checkpoint/difftf_omni

You may also want to train in a distributed manner. In this case, run the same command with mpiexec:

mpiexec -n 8 python image_train.py 
--datasetdir ./Checkpoint/omni_triplane_fordiffusion   #path to fitted triplanes
--expname difftf_omni #ckpt will save in ./Checkpoint/difftf_omni

Note: Hyperparameters about training are set in image_train.py while hyperparameters about architecture are set in ./improved_diffusion/script_util.py.

Note: Our fitted triplane can be downloaded via this link.

Inference

I. Sampling triplane using trained diffusion

Our pre-trained model can be founded in difftf_checkpoint/omni

python image_sample.py \\
--model_path ./Checkpoint/difftf_omni/model.pt  #checkpoint_path
--num_samples=5000
--save_path ./Checkpoint/difftf_omni # path of the generated triplanes

II. Rendering triplane using shared decoder

Our pre-trained share decoder can be founded in difftf_checkpoint/triplane decoder.zip

python ddpm_vis.py --config ./configs/omni/ddpm.txt
--ft_path ./Checkpoint/omni_triplane_fordiffusion/003000.tar #path of shared decoder
--triplanepath ./Checkpoint/difftf_omni/samples_5000x18x256x256.npz # path of generated triplanes
--basedir ./Checkpoint \\# basepath
--expname ddpm_omni_vis \\# triplanes will save in ./Checkpoint/omni_triplane
--mesh 0 \\# whether to save mesh
--testvideo \\# whether to save all images using video

python ddpm_vis.py --config ./configs/shapenet_car/ddpm.txt
--ft_path ./Checkpoint/shapenet_car_triplane_fordiffusion/003000.tar #path of shared decoder
--triplanepath ./Checkpoint/difftf_shapenet/samples_5000x18x256x256.npz # path of generated triplanes
--basedir ./Checkpoint \\# basepath
--expname ddpm_shapenet_vis \\# triplanes will save in ./Checkpoint/omni_triplane
--mesh 0 \\# whether to save mesh
--testvideo \\# whether to save all images using video

References

If you find DiffTF useful for your work please cite:

@article{cao2023large,
  title={Large-Vocabulary 3D Diffusion Model with Transformer},
  author={Cao, Ziang and Hong, Fangzhou and Wu, Tong and Pan, Liang and Liu, Ziwei},
  journal={arXiv preprint arXiv:2309.07920},
  year={2023}
}
Acknowledgement

The code is implemented based on improved-diffusion and nerf-pytorch. We would like to express our sincere thanks to the contributors.

🗞️ License

Distributed under the S-Lab License. See LICENSE for more information.

Flag Counter

About

Official PyTorch implementation of DiffTF (Accepted by ICLR2024)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published