Project page | arXiv:2409.11635
Due to the privacy policies of the Biovid Database, we can only release the checkpoints, training code, and inference code. To save our effort, we are going to release the training code and preprocess code as is for reference purposes. We only test and make sure the inference code is runnable with a Gradio demo.
Install inferno for the EMOCA decoder. Follow the instructions here, then follow this to download the necessary models for facial reconstruction. We slightly modified the original code to generate useful latent in the face reconstruction app and to serve the script.
There might be problems with installing pytorch3d, which may come from mismatched versions of CUDA, PyTorch, and pytorch3d. Please separately install pytorch3d if there are problems with installing it.
git clone
cd inferno/
conda create python=3.10 -n paindiff
conda activate paindiff
# Install pytorch and pytorch3d
# please be mindful that the cuda version should be matched for pytorch and your current cuda system,
pip3 install torch torchvision torchaudio --index-url
FORCE_CUDA=1 pip install git+
conda env update --name paindiff --file conda-environment_py39_cu12_torch2.yaml
pip install -e .
# Download the pretrained EMOCA
cd inferno_apps/FaceReconstruction
# back to paindiffusion folder and install requirements
cd ../../..
pip install -r requirements.txt
This project is heavily based on the beautiful implementation of diffusion models: modular-diffusion, denoising-diffusion-pytorch, k-diffusion