This repository contains the official implementation of the CVPR 2024 paper, "🎨 Paint-it: Text-to-Texture Synthesis via Deep Convolutional Texture Map Optimization and Physically-Based Rendering".
Paint-it is a text-driven high-quality PBR texture map synthesis method.
🌟 Our texture maps are ready for practical use in popular graphics engines like Blender and Unity, thanks to our Physics-based Rendering (PBR) parameterization, which includes diffuse, roughness, metalness, and normal information.
🎨 With our approach, the resulting texture maps are not only of superior quality but also offer the flexibility of relighting and material editing.
🔥 We've achieved impressive results without modifying the well-known Score-Distillation Sampling (SDS), instead focusing on optimizing variables through our texture map parameterization.
🔊 While many researchers are working on denoising the gradients from SDS, our work leverages the power of architectural bias, specifically Deep Image Prior, to robustly learn from noisy SDS gradients, even when dealing with PBR representations.
This code was developed on Ubuntu 18.04 with Python 3.8, CUDA 11.3 and PyTorch 1.12.0, using NVIDIA RTX A6000 (48GB) GPU. Later versions should work, but have not been tested.
conda create -n paint_it python=3.8
conda activate paint_it
# pytorch installation
pip install torch==1.12.0+cu113 torchvision==0.13.0+cu113 torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cu113
# for pytorch3d installation
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
# for python3.8, cuda 11.3, pytorch 1.12 (py38_cu113_pyt1120) -> need to install pytorch3d-0.7.2
pip install --no-index --no-cache-dir pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py38_cu113_pyt1120/download.html
pip install git+https://github.com/NVlabs/nvdiffrast/
pip install diffusers==0.12.1 huggingface-hub==0.11.1 transformers==4.21.1 sentence-transformers==2.2.2
pip install PyOpenGL PyOpenGL_accelerate accelerate rich ninja scipy trimesh imageio matplotlib chumpy opencv-python smplx
pip install numpy==1.23.1
Currently, this repository contains a sample mesh from Objaverse dataset. To download a subset of Objaverse, you can refer to the scripts provided here.
Given a 3D mesh in .obj
format and the text prompt, you can run below command to generate PBR texture maps.
# Generate PBR textures for .obj meshes
python paint_it.py
When generating PBR texture maps for a subset of Objaverse meshes, you can modify below dictionary (paint_it.py, L294) to handle multiple mesh object IDs and corresponding text prompts.
mesh_dicts = {
'9ce8ab24383c4c93b4c1c7c3848abc52': 'a pretzel',
}
Before you proceed, you need to download SMPL related materials. Get yourself registered and download the relevant files from SMPL webpage.
-
SMPL neutral model in
.pkl
formatYou can find
Download version 1.1.0 for Python 2.7 (female/male/neutral, 300 shape PCs)
. Rename the downloadedbasicmodel_neutral_lbs_10_207_0_v1.1.0.pkl
intoSMPL_NEUTRAL.pkl
and place it under./smpl
directory. -
SMPL UV map in
.obj
formatYou can find
Download UV map in OBJ format
. Move the downloadedsmpl_uv.obj
into./data
directory.
Given a 3D human mesh in SMPL parameter .npz
format and text prompt, you can run below command to generate PBR texture maps.
The example .npz
file is located under ./data/smpld_example
.
# Generate PBR textures for 3D human meshes
python paint_it_human.py
If you have 3D human scans (e.g., RenderPeople), but don't have smpl parameters for human meshes, try using this mesh registration tool.
If you find our code or paper helps, please consider citing:
@inproceedings{youwang2024paintit,
title = {Paint-it: Text-to-Texture Synthesis via Deep Convolutional Texture Map Optimization and Physically-Based Rendering},
author = {Youwang, Kim and Oh, Tae-Hyun and Pons-Moll, Gerard},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2024}
}
Kim Youwang (youwang.kim@postech.ac.kr)
We thank the members of AMILab and RVH group for their helpful discussions and proofreading.
The implementation of Paint-it is largely inspired and fine-tuned from the seminal projects. We would like to express our sincere gratitude to the authors for making their code public.
- Deep Image Prior (https://github.com/DmitryUlyanov/deep-image-prior)
- Stable-Dreamfusion (https://github.com/ashawkey/stable-dreamfusion)
- Fantasia3D (https://github.com/Gorilla-Lab-SCUT/Fantasia3D)
The project was made possible by funding from the Carl Zeiss Foundation. This work is funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) - 409792180 (Emmy Noether Programme, project: Real Virtual Humans), and the German Federal Ministry of Education and Research (BMBF): Tübingen AI Center, FKZ: 01IS18039A. Gerard Pons-Moll is a Professor at the University of Tübingen endowed by the Carl Zeiss Foundation, at the Department of Computer Science and a member of the Machine Learning Cluster of Excellence, EXC number 2064/1 – Project number 390727645. Kim Youwang and Tae-Hyun Oh were supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (No.RS-2023-00225630, Development of Artificial Intelligence for Text-based 3D Movie Generation; No.2022-0-00290, Visual Intelligence for SpaceTime Understanding and Generation based on Multi-layered Visual Common Sense; No.2021-0-02068, Artificial Intelligence Innovation Hub).