This repository serves as a straightforward pipeline for facial attribute editing. Leveraging classic techniques and off-the-shelf deep learning models, easy for beginners to learn. [中文文档]
.
├── assets
│ └── ...
├── attributes
│ └── ...
├── common
│ ├── basemodel.py
│ ├── __init__.py
│ ├── svm_util.py
│ └── tools.py
├── config
│ └── cfg.yaml
├── scripts
│ ├── run_generate_data.py
│ ├── run_predict_score.py
│ └── run_svm.py
├── input
│ └── ...
├── modules
│ ├── models
│ ├── face_alignment
│ │ │ ├── face_alignment.py
│ │ │ └── networks
│ │ │ └── ...
│ │ ├── face_editing
│ │ │ └── face_editing.py
│ │ ├── face_generator
│ │ │ ├── face_generator.py
│ │ │ └── networks
│ │ │ └── ...
│ │ ├── face_inversion
│ │ │ └── face_inversion.py
│ │ ├── face_parsing
│ │ │ ├── face_parsing.py
│ │ │ └── networks
│ │ │ └── ...
│ │ └── face_paste
│ │ └── face_paste.py
│ └── weights
| └── ...
├── output
│ └── ...
└── edit.py
pretrained model are stored in ./modules/weights
with following arrangement.
.
├── face_alignment
│ ├── 2DFAN4-11f355bf06.pth.tar
│ └── s3fd-619a316812.pth
├── face_editing
├── face_generator
│ └── ffhq.pkl
├── face_inversion
│ └── vgg16.pt
├── face_parsing
│ ├── 79999_iter.pth
│ └── resnet18-5c106cde.pth
└── face_paste
- face_alignment
Download 2DFAN4-11f355bf06.pth.tar
from here, s3fd-619a316812.pth
from here.
They are famous landmarks detection models.
- face_generator
Download ffhq.pkl
from here. It based on StyleGAN2 and trained on FFHQ dataset.
- face_inversion
Download vgg16.pt
from here. It used to calculate LPIPS loss.
- face_parsing
Download 79999_iter.pth
from here, resnet18-5c106cde.pth
from here. It based on modified BiSeNet and trained on CelebAMask-HQ dataset. It is for face semantic segmentation.
The environment of this project is ordinary like common deep learning project, here only list the main part of development environment. If you are missing some libraries in your environment (opencv-python
, scipy
), just pip it.
- Python == 3.8
- PyTorch == 1.11
- CUDA == 11.3
- imageio-ffmpeg == 0.4.8 (for saving
.mp4
) - ninja == 1.11.1.1 (for compiling StyleGAN ops)
Run the project by follow command:
python edit.py --input_dir ./input --output_dir./output --dir_path ./attributes/smile.npy --config ./config/cfg.yaml --gamma 1.5 [--save_cache] [--save_media] [--change_hair]
--input_dir
: Path to the folder containing input images, default is./input
. (The width and height of the image need to be divisible by 2 due to H264 codec.)--output_dir
: Path to the folder where the results will be saved, default is./output
.--dir_path
: Path to the directory file with specific attributes, default is./attributes/smile.npy
.--config
: Path to the configuration file, default is./config/cfg.yaml
.--gamma
: Value for editing strength, default is1.5
.--save_cache
: Flag to determine whether to save cache (finetuned generator, projected w, and so on) or not, options areTrue
orFalse
, default isFalse
.--save_media
: Flag to determine whether to save.mp4
media files or not, options areTrue
orFalse
, default isFalse
.--change_hair
: Flag to determine whether to mask hair inFace_Parsing
or not, default isFalse
. In the long hair case, this argument had better beFalse
.
This repo use precomputed latent direction from generators-with-stylegan2, thanks for their kindness sharing. Besides, in ./scripts
, this repo also provides simple scripts for calculating latent direction through SVM and CLIP.
Download ViT-B-32.pt
from here, run these following command:
# generate images and latent code w
python ./scripts/run_generate_data.py
# predict score for each image by CLIP
python ./scripts/run_predict_score.py --pos_text <positive prompt> --neg_text <negative prompt>
# run svm to seek boundary
python ./scripts/run_svm.py -s <predict score path>
Since this repository only applies and utilizes basic techniques, there is significant room for improvement in edited results. Additionally, as most models are trained on datasets with inherent biases towards face races, not every wild image can achieve satisfactory editing outcomes.
- Smile
smile1.mp4
smile2.mp4
- Glasses
glasses1.mp4
glasses2.mp4
- Age
age1.mp4
age2.mp4
Thanks DongFang Hu@OPPO for reviewing this code.