Skip to content

Official implementation of CGF paper Two-step Training: Adjustable Sketch Colorization via Reference Image and Text Tag

Notifications You must be signed in to change notification settings

tellurion-kanata/sketch_colorizer

Repository files navigation

Introduction

This repository is the official implementation of paper Two-step Training: Adjustable Sketch Colorization via Reference Image and Text Tag. overview

An improved version using LDM has been released in ColorizeDiffusion.

Instruction

To run our code, please install all the libraries required in our implementation by running the command:

pip install -r requirements.txt

1.Data preparation

The training dataset can be downloaded from Danbooru2019 figures, or using the rsync command:

rsync --verbose --recursive rsync://176.9.41.242:873/biggan/danbooru2019-figures ./danbooru2019-figures/

Generate sketch and reference images with any methods you like, or apply data/data_augmentation.py in this repository and sketchKeras for reference and sketch images, respectively. Please organize the training/validation/testing dataset as:
├─ dataroot
│ ├─ color
│ ├─ reference
│ ├─ sketch

If you don't want to generate reference images for training (which is a little time-consuming), you can use the latent shuffle function in models/modules.py during training by cancelling the comment in the forward function in draft.py. Note that this only works when the reference encoder is semantically aware (for example, pre-trained for classification or segmentation), and this will suffer approximately 10%-20% deterioration.

2.Training

Before training our colorization model, you need to prepare a pre-trained CNN which is used as the reference encoder in our method. We suggest you adopt a CNN which is pre-trained on both ImageNet and Danbooru2020 to achieve best colorization performance. By the way, clip image encoder is found inefficient in our paper, but this deterioration can be solved by combining our models with latent diffusion according to our latest experiments.

Using the following command for 1st training:

python train.py --name [project_name] -d [dataset_path] -pre [pretrained_CNN_path] -bs [batch_size] -nt [threads used to read input] -at [add,fc(default)]

and the following command for 2nd training:

python train.py --name [project_name] -d [dataset_path] -pre [pretrained_CNN_path] -pg [pretrained_first_stage_model] -bs [batch_size] -nt [threads used to read input] -m mapping

More information regarding training and testing options can be found in options/options.py or by using --help.

3.Testing

To use our pre-trained model for sketch colorization, download our pre-trained networks from Releases using the following commands: For colorization model:

python test.py --name [project_name] -d [dataset_path] -pre [pretrained_CNN_path] 

For controllable model:

python test.py --name [project_name] -d [dataset_path] -pre [pretrained_CNN_path] -m mapping

We didn't implement user interface for changing the tag values, so you need to change the values of tags manually in the modify_tags function in mapping.py. Corresponding tag id can be found in materials. Besides, you can activate --resize to control the output image size as load_size. All the generated images are saved in checkpoints/[model_name]/test/fake. Details of training/testing options can be found in options/options.py.

4. Evaluation

We offer an evaluation using FID distance, using the following command for evaluation. Activate --resize if you want to change the evaluation image size.

python evaluate.py --name [project_name] --dataroot [dataset_path]

Code Reference

  1. vit-pytorch
  2. pytorch-CycleGAN-and-pix2pix
  3. pretrained-models.pytorch
  4. pytorch-vision
  5. pytorch-spectral-normalization-gan

About

Official implementation of CGF paper Two-step Training: Adjustable Sketch Colorization via Reference Image and Text Tag

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages