ArtFusion: Controllable Arbitrary Style Transfer using Dual Conditional Latent Diffusion Models
Official PyTorch Implementation
Author: Dar-Yen Chen
This implementation is based on the CompVis/latent-diffusion repository.
Our paper presents the first learning-based arbitrary style transfer diffusion model. ArtFusion exhibits outstanding controllability and faithful representation of artistic details.
ArtFusion empowers users with the flexibility to balance between source content and reference style in the outputs, catering to diverse stylization preferences. Results range from distinct content structures to pronounced stylization.
ArtFusion can capture the core style characteristics that are typically overlooked in SOTA methods, such as the blurry edges typical of Impressionist art, the texture of oil painting, and similar brush strokes.
Create and activate the conda environment:
conda env create -f environment.yaml
conda activate artfusion
The WikiArt style dataset we use is from Kaggle, which is gathered from WIKIART.
The content dataset is MS COCO 2017.
Please download and place the datasets as:
└── datasets
├── ms_coco
└── wiki-art
Download the first-stage VAE utilized in LDM to ./checkpoints/vae/kl-f16.ckpt
.
Then run the commands:
python main.py \
--name experiment_name \
--base ./configs/kl16_content12.yaml \
--basedir ./checkpoints \
-t True \
--gpus 0,
The pretrained model can be downloaded here.
Please place it at the folder ./checkpoints/artfusion/
.
Inference can be done via the notebook.
Type following to set the conda environment on the jupyter notebook.
python -m ipykernel install --user --name artfusion
This project is released under the MIT License.
If you find this repository useful for your research, please cite using the following.
@misc{chen2023artfusion,
title={ArtFusion: Controllable Arbitrary Style Transfer using Dual Conditional Latent Diffusion Models},
author={Dar-Yen Chen},
year={2023},
eprint={2306.09330},
archivePrefix={arXiv},
primaryClass={cs.CV}
}