GitHub

MasaCtrl: Tuning-free Mutual Self-Attention Control for Consistent Image Synthesis and Editing

Pytorch implementation of MasaCtrl: Tuning-free Mutual Self-Attention Control for Consistent Image Synthesis and Editing

Mingdeng Cao, Xintao Wang, Zhongang Qi, Ying Shan, Xiaohu Qie, Yinqiang Zheng

arXiv | Project page

MasaCtrl enables performing various consistent non-rigid image synthesis and editing without fine-tuning and optimization.

Updates

[2023/4/25] Code released.
[2023/4/17] Paper is available here.

Introduction

We propose MasaCtrl, a tuning-free method for non-rigid consistent image synthesis and editing. The key idea is to combine the contents from the source image and the layout synthesized from text prompt and additional controls into the desired synthesized or edited image, with Mutual Self-Attention Control.

Main Features

1 Consistent Image Synthesis and Editing

MasaCtrl can perform prompt-based image synthesis and editing that changes the layout while maintaining contents of source image.

The target layout is synthesized directly from the target prompt.

Consistent synthesis results

Real image editing results

2 Integration to Controllable Diffusion Models

Directly modifying the text prompts often cannot generate target layout of desired image, thus we further integrate our method into existing proposed controllable diffusion pipelines (like T2I-Adapter and ControlNet) to obtain stable synthesis and editing results.

The target layout controlled by additional guidance.

Synthesis (left part) and editing (right part) results with T2I-Adapter

3 Generalization to Other Models: Anything-V4

Our method also generalize well to other Stable-Diffusion-based models.

Results on Anything-V4

Usage

Requirements

We implement our method with diffusers code base with similar code structure to Prompt-to-Prompt. The code runs on Python 3.8.5 with Pytorch 1.11. Conda environment is highly recommended.

pip install -r requirements.txt

Checkpoints

Stable Diffusion: We mainly conduct expriemnts on Stable Diffusion v1-4, while our method can generalize to other versions (like v1-5).

You can download these checkpoints on their official repository and Hugging Face.

Personalized Models: You can download personlized models from CIVITAI or train your own customized models.

Notebook Demo

To run the synthesis with MasaCtrl, single GPU with at least 16 GB VRAM is required.

The notebook playground.ipynb provides the synthesis samples.

MasaCtrl with T2I-Adapter

Will be releasing soon.

Acknowledgements

We thank awesome research works Prompt-to-Prompt, T2I-Adapter.

Citation

@misc{cao2023masactrl,
      title={MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing}, 
      author={Mingdeng Cao and Xintao Wang and Zhongang Qi and Ying Shan and Xiaohu Qie and Yinqiang Zheng},
      year={2023},
      eprint={2304.08465},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
    }

Contact

If your have any comments or questions, please open a new issue or feel free to contact Mingdeng Cao and Xintao Wang.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
images		images
masactrl		masactrl
README.md		README.md
playground.ipynb		playground.ipynb
playground.py		playground.py
playground_real.ipynb		playground_real.ipynb
playground_real.py		playground_real.py
real_image_editing.sh		real_image_editing.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MasaCtrl: Tuning-free Mutual Self-Attention Control for Consistent Image Synthesis and Editing

Updates

Introduction

Main Features

1 Consistent Image Synthesis and Editing

2 Integration to Controllable Diffusion Models

3 Generalization to Other Models: Anything-V4

Usage

Requirements

Checkpoints

Notebook Demo

MasaCtrl with T2I-Adapter

Acknowledgements

Citation

Contact

About

Releases

Packages

Languages

phymhan/MasaCtrl

Folders and files

Latest commit

History

Repository files navigation

MasaCtrl: Tuning-free Mutual Self-Attention Control for Consistent Image Synthesis and Editing

Updates

Introduction

Main Features

1 Consistent Image Synthesis and Editing

2 Integration to Controllable Diffusion Models

3 Generalization to Other Models: Anything-V4

Usage

Requirements

Checkpoints

Notebook Demo

MasaCtrl with T2I-Adapter

Acknowledgements

Citation

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages