Support LoRA for clip text encoder in diffusers #21770

haofanwang · 2023-02-23T20:31:40Z

What does this PR do?

Support a feature in huggingface/diffusers#2469. For now, as stable diffusion uses CLIPTextEncoder, it doesn't support adding LoRA layers yet. What we have done is quite similar to UNet2DConditionModel.

What to expect after this PR?

import torch
from transformers import CLIPTextModel, CLIPTokenizer
from diffusers.models.cross_attention import LoRACrossAttnProcessor

tokenizer = CLIPTokenizer.from_pretrained("CompVis/stable-diffusion-v1-4", subfolder="tokenizer")
text_encoder = CLIPTextModel.from_pretrained("CompVis/stable-diffusion-v1-4", subfolder="text_encoder")
text_encoder.requires_grad_(False)

# add LoRA layers
lora_attn_procs = {}
for name in text_encoder.attn_processors.keys():
    cross_attention_dim = None if name.endswith("self_attn.processor") else text_encoder.config.hidden_size
    hidden_size = text_encoder.config.hidden_size
    lora_attn_procs[name] = LoRACrossAttnProcessor(
        hidden_size=hidden_size, cross_attention_dim=cross_attention_dim
    )
text_encoder.set_attn_processor(lora_attn_procs)

inputs = tokenizer(["a photo of a cat", "a photo of a dog"], padding=True, return_tensors="pt")
outputs = text_encoder(**inputs)

# only added LoRA weights require gradients
for name, param in text_encoder.named_parameters():
    print(name, param.requires_grad)

HuggingFaceDocBuilderDev · 2023-02-23T20:46:17Z

The documentation is not available anymore as the PR was closed or merged.

sgugger · 2023-02-24T08:14:40Z

The support for LoRA should be done using our new peft library. We won't change Transformers models directly. cc @pacman100 @patrickvonplaten

haofanwang · 2023-02-24T08:18:59Z

Sure, it makes sense to me. I'm glad to know. I will directly make a new PR in diffusers.

support lora for clip text encoder

468035f

neecapp mentioned this pull request Feb 23, 2023

Initial setup of lora support invoke-ai/InvokeAI#2712

Closed

haofanwang added 4 commits February 24, 2023 13:04

Add xformers

0c0f482

reformat

32c322e

fix copy inconsistencies

ee8703b

format

4ec3a8d

haofanwang closed this Feb 24, 2023

haofanwang mentioned this pull request Feb 24, 2023

Support LoRA for clip text encoder in diffusers huggingface/diffusers#2479

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support LoRA for clip text encoder in diffusers #21770

Support LoRA for clip text encoder in diffusers #21770

haofanwang commented Feb 23, 2023

HuggingFaceDocBuilderDev commented Feb 23, 2023 •

edited

Loading

sgugger commented Feb 24, 2023

haofanwang commented Feb 24, 2023

Support LoRA for clip text encoder in diffusers #21770

Support LoRA for clip text encoder in diffusers #21770

Conversation

haofanwang commented Feb 23, 2023

What does this PR do?

What to expect after this PR?

HuggingFaceDocBuilderDev commented Feb 23, 2023 • edited Loading

sgugger commented Feb 24, 2023

haofanwang commented Feb 24, 2023

HuggingFaceDocBuilderDev commented Feb 23, 2023 •

edited

Loading