-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Community Pipeline] IPAdapter FaceID #6276
[Community Pipeline] IPAdapter FaceID #6276
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Thank you so much for adding this @fabiorigano! Was trying to work on this as well but great to see the quick addition. Sharing some more results from the PR with EpicRealism-v4. Will be interesting to see if multiple reference images could be provided as input to improve on the quality in the future. |
@fabiorigano thanks so much for this. Let's try to get the CI green. |
src/diffusers/loaders/ip_adapter.py
Outdated
weight_name: str, | ||
subfolder: Optional[str] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can pass this to the kwargs
right? I don't think there's a need to expose this specifically. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can pass this to the
kwargs
right? I don't think there's a need to expose this specifically. WDYT?
Yes, we can pass it as keyword argument. The other IP Adapter models need the image encoder, stored in a subfolder of H94/IP-Adapter, while FaceID model doesn't require an image encoder, so I made it Optional. I will remove it in the next update
src/diffusers/loaders/unet.py
Outdated
@@ -684,13 +684,20 @@ def _convert_ip_adapter_image_proj_to_diffusers(self, state_dict): | |||
diffusers_name = key.replace("proj", "image_embeds") | |||
updated_state_dict[diffusers_name] = value | |||
|
|||
elif "proj.3.weight" in state_dict: | |||
elif "proj.0.weight" in state_dict: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is using this key a better option? Can we use a more resilient condition here to avoid side-effects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed to proj.0.weight
because both IPAdapter Full and FaceID state dicts have it, while the IPAdapter Full proj.3.weight
key is named norm.weight
in the FaceID model
Would be also nice to have some documentation about it here: WDYT? |
src/diffusers/loaders/unet.py
Outdated
hidden_size=hidden_size, | ||
cross_attention_dim=cross_attention_dim, | ||
scale=1.0, | ||
rank=128, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to make rank
an argument?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for now, but perhaps new IP Adapter will be released in the future, using different LoRA ranks. Do you think it is better to remove it for now?
Sure, I can add the snippet on top of this PR |
@yiyixuxu could you take a look here? |
src/diffusers/loaders/ip_adapter.py
Outdated
@@ -46,7 +48,6 @@ class IPAdapterMixin: | |||
def load_ip_adapter( | |||
self, | |||
pretrained_model_name_or_path_or_dict: Union[str, Dict[str, torch.Tensor]], | |||
subfolder: str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a breaking change, can we leave as it was here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done as you suggested!
src/diffusers/loaders/ip_adapter.py
Outdated
try: | ||
logger.info(f"loading image_encoder from {pretrained_model_name_or_path_or_dict}") | ||
image_encoder = CLIPVisionModelWithProjection.from_pretrained( | ||
pretrained_model_name_or_path_or_dict, | ||
subfolder=os.path.join(subfolder, "image_encoder"), | ||
).to(self.device, dtype=self.dtype) | ||
self.image_encoder = image_encoder | ||
except TypeError: | ||
print("IPAdapter: `subfolder` not found, `image_encoder` is None, use image_embeds.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try: | |
logger.info(f"loading image_encoder from {pretrained_model_name_or_path_or_dict}") | |
image_encoder = CLIPVisionModelWithProjection.from_pretrained( | |
pretrained_model_name_or_path_or_dict, | |
subfolder=os.path.join(subfolder, "image_encoder"), | |
).to(self.device, dtype=self.dtype) | |
self.image_encoder = image_encoder | |
except TypeError: | |
print("IPAdapter: `subfolder` not found, `image_encoder` is None, use image_embeds.") | |
try: | |
logger.info(f"loading image_encoder from {pretrained_model_name_or_path_or_dict}") | |
image_encoder = CLIPVisionModelWithProjection.from_pretrained( | |
pretrained_model_name_or_path_or_dict, | |
subfolder=os.path.join(subfolder, "image_encoder"), | |
).to(self.device, dtype=self.dtype) | |
self.image_encoder = image_encoder | |
except TypeError: | |
print("IPAdapter: `subfolder` not found, `image_encoder` is None, use image_embeds.") |
Let's try to not use try...except
here please
@yiyixuxu can you take a look here? |
Btw, FaceID v2 and SDXL versions were made public last week. Maybe we could look into supporting and testing both in this PR too? |
One at a time please. |
I just fixed the loading of the LoRA weights, because some of them were missing. I updated set_default_attn_processor in order to restore processors with torch sdpa when using unload_ip_adapter. |
@patrickvonplaten |
@sayakpaul, I completely agree that this can enable real use cases and have huge potential. However, this model was released as an experimental checkpoint, so I think it's too early for us to integrate. I think we have two options here:
I prefer the option 1 - It will require a little bit of effort to move all the files to the community folder but will allow users to try this pipeline out at mealtime . What do you think? @fabiorigano it will be completely up to you:) |
hi @yiyixuxu, no problem for me, I can move everything to the community folder. I would leave some lines in common with IPAdapterFull unchanged, because in my opinion the current PR implementation is more readable and general. I also think that all classes copying |
ok by me - let's review it again once move it to the community folder :) |
@fabiorigano
YiYi |
Ok perfect! Will do it tomorrow |
Makes sense, yeah! |
@fabiorigano Thank you for your contribution. I took your branch on a test run but haven't had success loading the FaceID SDXL weights. ip-adapter-faceid_sdxl.bin is being tried to load as a LoRA and throws an error:
ip-adapter-faceid_sdxl_lora.safetensors throws:
|
hi @okaris, when diffusers.utils.USE_PEFT_BACKEND is True, this model can't be loaded. To bypass this constant, I made a little change in the definition, so you can put it to False (@yiyixuxu @sayakpaul I would like to know if it is acceptable for you). In any case, this implementation only support SD, I am quite sure something must be updated for SDXL I moved everything to community folder |
@fabiorigano do you mind sharing an example how to use that to disable it? |
hi @fabiorigano
do you know why it won't work when this constant is True? i.e. which line of code is causing the problem? |
you have the full example in the README
|
here, it is used toch.nn.Linear instead of LoRACompatibleLinear, and so I cannot call set_lora_layer
|
@fabiorigano cc @sayakpaul here for his expertise |
@yiyixuxu yes, at first I was probably setting it after creating the pipeline, but it works if set before, as I did in the README example. Sorry for not checking before committing 😅 |
Alright! great! I think it is good to merge then |
Hi @fabiorigano how can I test the community pipeline version before it's merged?
|
Hi @okaris, pass the path to the python file as custom_pipeline:
|
Great job @fabiorigano! |
Is there a way to make this work with the PEFT backend enabled @yiyixuxu @fabiorigano ? |
* Add support for IPAdapter FaceID * Add docs * Move subfolder to kwargs * Fix quality * Fix image encoder loading * Fix loading + add test * Move to community folder * Fix style * Revert constant update --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
What does this PR do?
Add support for IPAdapter FaceID
Fixes #6243
Who can review?
@patrickvonplaten, @sayakpaul, @yiyixuxu