Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat] Reduce peak VRAM memory usage of IP adapter #6453

Merged
merged 12 commits into from
Jun 3, 2024

Conversation

lstein
Copy link
Collaborator

@lstein lstein commented May 29, 2024

Summary

On my 12 GB VRAM GPU I was unable to simultaneously apply both an IP Adapter and an OpenPose ControlNet module to an SDXL model without running out of VRAM memory. Digging into a bit, I found this remark in latent.py

# TODO(ryand): With some effort, the step of running the CLIP Vision encoder could be done before any other
# models are needed in memory. This would help to reduce peak memory utilization in low-memory environments.

As suggested by @RyanJDick, I put some effort into this and moved the encoding step to occur the main model execution context, thereby reducing VRAM requirements. This solved the out of memory issue!

Related Issues / Discussions

There are a number of mypy-detected typecheck errors in latents.py that precede this PR. I have not tracked these down.

I’d like to move the whole prep_ip_adapter_data() call outside the model loader context (rather than just the code that generates the image prompt embeds), but this will take some more effort.

QA Instructions

Run with various combinations of IP Adapters and controlnets, and compare peak VRAM usage before and after applying this PR. Check for stability.

Merge Plan

Merge when approved.

Checklist

  • The PR has a short but descriptive title, suitable for a changelog
  • Tests added / updated (if applicable)
  • Documentation added / updated (if applicable)

@github-actions github-actions bot added python PRs that change python files invocations PRs that change invocations labels May 29, 2024
Copy link
Collaborator

@RyanJDick RyanJDick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Careful, I'm going to start leaving more TODO comments if they magically get addressed! 😉 Thanks for looking into this.

This wasn't quite what I had in mind, but I'm glad that it enables you to run more workflows. As a future improvement, it would be nice to reduce the coupling between the core IP-Adapter model, the IP-Adapter image projection model, and the CLIP Vision model. Then we can run the CLIP Vision model in it's own node, and won't have to lock->unlock->relock the IP-Adapter model like we are doing now.

I left a few minor comments. Once those are addressed, this looks good to me.

I ran a quick smoke test - no smoke.

invokeai/app/invocations/latent.py Outdated Show resolved Hide resolved
invokeai/app/invocations/latent.py Outdated Show resolved Hide resolved
invokeai/app/invocations/latent.py Outdated Show resolved Hide resolved
@lstein lstein requested a review from RyanJDick May 29, 2024 23:45
@github-actions github-actions bot added the services PRs that change app services label May 29, 2024
@lstein
Copy link
Collaborator Author

lstein commented May 30, 2024

This wasn't quite what I had in mind, but I'm glad that it enables you to run more workflows. As a future improvement, it would be nice to reduce the coupling between the core IP-Adapter model, the IP-Adapter image projection model, and the CLIP Vision model. Then we can run the CLIP Vision model in it's own node, and won't have to lock->unlock->relock the IP-Adapter model like we are doing now.

This is one place where lazy_offloading is handy. The IP adapter probably stays in VRAM through the lock->unlock->lock cycle.

Copy link
Collaborator

@RyanJDick RyanJDick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one minor comment on the latest changes.

invokeai/app/invocations/latent.py Outdated Show resolved Hide resolved
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
@lstein lstein requested a review from RyanJDick June 1, 2024 12:54
@lstein
Copy link
Collaborator Author

lstein commented Jun 3, 2024

@RyanJDick All comments are addressed. You want to give this a quick once-over?

@hipsterusername hipsterusername merged commit 756108f into main Jun 3, 2024
14 checks passed
@hipsterusername hipsterusername deleted the lstein/optimization/ip-image-encoder-vram branch June 3, 2024 18:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invocations PRs that change invocations python PRs that change python files services PRs that change app services
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants