[feat] Reduce peak VRAM memory usage of IP adapter #6453

lstein · 2024-05-29T02:48:43Z

Summary

On my 12 GB VRAM GPU I was unable to simultaneously apply both an IP Adapter and an OpenPose ControlNet module to an SDXL model without running out of VRAM memory. Digging into a bit, I found this remark in latent.py

# TODO(ryand): With some effort, the step of running the CLIP Vision encoder could be done before any other
# models are needed in memory. This would help to reduce peak memory utilization in low-memory environments.

As suggested by @RyanJDick, I put some effort into this and moved the encoding step to occur the main model execution context, thereby reducing VRAM requirements. This solved the out of memory issue!

Related Issues / Discussions

There are a number of mypy-detected typecheck errors in latents.py that precede this PR. I have not tracked these down.

I’d like to move the whole prep_ip_adapter_data() call outside the model loader context (rather than just the code that generates the image prompt embeds), but this will take some more effort.

QA Instructions

Run with various combinations of IP Adapters and controlnets, and compare peak VRAM usage before and after applying this PR. Check for stability.

Merge Plan

Merge when approved.

Checklist

The PR has a short but descriptive title, suitable for a changelog
Tests added / updated (if applicable)
Documentation added / updated (if applicable)

RyanJDick

Careful, I'm going to start leaving more TODO comments if they magically get addressed! 😉 Thanks for looking into this.

This wasn't quite what I had in mind, but I'm glad that it enables you to run more workflows. As a future improvement, it would be nice to reduce the coupling between the core IP-Adapter model, the IP-Adapter image projection model, and the CLIP Vision model. Then we can run the CLIP Vision model in it's own node, and won't have to lock->unlock->relock the IP-Adapter model like we are doing now.

I left a few minor comments. Once those are addressed, this looks good to me.

I ran a quick smoke test - no smoke.

invokeai/app/invocations/latent.py

…m:invoke-ai/InvokeAI into lstein/optimization/ip-image-encoder-vram

lstein · 2024-05-30T01:57:59Z

This wasn't quite what I had in mind, but I'm glad that it enables you to run more workflows. As a future improvement, it would be nice to reduce the coupling between the core IP-Adapter model, the IP-Adapter image projection model, and the CLIP Vision model. Then we can run the CLIP Vision model in it's own node, and won't have to lock->unlock->relock the IP-Adapter model like we are doing now.

This is one place where lazy_offloading is handy. The IP adapter probably stays in VRAM through the lock->unlock->lock cycle.

RyanJDick

Just one minor comment on the latest changes.

invokeai/app/invocations/latent.py

Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>

lstein · 2024-06-03T11:18:34Z

@RyanJDick All comments are addressed. You want to give this a quick once-over?

reduce peak VRAM memory usage of IP adapter

363ceba

lstein requested review from blessedcoolant, psychedelicious, brandonrising and hipsterusername as code owners May 29, 2024 02:48

github-actions bot added python PRs that change python files invocations PRs that change invocations labels May 29, 2024

Lincoln Stein and others added 3 commits May 28, 2024 22:57

handle case of no IP adapters requested

8a671a1

add check for congruence between # of ip_adapters and image_prompts

b98abdf

Merge branch 'main' into lstein/optimization/ip-image-encoder-vram

da4ad0d

RyanJDick reviewed May 29, 2024

View reviewed changes

invokeai/app/invocations/latent.py Outdated Show resolved Hide resolved

invokeai/app/invocations/latent.py Outdated Show resolved Hide resolved

invokeai/app/invocations/latent.py Outdated Show resolved Hide resolved

Lincoln Stein added 3 commits May 29, 2024 10:29

refactor redundant code and fix typechecking errors

13c7183

Merge branch 'lstein/optimization/ip-image-encoder-vram' of github.co…

c182116

…m:invoke-ai/InvokeAI into lstein/optimization/ip-image-encoder-vram

added a few comments to document design choices

c231f3d

lstein requested a review from RyanJDick May 29, 2024 23:45

github-actions bot added the services PRs that change app services label May 29, 2024

fix ruff

ac0ce49

Lincoln Stein and others added 2 commits May 30, 2024 06:40

use zip to iterate over image prompts and adapters

ac1e89d

Merge branch 'main' into lstein/optimization/ip-image-encoder-vram

e57ab8b

RyanJDick reviewed May 30, 2024

View reviewed changes

invokeai/app/invocations/latent.py Outdated Show resolved Hide resolved

Update invokeai/app/invocations/latent.py

841c3bc

Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>

lstein requested a review from RyanJDick June 1, 2024 12:54

Merge branch 'main' into lstein/optimization/ip-image-encoder-vram

a5d6462

RyanJDick approved these changes Jun 3, 2024

View reviewed changes

hipsterusername approved these changes Jun 3, 2024

View reviewed changes

hipsterusername merged commit 756108f into main Jun 3, 2024
14 checks passed

hipsterusername deleted the lstein/optimization/ip-image-encoder-vram branch June 3, 2024 18:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feat] Reduce peak VRAM memory usage of IP adapter #6453

[feat] Reduce peak VRAM memory usage of IP adapter #6453

lstein commented May 29, 2024 •

edited

Loading

RyanJDick left a comment

lstein commented May 30, 2024

RyanJDick left a comment

lstein commented Jun 3, 2024

[feat] Reduce peak VRAM memory usage of IP adapter #6453

[feat] Reduce peak VRAM memory usage of IP adapter #6453

Conversation

lstein commented May 29, 2024 • edited Loading

Summary

Related Issues / Discussions

QA Instructions

Merge Plan

Checklist

RyanJDick left a comment

Choose a reason for hiding this comment

lstein commented May 30, 2024

RyanJDick left a comment

Choose a reason for hiding this comment

lstein commented Jun 3, 2024

lstein commented May 29, 2024 •

edited

Loading