Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable even larger images with one simple torch.nn.functional.silu import #653

Merged
merged 1 commit into from
Sep 17, 2022

Conversation

mh-dm
Copy link
Contributor

@mh-dm mh-dm commented Sep 17, 2022

Fixes:
File "stable-diffusion/ldm/modules/diffusionmodules/model.py", line 37, in nonlinearity
return x*torch.sigmoid(x)
RuntimeError: CUDA out of memory. Tried to allocate 1.56 GiB [..]

Now up to 1536x1280 is possible on 8GB VRAM.
Also remove unused SiLU class.

silu is x * torch.sigmoid(x) is the swish function
https://pytorch.org/docs/stable/generated/torch.nn.functional.silu.html

…port

Fixes:
File "stable-diffusion/ldm/modules/diffusionmodules/model.py", line 37, in nonlinearity
    return x*torch.sigmoid(x)
RuntimeError: CUDA out of memory. Tried to allocate 1.56 GiB [..]

Now up to 1536x1280 is possible on 8GB VRAM.
Also remove unused SiLU class.
@tildebyte tildebyte requested review from lstein and bakkot September 17, 2022 21:27
@netsvetaev
Copy link
Contributor

Just cuda or mps too?

@mh-dm
Copy link
Contributor Author

mh-dm commented Sep 17, 2022

VRAM error only happens on cuda. On mps there might be a small speed benefit, but it only happens once at the end of generation.

Copy link
Collaborator

@lstein lstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing how much performance you're able to wring out of this code!

Tested on 12 GB P100 and worked as advertised.

@lstein lstein merged commit 071f65a into invoke-ai:development Sep 17, 2022
@mh-dm mh-dm deleted the silu branch September 17, 2022 23:29
afiaka87 pushed a commit to afiaka87/lstein-stable-diffusion that referenced this pull request Sep 19, 2022
…port (invoke-ai#653)

Fixes:
File "stable-diffusion/ldm/modules/diffusionmodules/model.py", line 37, in nonlinearity
    return x*torch.sigmoid(x)
RuntimeError: CUDA out of memory. Tried to allocate 1.56 GiB [..]

Now up to 1536x1280 is possible on 8GB VRAM.
Also remove unused SiLU class.
afiaka87 pushed a commit to afiaka87/lstein-stable-diffusion that referenced this pull request Sep 19, 2022
…port (invoke-ai#653)

Fixes:
File "stable-diffusion/ldm/modules/diffusionmodules/model.py", line 37, in nonlinearity
    return x*torch.sigmoid(x)
RuntimeError: CUDA out of memory. Tried to allocate 1.56 GiB [..]

Now up to 1536x1280 is possible on 8GB VRAM.
Also remove unused SiLU class.
afiaka87 pushed a commit to afiaka87/lstein-stable-diffusion that referenced this pull request Sep 19, 2022
…port (invoke-ai#653)

Fixes:
File "stable-diffusion/ldm/modules/diffusionmodules/model.py", line 37, in nonlinearity
    return x*torch.sigmoid(x)
RuntimeError: CUDA out of memory. Tried to allocate 1.56 GiB [..]

Now up to 1536x1280 is possible on 8GB VRAM.
Also remove unused SiLU class.
afiaka87 pushed a commit to afiaka87/lstein-stable-diffusion that referenced this pull request Sep 21, 2022
…port (invoke-ai#653)

Fixes:
File "stable-diffusion/ldm/modules/diffusionmodules/model.py", line 37, in nonlinearity
    return x*torch.sigmoid(x)
RuntimeError: CUDA out of memory. Tried to allocate 1.56 GiB [..]

Now up to 1536x1280 is possible on 8GB VRAM.
Also remove unused SiLU class.
austinbrown34 pushed a commit to cognidesign/InvokeAI that referenced this pull request Dec 30, 2022
…port (invoke-ai#653)

Fixes:
File "stable-diffusion/ldm/modules/diffusionmodules/model.py", line 37, in nonlinearity
    return x*torch.sigmoid(x)
RuntimeError: CUDA out of memory. Tried to allocate 1.56 GiB [..]

Now up to 1536x1280 is possible on 8GB VRAM.
Also remove unused SiLU class.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants