-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix AMD GPUs not being detected #7147
Fix AMD GPUs not being detected #7147
Conversation
We don't have any way of testing this on a 6xxx architecture, but this seems like a worthwhile and correct change nonetheless. We do know that 7xxx (Navi32/33 / RDNA3) chips will need some more special handling, but that shouldn't block this update. @max-maag - we also have this index url set in the Dockerfile - could you please update that in this PR? |
9b9632e
to
e3a7e5f
Compare
I saw that someone reported that in #7006 but because I'm not using Docker and am not familiar with it at all I kept it out of this PR until now. I changed the URL in the Docker file and added the relevant issue to this PR's related issue list. I didn't test the Dockerfile change though. I don't see any reason why it shouldn't work but maybe should verify the fix just to be sure. |
e3a7e5f
to
d8b0730
Compare
Approved - LGTM from my perspective. Thanks again for the contribution. Just FYI - I finally got it to generate on a recent AMD GPU (W7900). Here's a full write-up: https://gist.github.com/ebr/e4e4118b603bd95bfd2408ee30c27f0a. It's not pretty, but it works. |
d8b0730
to
b41762c
Compare
Each version of torch is only available for specific versions of CUDA and ROCm. The Invoke installer and dockerfile try to install torch 2.4.1 with ROCm 5.6 support, which does not exist. As a result, the installation falls back to the default CUDA version so AMD GPUs aren't detected. This commits fixes that by bumping the ROCm version to 6.1, as suggested by the PyTorch documentation. [1] The specified CUDA version of 12.4 is still correct according to [1] so it does need to be changed. Closes invoke-ai#7006 Closes invoke-ai#7146 [1]: https://pytorch.org/get-started/previous-versions/#v241
b41762c
to
dd3f044
Compare
Summary
Each version of torch is only available for specific versions of CUDA and ROCm. The Invoke installer tries to install torch 2.4.1 with ROCm 5.6 support, which does not exist. As a result, the installation falls back to the default CUDA version so AMD GPUs aren't detected. This commits fixes that by bumping the ROCm version to 6.1, as suggested by the PyTorch documentation.1 Torch 2.4.1 does not appear to be available for ROCm 6.2.
The specified CUDA version of 12.4 is still correct according to 1 so it does need to be changed.
Related Issues / Discussions
Closes #7006
Closes #7146
QA Instructions
Without this fix, the CPU is used to generate images. This can be seen in the log output. Image generation also takes forever.
I did not test the changes to the Dockerfile since I am not familiar with Docker.
Merge Plan
n/a
Checklist
Footnotes
https://pytorch.org/get-started/previous-versions/#v241 ↩ ↩2