-
It looks like CUDA can be much faster than --device cpu, especially as model size increases. At least one person has appeared to have success using an AMD consumer GPU (RX 6800). https://news.ycombinator.com/item?id=32933236 Options for --device appear to include: cpu, cuda, ipu, xpu, mkldnn, opengl, opencl, ideep, hip, ve, ort, mps, xla, lazy, vulkan, meta, hpu, privateuseone I have the appropriate version of torch installed on Linux, but I'm not sure how to enable ROCm using --device |
Beta Was this translation helpful? Give feedback.
Replies: 11 comments 18 replies
-
We haven't tested ROCm, but from this documentation it seems that you can keep using
|
Beta Was this translation helpful? Give feedback.
-
If it supports ROCm, it will be great. |
Beta Was this translation helpful? Give feedback.
-
You can already use pytorch-rocm to take advantage of AMD GPUs. Install it through pip before installing openai |
Beta Was this translation helpful? Give feedback.
-
ROCm works fine with Whisper. I'm using it with "Radeon RX Vega (VEGA10, DRM 3.42.0, 5.15.0-48-generic, LLVM 12.0.0)". A few important steps in getting it working (i'm on Kubuntu 20.04.5 w/ kernel 5.15.0-56-generic , YMMV): (1)
(6) Now install whisper and check "whisper --help" to see if it outputs: With my GPU, whisper outputs the following warning: MIOpen(HIP): Warning [SQLiteBase] Missing system database file: gfx900_56.kdb Performance may degrade. Please follow instructions to install: However I've seen suggestions that the above warning is spurious, and in either case, applies to the AMD proprietary drivers whereas I'm using the open-source default 'amdgpu' drivers. Anybody have further info on resolving this warning for open source 'amdgpu' users?? Will I see better performance using AMD's proprietary drivers?? ........ On a different system, I had the hare-brained idea of trying to use my AMD 4750G "APU" (Ryzen with integrated AMD GPU) with pytorch. In this case I ended up using the "nightly" ROCm build versions of "torch 2.0.0.dev20221219+rocm5" and torchaudio and torchvision suggested by https://pytorch.org/ . After much failure, trying a hint from https://stackoverflow.com/questions/73229163/amd-rocm-with-pytorch-on-navi10-rx-5700-rx-5700-xt ,
Note that without the env-var, whisper pukes:
With the env-var, whisper claims cuda works:
However running a transcription with most models runs out of memory and dumps core (bug that it dumps core??):
With the 'base' or 'tiny' models, it no longer dumps core, but instead hangs forever at 100% CPU, outputting nothing. Furthermore, in this 'hung' state, whisper no longer can be ^C'd to kill the hanging process (bug?). Rather the process needs to be killed externally, or you need to background the process and then "kill %1" it. Note that radeontop(1) (from https://github.com/clbr/radeontop not apt) indicates how little memory is available on AMD 4750G with a running KDE desktop:
(NB: i've also tried this AMD4750G APU system "headless" so that GPU is only used for whisper and not to run the KDE desktop, and it still hangs or crashes as above). And with that, I've given up on the AMD 4750G APU and continue to use my old Power-hog RxVega 56 (165W) with the 'medium' model quite successfully: https://www.youtube.com/watch?v=AFk5g7NJ1Ko https://rumble.com/v1n7cx8-trainspodder-and-whisper-transcribes-radio-w-good-proper-noun-spelling-infe.html In contrast, for the RxVega56 radeontop(1) reports (on a "headless" system where I'm using the GPU 100% for whisper).
|
Beta Was this translation helpful? Give feedback.
-
Hijacking this thread, I had a hard time get things to work on docker.
Please, consider that I know nothing, so there my be errors or unecessary steps on these files, but I took a long time to figure it out. You also need to install rocm drivers on your host machine first. First I suggest to get: |
Beta Was this translation helpful? Give feedback.
-
Here is a short description how to use Whisper with older AMD Cards (GFX803) RX 580 Yesterday I managed to get Whisper (or Whisper-WEBUI) start and running with GPU (GFX803) RX 580 (8GB). |
Beta Was this translation helpful? Give feedback.
-
@BernardBurke thanks for your feedback, makes me glad to here that my instructions helped! here is a link to the instruction for stable diffusion: Greetings from the other side of the world (from vienna / austria) |
Beta Was this translation helpful? Give feedback.
-
Thanks viebrix,
I've done a little bit of docker work before. I should take this one (build
a docker for people who, like you and me, have an older AMD gpu that is
going to waste for some self hosted ML workloads).
And, good choice on stable diffusion as the next target. I'll take a look.
One question.. doing the build of torch and torchvision, I received
hundreds of compiler (and other warnings). I didn't even bother to track
what was going on.
If I'm to build a docker script based on headless linux mint, for example
(I'll explain my choice below), I'm inclined to create a venv of a specific
version of python etc and just install the prebuild wheels (that I built on
the same version of mint). If I do a good job, regenerating the docker
build should be mostly automated (but I'm wary of the many compiler errors
on the build.. I should document it to potential users, but, I don't think
anyone is going to 'fix' this kind of thing.
What do you think?
On my choice of Mint... I remain ever hopeful that more people will move
their daily driver to linux (it's rare, but it still happens). I choose
Mint as my daily driver so I can be as I think most Windows or OSX folks
will be able to use it more or less immediately... and because I use it all
day every day, I'm pretty experienced in answering 'user' questions.
I'm part way through an attempted Daily Driver move to Nixos... are you
familiar with the nix package manager? I really think it's worth learning.
It has the potential to make OSX a decent opensource option (which it just
isn't today IMO).
Cheers.. I'll let you know where I'm at with the docker world.
PS: on one final note.. I have a *really * old GM204 [GeForce GTX 970] that
came from a daughter's ex (gamer) boyfriend, hooked up to a 12 year old
4Gigi i7 with 16 gb of ram. It is *much *faster at whisper transcribing
(almost twice as fast from my first few tries) than the 580 we're
discussing... and these old GPUs are dirt cheap on eBay.
…-- Ben Burke
Level 2, 11 York St
Sydney 2000
Cell +61418674002
Want to see when I'm available to meet?
<https://www.benburke.org/calendar> (this
link takes you to a simple view of my Calendar, and show's you when I'm
free...)
On Fri, 12 Jul 2024 at 03:57, viebrix ***@***.***> wrote:
@BernardBurke <https://github.com/BernardBurke> thanks for your feedback,
makes me glad to here that my instructions helped!
Using docker is a good idea. I haven't used docker by myself, but it would
make sense.
I have stable diffusion with comfyui installed on the same desktop and I
had sometimes troubles with differend torch version between whisper and
stablediffusion. There is indeed a virtual environment, but some nodes in
comfyui and also some extensions in automatic1111 web-ui had some troubles.
Probable I made some mistakes with not activation the virtual environment
before installing some dependencies, but docker would solve all this
problems.
Another point in favour is that some extensions and custom nodes in
comfyui / automatic111 web-ui are a risk (potential malware). Docker would
improve this risky situation. Same situation in weaker form (less risky)
applies also to dependencies in whisper-webui.
here is a link to the instruction for stable diffusion:
https://github.com/viebrix/pytorch-gfx803
But this could break your whisper installation..
Greetings from the other side of the world (from vienna / austria)
viebrix
—
Reply to this email directly, view it on GitHub
<#55 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AE27KUB3EMNYV6ZKF7OR2H3ZL3BQTAVCNFSM6AAAAAAQTUPUV2VHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTAMBSGM4TSNY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
@BernardBurke Edit: the project I read about is ZLUDA: |
Beta Was this translation helpful? Give feedback.
-
viebrix,
Thanks for that - I think your English is awesome (and I don't have any
other languages - always feel somewhat dumb for this reason. Even when I've
worked overseas, everyone in the IT field spoke English.. anyway).
I see what you are saying, and as I learn more about the GPU (and NPU)
world, I'll no doubt spend some more money. One part of my life now (after
41 years in tech) is to try and make IT accessible for young people,
especially those who don't have family money (or much of their own).
Getting some kind of GPU assisted machine learning locally sounds like a
good plan.
I just ran across this - seems someone has done much of the docker work
already https://github.com/robertrosenbusch/gfx803_rocm61_pt24
Cheers
-- Ben Burke
Level 2, 11 York St
Sydney 2000
Cell +61418674002
Want to see when I'm available to meet?
<https://www.benburke.org/calendar> (this
link takes you to a simple view of my Calendar, and show's you when I'm
free...)
…On Sun, 14 Jul 2024 at 22:55, viebrix ***@***.***> wrote:
@BernardBurke <https://github.com/BernardBurke>
I think NVIDIA's cuda runs much better than rocm. Whisper does not need
much VRAM therefore it could be that your geforce is running faster than
the newer AMD. But Stable Diffusion needs more VRAM. (Not using very simple
examples, but trying more - with controlnet or with SDXL model needs more
ram) Therefore a card with minimum 8GB VRAM (better 12GB) would be better.
I switched two weeks ago to a RTX 4060ti (with 16GB VRAM). Whisper was no
problem with my rx580 - but complex workflows in stable diffusion crashed
constantly my linux mint. Because rocm is not error free. Not all cuda
commands implemented straight forward in rocm and so there are memory
leaks. After examining these problems - I waited month to buy this new
NVIDIA gpu - because I hoped AMD would initiate more in compatibility with
pytorch. But than I read, that they stopped a project, which was
implemented - since one and a half year. This project completely
implemented this compatibility independent from HIP. I do not remember the
name of the project. They open sourced it - if I remember correctly. But
they didn't support it any more.
I like AMD I have a cpu from AMD and I'm happy with it, but the KI and
special the pytorch topic is a real problem. Alone the thing you have to
compile the whole rocm thing with special flags for older cards is a
harassment.
But I thing: already having amd cards why not using them for KI. They can
successfully doing a lot of KI tasks. There are only some limits - which
NVIDIA do not have. In comfy-ui (stable diffusion) there are a lot of
optimizations for lower vram and amd, but however OS crashes happens
sometimes.
I'm also using Mint on my private PC daily. At work I have to use Windows.
There are a lot of other thoughts and answers I have to your words, but I
don't want to "spam" this Issue Task here with topics they do not fit.
(hope everything is readable, my english is not very good..)
—
Reply to this email directly, view it on GitHub
<#55 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AE27KUA3PMF5DKEIBXBEFM3ZMJYLZAVCNFSM6AAAAAAQTUPUV2VHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTAMBUGM2TQMA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
A little update of NielsMayer instruction. rocm5.2 isn't working for me (guess it outdated), so i tried to install rocm5.6. (Just copied this from here) pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.6 After this, install whisper and set Tested on Arch Linux, 6700XT |
Beta Was this translation helpful? Give feedback.
We haven't tested ROCm, but from this documentation it seems that you can keep using
cuda
if the ROCm version is properly installed.