-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
AMD GPU machine learning? #4883
Replies: 1 comment · 9 replies
-
cc @mertalev, I think we already support AMD right? |
Beta Was this translation helpful? Give feedback.
All reactions
-
Thanks for the tips. DockerfileRUN poetry config installer.max-workers 10 && \
poetry config virtualenvs.create false
+ RUN apt purge -y libmimalloc2.0
+ RUN apt autoremove -y
COPY poetry.lock pyproject.toml ./
RUN poetry install --no-interaction --no-ansi --no-root --with rocm --without dev
RUN rm -rf /var/lib/apt/lists/* For the moment, my image is still big, but I will dig on this later, or if someone is interested in merging this into the ML Dockerfile.
|
Beta Was this translation helpful? Give feedback.
All reactions
-
Awesome! Someone tried this a few months ago but ran into this issue leading to this PR, so glad to hear you don't have this issue. Regarding concurrency, could you make an ONNX Runtime issue for that? I think this is a thread safety bug. Not a blocker for getting this into immich of course, just good to solve it. |
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 1
-
Oh, I didn't see the first log entry. It looks like it's actually the same issue. You can add a comment about it to the issue I linked. You can also try building with that PR. Edit: But merge/rebase the PR on main first. |
Beta Was this translation helpful? Give feedback.
All reactions
-
Wow, I'm so grateful for your tips. I have managed to build the Onnxruntime using the PR, and a small patch needed after the rebase, and now I can run in parallel (I'm currently running both I'll keep you up-to-date in this channel! |
Beta Was this translation helpful? Give feedback.
All reactions
-
🎉 3 -
🚀 5
-
I'm finally back, with great news! |
Beta Was this translation helpful? Give feedback.
All reactions
-
🎉 6
-
So, amd are producing docker images which contain the rocm core and python and a few other things long the way, how hard would it be to take what is being done in the current ML container and shift its base functionality into a more AI worth compute engine?
happy to be a test monkey and help out where needed
let me know what the gaps are and if its a hard thing to transition to a 'immich-ml-accel-cuda' 'immich-ml-accel-rocm' style of container :)
Beta Was this translation helpful? Give feedback.
All reactions