Distributed Inference and Imatrix Creation #529
oldgithubman
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
@jart
According to:
ggerganov/llama.cpp#7488 (comment)
you've implemented BF16 CUDA support while llama.cpp still hasn't. I've been waiting for BF16 CUDA support for months and the llama.cpp devs seem hellbent on not implementing it (at this point, I'm starting to think this is specifically to spite me - they also banned me without explanation).
Do you plan to implement the ability to create Imatrices? BF16 CUDA support would save me a ton of resources (literally weeks at this point).
Also, do you plan to implement distributed inference? This is the main feature preventing me from forgetting about llama.cpp.
Combining the two would be even cooler (distributed imatrix creation)
Beta Was this translation helpful? Give feedback.
All reactions