Distributed Inference and Imatrix Creation #529

oldgithubman · 2024-08-07T04:42:27Z

oldgithubman
Aug 7, 2024

@jart
According to:
ggerganov/llama.cpp#7488 (comment)
you've implemented BF16 CUDA support while llama.cpp still hasn't. I've been waiting for BF16 CUDA support for months and the llama.cpp devs seem hellbent on not implementing it (at this point, I'm starting to think this is specifically to spite me - they also banned me without explanation).
Do you plan to implement the ability to create Imatrices? BF16 CUDA support would save me a ton of resources (literally weeks at this point).
Also, do you plan to implement distributed inference? This is the main feature preventing me from forgetting about llama.cpp.
Combining the two would be even cooler (distributed imatrix creation)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Distributed Inference and Imatrix Creation #529

{{title}}

Replies: 0 comments

Select a reply

Distributed Inference and Imatrix Creation #529

oldgithubman Aug 7, 2024

Replies: 0 comments

oldgithubman
Aug 7, 2024