You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It's a good news to hear this feature supported by llama.cpp. However, I have some disagreements.
If the user does not have the cuda installed, using cuda backend makes little sense. Besides, the cpu backend is much smaller than the cuda backend, which may be desired in some conditions. I prefer to keep separating these two backends. :)
As of this PR to llama.cpp the CUDA binaries are capable of running with CPU only, as long as
n_gpu_layers = 0
.This might mean that we can significantly simplify our distribution of binaries by removing the CPU only variants and only shiping CUDA ones.
The text was updated successfully, but these errors were encountered: