-
Notifications
You must be signed in to change notification settings - Fork 995
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not Compatible with Models quantized with updated llama.cpp q4 and q5 quantization. #227
Comments
@abetlen Any ideas on how to use the new models? |
@abetlen Updating the version to 0.1.50 resolved the issue. Thanks! |
when using version 0.1.51 and 0.1.52 the problem still persist :
When I use old models though everything works fine using python, as well as when I run new models normally using llama.cpp. Am I missing something or is there a bug somewhere? |
Similarly, same problem working with os.system call to ./main but not through Llama() |
There are now three versions of llama model files following rapid development of the quantization code in About a week ago it changed to v2 which required Below is a quick way to verify your model versions.
|
Works with commit 08737ef: But not when quantizing with newer versions of ggerganov/llama.cpp. I guess llama-cpp-python is not yet ready to work with version 3? |
Not Compatible with Models quantized with updated llama.cpp q4 and q5 quantization released in llama.cpp PR 1405.
The text was updated successfully, but these errors were encountered: