-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Latest release crashes on start #903
Comments
can confirm, i get the exact same error. Rolling back to linked release works |
Same issue, introduced in #709 static const char *llama_ftype_name(enum llama_ftype ftype) {
switch (ftype) {
case LLAMA_FTYPE_ALL_F32: return "all F32";
case LLAMA_FTYPE_MOSTLY_F16: return "mostly F16";
case LLAMA_FTYPE_MOSTLY_Q4_0: return "mostly Q4_0";
case LLAMA_FTYPE_MOSTLY_Q4_1: return "mostly Q4_1";
default: LLAMA_ASSERT(false);
}
} ftype for my q4_1 model is 4 when this function is called. This is a gptq model converted to q4_1, and interestingly, the convert-gptq-to-ggml.py script does do Ah, so #801 removed checking for GPTQ models For the actual fix, I guess another llama_ftype could be added? temp fix for anyone waitingstatic const char *llama_ftype_name(enum llama_ftype ftype) {
switch (ftype) {
case LLAMA_FTYPE_ALL_F32: return "all F32";
case LLAMA_FTYPE_MOSTLY_F16: return "mostly F16";
case LLAMA_FTYPE_MOSTLY_Q4_0: return "mostly Q4_0";
case LLAMA_FTYPE_MOSTLY_Q4_1: return "mostly Q4_1";
+ case 4: return "mostly Q4_1 and some f16";
default: LLAMA_ASSERT(false);
}
} There is no negative effect from just bypassing this assertion, the f16/ftype hparam isn’t used anymore. |
Yes I am having this issue as well, with GPTQ models |
If you comment out |
For now I just rolled back to the commit before with:
|
My apologies, I assumed that the "4" format was no longer supported by the new loader code in #801, that's why I didn't make a value in |
I'm experiecing this error. Anyone knows what's the issue? I think this is a bug, since the one of the previous releases that doesn't have this problem is master-2663d2c.
The text was updated successfully, but these errors were encountered: