Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quantize : fail fast on write errors #3521

Merged
merged 1 commit into from
Oct 7, 2023

Conversation

cebtenzzre
Copy link
Collaborator

This causes quantize to fail early with an error message if you run out of disk space, instead of appearing to succeed while silently producing a corrupt output file.

[ 170/ 543]                 blk.18.ffn_up.weight - [ 6656, 17920,     1,     1], type =    f16, quantizing to q5_0 .. size =   227.50 MB ->    78.20 MB | hist: 0.078 0.062 0.061 0.059 0.059 0.062 0.067 0.070 0.078 0.063 0.060 0.056 0.054 0.055 0.057 0.059 
llama_model_quantize: failed to quantize: basic_ios::clear: iostream error
main: failed to quantize model from 'alpacino-33B.f16.gguf'

@ggerganov ggerganov merged commit f1782c6 into ggerganov:master Oct 7, 2023
36 checks passed
joelkuiper added a commit to vortext/llama.cpp that referenced this pull request Oct 12, 2023
…example

* 'master' of github.com:ggerganov/llama.cpp:
  py : change version of numpy requirement to 1.24.4 (ggerganov#3515)
  quantize : fail fast on write errors (ggerganov#3521)
  metal : support default.metallib load & reuse code for swift package (ggerganov#3522)
  llm : support Adept Persimmon 8B (ggerganov#3410)
  Fix for ggerganov#3454 (ggerganov#3455)
  readme : update models, cuda + ppl instructions (ggerganov#3510)
  server : docs fix default values and add n_probs (ggerganov#3506)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants