quantize : fail fast on write errors #3521

cebtenzzre · 2023-10-07T03:15:20Z

This causes quantize to fail early with an error message if you run out of disk space, instead of appearing to succeed while silently producing a corrupt output file.

[ 170/ 543]                 blk.18.ffn_up.weight - [ 6656, 17920,     1,     1], type =    f16, quantizing to q5_0 .. size =   227.50 MB ->    78.20 MB | hist: 0.078 0.062 0.061 0.059 0.059 0.062 0.067 0.070 0.078 0.063 0.060 0.056 0.054 0.055 0.057 0.059 
llama_model_quantize: failed to quantize: basic_ios::clear: iostream error
main: failed to quantize model from 'alpacino-33B.f16.gguf'

…example * 'master' of github.com:ggerganov/llama.cpp: py : change version of numpy requirement to 1.24.4 (ggerganov#3515) quantize : fail fast on write errors (ggerganov#3521) metal : support default.metallib load & reuse code for swift package (ggerganov#3522) llm : support Adept Persimmon 8B (ggerganov#3410) Fix for ggerganov#3454 (ggerganov#3455) readme : update models, cuda + ppl instructions (ggerganov#3510) server : docs fix default values and add n_probs (ggerganov#3506)

quantize : fail fast on write errors

87a3547

ggerganov approved these changes Oct 7, 2023

View reviewed changes

ggerganov merged commit f1782c6 into ggerganov:master Oct 7, 2023
36 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

quantize : fail fast on write errors #3521

quantize : fail fast on write errors #3521

cebtenzzre commented Oct 7, 2023

quantize : fail fast on write errors #3521

quantize : fail fast on write errors #3521

Conversation

cebtenzzre commented Oct 7, 2023