Directly use low bit gguf for continuous training? #10199
-
Any clues? |
Beta Was this translation helpful? Give feedback.
Answered by
BarfingLemurs
Nov 7, 2024
Replies: 1 comment 1 reply
-
Here's a quantized gguf conversion script: https://github.com/PygmalionAI/aphrodite-engine/blob/main/examples/gguf_to_torch.py (Finetuning/Training gguf models) https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct-QLORA_INT4_EO8 |
Beta Was this translation helpful? Give feedback.
1 reply
Answer selected by
FNsi
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Here's a quantized gguf conversion script: https://github.com/PygmalionAI/aphrodite-engine/blob/main/examples/gguf_to_torch.py
Use transformers for the training.
(Finetuning/Training gguf models)
#2632
l tried Q8_0 training on Openllama 3B.
https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct-QLORA_INT4_EO8
Perhaps you are looking for quantization aware training like this one?