how much vram could LLaMa 3 400B model require to be trained for chinese llama type training ? #563
StephennFernandes
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi there just wanted to know the estimates on how much VRAM could be needed to train chinese llama type training on the 400B llama model ? as extending the tokenizer also extends the vocab_size in the model parameters would be needing to account for those values as well.
i currently have 8x A6000's wanted to know if these could suffice ? additionally can i load the model in 4 bit and train the training script.
Beta Was this translation helpful? Give feedback.
All reactions