how much vram could LLaMa 3 400B model require to be trained for chinese llama type training ? #563

StephennFernandes · 2024-04-19T19:38:18Z

StephennFernandes
Apr 19, 2024

Hi there just wanted to know the estimates on how much VRAM could be needed to train chinese llama type training on the 400B llama model ? as extending the tokenizer also extends the vocab_size in the model parameters would be needing to account for those values as well.

i currently have 8x A6000's wanted to know if these could suffice ? additionally can i load the model in 4 bit and train the training script.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how much vram could LLaMa 3 400B model require to be trained for chinese llama type training ? #563

{{title}}

Replies: 0 comments

Select a reply

how much vram could LLaMa 3 400B model require to be trained for chinese llama type training ? #563

StephennFernandes Apr 19, 2024

Replies: 0 comments

StephennFernandes
Apr 19, 2024