关于VisualGLM训练的各种Tricks #5
Replies: 6 comments 1 reply
-
大模型训练中,训练的精度(int4,int8,float16...)不重要,模型的参数(6B、13B...)很影响模型的性能。 |
Beta Was this translation helpful? Give feedback.
-
目前glm系列的模型目前就6B。 |
Beta Was this translation helpful? Give feedback.
-
使用QLoRA最低显存要求9.8G,可以多卡,可以将脚本里的--include localhost:0改成你想使用的卡,比如--include localhost:0,1,2,或者直接去掉这个参数,就是使用所有卡了:THUDM/VisualGLM-6B@5d368f6#commitcomment-115614063 |
Beta Was this translation helpful? Give feedback.
-
支持多轮对话finetune:THUDM/VisualGLM-6B#118 |
Beta Was this translation helpful? Give feedback.
-
关于微调图片的大小问题:THUDM/VisualGLM-6B#82 |
Beta Was this translation helpful? Give feedback.
-
如果数据多的话可以考虑增加训练的参数,目前的训练脚本只训练了2层lora:THUDM/VisualGLM-6B#61 (comment)
Beta Was this translation helpful? Give feedback.
All reactions