-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
使用解决了多卡gradient accumulation严重BUG的最新transformer库(以及对应的trl库),DPO训练的时候LOSS变为之前的好几倍 #5747
Comments
这里的trl库是main branch吗? |
是的. 不过我不觉得是TRL弄的,应该主要原因还是transformer库的更新导致,我换了另外一个REPO也会有LOSS大幅增加的现象
Thanks.
Jianbang Zhang
…On Sat, Oct 19, 2024 at 4:51 AM ElementQi ***@***.***> wrote:
这里的trl库是main branch吗?
—
Reply to this email directly, view it on GitHub
<#5747 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADY2AP3RLT26O2JNLEDGW7DZ4IMQLAVCNFSM6AAAAABQG4J4OGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRTGY4TSOJXGE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
+1, sft bs4 ga4,初始4.x,bs16 只有1.x |
如果模型正常收敛就ok吧?是没法正常训练吗 他们是改了cross entophy loss的分母normalization,所以会有数值上的变化是正常的 |
收敛正常,但LOSS值改动后,相应的最优LR也得进行变动比较麻烦。我是不确定这个是否是working as design |
为什么原作者要把那个注释掉? 另外难道每个模型都得相应的改loss传入参数? |
transformer的一个PR已经解决该问题,详情见huggingface/transformers#34263 |
先留着,我之后仔细看下 |
I have re-install: Finetune QWen2VL on LLaMA-Factory w/ LoRA and gradient accumulation 8xH100 NVL (forced ignored version check). The loss is still way high, like near 10 times higher than before, although eval loss is small as before! Do you know why ? Thanks, |
应该在这里修复了: |
In your forward() function, give a signature parameter "loss_kwargs", then the loss value will be 'normal' again. The other way is after you installed transformer, go the the trainer.py and hardcode 'model_accepts_loss_kwargs' to True |
Reminder
System Info
8XH100
Reproduction
更新到master分支的最新的transformer & trl库,DPO训练LOSS从之前的1.0->0.3 变为9->3
详情见huggingface/transformers#34191
Expected behavior
No response
Others
No response
The text was updated successfully, but these errors were encountered: