-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DEV] Support sync params in tensor parallel config #8311
[DEV] Support sync params in tensor parallel config #8311
Conversation
Thanks for your contribution! |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #8311 +/- ##
===========================================
- Coverage 55.33% 55.32% -0.01%
===========================================
Files 614 614
Lines 95341 95353 +12
===========================================
Hits 52753 52753
- Misses 42588 42600 +12 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
ef09db4
to
69f7d22
Compare
69f7d22
to
4d6e782
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
请确保paddle正式版 不开启这些个功能的时候,不会报错
sync_param : 在优化器阶段使用broadcast同步所有is_distributed=False的参数 | ||
sync_grad : 在优化器阶段使用broadcast同步所有is_distributed=False的梯度 | ||
sync_moment : 在优化器阶段使用broadcast同步所有is_distributed=False的momentum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这些不是默认开启的吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PaddleNLP不配置这些参数的时候,不会影响现行逻辑。
框架里,sync_param默认开启,sync_grad和sync_moment默认不开启,且同步的参数名称默认为sync_param_name = ["embedding", "layer_norm", ".b_"]
,其它参数都不会同步。
PaddleNLP配置开启这些开关的时候,会强制同步所有参数。
在代码里有相关注释说明:https://github.com/PaddlePaddle/PaddleNLP/pull/8311/files#diff-477a2a51a1a5694f5db999c8695a3f6ec8fd4f08ded299fb66176651e9d6ebadR1100-R1102
PR types
New features
PR changes
Others
Description
迁移#8306 ,支持配置mp参数强制同步策略。