-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Auto Parallel] Support dynamic semi-auto training in Llama2 model #7851
[Auto Parallel] Support dynamic semi-auto training in Llama2 model #7851
Conversation
Thanks for your contribution! |
e16875e
to
de9bb93
Compare
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## develop #7851 +/- ##
===========================================
- Coverage 56.96% 56.67% -0.29%
===========================================
Files 587 588 +1
Lines 88647 89243 +596
===========================================
+ Hits 50494 50580 +86
- Misses 38153 38663 +510 ☔ View full report in Codecov by Sentry. |
cf7d444
to
b9acdf0
Compare
b9acdf0
to
327d788
Compare
|
||
auto variance_shape = x_shape; | ||
variance_shape.pop_back(); | ||
auto invvar = paddle::empty(variance_shape, paddle::DataType::FLOAT32, place); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这两处改动的原因是什么?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fused_ln
的改动是因为原本variance
的infer shape有问题,只是动手不会报错;动半加上切分推导规则,就会挂,所以需要修复。同样的,框架里layer norm算子也做了修复,详见PR-58776
return outputs | ||
|
||
|
||
class LlamaDecoderLayerAuto(nn.Layer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这些模型名字是不是跟 modeling_auto 是冲突的?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这些模型名字仅用于当前文件内自用,没有加到__all__
列表里,不会对用户或训练侧暴露;另外这里只是中间态,后期等动静半执行代码接入后,仅保留modeling_3D_auto.py
,原本纯静半的modeling_auto.py
会删除
@@ -14,6 +14,7 @@ | |||
|
|||
from .configuration import * | |||
from .modeling import * | |||
from .modeling_3D_auto import * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
一些命名会冲突吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
回复请见上条
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
Bug fixes
PR changes
Others
Description
[Auto Parallel] Support dynamic semi-auto training in Llama2 model