-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ZeroPadding] revert zero_padding #8973 #9003
[ZeroPadding] revert zero_padding #8973 #9003
Conversation
Thanks for your contribution! |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #9003 +/- ##
===========================================
- Coverage 54.78% 54.14% -0.65%
===========================================
Files 647 650 +3
Lines 102502 103871 +1369
===========================================
+ Hits 56160 56237 +77
- Misses 46342 47634 +1292 ☔ View full report in Codecov by Sentry. 🚨 Try these New Features:
|
@@ -53,42 +53,7 @@ class ZeroPadding: | |||
] | |||
|
|||
@classmethod | |||
def _pad_batch_records_to_max_length(cls, batch_records, max_length, pad_token=0): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这段代码是挪走了,还是彻底不需要了?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不需要了,我们使用tokenzier的pad进行补充
padding = "max_length" | ||
else: | ||
max_length = None | ||
padding = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
额,我们不padding到最大长度是什么情况?是说,不同batch 最大长度改变?(之前一直反馈这种情况容易显存泄漏。)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
但如果不是zero padding没必要pad到最长
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sequence_parallel应该是padding到最长,否则中间长度变化可能不能被tensor_parallel_degree整除
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
padding = "max_length" | ||
else: | ||
max_length = None | ||
padding = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
但如果不是zero padding没必要pad到最长
PR types
Bug fixes
PR changes
Others
Description
max_length
forsequence_parallel
inllm/run_finetune.py
.