-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: DPO support for global padding of seq_len to a multiple #386
Conversation
55a2d73
to
6e3c586
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a few minor questions. Also, to confirm, this PR just lets use pad DPO sequences to a certain multiple, this is not the PR which adds support to DPO for sequence parallelism, correct?
9079d4b
to
cbf87d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one added question
cbf87d8
to
32da0a9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sound
Signed-off-by: Terry Kong <terryk@nvidia.com> dpo pad fix if none Signed-off-by: Terry Kong <terryk@nvidia.com> rm variable_seq_len && fix comment on pad_multiple Signed-off-by: Terry Kong <terryk@nvidia.com> rm not resolver Signed-off-by: Terry Kong <terryk@nvidia.com> typo Signed-off-by: Terry Kong <terryk@nvidia.com>
32da0a9
to
1f4f3e6
Compare
…#386) Signed-off-by: Terry Kong <terryk@nvidia.com> Signed-off-by: abukharin <abukharin@nvidia.com>
What does this PR do ?
pad_length_to_multiple_of
will pad all minibatches to the same length if>0
. If==0
, the behavior is the same as before.Needed for:
Rebase stack
Changelog
Usage
# Add a code snippet demonstrating how to use this
Before your PR is "Ready for review"
Pre checks:
Checklist when contributing a new algorithm
max_steps=-1
andvalidation
?Additional Information