Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: DPO support for global padding of seq_len to a multiple #386

Merged
merged 1 commit into from
Nov 15, 2024

Conversation

terrykong
Copy link
Collaborator

@terrykong terrykong commented Nov 7, 2024

What does this PR do ?

  • adds pad_to_multiple_of for DPO which is a requirement for sequence parallel (which is required for moe models w/ TP)
    • the argument pad_length_to_multiple_of will pad all minibatches to the same length if >0. If ==0, the behavior is the same as before.

Needed for:

  • sequence parallel
  • mamba

Rebase stack

Changelog

  • Please update the CHANGELOG.md under next version with high level changes in this PR.

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

Before your PR is "Ready for review"

Pre checks:

Checklist when contributing a new algorithm

  • Does the trainer resume and restore model state all states?
  • Does the trainer support all parallelism techniques(PP, TP, DP)?
  • Does the trainer support max_steps=-1 and validation?
  • Does the trainer only call APIs defined in alignable_interface.py?
  • Does the trainer have proper logging?

Additional Information

  • Related to # (issue)

Copy link
Collaborator

@trias702 trias702 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a few minor questions. Also, to confirm, this PR just lets use pad DPO sequences to a certain multiple, this is not the PR which adds support to DPO for sequence parallelism, correct?

examples/nlp/gpt/conf/gpt_dpo.yaml Outdated Show resolved Hide resolved
examples/nlp/gpt/train_gpt_dpo.py Show resolved Hide resolved
@terrykong terrykong force-pushed the tk/dpo-pad-to-multiple branch 2 times, most recently from 9079d4b to cbf87d8 Compare November 14, 2024 22:18
@terrykong terrykong added Run CICD Set + un-set to retrigger and removed Run CICD Set + un-set to retrigger labels Nov 14, 2024
Copy link
Collaborator

@trias702 trias702 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one added question

examples/nlp/gpt/conf/gpt_dpo.yaml Outdated Show resolved Hide resolved
nemo_aligner/data/nlp/builders.py Show resolved Hide resolved
@terrykong terrykong force-pushed the tk/dpo-pad-to-multiple branch from cbf87d8 to 32da0a9 Compare November 14, 2024 23:04
trias702
trias702 previously approved these changes Nov 14, 2024
Copy link
Collaborator

@trias702 trias702 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sound

ashors1
ashors1 previously approved these changes Nov 15, 2024
Signed-off-by: Terry Kong <terryk@nvidia.com>

dpo pad fix if none

Signed-off-by: Terry Kong <terryk@nvidia.com>

rm variable_seq_len && fix comment on pad_multiple

Signed-off-by: Terry Kong <terryk@nvidia.com>

rm not resolver

Signed-off-by: Terry Kong <terryk@nvidia.com>

typo

Signed-off-by: Terry Kong <terryk@nvidia.com>
@terrykong terrykong dismissed stale reviews from ashors1 and trias702 via 1f4f3e6 November 15, 2024 00:24
@terrykong terrykong force-pushed the tk/dpo-pad-to-multiple branch from 32da0a9 to 1f4f3e6 Compare November 15, 2024 00:24
@terrykong terrykong added Run CICD Set + un-set to retrigger and removed Run CICD Set + un-set to retrigger labels Nov 15, 2024
@terrykong terrykong enabled auto-merge (squash) November 15, 2024 00:41
@terrykong terrykong merged commit e2b4b3f into main Nov 15, 2024
17 checks passed
@terrykong terrykong deleted the tk/dpo-pad-to-multiple branch November 15, 2024 00:47
abukharin3 pushed a commit to abukharin3/NeMo-Aligner that referenced this pull request Nov 22, 2024
…#386)

Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: abukharin <abukharin@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algorithms Run CICD Set + un-set to retrigger
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants