Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

peft_config & is_loaded_in_4bit check added to Reward_Trainer #2427

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

shirinyamani
Copy link

@shirinyamani shirinyamani commented Dec 2, 2024

What does this PR do?

This PR adds the peft_config to the reward modeling to allow user to pick a peft_config of choice, e.g. Lora, Dora, etc. It also checks for is_loaded_in_4bit and make sure the user DO NOT use QLoRA + FSDP to avoid conflicr in prepare_model_for_kbit_training or peft_module_casting_to_bf16 that assume access to the full model!

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a GitHub issue? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@shirinyamani shirinyamani changed the title peft_config added to Reward_Trainer peft_config & is_loaded_in_4bit check added to Reward_Trainer Dec 3, 2024
Comment on lines +148 to +153
if not isinstance(peft_config, PeftConfig):
raise ValueError(
"If you want to use the PeftModel, you need to pass a valid PeftConfig object to the RewardTrainer."
f" and you passed a {type(peft_config)}."
)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if not isinstance(peft_config, PeftConfig):
raise ValueError(
"If you want to use the PeftModel, you need to pass a valid PeftConfig object to the RewardTrainer."
f" and you passed a {type(peft_config)}."
)

I don't think it's necessary here. The code will eventually fail if the peft config doesn't have the right type

@qgallouedec
Copy link
Member

Thanks for this addition @shirinyamani!

Is there a simple way to test it? Ideally a small piece of code that would fail before this PR but passes after?
We probably won't be able to add it to the tests because it requires multi-gpu, but at least we'll have it ready. (I'll have to reinvest in multi-gpu testing in the future)

@shirinyamani
Copy link
Author

The simplest way might be testing it with assert to make sure the load_in_4bit is the case? what do think abt it ?
I can also look into multiple gpu set up! @qgallouedec

Thanks for this addition @shirinyamani!

Is there a simple way to test it? Ideally a small piece of code that would fail before this PR but passes after? We probably won't be able to add it to the tests because it requires multi-gpu, but at least we'll have it ready. (I'll have to reinvest in multi-gpu testing in the future)

@qgallouedec
Copy link
Member

To clarify: I think it's okay not to add a test in our unit test for this PR (because it's specific to multi-gpu configuration, and it's not trivial to set up with GitHub actions, but we'll do it anyway in the future).

However, we should check “by hand” that it works. Do you have a small command line / example script that I can run in my local multi-gpu configuration to check that it works as expected?

@qgallouedec
Copy link
Member

Btw if you don't have multi-gpu setting available, feel free to share a code that might not be correct, I can test it quickly on my side.

@shirinyamani
Copy link
Author

Rn I can think of two approaches both when we do NOT want to run manually using torch.distributed.run;

  1. So I just checked we have nice accelerate_config files in trl so I think what we can do is we can launch the multi_gpu config from examples/accelerate_configs then set up the sft modeling; (preferred approach/ not sure abt outcome as I do not have access to multiple gpu atm!)

accelerate launch --config_file examples/accelerate_configs/multi_gpu.yaml --num_processes=8 examples/scripts/sft.py --model_name_or_path Qwen/Qwen2-0.5B --dataset_name trl-lib/Capybara

  1. (Less sure/feasible approach), is to directly launch the multi-gpu from accelerate lib; the command for say 8 GPUs would be the following but not really sure if it is correct, I think it is not tho; reference

accelerate launch --multi_gpu --num_processes 8 sft --model_name_or_path Qwen/Qwen2.5-0.5B --dataset_name trl-lib/Capybara --output_dir Qwen2.5-0.5B-SFT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants