peft_config & is_loaded_in_4bit check added to Reward_Trainer #2427

shirinyamani · 2024-12-02T23:47:51Z

What does this PR do?

This PR adds the peft_config to the reward modeling to allow user to pick a peft_config of choice, e.g. Lora, Dora, etc. It also checks for is_loaded_in_4bit and make sure the user DO NOT use QLoRA + FSDP to avoid conflicr in prepare_model_for_kbit_training or peft_module_casting_to_bf16 that assume access to the full model!

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a GitHub issue? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

qgallouedec · 2024-12-03T10:57:47Z

trl/trainer/reward_trainer.py

+            if not isinstance(peft_config, PeftConfig):
+                raise ValueError(
+                    "If you want to use the PeftModel, you need to pass a valid PeftConfig object to the RewardTrainer."
+                    f" and you passed a {type(peft_config)}."
+                )
+


Suggested change

if not isinstance(peft_config, PeftConfig):

raise ValueError(

"If you want to use the PeftModel, you need to pass a valid PeftConfig object to the RewardTrainer."

f" and you passed a {type(peft_config)}."

)

I don't think it's necessary here. The code will eventually fail if the peft config doesn't have the right type

qgallouedec · 2024-12-03T10:58:43Z

Thanks for this addition @shirinyamani!

Is there a simple way to test it? Ideally a small piece of code that would fail before this PR but passes after?
We probably won't be able to add it to the tests because it requires multi-gpu, but at least we'll have it ready. (I'll have to reinvest in multi-gpu testing in the future)

shirinyamani · 2024-12-03T21:13:26Z

The simplest way might be testing it with assert to make sure the load_in_4bit is the case? what do think abt it ?
I can also look into multiple gpu set up! @qgallouedec

Thanks for this addition @shirinyamani!

Is there a simple way to test it? Ideally a small piece of code that would fail before this PR but passes after? We probably won't be able to add it to the tests because it requires multi-gpu, but at least we'll have it ready. (I'll have to reinvest in multi-gpu testing in the future)

qgallouedec · 2024-12-04T14:57:40Z

To clarify: I think it's okay not to add a test in our unit test for this PR (because it's specific to multi-gpu configuration, and it's not trivial to set up with GitHub actions, but we'll do it anyway in the future).

However, we should check “by hand” that it works. Do you have a small command line / example script that I can run in my local multi-gpu configuration to check that it works as expected?

qgallouedec · 2024-12-04T18:36:27Z

Btw if you don't have multi-gpu setting available, feel free to share a code that might not be correct, I can test it quickly on my side.

shirinyamani · 2024-12-04T21:29:56Z

Rn I can think of two approaches both when we do NOT want to run manually using torch.distributed.run;

So I just checked we have nice accelerate_config files in trl so I think what we can do is we can launch the multi_gpu config from examples/accelerate_configs then set up the sft modeling; (preferred approach/ not sure abt outcome as I do not have access to multiple gpu atm!)

accelerate launch --config_file examples/accelerate_configs/multi_gpu.yaml --num_processes=8 examples/scripts/sft.py --model_name_or_path Qwen/Qwen2-0.5B --dataset_name trl-lib/Capybara

(Less sure/feasible approach), is to directly launch the multi-gpu from accelerate lib; the command for say 8 GPUs would be the following but not really sure if it is correct, I think it is not tho; reference

accelerate launch --multi_gpu --num_processes 8 sft --model_name_or_path Qwen/Qwen2.5-0.5B --dataset_name trl-lib/Capybara --output_dir Qwen2.5-0.5B-SFT

shirinyamani added 2 commits December 2, 2024 16:42

peft_config added

614ab95

'load_in_4bit' added

b574f9a

shirinyamani changed the title ~~peft_config added to Reward_Trainer~~ peft_config & is_loaded_in_4bit check added to Reward_Trainer Dec 3, 2024

qgallouedec reviewed Dec 3, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

peft_config & is_loaded_in_4bit check added to Reward_Trainer #2427

peft_config & is_loaded_in_4bit check added to Reward_Trainer #2427

shirinyamani commented Dec 2, 2024 •

edited

Loading

qgallouedec Dec 3, 2024

qgallouedec commented Dec 3, 2024

shirinyamani commented Dec 3, 2024

qgallouedec commented Dec 4, 2024

qgallouedec commented Dec 4, 2024

shirinyamani commented Dec 4, 2024

peft_config & is_loaded_in_4bit check added to Reward_Trainer #2427

Are you sure you want to change the base?

peft_config & is_loaded_in_4bit check added to Reward_Trainer #2427

Conversation

shirinyamani commented Dec 2, 2024 • edited Loading

What does this PR do?

Before submitting

Who can review?

qgallouedec Dec 3, 2024

Choose a reason for hiding this comment

qgallouedec commented Dec 3, 2024

shirinyamani commented Dec 3, 2024

qgallouedec commented Dec 4, 2024

qgallouedec commented Dec 4, 2024

shirinyamani commented Dec 4, 2024

shirinyamani commented Dec 2, 2024 •

edited

Loading