Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lora+ implementation #1915

Merged
merged 34 commits into from
Jul 29, 2024
Merged

Conversation

kallewoof
Copy link
Contributor

@kallewoof kallewoof commented Jul 8, 2024

LoRA+

Builds on #1509.

@kallewoof kallewoof changed the title 202407 loraplus Add lora+ implementation Jul 8, 2024
src/peft/helpers.py Outdated Show resolved Hide resolved
@kallewoof kallewoof force-pushed the 202407-loraplus branch 2 times, most recently from 64dc23c to 6f42ed4 Compare July 9, 2024 14:43
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@BenjaminBossan
Copy link
Member

@kallewoof Thanks for pushing this PR forward. By now, we have waited long enough that I feel it's fair to continue with this PR.

I haven't done a full review yet, but it seems that tests are failing. Could you please take a look?

@kallewoof
Copy link
Contributor Author

@BenjaminBossan Thanks. I think I addressed the errors.

@BenjaminBossan
Copy link
Member

We get an error in Python 3.8 as we're missing a from __future__ import annotations.

@kallewoof
Copy link
Contributor Author

Thanks, fixed. Also added a license header.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates. No in-depth review yet, but I found a couple of issues that should be quick to address.

src/peft/optimizers/loraplus.py Outdated Show resolved Hide resolved
src/peft/optimizers/__init__.py Outdated Show resolved Hide resolved
src/peft/optimizers/loraplus.py Show resolved Hide resolved
src/peft/tuners/lora/__init__.py Outdated Show resolved Hide resolved
src/peft/tuners/__init__.py Outdated Show resolved Hide resolved
src/peft/utils/peft_types.py Outdated Show resolved Hide resolved
@kallewoof
Copy link
Contributor Author

Failing tests appear unrelated to the PR.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for continuing the work on LoRA+. I found a few areas for improvement, but overall it looks good already. Tested it on a small example and results were slightly improved.

src/peft/optimizers/loraplus.py Outdated Show resolved Hide resolved
src/peft/optimizers/loraplus.py Outdated Show resolved Hide resolved
src/peft/optimizers/__init__.py Outdated Show resolved Hide resolved
src/peft/optimizers/loraplus.py Outdated Show resolved Hide resolved
src/peft/optimizers/loraplus.py Show resolved Hide resolved
src/peft/optimizers/loraplus.py Outdated Show resolved Hide resolved
tests/test_loraplus_helper.py Outdated Show resolved Hide resolved
tests/test_loraplus_helper.py Outdated Show resolved Hide resolved
@BenjaminBossan
Copy link
Member

@kallewoof LMK when this is ready for another review.

@kallewoof
Copy link
Contributor Author

@BenjaminBossan Should be ready! Sorry if I missed anything.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic, I think we're almost done. I only have two small comments left, the rest looks good.

src/peft/optimizers/loraplus.py Outdated Show resolved Hide resolved
src/peft/optimizers/loraplus.py Outdated Show resolved Hide resolved
@kallewoof kallewoof force-pushed the 202407-loraplus branch 3 times, most recently from fa6d661 to f983257 Compare July 18, 2024 10:27
@BenjaminBossan
Copy link
Member

@kallewoof Please ping me once this is ready for review. Otherwise, I don't know if you're still working on some changes or not :)

@kallewoof
Copy link
Contributor Author

@BenjaminBossan Ping! Sorry about all the force-pushes.

@kallewoof
Copy link
Contributor Author

@BenjaminBossan Sorry, I thought about this overnight and I think you're right that weight_decay should be popped. If someone provides a weight decay meant for something other than LoRA+ then it will be picked up by both, which is most likely undesired.

@kallewoof
Copy link
Contributor Author

kallewoof commented Jul 22, 2024

Playing around with this, I noticed that LoraPlusConfig is actually not used anywhere. I think the code is missing a part where you create a LoraConfig with lora+ feature and it initializes the optimizer and such.

@BenjaminBossan
Copy link
Member

BenjaminBossan commented Jul 22, 2024

Sorry about all the force-pushes.

Not a big deal with this PR, but especially on bigger ones they're better to avoid to make reviews easier. Note that there is no need to clean up the git history, if that was what you're going for, as we squash before merging.

Sorry, I thought about this overnight and I think you're right that weight_decay should be popped. If someone provides a weight decay meant for something other than LoRA+ then it will be picked up by both, which is most likely undesired.

What do you mean by "both"?

Playing around with this, I noticed that LoraPlusConfig is actually not used anywhere.

Hmm, yes, you're right. How about removing it completely then? I guess an argument could be made that something like this API could be useful:

from peft import LoraPlusConfig

optimizer_config = LoraPlusConfig(...)
optimizer = create_loraplus_optimizer(model, optimizer_config)

to make it easier to share the config settings, but IMO the value is very marginal.

@kallewoof
Copy link
Contributor Author

kallewoof commented Jul 22, 2024

Sorry about all the force-pushes.

Not a big deal with this PR, but especially on bigger ones they're better to avoid to make reviews easier. Note that there is no need to clean up the git history, if that was what you're going for, as we squash before merging.

Right. It's a very ingrained habit from other projects where they don't squash.

Sorry, I thought about this overnight and I think you're right that weight_decay should be popped. If someone provides a weight decay meant for something other than LoRA+ then it will be picked up by both, which is most likely undesired.

What do you mean by "both"?

Imagine FutureOptimizerX which has a fancy weight_decay parameter that does something, and now imagine someone attaching a LoRA+ optimizer to it:

def create_loraplus_optimizer(
    model: PeftModel, optimizer_cls: type[Optimizer], *, lr: float, loraplus_lr_ratio: float, **kwargs
) -> Optimizer:

The LoRA+ optimizer picks out and uses the passed-in weight decay value in its setup. Then, if we don't pop it,

    optimizer = optimizer_cls(optimizer_grouped_parameters, **kwargs)

the FutureOptimizerX optimizer will now also use our weight decay param. It is unclear which optimizer the user was referring to for the weight decay, but it's probably unlikely that they intended for both FutureOptimizerX and LoRA+ to get the same weight decay with the same value.

Maybe I'm overcomplicating this?

Playing around with this, I noticed that LoraPlusConfig is actually not used anywhere.

Hmm, yes, you're right. How about removing it completely then? I guess an argument could be made that something like this API could be useful:

from peft import LoraPlusConfig

optimizer_config = LoraPlusConfig(...)
optimizer = create_loraplus_optimizer(model, optimizer_config)

to make it easier to share the config settings, but IMO the value is very marginal.

I think the cleanest approach is to remove it in this PR and then make a follow-up where we make it easier to use, if necessary.

Ultimately, without some tweaks to transformers, I don't think we can do this automatically e.g. from get_peft_model because we need to actually access the Trainer. I think.

Edit: We probably should add an example to the docs on how to use it though, at least. Let me look into that.

@kallewoof
Copy link
Contributor Author

@BenjaminBossan Sounds good. I rebased on main. If it is easier I will merge instead in the future.

@BenjaminBossan BenjaminBossan merged commit 273acf0 into huggingface:main Jul 29, 2024
14 checks passed
@BenjaminBossan
Copy link
Member

Thanks everyone involved in this PR, good work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants