-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add lora+ implementation #1915
Add lora+ implementation #1915
Conversation
64dc23c
to
6f42ed4
Compare
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@kallewoof Thanks for pushing this PR forward. By now, we have waited long enough that I feel it's fair to continue with this PR. I haven't done a full review yet, but it seems that tests are failing. Could you please take a look? |
6f42ed4
to
bbefbec
Compare
@BenjaminBossan Thanks. I think I addressed the errors. |
We get an error in Python 3.8 as we're missing a |
Thanks, fixed. Also added a license header. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates. No in-depth review yet, but I found a couple of issues that should be quick to address.
949793b
to
691178e
Compare
Failing tests appear unrelated to the PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for continuing the work on LoRA+. I found a few areas for improvement, but overall it looks good already. Tested it on a small example and results were slightly improved.
ed6095a
to
e58b876
Compare
e58b876
to
daddf0b
Compare
@kallewoof LMK when this is ready for another review. |
@BenjaminBossan Should be ready! Sorry if I missed anything. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic, I think we're almost done. I only have two small comments left, the rest looks good.
fa6d661
to
f983257
Compare
@kallewoof Please ping me once this is ready for review. Otherwise, I don't know if you're still working on some changes or not :) |
@BenjaminBossan Ping! Sorry about all the force-pushes. |
@BenjaminBossan Sorry, I thought about this overnight and I think you're right that |
Playing around with this, I noticed that |
Not a big deal with this PR, but especially on bigger ones they're better to avoid to make reviews easier. Note that there is no need to clean up the git history, if that was what you're going for, as we squash before merging.
What do you mean by "both"?
Hmm, yes, you're right. How about removing it completely then? I guess an argument could be made that something like this API could be useful: from peft import LoraPlusConfig
optimizer_config = LoraPlusConfig(...)
optimizer = create_loraplus_optimizer(model, optimizer_config) to make it easier to share the config settings, but IMO the value is very marginal. |
Right. It's a very ingrained habit from other projects where they don't squash.
Imagine
The LoRA+ optimizer picks out and uses the passed-in weight decay value in its setup. Then, if we don't pop it,
the Maybe I'm overcomplicating this?
I think the cleanest approach is to remove it in this PR and then make a follow-up where we make it easier to use, if necessary. Ultimately, without some tweaks to Edit: We probably should add an example to the docs on how to use it though, at least. Let me look into that. |
Co-authored-by: Chris Hua <stillmatic@users.noreply.github.com>
d2b3f9e
to
babd89b
Compare
@BenjaminBossan Sounds good. I rebased on main. If it is easier I will merge instead in the future. |
Thanks everyone involved in this PR, good work. |
LoRA+
Builds on #1509.