Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decrease memory usage of merge_and_unload #1944

Merged

Conversation

snarayan21
Copy link
Contributor

Addresses #1939

@snarayan21
Copy link
Contributor Author

@BenjaminBossan Would be great to get your approval here!

@BenjaminBossan
Copy link
Member

BenjaminBossan commented Jul 23, 2024

I checked the PEFT code on why we merge this way. Turns out we used to do it in-place but this PR changed it: #1372. IMO merging should not affect the error reported there, so I think we can undo the change, but I wanted to provide the necessary context.

Could you please also update the other merge functions?

base_layer.weight.data = base_layer.weight.data + self.get_delta_weight(active_adapter)

base_layer.weight.data = base_layer.weight.data + delta_weight

Thanks.

I plan to make a release soon, aiming for tomorrow. Let's get this fix ready before the release.

@snarayan21
Copy link
Contributor Author

@BenjaminBossan thanks for the pointers, found some additional places where this change is also needed.
And yeah since #1372 was a training time error making this fix for merging adapters shouldn't be applicable.

Hope this looks good to you.

@snarayan21
Copy link
Contributor Author

A release tomorrow with this fix would be ideal.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@BenjaminBossan
Copy link
Member

Strange that Windows CI is consistently failing. Hard to imagine that this is caused by the PR but I'm investigating further (but don't have a Windows machine myself). Hopefully this is just a temporary issue.

@snarayan21
Copy link
Contributor Author

Yeah quite weird:

The directory name is invalid: 'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\tmpv612vppk\\model.safetensors'

@BenjaminBossan
Copy link
Member

Okay, so the same Windows errors as in #1947, which has no code changes.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for restoring the more memory efficient merging behavior. A release tomorrow should be possible.

@BenjaminBossan BenjaminBossan merged commit 2ce83e0 into huggingface:main Jul 23, 2024
10 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants