-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QDoRA: Support DoRA with BnB quantization #1518
QDoRA: Support DoRA with BnB quantization #1518
Conversation
WIP Adds support for DoRA on 4bit and 8bit quantized models with BnB. For now, merging is not implemented. I'll investigate this next.
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me thank you @BenjaminBossan for adding quantization support for DoRA !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @BenjaminBossan for adding support to use DoRA with bnb quantized layers and the thorough tests! 🤗
docs/source/developer_guides/lora.md
Outdated
|
||
```py | ||
from peft import LoraConfig | ||
|
||
config = LoraConfig(use_dora=True, ...) | ||
``` | ||
|
||
DoRA should work with weights quantized with bitsandbytes ("QDoRA"). Issues have been reported when using QDoRA with DeepSpeed Zero2. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be in a Caveats
section wherein all such notes can be collated in one place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a caveats section directly below, within the DoRA section. Is this what you had in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @BenjaminBossan! ✨
Adds support for DoRA on 4bit and 8bit quantized models with BnB. Merging also works, with the usual caveats for quantized weights (results are not 100% identical), but it's not worse than vanialla LoRA.
Qdora seems to be working for me. However, I am noticing a large slowdown comparing Qlora and Qdora of around 2x. I am not sure if that is expected or not. But this seems like a good place as any to share this finding. |
Yes, QDoRA unfortunately requires an additional dequantization step on the quantized weights to calculate the weight norm. I wouldn't expect this to slow down training by 2x, but a significant slowdown is expected. Maybe you can run some profiler to check further if you think it's worth investigating.
You can also create new issues or discussions (if there aren't already existing ones) for this type of question. |
Don't pass load_in_8bit to AutoModel.from_pretrained, instead use BitsAndBytesConfig. There was already a PR to clean this up (huggingface#1552) but a slightly later PR (huggingface#1518) re-added this usage.
Adds support for DoRA on 4bit and 8bit quantized models with bitsandbytes. Merging also works, with the usual caveats for quantized weights (results are not 100% identical), but it's not worse than vanialla LoRA.
I did some quick tests and could see the expected memory savings with bnb. Same as with DoRA on non-quantized layers, using DoRA on quantized layers leads to a moderate increase in runtime.