-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for different LoRA formats #47
Comments
Did someone got lora's to work on 4 and 8 bit models? it worked for the dev en schnell but all quantized models give format errors |
I‘m using lora.safetensors created on replicate.com with dev and schnell, works like a charme. |
Yes but doesn't work on the quantized models
|
Just my two cents. Without the support for quantized models, this is meaningless. On the other hand, once quantized model lora support is added, it will be a game changer and I'm eagerly waiting for Mflux to add, because currently there's no efficient way to run Flux on Macs with lora support in an efficient way. I work on pinokio.computer (which lets people run AI tools locally easiliy), and I can tell you there are TONS of people waiting for this kind of thing to come out (flux1-schnell/dev-fp8 + full LoRA support). Because there are no good alternatives. You can do ComfyUI using fp8, but I have been personally looking forward to a native MLX support that also includes LoRA support. When that happens, and the flux1-dev-fp8 + LoRA runs faster than flux1-dev-fp8 + LoRA on ComfyUI, this will be a huge reason to start using MLX. Really hope this is prioritized over anything else. Thank you. |
@CharafChnioune What kind of error did you get? Was it perhaps similar to this #49? Like @kaimerklein said, the LoRA feature should typically work with the quantized models (assuming we have support for the given LoRA format). However, at the moment it does not support loading in a pre-quantised model together with non-quantized lora weights. But if you load the original weights and simply pass the This can probably be fixed, but might require some restructuring/rethinking on the backend side. Will have this in mind as a thing to update since more people have been requesting it. @cocktailpeanut Very interesting to hear how you are using the project, will check it out! For now, I think you are stuck with the solution described above, but at least it should work. And even though you have to store the full weights, you should still see the speedup that quantization brings. |
@filipstrand Yes exactly the same error the Loras work great on the normal schnell and dev models but for the 4 and 8 bit I always get that shape error. Wil try test it when I get home wil let you know if that fixed the error Update: worked thx |
As discussed in #40 there are different types of LoRA formats. To keep track of what works and what doesn't I have added a table in the README called Supported LoRA formats. I am not sure what the best way is to categorize the formats, but I thought this issue can serve as a place for anyone to post what they have tried and if it worked or not, along with any other information. It would also be really interesting to hear from people which are the most common sources for fine-tuning online (e.g like civitai.com or fal.ai), so we can prioritize support for these weights.
The text was updated successfully, but these errors were encountered: