Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Fork fastchat/serve/model_worker.py to support multiple LoRA models #1805

Closed
fozziethebeat opened this issue Jun 29, 2023 · 4 comments
Labels
enhancement New feature or request

Comments

@fozziethebeat
Copy link
Collaborator

Right now fastchat/serve/model_worker.py supports one model. With LoRA (or other PEFT options) trained adapters, we could in theory load one base model and multiple adapters per worker and reduce the amount of times we need to load the base model. fastchat/serve/model_worker.py is probably a bad place to do this as it would complicate things, but it looks like forking the script to something like fastchat/serve/multi_model_worker.py should be pretty easy.

New options would need to support a list of model_path:model_name pairs. A safe hard assumption would be that all listed models share the same base model and that it would only be loaded once. The worker can then register each LoRA model with the controller.

Would a setup like this break anything within the controller? I'm assuming no however I haven't checked this directly.

If this seems reasonable, I'm happy to draft a PR.

@fozziethebeat
Copy link
Collaborator Author

Note: A major Peft bug seems to block this from working right now: huggingface/peft#430

@merrymercy
Copy link
Member

Yes, this is definitely a feature we want.
cc @ZYHowell

@merrymercy merrymercy added the enhancement New feature or request label Jul 1, 2023
@fozziethebeat
Copy link
Collaborator Author

Cool, then I can put it together, it's a high priority item for me. I think right now it won't be memory efficient since the base model can't be used by two different PeftModels, but I'll setup the basic server runner and find out what breaks.

@Ying1123
Copy link
Member

Close the issue because it is done by #1866 and #1905.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants