Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add changes for lora adapter support and /v1/models endpoint #121

Merged
merged 2 commits into from
Oct 31, 2024

Conversation

sven-knoblauch
Copy link
Contributor

@sven-knoblauch sven-knoblauch commented Oct 9, 2024

small changes to add the lora modules ENV variable (supporting 1 lora adapter) as solution for #119
with the format: {"name": "xxx", "path": "xxx/xxxxx", "base_model_name": "xxx/xxxx"}

also change of the v1/models endpoint to return all models

@sven-knoblauch sven-knoblauch marked this pull request as draft October 9, 2024 11:42
@sven-knoblauch sven-knoblauch marked this pull request as ready for review October 9, 2024 12:53
@pandyamarut
Copy link
Collaborator

Thanks for the PR. @sven-knoblauch . Can you please also add how did you test the PR?

@sven-knoblauch
Copy link
Contributor Author

I made a docker container with the given Dockerfile (on dockerhub: svenknob/runpod-vllm-worker) and tested it on runpod serverless. Worked with a custom trained lora adapter (added in the runpod GUI as ENV variable: LORA_MODULES) with an awq mistral model. The lora adapter is also visible in the v1/models endpoint.

@pandyamarut pandyamarut merged commit 6e8696c into runpod-workers:main Oct 31, 2024
@nerdylive123
Copy link

Hi, is there a documentation for this env usage? in the markdown perhaps?

@nielsrolf
Copy link

Is a docker image publicly available that contains this PR?

@sven-knoblauch
Copy link
Contributor Author

Added a pull request for changing the readme #130.
Usage is similar to the usage in the "original" vllm server. The env var name is LORA_MODULES and the format is {"name": "xxx", "path": "xxx/xxxx", "base_model_name": "xxx/xxxx"}, where the name is the name the http requests are made for, the path is the huggingface path of the adapter and the base_model_name is the modelname it is trained on.

For now you can use my docker image svenknob/runpod-vllm-worker, till it has been published.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants