-
Notifications
You must be signed in to change notification settings - Fork 935
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Multimodal Support (Llava 1.5) (#821)
* llava v1.5 integration * Point llama.cpp to fork * Add llava shared library target * Fix type * Update llama.cpp * Add llava api * Revert changes to llama and llama_cpp * Update llava example * Add types for new gpt-4-vision-preview api * Fix typo * Update llama.cpp * Update llama_types to match OpenAI v1 API * Update ChatCompletionFunction type * Reorder request parameters * More API type fixes * Even More Type Updates * Add parameter for custom chat_handler to Llama class * Fix circular import * Convert to absolute imports * Fix * Fix pydantic Jsontype bug * Accept list of prompt tokens in create_completion * Add llava1.5 chat handler * Add Multimodal notebook * Clean up examples * Add server docs --------- Co-authored-by: Andrei Betlen <abetlen@gmail.com>
- Loading branch information
1 parent
56171cf
commit aab74f0
Showing
10 changed files
with
796 additions
and
102 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,77 @@ | ||
# OpenAI Compatible Server | ||
|
||
`llama-cpp-python` offers an OpenAI API compatible web server. | ||
|
||
This web server can be used to serve local models and easily connect them to existing clients. | ||
|
||
## Setup | ||
|
||
### Installation | ||
|
||
The server can be installed by running the following command: | ||
|
||
```bash | ||
pip install llama-cpp-python[server] | ||
``` | ||
|
||
### Running the server | ||
|
||
The server can then be started by running the following command: | ||
|
||
```bash | ||
python3 -m llama_cpp.server --model <model_path> | ||
``` | ||
|
||
### Server options | ||
|
||
For a full list of options, run: | ||
|
||
```bash | ||
python3 -m llama_cpp.server --help | ||
``` | ||
|
||
NOTE: All server options are also available as environment variables. For example, `--model` can be set by setting the `MODEL` environment variable. | ||
|
||
## Guides | ||
|
||
### Multi-modal Models | ||
|
||
`llama-cpp-python` supports the llava1.5 family of multi-modal models which allow the language model to | ||
read information from both text and images. | ||
|
||
You'll first need to download one of the available multi-modal models in GGUF format: | ||
|
||
- [llava1.5 7b](https://huggingface.co/mys/ggml_llava-v1.5-7b) | ||
- [llava1.5 13b](https://huggingface.co/mys/ggml_llava-v1.5-13b) | ||
|
||
Then when you run the server you'll need to also specify the path to the clip model used for image embedding | ||
|
||
```bash | ||
python3 -m llama_cpp.server --model <model_path> --clip-model-path <clip_model_path> | ||
``` | ||
|
||
Then you can just use the OpenAI API as normal | ||
|
||
```python3 | ||
from openai import OpenAI | ||
|
||
client = OpenAI(base_url="http://<host>:<port>/v1", api_key="sk-xxx") | ||
response = client.chat.completions.create( | ||
model="gpt-4-vision-preview", | ||
messages=[ | ||
{ | ||
"role": "user", | ||
"content": [ | ||
{ | ||
"type": "image_url", | ||
"image_url": { | ||
"url": "<image_url>" | ||
}, | ||
}, | ||
{"type": "text", "text": "What does the image say"}, | ||
], | ||
} | ||
], | ||
) | ||
print(response) | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,84 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 1, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"ChatCompletion(id='chatcmpl-65a710ba-41d1-4d0a-a124-a44b2b4a0189', choices=[Choice(finish_reason='stop', index=0, message=ChatCompletionMessage(content=' The image reads \"LlamaC++.\"', role='assistant', function_call=None, tool_calls=None))], created=1699413274, model='gpt-4-vision-preview', object='chat.completion', system_fingerprint=None, usage=CompletionUsage(completion_tokens=10, prompt_tokens=624, total_tokens=634))\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"from openai import OpenAI\n", | ||
"\n", | ||
"import urllib.request\n", | ||
"import base64\n", | ||
"\n", | ||
"def get_data_url(url):\n", | ||
" return \"data:image/png;base64,\" + base64.b64encode(urllib.request.urlopen(url).read()).decode(\"utf-8\")\n", | ||
"\n", | ||
"client = OpenAI(base_url=\"http://100.64.159.73:8000/v1\", api_key=\"sk-1234\")\n", | ||
"response = client.chat.completions.create(\n", | ||
" model=\"gpt-4-vision-preview\",\n", | ||
" messages=[\n", | ||
" {\n", | ||
" \"role\": \"user\",\n", | ||
" \"content\": [\n", | ||
" {\n", | ||
" \"type\": \"image_url\",\n", | ||
" \"image_url\": {\n", | ||
" \"url\": get_data_url(\"https://user-images.githubusercontent.com/1991296/230134379-7181e485-c521-4d23-a0d6-f7b3b61ba524.png\"),\n", | ||
" # \"url\": \"https://user-images.githubusercontent.com/1991296/230134379-7181e485-c521-4d23-a0d6-f7b3b61ba524.png\",\n", | ||
" },\n", | ||
" },\n", | ||
" {\"type\": \"text\", \"text\": \"What does the image say\"},\n", | ||
" ],\n", | ||
" }\n", | ||
" ],\n", | ||
")\n", | ||
"print(response)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"![](https://user-images.githubusercontent.com/1991296/230134379-7181e485-c521-4d23-a0d6-f7b3b61ba524.png)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": ".venv", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.11.5+" | ||
}, | ||
"orig_nbformat": 4 | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 2 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
aab74f0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to have broken building on windows for me.. sigh
aab74f0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tk-master do you mind creating an issue in llama.cpp and sending me a link, this looks to be a upstream cmake bug.
aab74f0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abetlen the thing is, i tried building latest llama.cpp from master branch and it was successful .. not sure what's going on.. i cant open an issue in llama.cpp then
aab74f0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you updated to the latest vendor commit in
main
too and Windows builds still failing. Is there a formal LCP issue about this? Even if its an upstream failure is an issue inappropriate? Right now I am following comments on a commit to stay abreast of the trouble.aab74f0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It build for me after I removed the changes in CMakeLists.txt but I'm guessing it didn't build llava now..
aab74f0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tk-master , I got it to compile by adding this line:
Still have to test it, but figured to report progress... Will try it with non-OFF arch later...
EDIT: Not working with the example setup... gonna give this a rest for a while.
aab74f0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bioshazard changing
set_target_properties(llava_shared PROPERTIES OUTPUT_NAME "llava")
to
set_target_properties(llava_shared PROPERTIES OUTPUT_NAME "llava" CUDA_ARCHITECTURES OFF)
..worked for building it at least!
Someone should test if llava works as expected on windows with this though
aab74f0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tk-master I tried to follow the README after attempting compile like that, but got some other error... might try to mess with it more next week and will open an issue in this repo if I can get it working on linux but not windows. Also might attempt to compile without the CUBLAS env var.
aab74f0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bioshazard does the latest release work for you? It seems to have fixed the issue for some others.
aab74f0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abetlen yes! just noticed the latest release totally builds on Windows. Thanks for following up! Haven't tried llava model yet, but the build issues are fixed in 0.2.17 on windows tyty
aab74f0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tk-master ^^