-
Notifications
You must be signed in to change notification settings - Fork 253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow using locally-running OpenAI API-compatible service #277
Comments
Thanks for raising the issue and suggest a great solution for supporting other LLM services! We'll definitely take a look and think about it. |
@igor-elbert There is another issue considering how do we support embedding models using other than OpenAI. As of now, I suppose we can't directly use Ollama's supported embedding models since they don't conform to OpenAI's api. Am I correct? Reference: https://ollama.com/blog/embedding-models. In the "Coming soon" section, OpenAI API Compatibility is one of the items. |
I have created a branch for this issue: https://github.com/Canner/WrenAI/tree/feature/ai-service/changing-providers However, I think one issue we need to tackle first is maybe we should allow community members more easily use their preferred embedding models. As of now, we only use OpenAI's embedding models. There should be three things that community members would like to change on their own, namely generators, vector databases, embedding models. And one caveat in our design as of now is that generators and embedding models should be the same LLM provider such as OpenAI. What's your thoughts about it? |
Apropros of this, support for defog's SqlCoder would be nice |
I think Llama does support it:OpenAI compatibility · Ollama Blogollama.comOn May 20, 2024, at 4:48 PM, Chih-Yu Yeh ***@***.***> wrote:
@igor-elbert There is another issue considering how do we support embedding models using other than OpenAI. As of now, I suppose we can't directly use Ollama's supported embedding models since they don't conform to OpenAI's api. Am I correct?
Reference: https://ollama.com/blog/embedding-models. In the "Coming soon" section, OpenAI API Compatibility is one of the items.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Yup. Found it here |
@igor-elbert @ccollie I've tested Ollama's text generation model and embedding model support for OpenAI API compatibility. The result is that the embedding model can't be used using OpenAI API. Please check out the attached gist url for reproduction. Please correct me if I am wrong, thanks. https://gist.github.com/cyyeh/b1042006b4ca067f2a75abd97e3749fb |
Unfortunately I don't (yet) have an ollama setup. However, I did find this which led me to believe this is possible |
Sorry, I misread the gist (re embeddings), but as far as the Ollama docs go, they only have support currently for the chat completions api |
Hi, we just refined how you can add your preferred LLM and Document Store. You only need to define LLM and Document Store and their environment variables! For details please check out the detailed guide here: https://docs.getwren.ai/installation/custom_llm For adding ollama, I've created a branch for this and made a minimal implementation, feel free to check it out: https://github.com/Canner/WrenAI/tree/feature/ai-service/add-ollama ONE CAVEAT: after you define your own LLM, you may find ai pipelines break, that's because your LLM may not suitable for the prompts, so you need to do prompt engineering at the moment. In the future, we'll come up with ways for you to easily extend and customize your prompts. Or welcome to share your thoughts here. As of now, I suppose prompts and respective LLM should match to have the best performance. |
If there are no more issues, I'll close this issue then. Thank you :) |
You might also be interested in these models that you can run locally and they can generate SQL: https://ollama.com/library/sqlcoder |
We'll merge the |
Might also be useful for this project since WrenAi has DuckDB already. |
To enforce some format you might need a model that supports function calling such as Mistral 7B v0.3. please note this model might not be particularly powerful with SQL generation. @cyyeh |
FYI, there are two most popular inference engines: I would suggest using LiteLLM framework that can augment different LLM providers, and make it easier to maintain and add new once. |
all, the ollama has been integrated in this branch, also you can use openai api compatible llm: related pr: #376 |
All, we now support using Ollama and OpenAI API-compatible LLMs now with the latest release: https://github.com/Canner/WrenAI/releases/tag/0.6.0 Setups on how to run Wren AI using custom LLMs: https://docs.getwren.ai/installation/custom_llm#running-wren-ai-with-your-custom-llm-or-document-store Currently, there is one obvious limitation for custom LLMs: you need to use the same provder(such as OpenAI, or Ollama) for LLM and embedding model. We'll fix that and release a new version soon. Stay tuned 🙂 I'll close this issue as completed now. |
Is your feature request related to a problem? Please describe.
We have Ollama and Jan.ai running locally and want to use them instead of OpenAI for data privacy reasons.
Describe the solution you'd like
Please allow adding URL to the configuration. If the services conform to OpenAI API the rest fo the code should work.
Describe alternatives you've considered
IP forwarding but it's clunky.
Additional context
Other similar services (e.g. CodeGPT) allow custom URLs to the LLM services
The text was updated successfully, but these errors were encountered: