Allow using locally-running OpenAI API-compatible service #277

igor-elbert · 2024-05-20T14:04:55Z

Is your feature request related to a problem? Please describe.
We have Ollama and Jan.ai running locally and want to use them instead of OpenAI for data privacy reasons.

Describe the solution you'd like
Please allow adding URL to the configuration. If the services conform to OpenAI API the rest fo the code should work.

Describe alternatives you've considered
IP forwarding but it's clunky.

Additional context
Other similar services (e.g. CodeGPT) allow custom URLs to the LLM services

cyyeh · 2024-05-20T14:26:35Z

Thanks for raising the issue and suggest a great solution for supporting other LLM services! We'll definitely take a look and think about it.

cyyeh · 2024-05-20T14:48:32Z

@igor-elbert There is another issue considering how do we support embedding models using other than OpenAI. As of now, I suppose we can't directly use Ollama's supported embedding models since they don't conform to OpenAI's api. Am I correct?

Reference: https://ollama.com/blog/embedding-models. In the "Coming soon" section, OpenAI API Compatibility is one of the items.

cyyeh · 2024-05-20T18:22:37Z

I have created a branch for this issue: https://github.com/Canner/WrenAI/tree/feature/ai-service/changing-providers

However, I think one issue we need to tackle first is maybe we should allow community members more easily use their preferred embedding models. As of now, we only use OpenAI's embedding models. There should be three things that community members would like to change on their own, namely generators, vector databases, embedding models. And one caveat in our design as of now is that generators and embedding models should be the same LLM provider such as OpenAI.

What's your thoughts about it?

ccollie · 2024-05-20T20:20:26Z

Apropros of this, support for defog's SqlCoder would be nice

igor-elbert · 2024-05-20T21:21:12Z

I think Llama does support it:OpenAI compatibility · Ollama Blogollama.comOn May 20, 2024, at 4:48 PM, Chih-Yu Yeh ***@***.***> wrote: @igor-elbert There is another issue considering how do we support embedding models using other than OpenAI. As of now, I suppose we can't directly use Ollama's supported embedding models since they don't conform to OpenAI's api. Am I correct? Reference: https://ollama.com/blog/embedding-models. In the "Coming soon" section, OpenAI API Compatibility is one of the items. —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>

ccollie · 2024-05-20T22:31:05Z

I think Llama does support it:OpenAI compatibility

Yup. Found it here

cyyeh · 2024-05-21T00:05:26Z

@igor-elbert @ccollie I've tested Ollama's text generation model and embedding model support for OpenAI API compatibility. The result is that the embedding model can't be used using OpenAI API. Please check out the attached gist url for reproduction. Please correct me if I am wrong, thanks.

https://gist.github.com/cyyeh/b1042006b4ca067f2a75abd97e3749fb

ccollie · 2024-05-21T00:37:48Z

Unfortunately I don't (yet) have an ollama setup. However, I did find this which led me to believe this is possible

ccollie · 2024-05-21T00:40:07Z

Sorry, I misread the gist (re embeddings), but as far as the Ollama docs go, they only have support currently for the chat completions api

cyyeh · 2024-05-21T08:11:55Z

@igor-elbert @ccollie

Hi, we just refined how you can add your preferred LLM and Document Store. You only need to define LLM and Document Store and their environment variables! For details please check out the detailed guide here: https://docs.getwren.ai/installation/custom_llm

For adding ollama, I've created a branch for this and made a minimal implementation, feel free to check it out: https://github.com/Canner/WrenAI/tree/feature/ai-service/add-ollama

ONE CAVEAT: after you define your own LLM, you may find ai pipelines break, that's because your LLM may not suitable for the prompts, so you need to do prompt engineering at the moment. In the future, we'll come up with ways for you to easily extend and customize your prompts. Or welcome to share your thoughts here. As of now, I suppose prompts and respective LLM should match to have the best performance.

cyyeh · 2024-05-21T08:13:44Z

If there are no more issues, I'll close this issue then. Thank you :)

qdrddr · 2024-05-22T01:45:08Z

You might also be interested in these models that you can run locally and they can generate SQL:

https://ollama.com/library/sqlcoder
https://ollama.com/library/codeqwen
https://ollama.com/library/starcoder2

cyyeh · 2024-05-22T11:36:12Z

We'll merge the add-ollama branch to the main branch after we make sure it won't break our ai pipelines currently. We will investigate some ways to solve the issue. For example, https://github.com/noamgat/lm-format-enforcer

qdrddr · 2024-05-31T17:58:39Z

Might also be useful for this project since WrenAi has DuckDB already.
duckdb-nsql model

qdrddr · 2024-05-31T18:02:05Z

We'll merge the add-ollama branch to the main branch after we make sure it won't break our ai pipelines currently. We will investigate some ways to solve the issue. For example, https://github.com/noamgat/lm-format-enforcer

To enforce some format you might need a model that supports function calling such as Mistral 7B v0.3. please note this model might not be particularly powerful with SQL generation. @cyyeh

qdrddr · 2024-06-06T17:28:52Z

FYI, there are two most popular inference engines:
Ollama (partly compatible with OpenAI APIs), mostly using its own APIs.
and LocalAI (Tend to be almost fully compatible with OpenAI APIs)

I would suggest using LiteLLM framework that can augment different LLM providers, and make it easier to maintain and add new once.

cyyeh · 2024-06-11T05:58:15Z

all, the ollama has been integrated in this branch, also you can use openai api compatible llm: chore/ai-service/update-env
we'll merge this branch to the main branch in the near future and update the documentation
as of now, I'll delete the original ollama branch
Thank you all for your patience

related pr: #376

cyyeh · 2024-06-28T23:02:30Z

All, we now support using Ollama and OpenAI API-compatible LLMs now with the latest release: https://github.com/Canner/WrenAI/releases/tag/0.6.0

Setups on how to run Wren AI using custom LLMs: https://docs.getwren.ai/installation/custom_llm#running-wren-ai-with-your-custom-llm-or-document-store

Currently, there is one obvious limitation for custom LLMs: you need to use the same provder(such as OpenAI, or Ollama) for LLM and embedding model. We'll fix that and release a new version soon. Stay tuned 🙂

I'll close this issue as completed now.

igor-elbert added the feature-request label May 20, 2024

cyyeh mentioned this issue May 23, 2024

Feature: Google Gemini and Local LLM Support #308

Closed

cyyeh added the contribution-welcome label May 27, 2024

cyyeh mentioned this issue May 28, 2024

Change OpenAI URL #330

Closed

cyyeh added the module/ai-service ai-service related label Jun 1, 2024

cyyeh closed this as completed Jun 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow using locally-running OpenAI API-compatible service #277

Allow using locally-running OpenAI API-compatible service #277

igor-elbert commented May 20, 2024

cyyeh commented May 20, 2024 •

edited

Loading

cyyeh commented May 20, 2024 •

edited

Loading

cyyeh commented May 20, 2024 •

edited

Loading

ccollie commented May 20, 2024

igor-elbert commented May 20, 2024 via email

ccollie commented May 20, 2024

cyyeh commented May 21, 2024 •

edited

Loading

ccollie commented May 21, 2024

ccollie commented May 21, 2024

cyyeh commented May 21, 2024

cyyeh commented May 21, 2024

qdrddr commented May 22, 2024

cyyeh commented May 22, 2024

qdrddr commented May 31, 2024

qdrddr commented May 31, 2024

qdrddr commented Jun 6, 2024

cyyeh commented Jun 11, 2024 •

edited

Loading

cyyeh commented Jun 28, 2024 •

edited

Loading

Allow using locally-running OpenAI API-compatible service #277

Allow using locally-running OpenAI API-compatible service #277

Comments

igor-elbert commented May 20, 2024

cyyeh commented May 20, 2024 • edited Loading

cyyeh commented May 20, 2024 • edited Loading

cyyeh commented May 20, 2024 • edited Loading

ccollie commented May 20, 2024

igor-elbert commented May 20, 2024 via email

ccollie commented May 20, 2024

cyyeh commented May 21, 2024 • edited Loading

ccollie commented May 21, 2024

ccollie commented May 21, 2024

cyyeh commented May 21, 2024

cyyeh commented May 21, 2024

qdrddr commented May 22, 2024

cyyeh commented May 22, 2024

qdrddr commented May 31, 2024

qdrddr commented May 31, 2024

qdrddr commented Jun 6, 2024

cyyeh commented Jun 11, 2024 • edited Loading

cyyeh commented Jun 28, 2024 • edited Loading

cyyeh commented May 20, 2024 •

edited

Loading

cyyeh commented May 20, 2024 •

edited

Loading

cyyeh commented May 20, 2024 •

edited

Loading

cyyeh commented May 21, 2024 •

edited

Loading

cyyeh commented Jun 11, 2024 •

edited

Loading

cyyeh commented Jun 28, 2024 •

edited

Loading