Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.Net: Test local modal with Semantic Kernel (i.e., Llama via Ollama) #3990

Closed
matthewbolanos opened this issue Dec 5, 2023 · 4 comments
Closed
Assignees
Labels
.NET Issue or Pull requests regarding .NET code v1.0.1 Required for the Semantic Kernel v1.0.1 release

Comments

@matthewbolanos
Copy link
Member

matthewbolanos commented Dec 5, 2023

Deploy a IChatCompletionService Llama model locally with Ollama and validate that it works with Semantic Kernel and the existing IChatCompletionService interface.

@matthewbolanos matthewbolanos converted this from a draft issue Dec 5, 2023
@shawncal shawncal added the triage label Dec 5, 2023
@matthewbolanos matthewbolanos added the .NET Issue or Pull requests regarding .NET code label Dec 5, 2023
@github-actions github-actions bot changed the title Test local modal with Semantic Kernel (i.e., Llama via Ollama) .Net: Test local modal with Semantic Kernel (i.e., Llama via Ollama) Dec 5, 2023
@alliscode alliscode self-assigned this Dec 6, 2023
@matthewbolanos
Copy link
Member Author

Doesn't need to test function calling.

@markwallace-microsoft markwallace-microsoft added v1.0.1 Required for the Semantic Kernel v1.0.1 release and removed triage v1 bugbash labels Dec 14, 2023
@alliscode
Copy link
Member

I tested our IChatCompletionService with Ollama using mistral. My conclusion is that our abstractions are good enough to allow this to work, but it does not currently work due to some implementation details in the Azure OpenAI SDK:

  • Ollama chat API uses the same interface as OpenAI however the Ollama response is streaming by default where the OpenAI response is not-streaming by default. This is a problem because the only way to disable streaming on Ollama is to set streaming=false in the request body, which is never done in the Azure OpenAI SDK, it's either true or null (missing).
  • Ollama streaming uses Server Sent Events just like OpenAI but Ollama uses named events, also known as multi-line events, see this and the Azure OpenAI SDK does not support this, it throws an exception.

Once these issues are fixed in the Azure OpenAI SDK, our IChatCompletionService will support Ollama.

@github-project-automation github-project-automation bot moved this to Sprint: Done in Semantic Kernel Dec 15, 2023
@stephentoub
Copy link
Member

Once these issues are fixed in the Azure OpenAI SDK, our IChatCompletionService will support Ollama.

Are there issues open on that for Azure.AI.OpenAI? Is anyone working on it? Timeframe?

@clement128
Copy link

hello any update for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
.NET Issue or Pull requests regarding .NET code v1.0.1 Required for the Semantic Kernel v1.0.1 release
Projects
Archived in project
Development

No branches or pull requests

6 participants