-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add local models to non-streaming accept list #14420
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this hotfix.
In general we should make this configurable per (custom) model with an additional attribute.
It is not only "custom" models, o1-preview also does not support streaming at the moment. |
Middleground suggestion to get this in quickly:
|
Custom OpenAI models can now be configured with 'disableStreaming: true' to indicate that streaming shall not be used. This is especially useful for models which do not support streaming at all. Co-authored-by: Matthew Khouzam <matthew.khouzam@ericsson.com>
Adapted the PR. @MatthewKhouzam can you check whether this works for you? |
@MatthewKhouzam Did you have a chance to look the changes? Can we merge? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am OK with this... but should it be enableStreaming... I'm imagining tech support saying "yeah, go ahead and enable disablestreaming." and people having a hard time.
It's a bit more errorprone when a boolean is |
I want to be clear, I approved on my side, anything else left to do? |
What it does
Allows local model orchestrators like GPT4All to be a back-end for Theia's AI features by allowing to configure
disableStreaming
for themStarts to address issues/14413
How to test
disableStreaming: true
Example Open AI configuration (using the official Open AI endpoint):
Follow-ups
We need to set up a max_token as different orchestrators stop at different lengths.
Review checklist
Reminder for reviewers