-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Dall-E 3 Image Generation #185
base: main
Are you sure you want to change the base?
Conversation
0ce7e95
to
d61142a
Compare
@@ -207,6 +207,24 @@ QAzureOpenAIChatOptions.OpenAIDeploymentConnection connection in this._qAzureOpe | |||
return chatCompletionDeployments; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't actually use this yet... wondering if we are going to eventually select deployments like the chat completion model, or if it will always be one (i.e dalle-3)
@@ -971,11 +993,40 @@ private async Task<CopilotChatMessage> StreamResponseToClientAsync( | |||
); | |||
} | |||
} | |||
var chatHistory = prompt.MetaPromptTemplate; | |||
|
|||
// logic to get the last user message and extract relevant text |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really like this (lines 998-1003), must be a better way. Throwing it up in the meantime to get some eyes on it. It essentially does some string manipulation to the prompt object to get the description text that dall-e-3
will be called with
@@ -905,6 +906,23 @@ private Dictionary<string, int> GetTokenUsages(KernelArguments kernelArguments, | |||
return tokenUsageDict; | |||
} | |||
|
|||
private bool IsImageRequest(string prompt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally would have the semantic kernel interpret the prompt and use its built in capabilities to determine what the appropriate service is. For now we have custom logic to determine if it is an image request. Would like to iterate on this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason we cant have the semantic kernel interpret the prompt at the moment?
@@ -905,6 +906,23 @@ private Dictionary<string, int> GetTokenUsages(KernelArguments kernelArguments, | |||
return tokenUsageDict; | |||
} | |||
|
|||
private bool IsImageRequest(string prompt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason we cant have the semantic kernel interpret the prompt at the moment?
/// </summary> | ||
public List<string> GetAllTextToImageDeployments() | ||
{ | ||
var textToImageDeployments = new List<string>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see duplicate code between this and GetAllChatCompletionDeployments
that could be shaved down
@@ -53,9 +53,18 @@ public class OpenAIDeploymentConnection | |||
public string APIKey { get; set; } = string.Empty; | |||
public IList<ChatCompletionDeployment> ChatCompletionDeployments { get; set; } = | |||
new List<ChatCompletionDeployment>(); | |||
public IList<string> ImageGenerationDeployments { get; set; } = new List<string>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of IList<string>
I think we should be doing something similar to the way ChatCompletionDeployments
is declared
appsettings.Development.json
changes needed after mergeSmall changes to appsettings will be needed when this is merged, refer to the helm changes. Only need to apply this to the eastus endpoint.
Motivation and Context
Allow the user to generate images while chatting with Q-Pilot. I am making a separate ticket for enabling this only for certain specializations to keep the PRs a bit more compact.
Description
Dall-E 3 has been added as a deployment; however, it is only available with our east-us services. Currently it just looks for trigger words when the user is chatting with the bot, I am going to further investigate if we can get the Semantic Kernel to automatically detect and decide which deployment to use.
A flag is set on the bot response for
IsImage
and this flag is interpreted by the frontend to display the source given from the dall-e-3 deployment in an<img>
.Contribution Checklist