Add Dall-E 3 Image Generation #185

jdtoombs · 2024-11-14T01:06:34Z

`appsettings.Development.json` changes needed after merge

Small changes to appsettings will be needed when this is merged, refer to the helm changes. Only need to apply this to the eastus endpoint.

Motivation and Context

Allow the user to generate images while chatting with Q-Pilot. I am making a separate ticket for enabling this only for certain specializations to keep the PRs a bit more compact.

Description

Dall-E 3 has been added as a deployment; however, it is only available with our east-us services. Currently it just looks for trigger words when the user is chatting with the bot, I am going to further investigate if we can get the Semantic Kernel to automatically detect and decide which deployment to use.

A flag is set on the bot response for IsImage and this flag is interpreted by the frontend to display the source given from the dall-e-3 deployment in an <img>.

Contribution Checklist

The code builds clean without any errors or warnings
The PR follows the Contribution Guidelines and the pre-submission formatting script raises no violations
All unit tests pass, and I have added new tests where possible
I didn't break anyone 😄

jdtoombs · 2024-11-14T01:32:00Z

webapi/Plugins/Chat/Ext/QAzureOpenAIChatExtension.cs

@@ -207,6 +207,24 @@ QAzureOpenAIChatOptions.OpenAIDeploymentConnection connection in this._qAzureOpe
        return chatCompletionDeployments;


We don't actually use this yet... wondering if we are going to eventually select deployments like the chat completion model, or if it will always be one (i.e dalle-3)

jdtoombs · 2024-11-14T05:21:39Z

webapi/Plugins/Chat/ChatPlugin.cs

@@ -971,11 +993,40 @@ private async Task<CopilotChatMessage> StreamResponseToClientAsync(
                );
            }
        }
+        var chatHistory = prompt.MetaPromptTemplate;
+
+        // logic to get the last user message and extract relevant text


I don't really like this (lines 998-1003), must be a better way. Throwing it up in the meantime to get some eyes on it. It essentially does some string manipulation to the prompt object to get the description text that dall-e-3 will be called with

jdtoombs · 2024-11-14T05:23:12Z

webapi/Plugins/Chat/ChatPlugin.cs

@@ -905,6 +906,23 @@ private Dictionary<string, int> GetTokenUsages(KernelArguments kernelArguments,
        return tokenUsageDict;
    }

+    private bool IsImageRequest(string prompt)


Ideally would have the semantic kernel interpret the prompt and use its built in capabilities to determine what the appropriate service is. For now we have custom logic to determine if it is an image request. Would like to iterate on this

What's the reason we cant have the semantic kernel interpret the prompt at the moment?

JTraill · 2024-11-15T17:49:19Z

webapi/Plugins/Chat/ChatPlugin.cs

@@ -905,6 +906,23 @@ private Dictionary<string, int> GetTokenUsages(KernelArguments kernelArguments,
        return tokenUsageDict;
    }

+    private bool IsImageRequest(string prompt)


What's the reason we cant have the semantic kernel interpret the prompt at the moment?

JTraill · 2024-11-15T18:23:43Z

webapi/Plugins/Chat/Ext/QAzureOpenAIChatExtension.cs

+    /// </summary>
+    public List<string> GetAllTextToImageDeployments()
+    {
+        var textToImageDeployments = new List<string>();


I see duplicate code between this and GetAllChatCompletionDeployments that could be shaved down

JTraill · 2024-11-15T18:50:44Z

webapi/Plugins/Chat/Ext/QAzureOpenAIChatOptions.cs

@@ -53,9 +53,18 @@ public class OpenAIDeploymentConnection
        public string APIKey { get; set; } = string.Empty;
        public IList<ChatCompletionDeployment> ChatCompletionDeployments { get; set; } =
            new List<ChatCompletionDeployment>();
+        public IList<string> ImageGenerationDeployments { get; set; } = new List<string>();


Instead of IList<string> I think we should be doing something similar to the way ChatCompletionDeployments is declared

add image generation

f6c6ff4

jdtoombs self-assigned this Nov 14, 2024

github-actions bot added webapp webapi helm labels Nov 14, 2024

jdtoombs force-pushed the image-gen branch 2 times, most recently from 0ce7e95 to d61142a Compare November 14, 2024 01:24

jdtoombs commented Nov 14, 2024

View reviewed changes

cleanup

def1571

jdtoombs force-pushed the image-gen branch from 133f9f0 to def1571 Compare November 14, 2024 01:35

jdtoombs commented Nov 14, 2024

View reviewed changes

JTraill reviewed Nov 15, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Dall-E 3 Image Generation #185

Add Dall-E 3 Image Generation #185

jdtoombs commented Nov 14, 2024 •

edited

Loading

jdtoombs Nov 14, 2024

jdtoombs Nov 14, 2024 •

edited

Loading

jdtoombs Nov 14, 2024

JTraill Nov 15, 2024

JTraill Nov 15, 2024

JTraill Nov 15, 2024

JTraill Nov 15, 2024

		@@ -207,6 +207,24 @@ QAzureOpenAIChatOptions.OpenAIDeploymentConnection connection in this._qAzureOpe
		return chatCompletionDeployments;

Add Dall-E 3 Image Generation #185

Are you sure you want to change the base?

Add Dall-E 3 Image Generation #185

Conversation

jdtoombs commented Nov 14, 2024 • edited Loading

appsettings.Development.json changes needed after merge

Motivation and Context

Description

Contribution Checklist

jdtoombs Nov 14, 2024

Choose a reason for hiding this comment

jdtoombs Nov 14, 2024 • edited Loading

Choose a reason for hiding this comment

jdtoombs Nov 14, 2024

Choose a reason for hiding this comment

JTraill Nov 15, 2024

Choose a reason for hiding this comment

JTraill Nov 15, 2024

Choose a reason for hiding this comment

JTraill Nov 15, 2024

Choose a reason for hiding this comment

JTraill Nov 15, 2024

Choose a reason for hiding this comment

jdtoombs commented Nov 14, 2024 •

edited

Loading

`appsettings.Development.json` changes needed after merge

jdtoombs Nov 14, 2024 •

edited

Loading