Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.1.0-beta.1 staging: RealtimeConversationClient #238

Merged
merged 2 commits into from
Oct 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
703 changes: 703 additions & 0 deletions api/OpenAI.netstandard2.0.cs

Large diffs are not rendered by default.

7 changes: 7 additions & 0 deletions src/Custom/OpenAIClient.cs
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
using OpenAI.Images;
using OpenAI.Models;
using OpenAI.Moderations;
using OpenAI.RealtimeConversation;
using OpenAI.VectorStores;
using System;
using System.ClientModel;
Expand Down Expand Up @@ -42,6 +43,7 @@ namespace OpenAI;
[CodeGenSuppress("_cachedLegacyCompletionClient")]
[CodeGenSuppress("_cachedOpenAIModelClient")]
[CodeGenSuppress("_cachedModerationClient")]
[CodeGenSuppress("_cachedRealtimeConversationClient")]
[CodeGenSuppress("_cachedVectorStoreClient")]
[CodeGenSuppress("GetAssistantClient")]
[CodeGenSuppress("GetAudioClient")]
Expand All @@ -58,6 +60,7 @@ namespace OpenAI;
[CodeGenSuppress("GetLegacyCompletionClient")]
[CodeGenSuppress("GetModelClient")]
[CodeGenSuppress("GetModerationClient")]
[CodeGenSuppress("GetRealtimeConversationClient")]
[CodeGenSuppress("GetVectorStoreClient")]
public partial class OpenAIClient
{
Expand Down Expand Up @@ -110,6 +113,7 @@ public OpenAIClient(ApiKeyCredential credential, OpenAIClientOptions options)
Argument.AssertNotNull(credential, nameof(credential));
options ??= new OpenAIClientOptions();

_keyCredential = credential;
_pipeline = OpenAIClient.CreatePipeline(credential, options);
_endpoint = OpenAIClient.GetEndpoint(options);
_options = options;
Expand Down Expand Up @@ -255,6 +259,9 @@ protected internal OpenAIClient(ClientPipeline pipeline, OpenAIClientOptions opt
[Experimental("OPENAI001")]
public virtual VectorStoreClient GetVectorStoreClient() => new(_pipeline, _options);

[Experimental("OPENAI002")]
public virtual RealtimeConversationClient GetRealtimeConversationClient(string model) => new(model, _keyCredential, _options);

internal static ClientPipeline CreatePipeline(ApiKeyCredential credential, OpenAIClientOptions options)
{
return ClientPipeline.Create(
Expand Down
36 changes: 36 additions & 0 deletions src/Custom/RealtimeConversation/AlphaDesignNotes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Development notes for .NET `/realtime` -- alpha

This document is intended to capture some of the exploratory design choices made when exposing the `/realtime` API in the .NET library.

## Naming and structure

"Realtime" does not describe "what" the capability does, but rather "how" it does it; `RealtimeClient`, while a faithful translation from REST, would not be descriptive or idiomatic. `AudioClient` has operations that let you send or receive audio; `ChatClient` is about all about getting chat (completion) responses; EmbeddingClient generates embeddings; `${NAME}Client` does *not* let you send or receive "realtimes."

A number of names could work. `Conversation` was chosen as an expedient placeholder.

Because the `/realtime` API involves simultaneously sending and receiving data on a single WebSocket, the primary logic vehicle is an `IDisposable` `ConversationSession` type -- this is configured by its originating `ConversationClient` and manages a `ClientWebSocket` instance. `ConversationClient` then provides task-based methods like `SendText` and `SubmitToolResponse` -- methods that allow the abstraction of client-originated request messages -- while exposing an `IAsyncEnumerable` collection of (response) `ConversationMessage` instances via `ReceiveMessagesAsync`.

The initial design approach for `ConversationMessage` feature uses a "squish" strategy; the many variant concrete message types are internalized, then composed into the single wrapper that conditionally populates appropriate properties based on the underlying message. This is a reapplication of the general principles applied to Chat Completion and Assistants streaming, though it's a larger single-type "squish" than previously pursued.

This is intended to facilitate a low barrier to entry, as explicit knowledge about different message types is not necessary to work with the operation. For example, a basic "hello world" may just do something like the following:

```csharp
using ConversationSession conversation = await client.StartConversationAsync();

await conversation.SendTextAsync("Hello, world!");

await foreach (ConversationMessage message in client.ReceiveMessagesAsync())
{
Console.Write(message.Text);
}
```

## Turn-based data buffering

A repeated piece of early alpha feedback was that a client-integrated mechanism to automatically accumulate incoming response data (not requiring manual, do-it-yourself accumulation) would be valuable.

To explore accomplishing this, `ConversationSession` includes a pair of properties, `LastTurnFullResponseText` and `LastTurnFullResponseAudio`, that will automatically be populated with accumulated data when a `turn_finished` event is received. This is consistent with the "snapshot" mechanism used in several instances within Stainless SDK libraries, which likewise feature automatically accumulated data being populated into an appropriate location.

As this requires visibility into the response body, automatic accumulation is only performed when using the convenience method variant of `ReceiveMessagesAsync`.

Because this accumulated text and (especially) audio data can quickly grow in size to hundreds of kilobytes, a client-only property for `LastTurnResponseAccumulationEnabled` is inserted into `ConversationOptions`. In contexts with many parallel operations and high sensitive to memory footprint, the setting can thus opt out of the behavior.
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
using System;
using System.Collections.Generic;
using System.Diagnostics.CodeAnalysis;

namespace OpenAI.RealtimeConversation;

[Experimental("OPENAI002")]
internal static partial class ConversationContentModalitiesExtensions
{
internal static void ToInternalModalities(this ConversationContentModalities modalities, IList<InternalRealtimeRequestSessionUpdateCommandSessionModality> internalModalities)
{
internalModalities.Clear();
if (modalities.HasFlag(ConversationContentModalities.Text))
{
internalModalities.Add(InternalRealtimeRequestSessionUpdateCommandSessionModality.Text);
}
if (modalities.HasFlag(ConversationContentModalities.Audio))
{
internalModalities.Add(InternalRealtimeRequestSessionUpdateCommandSessionModality.Audio);
}
}

internal static ConversationContentModalities FromInternalModalities(IEnumerable<InternalRealtimeRequestSessionUpdateCommandSessionModality> internalModalities)
{
ConversationContentModalities result = 0;
foreach (InternalRealtimeRequestSessionUpdateCommandSessionModality internalModality in internalModalities ?? [])
{
if (internalModality == InternalRealtimeRequestSessionUpdateCommandSessionModality.Text)
{
result |= ConversationContentModalities.Text;
}
else if (internalModality == InternalRealtimeRequestSessionUpdateCommandSessionModality.Audio)
{
result |= ConversationContentModalities.Audio;
}
}
return result;
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
using System;
using System.Collections.Generic;
using System.Diagnostics.CodeAnalysis;

namespace OpenAI.RealtimeConversation;

[Experimental("OPENAI002")]
[Flags]
public enum ConversationContentModalities : int
{
Text = 1 << 0,
Audio = 1 << 1,
}
36 changes: 36 additions & 0 deletions src/Custom/RealtimeConversation/ConversationContentPart.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
using System;
using System.Collections.Generic;
using System.Diagnostics.CodeAnalysis;

namespace OpenAI.RealtimeConversation;

[Experimental("OPENAI002")]
[CodeGenModel("RealtimeContentPart")]
public partial class ConversationContentPart
{
[CodeGenMember("Type")]
internal ConversationContentPartKind Type;

public ConversationContentPartKind Kind => Type;

public string TextValue =>
(this as InternalRealtimeRequestTextContentPart)?.Text
?? (this as InternalRealtimeResponseTextContentPart)?.Text;

public string AudioTranscriptValue =>
(this as InternalRealtimeRequestAudioContentPart)?.Transcript
?? (this as InternalRealtimeResponseAudioContentPart)?.Transcript;

public static ConversationContentPart FromInputText(string text)
=> new InternalRealtimeRequestTextContentPart(text);
public static ConversationContentPart FromInputAudioTranscript(string transcript = null) => new InternalRealtimeRequestAudioContentPart()
{
Transcript = transcript,
};
public static ConversationContentPart FromOutputText(string text)
=> new InternalRealtimeResponseTextContentPart(text);
public static ConversationContentPart FromOutputAudioTranscript(string transcript = null)
=> new InternalRealtimeResponseAudioContentPart(transcript);

public static implicit operator ConversationContentPart(string text) => FromInputText(text);
}
55 changes: 55 additions & 0 deletions src/Custom/RealtimeConversation/ConversationFunctionTool.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
using System;
using System.Collections.Generic;
using System.Diagnostics.CodeAnalysis;

namespace OpenAI.RealtimeConversation;

[Experimental("OPENAI002")]
[CodeGenModel("RealtimeFunctionTool")]
public partial class ConversationFunctionTool : ConversationTool
{
[CodeGenMember("Name")]
private string _name;
public required string Name
{
get => _name;
set => _name = value;
}

[CodeGenMember("Description")]
private string _description;

public string Description
{
get => _description;
set => _description = value;
}

[CodeGenMember("Parameters")]
private BinaryData _parameters;

public BinaryData Parameters
{
get => _parameters;
set => _parameters = value;
}

public ConversationFunctionTool() : base(ConversationToolKind.Function, null)
{
}

[SetsRequiredMembers]
public ConversationFunctionTool(string name)
: this(ConversationToolKind.Function, null, name, null, null)
{
Argument.AssertNotNull(name, nameof(name));
}

[SetsRequiredMembers]
internal ConversationFunctionTool(ConversationToolKind kind, IDictionary<string, BinaryData> serializedAdditionalRawData, string name, string description, BinaryData parameters) : base(kind, serializedAdditionalRawData)
{
_name = name;
_description = description;
_parameters = parameters;
}
}
46 changes: 46 additions & 0 deletions src/Custom/RealtimeConversation/ConversationItem.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
using System.Collections.Generic;
using System.Diagnostics.CodeAnalysis;
using System.Linq;

namespace OpenAI.RealtimeConversation;

[Experimental("OPENAI002")]
[CodeGenModel("RealtimeRequestItem")]
public partial class ConversationItem
{
public string FunctionCallId => (this as InternalRealtimeRequestFunctionCallItem)?.CallId;
public string FunctionName => (this as InternalRealtimeRequestFunctionCallItem)?.Name;
public string FunctionArguments => (this as InternalRealtimeRequestFunctionCallItem)?.Arguments;

public IReadOnlyList<ConversationContentPart> MessageContentParts
=> (this as InternalRealtimeRequestAssistantMessageItem)?.Content.ToList().AsReadOnly()
?? (this as InternalRealtimeRequestSystemMessageItem)?.Content?.ToList().AsReadOnly()
?? (this as InternalRealtimeRequestUserMessageItem)?.Content?.ToList().AsReadOnly();
public ConversationMessageRole? MessageRole
=> (this as InternalRealtimeRequestMessageItem)?.Role;

public static ConversationItem CreateUserMessage(IEnumerable<ConversationContentPart> contentItems)
{
return new InternalRealtimeRequestUserMessageItem(contentItems);
}

public static ConversationItem CreateSystemMessage(string toolCallId, IEnumerable<ConversationContentPart> contentItems)
{
return new InternalRealtimeRequestSystemMessageItem(contentItems);
}

public static ConversationItem CreateAssistantMessage(IEnumerable<ConversationContentPart> contentItems)
{
return new InternalRealtimeRequestAssistantMessageItem(contentItems);
}

public static ConversationItem CreateFunctionCall(string name, string callId, string arguments)
{
return new InternalRealtimeRequestFunctionCallItem(name, callId, arguments);
}

public static ConversationItem CreateFunctionCallOutput(string callId, string output)
{
return new InternalRealtimeRequestFunctionCallOutputItem(callId, output);
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
using System;
using System.ClientModel.Primitives;
using System.Diagnostics.CodeAnalysis;
using System.Text.Json;

namespace OpenAI.RealtimeConversation;

public partial class ConversationMaxTokensChoice : IJsonModel<ConversationMaxTokensChoice>
{
void IJsonModel<ConversationMaxTokensChoice>.Write(Utf8JsonWriter writer, ModelReaderWriterOptions options)
=> CustomSerializationHelpers.SerializeInstance(this, SerializeConversationMaxTokensChoice, writer, options);

ConversationMaxTokensChoice IJsonModel<ConversationMaxTokensChoice>.Create(ref Utf8JsonReader reader, ModelReaderWriterOptions options)
=> CustomSerializationHelpers.DeserializeNewInstance(this, DeserializeConversationMaxTokensChoice, ref reader, options);

BinaryData IPersistableModel<ConversationMaxTokensChoice>.Write(ModelReaderWriterOptions options)
=> CustomSerializationHelpers.SerializeInstance(this, options);

ConversationMaxTokensChoice IPersistableModel<ConversationMaxTokensChoice>.Create(BinaryData data, ModelReaderWriterOptions options)
=> CustomSerializationHelpers.DeserializeNewInstance(this, DeserializeConversationMaxTokensChoice, data, options);

string IPersistableModel<ConversationMaxTokensChoice>.GetFormatFromOptions(ModelReaderWriterOptions options) => "J";

internal static void SerializeConversationMaxTokensChoice(ConversationMaxTokensChoice instance, Utf8JsonWriter writer, ModelReaderWriterOptions options)
{
if (instance._isDefaultNullValue == true)
{
writer.WriteNullValue();
}
else if (instance._stringValue is not null)
{
writer.WriteStringValue(instance._stringValue);
}
else if (instance.NumericValue.HasValue)
{
writer.WriteNumberValue(instance.NumericValue.Value);
}
}

internal static ConversationMaxTokensChoice DeserializeConversationMaxTokensChoice(JsonElement element, ModelReaderWriterOptions options = null)
{
if (element.ValueKind == JsonValueKind.Null)
{
return new ConversationMaxTokensChoice(isDefaultNullValue: true);
}
if (element.ValueKind == JsonValueKind.String)
{
return new ConversationMaxTokensChoice(stringValue: element.GetString());
}
if (element.ValueKind == JsonValueKind.Number)
{
return new ConversationMaxTokensChoice(numberValue: element.GetInt32());
}
return null;
}

internal static ConversationMaxTokensChoice FromBinaryData(BinaryData bytes)
{
if (bytes is null)
{
return new ConversationMaxTokensChoice(isDefaultNullValue: true);
}
using JsonDocument document = JsonDocument.Parse(bytes);
return DeserializeConversationMaxTokensChoice(document.RootElement);
}
}
39 changes: 39 additions & 0 deletions src/Custom/RealtimeConversation/ConversationMaxTokensChoice.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
using System;
using System.Collections.Generic;
using System.ClientModel.Primitives;
using System.Diagnostics.CodeAnalysis;

namespace OpenAI.RealtimeConversation;

[Experimental("OPENAI002")]
public partial class ConversationMaxTokensChoice
{
public int? NumericValue { get; }
private readonly bool? _isDefaultNullValue;
private readonly string _stringValue;

public static ConversationMaxTokensChoice CreateInfiniteMaxTokensChoice()
=> new("inf");
public static ConversationMaxTokensChoice CreateDefaultMaxTokensChoice()
=> new(isDefaultNullValue: true);
public static ConversationMaxTokensChoice CreateNumericMaxTokensChoice(int maxTokens)
=> new(numberValue: maxTokens);

public ConversationMaxTokensChoice(int numberValue)
{
NumericValue = numberValue;
}

internal ConversationMaxTokensChoice(string stringValue)
{
_stringValue = stringValue;
}

internal ConversationMaxTokensChoice(bool isDefaultNullValue)
{
_isDefaultNullValue = true;
}

public static implicit operator ConversationMaxTokensChoice(int maxTokens)
=> CreateNumericMaxTokensChoice(maxTokens);
}
Loading
Loading