Skip to content

Commit

Permalink
2.1.0-beta.1 staging: RealtimeConversationClient (#238)
Browse files Browse the repository at this point in the history
* realtime client staging

* add non-version-impacting .csproj update
  • Loading branch information
trrwilson authored Oct 1, 2024
1 parent c28661f commit ff75da4
Show file tree
Hide file tree
Showing 251 changed files with 21,795 additions and 1 deletion.
703 changes: 703 additions & 0 deletions api/OpenAI.netstandard2.0.cs

Large diffs are not rendered by default.

7 changes: 7 additions & 0 deletions src/Custom/OpenAIClient.cs
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
using OpenAI.Images;
using OpenAI.Models;
using OpenAI.Moderations;
using OpenAI.RealtimeConversation;
using OpenAI.VectorStores;
using System;
using System.ClientModel;
Expand Down Expand Up @@ -42,6 +43,7 @@ namespace OpenAI;
[CodeGenSuppress("_cachedLegacyCompletionClient")]
[CodeGenSuppress("_cachedOpenAIModelClient")]
[CodeGenSuppress("_cachedModerationClient")]
[CodeGenSuppress("_cachedRealtimeConversationClient")]
[CodeGenSuppress("_cachedVectorStoreClient")]
[CodeGenSuppress("GetAssistantClient")]
[CodeGenSuppress("GetAudioClient")]
Expand All @@ -58,6 +60,7 @@ namespace OpenAI;
[CodeGenSuppress("GetLegacyCompletionClient")]
[CodeGenSuppress("GetModelClient")]
[CodeGenSuppress("GetModerationClient")]
[CodeGenSuppress("GetRealtimeConversationClient")]
[CodeGenSuppress("GetVectorStoreClient")]
public partial class OpenAIClient
{
Expand Down Expand Up @@ -110,6 +113,7 @@ public OpenAIClient(ApiKeyCredential credential, OpenAIClientOptions options)
Argument.AssertNotNull(credential, nameof(credential));
options ??= new OpenAIClientOptions();

_keyCredential = credential;
_pipeline = OpenAIClient.CreatePipeline(credential, options);
_endpoint = OpenAIClient.GetEndpoint(options);
_options = options;
Expand Down Expand Up @@ -255,6 +259,9 @@ protected internal OpenAIClient(ClientPipeline pipeline, OpenAIClientOptions opt
[Experimental("OPENAI001")]
public virtual VectorStoreClient GetVectorStoreClient() => new(_pipeline, _options);

[Experimental("OPENAI002")]
public virtual RealtimeConversationClient GetRealtimeConversationClient(string model) => new(model, _keyCredential, _options);

internal static ClientPipeline CreatePipeline(ApiKeyCredential credential, OpenAIClientOptions options)
{
return ClientPipeline.Create(
Expand Down
36 changes: 36 additions & 0 deletions src/Custom/RealtimeConversation/AlphaDesignNotes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Development notes for .NET `/realtime` -- alpha

This document is intended to capture some of the exploratory design choices made when exposing the `/realtime` API in the .NET library.

## Naming and structure

"Realtime" does not describe "what" the capability does, but rather "how" it does it; `RealtimeClient`, while a faithful translation from REST, would not be descriptive or idiomatic. `AudioClient` has operations that let you send or receive audio; `ChatClient` is about all about getting chat (completion) responses; EmbeddingClient generates embeddings; `${NAME}Client` does *not* let you send or receive "realtimes."

A number of names could work. `Conversation` was chosen as an expedient placeholder.

Because the `/realtime` API involves simultaneously sending and receiving data on a single WebSocket, the primary logic vehicle is an `IDisposable` `ConversationSession` type -- this is configured by its originating `ConversationClient` and manages a `ClientWebSocket` instance. `ConversationClient` then provides task-based methods like `SendText` and `SubmitToolResponse` -- methods that allow the abstraction of client-originated request messages -- while exposing an `IAsyncEnumerable` collection of (response) `ConversationMessage` instances via `ReceiveMessagesAsync`.

The initial design approach for `ConversationMessage` feature uses a "squish" strategy; the many variant concrete message types are internalized, then composed into the single wrapper that conditionally populates appropriate properties based on the underlying message. This is a reapplication of the general principles applied to Chat Completion and Assistants streaming, though it's a larger single-type "squish" than previously pursued.

This is intended to facilitate a low barrier to entry, as explicit knowledge about different message types is not necessary to work with the operation. For example, a basic "hello world" may just do something like the following:

```csharp
using ConversationSession conversation = await client.StartConversationAsync();

await conversation.SendTextAsync("Hello, world!");

await foreach (ConversationMessage message in client.ReceiveMessagesAsync())
{
Console.Write(message.Text);
}
```

## Turn-based data buffering

A repeated piece of early alpha feedback was that a client-integrated mechanism to automatically accumulate incoming response data (not requiring manual, do-it-yourself accumulation) would be valuable.

To explore accomplishing this, `ConversationSession` includes a pair of properties, `LastTurnFullResponseText` and `LastTurnFullResponseAudio`, that will automatically be populated with accumulated data when a `turn_finished` event is received. This is consistent with the "snapshot" mechanism used in several instances within Stainless SDK libraries, which likewise feature automatically accumulated data being populated into an appropriate location.

As this requires visibility into the response body, automatic accumulation is only performed when using the convenience method variant of `ReceiveMessagesAsync`.

Because this accumulated text and (especially) audio data can quickly grow in size to hundreds of kilobytes, a client-only property for `LastTurnResponseAccumulationEnabled` is inserted into `ConversationOptions`. In contexts with many parallel operations and high sensitive to memory footprint, the setting can thus opt out of the behavior.
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
using System;
using System.Collections.Generic;
using System.Diagnostics.CodeAnalysis;

namespace OpenAI.RealtimeConversation;

[Experimental("OPENAI002")]
internal static partial class ConversationContentModalitiesExtensions
{
internal static void ToInternalModalities(this ConversationContentModalities modalities, IList<InternalRealtimeRequestSessionUpdateCommandSessionModality> internalModalities)
{
internalModalities.Clear();
if (modalities.HasFlag(ConversationContentModalities.Text))
{
internalModalities.Add(InternalRealtimeRequestSessionUpdateCommandSessionModality.Text);
}
if (modalities.HasFlag(ConversationContentModalities.Audio))
{
internalModalities.Add(InternalRealtimeRequestSessionUpdateCommandSessionModality.Audio);
}
}

internal static ConversationContentModalities FromInternalModalities(IEnumerable<InternalRealtimeRequestSessionUpdateCommandSessionModality> internalModalities)
{
ConversationContentModalities result = 0;
foreach (InternalRealtimeRequestSessionUpdateCommandSessionModality internalModality in internalModalities ?? [])
{
if (internalModality == InternalRealtimeRequestSessionUpdateCommandSessionModality.Text)
{
result |= ConversationContentModalities.Text;
}
else if (internalModality == InternalRealtimeRequestSessionUpdateCommandSessionModality.Audio)
{
result |= ConversationContentModalities.Audio;
}
}
return result;
}
}
13 changes: 13 additions & 0 deletions src/Custom/RealtimeConversation/ConversationContentModalities.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
using System;
using System.Collections.Generic;
using System.Diagnostics.CodeAnalysis;

namespace OpenAI.RealtimeConversation;

[Experimental("OPENAI002")]
[Flags]
public enum ConversationContentModalities : int
{
Text = 1 << 0,
Audio = 1 << 1,
}
36 changes: 36 additions & 0 deletions src/Custom/RealtimeConversation/ConversationContentPart.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
using System;
using System.Collections.Generic;
using System.Diagnostics.CodeAnalysis;

namespace OpenAI.RealtimeConversation;

[Experimental("OPENAI002")]
[CodeGenModel("RealtimeContentPart")]
public partial class ConversationContentPart
{
[CodeGenMember("Type")]
internal ConversationContentPartKind Type;

public ConversationContentPartKind Kind => Type;

public string TextValue =>
(this as InternalRealtimeRequestTextContentPart)?.Text
?? (this as InternalRealtimeResponseTextContentPart)?.Text;

public string AudioTranscriptValue =>
(this as InternalRealtimeRequestAudioContentPart)?.Transcript
?? (this as InternalRealtimeResponseAudioContentPart)?.Transcript;

public static ConversationContentPart FromInputText(string text)
=> new InternalRealtimeRequestTextContentPart(text);
public static ConversationContentPart FromInputAudioTranscript(string transcript = null) => new InternalRealtimeRequestAudioContentPart()
{
Transcript = transcript,
};
public static ConversationContentPart FromOutputText(string text)
=> new InternalRealtimeResponseTextContentPart(text);
public static ConversationContentPart FromOutputAudioTranscript(string transcript = null)
=> new InternalRealtimeResponseAudioContentPart(transcript);

public static implicit operator ConversationContentPart(string text) => FromInputText(text);
}
55 changes: 55 additions & 0 deletions src/Custom/RealtimeConversation/ConversationFunctionTool.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
using System;
using System.Collections.Generic;
using System.Diagnostics.CodeAnalysis;

namespace OpenAI.RealtimeConversation;

[Experimental("OPENAI002")]
[CodeGenModel("RealtimeFunctionTool")]
public partial class ConversationFunctionTool : ConversationTool
{
[CodeGenMember("Name")]
private string _name;
public required string Name
{
get => _name;
set => _name = value;
}

[CodeGenMember("Description")]
private string _description;

public string Description
{
get => _description;
set => _description = value;
}

[CodeGenMember("Parameters")]
private BinaryData _parameters;

public BinaryData Parameters
{
get => _parameters;
set => _parameters = value;
}

public ConversationFunctionTool() : base(ConversationToolKind.Function, null)
{
}

[SetsRequiredMembers]
public ConversationFunctionTool(string name)
: this(ConversationToolKind.Function, null, name, null, null)
{
Argument.AssertNotNull(name, nameof(name));
}

[SetsRequiredMembers]
internal ConversationFunctionTool(ConversationToolKind kind, IDictionary<string, BinaryData> serializedAdditionalRawData, string name, string description, BinaryData parameters) : base(kind, serializedAdditionalRawData)
{
_name = name;
_description = description;
_parameters = parameters;
}
}
46 changes: 46 additions & 0 deletions src/Custom/RealtimeConversation/ConversationItem.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
using System.Collections.Generic;
using System.Diagnostics.CodeAnalysis;
using System.Linq;

namespace OpenAI.RealtimeConversation;

[Experimental("OPENAI002")]
[CodeGenModel("RealtimeRequestItem")]
public partial class ConversationItem
{
public string FunctionCallId => (this as InternalRealtimeRequestFunctionCallItem)?.CallId;
public string FunctionName => (this as InternalRealtimeRequestFunctionCallItem)?.Name;
public string FunctionArguments => (this as InternalRealtimeRequestFunctionCallItem)?.Arguments;

public IReadOnlyList<ConversationContentPart> MessageContentParts
=> (this as InternalRealtimeRequestAssistantMessageItem)?.Content.ToList().AsReadOnly()
?? (this as InternalRealtimeRequestSystemMessageItem)?.Content?.ToList().AsReadOnly()
?? (this as InternalRealtimeRequestUserMessageItem)?.Content?.ToList().AsReadOnly();
public ConversationMessageRole? MessageRole
=> (this as InternalRealtimeRequestMessageItem)?.Role;

public static ConversationItem CreateUserMessage(IEnumerable<ConversationContentPart> contentItems)
{
return new InternalRealtimeRequestUserMessageItem(contentItems);
}

public static ConversationItem CreateSystemMessage(string toolCallId, IEnumerable<ConversationContentPart> contentItems)
{
return new InternalRealtimeRequestSystemMessageItem(contentItems);
}

public static ConversationItem CreateAssistantMessage(IEnumerable<ConversationContentPart> contentItems)
{
return new InternalRealtimeRequestAssistantMessageItem(contentItems);
}

public static ConversationItem CreateFunctionCall(string name, string callId, string arguments)
{
return new InternalRealtimeRequestFunctionCallItem(name, callId, arguments);
}

public static ConversationItem CreateFunctionCallOutput(string callId, string output)
{
return new InternalRealtimeRequestFunctionCallOutputItem(callId, output);
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
using System;
using System.ClientModel.Primitives;
using System.Diagnostics.CodeAnalysis;
using System.Text.Json;

namespace OpenAI.RealtimeConversation;

public partial class ConversationMaxTokensChoice : IJsonModel<ConversationMaxTokensChoice>
{
void IJsonModel<ConversationMaxTokensChoice>.Write(Utf8JsonWriter writer, ModelReaderWriterOptions options)
=> CustomSerializationHelpers.SerializeInstance(this, SerializeConversationMaxTokensChoice, writer, options);

ConversationMaxTokensChoice IJsonModel<ConversationMaxTokensChoice>.Create(ref Utf8JsonReader reader, ModelReaderWriterOptions options)
=> CustomSerializationHelpers.DeserializeNewInstance(this, DeserializeConversationMaxTokensChoice, ref reader, options);

BinaryData IPersistableModel<ConversationMaxTokensChoice>.Write(ModelReaderWriterOptions options)
=> CustomSerializationHelpers.SerializeInstance(this, options);

ConversationMaxTokensChoice IPersistableModel<ConversationMaxTokensChoice>.Create(BinaryData data, ModelReaderWriterOptions options)
=> CustomSerializationHelpers.DeserializeNewInstance(this, DeserializeConversationMaxTokensChoice, data, options);

string IPersistableModel<ConversationMaxTokensChoice>.GetFormatFromOptions(ModelReaderWriterOptions options) => "J";

internal static void SerializeConversationMaxTokensChoice(ConversationMaxTokensChoice instance, Utf8JsonWriter writer, ModelReaderWriterOptions options)
{
if (instance._isDefaultNullValue == true)
{
writer.WriteNullValue();
}
else if (instance._stringValue is not null)
{
writer.WriteStringValue(instance._stringValue);
}
else if (instance.NumericValue.HasValue)
{
writer.WriteNumberValue(instance.NumericValue.Value);
}
}

internal static ConversationMaxTokensChoice DeserializeConversationMaxTokensChoice(JsonElement element, ModelReaderWriterOptions options = null)
{
if (element.ValueKind == JsonValueKind.Null)
{
return new ConversationMaxTokensChoice(isDefaultNullValue: true);
}
if (element.ValueKind == JsonValueKind.String)
{
return new ConversationMaxTokensChoice(stringValue: element.GetString());
}
if (element.ValueKind == JsonValueKind.Number)
{
return new ConversationMaxTokensChoice(numberValue: element.GetInt32());
}
return null;
}

internal static ConversationMaxTokensChoice FromBinaryData(BinaryData bytes)
{
if (bytes is null)
{
return new ConversationMaxTokensChoice(isDefaultNullValue: true);
}
using JsonDocument document = JsonDocument.Parse(bytes);
return DeserializeConversationMaxTokensChoice(document.RootElement);
}
}
39 changes: 39 additions & 0 deletions src/Custom/RealtimeConversation/ConversationMaxTokensChoice.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
using System;
using System.Collections.Generic;
using System.ClientModel.Primitives;
using System.Diagnostics.CodeAnalysis;

namespace OpenAI.RealtimeConversation;

[Experimental("OPENAI002")]
public partial class ConversationMaxTokensChoice
{
public int? NumericValue { get; }
private readonly bool? _isDefaultNullValue;
private readonly string _stringValue;

public static ConversationMaxTokensChoice CreateInfiniteMaxTokensChoice()
=> new("inf");
public static ConversationMaxTokensChoice CreateDefaultMaxTokensChoice()
=> new(isDefaultNullValue: true);
public static ConversationMaxTokensChoice CreateNumericMaxTokensChoice(int maxTokens)
=> new(numberValue: maxTokens);

public ConversationMaxTokensChoice(int numberValue)
{
NumericValue = numberValue;
}

internal ConversationMaxTokensChoice(string stringValue)
{
_stringValue = stringValue;
}

internal ConversationMaxTokensChoice(bool isDefaultNullValue)
{
_isDefaultNullValue = true;
}

public static implicit operator ConversationMaxTokensChoice(int maxTokens)
=> CreateNumericMaxTokensChoice(maxTokens);
}
Loading

0 comments on commit ff75da4

Please sign in to comment.