Skip to content

Commit

Permalink
Update docs to describe newer relationship to SK
Browse files Browse the repository at this point in the history
  • Loading branch information
SteveSandersonMS committed Aug 15, 2024
1 parent c7037dc commit 6835563
Show file tree
Hide file tree
Showing 4 changed files with 48 additions and 13 deletions.
1 change: 1 addition & 0 deletions Directory.Packages.props
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
<PackageVersion Include="Microsoft.AspNetCore.Components.WebAssembly.Server" Version="8.0.0" />
<PackageVersion Include="Azure.AI.OpenAI" Version="1.0.0-beta.11" />
<PackageVersion Include="Microsoft.SemanticKernel.Connectors.Onnx" Version="1.18.0-alpha" />
<PackageVersion Include="Microsoft.SemanticKernel.Plugins.Memory" Version="1.18.0-alpha" />
<PackageVersion Include="System.Numerics.Tensors" Version="8.0.0" />
<PackageVersion Include="System.Runtime.Caching" Version="8.0.0" />
<PackageVersion Include="System.Text.Json" Version="8.0.4" />
Expand Down
30 changes: 18 additions & 12 deletions docs/local-embeddings.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,18 @@ While you can use an external AI service to compute embeddings, in many cases yo

With `SmartComponents.LocalEmbeddings`, you can compute embeddings in under a millisecond, and perform semantic search over hundreds of thousands of candidates in single-digit milliseconds. However, there are limits. To understand the performance characteristics and when you might benefit from moving to an external vector database, see *Performance* below.

## Relationship to Semantic Kernel

Originally, `SmartComponents.LocalEmbeddings` was a standalone library, but more recently has been changed to be a wrapper around Semantic Kernel's own ability to compute embeddings locally using ONNX runtime.

As such, `SmartComponents.LocalEmbeddings` is now equivalent to using Semantic Kernel's `BertOnnxTextEmbeddingGenerationService`, with the following additional features:

* **Acquiring the embeddings model automatically at build time**. If you use SK directly, you need to take care of downloading a suitable `.onnx` file for the embeddings model and making it available at runtime. `LocalEmbeddings` handles this for you - see below for details of how to customize it.
* **Helper methods for finding the closest match from a set of candidates**. If you use SK directly, you can use `TensorPrimitives.CosineSimilarity` and similar methods to compute similarity between two embeddings, or `SemanticTextMemory.SearchAsync` to find the closest match from a precomputed set of embeddings. In comparison, `LocalEmbeddings` provides `LocalEmbedder.FindClosest` (described below) as an alternative way to search through a set of candidates. Both approaches will perform the same, but are convenient in different circumstances. If you're using SK, it's best to stick with the SK APIs, but if you're not using SK, the `LocalEmbedder.FindClosest` helper may be easier to use.
* **Alternative representations for embeddings**. With Semantic Kernel, the convention is to represents embeddings as `Span<float>` or `ReadOnlyMemory<float>`, which are equivalent in space/accuracy to `EmbeddingF32`. Beyond this, `SmartComponents.LocalEmbeddings` offers other representations `EmbeddingI8` and `EmbeddingI1` (described below) which give you different space/accuracy tradeoffs. For example, `EmbeddingI1` takes up only 1/32 of the memory of `EmbeddingF32` or `Span<float>` and can be use in nearest-neighbour searches considerably faster, at the cost of reduced accuracy. This is described in detail below.

**Recommendation**: `SmartComponents.LocalEmbeddings` is now a set of samples of ways you can build further capabilities and conveniences on top of Semantic Kernel's `BertOnnxTextEmbeddingGenerationService`. If you find these useful, you can use them in your own applications. But if SK's APIs are sufficient for your use cases, you should simply use them directly without using `SmartComponents.LocalEmbeddings`.

## Getting started

Add the `SmartComponents.LocalEmbeddings` project from this repo to your solution and reference it from your app.
Expand Down Expand Up @@ -262,20 +274,14 @@ The overall goal for `SmartComponents.LocalEmbeddings` is to make semantic searc

## Usage with Semantic Kernel

If you want to use this ONNX-based local embeddings generator with [Semantic Kernel](https://learn.microsoft.com/en-us/semantic-kernel/overview/), then you can use the the `SmartComponents.LocalEmbeddings.SemanticKernel` library.
As mentioned in the introduction to this document, `SmartComponents.LocalEmbeddings` is simply a wrapper around Semantic Kernel's `BertOnnxTextEmbeddingGenerationService`, showing ways to add further conveniences and capabilities.

Add the `SmartComponents.LocalEmbeddings.SemanticKernel` project to your solution and reference it from your app. Then use `AddLocalTextEmbeddingGeneration` to add a local embeddings generator to your `Kernel`:
The `LocalEmbedder` type implements SK's `ITextEmbeddingGenerationService` interface, so it can be used directly with any Semantic Kernel APIs that needs to generate embeddings. For example, when constructing a `SemanticTextMemory`, you can pass an instance of `LocalEmbedder` as the `embeddingGenerator` constructor argument:

```cs
var builder = Kernel.CreateBuilder();
builder.AddLocalTextEmbeddingGeneration();
```

You can then generate embeddings in the usual way for Semantic Kernel:

```cs
var kernel = builder.Build();
var embeddingGenerator = kernel.Services.GetRequiredService<ITextEmbeddingGenerationService>();
var storage = new VolatileMemoryStore(); // Requires a reference to Microsoft.SemanticKernel.Plugins.Memory
using var embedder = new LocalEmbedder();
var semanticTextMemory = new SemanticTextMemory(storage, embedder);

var embedding = await embeddingGenerator.GenerateEmbeddingAsync("Some text here");
// ... and now use semanticTextMemory to store and search for items
```
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@

using System.Numerics.Tensors;
using Microsoft.SemanticKernel.Embeddings;
using Microsoft.SemanticKernel.Memory;

namespace SmartComponents.LocalEmbeddings.SemanticKernel.Test;

Expand Down Expand Up @@ -65,4 +66,30 @@ public async Task CanBeConfiguredAsCaseSensitive()
var similarity = TensorPrimitives.CosineSimilarity(catLower.Span, catUpper.Span);
Assert.NotEqual(1, MathF.Round(similarity, 3));
}

[Fact]
public async Task CanBeUsedWithSemanticTextMemory()
{
// Construct an in-memory SK SemanticTextMemory that uses LocalEmbedder
var storage = new VolatileMemoryStore();
using var embedder = new LocalEmbedder();
var semanticTextMemory = new SemanticTextMemory(storage, embedder);

// Populate the memory with some information
await semanticTextMemory.SaveInformationAsync("animals", "Dog", "id_1");
await semanticTextMemory.SaveInformationAsync("animals", "Cat", "id_2");
await semanticTextMemory.SaveInformationAsync("animals", "Biscuit", "id_3");

// Do a nearest-neighbour search
MemoryQueryResult? first = null;
await foreach (var item in semanticTextMemory.SearchAsync("animals", "Kitten"))
{
first = item;
break;
}

// See that "Cat" was the closest to "Kitten"
Assert.Equal("id_2", first?.Metadata.Id);
Assert.Equal("Cat", first?.Metadata.Text);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,12 @@

<IsPackable>false</IsPackable>
<IsTestProject>true</IsTestProject>
<NoWarn>$(NoWarn);SKEXP0001</NoWarn>
<NoWarn>$(NoWarn);SKEXP0001;SKEXP0050</NoWarn>
</PropertyGroup>

<ItemGroup>
<PackageReference Include="Microsoft.NET.Test.Sdk" />
<PackageReference Include="Microsoft.SemanticKernel.Plugins.Memory" />
<PackageReference Include="xunit" />
<PackageReference Include="xunit.runner.visualstudio">
<IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets>
Expand Down

0 comments on commit 6835563

Please sign in to comment.