Speechly is an API for building voice features and voice chat moderation into games, XR, applications and web sites. Speechly Client Library for Unity and C# streams audio for analysis and provides real-time speech-to-text transcription and information extracted from the speech via the library's C# API.
Speech recognition runs by default in Speechly cloud (online). On-device (offline) capabilities are available via separate libSpeechly add-on that runs machine learning models on the device itself using ONNX runtime or TensorFlow Lite.
- speechly-dotnet/ contains the C# SpeechlyClient without Unity dependencies
- speechly-unity/Assets/Speechly/ folder contains the C# SpeechlyClient plus Unity-specifics like
Speechly.prefab
andMicToSpeechly.cs
script and Unity sample projects:
- Download speechly.unitypackage
- Select Assets > Import Package > Custom Package... and select the downloaded file.
- Import
Speechly/
folder from the package to your Unity project
Refer to https://docs.unity3d.com/Manual/upm-ui-install.html for instructions on using Unity packages.
- Unity 2018.1 or later (tested with 2019.4.36f1 and 2021.3.27f1)
- TextMeshPro (examples only)
- XR Plug-in Management (VR example only)
See language support here.
General usage pattern for Unity is to add Speechly.prefab
to your scene, configure it properly, and then interact
with it using either MicToSpeechly
or the internal SpeechlyClient
which is accessible as MicToSpeechly.Instance.SpeechlyClient
.
Speechly.prefab
is by default set with Don't destroy on load
so it's available in every scene. It creates a SpeechlyClient
singleton which you can can access with MicToSpeechly.Instance.SpeechlyClient
.
When configured in voice-activated mode, MicToSpeechly
will start listening to the microphone as soon as it is enabled.
If you want to stop the speech recognition, you can disable the game object associated with MicToSpeechly
:
MicToSpeechly.Instance?.gameObject.SetActive(false);
Similarly, enable the listening again by setting the object active. Because MicToSpeechly
is not destroyed on scene changes,
do not add your own scripts to the same game object as MicToSpeechly
so that they will operate correctly.
- Add the
Speechly.prefab
to your scene and select it. - In the inspector, enter a valid
App id
acquired from Speechly dashboard - Check
VAD controls listening
anddebug print
. - Run the app, speak and see the console for the basic transcription results.
- Add the
Speechly.prefab
to your scene and select it. - In the inspector, enter a valid
App id
acquired from Speechly dashboard - Check
debug print
to see basic transcription results in the console. - Create the following script to listen only when mouse/finger is pressed. Ensure that
VAD controls listening
is unchecked as we're controlling listening manually. - Run the app, press and hold anywhere on the screen and see the console for the basic transcription results.
void Update()
{
SpeechlyClient speechlyClient = MicToSpeechly.Instance.SpeechlyClient;
if (Input.GetMouseButton(0))
{
if (!speechlyClient.IsActive)
{
_ = speechlyClient.Start();
}
}
else
{
if (speechlyClient.IsActive)
{
_ = speechlyClient.Stop();
}
}
}
- Use either hands-free listening or manual listening as described above.
- To handle the speech-to-text results, attach a callback for
OnSegmentChange
. The following example displays the speech transcription in a TMP text field.
void Start()
{
SpeechlyClient speechlyClient = MicToSpeechly.Instance.SpeechlyClient;
speechlyClient.OnSegmentChange += (segment) =>
{
Debug.Log(segment.ToString());
TranscriptText.text = segment.ToString(
(intent) => "",
(words, entityType) => $"<color=#15e8b5>{words}<color=#ffffff>",
"."
);
};
}
You can send audio from other audio sources by calling SpeechlyClient.Start()
, sending any number of packets containing samples as float32 array (mono 16kHz by default) using SpeechlyClient.ProcessAudio()
. Finally, call SpeechlyClient.Stop()
.
In this case, you need to handle SpeechlyClient
creation, initialization, and shutdown yourself, or modify MicToSpeechly
to not process the microphone audio at the same time.
An Unity sample scene that streams data from microphone to Speechly using MicToSpeechly.cs script running on a GameObject. App-specific logic is in UseSpeechly.cs
which registers a callback and shows speech-to-text results in the UI.
An Unity sample scene that showcases a point-and-talk interface: target an object and hold the mouse button to issue speech commands like "make it big and red" or "delete". Again, app-specific logic is in UseSpeechly.cs
which registers a callback to respond to detected intents and keywords (entities).
AudioFileToSpeechly
streams a pre-recorded audio file for speech recognition.HandsFreeListening
demonstrates hands-free voice input.HandsFreeListeningVR
demonstrates hands-free voice input on a VR headset. Tested on Meta Quest 2.
Speechly Client for Unity is on-device speech recognition ready. Enabling on-device support requires add-on files (libSpeechly, speech recognition model) from Speechly.
- OS X (Intel): Replace the zero-length placeholder file
libSpeechlyDecoder.dylib
inAssets/Speechly/SpeechlyOnDevice/libSpeechly/OS_X-x86-64/libSpeechlyDecoder.dylib
- Android (arm64): Replace the zero-length placeholder file
libSpeechlyDecoder.so
inAssets/Speechly/SpeechlyOnDevice/libSpeechly/Android-ARM64/libSpeechlyDecoder.so
- Other platforms: Follow the installation instruction provided with the files.
- Add the
my-custom-model.bundle
file intospeechly-unity/Assets/StreamingAssets/SpeechlyOnDevice/Models/
- Select the
Speechly.prefab
and enter the model's filename (e.g.my-custom-model.bundle
) as the value for Model Bundle property.
To enable microphone input on OS X, set Player Settings > Settings for PC, Mac & Linux Standalone > Other Settings > Microphone Usage Description
, to for example, "Voice input is automatically processed by Speechly.com".
To diagnose problems with device builds, you can do the following:
- First try running MicToSpeechlyScene.unity in the editor without errors.
- Change to Android player, set MicToSpeechlyScene.unity as the main scene and do a
build and run
to deploy and start the build to on a device. - On the terminal start following Unity-related log lines:
adb logcat -s Unity:D
- Use the app on device. Ensure it's listening (e.g. keep
Hold to talk
button pressed) and say "ONE, TWO, THREE". Then release the button. - You should see "ONE, TWO, THREE" displayed in the top-left corner of the screen. If not, see the terminal for errors.
Exception: Could not open microphone
and green VU meter won't move. Cause: There's no implementation in place to wait for permission prompt to complete so mic permission is not given on the first run and Microphone.Start() fails. Fix: Implement platform specific permission check, or, restart app after granting the permission.WebException: Error: NameResolutionFailure
and transcript won't change when button held and app is spoken to. Cause: Production builds restric access to internet. Fix: With Android target active, go Player settings and find "Internet Access" and change it to "required".- IL2CPP build fails with
NullReferenceException
atSystem.Runtime.Serialization.Json.JsonFormatWriterInterpreter.TryWritePrimitive
. Cause: System.Runtime.Serialization.dll uses reflection to access some methods. Fix: To prevent Unity managed code linker from stripping away these methods add the filelink.xml
with the following content:
<linker>
<assembly fullname="System.Runtime.Serialization" preserve="all"/>
</linker>
- A C# development environment with .NET Standard 2.0 API:
- Microsoft .NET Core 3 or later (tested with .NET 6.0.200)
SpeechlyClient features can be run with prerecorded audio on the command line in speechly-dotnet/
folder:
dotnet run test
processes an example file, sends to Speechly cloud SLU and prints the received results in console.dotnet run vad
processes an example file, sends the utterances audio to files intemp/
folder as 16 bit raw and creates an utterance timestamp.tsv
(tab-separated values) for each audio file processed.dotnet run vad myaudiofiles/*.raw
processes a set of files with VAD.
We are happy to receive community contributions! For small fixes, feel free to file a pull request. For bigger changes or new features start by filing an issue.
./link-speechly-sources.sh
shell script will create hard links fromspeechly-dotnet/Speechly/
tospeechly-unity/Assets/Speechly/
so shared .NET code remains is in sync. Please run the script after checking out the repo and before making any changes. If you can't use the script please ensure that the files are identical manually before opening a PR../build-docs.sh
generates public API documentation using DocFX from triple-slash///
comments with C# XML documentation tags.
An unitypackage release is created with Unity's Export package feature. It should contain Speechly
and StreamingAssets
folders which contain Speechly-specific code and assets. Other files and folders in the project should not be included.
- Open the folder
speechly-unity/
subfolder (or any Scene file within that solution) using Unity Hub. The project is currently in Unity 2021.3.27f LTS. - In the Project window, select
Speechly
andStreamingAssets
folders, then right click them to open the context menu. - In the context menu, select
Export package...
. - In the Export package window, uncheck
Include dependencies
, then clickExport...
- Name the file
speechly.unitypackage
in the root folder - Add to git and push.
- These Unity SDK files are distributed under MIT License.
- Data sent to Speechly cloud is processed according to Speechly terms of use https://www.speechly.com/privacy
Copyright 2022 Speechly
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.