gcp-stt-discord

Google Cloud Platform Speech-To-Text for Discord

About

This is a helper library for extending the discord.js and discord.js voice libraries to support easy access to speech-to-text processing via the Google Cloud Platform.

The advantage of this library is that it provides a simple interface for processing speech-to-text data from Discord users.

Copyright & Licensing

This codebase is licensed under the MIT License.

Contributing

Open-source contributions to this project are encouraged.

Please open an issue or pull request if you would like to contribute.

Installation

This library is not published to npm.

To install this library, you must use the following command:

npm i github:@minesuperior/gcp-stt-discord

Documentation

Only the most commonly used features are documented here.

Other features can be found in the source code or via your editor's autocomplete.

Click to expand

`attachSpeechEvent`

Attaches custom event listeners to the client.

This should be done before the client is ready.

Parameters

Name	Type	Description
`options`	`AttachSpeechEventOptions`	Options for attaching the speech event.

`AttachSpeechEventOptions`

Name	Type	Description
`key`	`string`	The Google Cloud Platform API key.
`client`	`Discord.Client`	The discord.js client.
`shouldProcessUserId`	`(discordUserId: string) => Promise<boolean>`	A promise that should return true if the user's voice should be processed.

Returns

void

`Events`

The events emitted by the discord.js client injected by this library.

Name	Parameters	Description
`Error`	`SpeechError`	Emitted when an error occurs.
`VoiceMessage`	`VoiceMessage`	Emitted when a user finishes speaking.

`SpeechError`

An error that occurred within this library or dependencies.

Name	Type	Description
`code`	`SpeechErrorCode`	A custom error code provided by this library.
`message`	`string`	The error message.
`error`	`Error`	The error that occurred.

`SpeechErrorCode`

Name	Description
`NetworkRequest`	An error occurred while making a network request.
`NetworkResponse`	An error occurred while receiving a network response.
`CreateVoiceMessage`	An error occurred while creating a voice message.
`VoiceConnectionStatusTimeout`	The voice connection status did not change in time.
`OpusStream`	An error occurred within an opus stream.

`VoiceMessage`

A voice message is an instance of a user's voice data.

Properties

Name	Type	Description
`client`	`Discord.Client`	The discord.js client.
`userId`	`string`	The user's identifier.
`channelId`	`string`	The channel's identifier.
`connection`	`DiscordVoice.VoiceConnection`	The voice connection.
`monoBuffer`	`Buffer`	The audio represented as a 48kHz PCM mono audio buffer.
`duration`	`number`	The duration of the voice message in seconds.

Methods

Name	Parameters	Return Type	Description
`recognize`	n/a	`Promise<SpeechToTextApiResponsePayload>`	A promise to resolve a transcribe from GCP STT.

`SpeechToTextApiResponsePayload`

Refer to the Google Cloud Platform's Speech-To-Text Api's documentation for more information.

Example

This example assumes that you are using Discord.js v14, and Discord.js voice 0.14.

Click to expand

import * as Discord from 'discord.js';

import * as DiscordSpeechRecognition from '@minesuperior/gcp-stt-discord';

//------------------------------------------------------------//

const gcp_stt_api_key = process.env.GCP_STT_API_KEY as string | undefined;
if (!gcp_stt_api_key?.length) throw new Error('GCP_STT_API_KEY is not set');

const discord_bot_token = process.env.DISCORD_BOT_TOKEN as string | undefined;
if (!discord_bot_token?.length) throw new Error('DISCORD_BOT_TOKEN is not set');

//------------------------------------------------------------//

const discord_client = new Discord.Client({
  intents: [
    Discord.GatewayIntentBits.Guilds,
    Discord.GatewayIntentBits.GuildVoiceStates,
  ],
});

// Attaches custom event listeners to the client.
// This should be done before the client is ready.
DiscordSpeechRecognition.attachSpeechEvent({
  key: gcp_stt_api_key,
  client: discord_client,
  shouldProcessUserId: (userId) => {
    // Assuming that the bot is currently in a voice channel with a user,
    // every time that a user in the voice channel starts speaking,
    // this function will be called to check if the user's voice should be processed

    // You should ignore bots, system accounts, and people who do not want to be processed.
    // That includes preventing this bot from processing its own voice.

    // Due to laws such as GDPR, it is highly recommended to
    // have users opt-in to your bot's speech recognition.

    // return true to process the user's voice.
    // return false to not process the user's voice.

    return true; // process all users
  },
});

//------------------------------------------------------------//

client.on(DiscordSpeechRecognition.Events.Error, (speechError: DiscordSpeechRecognition.SpeechError) => {
  // It is highly recommended to only listen to errors that you care about.
  // Use `speechError.code` and the enum `SpeechErrorCode` to filter.
});

client.on(DiscordSpeechRecognition.Events.VoiceMessage, (voiceMessage: DiscordSpeechRecognition.VoiceMessage) => {
  // This event is fired every time a user finishes speaking.
  // The `voiceMessage` parameter contains the user's voice data.
  // You should check the duration of the voice message to make sure it isn't too short or too long.

  if (
    voiceMessage.duration < 1 ||
    voiceMessage.duration > 30
  ) return; // voice message is too short or too long

  const { results, totalBilledTime } = await voiceMessage.recognize();

  const [ firstResult ] = results;
  if (!firstResult) return; // first result is not present

  if (!firstResult.isFinal) return; // result is not final (not done processing)

  const [ firstAlternative ] = firstResult.alternatives;
  if (!firstAlternative) return; // first alternative is not present

  const { transcript, confidence } = firstAlternative;

  // do something with the `transcript`, `confidence`, and `totalBilledTime`
});

client.on(Discord.Events.ClientReady, () => {
  // have the bot join a voice channel somehow
  // then, the bot will start processing voice messages
});

//------------------------------------------------------------//

client.login(discord_bot_token);

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gcp-stt-discord

About

Copyright & Licensing

Contributing

Installation

Documentation

`attachSpeechEvent`

Parameters

`AttachSpeechEventOptions`

Returns

`Events`

`SpeechError`

`SpeechErrorCode`

`VoiceMessage`

Properties

Methods

`SpeechToTextApiResponsePayload`

Example

About

Releases

Packages

Languages

License

MineSuperior/gcp-stt-discord

Folders and files

Latest commit

History

Repository files navigation

gcp-stt-discord

About

Copyright & Licensing

Contributing

Installation

Documentation

attachSpeechEvent

Parameters

AttachSpeechEventOptions

Returns

Events

SpeechError

SpeechErrorCode

VoiceMessage

Properties

Methods

SpeechToTextApiResponsePayload

Example

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

`attachSpeechEvent`

`AttachSpeechEventOptions`

`Events`

`SpeechError`

`SpeechErrorCode`

`VoiceMessage`

`SpeechToTextApiResponsePayload`

Packages