API ‐ Standard TTS Generation API

This endpoint allows you to generate Text-to-Speech (TTS) audio based on text input. It supports both character and narrator speech generation.

To understand how tts requests to this endpoint flow through AllTalk V2, please see the flowchart here

Endpoint Details

URL: http://{ipaddress}:{port}/api/tts-generate
Method: POST
Content-Type: application/x-www-form-urlencoded

Request Parameters

Parameter	Type	Description
`text_input`	string	The text you want the TTS engine to produce.
`text_filtering`	string	Filter for text. Options: `none`, `standard`, `html`
`character_voice_gen`	string	The name of the character's voice file (WAV format).
`rvccharacter_voice_gen`	string	The name of the RVC voice file for the character. Format: `folder\file.pth` or `Disabled`
`rvccharacter_pitch`	integer	The pitch for the RVC voice for the character. Range: -24 to 24
`narrator_enabled`	boolean	Enable or disable the narrator function.
`narrator_voice_gen`	string	The name of the narrator's voice file (WAV format).
`rvcnarrator_voice_gen`	string	The name of the RVC voice file for the narrator. Format: `folder\file.pth` or `Disabled`
`rvcnarrator_pitch`	integer	The pitch for the RVC voice for the narrator. Range: -24 to 24
`text_not_inside`	string	Specify handling of lines not inside quotes or asterisks. Options: `character`, `narrator`, `silent`
`language`	string	Choose the language for TTS. (See supported languages below)
`output_file_name`	string	The name of the output file (excluding the .wav extension).
`output_file_timestamp`	boolean	Add a timestamp to the output file name.
`autoplay`	boolean	Enable or disable playing the generated TTS to your standard sound output device.
`autoplay_volume`	float	Set the autoplay volume. Range: 0.1 to 1.0
`speed`	float	Set the speed of the generated audio. Range: 0.25 to 2.0
`pitch`	integer	Set the pitch of the generated audio. Range: -10 to 10
`temperature`	float	Set the temperature for the TTS engine. Range: 0.1 to 1.0
`repetition_penalty`	float	Set the repetition penalty for the TTS engine. Range: 1.0 to 20.0

Supported Languages

Code	Language
`ar`	Arabic
`zh-cn`	Chinese (Simplified)
`cs`	Czech
`nl`	Dutch
`en`	English
`fr`	French
`de`	German
`hi`	Hindi (limited support)
`hu`	Hungarian
`it`	Italian
`ja`	Japanese
`ko`	Korean
`pl`	Polish
`pt`	Portuguese
`ru`	Russian
`es`	Spanish
`tr`	Turkish

Example Requests

Standard TTS Speech Example

Generate a time-stamped file for standard text and play the audio at the command prompt/terminal:

curl -X POST "http://127.0.0.1:7851/api/tts-generate" \
     -d "text_input=All of this is text spoken by the character. This is text not inside quotes, though that doesnt matter in the slightest" \
     -d "text_filtering=standard" \
     -d "character_voice_gen=female_01.wav" \
     -d "narrator_enabled=false" \
     -d "narrator_voice_gen=male_01.wav" \
     -d "text_not_inside=character" \
     -d "language=en" \
     -d "output_file_name=myoutputfile" \
     -d "output_file_timestamp=true" \
     -d "autoplay=false" \
     -d "autoplay_volume=0.8"

Narrator Example

Generate a time-stamped file for text with narrator and character speech and play the audio at the command prompt/terminal:

curl -X POST "http://127.0.0.1:7851/api/tts-generate" \
     -d "text_input=*This is text spoken by the narrator* \"This is text spoken by the character\". This is text not inside quotes." \
     -d "text_filtering=standard" \
     -d "character_voice_gen=female_01.wav" \
     -d "narrator_enabled=true" \
     -d "narrator_voice_gen=male_01.wav" \
     -d "text_not_inside=character" \
     -d "language=en" \
     -d "output_file_name=myoutputfile" \
     -d "output_file_timestamp=true" \
     -d "autoplay=false" \
     -d "autoplay_volume=0.8"

Note: If your text contains double quotes, escape them with \" (see the narrator example).

Minimal Request Example

You can send a request with any mix of settings you wish. Missing fields will be populated using default API Global settings and default TTS engine settings:

curl -X POST "http://127.0.0.1:7851/api/tts-generate" \
     -d "text_input=All of this is text spoken by the character. This is text not inside quotes, though that doesnt matter in the slightest"

Response

The API returns a JSON object with the following properties:

Property	Description
`status`	Indicates whether the generation was successful (`generate-success`) or failed (`generate-failure`).
`output_file_path`	The on-disk location of the generated WAV file.
`output_file_url`	The HTTP location for accessing the generated WAV file for browser playback.
`output_cache_url`	The HTTP location for accessing the generated WAV file as a pushed download.

Example response:

{
    "status": "generate-success",
    "output_file_path": "C:\\text-generation-webui\\extensions\\alltalk_tts\\outputs\\myoutputfile_1704141936.wav",
    "output_file_url": "/audio/myoutputfile_1704141936.wav",
    "output_cache_url": "/audiocache/myoutputfile_1704141936.wav"
}

Note: The response no longer includes the IP address and port number. You will need to add these in your own software/extension.

Additional Notes

All global settings for the API endpoint can be configured within the AllTalk interface under Global Settings > AllTalk API Defaults.
TTS engine-specific settings, such as voices to use or engine parameters, can be set on an engine-by-engine basis in TTS Engine Settings > TTS Engine of your choice.
Although you can send all variables/settings, the loaded TTS engine will only support them if it is capable. For example, you can request a TTS generation in Russian, but if the TTS model that is loaded only supports English, it will only generate English-sounding text-to-speech.
Voices sent in the request have to match the voices available within the TTS engine loaded. Generation requests where the voices don't match will result in nothing being generated and possibly an error message.

Code Examples

Python Example

import requests
import json

# API endpoint
API_URL = "http://127.0.0.1:7851/api/tts-generate"

# Function to generate TTS
def generate_tts(text, character_voice, narrator_voice=None, language="en", output_file="output", autoplay=False):
    # Prepare the payload
    payload = {
        "text_input": text,
        "text_filtering": "standard",
        "character_voice_gen": character_voice,
        "narrator_enabled": "true" if narrator_voice else "false",
        "narrator_voice_gen": narrator_voice if narrator_voice else "",
        "text_not_inside": "character",
        "language": language,
        "output_file_name": output_file,
        "output_file_timestamp": "true",
        "autoplay": str(autoplay).lower(),
        "autoplay_volume": "0.8"
    }

    # Send POST request to the API
    response = requests.post(API_URL, data=payload)

    # Check if the request was successful
    if response.status_code == 200:
        result = json.loads(response.text)
        if result["status"] == "generate-success":
            print(f"TTS generated successfully!")
            print(f"File path: {result['output_file_path']}")
            print(f"File URL: {result['output_file_url']}")
            print(f"Cache URL: {result['output_cache_url']}")
        else:
            print("TTS generation failed.")
    else:
        print(f"Error: {response.status_code} - {response.text}")

# Example usage
if __name__ == "__main__":
    text = "Hello, this is a test of the TTS API. *This part is narrated.* \"And this is spoken by a character.\""
    character_voice = "female_01.wav"
    narrator_voice = "male_01.wav"

    generate_tts(text, character_voice, narrator_voice)

# Note: Make sure to replace the API_URL with the correct IP address and port if different from the default
# You can customize the payload further by adding more parameters as needed (e.g., pitch, speed, temperature)
# Error handling can be improved for production use

Javascript Example

// API endpoint
const API_URL = "http://127.0.0.1:7851/api/tts-generate";

// Function to generate TTS
async function generateTTS(text, characterVoice, narratorVoice = null, language = "en", outputFile = "output", autoplay = false) {
    // Prepare the payload
    const payload = new URLSearchParams({
        text_input: text,
        text_filtering: "standard",
        character_voice_gen: characterVoice,
        narrator_enabled: narratorVoice ? "true" : "false",
        narrator_voice_gen: narratorVoice || "",
        text_not_inside: "character",
        language: language,
        output_file_name: outputFile,
        output_file_timestamp: "true",
        autoplay: autoplay.toString(),
        autoplay_volume: "0.8"
    });

    try {
        // Send POST request to the API
        const response = await fetch(API_URL, {
            method: 'POST',
            body: payload,
            headers: {
                'Content-Type': 'application/x-www-form-urlencoded',
            },
        });

        if (!response.ok) {
            throw new Error(`HTTP error! status: ${response.status}`);
        }

        const result = await response.json();

        if (result.status === "generate-success") {
            console.log("TTS generated successfully!");
            console.log(`File path: ${result.output_file_path}`);
            console.log(`File URL: ${result.output_file_url}`);
            console.log(`Cache URL: ${result.output_cache_url}`);
            return result;
        } else {
            console.error("TTS generation failed.");
            return null;
        }
    } catch (error) {
        console.error("Error:", error);
        return null;
    }
}

// Example usage
const text = "Hello, this is a test of the TTS API. *This part is narrated.* \"And this is spoken by a character.\"";
const characterVoice = "female_01.wav";
const narratorVoice = "male_01.wav";

generateTTS(text, characterVoice, narratorVoice)
    .then(result => {
        if (result) {
            // Handle successful generation, e.g., play audio or update UI
        }
    });

// Note: Make sure to replace the API_URL with the correct IP address and port if different from the default
// You can customize the payload further by adding more parameters as needed (e.g., pitch, speed, temperature)
// This example uses async/await for better readability, but you can also use .then() chains if preferred
// Error handling can be improved for production use
// For browser usage, ensure CORS is properly configured on the server side

AllTalk Version 2 Index

Installation

System Requirements

Features

3rd Party Integrations

XTTS Finetuning Guides

API Documentation

Support & Help

Provide feedback

Saved searches

Use saved searches to filter your results more quickly