Skip to content

AllTalk V2 QuickStart Guide

erew123 edited this page Nov 27, 2024 · 34 revisions

1. Starting AllTalk

To start using AllTalk, you’ll use various start_xxx files:


2. Console/Terminal Information & API

After running start_alltalk, the application will display key details in the console, including Github update status, the API address that external applications can use to talk to AllTalk, and the Gradio links to open up the main AllTalk web user interface.

Key Console/Terminal Details:

  • GitHub Updated: Shows the last time an update was made on Github to AllTalk.
    • Pull new changes by running git pull in the alltalk_tts folder & then atsetup to update the installation requirements.
  • API Address: 127.0.0.1:7851 - the API endpoint for TTS calls and a basic web interface.
  • Gradio Interface: The main web interface for AllTalk

image

💡 Tip: Open the Gradio links by holding CTRL then Left Click with your mouse
💡 Tip: Errors/Issues will be displayed at the terminal/console screen. The Known Errors page is here on the Wiki


3. Using the Gradio Interface

There are multiple areas to AllTalks Gradio web user interface. The primary areas of interest are:

image

  • Generation Tab - In here you can generate TTS and choose which TTS Engine is loaded/Set along with the model its using
    • Some TTS Engines do not have multiple model files, as their actual voice files are the models.

image

  • Global Settings - In here you can configure/manage many features that are core to AllTalk.
    • Some feature/settings will be only for advanced users or specific use cases.
    • You can tweak lots of settings to change the behaviour or AllTalk.
    • If you wish to use RVC voices, you need to Enable it in the RVC Settings Tab.

image

  • TTS Engine Settings - In here you can make changes to each TTS engine, download its model files & find help about that TTS engine.
    • Typically you will want to download/setup your chosen TTS engine when you have freshly installed AllTalk.
    • If you wish to install your own models/voices for a TTS engine, you can read that engines documentation here.
    • An overview page for each TTS engine can also be found here, along with links to the Developer of that TTS engine.

image

  • Help Accordions - Throughout the AllTalk interface are expandable help accordions with information about the page you are on and its settings.
    • Some help/information isn't in the interface but on the GitHub Wiki which you are on right now.
    • A Known errors and issues list is maintained on the GitHub Wiki here

4. Generate TTS Tab

This is where you control which TTS engine AllTalk has loaded for generating TTS & can generate basic TTS.

image

  • Change TTS Engine: To switch between different TTS engines, go to "Generate TTS" > "Generate" tab.
    • Swap TTS Engine: Use the "Swap TTS Engine" button to select different TTS engines (e.g. XTTS, Piper, VITS).
    • Load Different Model: Click "Load Different Model" to change the model for the chosen TTS engine
  • Advanced Engine/Model settings: This is where you can control/change other settings like TTS speed, language etc.
    • The settings here will change depending on what the currently loaded TTS engine supports. Some features may be greyed out.

💡 Tip: Click the Refresh Server Settings button to update all the voice lists etc.
💡 Tip: Download models for each TTS Engine and manage each TTS Engines settings, go to the TTS Engine Settings tab
💡 Tip: Voice cloning TTS engines store wav/mp3 files to be used for cloning in the alltalk_tts/voices/ folder.
💡 Tip: RVC will not be enabled until you Enable it in the Global Settings > RVC Settings area.


5. Global Settings

Adjust AllTalk's central default behaviour and settings. Examples include:

image

  • Audio Transcoding Convert output audio to different file formats as TTS is generated e.g MP3, Ogg, Flac
  • Delete Old WAVs: Automatically delete old generated TTS files on start-up.
  • Disk Space Use: Find out how much disk space is being used and where it is being used.
  • RVC Pipeline: Enable RVC and set its default settings in the "RVC Settings" tab.
    • Enabling RVC will download a few model files that it needs and also setup the rvc_voices folder where you can place RVC voice files for use.

⚠️ Warning: Some features in the Global Settings may be for advanced uses or special cases. Whilst you cannot damage anything, you can affect/impact how AllTalk behaves. As such if you are uncertain what you are doing, please read the help for each section.


6. TTS Engine Settings

In here you can set the custom settings for each TTS engine that AllTalk works with. You can also download Models/Voices (some Engines are Voice cloning Model based and some are individual voice models). You can find out about each TTS engine and its settings. If you wish to use the OpenAI compatible TTS endpoint, you can map the voices between what OpenAI's API uses and what the underlying TTS engine will use.

image

  1. Engine Information: Detailed descriptions of each engine (F5-TTS, Piper, XTTS, Parler, etc.) and links to developer sites.
  2. Models/Voices Download: Download models or voices specific to the chosen TTS engine.
  3. Default Settings: Set default parameters, including temperature, pitch, and repetition.
  4. Engine Help: Instructions on using each engine, managing models, and troubleshooting.

Currently Available TTS Engines

Model DeepSpeed Pitch Speed RepPen MultiLang Streaming Low VRAM Temp Multi Model Voice Clone
F5-TTS No No Yes No *Yes No Yes No Yes Yes
Parler-TTS No No No No No No Yes No Yes No
Piper No No Yes No *No No No No Yes No
Coqui VITS No No No No *No No Yes No Yes No
Coqui XTTS Yes No Yes Yes Yes Yes Yes Yes Yes Yes

Notes

  • F5-TTS: Voice Cloning from an audio sample. Supports only Chinese and English language.
  • Coqui XTTS: Voice Cloning from an audio sample. Multi-language.
  • Parler-TTS: Voices created by written text instructions. English language only.
  • Piper TTS: Individual single voice model files. Multi-language (depends on the voice model file).
  • Coqui VITS: Individual single voice model files. Multi-language (depends on the voice model file).

💡 Tip: Information on each TTS engine is available in the Gradio Interface for each model.
💡 Tip: For more detailed information, links to the TTS engine developers websites are on the chart above.


7. Folder Structure

AllTalk organizes files in the following structure:

📁 alltalk_tts/
    ├── 📁 .GitHub/                 # Git's version management tracking folder
    ├── 📁 alltalk_environment/     # AllTalk's Python environment folder
    ├── 📁 finetune/                # Coqui XTTS finetuning dataset files
    ├── 📁 models/                  # 🚨 TTS Engines model files are stored in here
    │   ├── 📁 f5tts/               # F5-TTS's TTS' model files/folders
    │   ├── 📁 piper/               # Piper TTS's voice files/folders
    │   ├── 📁 xtts/                # Parler's model files/folders
    │   ├── 📁 rvc_base/            # RVC's core model files
    │   ├── 📁 rvc_voices/          # RVC's voice models (where you can place them)
    │   ├── 📁 xtts/                # Coqui XTTS's model files/folders
    │   ├── 📁 vits/                # Coqui VITS's voice files/folders
    │   └── etc.../
    ├── 📁 system/                  
    │   ├── 📁 espeak-ng/           # Windows installer for espeak-ng
    │   ├── 📁 gradio_pages/        
    │   ├── 📁 requirements/        # Requirement files
    │   ├── 📁 TGWUI Extension/     # TGWUI remote extension
    │   └── 📁 tts_engines/         # Individual TTS engine's core code
    │       ├── 📁 f5tts/
    │       ├── 📁 parler/
    │       ├── 📁 piper/
    │       ├── 📁 rvc/
    │       ├── 📁 template-tts-engine/ # Template code for adding a new TTS engine
    │       ├── 📁 vits/
    │       ├── 📁 xtts/
    │       ├── 🗎 tts_engines.json  # TTS engine configuration file
    │       └── 🗎 new_engines.json  # New TTS engine configuration file
    ├── 📁 voices/                  # 🚨 Audio samples for voice cloning engines are stored in here.
    ├── 📁 outputs/                 # 🚨 TTS output audio files
    ├── 🗎 confignew.json            # AllTalk's central configuration file
    ├── 🗎 atsetup.bat               # Windows setup file
    ├── 🗎 atsetup.sh                # Linux setup file
    ├── 🗎 etc...
    ├── 🗎 script.py                 # Main start-up script
    └── 🗎 tts_server.py             # Engine management script

💡 Tip: Pay attention to the folders marked with 🚨 as these will be important for using the TTS engines.


7. Additional Information

  • Auto-Delete WAVs: Set in Global Settings; controls automatic deletion of old output files.

For detailed help refer to the relevant tabs in the Gradio interface or the AllTalk Wiki.


Clone this wiki locally