LIVE-CHAT

LIVE-CHAT is a voice-based conversational assistant application that uses Speech-to-Text (STT), Large Language Models (LLMs) and Text-to-Speech (TTS) to chat in your terminal. It's designed to simulate a live conversation with short, conversational responses. Now with enhanced TTS speaker selection and Language Model (LLM) selection features.

Example Video 👆

Features

Text-to-Speech (TTS) support with multiple providers: Microsoft Edge TTS, Deepgram.com, Coqui XTTSv2 (Offline). Now includes the ability to select your preferred TTS speaker.
Language model processing for conversational responses. Now includes the ability to select your preferred Language Model (LLM), with support with multiple providers: Groq, OpenAI API, Ollama (Offline).
Speech-to-Text (STT) support with multiple providers: Deepgram.com, Whisper (Offline). You can put audio files in /voices for custom cloning with Coqui.
Enhanced user customization options for a more personalized experience.

Setup

Clone the repository and cd live-chat
Create a python environment
- Conda: conda create -n live-chat python=3.11 and activate with conda activate live-chat
- Python Virtual Environment: python -m venv venv and activate with source venv/bin/activate (Linux) or venv\Scripts\activate (Windows)
Install torch as per your hardware
Install the required Python packages by running pip install -r requirements.txt
Set up your environment variables in a .env file. You'll need to provide your API keys for the TTS and STT services. You can also use the offline modes without any API keys, however you will have to install and configure Ollama
Install ffmpeg and ensure you can run it in your command line
Run the application with python app.py

Usage

When you run the application, you'll be prompted to enter your preferred TTS and STT providers. You can now also select your preferred TTS speaker and Language Model (LLM). After that, the application will start a conversation. You can stop the conversation by saying "goodbye".

The fastest combination of tools that I have found is STT using Deepgram, LLM with Groq, and TTS with Deepgram.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the terms of the MIT license.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

LIVE-CHAT

Features

Setup

Usage

Contributing

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

LIVE-CHAT

Features

Setup

Usage

Contributing

License