Speech-to-Speech Translator

This project uses Google Cloud Speech-to-Text API to transcribe speech to text, DeepL API to translate the transcribed text, and ElevenLabs API to convert the translated text back to speech. This creates a seamless speech-to-speech translation system.

Prerequisites

Before running this project, ensure you have the following dependencies installed:

Python 3.7 or later
Google Cloud SDK (gcloud)
Pyaudio
Requests
Pygame
DeepL API key
ElevenLabs API key

Installation

Clone the repository:

git clone https://github.com/bykemalh/S2ST.git
cd S2ST

Set up a virtual environment:

python3 -m venv env
source env/bin/activate  # On Windows use `env\Scripts\activate`

Install the required Python packages:

pip install google-cloud-speech pyaudio deepl requests pygame

Install Google Cloud SDK: Follow the installation instructions for your operating system here.

Authenticate with Google Cloud:

gcloud auth login
gcloud auth application-default login

Enable the Google Cloud Speech-to-Text API:

gcloud services enable speech.googleapis.com

Set up API keys: Replace the placeholder values in the script with your actual DeepL and ElevenLabs API keys.
```
auth_key = "your-deepl-auth-key"
xi_api_key = "your-elevenlabs-api-key"
```

Running the Application

To run the application, simply execute the main.py script:

python S2ST_NewAdvanced.py

How It Works

Audio Input:
- The application opens a microphone stream using the pyaudio library and captures audio in real-time.
Speech-to-Text:
- The captured audio is sent to the Google Cloud Speech-to-Text API, which returns the transcribed text.
Translation:
- The transcribed text is translated to English using the DeepL API.
Text-to-Speech:
- The translated text is sent to the ElevenLabs API, which converts it to speech and plays it back.

Dependencies

Ensure you have the following libraries installed:

google-cloud-speech
pyaudio
deepl
requests
pygame

You can install these dependencies using the following command:

pip install google-cloud-speech pyaudio deepl requests pygame

Configuration

Modify the following variables in the script to match your settings:

auth_key: Your DeepL API key.
xi_api_key: Your ElevenLabs API key.
voice_id: The voice ID to be used with ElevenLabs API.
RATE: The audio sample rate (default is 16000).
CHUNK: The audio chunk size (default is 1600).

Logging

Logging is set up in the script to capture errors during the text-to-speech conversion process. You can enable more detailed logging by uncommenting the logging configuration line.

# logging.basicConfig(level=logging.DEBUG)

License

This project is licensed under the MIT License. See the LICENSE file for details.

Contributing

If you wish to contribute to this project, please fork the repository and create a pull request.

Acknowledgments

Developed By

This algorithm was developed by Kemal Hafızoğlu.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
S2ST_Advanced.py		S2ST_Advanced.py
S2ST_Basic.py		S2ST_Basic.py
S2ST_Main.py		S2ST_Main.py
TTS_ElevanLabs.py		TTS_ElevanLabs.py
TranslateDeepL.py		TranslateDeepL.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech-to-Speech Translator

Prerequisites

Installation

Running the Application

How It Works

Dependencies

Configuration

Logging

License

Contributing

Acknowledgments

Developed By

About

Releases

Packages

Languages

License

bykemalh/S2ST

Folders and files

Latest commit

History

Repository files navigation

Speech-to-Speech Translator

Prerequisites

Installation

Running the Application

How It Works

Dependencies

Configuration

Logging

License

Contributing

Acknowledgments

Developed By

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages