TTS-RVC-API

Yes, we can use Coqui with RVC!

Why combine the two frameworks?

Coqui is a text-to-speech framework (vocoder and encoder), but cloning your own voice takes decades and offers no guarantee of better results. That's why we use RVC (Retrieval-Based Voice Conversion), which works only for speech-to-speech. You can train the model with just 2-3 minutes of dataset as it uses Hubert (a pre-trained model to fine-tune quickly and provide better results).

Installation

How to use Coqui + RVC api?

https://github.com/skshadan/TTS-RVC-API.git

python -m venv .venv
. .venv/bin/activate
pip install -r requirements.txt
pip install TTS
python -m uvicorn app.main:app

Now update config.toml with relative paths config model_dir path or set a speaker_name in the request body

Where the RVC v2 model is mounted on the container at:

  /
└── models
      └── speaker1
          ├── speaker1.pth
          └── speaker1.index

Now Run this

  python -m uvicorn app.main:app

POST REQUEST

  http://localhost:8000/generate

    emotions : happy,sad,angry,dull
    speed = 1.0 - 2.0

  {
  "speaker_name": "speaker3",
  "input_text": "Hey there! Welcome to the world",
  "emotion": "Surprise",
  "speed": 1.0
}

CODE SNIPPET

import requests
import json
import time

url = "http://127.0.0.1:8000/generate"

payload = json.dumps({
  "speaker_name": "speaker3",
  "input_text": "Are you mad? The way you've betrayed me is beyond comprehension, a slap in the face that's left me boiling with an anger so intense it's as if you've thrown gasoline on a fire, utterly destroying any trust that was left.",
  "emotion": "Dull",
  "speed": 1.0
})
headers = {
  'Content-Type': 'application/json'
}

start_time = time.time()  # Start the timer

response = requests.request("POST", url, headers=headers, data=payload)

end_time = time.time()  # Stop the timer

if response.status_code == 200:
    audio_content = response.content
    
    # Save the audio to a file
    with open("generated_audio.wav", "wb") as audio_file:
        audio_file.write(audio_content)
        
    print("Audio saved successfully.")
    print("Time taken:", end_time - start_time, "seconds")
else:
    print("Error:", response.text)

Feedback

If you have any feedback, issues please reach out to shadankhantech@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
app		app
models		models
tmp		tmp
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
TODO.md		TODO.md
api.py		api.py
build.sh		build.sh
config.toml		config.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TTS-RVC-API

Why combine the two frameworks?

Installation

POST REQUEST

CODE SNIPPET

Feedback

About

Releases

Packages

Contributors 2

Languages

License

skshadan/TTS-RVC-API

Folders and files

Latest commit

History

Repository files navigation

TTS-RVC-API

Why combine the two frameworks?

Installation

POST REQUEST

CODE SNIPPET

Feedback

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages