Skip to content

airenas/espnet-tts-serving

Repository files navigation

espnet-tts-serving

Python Coverage Status CodeQL

Serves ESPnet (version 2) TTS model file. It packs the python code into a docker container for running pytorch on CPU/GPU. It is just a pytorch model inference. No special frontend is defined here. Input is a list of phonemes: {"text":"a <space> b", "voice": "sample.v"}, output is a based64 encoded spectrogram prediction: {"data":"T5CE ...<truncated>... AAA=="}.

Configuration

The service can load several models. It takes a configuration file as an input. See deploy/cpu/voices.yaml as a sample. Service will load a model for a configured voice name, and it will keep it until a request with another voice name will arrive. There is a possibility to load several models into a memory using environment WORKERS parameter.

There is also some load balancer implemented. It tries to keep a model in memory if there are many requests waiting for the same voice.

Sample usage

See deploy/cpu or deploy/gpu for deployment and testing samples.

ESPnet version 1

For ESPnet (version 1) look at ESPnet1 branch


License

Copyright © 2021, Airenas Vaičiūnas.

Released under the The 3-Clause BSD License.