Skip to content

High-quality multi-lingual text-to-speech library by MyShell.ai.

License

Notifications You must be signed in to change notification settings

teedihuni/Melo_TTS_kor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Korean TTS

Just using melotts in korea texts

step 1 Install

The repo is developed and tested on Ubuntu 20.04 and Python 3.9.

pip install -e 
python -m unidic download

step 2 Weight download

you need to download weight in melo hugginface.

'config.json' also need to be downloaded.

step 3 Inference

method 1

cd melo
python infer.py -t "<TEXT EXAMPLES>" -m "<weigth_path>" -o "<result_path>" -l 'KR'

you can also change voice speed.

original infer.py do not use voice speed arguments but default speed is too slow for korea language.

So i just added speed arguments to customize. Speed 1.2 fits well in korean voice.

python infer.py -t "<TEXT EXAMPLES>" -m "<weigth_path>" -o "<result_path>" -l 'KR' -sp 1.3

method 2

cd test
python test_base_model_tts_package.py

if you use this method you need to add config&checkpoint arguments when you define TTS model.

not just like this 'model = TTS(language=language)' but 'model = TTS(language=language, config_path=config_path, ckpt_path=ckpt_path)'

launch.json example for vscode debug

{
    // Use IntelliSense to learn about possible attributes.
    // Hover to view descriptions of existing attributes.
    // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Python Debugger: Current File",
            "type": "debugpy",
            "request": "launch",
            "program": "${file}",
            "args": [
                "-t","[TEXT]",
                "-m","MeloTTS/melo/weight/checkpoint.pth",
                "-o","MeloTTS/test/result",
                "-l","KR",
                "-sp","1.23"

            ],
            "console": "integratedTerminal",
            "justMyCode": false
        }

    ]
}

todo list

  • inference test [2024.05.02]
  • voice speed [2024.05.02]
  • voice conversion (~ing)
  • train code test
 

Introduction

MeloTTS is a high-quality multi-lingual text-to-speech library by MyShell.ai. Supported languages include:

Usage

The Python API and model cards can be found in this repo or on HuggingFace.

Citation

@software{zhao2024melo,
  author={Zhao, Wenliang and Yu, Xumin and Qin, Zengyi},
  title = {MeloTTS: High-quality Multi-lingual Multi-accent Text-to-Speech},
  url = {https://github.com/myshell-ai/MeloTTS},
  year = {2023}
}

License

This library is under MIT License, which means it is free for both commercial and non-commercial use.

Acknowledgements

This implementation is based on TTS, VITS, VITS2 and Bert-VITS2. We appreciate their awesome work.

About

High-quality multi-lingual text-to-speech library by MyShell.ai.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published