-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tacotron: Train TWEB dataset #22
Comments
Using the master branch had very poor performance due to the very long sequence length of the dataset. To alleviate the problem I try to use Truncated Backpropagation Through Time. |
This dataset is in very low quality. It is low-pass filtered applied. It causes low stop-token prediction and pronunciation errors, especially for novel words. Training with phonemes might improve the results. Also I replaces ReLU with RReLU and removed Dropout in prenet. These changes improved the results but yet to be tested on other datasets. Sound example: https://soundcloud.com/user-565970875/tweb-example-108k-iters-2810d57 |
As I just discover, I trained the model with thw sampling rate 22050 which is default for LJSpeech but TWEB has 12000 sampling rate. That might be a important bug. |
@erogol Is there any pretrained TTS model that has been trained with a male voice? I have been trying synthesize audio with the pretrained model (Tacotron-iter-108K on the TWEB dataset) but the commit is no longer present (2810d57). I get this error |
Dataset: https://www.kaggle.com/bryanpark/the-world-english-bible-speech-dataset
The text was updated successfully, but these errors were encountered: