You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Trained a model with dataset of a multiple speakers.
Quality is ok but... The model produces random speaker voice on inference.
If there any type of control on this, is it possible to choose the voice?
How the model chooses which one to use for inference?
What's interesting - model picks the same voice for each specific text (unless I edit anything in it, even a dot or comma).
The text was updated successfully, but these errors were encountered:
Trained a model with dataset of a multiple speakers. Quality is ok but... The model produces random speaker voice on inference. If there any type of control on this, is it possible to choose the voice? How the model chooses which one to use for inference? What's interesting - model picks the same voice for each specific text (unless I edit anything in it, even a dot or comma).
Hello, have you solved this issue? Recently I'd like to use waveglow as my vocoder in the multispeaker setting (VCTK Corpus).
Trained a model with dataset of a multiple speakers.
Quality is ok but... The model produces random speaker voice on inference.
If there any type of control on this, is it possible to choose the voice?
How the model chooses which one to use for inference?
What's interesting - model picks the same voice for each specific text (unless I edit anything in it, even a dot or comma).
The text was updated successfully, but these errors were encountered: