Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

any list of all 36 voices? #95

Closed
OpenMachinesAI opened this issue Aug 9, 2024 · 7 comments
Closed

any list of all 36 voices? #95

OpenMachinesAI opened this issue Aug 9, 2024 · 7 comments

Comments

@OpenMachinesAI
Copy link

just want a list

@LuckyMcBeast
Copy link

Yes, please. I looked everywhere to find one.

@Yazorp
Copy link

Yazorp commented Aug 9, 2024

From this Reddit discussion: https://www.reddit.com/r/LocalLLaMA/comments/1encx98/improved_text_to_speech_model_parler_tts_v1_by/

Laura
Gary
Jon
Lea
Karen
Rick
Brenda
David
Eileen
Jordan
Mike
Yann
Joy
James
Eric
Lauren
Rose
Will
Jason
Aaron
Naomie
Alisa
Patrick
Jerry
Tina
Jenna
Bill
Tom
Carol
Barbara
Rebecca
Anna
Bruce
Emily

@ylacombe
Copy link
Contributor

Hey, the previous list is indeed correct!
However, I've realized that the models were better at some speakers, namely:

Large - Top 20:

Will       0.906055
Eric       0.887598
Laura      0.877930
Alisa      0.877393
Patrick    0.873682
Rose       0.873047
Jerry      0.871582
Jordan     0.870703
Lauren     0.867432
Jenna      0.866455
Karen      0.866309
Rick       0.863135
Bill       0.862207
James      0.856934
Yann       0.856787
Emily      0.856543
Anna       0.848877
Jon        0.848828
Brenda     0.848291
Barbara    0.847998

Mini - Top 20:

Jon        0.908301
Lea        0.904785
Gary       0.903516
Jenna      0.901807
Mike       0.885742
Laura      0.882666
Lauren     0.878320
Eileen     0.875635
Alisa      0.874219
Karen      0.872363
Barbara    0.871509
Carol      0.863623
Emily      0.854932
Rose       0.852246
Will       0.851074
Patrick    0.850977
Eric       0.845459
Rick       0.845020
Anna       0.844922
Tina       0.839160

Would you like to add all of these information in the repo somewhere? If so, feel free to open a PR!

@dgm3333
Copy link

dgm3333 commented Aug 13, 2024

What are the numbers you've included (I'm guessing might be WER, generation speed, or some other accuracy measure)?
The list of names is already here: examples/prompt_creation/speaker_ids_to_names.json

@ylacombe
Copy link
Contributor

ylacombe commented Aug 13, 2024

Numbers represent average speaker similarity between random snippet of the person speaking and randomly Parler-generated snippet. The higher, the better the model is being able to keep voice consistency.
Numbers are from this dataset for Mini and this dataset for Large.

@kdcyberdude
Copy link

@ylacombe, How is the similarity score calculated? Did you use a specific speaker embedding model to obtain the similarity score?

@ylacombe
Copy link
Contributor

Closed by #141 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants