Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pretrained models #435

Closed
breandan opened this issue Mar 11, 2017 · 41 comments
Closed

Add pretrained models #435

breandan opened this issue Mar 11, 2017 · 41 comments

Comments

@breandan
Copy link

Is it possible to add links to some pretrained models? I would like to test the performance on some real world speech, but could not find any reference to these in the docs. Thanks!

@kdavis-mozilla
Copy link
Contributor

We're planning on adding links to some pre-trained models as soon as we're satisfied with their word error rate (WER).

We're targeting a WER of at max 10% on the TED test set and hope that we can, within the next few weeks, release a model.

@gvoysey
Copy link
Contributor

gvoysey commented Apr 4, 2017

@kdavis-mozilla is there any update on this line of work?

@kdavis-mozilla
Copy link
Contributor

@gvoysey I wish I could give you the models right now. But I can't. Sorry. The WER still needs to be tuned.

We're still waiting on our new hardware which will allow us to tune hyperparameters with a quicker turn around. But it's not here yet.

Unfortunately the ETA for the hardware looks to be about 4 weeks out now. Then, once we have the hardware, we'll still need to tune for about a week or two.

Again, sorry.

@gvoysey
Copy link
Contributor

gvoysey commented Apr 5, 2017

@kdavis-mozilla no worries. I have been thrashing some GPUs myself (on librivox), hopefully they'll finish soon!

@jacobjennings
Copy link

I'm curious how large a trained model from a large dataset like TED is on disk, as a rough estimate?

@gvoysey
Copy link
Contributor

gvoysey commented Apr 14, 2017

I have a trained librivox model which is ~800 MB. I've found deepspeech2 implementations on other systems (torch) that are ~600 MB. These facts are anecdata. :)

@phasnox
Copy link

phasnox commented May 3, 2017

You know how long will it take to train with the TED dataset, if I have a GTX 1080?

@kdavis-mozilla
Copy link
Contributor

@phasnox We use 4 Titan X's and it takes about 5 days. So, I'd guess on the order of 20 days.

@getnamo
Copy link

getnamo commented May 15, 2017

@kdavis-mozilla maxwell Titan X, Titan X (Pascal), or Titan Xp? I hate that we need the distinction...

@kdavis-mozilla
Copy link
Contributor

@getnamo Titan X (Pascal)

There are only two hard things in Computer Science:
cache invalidation and naming things.
                                -- Phil Karlton

But I don't think NVIDIA even tried on this one.

@gvoysey
Copy link
Contributor

gvoysey commented May 16, 2017 via email

@striki70
Copy link

Any news for links to trained models?

@striki70
Copy link

@phasnox 20 days for how many epochs? WER achieved?

@phasnox
Copy link

phasnox commented May 16, 2017

@striki70 yehp WER achieved would be nice. But I have'nt try yet, I need to get a better cooling system.

@gvoysey
Copy link
Contributor

gvoysey commented May 22, 2017

@kdavis-mozilla just checking in on any updates you have on trained models.

@kdavis-mozilla
Copy link
Contributor

Just got the hardware and installed it last week. Now we're working through a few OOM issues which appear in the cluster setting and not in the single node setting. So, it's a work in progress.

@pythonmobile
Copy link

Waiting for it :) Thanks.

@AnimeshKoratana
Copy link

Are there any updates regarding the pretrained models?

@reuben
Copy link
Contributor

reuben commented Jun 29, 2017

We're training a bunch of models and getting ready to share them. It's coming soon! :)

@ThejanW
Copy link

ThejanW commented Jul 3, 2017

Are they ready now?

@reuben
Copy link
Contributor

reuben commented Jul 3, 2017

We'll comment here when they're available.

@gardenia22
Copy link
Contributor

I test TED dataset with Google API using this tool autosub with little modification. Google API achieved 27.3162% WER. Is 10% WER aiming too high?

@mozilla mozilla locked and limited conversation to collaborators Jul 3, 2017
@reuben
Copy link
Contributor

reuben commented Jul 3, 2017

That's off topic for this issue. Please join our IRC channel for questions and discussions. I'm locking this issue to try and get people to stick to the proper communication channels.

@reuben
Copy link
Contributor

reuben commented Nov 26, 2017

We have a first release of an American English model available in our releases page: https://github.com/mozilla/DeepSpeech/releases/latest

Please check it out and experiment, we're excited to see what you can do with it. We also set up discussion forums on Discourse, check the release notes for links.

@reuben reuben closed this as completed Nov 26, 2017
@saikishor
Copy link

@reuben We surely appreciate the work and effort you have taken in, to provide us the model. It works amazing. It would be much more helpful, if you could provide us the checkpoints of the above model.

@nicolaspanel
Copy link
Contributor

@reuben @kdavis-mozilla and others, thanks for the great work (amazing, really !)
Do you plan to share pre-trained models for other languages as well in future releases (ex: french) ?

@kdavis-mozilla
Copy link
Contributor

@nicolaspanel Thanks, and yes we want to share models for as many languages as we can!

@nicolaspanel
Copy link
Contributor

@kdavis-mozilla great 👍 do you have any ETA (no mentioned in currents projects)?
PS: I would be happy to help cleanup/prepare datasets if needed

@kdavis-mozilla
Copy link
Contributor

@nicolaspanel For other languages the timing depends upon the rate at which Common Voice collects data for the language.

For example, a couple of weeks ago data donation started for French. The faster Common Voice collects data for a particular language, e.g. French, the faster we can bring you a model for that language.

So to a very large extent the timings will be determined be the community of Common Voice.

@lissyx
Copy link
Collaborator

lissyx commented Jun 13, 2018

@nicolaspanel For french we also deeply need to diversify our sources. If you're interested, you can find informations on https://github.com/mozfr/besogne/wiki/Common-Voice-Fr

@Sorkanius
Copy link

Are there any news on next trained model release?

Thanks a lot, your work is amazing!

@kdavis-mozilla
Copy link
Contributor

No real updates yet, but as we just started collecting in other languages we need to give it a bit more time before we have enough data to train on.

@kdavis-mozilla
Copy link
Contributor

PS: And thanks for the compliment!

@Sorkanius
Copy link

Are you working towards increasing the dataset or trying new networks? Or both maybe?

@kdavis-mozilla
Copy link
Contributor

Right now trying to get more data through Common Voice

@Sorkanius
Copy link

Have you thought about data mining movies/series with their subtitles?

@kdavis-mozilla
Copy link
Contributor

Licensing issues prevent this.

@Sorkanius
Copy link

Oh I understand, thanks again for your time, Keep up the great work!

@akshat9425
Copy link

Is there any pretrained model of indian english pls send me if it exists

@b-ak
Copy link
Contributor

b-ak commented Sep 17, 2018

@akshat9425 A couple of us are working towards achieving something on similar lines. We could explore the possibility of collaborating on the same. Write to me bak0@protonmail.com

@lock
Copy link

lock bot commented Jan 2, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Jan 2, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests