-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RNN vs LSTM? #498
Comments
Generally LSTM's out-perform vanilla RNN's. Also, we don't do low level parallelization as Baidu did, with forward and backward passes of the bi-directional RNN occurring on different GPU's then having the GPU's exchange roles. However, we do have an open issue #362 to explore the difference in LSTM WER vs RNN WER. If you have access to training hardware, and want to tackle issue #362, feel free to explore on the TED data set. However, as a warning, it will take some compute power to tune all the hyperparameters. |
Thanks for fast reply. May I ask currently what's your WER by LSTM? |
It depends on the data set. But, for example, on the librivox data set the full test set the WER was about 22% and for the clean subset of the librivox test data set it was about 12%. However, we haven't really had time to tune to the librivox data set as we're waiting on new hardware that would allow us to get a quicker turn-around for training and we need to tune the language model too. So these numbers are only a first pass on the data set. |
@joyousrabbit I'm going to close this as it seems the associated question has been answered. |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Hello, Mozilla,
Why in your project it uses LSTM instead of RNN?
In paper it said: we have limited ourselves to a single recurrent layer (which is the hardest to parallelize) and we do not use Long-Short-Term-Memory (LSTM) circuits.
The text was updated successfully, but these errors were encountered: