Who Needs Words? Lexicon-free Speech Recognition (Likhomanenko et al., 2019)

Below are pre-trained acoustic and language models from Who Needs Words? Lexicon-free Speech Recognition (Likhomanenko et al., 2019).

Acoustic Models

File	Dataset	Dev Set	Architecture	Lexicon	Tokens
baseline_dev-clean+other	LibriSpeech	dev-clean+dev-other	Archfile	Lexicon	Tokens
baseline_nov93dev	WSJ	nov93dev	Archfile	Lexicon	Tokens

Language Models

Convolutional language models (ConvLM) are trained with the fairseq toolkit. n-gram language models are trained with the KenLM toolkit. The below language models are converted into a binary format compatible with the wav2letter++ decoder.

Name	Dataset	Type	Vocab
lm_librispeech_convlm_char_20B	LibriSpeech	ConvLM 20B	LM Vocab
lm_librispeech_convlm_word_14B	LibriSpeech	ConvLM 14B	LM Vocab
lm_librispeech_kenlm_char_15g_pruned	LibriSpeech	15-gram	-
lm_librispeech_kenlm_char_20g_pruned	LibriSpeech	20-gram	-
lm_librispeech_kenlm_word_4g_200kvocab	LibriSpeech	4-gram	-
lm_wsj_convlm_char_20B	WSJ	ConvLM 20B	LM Vocab
lm_wsj_convlm_word_14B	WSJ	ConvLM 14B	LM Vocab
lm_wsj_kenlm_char_15g_pruned	WSJ	15-gram	-
lm_wsj_kenlm_char_20g_pruned	WSJ	20-gram	-
lm_wsj_kenlm_word_4g	WSJ	4-gram	-

Citation

@article{likhomanenko2019needs,
  title={Who needs words? lexicon-free speech recognition},
  author={Likhomanenko, Tatiana and Synnaeve, Gabriel and Collobert, Ronan},
  journal={arXiv preprint arXiv:1904.04479},
  year={2019}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lexicon_free.md

lexicon_free.md

Who Needs Words? Lexicon-free Speech Recognition (Likhomanenko et al., 2019)

Acoustic Models

Language Models

Citation

Files

lexicon_free.md

Latest commit

History

lexicon_free.md

File metadata and controls

Who Needs Words? Lexicon-free Speech Recognition (Likhomanenko et al., 2019)

Acoustic Models

Language Models

Citation