Pytorch implementation of conformer model with training script for end-to-end speech recognition on the LibriSpeech dataset.
python train.py --data_dir=./data --train_set=train-clean-100 --test_set=test_clean --checkpoint_path=model_best.pt
python train.py --load_checkpoint --checkpoint_path=model_best.pt
python train.py --use_amp
For a full list of command line arguments, run python train.py --help
. Smart batching is used by default but may need to be disabled for larger datasets. For valid train_set and test_set values, see torchaudio's LibriSpeech dataset. The model parameters default to the Conformer (S) configuration. For the Conformer (M) and Conformer (L) models, refer to the table below:
- Language Model (LM) implementation
- Multi-GPU support
- Support for full LibriSpeech960h train set
- Support for other decoders (ie: transformer decoder, etc.)