You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm learning how to use LibriSpeech Dataset to train the squeezeformer network.
After 20 epochs training, both the evaluation WER (0.6368) and CER (0.4251) are still very high and not improved anymore. (training WER(0.6405), CER (0.4278))
These results seem "inconsistent" with the accuracies claimed in the paper. (CER, WER < 0.1)
So, I think there must be something wrong with my setting.
(1) Did anyone get the CER/WER accuracies below 0.1 by using the code from opeeenspeech with LibriSpeech dataset?
(2) Which tokenizer should I use to get good accuracy? (libri_subword or libr_character?)
I used the libri_subword now.
(3) Is my training script correct?
Details
(a) The training data and evaluation data setting in the "preprocess.py" is as follows:
LIBRI_SPEECH_DATASETS = [
"train-960",
"dev-clean",
"dev-other",
"test-clean",
"test-other",
]
(b) My training script is as follows:
python ./openspeech_cli/hydra_train.py dataset="librispeech" dataset.dataset_download=False dataset.dataset_path=$DATASET_PATH dataset.dataset_path="/dataSSD/" dataset.manifest_file_path=$MANIFEST_FILE_PATH dataset.manifest_file_path="/dataSSD/LibriSpeech/libri_subword_manifest.txt" tokenizer=libri_subword tokenizer.vocab_path=$VOCAB_FILE_PATH tokenizer.vocab_path="/dataSSD/LibriSpeech" model=squeezeformer_lstm audio=fbank trainer=gpu lr_scheduler=warmup_reduce_lr_on_plateau ++trainer.accelerator=dp ++trainer.batch_size=128 ++trainer.num_workers=4 criterion=cross_entropy
(c) The training results after 19 epochs are as follows:
The text was updated successfully, but these errors were encountered:
❓ Questions & Help
I'm learning how to use LibriSpeech Dataset to train the squeezeformer network.
After 20 epochs training, both the evaluation WER (0.6368) and CER (0.4251) are still very high and not improved anymore. (training WER(0.6405), CER (0.4278))
These results seem "inconsistent" with the accuracies claimed in the paper. (CER, WER < 0.1)
So, I think there must be something wrong with my setting.
(1) Did anyone get the CER/WER accuracies below 0.1 by using the code from opeeenspeech with LibriSpeech dataset?
(2) Which tokenizer should I use to get good accuracy? (libri_subword or libr_character?)
I used the libri_subword now.
(3) Is my training script correct?
Details
(a) The training data and evaluation data setting in the "preprocess.py" is as follows:
LIBRI_SPEECH_DATASETS = [
"train-960",
"dev-clean",
"dev-other",
"test-clean",
"test-other",
]
(b) My training script is as follows:
python ./openspeech_cli/hydra_train.py dataset="librispeech" dataset.dataset_download=False dataset.dataset_path=$DATASET_PATH dataset.dataset_path="/dataSSD/" dataset.manifest_file_path=$MANIFEST_FILE_PATH dataset.manifest_file_path="/dataSSD/LibriSpeech/libri_subword_manifest.txt" tokenizer=libri_subword tokenizer.vocab_path=$VOCAB_FILE_PATH tokenizer.vocab_path="/dataSSD/LibriSpeech" model=squeezeformer_lstm audio=fbank trainer=gpu lr_scheduler=warmup_reduce_lr_on_plateau ++trainer.accelerator=dp ++trainer.batch_size=128 ++trainer.num_workers=4 criterion=cross_entropy
(c) The training results after 19 epochs are as follows:
The text was updated successfully, but these errors were encountered: