-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refine librispeech.py for DeepSpeech2. #78
Conversation
Summary: 1. Add manifest line check. 2. Avoid re-unpacking if unpacked data already exists. 3. Add full_download (download all 7 sub-datasets of LibriSpeech).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost LGTM. Please explain the reason of checking the line number when md5 has been checked.
@@ -18,6 +18,7 @@ For some machines, we also need to install libsndfile1. Details to be added. | |||
``` | |||
cd data | |||
python librispeech.py | |||
cat manifest.libri.train-* > manifest.libri.train-all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we introduce the meaning of manifest.libri.train-* file ?
I see, the introduction details is in following section. Feel abrupt about manifest file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
deep_speech_2/data/librispeech.py
Outdated
MD5_TRAIN_CLEAN_100 = "2a93770f6d5c6c964bc36631d331a522" | ||
MD5_TRAIN_CLEAN_360 = "c0e676e450a7ff2f54aeade5171606fa" | ||
MD5_TRAIN_OTHER_500 = "d1a0fd59409feb2c614ce4d30c387708" | ||
|
||
NUM_LINES_TEST_CLEAN = 2620 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please explain why it's necessary to check the line number when MD5 has been checked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also feel confused
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
resolve #77
Summary: