-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add deployment support wav2vec2.0 via torchaudio #3609
Conversation
@@ -199,6 +199,48 @@ loss = model(input_values, labels=labels).loss | |||
loss.backward() | |||
``` | |||
|
|||
## Deploying wav2vec 2.0 with torchaudio | |||
|
|||
`torchaudio` has added wav2vec 2.0 model definition that supports TorchScript, alongside of function to import model instances from `fairseq` or 🤗Transformers. By using TorchScript, you can deploy your wav2vec 2.0 model to ONNX runtime, [C++](https://github.com/pytorch/audio/tree/master/examples/libtorchaudio/speech_recognition), [iOS](https://github.com/pytorch/ios-demo-app/tree/master/SpeechRecognition), and [Android](https://github.com/pytorch/android-demo-app/tree/master/SpeechRecognition). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Examples for iOS and Android are being updated to use torchaudio.
- torchaudio based wav2vec2 with no model input length limit pytorch/android-demo-app#141
- updated script and iOS code to use torchaudio 0.9 based wav2vec2 model with no input limit pytorch/ios-demo-app#53
ONNX support via TorchScript is reported to work here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great!
examples/wav2vec/README.md
Outdated
from torchaudio.models.wav2vec2.utils import import_fairseq_model | ||
|
||
original, _, _ = fairseq.checkpoint_utils.load_model_ensemble_and_task( | ||
["wav2vec_small_960h.pt"], arg_overrides={'data': "<DIRECTORY_WITH_DICTIONARY>"}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in newer models, dictionary is actually stored inside the model checkpoint rather than internally. we could probably convert older checkpoints to follow this new format if it will simplify things
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexeib I removed the arg_overrides
. Did you update the public checkpoints? My concern is that users who just want to try the published model (if not converted to the new format) will encounter an issue when loading models.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, i have not updated the old checkpoints yet. maybe we can allow users to optionally provide data dir instead of making it mandatory to either provide or not?
This pull request has been automatically marked as stale. If this pull request is still relevant, please leave any comment (for example, "bump"), and we'll keep it open. We are sorry that we haven't been able to prioritize reviewing it yet. Your contribution is very much appreciated. |
Closing this pull request after a prolonged period of inactivity. If this issue is still present in the latest release, please ask for this pull request to be reopened. Thank you! |
What does this PR do?
In upcoming PyTorch 1.9 / torchaudio 0.9 release, torchaudio support TorchScript-able wav2vec2.0 model definitions.
This PR adds the illustration fo how to convert the models to from
fairseq
andtransformers
into deployable package.cc @myleott @alexeib