Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add C++ runtime and Python APIs for Moonshine models #1473

Merged
merged 7 commits into from
Oct 26, 2024

Conversation

csukuangfj
Copy link
Collaborator

Note that Moonshine tiny is much faster than whisper tiny.en.

RTF on my MacBook Pro (CPU, 1 thread)

Moonshine tiny Whisper tiny.en
RTF 0.031 0.095

Speed test about generating subtitles (1 thread, CPU, on my MacBook Pro)

Please first download model files from https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models

wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/Obama.wav

wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2
tar xf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2

wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2
tar xf sherpa-onnx-whisper-tiny.en.tar.bz2

Moonshine tiny

./python-api-examples/generate-subtitles.py  \
  --silero-vad-model=./silero_vad.onnx \
  --moonshine-preprocessor=./sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx \
  --moonshine-encoder=./sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx \
  --moonshine-uncached-decoder=./sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx \
  --moonshine-cached-decoder=./sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx \
  --tokens=./sherpa-onnx-moonshine-tiny-en-int8/tokens.txt \
  --num-threads=1 \
  /Users/fangjun/Desktop/Obama.mov

Outputs

Started!
Saved to /Users/fangjun/Desktop/Obama.srt
Audio duration: 335.235 s
Elapsed:        10.464 s
RTF = 10.464/335.235 = 0.031
Done!

Whisper tiny.en

./python-api-examples/generate-subtitles.py  \
  --silero-vad-model=./silero_vad.onnx \
  --whisper-encoder=./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx \
  --whisper-decoder=./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx \
  --tokens=./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt \
  --num-threads=1 \
  /Users/fangjun/Desktop/Obama.mov

Output

Started!
Saved to /Users/fangjun/Desktop/Obama.srt
Audio duration: 335.235 s
Elapsed:        31.809 s
RTF = 31.809/335.235 = 0.095
Done!

Config of my MacBook Pro

Screenshot 2024-10-26 at 14 18 31

@csukuangfj csukuangfj merged commit 669f5ef into k2-fsa:master Oct 26, 2024
23 of 201 checks passed
@csukuangfj csukuangfj deleted the cpp-moonshine branch October 26, 2024 06:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant