Harness the power of ONNX Runtime to transcribe audio into text effortlessly.
-
Single Model:
-
Combined Models (ASR + Speaker Identify):
- End-to-end speech recognition with built-in
STFT
processing.
Input: Audio file
Output: Transcription result - Seamlessly integrate with these additional tools for improved performance:
- This Whisper does not support automatic language detection. Please specify a target language.
- Visit the project overview for further details.
OS | Device | Backend | Model | Real-Time Factor (Chunk Size: 128000 or 8s) |
---|---|---|---|---|
Ubuntu 24.04 | Laptop | CPU i5-7300HQ |
SenseVoiceSmall f32 |
0.037 |
Ubuntu 24.04 | Laptop | CPU i5-7300HQ |
SenseVoiceSmall q8f32 |
0.075 |
Ubuntu 24.04 | Desktop | CPU i3-12300 |
SenseVoiceSmall f32 |
0.019 |
Ubuntu 24.04 | Desktop | CPU i3-12300 |
SenseVoiceSmall q8f32 |
0.022 |
Ubuntu 24.04 | Desktop | CPU i3-12300 |
SenseVoiceSmall + ERes2NetV2_w24s4ep4 f32 |
0.1 |
Ubuntu 24.04 | Desktop | CPU i3-12300 |
Whisper-Large-v3-en q8f32 |
0.15 |
Ubuntu 24.04 | Desktop | CPU i3-12300 |
Whisper-Large-v3-Turbo-en q8f32 |
0.073 |
利用 ONNX Runtime 实现音频到文本的高效转录。
-
单模型:
-
组合模型 (ASR + 讲话者识别):
- 端到端语音识别,内置
STFT
处理。
输入:音频文件
输出:转录结果 - 推荐搭配以下工具,提升性能:
- 此 Whisper 不支持自动语言检测。请指定目标语言。
- 访问项目概览获取更多信息。