Skip to content

Latest commit

 

History

History
76 lines (60 loc) · 4.94 KB

File metadata and controls

76 lines (60 loc) · 4.94 KB

Automatic-Speech-Recognition-ASR-ONNX

Harness the power of ONNX Runtime to transcribe audio into text effortlessly.

Supported Models

  1. Single Model:

  2. Combined Models (ASR + Speaker Identify):

Features

  • End-to-end speech recognition with built-in STFT processing.
    Input: Audio file
    Output: Transcription result
  • Seamlessly integrate with these additional tools for improved performance:
  • This Whisper does not support automatic language detection. Please specify a target language.

Learn More


性能 Performance

OS Device Backend Model Real-Time Factor
(Chunk Size: 128000 or 8s)
Ubuntu 24.04 Laptop CPU
i5-7300HQ
SenseVoiceSmall
f32
0.037
Ubuntu 24.04 Laptop CPU
i5-7300HQ
SenseVoiceSmall
q8f32
0.075
Ubuntu 24.04 Desktop CPU
i3-12300
SenseVoiceSmall
f32
0.019
Ubuntu 24.04 Desktop CPU
i3-12300
SenseVoiceSmall
q8f32
0.022
Ubuntu 24.04 Desktop CPU
i3-12300
SenseVoiceSmall +
ERes2NetV2_w24s4ep4
f32
0.1
Ubuntu 24.04 Desktop CPU
i3-12300
Whisper-Large-v3-en
q8f32
0.15
Ubuntu 24.04 Desktop CPU
i3-12300
Whisper-Large-v3-Turbo-en
q8f32
0.073

Coming Soon 🚀


自动语音识别(ASR)ONNX

利用 ONNX Runtime 实现音频到文本的高效转录。

支持模型

  1. 单模型

  2. 组合模型 (ASR + 讲话者识别)

功能特点

  • 端到端语音识别,内置 STFT 处理。
    输入:音频文件
    输出:转录结果
  • 推荐搭配以下工具,提升性能:
  • 此 Whisper 不支持自动语言检测。请指定目标语言。

了解更多