Skip to content

DakeQQ/Voice-Activity-Detection-VAD-ONNX

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

94 Commits
 
 
 
 
 
 
 
 

Repository files navigation


Voice-Activity-Detection-VAD-ONNX

Speech activity detection powered by ONNX Runtime for high-performance applications.

Features

  1. Supported Model:

    • FSMN
    • Silero (Optimized for enhanced parallel computing performance)
  2. Recommendation and Note:

    • It is recommended to use the Audio Denoiser for optimal performance in noisy environments.
  3. End-to-End Processing:

    • This model includes internal STFT processing.
    • Input: Raw audio
    • Output: Detected speech timestamps
  4. Resources:


性能 Performance

OS Device Backend Model Real-Time Factor
(Chunk Size: 512 or 32ms)
Ubuntu-24.04 Desktop CPU
i3-12300
FSMN
f32
0.0047
Ubuntu-24.04 Desktop CPU
i3-12300
Silero
f32
0.0026

Voice-Activity-Detection-VAD-ONNX

通过 ONNX Runtime 实现高性能的语音活动检测。

功能

  1. 支持的模型

  2. 推荐与注意

    • 建议与 音频降噪器 搭配使用,以在嘈杂环境中获得最佳性能。
  3. 端到端处理

    • 模型包含内部 STFT 处理。
    • 输入:原始音频
    • 输出:检测到的语音时间戳
  4. 资源


About

Utilizes ONNX Runtime for speech activity detection.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages