Moving to faster-whisper #185
Replies: 6 comments 3 replies
-
😃 Sounds awesome! |
Beta Was this translation helpful? Give feedback.
-
As someone who needs trascriptions in Vietnamese, It's unfortunate to me that this program is very very slow with creating transcriptions for this language, so if this change will bring significant increase in speed for Non-english transcriptions, that'll be of great use! |
Beta Was this translation helpful? Give feedback.
-
I've been comparing models, and so far, Sherpa-ONNX stands out for me, despite some challenges with CUDA on Windows. It’s impressively fast on CPU, so I haven’t worried much about GPU support. WhisperCPP's 'small' model works but is quite slow. Lately, I've been exploring the best way to use these models, particularly around the classic 16kHz resampling and mono setup. I'm curious about the optimal pre-processing steps before integrating the audio files into the model. I think that understanding the model's features seems crucial for proper use. and to be honest, I don't want to go straight to "use a bigger model" just because it gives better results and I think it's not a fun way to solve a problem. :-D https://alphacephei.com/nsh/2024/04/20/status-of-whisper.html |
Beta Was this translation helpful? Give feedback.
-
interesting. I tried some FFT stuff, it made the output better. But fft is very expensive. |
Beta Was this translation helpful? Give feedback.
-
Thank you guys for writing here. I found that whisper.cpp integrated Vulkan support and that's great news since it's cross platform and cross GPU backend. meaning we may be able to get rid of Cuda, RoCm, OpenCL, OpenBlas etc... all we need is vulkan and we get good speed for Linux and Windows for AMD / Nvidia / Intel Integrated GPUs By the way as for comparing it to faster whisper and other platforms it turns out that whisper.cpp is actually pretty fast. most of the projects which wants to declare themself as fast making benchmarks on cuda with powerful GPUs which most people doesn't have them. I think that comparing it on regular hardware with more popular model like medium is more interesting. Vulkan transcribed 20s audio in 14s on amd ryzen 5 4500u on Windows with medium model. |
Beta Was this translation helpful? Give feedback.
-
hmm, vulkan API sounds really impressive I hadn't heard that. and maybe wgpu next :-P Then staying with whishper cpp sounds reasonable. |
Beta Was this translation helpful? Give feedback.
-
Hey everyone!
I'm currently exploring an update for Vibe that may switch to faster-whisper instead of whisper.cpp. This potential change aims to significantly boost performance on a wider range of computers, not just Nvidia or macOS devices.
What this could mean:
If implemented, Vibe would run much faster on almost any computer. However, to achieve this improvement, some advanced features might temporarily be unavailable. Here’s what would continue to work:
However, certain features like translation and initialization prompts might be impacted.
And GGML models won't be supported. However it's easy to export them into the required format, but keep it mind that it's less flexible.
What do you think about Vibe potentially moving to faster-whisper?
Update: see main readme. performance optimized for all platforms!
Beta Was this translation helpful? Give feedback.
All reactions