Moving to faster-whisper #185

thewh1teagle · 2024-07-18T16:29:08Z

thewh1teagle
Jul 18, 2024
Maintainer

Hey everyone!

I'm currently exploring an update for Vibe that may switch to faster-whisper instead of whisper.cpp. This potential change aims to significantly boost performance on a wider range of computers, not just Nvidia or macOS devices.

What this could mean:

If implemented, Vibe would run much faster on almost any computer. However, to achieve this improvement, some advanced features might temporarily be unavailable. Here’s what would continue to work:

Transcription in every language
Showing timestamps

However, certain features like translation and initialization prompts might be impacted.
And GGML models won't be supported. However it's easy to export them into the required format, but keep it mind that it's less flexible.

What do you think about Vibe potentially moving to faster-whisper?

Excited! Faster performance on any computer sounds great. 😃
Concerned about missing advanced features like translation. 😕
Other (please specify) 🤔

Update: see main readme. performance optimized for all platforms!

MrPowley · 2024-07-18T17:15:08Z

MrPowley
Jul 18, 2024

😃 Sounds awesome!
I think a faster Whisper (if it's actually much faster) could make up for the lack of some features. Personally, as a non-native English speaker, the translate feature isn't useful to me, so the benefit of having a faster transcription would allow people who want the translation, to get it manually (I mean using a translation service like Google or Deepl) in as much or less time compared to previous versions of Vibe. Also the fact of being able to have a quick transcription on "any" computer or smartphone, once you have it ported, could be a more than interesting compromise.
Though missing the initialization prompts could reduce the transcription "precision" for some terms but again, it could be compensated by the time gain.
All in all, i think using a new version of Whisper is a great idea.

0 replies

rarelygoeshere · 2024-07-24T09:40:09Z

rarelygoeshere
Jul 24, 2024

As someone who needs trascriptions in Vietnamese, It's unfortunate to me that this program is very very slow with creating transcriptions for this language, so if this change will bring significant increase in speed for Non-english transcriptions, that'll be of great use!

0 replies

altunenes · 2024-08-12T21:13:20Z

altunenes
Aug 12, 2024

I've been comparing models, and so far, Sherpa-ONNX stands out for me, despite some challenges with CUDA on Windows. It’s impressively fast on CPU, so I haven’t worried much about GPU support.

WhisperCPP's 'small' model works but is quite slow. Lately, I've been exploring the best way to use these models, particularly around the classic 16kHz resampling and mono setup. I'm curious about the optimal pre-processing steps before integrating the audio files into the model.

I think that understanding the model's features seems crucial for proper use. and to be honest, I don't want to go straight to "use a bigger model" just because it gives better results and I think it's not a fun way to solve a problem. :-D

https://alphacephei.com/nsh/2024/04/20/status-of-whisper.html

0 replies

altunenes · 2024-08-14T09:35:51Z

altunenes
Aug 14, 2024

openai/whisper#2125

interesting. I tried some FFT stuff, it made the output better. But fft is very expensive.

0 replies

thewh1teagle · 2024-08-18T01:16:00Z

thewh1teagle
Aug 18, 2024
Maintainer Author

Thank you guys for writing here.
The current state of Vibe regarding speed is that macOS works very fast and it stay that long.
Linux works very slow. I disblaed OpenCL by default because of bugs in that.
Windows works somewhat fast using OpenCL but has a lot of bugs.

I found that whisper.cpp integrated Vulkan support and that's great news since it's cross platform and cross GPU backend. meaning we may be able to get rid of Cuda, RoCm, OpenCL, OpenBlas etc... all we need is vulkan and we get good speed for Linux and Windows for AMD / Nvidia / Intel Integrated GPUs

By the way as for comparing it to faster whisper and other platforms it turns out that whisper.cpp is actually pretty fast. most of the projects which wants to declare themself as fast making benchmarks on cuda with powerful GPUs which most people doesn't have them. I think that comparing it on regular hardware with more popular model like medium is more interesting.

Vulkan transcribed 20s audio in 14s on amd ryzen 5 4500u on Windows with medium model.

0 replies

altunenes · 2024-08-18T07:59:39Z

altunenes
Aug 18, 2024

hmm, vulkan API sounds really impressive I hadn't heard that. and maybe wgpu next :-P Then staying with whishper cpp sounds reasonable.

3 replies

thewh1teagle Aug 22, 2024
Maintainer Author

I just added new (beta) release with vulkan support : - )
It's much faster on Windows and Linux.
I really like vulkan it make things simple and efficient

altunenes Aug 22, 2024

Very a very cool feature and I'm really happy we got rid of the ugly Cuda installation pipelines.
I will test it soon! But I want to test it with my own console pipelines to understand it better. When I try it on Vibe it works really fine.
"4bfa360"
So, what I understand here, adding only "vulkan" as the feature is enough to initialize it?

thewh1teagle Aug 22, 2024
Maintainer Author

Very a very cool feature and I'm really happy we got rid of the ugly Cuda installation pipelines.

Cuda makes executables huge—like 300MB—for no good reason. I tested the latest Vibe on an rtx 3060, and transcription of 1 hour took just 2.5 minutes using Vulkan. Setup was quick and easy! This is the key to fast transcription on Windows. My only concern is that it fails on GitHub Actions for Windows/Linux, though it works fine elsewhere.

So, what I understand here, adding only "vulkan" as the feature is enough to initialize it?

Building with vulkan sdk also requires setting the VULKAN_SDK environment variable on Windows. I'll add notes about this after merging.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Moving to faster-whisper #185

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 6 comments 3 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Moving to faster-whisper #185

thewh1teagle Jul 18, 2024 Maintainer

Replies: 6 comments · 3 replies

MrPowley Jul 18, 2024

rarelygoeshere Jul 24, 2024

altunenes Aug 12, 2024

altunenes Aug 14, 2024

thewh1teagle Aug 18, 2024 Maintainer Author

altunenes Aug 18, 2024

thewh1teagle Aug 22, 2024 Maintainer Author

altunenes Aug 22, 2024

thewh1teagle Aug 22, 2024 Maintainer Author

thewh1teagle
Jul 18, 2024
Maintainer

Replies: 6 comments 3 replies

MrPowley
Jul 18, 2024

rarelygoeshere
Jul 24, 2024

altunenes
Aug 12, 2024

altunenes
Aug 14, 2024

thewh1teagle
Aug 18, 2024
Maintainer Author

altunenes
Aug 18, 2024

thewh1teagle Aug 22, 2024
Maintainer Author

thewh1teagle Aug 22, 2024
Maintainer Author