Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding option to set n_threads for Whisper.cpp #892

Merged
merged 6 commits into from
Aug 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 6 additions & 14 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,31 +94,23 @@ Set-ExecutionPolicy Bypass -Scope Process -Force; [System.Net.ServicePointManage
```
2. Install the GNU make. `choco install make`
3. Install the ffmpeg. `choco install ffmpeg`
4. Install Poetry, paste this info Windows PowerShell line by line. [More info](https://python-poetry.org/docs/)
4. Install [MSYS2](https://www.msys2.org/), follow [this guide](https://sajidifti.medium.com/how-to-install-gcc-and-gdb-on-windows-using-msys2-tutorial-0fceb7e66454).
5. Install Poetry, paste this info Windows PowerShell line by line. [More info](https://python-poetry.org/docs/)
```
(Invoke-WebRequest -Uri https://install.python-poetry.org -UseBasicParsing).Content | py -

[Environment]::SetEnvironmentVariable("Path", $env:Path + ";%APPDATA%\pypoetry\venv\Scripts", "User")

Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
```
5. Restart Windows.
6. Restart Windows.

6. Clone the repository `git clone --recursive https://github.com/chidiwilliams/buzz.git`
7. Enter repo folder `cd buzz`
8. Copy `whisper.dll` from the repo backup to `buzz` folder.
```
cp -r .\dll_backup\ .\buzz\
```
7. Clone the repository `git clone --recursive https://github.com/chidiwilliams/buzz.git`
8. Enter repo folder `cd buzz`
9. Activate the virtual environment `poetry shell`
10. Install the dependencies `poetry install`
11. Build Buzz `poetry build`
12. Install Buzz
```
$whlFile = Get-ChildItem .\dist\buzz*.whl | Select-Object -First 1
pip install $whlFile
```
13. Run Buzz `python -m buzz`
12. Run Buzz `python -m buzz`

#### GPU Support

Expand Down
23 changes: 20 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,17 @@ else
endif

clean:
ifeq ($(OS), Windows_NT)
del /f buzz\$(LIBWHISPER) 2> nul
del /f buzz\whisper_cpp.py 2> nul
rmdir /s /q whisper.cpp\build 2> nul
rmdir /s /q dist 2> nul
else
rm -f buzz/$(LIBWHISPER)
rm -f buzz/whisper_cpp.py
rm -rf whisper.cpp/build || true
rm -rf dist/* || true
endif

COVERAGE_THRESHOLD := 75

Expand Down Expand Up @@ -68,9 +75,9 @@ else
endif

buzz/$(LIBWHISPER):
ifeq ($(OS),Windows_NT)
cp dll_backup/whisper.dll buzz || true
cp dll_backup/SDL2.dll buzz || true
ifeq ($(OS), Windows_NT)
cp dll_backup/whisper.dll buzz || copy dll_backup\whisper.dll buzz\whisper.dll
cp dll_backup/SDL2.dll buzz || copy dll_backup\SDL2.dll buzz\SDL2.dll
else
cmake -S whisper.cpp -B whisper.cpp/build/ $(CMAKE_FLAGS)
cmake --build whisper.cpp/build --verbose
Expand Down Expand Up @@ -98,6 +105,7 @@ dmg_mac:
--app-drop-link 425 120 \
--codesign "$$BUZZ_CODESIGN_IDENTITY" \
--notarize "$$BUZZ_KEYCHAIN_NOTARY_PROFILE" \
--filesystem APFS \
"${mac_dmg_path}" \
"dist/dmg/"

Expand Down Expand Up @@ -188,10 +196,19 @@ translation_po:
sed -i.bak 's/CHARSET/UTF-8/' ${TMP_POT_FILE_PATH} && rm ${TMP_POT_FILE_PATH}.bak
msgmerge -U ${PO_FILE_PATH} ${TMP_POT_FILE_PATH}

# On windows we can have two ways to compile locales, one for CI the other for local builds
# Will try both and ignore errors if they fail
translation_mo:
ifeq ($(OS), Windows_NT)
-forfiles /p buzz\locale /c "cmd /c python ..\..\msgfmt.py -o @path\LC_MESSAGES\buzz.mo @path\LC_MESSAGES\buzz.po"
-for dir in buzz/locale/*/ ; do \
python msgfmt.py -o $$dir/LC_MESSAGES/buzz.mo $$dir/LC_MESSAGES/buzz.po; \
done
else
for dir in buzz/locale/*/ ; do \
python msgfmt.py -o $$dir/LC_MESSAGES/buzz.mo $$dir/LC_MESSAGES/buzz.po; \
done
endif

lint:
ruff check . --fix
Expand Down
2 changes: 2 additions & 0 deletions buzz/transcriber/whisper_cpp.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import os
import ctypes
import logging
from typing import Union, Any, List
Expand Down Expand Up @@ -109,6 +110,7 @@ def whisper_cpp_params(
params = whisper_cpp.whisper_full_default_params(
whisper_cpp.WHISPER_SAMPLING_GREEDY
)
params.n_threads = int(os.getenv("BUZZ_WHISPERCPP_N_THREADS", 4))
params.print_realtime = print_realtime
params.print_progress = print_progress

Expand Down
32 changes: 31 additions & 1 deletion docs/docs/preferences.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,4 +34,34 @@ Available variables:

Live transcription export can be used to integrate Buzz with other applications like OBS Studio. When enabled, live text transcripts will be exported to a text file as they get generated and translated.

If AI translation is enabled for live recordings, the translated text will also be exported to the text file. Filename for the translated text will end with `.translated.txt`.
If AI translation is enabled for live recordings, the translated text will also be exported to the text file. Filename for the translated text will end with `.translated.txt`.

## Advanced Preferences

To keep preferences section simple for new users, some more advanced preferences are settable via OS environment variables. Set the necessary environment variables in your OS before starting Buzz or create a script to set them.

On MacOS and Linux crete `run_buzz.sh` with the following content:

```bash
#!/bin/bash
export VARIABLE=value
export SOME_OTHER_VARIABLE=some_other_value
buzz
```

On Windows crete `run_buzz.bat` with the following content:

```bat
@echo off
set VARIABLE=value
set SOME_OTHER_VARIABLE=some_other_value
"C:\Program Files (x86)\Buzz\Buzz.exe"
```

### Available variables

**BUZZ_WHISPERCPP_N_THREADS** - Number of threads to use for Whisper.cpp model. Default is `4`. Available from `v1.0.2`.

On a laptop with 16 threads setting `BUZZ_WHISPERCPP_N_THREADS=8` leads to some 15% speedup in transcription time.
Increasing number of threads even more will lead in slower transcription time as results from parallel threads has to be
combined to produce the final answer.
Loading