Skip to content

All Changelogs

Isaak edited this page Feb 14, 2024 · 20 revisions

v1.11.2 Changelog

  • Basically redid the whole websocket Implementation
  • Websocket implementation now supports both acting as a Server and a Client. grafik

v1.11.1 Changelog

  • fix some issues with obs_only replacements
  • fix "is_finished" logic and add finished parameter to websocket

v1.11.0 Changelog

  • Custom Message formats for Translations #17
    • Gives the ability to show both the original transcription in combination with a Translation. you can configure the format of the output in the Translation Settings:
      grafik
      {1} being the original transcription and {2} being the Translation. As an example of an output with translation between English -> German: Thank you! (Danke!)
  • SteamVR Overlay font changes #18
    • uses NotoSans by default now to support more languages by default.
    • Additionally, It switches between different NotoSans versions for CJK (Chinese, Japanese, Korean) when needed.
  • limiting Max samples for longform transcription to reduce load in long sentences, especially useful when transcribing real time.
  • some changes to fight hallucinations, (things like spamming you you you or Thank you for watching! and similar.
  • small optimizations and fixes for inconsistencies.

v1.10.1 Changelog

  • Added distil-small.en to models
  • bumped dependencies, some of them security fixes

v1.10.0 Changelog

  • Websocket support!
    • Just a simple websocket at port 8765, you can change that port to whatever you want.
      grafik
  • 7tv support in browsersouce! image image
    • put your 7tv emote set id into the obs source settings, you can find that id by clicking on your emote set on the 7tv website and taking the long string from the url. example: "https://7tv.app/emote-sets/656419ccc27ad540196752b2" the id here would be 656419ccc27ad540196752b2
  • bump a bunch of dependencies.

v1.9.1 Changelog

  • updating is now always done over the force_update.bat to ensure consistency
  • a few little bug fixes in installing and updating logic
  • more logging for debugging

v1.9.0 Changelog

  • bump ctranslate2 to 3.21.0 to add: support for distil-whisper!
    • a distilled version of the whisper models that is 6 times faster, 49% smaller, all at the cost of being a bit less accurate.
    • Currently available are only distil-medium and distil-large-v2, both are currently not Multilingual, so english only.
    • you need to select the models manually!
      282268698-e65906da-1ce6-4cf4-b659-d7874d18a5c7

v1.8.5 Changelog

  • fix sending KAT_Pointer even tho KAT is turned off in the settings (for real this time)
  • log used device for debugging purposes

v1.8.4 Changelog

  • potentially fix a cuda related problem.
  • force_update.bat now cleans some broken packages.

v1.8.3 Changelog

  • fix some KAT parameters sending, even tho KAT is disabled #16
  • bumped a bunch of dependencies
  • fix small bug with obs only mode

v1.8.2 Changelog

  • fix translation quantization error
  • bump ctranslate2 to 3.19.0

v1.8.1 Changelog

  • fixed a dependency issue that caused the program to fail installing.

v1.8.0 Changelog

  • Support for OSCQuery! (Experimental)
    • Automatically finds a port to recieve data from VRChat from. This removes the need for routing applications, if you need to use them.
    • Set OSC Server port to 0 in the settings to use the automatic discovery of OSCQuery.
    • If you dont want to use OSCQuery or you dont have a need for it, just leave the port at 9001.
  • You can now disable the OSC Server completely by setting the port to -1, this is useful for someone that isnt going to use KAT.
  • Fix timeout_time and pause_threshold always resetting
  • update force_update.bat for CPU only installs
  • updater ignores Beta versions, unless already in Beta for future beta versions. (opt-in in the works)

v1.7.3-Beta Changelog

  • Don't advertise the OSCQuery endpoint when using the default OSC port (9001)

v1.7.2-Beta Changelog

  • Fix timeout_time and pause_threshold always resetting

v1.7.1-Beta Changelog

  • Fix KAT charsync
  • update force_update.bat for CPU only installs
  • updater ignore Beta, unless already in Beta for future beta versions. (opt-in in the works)

v1.7.0-Beta Changelog

  • Support for OSCQuery!
    • Automatically finds a port to recieve data from VRChat from. This removes the need for routing applications.
    • Set OSC Server port to 0 in the settings to use the automatic discovery of OSCQuery
    • This change doesnt effect anyone that uses TextboxSTT with VRChats textbox exclusively, only people that use KAT.

v1.6.2 Changelog

  • bump ctranslate to 3.18.0
  • add some exception handling to browser source

v1.6.1 Changelog

  • try to prevent hallucinations by enforcing a lower logprob
  • Limit temperatures to 0.0, as multiple temperatures cause occasional long latencies

v1.6.0 Changelog

  • added a small force update batch file in the src folder, just in case something breaks.
  • bump a few dependencies
    • bump ctranslate to 3.17.1
    • bump faster-whisper to 0.7.1
  • Clipboard actions! #8
    • Copy the last transcription to Clipboard, either manually or automatically
  • SteamVR Overlay Timeout #9
    • A seperate timeout for the SteamVR overlay, you can find that timeout setting in the overlay settings in the settings menu.

v1.5.6 Changelog

  • Fixes a logging issue with the newer ctranslate version, causing it not to start.

v1.5.5 Changelog

  • bump ctranslate and faster-whisper dependencies

v1.5.4 Changelog

  • Fixed a cast error, causing the settings to get saved incorrectly and crash the program.
  • Added some additional exception handling to getting audio devices, to prevent the settings menu from bugging out when a device cant be recognized.

v1.5.3 Changelog

  • pip cache and temp files are now cached locally in the python/cache folder and get cleared when not needed anymore.
  • Some exception handling for the Installer.
  • bump some dependencies.

v1.5.2 Changelog

  • bump some dependencies on versions.
  • fix audio feedback option not being loaded correctly, causing it to jump to the "ON" state when the settings menu is opened.

v1.5.1 Changelog

  • Shows an error when missing dependencies instead of just not showing anything.
  • bump some dependencies on versions (Updating might take a bit longer this time, because of updating torch)

v1.5.0 Changelog

  • Rewrite of Launcher code in C#, as an embedded batch file is suspicious to alot of antivirus programs.
    • This Launcher requires .NET 4.8.1, which should come pre installed with Windows 10 and up.

v1.4.3 Changelog

  • Checking for updates is now done without command line windows popping up every time.
  • Update progress is now shown in the program.
  • Some Requirements are locked to certain versions, to prevent big updates of dependencies to break the program.
  • A few background fixes for the development side of things.

v1.4.2 Changelog

  • change some default config settings
    • mode is set to "once_continuous" by default now.
    • SteamVR Overlay enabled by default now.
  • Fixed setting VAD in the UI not saving.

v1.4.1 Changelog

  • removes the requirements of git being installed by using a portable instance.
    • If the portable instance doesnt exist, the locally installed version is used.

v1.4.0 Changelog

  • Preloading the whisper model on startup, this should remove the increased latency on the first transcription.
  • No redownloading of the program needed anymore! Download once, install once. You will get notified about new updates in TextboxSTT.
  • TextboxSTT now comes with a portable python instance, that means required dependencies are now installed when needed.
  • Added default word replacements for things like *, !, ?, .
  • The "Reset Settings" button now resets all settings and restarts the program. (Word Replacements and Emotes are kept)
  • The ⟳ button now restarts the program completely instead of reloading.
  • The hotkey used by TextboxSTT now ignores input whenever you hold a modifier (CTRL, SHIFT etc.)
  • You can now record a whole hotkey instead of just one key.

v1.3.1 Changelog

Fixes running into Rate limit issues when in mode "once". This fixes the issue of certain transcriptions not appearing in the Textbox.

v1.3.0 Changelog

  • Control some TextboxSTT parameters over OSC in VRChat. Following parameters can be controlled:
    • "use_kat" (boolean)
    • "use_textbox" (boolean)
    • "use_both" (boolean)
    • "mode" (OSC parameter name "stt_mode") (integer)
    • Add those parameters to your Expression Menu to control them.
  • when adding a custom model for whisper, they are then saved in the settings, to remove one, select it and clear the textfield and press enter.
  • Autocorrection for spelling in the Text to Text field. Supported languages are English, Polish, Turkish, Russian, Ukrainian, Czech, Portuguese, Greek, Italian, Vietnamese, French and Spanish.
  • In mode "once_continuous" and "realtime", the program now tries to find sentence ends when transcriptions are taking too long, modifiable by the "max_transciption_time" setting for whisper.
  • Silero Voice activity detection. Further adds voice activity detection to filter out pauses and static noise.
  • obs only script, running "obs_only.exe" will run TextboxSTT in OBS only mode. with a simple console window and real time transcription.

v1.2.0 Changelog

  • Translation between languages, powered by M2M-100 using ctranslate2.
    • Translate between any of the ~100 languages supported.
    • Translation requires downloading the M2M-100 model into cache, which is another ~2GB.
    • Inference is done on CPU by default, you can change this but i would advise against it, unless you have another 2GB of VRAM to spare.
  • Text timeout is now handled by TextboxSTT, for more consistency between KAT, Textbox and the SteamVR Overlay.
    • e.g. it will consistently populate the Textbox/KAT until either the Text timeout time is reached (30.0 seconds by default), or if it is cleared manually. Changing that value to <=0.0 will never clear the textbox, unless cleared manually.
  • Changed the default "phrase_time_limit" from 2.0 to 1.0, for more "real time" transcriptions in modes "once_continuous" and "realtime"

v1.1.3 Changelog

  • Fixed obs not launching unless reloading the program.
  • added a typewriter effect to the OBS Source for better readability.

v1.1.2 Changelog

  • Fixed context managing issue with audio source in mode once_continuous and realtime
  • Try preventing SteamVR Overlay from freezing by switching Application type to Overlay and reinitializing OVR when error OverlayError_RequestFailed

v1.1.1 Changelog

  • Automatically restarting the program when it is needed.
  • Fixed obs browser source not launching.
  • Fixed whisper transcribing random words when its only noise. (maybe use VAD in the future to avoid this issue and generally better results with transcription)
  • Refactor and logging changes and fixes.
  • Reverted some default value changes

v1.1.0 Changelog

  • #2 allow use of user fine tuned models on Huggingface
    • translation to english does not work with those models, at least with my testing.
    • In the model section of the settings select "custom" and enter a path to a huggingface model: e.g. "openai/whisper-base": You can return to selection by pressing enter on an empty box.
TextboxSTT_n0NS2WHmrr.mp4
  • complete config revamp, same (and more) config options but more organized!
    • sadly for this version you cannot automatically take your old config with you, you can ask in the support discord on how to do that if you have alot of word replacements and/or emotes set.
  • fast reload feature: click on the ⭯ button to quickly reload TextboxSTT
  • added audio settings: added a gain slider and an individiual toggle for each audio feedback step.
    image
  • Shows transcribe times in main UI now.
  • better log management, the program creates up to 5 logs, "latest.log" is the latest. logs are now saved in the "cache" folder.
  • added a program icon, wowee
  • Seperate windows are now always positioning relative to the window that it was opened from, not on the main window.
  • lots of refactoring and additional error logging.
  • updated to faster-whisper 0.3.0
  • some smaller bugfixes

v1.0.0 Changelog

  • Enforcing Single Instance by closing other instances of the program.
  • Switched from pyinstaller to cx_freeze for distributing (again).
    • Files are much more organized and clearer.
  • Switched from openai/whisper to guillaumekln/faster-whisper !
    • This implementation is up to 4 times faster than openai/whisper for the same accuracy while using less memory. (Benchmarks)
  • Added additional device settings for transcription
    • "compute_type"
    • "cpu_threads"
    • "num_workers"
  • Added Audio feedback toggle
  • Some OBS source fixes.
  • Delete cache after downloading model.
  • logging transcribe times.
  • You should now be able to take config.json files in between versions. Missing entries are added. Unused entries are removed.
  • create config if it doesnt exist.