Skip to content

Releases: KoljaB/RealtimeSTT

v0.1.15

14 Apr 12:05
Compare
Choose a tag to compare
  • added parameter beam_size
    (int, default=5)
    The beam size to use for beam search decoding
  • added parameter beam_size_realtime
    (int, default=3)
    The beam size to use for real-time transcription beam search decoding.
  • added parameter initial_prompt
    (str or iterable of int, default=None)
    Initial prompt to be fed to the transcription models.
  • added parameter suppress_tokens
    (list of int, default=[-1])
    Tokens to be suppressed from the transcription output.
  • added method set_microphone(microphone_on=True)
    This parameter allows dynamical switching between recording from the input device configured in RealtimeSTT and chunks injected into the processing pipeline with the feed_audio-method

v0.1.13

08 Apr 20:54
Compare
Choose a tag to compare
  • added beam_size: int = 5 and beam_size_realtime: int = 3 parameters to AudioToTextRecorder constructor allowing faster (realtime) transcriptions by lowering the beamsizes
  • added last_transcription_bytes containing the raw bytes from the last transcription
    You can retrieve those bytes with recorder.last_transcription_bytes for further analysis, saving to file etc

v0.1.12

30 Mar 15:30
Compare
Choose a tag to compare
  • fixed qsize issue for macOS
  • upgrade requirements to torch 2.2.2

v0.1.11

16 Mar 19:47
Compare
Choose a tag to compare
  • added on_recorded_chunk callback to allow processing of audio chunks recorded from microphone by the client

v0.1.9

29 Jan 17:09
Compare
Choose a tag to compare
  • switched to torch.multiprocessing
  • added compute_type (#14), input_device_index (select input audio device) and gpu_device_index (select gpu device) parameters
  • recorder.text() interruptable with recorder.abort()
  • fix for #20

v0.1.8

15 Dec 12:51
Compare
Choose a tag to compare
  • added example how to realtime transcribe from browser microphone
  • large-v3 whisper model now supported (upgrade to faster_whisper 0.10.0)
  • added feed_audio() and use_microphone parameter to feed chunks

Bugfixes and KeyboardInterrupt support

09 Nov 15:36
Compare
Choose a tag to compare
  • Bugfix for Mac OS Installation (occured int the context of multiprocessing with the usage of queue.size(), changed to use multiprocessing.Manager().Queue() which should work under Mac)
  • KeyboardInterrupt handling (we can now abort the recorder with CTRL+C)
  • Bugfix for spinner handling (could lead to exception in some cases, AttributeError: 'NoneType' object has no attribute '_interval')

v0.1.6

17 Oct 10:03
Compare
Choose a tag to compare

Implements context manager protocol. This enables the recorder instance to be used in a with statement, ensuring proper resource management. Waits for transcription process to start in constructor now. Fixed bug in the shutdown method.

v0.1.5

04 Oct 17:37
Compare
Choose a tag to compare

Bugfix for a problem with the detection of short speech right after a sentence detection. In this case the voice activity detection of the new sentence overlaps with the transcription of the last sentence causing problems due to the pythons global interpreter lock mechanism. This only happens when this overlapping occurs within the same process context so that this multiprocessing upgrade should solve the issue.

Initial Release

06 Sep 20:37
Compare
Choose a tag to compare
v0.1.3

updated versions in batchfiles of example apP