Releases · bricky149/tlap-rs

31 May 15:26

v0.5.2

276b795

A Minor Update Latest

Latest

This release is aimed at fixing a build issue that has arisen from a cpal crate update, as well as a change to recreate the subtitle file upon every run instead of needing to remove it manually.

Assets 2

04 Nov 12:23

bricky149

v0.5.1

056c48c

Silence is Golden (with hotfix) Pre-release

Pre-release

This release fixes the known issue where reading back files for streaming could result in the thread it's running on to not finish before the next call. We're splitting audio files based on assumed silence now, instead of every 64k samples (four seconds). This only applies to recorded audio, not real-time captures, for now.

This also contains a hotfix for a regression where poor quality recordings were incorrectly marked as 'done' with no subtitles written, as a result of silence not being 'true' silence.

TL;DR

Refactor streaming code to read only data written since the last read
Split audio lines where we think there is silence, rather than every 64k samples

Assets 2

12 Oct 17:22

bricky149

v0.5.0

2ec8f14

Strong and Stable Pre-release

Pre-release

This release focuses on clearing up unwrap() calls and handling Err() cases. Despite additional code (and nesting), the resulting binary is smaller due to refactoring subtitle writing calls into a single function as well as how timestamps are generated from periods of time.

There may be a future release that standardises code style to match idiomatic Rust patterns. Depends if I want to spend time on other features I would appreciate, like punctuation and detecting silence instead of splitting a file's samples over four-second lines.

TL;DR

Add a test for subtitle output
Change Options to Results as all data is needed during execution
Reduce chance of panicking by covering for every Err() case
Miscellaneous SemVer-breaking changes, despite no new features

Known issues

Given a long enough file, the thread that reads input back might run longer than the sleep period

Assets 2

07 Oct 10:01

bricky149

v0.4.0

6329d1b

The (inevitable) Coqui Rewrite Pre-release

Pre-release

When this program was first written in December 2021, DeepSpeech was the only library I could use as it had Rust bindings. In February 2022, Rust bindings appeared for Coqui. It only seemed right to switch to it.

As I was doing that, I found I could no longer feed an input stream into Coqui's intermediate_decode() due to an API change somewhere either in the library itself or its bindings. Instead, I opted to write the input to file first and then read it back on another thread for Coqui's speech_to_text() to do its magic. This removed the eager looping code (thus saving CPU cycles) and the latency I was getting before. I had to switch from PortAudio to cpal to make that happen, with the positive side-effect being this feature may work on non-Linux platforms now.

I cut out code that wasn't part of the core functionality. For example, input would be resampled if it didn't match the 16kHz rate the model was using. ffmpeg does a better job of doing this so I cut out related code and crates, cutting deps by around half. I then split code into 'speech' and 'subtitle' domains, which forced me to refactor the code somewhat.

Unfortunately, Coqui doesn't link their library to a CUDA runtime. This means it'll only run on the CPU. Every CPU cycle that can be saved counts!

TL;DR

Major codebase rewrite
- Migrated from deprecated DeepSpeech dependencies to coqui-stt
- Migrated from PortAudio to cpal, allowing for cross-platform feature parity
- Removed resampling functionality, cutting external crates almost in half
- Separated speech-related and subtitle-related code into their own files
- Reworked sub streaming as to not pin CPU usage to 100%
- Threaded sub streaming to reduce transcription latency
As Coqui does not offer CUDA binaries, CUDA support has been removed

Known issues

Given a long enough file, the thread that reads input back might run longer than the sleep period

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TL;DR

TL;DR

Known issues

TL;DR

Known issues

Releases: bricky149/tlap-rs

A Minor Update

Silence is Golden (with hotfix)

TL;DR

Strong and Stable

TL;DR

Known issues

The (inevitable) Coqui Rewrite

TL;DR

Known issues