This cli app transcribes audio and video for submission to the bitcointranscripts repo.
Available transcription models and services
- (local) Whisper
--model xxx [default: tiny.en]
- (remote) Deepgram (whisper-large)
--deepgram [default: False]
- summarization
--summarize
- diarization
--diarize
- summarization
Transcription Workflow
This transcription tool operates through a structured four-stage process:
- Preprocess: Gathers all the available metadata for each source (supports YouTube videos&playlists, and RSS feeds)
- Process: Downloads and converts sources for transcription preparation
- Transcription: Utilizes
openai-whisper
or Deepgram to generate transcripts.- Converts audio to text.
- Save as JSON: Preserves the output of the transcription service for future use.
- Save as SRT: Generates SRT file [whisper only]
- Summarize: Generates a summary of the transcript. [deepgram only]
- Upload: Saves transcription service output in an AWS S3 Bucket [optional]
- Finalizes the resulting transcript.
- Process diarization. [deepgram only]
- Process chapters.
- Converts audio to text.
- Postprocess: Offers multiple options for further actions:
- Push to GitHub: Push transcripts to your fork of the bitcointranscripts repo.
- Markdown: Saves transcripts in a markdown format supported by bitcointranscripts.
- Upload: Saves transcripts in an AWS S3 Bucket.
- Push to Queuer backend: Sends transcripts to a Queuer backend.
- Save as JSON: Preserves transcripts for future use.
-
To use deepgram as a transcription service, you must have a valid
DEEPGRAM_API_KEY
in the.env
file. -
To push the resulting transcript to GitHub you need to fork bitcointranscripts and then clone your fork and define your
BITCOINTRANSCRIPTS_DIR
in the.env
file. -
To push the resulting transcript to a Queuer backend, you must have a valid
QUEUE_ENDPOINT
in the.env
file. If not, you can instead save the payload in a json file using the--noqueue
flag. -
To enable pushing the models to a S3 bucket,
-
To be able to convert the intermediary media files to mp3, install
FFmpeg
-
for Mac Os users, run
brew install ffmpeg
-
for other users, follow the instruction on their site to install
-
-
To use a specific configuration profile, set the
PROFILE
variable in your.env
file.
This application supports configuration via a config.ini
file.
This file allows you to set default values for various options and flags, reducing the need to specify them on the command line every time.
Additionally, the configuration file can include options not available through the command line, offering greater flexibility and control over the application's behavior.
An example configuration file named config.ini.example
is included in the repository.
To use it, copy it to config.ini
and modify it according to your needs:
cp config.ini.example config.ini
Navigate to the application directory and run the below commands:
python3 -m venv venv
creates a virtual environment
source venv/bin/activate
activates the virtual environment
pip3 install .
to install the application
tstbtc --version
view the application version
tstbtc --help
view the application help
pip3 uninstall tstbtc
to uninstall the application
tstbtc transcribe {source_file/url}
transcribe the given source
Suported sources:
- YouTube videos and playlists
- Local and remote audio files
- JSON files containing individual sources
Note:
- The https links need to be wrapped in quotes when running the command on zsh
To include optional metadata in your transcript, you can add the following parameters:
--loc
: Add the location in the bitcointranscripts hierarchy that you want to associate the transcript [default: "misc"]-t
or--title
: Add the title for the resulting transcript (required for audio files)-d
or--date
: Add the event date to transcript's metadata in format 'yyyy-mm-dd'- can be used multiple times:
-T
or--tags
: Add a tag to transcript's metadata-s
or--speakers
: Add a speaker to the transcript's metadata-c
or--category
: Add a category to the transcript's metadata
To configure the transcription process, you can use the following flags:
-m
or--model
: Select which whisper model to use for the transcription [default: tiny.en]-D
or--deepgram
: Use deepgram for transcription, instead of using the whisper model [default: False]-M
or--diarize
: Supply this flag if you have multiple speakers AKA want to diarize the content [only available with deepgram]-S
or--summarize
: Summarize the transcript [only available with deepgram]--github
: Specify the GitHub operation mode-u
or--upload
: Upload processed model files to AWS S3--markdown
: Save the resulting transcript to a markdown format supported by bitcointranscripts--noqueue
: Do not push the resulting transcript to the Queuer, instead store the payload in a json file--nocleanup
: Do not remove temp files on exit
To transcribe this podcast episode from YouTube from Stephan Livera's podcast and add the associated metadata, we would run either of the below commands. The first uses short argument tags, while the second uses long argument tags. The result is the same.
tstbtc transcribe Nq6WxJ0PgJ4 --loc "stephan-livera-podcast" -t 'OP_Vault - A New Way to HODL?' -d '2023-01-30' -T 'script' -T 'op_vault' -s 'James O’Beirne' -s 'Stephan Livera' -c ‘podcast’
tstbtc transcribe Nq6WxJ0PgJ4 --loc "stephan-livera-podcast" --title 'OP_Vault - A New Way to HODL?' --date '2023-01-30' --tags 'script' --tags 'op_vault' --speakers 'James O’Beirne' --speakers 'Stephan Livera' --category ‘podcast’
You can also transcribe a remote audio/mp3 link, such as the following from Stephan Livera's podcast:
mp3_link="https://anchor.fm/s/7d083a4/podcast/play/64348045/https%3A%2F%2Fd3ctxlq1ktw2nl.cloudfront.net%2Fstaging%2F2023-1-1%2Ff7fafb12-9441-7d85-d557-e9e5d18ab788.mp3"
tstbtc transcribe $mp3_link --loc "stephan-livera-podcast" --title 'SLP455 Anant Tapadia - Single Sig or Multi Sig?' --date '2023-02-01' --tags 'multisig' --speakers 'Anant Tapadia' --speakers 'Stephan Livera' --category 'podcast'
To run the unit tests
pytest -v -m main -s
To run the feature tests
pytest -v -m feature -s
To run the full test suite
pytest -v -s
Transcriber to Bitcoin Transcript is released under the terms of the MIT license. See LICENSE for more information or see https://opensource.org/licenses/MIT.