Transcription of videos, including YouTube videos, with Whisper (OpenAI), based on this tutorial.
This repository contains two different scripts:
transcript_youtube
allows you to transcript videos from youtube by just providing the link and the language.transcript_mp4
allows you to transcript your videos in mp4.
Install ffmpeg, yt-dlp, and Whisper:
pip install yt-dlp openai-whisper==20231106 openai
sudo apt install -y ffmpeg
you have to specify the link and the language (en, pt, es, ...):
sh transcript_youtube.sh <link_youtube> <language>
For instance:
sh transcript_youtube.sh https://www.youtube.com/watch?v=AJhkLwMvgrg pt
you have to specify the path to your file and the language (en,pt,es...)
sh transcript_mp4.sh <your_file.mp4> <language>
The output is a file with the name of the input file following by _srt.mp4
. It contains the subtitles extracted with Whisper aligned with the speech.
You can choose the Whisper model in script transcript.py
The medium
model is set by default, but you can also choose: small
and large
.