Skip to content

Ara (think parrot 🦜 ) is a script / api to transcribe and diarise audio. It uses Whisper and Pyannote

Notifications You must be signed in to change notification settings

EdoardoPona/Ara

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ara 🦜

Overview

Ara is a script / api to transcribe ✍️ and diarize 📓 audio. The typical use case for this is transcribing audio from interviews, podcasts and anything where multiple people are speaking. The output is 'easy' to read (if you like .txt files), formatted so that speakers are clear for each segment.

It uses Whisper to transcribe the audio into text. It then uses Pyannote to diarize different speakers. Finally, it matches the segments from the two models and writes the output to file or returns it through the api.

Usage

Script

call the script like so:

python script.py -i input.wav -o output.txt -l English 

Flask API

main.py defines a basic FastAPI with an endpoint for transcription Start the server

uvicorn main:app --reload 

query

curl 127.0.0.1:8000/transcribe/sample_data.interview.wav

This can be useful for interacting with it through Docker, or deploying the code.

The repo comes with a Dockerfile, which makes it easier to deploy in a containerised way. build the docker, then run like so

sudo docker run -p 80:80 --gpus all <CONTAINER NAME>

About

Ara (think parrot 🦜 ) is a script / api to transcribe and diarise audio. It uses Whisper and Pyannote

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published