Run OpenAI Whisper as a Replicate Cog on Fly.io!
This app exposes the Whisper model via a simple HTTP server, thanks to Replicate Cog. Cog is an open-source tool that lets you package machine learning models in a standard, production-ready container. When you're up and running, you can trascribe audio using the /predictions endpoint.
Create a deploy the app in one single command:
fly launch --from https://github.com/fly-apps/cog-whisper --no-public-ips
Assign a Flycast IP to the app:
fly ips allocate-v6 --private
That's it! You can now access the app at http://<APP_NAME>.flycast/predictions
Important
By default, the app runs on Fly GPUs — Nvidia L40s to be exact. This can be customized in the fly.toml vm
settings. It will run on a standard Fly Machine — but performance will be reduced.
curl -X PUT \
-H "Content-Type: application/json" \
-d '{
"input": {
"audio": "https://fly.storage.tigris.dev/cogs/bun_on_fly.mp3"
}
}' \
http://cog-whisper.flycast/predictions/test | jq
-
Clone the
cog-whisper
repository from GitHub:git clone git@github.com:fly-apps/cog-whisper.git
-
Navigate into the cloned directory:
cd cog-whisper
-
Run locally. First, run
get_weights.sh
from the project root to download pre-trained weights, then build a container and run predictions:./scripts/get_weights.sh:
cog predict -i audio="<path/to/your/audio/file>"
-
Build the Docker image using
cog
:cog build -t whisper
Create an issue or ask a question here: https://community.fly.io/