This is an audio transcription app built with Flask and Python. The app allows users to upload audio files, transcribe them using the OpenAI Whisper ASR local model, and view the transcription status.
- Upload audio files in various formats.
- Transcribe uploaded audio files using the OpenAI Whisper ASR local model.
- View the transcription status of each audio file.
- Python 3.7 or higher
-
Clone the repository:
git clone https://github.com/lejacobroy/stenowhisper.git
-
Navigate to the project directory:
cd stenowhisper
-
Install the required dependencies:
pip install -r requirements.txt
-
Start the Flask development server:
python main.py
-
Open your web browser and go to
http://localhost:5000
to access the app. -
Upload an audio file by clicking on the "Upload" button and selecting the file from your local machine.
-
Once the file is uploaded, you can view the transcription status by clicking on the "Status" button.
-
To transcribe the uploaded audio file, click on the "Transcribe" button.
-
To stop the transcription process, click on the "Stop Transcription" button.
The project follows the following folder structure:
audio-transcription-app/
├── app.py
├── audio_file.py
├── database.py
├── main.py
├── models.py
├── transcriber.py
├── templates/
│ └── index.html
└── static/
└── styles.css
app.py
: Contains the Flask application routes and logic.audio_file.py
: Defines theAudioFile
data class for storing audio file information.database.py
: Handles the SQLite database operations for storing and retrieving audio files.main.py
: Entry point of the application.models.py
: Contains theAudioLibrary
andTranscriptionView
classes.transcriber.py
: Implements theTranscriber
class for transcribing audio files using the OpenAI Whisper ASR local model.templates/
: Directory containing HTML templates for the Flask views.static/
: Directory containing static files such as CSS stylesheets.
Contributions are welcome! If you have any suggestions, bug reports, or feature requests, please open an issue or submit a pull request.
This project is licensed under the MIT License. See the LICENSE file for more information.