Skip to content
/ rtstt Public

Real time speech to text transcription using OpenAi whisper

License

Notifications You must be signed in to change notification settings

ajxv/rtstt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Real Time Speech To Text (Using OpenAi Whisper)

Requirements

  • Python 3.1x

Setting Up

  • Install requirements using pip install -r requirements.txt

Running the Application

  • Run the flask app using python3 app.py

Selecting the Appropriate Model

Whisper offers several models that balance speed and accuracy:

  • tiny: Fastest but least accurate
  • base: A balance between speed and accuracy
  • small: More accurate, slower than base
  • medium: Even more accurate, slower than small
  • large: Most accurate but slowest

You can select a model by specifying it when loading the Whisper model. For example:

self.model = whisper.load_model("medium")

Demo

sample

To-Dos

  • Improve accuracy of transcription
  • Add support for multiple languages
  • Optimize performance for low-latency environments
  • Implement speaker recognition
  • Webohook - Create separate sessions(?) for each connected client

Contribution Guidelines

Contributions are welcome! Please follow these steps to contribute:

  1. Fork the repository.
  2. Create a new branch (git checkout -b feature-branch).
  3. Make your changes.
  4. Commit your changes (git commit -m 'Add new feature').
  5. Push to the branch (git push origin feature-branch).
  6. Create a pull request.

License

This project is licensed under the MIT License.

About

Real time speech to text transcription using OpenAi whisper

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published