This project implements a speech-to-text assistant named David that interacts with OpenAI's GPT-3.5-turbo model. The assistant can recognize speech, generate responses, capture the screen, and extract text from images.
- Recognizes speech using Google's speech recognition API.
- Generates text responses using OpenAI's GPT-3.5-turbo model.
- Captures the screen and extracts text using OCR.
- Converts text to speech using OpenAI's text-to-speech API.
- Python 3.6+
- OpenAI API Key
- Google Cloud Speech-to-Text API (for speech recognition)
- Tesseract OCR
-
Clone the repository:
git clone https://github.com/yourusername/david-assistant.git cd david-assistant
-
Create and activate a virtual environment:
python -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`
-
Install the required packages:
pip install -r requirements.txt
-
Install Tesseract OCR:
- On macOS: brew install tesseract
- On Ubuntu: sudo apt-get install tesseract-ocr
- On Windows: Download and install from the official site.
-
Create an .env file and add your OpenAI API key:
touch .env
Plain text API_KEY=<your_openai_api_key>
-
Edit the chatbot.txt file:
Hello, my name is <your_name>. You are my assistant, <assistant_name>.