A smart assistant for the visually impaired, to make their lives easier.
Goal is to guide the visually impaired user via audio by answering their questions and queries and providing them with the response.
YouTube Link: https://youtu.be/YNT_FSTY52A
- DroidCam for streaming video from phone to the system
- OpenCV for capturing and processing real-time video
- OpenAI Whisper for converting speech to text
- OpenFlamingo (LLaMA 7B + CLIP ViT/L-14) Vision-Language Model
These instructions assume a working installation of Anaconda.
git clone git@github.com:shreayan98c/LookoutX.git
cd LookoutX
conda env create -f environment.yml
Depending on your desired configuration, you may need to install the PyTorch package separately. This can be done following the instructions on the PyTorch website, in an empty conda environment. Then you can install the remaining packages with:
conda activate lookoutx
pip install -r requirements.txt
pip install git+https://github.com/zphang/transformers.git@68d640f7c368bcaaaecfc678f11908ebbd3d6176
pip install -e .
This is only necessary if the installation from environment.yml
fails.
python main.py train
This project is licensed under the terms of the MIT license.