This repository consists of the code utilized for creation of an Indian Sign Language Translator satisfying the following criteria :
- Near-Real-Time Application
- Achieve background independence
- Attain Illumination independence
We achieve these goals by providing the following features :
-
Face Detection : Used as an activation mechanism. We use Haar cascade models from the OpenCV library to detect faces in an image. When a face is detected, the program checks the next few consecutive frames, and when a threshold value of consecutive frames with a face in it is reached, the sign language system is triggered.
-
Hand Detection : The first step in preprocessing is preliminary hand detection method, which goes through every frame selected from the clip, and attempts to find a hand in them using a YOLO-v3 pre-trained network. If any hands are found in the frame, an extended bounding box is created around the hand(s). These images are then cropped to contain only the contents of the box, and are passed onto the next step of preprocessing which is resizing. If no hands are found, the frame is discarded entirely.
-
Skin Segmentation : After cropping and resizing, the images are passed through a combination of HSV (Hue, Saturation, Value) and YCbCr (Luminance, Chrominance) based filters to segment out skin and remove background noise present in the box input.
-
Sign Recognition : The processed input is passed through a SqueezeNet model trained (via Transfer Learning) on a synthesized and cleaned Indian Sign Language dataset consisting of 10 classes, and ~2300+ images per class.
The work performed is divided into the following folders :
The App section consists of the files required to run the standalone webcam implementation of the translator. Contains :
- The trained model
- The hand segmentation network
- Preprocessing scripts
- Main application (main.py)
Covers the scripts used in :
- Creating new data, via modifications on brightness, clarity and picture quality (Synthesis)
- Cleaning noisy generated data from the previous step, by using the YOLO-v3 Hand Detection Network (Cleaning)
Contains the scripts used in order to perform pre-processing on the input dataset, including image upscaling, skin segmentation and hand centralization. These tasks are performed before entering the image dataset into the neural network.
Consists of the notebook used in order to train and save the SqueezeNet model used for the project. Originally made in Colab.
- OpenCV
- Tensorflow
- Keras
- Numpy
- Pillow
- ImageAI
The specific versions are mentioned in requirements.txt
This translator was originally created as part of our Final Year Project, consisting of the following members -
Name | GithubID |
---|---|
Abhishek Singh Dhadwal | AbhishekSinghDhadwal |
Saurabh Pujari | saurabh0719 |
Kopal Bhatnagar | kopalbhatnagar05 |
Yash Kumar | yashKumar2412 |
For further details on the implementation, kindly refer to the Thesis folder containing both the project report and the final presentation.