- The datasets are not pushed to the repo!
- The result data (cascade xml file and cnn models) are not pushed to the repo!
- Gathering Dataset
- Train Cascade Classifier
- Train CNN
- Merge Face Detection and Facial Expression Recognition
We need 2 datasets for our project:
- Dataset for Cascade Classifier
- Dateset for CNN
The first dataset is used to detect face among other objects.
So the faces are the positive data and other objects could be the negative.
The second dataset is used for CNN feed. After the CNN training, the trained model can recognize five different emotions. Therefore, we seperate the second dataset to the following expressions. \
- neutral
- happy
- sad
- surprised
- angry
You can train the cascade classifier and the CNN with any type of data!
You can find relevant code in cascade folder of the repository.
To be honest, this part of job was a little tricky and boring.
To use the haar or lbp opencv_traincascade
, first you should:
- Get opencv and opencv_contrib command collections
- Run the command via cli rather than python program
So we had decided to write some bash scripts to make the process easier.
This is what I'd wrote as an Instruction:
- fill out the folders: "neg_orig" and "pos_orig" with images
- run
to_gray.python
program. It resizes and converts images to grayscale and puts them to "pos" and "neg" folders - run
create_bg.sh
andcreate_info.sh
to create information files: "bg.txt" for negatives and "info.dat" for positives (information files init) - run
create_neg.sh
andcreate_pos.sh
to create samples with some single images in "single_neg" and "single_pos" folders - run
merge_neg_samples.sh
andmerge_pos_samples.sh
to merge .lst files for the generated samples to the "bg.txt" and "info.dat" files - copy "info/*.jpg" to "pos" folder and bg/*.jpg to "neg" folder
- run
create_samples.sh
to run opencv_createsamples command for generating .vec file - run
nohup ./train.sh &
to run the training in background
AAAAH YEAH That is sort of complicated, You might say and yes it is.
The concept is that you have some negative and positive images in two different folders, about 1000 images in each. However, you need to generate more images with different aspects (angles, perspectives, ...).
So what we've done, is creating information data file of original images.
Afterward, we generate more than 10000 of images with different perspective and angles from a single image.
Then, we tried to make another information data file for these new images.
Finally we merged all the information files to pass through the opencv_traincascade
.
If you are messed up till now, I suggest you to read following tutorials:
- https://docs.opencv.org/master/dc/d88/tutorial_traincascade.html
- https://www.learnopencv.com/training-better-haar-lbp-cascade-eye-detector-opencv/
After training the cascade classifier, -whether with HAAR or LBP algorithms- an XML file will be generated which is needed for the final step:
Facial Expression Classification
You can find the code in CNN folder of the repository. We've used Keras library to train and predict.
The procedure of the CNN training is divided into the followong steps:
- Dataset should be resized into 48x48 pixels.
- Dataset should be divided into three section including train, validation, and test.
- We should generate images with different angles and scales and shifts which will be used as training and validation data.
- We use sequential model which contains 8 convoloutional layers and 5 dense layers.
- We use Dropout, Batch Normalization, and MaxPool2D layers between convoloutional and dense layers.
- Finally, after the training, we test our trained model with the test dataset.
You can observe the CNN model configuration in the following code snippet.
model = Sequential()
model.add(Conv2D(64, kernel_size = (3, 3), input_shape = (48, 48, 1), activation = 'relu', kernel_regularizer=l2(0.01)))
model.add(Conv2D(64, kernel_size = (3, 3), activation = 'relu', padding='same'))
model.add(BatchNormalization())
model.add(MaxPool2D(pool_size = (2, 2)))
model.add(Dropout(0.3))
model.add(Conv2D(128, kernel_size = (3, 3), activation = 'relu', padding='same'))
model.add(BatchNormalization())
model.add(Conv2D(128, kernel_size = (3, 3), activation = 'relu', padding='same'))
model.add(BatchNormalization())
model.add(MaxPool2D(pool_size = (2, 2)))
model.add((Dropout(0.4)))
model.add(Conv2D(256, kernel_size = (3, 3), activation = 'relu', padding='same'))
model.add(BatchNormalization())
model.add(Conv2D(256, kernel_size = (3, 3), activation = 'relu', padding='same'))
model.add(BatchNormalization())
model.add(MaxPool2D(pool_size = (2, 2)))
model.add((Dropout(0.5)))
model.add(Conv2D(512, kernel_size = (3, 3), activation = 'relu', padding='same'))
model.add(BatchNormalization())
model.add(Conv2D(512, kernel_size = (3, 3), activation = 'relu', padding='same'))
model.add(BatchNormalization())
model.add(MaxPool2D(pool_size = (2, 2)))
model.add((Dropout(0.5)))
model.add(Flatten())
model.add(Dense(512, activation = 'relu'))
model.add(Dropout(0.5))
model.add(Dense(256, activation = 'relu'))
model.add(Dropout(0.4))
model.add(BatchNormalization())
model.add(Dense(128, activation = 'relu'))
model.add(Dropout(0.3))
model.add(Dense(64, activation = 'relu'))
model.add(Dropout(0.3))
model.add(BatchNormalization())
model.add(Dense(5, activation = 'softmax'))
model.compile(loss = 'categorical_crossentropy', optimizer = 'adam', metrics = ['accuracy'])
model.fit_generator(generator=train_generator,
steps_per_epoch=4096,
validation_data=valid_generator,
validation_steps=1024,
epochs=1000,
callbacks= [model_check, earlystopping, reduce_lr]
)
The validation accuracy is 0.6767.
The validation loss is 0.8386.
The training accuracy is 0.7165.
The training accuracy is 0.7616.
This stage is to use the cascade.xml and the CNN trained model to detect the face position and the facial expression respectively.
No complexity >> We didn't push the merged codes
- Alireza Kavian
- Kamran Akbar