Different video production styles, such as, talking head, presentation slides, code/tutorial/demonstration etc. that are used in MOOCs platforms are classified using several deep models pre-trained in ImageNet dataset. The concept of Transfer Learning is adopted to leverage the pre-trained networks by extracting the features from the convolutional base of the networks and training a new classifier on top of it.
Comparison results for different pre-trained networks is shown:
- VGG16
- VGG19
- InceptionV3
- Xception
- ResNet50
Dataset is avaiable Here.