Skip to content

Emotional Video to Audio Transformation with ANFIS-DeepRNN (Vanilla RNN and LSTM-DeepRNN) [MPE 2020]

Notifications You must be signed in to change notification settings

gcunhase/Emotional-Video-to-Audio-with-ANFIS-DeepRNN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

About

Repository for paper titled "Emotional Video to Audio Transformation Using Deep Recurrent Neural Networks and a Neuro-Fuzzy System".

Contents

RequirementsDatasetHow to UseHow to Cite

Requirements

Matlab 2017, Mac OS

Toolboxes: Fuzzy Logic, Deep Learning

Dataset

Both datasets have emotion label in the 2D-axis (Valence and Arousal)

  • 8 music videos
  • Emotion labels: dataset/lindsey stirling dataset/user_response*.tsv
  • 38 music videos
  • Emotion labels: dataset/deap dataset/participant_ratings.csv

Model

  • Extract audio and visual features
  • ANFIS for emotion classification of visual features
  • Seq2Seq for audio feature generation (multi-modal domain transformation)
  • Mapping of audio features to audio snippets for music generation

How to Use

All the codes are for the Lindsey Stirling Dataset. The codes corresponding to the DEAP Dataset are also available.

  1. Change current folder to where this file is located

  2. Download datasets

  3. Extract audio and visual features

    • Extract sound features:
      scripts/emotion_from_sound/main_sound2feat_lindsey.m
      
    • Extract visual features:
      scripts/emotion_from_visual/main_video2feat_lindsey.m
      
  4. Train:

    • Settings and Load data:
      scripts/model/main_settings.m
      
    • ANFIS for emotion classification from HSL (visual features):
      scripts/model/main_anfis.m
      
    • Seq2Seq for domain transformation from visual to audio features:
      scripts/model/main_seq2seq_train.m
      
  5. Evaluation (music generation from visual features)

    • Extract sound features (test data):
      scripts/emotion_from_sound/main_sound2feat_lindsey_test_individual.m
      
    • Extract visual features (test data):
      scripts/emotion_from_visual/main_video2feat_lindsey_test_individual.m
      
    • Settings and Load data:
      scripts/model/main_settings.m
      
    • Eval
      scripts/model/main_anfis_seq2seq_test.m
      
  6. Evaluation of MTurk results in scripts/eval_mturk

Notes

Acknowledgement

In case you wish to use this code, please use the following citation:

@article{sergio2020mpe,
   AUTHOR={{Sergio, G. C., and Lee, M.}},
   TITLE={Emotional Video to Audio Transformation Using Deep Recurrent Neural Networks and a Neuro-Fuzzy System},
   JOURNAL={Mathematical Problems in Engineering},
   VOLUME={2020},
   PAGES={1--15},
   DOI={https://doi.org/10.1155/2020/8478527},
   YEAR={2020}
}

Contact: gwena.cs@gmail.com