Repository for paper titled "Emotional Video to Audio Transformation Using Deep Recurrent Neural Networks and a Neuro-Fuzzy System".
Requirements • Dataset • How to Use • How to Cite
Matlab 2017, Mac OS
Toolboxes: Fuzzy Logic, Deep Learning
Both datasets have emotion label in the 2D-axis (Valence and Arousal)
- 8 music videos
- Emotion labels:
dataset/lindsey stirling dataset/user_response*.tsv
- 38 music videos
- Emotion labels:
dataset/deap dataset/participant_ratings.csv
- Extract audio and visual features
- ANFIS for emotion classification of visual features
- Seq2Seq for audio feature generation (multi-modal domain transformation)
- Mapping of audio features to audio snippets for music generation
All the codes are for the Lindsey Stirling Dataset. The codes corresponding to the DEAP Dataset are also available.
-
Change current folder to where this file is located
-
Download datasets
-
Extract audio and visual features
- Extract sound features:
scripts/emotion_from_sound/main_sound2feat_lindsey.m
- Extract visual features:
scripts/emotion_from_visual/main_video2feat_lindsey.m
- Extract sound features:
-
Train:
- Settings and Load data:
scripts/model/main_settings.m
- ANFIS for emotion classification from HSL (visual features):
scripts/model/main_anfis.m
- Seq2Seq for domain transformation from visual to audio features:
scripts/model/main_seq2seq_train.m
- Settings and Load data:
-
Evaluation (music generation from visual features)
- Extract sound features (test data):
scripts/emotion_from_sound/main_sound2feat_lindsey_test_individual.m
- Extract visual features (test data):
scripts/emotion_from_visual/main_video2feat_lindsey_test_individual.m
- Settings and Load data:
scripts/model/main_settings.m
- Eval
scripts/model/main_anfis_seq2seq_test.m
- Extract sound features (test data):
-
Evaluation of MTurk results in
scripts/eval_mturk
- Not included due to large size: generated music, video and audio data, and data mats.
- LSTM for Matlab, https://kr.mathworks.com/help/nnet/ref/nnet.cnn.layer.lstmlayer.html
In case you wish to use this code, please use the following citation:
@article{sergio2020mpe,
AUTHOR={{Sergio, G. C., and Lee, M.}},
TITLE={Emotional Video to Audio Transformation Using Deep Recurrent Neural Networks and a Neuro-Fuzzy System},
JOURNAL={Mathematical Problems in Engineering},
VOLUME={2020},
PAGES={1--15},
DOI={https://doi.org/10.1155/2020/8478527},
YEAR={2020}
}
Contact: gwena.cs@gmail.com