mediaeval 2016 - emotion in music task: lessons learned
TRANSCRIPT
![Page 1: MediaEval 2016 - Emotion in Music Task: Lessons Learned](https://reader031.vdocuments.mx/reader031/viewer/2022021919/587366821a28abe7648b6f51/html5/thumbnails/1.jpg)
Emotion in Music Task: Lessons Learned
Anna Aljanaki1 Yi-Hsuan Yang2
Mohammad Soleymani1
1University of Geneva, Switzerland2Academia Sinica, Taiwan
20-21 October, MediaEval 2016
![Page 2: MediaEval 2016 - Emotion in Music Task: Lessons Learned](https://reader031.vdocuments.mx/reader031/viewer/2022021919/587366821a28abe7648b6f51/html5/thumbnails/2.jpg)
Emotion in Music Task
I 2013 — Emotion in Music Brave New Task.I Organized by M. Soleymani, M.N. Caro, E.M. Schmidt and
Y.-H. YangI 2 subtasks - dynamic (per-second) music emotion
recognition and song-level emotion recognitionI 3 participating teams
![Page 3: MediaEval 2016 - Emotion in Music Task: Lessons Learned](https://reader031.vdocuments.mx/reader031/viewer/2022021919/587366821a28abe7648b6f51/html5/thumbnails/3.jpg)
Emotion in Music Task
I Focused on audio analysis (optionally, metadata)I Most attention was paid to recognizing how emotion
changes over timeI Used valence/arousal model
![Page 4: MediaEval 2016 - Emotion in Music Task: Lessons Learned](https://reader031.vdocuments.mx/reader031/viewer/2022021919/587366821a28abe7648b6f51/html5/thumbnails/4.jpg)
Valence/Arousal model
![Page 5: MediaEval 2016 - Emotion in Music Task: Lessons Learned](https://reader031.vdocuments.mx/reader031/viewer/2022021919/587366821a28abe7648b6f51/html5/thumbnails/5.jpg)
Dynamic emotion tracking (over duration of a piece)
![Page 6: MediaEval 2016 - Emotion in Music Task: Lessons Learned](https://reader031.vdocuments.mx/reader031/viewer/2022021919/587366821a28abe7648b6f51/html5/thumbnails/6.jpg)
Emotion in Music Task
I 2013 — Emotion in Music Brave New Task.I Organized by M. Soleymani, M.N. Caro, E.M. Schmidt and
Y.-H. YangI 2 tasks - dynamic (per-second) music emotion recognition
and song-level emotion recognitionI 3 participating teams
I 2014 — Emotion in Music Task, Second EditionI Organized by A. Aljanaki, Y.-H. Yang, M. SoleymaniI 2 tasks - dynamic (per-second) music emotion recognition
and feature designI 7 participating teams
![Page 7: MediaEval 2016 - Emotion in Music Task: Lessons Learned](https://reader031.vdocuments.mx/reader031/viewer/2022021919/587366821a28abe7648b6f51/html5/thumbnails/7.jpg)
Emotion in Music Task
I 2013 — Emotion in Music Brave New Task.I Organized by M. Soleymani, M.N. Caro, E.M. Schmidt and
Y.-H. YangI 2 tasks - dynamic (per-second) music emotion recognition
and song-level emotion recognitionI 3 participating teams
I 2014 — Emotion in Music Task, Second EditionI Organized by A. Aljanaki, Y.-H. Yang, M. SoleymaniI 2 tasks - dynamic (per-second) music emotion recognition
and feature designI 7 participating teams
I 2015 — Emotion in Music Task, Third Edition.I Organized by A. Aljanaki, Y.-H. Yang, M. SoleymaniI 1 task - dynamic (per-second) music emotion recognition,
three submissions - features, prediction on baselinefeatures, prediction on custom features.
I 11 participating teams
![Page 8: MediaEval 2016 - Emotion in Music Task: Lessons Learned](https://reader031.vdocuments.mx/reader031/viewer/2022021919/587366821a28abe7648b6f51/html5/thumbnails/8.jpg)
Quality of the annotations
Year 2013 2014 2015Total length 9h 18min 12h 30min 3h 46minCronbach’s α for arousal .28 ± 0.28 .31 ± 0.30 .66 ± 0.26GAM’s R2 for arousal .13 ± 0.10 .14 ± 0.11 .44 ± 0.19Cronbach’s α for valence .28 ± 0.29 .20 ± 0.24 .51 ± 0.35GAM’s R2 for valence .13 ± 0.10 .10 ± 0.08 .37 ± 0.21
![Page 9: MediaEval 2016 - Emotion in Music Task: Lessons Learned](https://reader031.vdocuments.mx/reader031/viewer/2022021919/587366821a28abe7648b6f51/html5/thumbnails/9.jpg)
Quality of the annotations
Year 2013 2014 2015Total length 9h 18min 12h 30min 3h 46minCronbach’s α for arousal .28 ± 0.28 .31 ± 0.30 .66 ± 0.26GAM’s R2 for arousal .13 ± 0.10 .14 ± 0.11 .44 ± 0.19Cronbach’s α for valence .28 ± 0.29 .20 ± 0.24 .51 ± 0.35GAM’s R2 for valence .13 ± 0.10 .10 ± 0.08 .37 ± 0.21
I 2013 & 2014 – 45 second excerpts. 2015 – full songs.I 2013 & 2014 – Amazon Mechanical Turk Workers. 2015 –
Both lab and AMT workers.I 2015 – introduced preliminary listening.
![Page 10: MediaEval 2016 - Emotion in Music Task: Lessons Learned](https://reader031.vdocuments.mx/reader031/viewer/2022021919/587366821a28abe7648b6f51/html5/thumbnails/10.jpg)
Quality of the annotations - Arousal
![Page 11: MediaEval 2016 - Emotion in Music Task: Lessons Learned](https://reader031.vdocuments.mx/reader031/viewer/2022021919/587366821a28abe7648b6f51/html5/thumbnails/11.jpg)
Quality of the annotations - Valence
![Page 12: MediaEval 2016 - Emotion in Music Task: Lessons Learned](https://reader031.vdocuments.mx/reader031/viewer/2022021919/587366821a28abe7648b6f51/html5/thumbnails/12.jpg)
Continuous annotation interface
![Page 13: MediaEval 2016 - Emotion in Music Task: Lessons Learned](https://reader031.vdocuments.mx/reader031/viewer/2022021919/587366821a28abe7648b6f51/html5/thumbnails/13.jpg)
Continuous annotation problems
I Absolute scaleI Reaction timeI Scaling (’zoom’ levels)
![Page 14: MediaEval 2016 - Emotion in Music Task: Lessons Learned](https://reader031.vdocuments.mx/reader031/viewer/2022021919/587366821a28abe7648b6f51/html5/thumbnails/14.jpg)
Continuous annotation problems
Absolute scale ratings
![Page 15: MediaEval 2016 - Emotion in Music Task: Lessons Learned](https://reader031.vdocuments.mx/reader031/viewer/2022021919/587366821a28abe7648b6f51/html5/thumbnails/15.jpg)
Continuous annotation problems
We tried to scale each annotation to the dynamic mean of thesong: aj,i = aj,i + (Aj − A)
![Page 16: MediaEval 2016 - Emotion in Music Task: Lessons Learned](https://reader031.vdocuments.mx/reader031/viewer/2022021919/587366821a28abe7648b6f51/html5/thumbnails/16.jpg)
Continuous annotation problems
There is a reaction time in the annotations. Before listeners cangive judgements on the emotional content of music, they needto listen to it for some time.
![Page 17: MediaEval 2016 - Emotion in Music Task: Lessons Learned](https://reader031.vdocuments.mx/reader031/viewer/2022021919/587366821a28abe7648b6f51/html5/thumbnails/17.jpg)
Continuous annotation problems
There is a scaling problem – the unit of emotional expressioncan be structural section, or phrase, or a single note.
![Page 18: MediaEval 2016 - Emotion in Music Task: Lessons Learned](https://reader031.vdocuments.mx/reader031/viewer/2022021919/587366821a28abe7648b6f51/html5/thumbnails/18.jpg)
Best solutions
Method ρ RMSE2013, BLSTM-RNN .31 ± .37 .08 ± .052014, LSTM .35 ± .45 .10 ± .052015, BLSTM-RNN .66 ± .25 .12 ± .06
Table: Winning algorithms on arousal, ordered by Spearman’s ρ.BLSTM-RNN – Bi-directional Long-Short Term Memory RecurrentNeural Networks.
Method ρ RMSE2013, BLSTM-RNN .19 ± .43 .08 ± .042014, LSTM .20 ± .49 .08 ± .052015, BLSTM-RNN .17 ± .09 .12 ± .54
Table: Winning algorithms on valence, ordered by Spearman’s ρ.
![Page 19: MediaEval 2016 - Emotion in Music Task: Lessons Learned](https://reader031.vdocuments.mx/reader031/viewer/2022021919/587366821a28abe7648b6f51/html5/thumbnails/19.jpg)
Possible solutions and modifications
I Change the task from emotion tracking to dynamicstracking (diminuendo, crescendo, rallentando)
![Page 20: MediaEval 2016 - Emotion in Music Task: Lessons Learned](https://reader031.vdocuments.mx/reader031/viewer/2022021919/587366821a28abe7648b6f51/html5/thumbnails/20.jpg)
Possible solutions and modifications
I Change the task from emotion tracking to dynamicstracking (diminuendo, crescendo, rallentando)
I Change the data collection interface
![Page 21: MediaEval 2016 - Emotion in Music Task: Lessons Learned](https://reader031.vdocuments.mx/reader031/viewer/2022021919/587366821a28abe7648b6f51/html5/thumbnails/21.jpg)
Categorical interface
![Page 22: MediaEval 2016 - Emotion in Music Task: Lessons Learned](https://reader031.vdocuments.mx/reader031/viewer/2022021919/587366821a28abe7648b6f51/html5/thumbnails/22.jpg)
Possible solutions and modifications
I Change the task from emotion tracking to dynamicstracking (diminuendo, crescendo, rallentando)
I Change the data collection interfaceI Finding the practical task where continuous tracking is
necessary.I Retrieval by an emotional trajectoryI ThumbnailingI Emotion prediction from physiological signals and audio
![Page 23: MediaEval 2016 - Emotion in Music Task: Lessons Learned](https://reader031.vdocuments.mx/reader031/viewer/2022021919/587366821a28abe7648b6f51/html5/thumbnails/23.jpg)
Acknowledgements
We thank Erik M. Schmidt, Mike N. Caro, Cheng-Ya Sha,Alexander Lansky, Sung-Yen Liu and Eduardo Countinho fortheir contributions to task developments, and anonymousTurkers for their work.