discovery and characterization of melodic motives in large audio music collections

31
[email protected] Discovery and Characterization of Melodic Motives in Large Audio Music Collections PhD Proposal Defense Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain Sankalp Gulati Supervisor: Prof. Xavier Serra

Upload: sankalp-gulati

Post on 22-Nov-2014

359 views

Category:

Education


1 download

DESCRIPTION

Presentation for my PhD proposal Defense at Music Technology Group, UPF, Barcelona, Spain (2013).

TRANSCRIPT

  • 1. [email protected] Discovery and Characterization of Melodic Motives in Large Audio Music Collections PhD Proposal Defense Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain Sankalp Gulati Supervisor: Prof. Xavier Serra
  • 2. [email protected] Patterns Images at right half taken from- (Mueen & Keogh, 2009) and (Mueen & Keogh, 2010)
  • 3. [email protected] Melodic Patterns Top right image taken from - (Mueen & Keogh, 2009) and (Mueen & Keogh, 2010)
  • 4. [email protected] Melodic Motives (Patterns) Melodic Motives Top right image taken from - (Mueen & Keogh, 2009) and (Mueen & Keogh, 2010)
  • 5. [email protected] Melodic Motives Discovery Induction Extraction Matching Retrieval Discovery Melodic Motives + Image taken from - (Mueen & Keogh, 2009)
  • 6. [email protected] Large Audio Music Collections Discovery Melodic Motives Large Audio Music Collections > 500,000 > 550 hours
  • 7. [email protected] Characterizatio n Discovery Characterization Melodic Motives Large Audio Music Collections Transform N dimensions
  • 8. [email protected] Discovery and Characterization of Melodic Motives in Large Audio Music Collections PhD Proposal Defense Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain Sankalp Gulati Supervisor: Prof. Xavier Serra
  • 9. [email protected] Music Information Research (MIR) Introduction
  • 10. [email protected] Introduction Music->Melody (pitch, loudness, timbre) It is melody that enables us to distinguish one work from another. It is melody that human beings are innately able to reproduce by singing, humming, and whistling. It is melody that makes music memorable: we are likely to recall a tune long after we have forgotten its text -(Selfridge-Field, 1998) Audio example:
  • 11. [email protected] Introduction Melodic Analysis : Melodic Motives Computational Melodic Motivic Analysis Hungarian, Slovak, French, Sicilian, Bulgaria n and Appalachian Folk Melodies - (Juhsz, 2006) Cretan, Nova scotia and Essen Folk Melodies (Conklin and Anagnostopoulou, 2010, 2006) Tunisian modal music -(Lartillot & Ayari, 2006).
  • 12. [email protected] Introduction Melodic Motivic Discovery in Audio Music Signals? Is it needed? Why so little work? Solution?
  • 13. [email protected] Introduction Indian Art Music: Opportunities Heterophonic Music Melodic framework (Rg) Importance of melodic phrases (Pakads, Chalans) Available audio music repertoire
  • 14. [email protected] Introduction: Broad Research Goals Broad Research Goals: Computational methodology for melodic motivic discovery in large audio music collection utilizing domain specific knowledge. Melodic motivic analysis methodology Similarity measures based on melodic motives Compilation of sizeable audio music collection of Indian art music Summarize and compile existing literature
  • 15. [email protected] Introduction: Goals and Motivation Motivation: Lack of approaches for melodic motif extraction in audio signal Lack of utilization of domain specific knowledge in computational methodologies Further state of the art in pattern processing in MIR
  • 16. [email protected] Proposed Methodology
  • 17. [email protected] Proposed Methodology: Overview Block Diagram for proposed methodology
  • 18. [email protected] Proposed Methodology: Data Collection Audio Metadata Annotations > 550 hours
  • 19. [email protected] Proposed Methodology: Melodic Feature Extraction Pitch, loudness and timbre features Pitch: F0 frequency contour of predominant melodic source. Use - (Salamon & Gmez, 2012) Loudness: Perceptual loudness computed using only predominant melodic source. Use - (Zwicker, 1977) Timbre: Centroid of the spectral envelope of the predominant melodic source. Use - (Rbel & Rodet, 2005). Predominant F0 Frequency estimation Synthesize predominant melodic source Loudness feature extraction Timbre feature extraction Audio
  • 20. [email protected] Evaluation: predominant F0 frequency estimation 6 Hindustani music pieces ~45 mins Proposed Methodology: Melodic Feature Extraction
  • 21. [email protected] Compact + Abstract/reduced Challenges: Heavy meandering around notes (Gamakas) Svar intonation Aroh-Avroh dependent svar intonation F0 frequency contour musical pitch perception 2.215 2.22 2.225 2.23 2.235 2.24 x 10 4 1300 1400 1500 1600 1700 1800 1900 2000 Time (1 sample = 10 ms) PredominantF0frequency(Cents) Proposed Methodology: Melodic Representation
  • 22. [email protected] Continuous time varying values of pitch, loudness and timbral features Possibilities Melody transcription SAX based symbolic representation Parametric representation (no studies!!) Saddle point based representation Domain knowledge Svar intonation profiles Proposed Methodology: Melodic Representation
  • 23. [email protected] Proposed Methodology: Melodic Similarity Challenges Melodic representation Large timing variations Pitch variations (ornamentations) Differentiating a characteristic phrase from a melodic sequence using same svars Fixing similarity threshold Audio example: Dynamic Time Warping (Initial experiments) DTW > (SAX + Euclidean distances) (Ross, Vinutha, and Rao,2012)
  • 24. [email protected] Possibilities Euclidian and Mahalanobis distance measures HMM based distance measures Dynamic time warping based distances Step and boundary conditions Constraints Context dependent DTW Domain Knowledge DTW constraint parameters Pattern dependent similarity threshold Weighted distance measures Proposed Methodology: Melodic Similarity
  • 25. [email protected] Proposed Methodology: Pattern Extraction Challenges: Melodic segmentation Different motif lengths Large volume of audio data Exact melodic similarity ~ parametric melodic representation 1000 1200 1400 1600 1800 2000 2200 160 180 200 220 240 260 280 300 320 Time (1 sample = 10 ms) PredominantF0frequency(Hz) Match Matrix
  • 26. [email protected] Ongoing work Music parallelismMelodic segmentation Motif discovery in time series analysis domain Fast brute force exhaustive pattern search Pruning strategies 1000 1200 1400 1600 1800 2000 2200 160 180 200 220 240 260 280 300 320 Time (1 sample = 10 ms) PredominantF0frequency(Hz) Proposed Methodology: Pattern Extraction
  • 27. [email protected] Possibilities Sparse similarity matrices Lower bounds on distance measures Phase space embedding/recurrent plots Suffix trees (~parametric representation) Domain knowledge Probable phrase boundaries Pruning rules Motif characteristics Proposed Methodology: Pattern Extraction
  • 28. [email protected] Proposed Methodology: Melodic Motivic Analysis Challenges Non uniform length of motives Directions Clustering K-mean clustering Self organizing maps Fractal Analysis Application Rg characterization Rg specific motives Shared motives Transform N dimensions
  • 29. [email protected] Proposed Methodology: Evaluation Challenges No annotated corpus Human subjectivity in similarity related tasks Listening tests Feedback through Dunya users
  • 30. [email protected] References Selfridge-Field, E. (1998). Conceptual and representational issues in melodic comparison. Computing in musicology: a directory of research(11), 364. Juhsz, Z. (2006, June). A systematic comparison of different European folk music traditions using self-organizing maps. Journal of New Music Research, 35(2), 95112. Conklin, D., & Anagnostopoulou, C. (2006). Segmental pattern discovery in music. INFORMS Journal on Computing, 18(3), 285293. Lartillot, O., & Ayari, M. (2006). Motivic pattern extraction in music, and application to the study of Tunisian modal music. South African Computer Journal, 36, 1628. Salamon, J., & Gmez, E. (2012, August). Melody Extraction From Polyphonic Music Signals Using Pitch Contour Characteristics. IEEE Transactions on Audio, Speech, and Language Processing, 20(6), 17591770. Zwicker, E. (1977). Procedure for calculating loudness of temporally variable sounds. The Journal of the Acoustical Society of America, 62(3), 675682. Rbel, A., & Rodet, X. (2005). Efficient spectral envelope estimation and its application to pitch shifting and envelope preservation. In Proc. dafx. Ross, J. C., Vinutha, T., & Rao, P. (2012). Detecting melodic motifs from audio for hindustani classical music. In Proceedings of the 13th international society for music information retrieval conference, porto, portugal. Mueen, A., Keogh, E. J., Zhu, Q., Cash, S., & Westover, M. B. (2009, April). Exact Discovery of Time Series Motifs. In SDM (pp. 473-484). Mueen, A., & Keogh, E. (2010, July). Online discovery and maintenance of time series motifs. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 1089-1098). ACM.
  • 31. [email protected] Work Plan