Download - Landmark Detection in Hindustani Music Melodies

Landmark Detection in Hindustani Music Melodies

Sankalp Gulati1, Joan Serrà2, Kaustuv K. Ganguli3 and Xavier Serra1

[email protected], [email protected], [email protected], [email protected]

1Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain 2Artificial Intelligence Research Institute (IIIA-CSIC), Bellaterra, Barcelona, Spain

3Indian Institute of Technology Bombay, Mumbai, India

SMC-ICMC 2014, Athens, Greece

5

Page 2: Landmark Detection in Hindustani Music Melodies

Indian Art Music

§  Hindustani music (North Indian music)

§  Carnatic music

Page 3: Landmark Detection in Hindustani Music Melodies

Melodies in Hindustani Music

§  Rāg: melodic framework of Indian art music

Page 4: Landmark Detection in Hindustani Music Melodies

S r R g G m M P d D n N

Do Re Mi Fa Sol La Ti

Svaras

Page 5: Landmark Detection in Hindustani Music Melodies

Svaras

Bhairavi Thaat S r g m P d n

Page 6: Landmark Detection in Hindustani Music Melodies

Svaras

Bhairavi Thaat

*Nyās Svar (Rāg Bilaskhani todi) S r* g* m* P d* n

S r g m P d n

Nyās translates to home/residence

Page 7: Landmark Detection in Hindustani Music Melodies

§  Example

Melodic Landmark: Nyās Occurrences

178 180 182 184 186 188 190−200

0

200

400

600

800

Time (s)

F0 frequ

ency

(ce

nts)

N1 N5N4N3N2

A.K.Dey,Nyāsainrāga:thepleasantpauseinHindustanimusic.KanishkaPublishers,Distributors,2008.

Page 8: Landmark Detection in Hindustani Music Melodies

Goal and Motivation

§  Methodology for detecting nyās occurrences

Page 9: Landmark Detection in Hindustani Music Melodies

Goal and Motivation

§  Methodology for detecting nyās occurrences N N N N N

Page 10: Landmark Detection in Hindustani Music Melodies

Goal and Motivation

§  Methodology for detecting nyās occurrences

§  Motivation §  Melodic motif discovery [Ross and Rao 2012] §  Melodic segmentation §  Music transcription

N N N N N

J.C.RossandP.Rao,“DetecKonofraga-characterisKcphrasesfromHindustaniclassicalmusicaudio,”inProc.of2ndCompMusicWorkshop,2012,pp.133–138.

Page 11: Landmark Detection in Hindustani Music Melodies

Methodology: Block Diagram

Predominant pitch estimation Tonic identification

Audio

Nyās svars

Pred. pitch est. and representation

Histogram computation

Segmentation Svar identification Segmentation

Local Feature extraction Contextual Local + Contextual

Segment classification Segment fusion Segment classification and fusion

Blockdiagramoftheproposedmethodology

Page 12: Landmark Detection in Hindustani Music Melodies

Methodology: Pred. Pitch Estimation

Audio

Nyās svars

Page 13: Landmark Detection in Hindustani Music Melodies

Methodology: Segmentation

Audio

Nyās svars

Page 14: Landmark Detection in Hindustani Music Melodies

Methodology: Feature Extraction

Audio

Nyās svars

0.12, 0.34, 0.59, 0.23, 0.54

0.21, 0.24, 0.54, 0.54, 0.42

0.32, 0.23, 0.34, 0.41, 0.63

0.66, 0.98, 0.74, 0.33, 0.12

0.90, 0.42, 0.14, 0.83, 0.76

Page 15: Landmark Detection in Hindustani Music Melodies

Methodology: Segment Classification

Audio

Nyās svars

0.12, 0.34, 0.59, 0.23, 0.54

0.21, 0.24, 0.54, 0.54, 0.42

0.32, 0.23, 0.34, 0.41, 0.63

0.66, 0.98, 0.74, 0.33, 0.12

0.90, 0.42, 0.14, 0.83, 0.76

Page 16: Landmark Detection in Hindustani Music Melodies

Methodology: Segment Classification

Audio

Nyās svars

Nyās

Non nyās

Nyās

Non nyās

Page 17: Landmark Detection in Hindustani Music Melodies

Pred. Pitch Estimation and Representation

§  Predominant pitch estimation §  Method by Salamon and Gómez (2012) §  Favorable results in MIREX’11

§  Tonic Normalization §  Pitch values converted from Hertz to Cents §  Multi-pitch approach by Gulati et al. (2014)

J.SalamonandE.Gomez,“MelodyextracKonfrompolyphonicmusicsignalsusingpitchcontourcharac-terisKcs,”IEEETransacKonsonAudio,Speech,andLanguageProcessing,vol.20,no.6,pp.1759–1770,2012.S.GulaK,A.Bellur,J.Salamon,H.Ranjani,V.Ishwar,H.A.Murthy,andX.Serra,“AutomaKcTonicIdenK-ficaKoninIndianArtMusic:ApproachesandEvalua-Kon,”JournalofNewMusicResearch,vol.43,no.01,pp.55–73,2014.

Page 18: Landmark Detection in Hindustani Music Melodies

Melody Segmentation

1 2 3 4 5 6 7 8 9 10

850

900

950

1000

1050

1100

1150

T1

Sn

Sn-1

Sn+1

ε ρ1 T2 T3

ρ2

T4 T7 T8 T9

T5 T6

Nyās segment

Time (s)

Fund

amen

tal F

requ

ency

(cen

ts)

§  Baseline: Piecewise linear segmentation (PLS)

E.Keogh,S.Chu,D.Hart,andM.Pazzani,“SegmenKngKmeseries:Asurveyandnovelapproach,”DataMininginTimeSeriesDatabases,vol.57,pp.1–22,2004.

Page 19: Landmark Detection in Hindustani Music Melodies

Feature Extraction

Page 20: Landmark Detection in Hindustani Music Melodies

Feature Extraction

§  Local (9 features)

Page 21: Landmark Detection in Hindustani Music Melodies

Feature Extraction

§  Local (9 features) §  Contextual (24 features)

Page 22: Landmark Detection in Hindustani Music Melodies

Feature Extraction

§  Local (9 features) §  Contextual (24 features) §  Local + Contextual (33 features)

Page 23: Landmark Detection in Hindustani Music Melodies

Feature Extraction: Local

§  Segment Length

l

Page 24: Landmark Detection in Hindustani Music Melodies

§  Segment Length §  μ and σ of pitch values

μσ

Page 25: Landmark Detection in Hindustani Music Melodies

§  Segment Length §  μ and σ of pitch values §  μ and σ of difference in adjacent peaks and valley

locations

d1 d2 d3 d4 d5

Page 26: Landmark Detection in Hindustani Music Melodies

locations §  μ and σ of the peak and valley amplitudes

p1

p2

p3

p4

p5

p6

Page 27: Landmark Detection in Hindustani Music Melodies

locations §  μ and σ of the peak and valley amplitudes §  Temporal centroid (length normalized)

Page 28: Landmark Detection in Hindustani Music Melodies

locations §  μ and σ of the peak and valley amplitudes §  Temporal centroid (length normalized) §  Binary flatness measure

Flatornon-flat

Page 29: Landmark Detection in Hindustani Music Melodies

Feature Extraction: Contextual

§  4 different normalized segment lengths

L1

L2

BreathPause

Page 30: Landmark Detection in Hindustani Music Melodies

L1

L2

BreathPause

Page 31: Landmark Detection in Hindustani Music Melodies

L1

L2

BreathPause

Page 32: Landmark Detection in Hindustani Music Melodies

L1

L2

BreathPause

Page 33: Landmark Detection in Hindustani Music Melodies

§  4 different normalized segment lengths §  Time difference from the succeeding and preceding

breath pauses

d1 d2

BreathPause

Page 34: Landmark Detection in Hindustani Music Melodies

§  4 different normalized segment lengths §  Time difference from the succeeding and

proceeding unvoiced regions §  Local features of neighboring segments (9*2=18)

BreathPause

Page 35: Landmark Detection in Hindustani Music Melodies

Segment Classification

§  Class: Nyās and Non-nyās §  Classifiers:

§  Trees (min_sample_split=10) §  K nearest neighbors (n_neighbors=5) §  Naive bayes (fit_prior=False) §  Logistic regression (class_weight=‘auto’) §  Support vector machines (RBF)(class weight=‘auto’)

§  Testing methodology §  Cross-fold validation

§  Software: Scikit-learn, version 0.14.1

F.Pedregosa,G.Varoquaux,A.Gramfort,V.Michel,B.Thirion,O.Grisel,M.Blondel,P.Prejenhofer,R.Weiss,V.Dubourg,J.Vanderplas,A.Passos,D.Cournapeau,M.Brucher,M.Perrot,andE.Duches-nay,“Scikit-learn:machinelearninginPython,”JournalofMachineLearningResearch,vol.12,pp.2825–2830,2011

Page 36: Landmark Detection in Hindustani Music Melodies

Evaluation: Dataset

§  Audio: §  Number of recordings: 20 Ālap vocal pieces §  Duration of recordings: 1.5 hours §  Number of artists: 8 §  Number of rāgs: 16 §  Type of audio: 15 polyphonic commercial recordings, 5

in-house monophonic recordings**

§  Annotations: Musician with > 15 years of training §  Statistics:

§  1257 nyās segments §  150 ms to16.7 s §  mean 2.46 s, median 1.47 s.

**OpenlyavailableunderCClicenseinfreesound.org

Page 37: Landmark Detection in Hindustani Music Melodies

Evaluation: Measures

§  Boundary annotations (F-scores) §  Hit rate §  Allowed deviation: 100 ms

§  Label annotations (F-scores): §  Pairwise frame clustering method [Levy and Sandler

2008]

§  Statistical significance: Mann-Whitney U test (p=0.05)

§  Multiple comparison: Holm-Bonferroni method

M.LevyandM.Sandler,“StructuralsegmentaKonofmusicalaudiobyconstrainedclustering,”IEEETransacKonsonAudio,Speech,andLanguageProcessing,vol.16,no.2,pp.318–326,2008.H.B.MannandD.R.Whitney,“OnatestofwhetheroneoftworandomvariablesisstochasKcallylargerthantheother,”TheannalsofmathemaKcalstaKsKcs,vol.18,no.1,pp.50–60,1947.S.Holm,“AsimplesequenKallyrejecKvemulKpletestprocedure,”ScandinavianjournalofstaKsKcs,pp.65–70,1979.

Page 38: Landmark Detection in Hindustani Music Melodies

Evaluation: Baseline Approach

§  DTW based kNN classification (k=5) §  Frequently used for time series classification

§  Random baselines §  Randomly planting boundaries §  Evenly planting boundaries at every 100 ms * §  Ground truth boundaries, randomly assign class labels

X.Xi,E.J.Keogh,C.R.Shelton,L.Wei,andC.A.Ratanamahatana,“FastKmeseriesclassificaKonusingnumerosityreducKon,”inProc.oftheInt.Conf.onMa-chineLearning,2006,pp.1033–1040.X.Wang,A.Mueen,H.Ding,G.Trajcevski,P.Scheuermann,andE.J.Keogh,“Experimentalcom-parisonofrepresentaKonmethodsanddistancemea-suresforKmeseriesdata,”DataMiningandKnowl-edgeDiscovery,vol.26,no.2,pp.275–309,2013.

Page 39: Landmark Detection in Hindustani Music Melodies

Results: Nyās Boundary Annotation

Feat. DTW Tree KNN NB LR SVM

AL 0.356 0.407 0.447 0.248 0.449 0.453C 0.284 0.394 0.387 0.383 0.389 0.406

L+C 0.289 0.414 0.426 0.409 0.432 0.437

BL 0.524 0.672 0.719 0.491 0.736 0.749C 0.436 0.629 0.615 0.641 0.621 0.673

L+C 0.446 0.682 0.708 0.591 0.725 0.735

Table 1. F-scores for nyas boundary detection using PLSmethod (A) and the proposed segmentation method (B).Results are shown for different classifiers (Tree, KNN,NB, LR, SVM) and local (L), contextual (C) and local to-gether with contextual (L+C) features. DTW is the base-line method used for comparison. F-score for the randombaseline obtained using RB2 is 0.184.

AL 0.553 0.685 0.723 0.621 0.727 0.722C 0.251 0.639 0.631 0.690 0.688 0.674

L+C 0.389 0.694 0.693 0.708 0.722 0.706

BL 0.546 0.708 0.754 0.714 0.749 0.758C 0.281 0.671 0.611 0.697 0.689 0.697

L+C 0.332 0.672 0.710 0.730 0.743 0.731

Table 2. F-scores for nyas and non-nyas label annotationstask using PLS method (A) and the proposed segmenta-tion method (B). Results are shown for different classifiers(Tree, KNN, NB, LR, SVM) and local (L), contextual (C)and local together with contextual (L+C) features. DTW isthe baseline method used for comparison. The best randombaseline F-score is 0.153 obtained using RB2.

mentation method consistently performs better than PLS.However, the differences are not statistically significant.

In addition, we also investigate per-class accuracies forlabel annotations. We find that the performance for nyassegments is considerably better than non-nyas segments.This could be attributed to the fact that even though thesegment classification accuracy is balanced across classes,the differences in segment length of nyas and non-nyassegments (nyas segments being considerably longer thannon-nyas segments) can result in more number of matchedpairs for nyas segments.

In general, we see that the proposed segmentation methodimproves the performance over PLS method in both tasks,wherein the differences are statistically significant in theformer case. Furthermore, the local feature set, when com-bined with the proposed segmentation method, yields thebest accuracies. We also find that the contextual featuresdo not complement the local features to further improve theperformance. However, interestingly, they perform reason-ably good considering that they only use contextual infor-mation.

5. CONCLUSIONS AND FUTURE WORK

We proposed a method for detecting nyas segments in mel-odies of Hindustani music. We divided the task into twobroad steps: melody segmentation and segment classifi-cation. For melody segmentation we proposed a methodwhich incorporates domain knowledge to facilitate nyasboundary annotations. We evaluated three feature sets: lo-cal, contextual and the combination of both. We showedthat the performance of the proposed method is signifi-cantly better compared to a baseline method using stan-dard dynamic time warping based distance and a K near-est neighbor classifier. Furthermore, we showed that theproposed segmentation method outperforms a standard ap-proach based on piece-wise linear segmentation. A featureset that includes only the local features was found to per-form best. However, we showed that using just the con-textual information we could also achieve a reasonable ac-curacy. This indicates that nyas segments have a definedmelodic context which can be learned automatically. Inthe future we plan to perform this task on Bandish perfor-mances, which is a compositional form in Hindustani mu-sic. We also plan to investigate other melodic landmarksand different evaluation measures for label annotations.

Acknowledgments

This work is partly supported by the European ResearchCouncil under the European Union’s Seventh FrameworkProgram, as part of the CompMusic project (ERC grantagreement 267583). J.S. acknowledges 2009-SGR-1434from Generalitat de Catalunya, ICT-2011-8-318770 fromthe European Commission, JAEDOC069/2010 from CSIC,and European Social Funds.

6. REFERENCES

[1] A. D. Patel, Music, language, and the brain. Oxford,UK: Oxford University Press, 2007.

[2] W. S. Rockstro, G. Dyson, W. Drabkin, H. S. Pow-ers, and J. Rushton, “Cadence,” in Grove music online,L. Macy, Ed. Oxford University Press, 2001.

[3] P. Sambamoorthy, South Indian music vol. I-VI. TheIndian Music Publishing House, 1998.

[4] A. K. Dey, Nyasa in raga: the pleasant pause in Hin-dustani music. Kanishka Publishers, Distributors,2008.

[5] K. K. Ganguli, “How do we ‘see’ & ‘say’ a raga: aperspective canvas,” Samakalika Sangeetham, vol. 4,no. 2, pp. 112–119, 2013.

[6] G. K. Koduri, S. Gulati, P. Rao, and X. Serra, “Ragarecognition based on pitch distribution methods,” Jour-nal of New Music Research, vol. 41, no. 4, pp. 337–350, 2012.

[7] P. Rao, J. C. Ross, K. K. Ganguli, V. Pandit, V. Ish-war, A. Bellur, and H. A. Murthy, “Classificationof Melodic Motifs in Raga Music with Time-seriesMatching,” Journal of New Music Research, vol. 43,no. 1, pp. 115–131, Jan. 2014.

Page 40: Landmark Detection in Hindustani Music Melodies

AL 0.356 0.407 0.447 0.248 0.449 0.453C 0.284 0.394 0.387 0.383 0.389 0.406

L+C 0.289 0.414 0.426 0.409 0.432 0.437

BL 0.524 0.672 0.719 0.491 0.736 0.749C 0.436 0.629 0.615 0.641 0.621 0.673

L+C 0.446 0.682 0.708 0.591 0.725 0.735

AL 0.553 0.685 0.723 0.621 0.727 0.722C 0.251 0.639 0.631 0.690 0.688 0.674

L+C 0.389 0.694 0.693 0.708 0.722 0.706

BL 0.546 0.708 0.754 0.714 0.749 0.758C 0.281 0.671 0.611 0.697 0.689 0.697

L+C 0.332 0.672 0.710 0.730 0.743 0.731

Acknowledgments

6. REFERENCES

Page 41: Landmark Detection in Hindustani Music Melodies

AL 0.356 0.407 0.447 0.248 0.449 0.453C 0.284 0.394 0.387 0.383 0.389 0.406

L+C 0.289 0.414 0.426 0.409 0.432 0.437

BL 0.524 0.672 0.719 0.491 0.736 0.749C 0.436 0.629 0.615 0.641 0.621 0.673

L+C 0.446 0.682 0.708 0.591 0.725 0.735

AL 0.553 0.685 0.723 0.621 0.727 0.722C 0.251 0.639 0.631 0.690 0.688 0.674

L+C 0.389 0.694 0.693 0.708 0.722 0.706

BL 0.546 0.708 0.754 0.714 0.749 0.758C 0.281 0.671 0.611 0.697 0.689 0.697

L+C 0.332 0.672 0.710 0.730 0.743 0.731

Acknowledgments

6. REFERENCES

Page 42: Landmark Detection in Hindustani Music Melodies

AL 0.356 0.407 0.447 0.248 0.449 0.453C 0.284 0.394 0.387 0.383 0.389 0.406

L+C 0.289 0.414 0.426 0.409 0.432 0.437

BL 0.524 0.672 0.719 0.491 0.736 0.749C 0.436 0.629 0.615 0.641 0.621 0.673

L+C 0.446 0.682 0.708 0.591 0.725 0.735

AL 0.553 0.685 0.723 0.621 0.727 0.722C 0.251 0.639 0.631 0.690 0.688 0.674

L+C 0.389 0.694 0.693 0.708 0.722 0.706

BL 0.546 0.708 0.754 0.714 0.749 0.758C 0.281 0.671 0.611 0.697 0.689 0.697

L+C 0.332 0.672 0.710 0.730 0.743 0.731

Acknowledgments

6. REFERENCES

Page 43: Landmark Detection in Hindustani Music Melodies

AL 0.356 0.407 0.447 0.248 0.449 0.453C 0.284 0.394 0.387 0.383 0.389 0.406

L+C 0.289 0.414 0.426 0.409 0.432 0.437

BL 0.524 0.672 0.719 0.491 0.736 0.749C 0.436 0.629 0.615 0.641 0.621 0.673

L+C 0.446 0.682 0.708 0.591 0.725 0.735

AL 0.553 0.685 0.723 0.621 0.727 0.722C 0.251 0.639 0.631 0.690 0.688 0.674

L+C 0.389 0.694 0.693 0.708 0.722 0.706

BL 0.546 0.708 0.754 0.714 0.749 0.758C 0.281 0.671 0.611 0.697 0.689 0.697

L+C 0.332 0.672 0.710 0.730 0.743 0.731

Acknowledgments

6. REFERENCES

Page 44: Landmark Detection in Hindustani Music Melodies

AL 0.356 0.407 0.447 0.248 0.449 0.453C 0.284 0.394 0.387 0.383 0.389 0.406

L+C 0.289 0.414 0.426 0.409 0.432 0.437

BL 0.524 0.672 0.719 0.491 0.736 0.749C 0.436 0.629 0.615 0.641 0.621 0.673

L+C 0.446 0.682 0.708 0.591 0.725 0.735

AL 0.553 0.685 0.723 0.621 0.727 0.722C 0.251 0.639 0.631 0.690 0.688 0.674

L+C 0.389 0.694 0.693 0.708 0.722 0.706

BL 0.546 0.708 0.754 0.714 0.749 0.758C 0.281 0.671 0.611 0.697 0.689 0.697

L+C 0.332 0.672 0.710 0.730 0.743 0.731

Acknowledgments

6. REFERENCES

Page 45: Landmark Detection in Hindustani Music Melodies

Results: Nyās Label Annotation

AL 0.356 0.407 0.447 0.248 0.449 0.453C 0.284 0.394 0.387 0.383 0.389 0.406

L+C 0.289 0.414 0.426 0.409 0.432 0.437

BL 0.524 0.672 0.719 0.491 0.736 0.749C 0.436 0.629 0.615 0.641 0.621 0.673

L+C 0.446 0.682 0.708 0.591 0.725 0.735

AL 0.553 0.685 0.723 0.621 0.727 0.722C 0.251 0.639 0.631 0.690 0.688 0.674

L+C 0.389 0.694 0.693 0.708 0.722 0.706

BL 0.546 0.708 0.754 0.714 0.749 0.758C 0.281 0.671 0.611 0.697 0.689 0.697

L+C 0.332 0.672 0.710 0.730 0.743 0.731

Acknowledgments

6. REFERENCES

Page 46: Landmark Detection in Hindustani Music Melodies

Conclusions and Future work

§  Proposed segmentation better than PLS method §  Proposed methodology better than standard DTW based

kNN classification §  Local features yield highest accuracy §  Contextual features are also important (maybe not

complementary to local features)

§  Future work §  Perform similar analysis on Bandish performances §  Incorporate Rāga specific knowledge

Landmark Detection in Hindustani Music Melodies

Sankalp Gulati1, Joan Serrà2, Kaustuv K. Ganguli3 and Xavier Serra1

[email protected], [email protected], [email protected], [email protected]

1Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain 2Artificial Intelligence Research Institute (IIIA-CSIC), Bellaterra, Barcelona, Spain

3Indian Institute of Technology Bombay, Mumbai, India

SMC-ICMC 2014, Athens, Greece

5

Download - Landmark Detection in Hindustani Music Melodies

Top Related