automatic knowledge acquisition: recognizing music notation with methods of centroids and...

7
Automatic Knowledge Acquisition: Recognizing Music Notation with Methods of Centroids and Classifications Trees Wladyslau~ Homcnda and Marciri Luckncr Abshact-This paper prcscnts a pattcrn recoguition study 10. Polonaise aimed at music symbols rccognition. Thc study is Socused on 0,- snub ta mu. -6 *as, classification methods of music symbols based on decision trees LUilWir VAN I_?tlAOYLN a:>"-ldlil and clustering method applied to classes of music symhols that face classification prohlcms. Classification is made on thc basis of cdractcd fcaturcs. A comparison of sclcctcd classifiers was made un some classes of notation symbols distorted by a variety of factors as image noise, printing defects, different fonts, skew and curvature of scanning, overlapped symbols. i lies and conlpulcr scicncc. Music nolalion rccognition, as a case or pallem irecognilion, slill laces iresearch challenges. This arca, hy conli-as1 lo OCR, Oplical Characlcr Rccogni- tion, technologies of 111-iiited text recognition, is identified ;rs OMR; 0ptic;jl Music Recognition. OCR is now ;j well deve- loped technology with alinost 100% recognition efficiency. On the contrary, OMR is the technology under development. As it was stated in [7], [9], OMR raises difficulties common to general pattern recognition as well as domain specific problems. Scanned rrluaic nillalions suhjecled ti) recognilion are blurred. noised, fragmented or overlapped in printing; rotated and shifted sy~nbol placcmcnt; skcwcd and curvcd scanning, etc. On tile other Iland, mnsic symbols' appearance is highly irrcgular: syrnhols rnay hc dcnscly crtowdcd in onc ~rcgionand sparsely placed in olhcrs. 1nsl;inccs of the same symhol lie on, ahovc or hclow slal'l lincs. Thus, copics or the same sylnbol inay be affected hy staff lines 01- isolated froin staff lines influence. A fusther difficulty is raised by irregular sizing and shaping or music symhols. Music notation inclu- des symbols ol' the l'ull range of size: seasting from small dols (staccato symbol or rhythmic value prolongation ol' a note or a rest) and ending with page width staff lines, arcs or dynamic hairpins. Sophisticated shaping of music symhols would be illustrated by clefs, rests, articulation markings, etc.. cf. Figure 1. As a consequence, methods employed to inusic notation recognition gnust he Rexihle enough to overcome outlined Wladyslaa Homzndn is with thc Faculty of Mathzrnntics and Information Science; Wmaw Llniversity of Technology, pl. Politechniki 1; 00=66l Wdrsrrw, Poland (phonc: (48)(22) 621-93-12; fan: (48)(22) 625-74-60; ~n,ail: hon,esrla@'~nini.pw.etlu.pl). Marcill Lucloler is ~bilh 1he Fn~aIl), 01 Geodeby and Cnn0gr.l~ phy, \Varsstv University of Technology, pl. Politechniki 1. 00=661 W. drsd\\i. .. I'oland (phone: (48)(22) 621~93~12; fdn: (48)(22) 625~74~60 cmail: mluclmcr(@gil~.prv.c~iu.pl). Fig. I. As example or lrn~8sical score it-regularilies, insensitive to a11 sorts ol'dislortions, ciipahle of producing results handy for storing in the computer ineinory and for further processing, cf 171. There is no universal patte~n recognition method that could he effectively applied to inusic notation recognition. The variety of methods that conld he used range fro111 neural networks, cf. [9], through statistical methods, cf. [14]. to syn- tactical strnctnring and granular computing, cf. 18 1. A number of papers were published on music notation recognition. cf. sclcclcd papers 141, 151. 1121, 1141. The crilical survey of diffcrcnt early attcmpts to music notation rccognition and its syntl~esis was niade in 111, 131. Despite tile 1ic11 track (of resea~cl~_ the OMR technology is slill far froni being pcrrcct. 'l'hc ruhjccl, OMR deals \vith, is ku rrnorc difficull ihan charnclcr rccognition. On Lhc olhcr hand. music nolalion docs not huvc u unique dcIini1it)n. Puhli- 0-7803-9490-9/06/$20.00/©2006 IEEE 2006 International Joint Conference on Neural Networks Sheraton Vancouver Wall Centre Hotel, Vancouver, BC, Canada July 16-21, 2006 6414

Upload: pw

Post on 08-Feb-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

Automatic Knowledge Acquisition: Recognizing Music Notation

with Methods of Centroids and Classifications Trees

Wladyslau~ Homcnda and Marciri Luckncr

Abshact-This paper prcscnts a pattcrn recoguition study 10. Polonaise aimed at music symbols rccognition. Thc study is Socused on 0,- snub ta mu. -6 *as,

classification methods of music symbols based on decision trees LUilWir VAN I_?tlAOYLN a:>"-ldlil

and clustering method applied to classes of music symhols that face classification prohlcms. Classification is made on thc basis of cdractcd fcaturcs. A comparison of sclcctcd classifiers was made un some classes of notation symbols distorted by a variety of factors as image noise, printing defects, different fonts, skew and curvature of scanning, overlapped symbols. i

lies and conlpulcr scicncc. Music nolalion rccognition, as a case or pallem irecognilion, slill laces iresearch challenges. This arca, hy conli-as1 lo OCR, Oplical Characlcr Rccogni- tion, technologies of 111-iiited text recognition, is identified ;rs OMR; 0ptic;jl Music Recognition. OCR is now ;j well deve- loped technology with alinost 100% recognition efficiency. On the contrary, OMR is the technology under development. As it was stated in [7], [9], OMR raises difficulties common to general pattern recognition as well as domain specific problems. Scanned rrluaic nillalions suhjecled ti) recognilion are blurred. noised, fragmented or overlapped in printing; rotated and shifted sy~nbol placcmcnt; skcwcd and curvcd scanning, etc. On tile other Iland, mnsic symbols' appearance is highly irrcgular: syrnhols rnay hc dcnscly crtowdcd in onc ~rcgion and sparsely placed in olhcrs. 1nsl;inccs of the same symhol lie on, ahovc or hclow slal'l lincs. Thus, copics or the same sylnbol inay be affected hy staff lines 01- isolated froin staff lines influence. A fusther difficulty is raised by irregular sizing and shaping or music symhols. Music notation inclu- des symbols ol' the l'ull range of size: seasting from small dols (staccato symbol or rhythmic value prolongation ol' a note or a rest) and ending with page width staff lines, arcs or dynamic hairpins. Sophisticated shaping of music symhols would be illustrated by clefs, rests, articulation markings, etc.. cf. Figure 1.

As a consequence, methods employed to inusic notation recognition gnust he Rexihle enough to overcome outlined

Wladyslaa Homzndn is with thc Faculty of Mathzrnntics and Information Science; W m a w Llniversity of Technology, pl. Politechniki 1; 00=66l Wdrsrrw, Poland (phonc: (48)(22) 621-93-12; fan: (48)(22) 625-74-60; ~n,a i l : hon,esrla@'~nini.pw.etlu.pl).

Marcill Lucloler is ~ b i l h 1he Fn~aIl), 01 Geodeby and Cnn0gr.l~ phy, \Varsstv University of Technology, pl. Politechniki 1. 00=661 W. drsd\\i. .. I'oland (phone: (48)(22) 621~93~12; fdn: (48)(22) 625~74~60 cmail: mluclmcr(@gil~.prv.c~iu.pl).

Fig. I . As example or lrn~8sical score

it-regularilies, insensitive to a11 sorts ol'dislortions, ciipahle of producing results handy for storing in the computer ineinory and for further processing, c f 171.

There is no universal pa t te~n recognition method that could he effectively applied to inusic notation recognition. The variety of methods that conld he used range fro111 neural networks, cf. [9], through statistical methods, cf. [14]. to syn- tactical strnctnring and granular computing, cf. 18 1. A number of papers were published on music notation recognition. cf. sclcclcd papers 141, 151. 1121, 1141. The crilical survey of diffcrcnt early attcmpts to music notation rccognition and its syntl~esis was niade i n 111, 131.

Despite tile 1ic11 track (of resea~cl~_ the OMR technology is slill far froni being pcrrcct. 'l'hc ruhjccl, OMR deals \vith, is ku rrnorc difficull ihan charnclcr rccognition. On Lhc olhcr hand. music nolalion docs not huvc u unique dcIini1it)n. Puhli-

0-7803-9490-9/06/$20.00/©2006 IEEE

2006 International Joint Conference on Neural NetworksSheraton Vancouver Wall Centre Hotel, Vancouver, BC, CanadaJuly 16-21, 2006

6414

Fig. 2. Classes of symbols sul~jcjcctcd ro rccognirion: forrc, piasa, quartcr, cighL ant1 rixiccn~h rcrlc

shers apply different rules and different fonts. Furthernrore, rnusic notation is a livc language of communication and lras hccn developing for ihc past ccnlurics and will he evolving in tlrc rul~lrc.

11. METHODOLOGY

In this section the methods of centroids and classification trees is investigated. Simple symbol features as projec- tions, histograms. widthheight proportions are contrasted with Zernike moments, statistical PCA method and linear combinations of features. These methods are cast on two approaches: classification with and without rejection.

It is evident that the con-ect I-ecognition of symhols decides about the recognizer quality. Thus, the recognition rate is the most important parameter of evaluation of recognizers. Recognition time is less evident though an important factor of recognizers. This factor is critical for online recognizers which should follow a live process. In case of oftline rccognizcrs, short recognition time is not critical though still important. Learning time is of a lesser importance in the evaluation of recognizers. Both classification methods: classification trees and centroids were evaluated from this perspective. The discussed methods should he used instead coniplcx niclhods iT lhcir rcsulls arc sirnilar.

Neural networks classifiers are examples of complex methods. Results tbr the same issue produced by neural networks are given in [9]. The recognition rate for a Multi Layer Perceptron wit11 24-18-5 neurons structure exceeds 99.6% for the classification without rejection. The results for the classification with rcjccLion were a lilllc worse and attained 99.7%). IIowcvcr, for the second issue a structure of the netwo~k was mucli extended (222-4G4). In such a casc linrc or clarsilicalion (complcxily or ihc algorilhms) hccon~c to bc iniporunl as tl~crc is more fcalurcs to conip~~lc . Rcscal-ch on a faslcr classilicr sccms Lo he juslilicd.

Fig. 3. Kcsults uf classificntton withuut i-cjcctton fur diffcrcnr mcthuds nnd for various r l i l w sws.

A. <:lassification with and without rejectior!

'l'\vo diKcrcnl approaches wcrc used lo sludy rnusic synl- hols classificaLi(~n. The first assumes lhal cvcl-y symhol subjected to classification belongs to one of the given classes or synrbols and, of course, i l should he classilied Lo one of those classcs. 'This approach is called lhc classificalion wi- thoul rejection. Classilicaliun wilhoul rejection is impraclical in many circumslances since usually aulomalic irecognizers should have extraneous symbols and garbage rejected. Ne- vertheless, classification without rejection can be used as ;r testing tool for the evaluation of recognizers and, in some cases. can be directly applied in practice.

Classificadon with rcjcclion docs no1 assume that cvcry symhul should hclong lo one of Lhc given classcs of synihols. I'he clarsilicalion should be able Lo decide thal a synlhol belongs cithcr to one of the given classcs or it is an cxtranc- uus symbol and should no1 be claasilicd. Clasailicalion with rejection cannuL he Lurnrd L(I classilicali(~n withc~ul ~-c,jccli(~n hy adding a supplemenvary class that collects a11 extl-meous symhols. Classification without rejection requires a definition of every class in terms of features of symbols. An extra class collecting unclassified symbols of classificatioli with rejection obviously cannot he defined aprio1.i or hy a learning procedure. Tlie qnestion how to reject symbols which do not belong to rccognizcd classcs is a fimdanicnlal one. A discussion on some aspects of this question is included.

R. .S'yn~hol,~ and jeanrres

Five classes of music symhols were investigated in this sludy, namely - forte; piano, rcsl 114. rcsl 118. rcsl l l l h , cf. Figuic 2. Each class included a scl of 300 symhols cxb-aclcd from YO scorcs. Evcry class \%,as cvcnly splil inlu (raining and testing sets of 150 symbols in each one with a random choice of symhols.

Symbols were characterized by the basic set of features. Bccausc the rccognizcd symbols arc not rcslrictcd by musical rules, any significant feature cannot hc crcatcd from prior kl~owledge about a score and unive~sal features should lhe used. 'i'hc vcclor or278 rcaturcs is rcprcscntcd hy norniali,.cd values of the unit interval LO, 11. In nlost cases discussed in lhc paper dimcnsionalily of dulu space was reduccd Lo

6415

piauo

resi 1.'16

rest 1/16

Fig. 4. Classificar~os wee, classilicalion wiillour r e j d o n , alrllr dec~sion based os bingle ieaurrs.

47. The rcatu~-cs' s c ~ was hascd on ~ h c s c ~ that had hccn discusscd in 1131. The inner variation of cach class as well as lhc varialion hclwccn classcs were compulcd. The rcduccd set includcd classcs with the minimum ralc of its variation and varialion lhc hclwccn classcs. This s c ~ includcs s ~ ~ c l r realures as hislograms, densily ale or black pixels Lo all pixels), symbol direction (a directioi~ of a symbol is the direction of the longest run of black pixels taken in any vertical, horizontal and diagonal direction). The computation of any feature from the basic set is cheaper cornpared to computation of other sets of features.

An alternative fcaturcs' set was constructed using the first 47 Zernike's moments_ cf. [lo]. Zernike moments are often used as image fealures hecause Lhey are invarianl 10 lransla- lion, scaling mid rolalion. Kolirlion inval-iancc dislinguishcs Zernike niomenls rrorn other feature extractors which usually arc scnsilivc no1 only Lo n~luli~)n, hul oflcn 10 1ranslalil)n and scaling. Unfol-tunately, in most cases the cost of computation of Zernike moments is relatively higher than the cost of computation of other features from the basic set.

C Method of rentmidr

In the mcthod of centroids cach class is rcprcscntcd hy a ccnlral point ol' lhc class syrr~hols o l rhc learning set. A ncw synibol is classified lo ihc class with a centroid closest lo lhc classiiicd symbol. The dislancc is cxpl-csscd in lei-ms of a mclric in lhc space of i"caiu~-cs.

The method of centroids is based on k-means algorithm, cf. 121. Each class is represented by its centroid. The cen- troid's coordinates are calculated as means of the class's sylnbols of the learning set. As was stated. the classification of a new syrnbol relics on finding ihc closcsl centroid. Since the computation time of this proccss is proportional to the dimension (of the data space, the features' space dime~lsiilna- lily was rcduccd lo 10 (lirsl rcalurcs rrorr~ ihc rcduccd set). Occrcascd dirncnsionality brings anolhcr hcncfit. 11 allows Lo din~inish condilions under which a signilicanl l a lu rc

rest l!E

rzst l / Y

Fig. 5. Classification tree, classificatiotl withoor rzjecrion, split derision based on linear cornbi~~arion of features.

disappears due to its similxity to other ones or dependency on olhcrs.

The QUEST (Quick, IJnhiascd, Eliicicnl Slalistical Trccs) algorilhm was uscd LO huild classificalion LI-ccs, cr. 11 11. Due lo the lengll~ or the descl-iplion or ihe algorillnr~ and its complcxily only a brier description is given. Also; ~o simplify tcsrs and dccrcasc the complcxily ol' computation i l is assunled (ha1 all realures used in this algorilhm are ordered. However. the origii~al ;~lgorithin dues not requil-e such an assumption.

The algorithn~ builds a binary tree. It starts from a single node that groups all learning examples. In each step two new nodes are created. First the best terminal node to split is choscn irnd a fcalurc Lhal will he of use to lhc splil decision. For ordered fcirlurcs lhc p-levels arc conrpulcd for ANOVAs of the relationship of the classcs to the values of the examples lhal are presenl a1 Lhe node. The smallest p- lcvcl sllows thc approprialc splil rcalul-c. Usually an upper lirnil for Lhe srnallesi p-value is raised. The Lypical value ol' this limil is equal Lo 0.05 of p-lcvcl. Il' no p-lcvcl smallcr than the tI11-eshold p-level is found, p-levels are computed for statistical tests. This inethod gives p-levels robust to changes of distribution.

To determine a split threshold value the 2-mean clu- stering algorithm is used [6]. The algorithm creates two "mctaclasscs". Roots calculatcd for a quadxatic cquation3 which descrihe the difference in the "metnclasses" means, are candidates for the split threshold value. Symhols are split hy the threshold and create two new nodes. The entire procedure is repeated for a next terminal node.

Each node is lahclcd hy lhc "mclaclass" wiih a higgcl- number of symbols. If the number of others cases is small enough, the node is taken as a leaf (five instances in our case).

The classification process starts from the root. A value of a feature corresponding to the node is compared with tile threshold to dccidc which m)dc should hc choscn. A class of a sylnhol is deterlnined when a leaf is found.

'l'hc QUliS'I' algorilhrn can be uscd lo crcalc lrccs with an altcrnalivc sclcction rncrhod hascd on a lincirr cornhination ol' lalurcs inslcad of a single fcalurc. The usage ol'lhc linear

6416

1 rest I/S 1

resi 1%

reit I S rest 1:4

iest 1116 resr 118 izst 11-1

Fig. 6. Claabificdon tree, cl&\.ia~ficaiion with rrjrdios. slllll decibion based os lirlear comnbina~ios 01' 1'ealnrrs.

comhinalil~n of fcalurcs requires lalurcs 11f c~~n l inu~)us characlcl-. Due 111 Lhc nalurc of lhc maltcr discussed in this paper we can apply such a classification.

Methods discussed in Section I 1 were employed to classify selected symbols of music notation. 'l'he obtained 1-esults are prearnlrd and discussed in this Seclion. Those reaulLa and discussion concern the classilicalion o l symbols that wcre already exu-acted fl-om their eirviruirineirt. It is clear that in the real system the recognition rate will be diminished by ell-ors of a stage of segmentation. This causes that the segmentation of an image is one of critical stages of a real recognition system. Anyway, the segmentation of an image is out of t l ~ e scope of this paper and will be a subject of furlher sludics.

A. Clriss~/it:rrlit~n wirhor~l rejet-lion

Classification without rejection is somewhat artificial. In real systems the classification module has to cope not only with syinhols to he recognized, but also with other symbols and noise that could be extl-acted by the segmentation inod111e and presented to the classifier. Nevertheless_ classification without rejection is simpler than classification with rejection and would be used for a raw estimation of the classifier.

I j Method of cer~tiaids: Considering the set of basic features, as described in Section 11-A, the method of cen- troids gave a relatively low classification rate. Only 598 out of 750 symbols (79.7Oh) were correctly classified. cf. Figurc 3. Moreover, the classificadon raLc fell down Lo 28.856 in the worst class (the class of 'rest ll8')! Even though the classification rate was ~elatively high in ilther classes (al~ilut 8.3-97%.), t t~is mclhod should hc rcjcclcd. In racl, lhis rnclhod is a classificr of four clirsscs only, a1 lcasl for lhc discussed

rcatul-~~.

Fig. 7. Classiticatiol, r r a , classihcation with rejection, split decision based on litlcsr combination of fcaturcs.

In contrast, Zernike moments used as features in the nrctliod o l centroids produced niucli hcllcr results coniparcd to Lhc hasic scl of fcalui-cs. Thuugh Lhc compulalion of Zcrnikc momcnts is more timc consuming, they still could bc cfficicntly uscd if thc ordcr of momcnts is limited. In this study only rnonrcnts up lo Lhc rcnh ordcr arc uscd lo rcducc cornputational linrc. A ralc or corrcclly rccogni~cd syrnhols rcachcd Lhc lcvcl 92.8% (696 oul of 7501, cf. Figurc 4. In most cases the mistaken symbols were heavily noised by staff lines.

Music notation sy~nbols may he placed on staff lines as well as oiilsidc the s u l r lincs. In lhc first case symbols 'uc alfcclcd by slafl lincs while in the second case Lhcy 'uc nol. In this paper wc do not distinguish whcthcr the tcstcd syrnhols lic on staff lincs or not. Yet, it would he interesting to cxa~ninc separated scls or s a n d alone sy~nhols wd sy~nhols alTecLed hy slaff lines. Apparenlly, in such a case the perfect recognition seems to be possihle. Mol-eovei-, accor(1ing to simple tests only four Zernike moments are needed to huild almost a perfect classifier for non-noised set of symbols.

2) Classification trees: Il could he concluded Lhal clas- sification trees al-e a better classifier than the method of centroids. The simple tree with a split decision based on single features reaches the accuracy at the level of 94.5% (709 corl-ectly recognized symbols out of 7501, cf. Figu1-e 4.

The structure of the tree is given in Figure 4. Nodes of this tree are laheled hy the name of the most numerous class while nirrnhers represent the amount of symbols linked to all descendant leaves.

A feature placed in the root is the most important one as a classification factor. In the case of investigated symbols one of peripheral values is taken as such a feature. The peripheral valne is the distance between the edge of the hounding box of an analyzed symbol to the closest black pixel. This value is calculalcd scparalcly for all lour cdgcs of the bounding box. For cvcry cdgc the picture is split into eight stripes_ each one eight pixels wide. Fol each slripc t t ~ c fcalurc is calculalcd as a nrinirnum peripheral. This fcalurc dirlcrcnlialcs for instance very well hcl\vccn dynamic markings and rcsls. This differentialing properly is

6417

Fig. 8. Sy!nbalr not clarsiticd fa any class clcfs, notcs, d>ramatic rymhols, rcsls and dynilnlic l i 1 1 r l s .

all'cctcd by llic stall lines. About 90 pcrccnl o l rcsls synibols is alkclcd by a1 lcasl one h o ~ i ~ o n l a l line rroni a slafr in conlr-asl lo dynamic m a ~ k s Lhal arc usually isolated.

Discussing classilicalion trccs hascd on singlc rcalurcs splil decision, cl. Figurc 4, we can notice Lhal Lhc tl-cc has scvcn leaves and lhc slruclurc hascd on six lcalurcs. This means that we can choose six featul-es reducing the feature space to six dimensions. FUI-ther~nore, the decision process requires a comparison of not inore than tive features. This makes the classification process accurate and fast. IIowever, the classification tree with the split decision based on single features still has a problem with classes 'rest 114' and 'rest 1/16,, cf. Figure 4.

Since the tree is significantly unhalmiced, cf. Figul-e 4. the computation coinplexity is increased 2-3 tiines compared to the optimal version of the classification tree. The opti~nal structure (with a depth of four levels) could not he obtained because the classification tree with the split decision based on single features has a problem with classes 'rest 114' and 'rest 1116'. For their cffcctivc scp'aration by a singlc feature noises gcncralcd hy other classcs should he rcmoucd. Hcncc, additional not so important splits arc made hcforc.

Thc set of fcaturcs was replaced by Zcrnikc moments in order to improve this disadvantage. Unfo~Tunately~ this ope- ration did not improve the classification itself. The numher of correctly classified sy~nbols reached 700 out of 750 cases (93.3%,), cf. Figure 4. To make tile things worse is that the classilicirlion Lrcc grcw up and \\;as based on lwclvc I'calurcs.

The last test utilized the split decision made on linear coinbination of features. This method was the most advanced and gave Lhc bcsl rcsulls. Classificalion accuracy grcw up Lo 98.9% (742 symhols uul or 750 wcrc curl-cclly classified), cf. Figurc 4. So_ the numhcr of niistakcs is niuch smallcr llie i n 0i11e1. 1esIs. AS i n lpreviour caser nlost orerrcws were gcncralcd hy misclassification or 'rest 114' and 'real 1/16'.

This method gave the simplest tree with five leaves (the rnininrun~ numhcr) and [our decision nodcs, cr. I;igurc 5. In lhc worsl case il is necessary lo compute lhrcc lincirr comhinaLi(lns of 4X lcalur-cs' \,alucs. This mclhod has a

Fig. 9. Results of classification. classification wirh rzjecrion for different data sets (linear tree?) and for split decivion based on single features.

big Lirnc coniplcxily due to a high dinicnsionalily. But, in l'acl, a par1 or katul-cs in lhc linear c~)mhinalion has small c(~cllicicnLs whal allows li)r reducing Lhc dinicnsi~)nalily or the set of features.

B. (lussificutiun wifh r~jectiu17

Tcsts discussed in Section 111-A arc impractical hccausc they do not consider syrnhols not included into classes suhjcctcd to classification. In real conditions rccogni7.crs nrusl cope wilh such syrnhols as \+,ell as with garhagc or noised regions. 'I'lrc ideal classilicr should recognize all types of synihols. In PI-aclicc i l is cnough lo rqjccl no1 considc~cd symbols lo avoid misclassilications. Tcsls discusscd in (his section included other sy~nhols of music notation besides the set of symbols used in classification without rejection. Those exu-a syrnbols are coinrnon for music notation like; for instance, clefs, notes, chords, chi-omatic symbols. etc. These syrnbols should be similar to tested symbols. Similarity of those symbols is understood in terms of their features as, for example, rests and dynamic markings. Examples of extra symbols arc g i \ , c ~ ~ in Figurc 8.

I ) Melhod ~Jcen~roids: The method of centroids seems to be a good lool lnr classificalion with rcjccdon. Wc have Lwo tcchniqucs lo rcjccl Lcstcd synih~)ls. The lirsl one relics on crcaling a new ccnlroid col~csponding wilh Lhc 'olhcr' class. This technique, in fact, tulns classification with rejection inlo c1assiGc;rlion wilhoul rrjrclion by adding a new class ol' rcjcclcd symbols. l'his lcchniquc has all drawbacks o l classilicalion wilhoul rejeclion. On (he olher hand, lhe 'olher' class usually covers wide areas ol' l'eatui-es' space wilh classes of classified symbols intel-lacing areas of sy~nhols to be 1-ejected. It is clear that covering such a dispersed set of the 'other' syrnbols with one cluster is impossible. It is rather rare that synibols of the 'other' class are nested around one centroid. Tllis observation proves that bunching symbols of 'other' class in only one cluster cannot be a solution o l the classification problcni. So, ihc split o l ~ h c 'other' class to smallcr clustcrs collecting all cxtra synihols would seem to overcome the prohlem. To surmount this hurdle ihc class or 'olhcr' synrbols should hc split inlo smallcr clus~crs which niay collccl all 'olhcr' synihols around s~~pplcmcnlary ccnlroids. The incrcascd numhcr 01' ccnlroid

6418

and separated clusters of 'othel-' symhols tends to exhausting all syinhols presented to the classifier. This method leads to a complete classification system recognizing all symhols. But this method is still a case of classification without rejections. Unllfustunately, both approaches - this with the 'other' class and this with the split class of 'other' symbols - failed in our tests.

The allcmalivc tcchniquc 1-clics on 1-cjccling symbols that a!-e far enough from every centroid. An analy7ed symbol is rejected if the disrance hetween a classified symbol and every centroid exceeds the assumed threshold. This technique of rejection is the case of classification with rejection. The main question is how to estimate the value of the rejection distance.

To make a limit for the distance, two factors should he optimized: the number of re,jected syinhols of the 'other' clilss and Lhe number o l accepled symbols from base cli~sses. Insleild o l solving (his cumplicaled problem the number ol' symbols classified to the 'other' class was estimated. Ten fe- atures with minimum inside-clirss variation were considered. To be more restrictive, the variation of each class was limited separately for each class. Minimu111 and maximum values of features in the learning set were taken as limit bounda- ries. Checking if values of every features satisfied assu~ned condilions showed lhal only 20 oul or 66 'olhcr' synihols wcrc ircccplcd (30.3%). Incnrrcclly accepted synibi)ls of Ll~c 'ulhcr' class wcrc linked lo all classcs cxccpl lhc 'rorlc' one. Pvcn if it is assumcd that recognition rcsults rcmain the sanrc, ihc recognition raw is slill low (only 618 out or 816 cascs wcrc c o ~ ~ c c t l y classilid, what gave 79.7%). Since the ~rccognilion rate could not he increased Lhc suhscqucnl tcsls with the centroids method applied wel-e abandoned. Zernike's moments used instead of the basic set produced even worse results.

2) Clns~ i f lcnr io~s trees: Classification trees with split decision based on the single feature cannot exceed 89% (no matter which data set is used - the basic one or the set of moments). The split decision based on the linear combinatio~~ of features improved the performance of classification Wees in classification with rejection. The classification accuracy was increased to 95.9% (785 correct classifications out of 816), cf. Figure 9, though it was still significantly worse than in the case of classification without rejection. Estimation of tile feature effectiveness given by the QUEST algorithm leads to the conclusion that correlation between classified and rcjcclcd syn~hols docs not irllow for distinguisliing bclwccn Lhcm.

To inrprove the perfornla~~ce of classification trees two alternative types of features' vectors were built. The first one was obtained as a projection of 278 dimensional featu- rcs' spacc onto 47 dinicnsionirl spacc. The projcclion was done using the Principal Component Analysis algorithm as featu~es' selector, cf. 121. Tire sepa~atio~r ploperty of I I ~ W ~

vcclors is improved sincc ihc variance or lhcrn should he highcr bclwccn classcs cornparcd lo Lhc original data scl. The second type of lalurcs ' vector was crcalcd hy Zcrnikc

moments replacing those teatul-es that 31-e useless in the tl-ee structure. were The p-levels calculated in the process of u-ee consu-uction created the basis for replacement. This method produced the 45 dimensional dam space.

For both types of fcaturcs' vcctor rcsults wcrc improvcd. Thc PCA method gave 97.4%) (797 cascs out of 816) cfficacy while Zernike moments came to 97.2%) (794 cases out of 816). cr. Figure 8. 1,ooking rrorn classificalion pcrspcclivc tlic accurac): or holh rncthods is analogous: tlic PCA nrclhod produced slightly hcltcr rcsulls c(~mparcd to ihc method of Zcmikc momcnls, hut this difrc1-cncc is neglected. However_ the PCA method is much more computation intensive. Fil-st of all high dimensional dilt;~ space is used in this method, therefore much more features must he ci~lculi~ted. Secondly, [or each veclor presented lo this melhod its producl wilh the transformation arl-i~y must be computed. On the other hand. only a limited nu~nber of Zernike moments has to he computed in order to find a vector of the second type. Subsequently, Ze~nike moments are co~nputationally more efficient than the PCA method. Moreover, since vectors with Zernike moments give only three more ~misclassifications than the PCA method. it could be stated that the former nicthi)d ovcrcorncs lhc laltcr one; cspccially in praclical applicali~)ns.

A structure and size of the tree of classification with rejec- tion conlpared to the u-ee of classification without rejection is the last question that should he answered here. An auswer to this question will give a look at increased complication of the decision tree when rejection is considered.

The trcc of classification with rejection built on the basic dirlir scr has an cxlra lcirr or ihc 'olhcr' class compared lo ihc corresponding trcc or classilicalion wilhoul rcjcclion. 11 irlso has one n101-c dccision node what makes lhal Lhc longcsl decision process is elongated to four tests. Unfo~tunately, rejection of an inualid case requires four decision steps, cf. Figure 6. For that reason a frequent rejection situation makes the classifier a time-consuming.

The tree wit11 the split decision based on ~non~en t s llas scvcn leaves sincc tlic class 'othcr' is rcprcscntcd l\vicc. This trcc has also six dccision nodes. Thc longest dccision process includes foul. tests. This means that the enlargement of the tree is sensible. Recognition ilnprovement observed for such kind of trees comes from the doitble representation of tile 'other' class. 'l'his cITcc1 is consialcnl with Lhc ohscrvation that the 'othcr' class should be split inio subclasses in the centroid mcthod. I l is possible Lo dccidc aboul lhc rejection of part of inualid sy~nbols in the tree steps decision process. In tests a short rejection path was used by over 70 percent of rejections. The su-ucture of the tree is given in Figure 7. To conclude, Lurning l ior~l classilicaliun willlout rejection to classification with rejection does not significantly colnplicatcd thc trcc structure.

The tlee lhaaed on the PCA split decision could lhe explored in ihc sarnc way. Ilul, as i l was mcnlioncd herore; projcclion or lhc fcalurcs' spacc onlo a subspacc is Limc c~lnsuming. Well, this mclhod has Lhc highcsl polcnlial power

6419

to incl-ease the I-ecognition efficiency. I t would he efficiently

appl ied i f run on the h i g h p e ~ f o r m a n c e computers . Para l le l

a lgor i thms used in matr ices multiplication will increase the c o ~ n p u t i i l g efficiency. But th is k ind o f exploration i s out of d ~ e scope of illis papel-.

IV. CONC'LCISIONS

The paper investigates pattern recognition methods uti l ized

to recognize the music notation symbols . Two approaches wcrc icslcd: classi l ication wi lhout rcjccdon and classification

wilh rcjccdons. For each approach scvcral lcals wcrc n~adc us ing diffcrcnt descriptions of classified patterns a n d diffcrcnt

classi l icalion mclhods .

'l'hc nrosl interest ing conclus ions wc rc d r awn with in

Lhc 1'ramc\+,ork 01' PI-aclical applica1ii)n 01' classi l icalion w i lh

~rcjcclion. T h c m o s l iiccuralc m c l h o d of l incar classilication

tl-ees uses the PI-incipal c o m p o n e n t analys is based on t h e full

se ts o f fea tures a n d the method o f moments as features '

exu-actors. Colnputa t ional ly intensive principal component analys is seems t o be a less practical features ' extractor than t h e slightly less accura te but much more efficient me thod o f

mornenis.

This work is suppor ted under tile S ta te Committee for Scientific Research Grant no 3TllC00926, for the years 2004-2007.

REFF.RENCFS

1 1 1 D. Rsinhridgc: T. RdI, "Thc challcngc of optical !m~~ric rccagnitio<', C'o?zpnpnvrr nr~d ilrc Huntanrires 15 (20013 95-121

[?I C M. Ribhop, Nuurn1 iVerworksfor Poirum l~ucognii ioi~, Oxlurcl Cini- vecsiry Press. Oxford. 1995.

131 D. Blosteln, H.S.Baird. ''A Critical Sun:ey ot MUSK image ,\nalysii'. in: H. S. Baird, H. Bunkc, K. Yatnamoto (Eds), S r r u c i u ~ ~ d Docu,vrcnr Annls$ir, Sp~ingcrVcrl:ig, 1992, pp. 405-111.

[4] S:C. T<~linw, C. Giilknupir, el ill. ''An oplicill nawii<m recognilion sysrem for printed music based on template ,marching and high level reasoning", The 6ih Recherche d'lr$or,,tmiu,is .Irsisie pur Orilirziiieur. Paris; 2W0.

r51 1. Pujiwgu, R. Alphuncc, el ul. "lnieruclivc uplic;rl rrlusic recugniliun". Pmr.. of ihr lotern. Cr,nmurrr ~Wu.sic Cooferenve, Sari klac. USA, 1992.

[6] 1. A. ~ar i igas , M. A. n7<;ng. 'A k-means clusienng algoriih~n". Applied Srarisiicr, 28. 101),1978.

171 \V. Homcnds, "Aafamafic rccopnifion of ptinfcd mnsic and its convcr- ?ion info playablc music data", (.'onirnl m?d Cubcr,?crics, 25(2) (1996) 151-Zh7 .. .. .. . . . . . .

[8] \V. Hmmendo. "Gnnular Computing as on Abstraction of Data ,\ggre- gatiotl - a View oo Optical hlusic Recognition", .A,chii'es o/ Coniml kieoces, Vol. I2 (XLVIII), 2002 No. 4, pp 433~455.

r9l W Horilmda, M. Luckncr "Auu,ririllic Rrcogniiicm a i Music Naluli<m Using Neural Nciwnrka", Pmr.. of the lnrernuiiimni Confernre O n Anifrriol I,zirllii.ence and S?.srsrents. Divn<mrcmkayc. Rusaiil, Srplrrilhrr 3-10. 2UW, pg. 71-80.

1101 M-K Ha, 'Visual parrcm rccopnifion hy mamcnts isr~asisnrr', IRF ' lmn.~nci ron~ Oii lifoi.,irniio,i ' Ihmm, 179~187: 1962.

rl 11 W Y Ld,, Y-S. Shih, "Splil sdrcLios methods fix classificauos trees", Stuiiiiicii Sirzicn, 7:815-840, 1997.

1121 hl. Luckner, "Automatic identification of Seiectecl S~ynlbols of Music Nofation" ( i n Poltrh), MSc i f i c ~ i s , kc!llty of Math. and lt~fo!marian Scirnce. Wdraaw ljni\,eraiLy r l i Trchnc,lt~gy, Ib7arail\r. Poland. 2Wl.

1131 R. D. Rorilm,. R. T Tt~ureukv. O>xi,zi C~Z~PZYSL Chi imcre~ I<rr.orni-

6420