research article classification system of pathological voices … · 2019. 7. 31. · research...

Research ArticleClassification System of Pathological Voices Using Correntropy

Aluisio I R Fontes12 Pedro T V Souza3 Adriatildeo D D Neto2

Allan de M Martins3 and Luiz F Q Silveira2

1 Postgraduate Program in Electrical and Computer Engineering (PPgEEC) Federal University of Rio Grande do Norte59078-970 Natal RN Brazil

2 Department of Computer Engineering Federal University of Rio Grande do Norte 59078-970 Natal RN Brazil3 Department of Electrical Engineering Federal University of Rio Grande do Norte 59078-970 Natal RN Brazil

Correspondence should be addressed to Aluisio I R Fontes aluisioigorgmailcom

Received 16 April 2014 Accepted 22 July 2014 Published 19 August 2014

Academic Editor Pak-Kin Wong

Copyright copy 2014 Aluisio I R Fontes et al This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited

This paper proposes the use of a similarity measure based on information theory called correntropy for the automatic classificationof pathological voices By using correntropy it is possible to obtain descriptors that aggregate distinct spectral characteristics forhealthy and pathological voices Experiments using computational simulation demonstrate that such descriptors are very efficientin the characterization of vocal dysfunctions leading to a success rate of 97 in the classification With this new architecture theclassification process of vocal pathologies becomes much more simple and efficient

1 Introduction

In the past few decades medicine has received noteworthycontributions which have allowed for great advancementin medical activity within various contexts for exampleimprovement of surgery techniques description of thehuman genome or even assistance in medical diagnosisIn particular regarding medical diagnosis digital signalprocessing techniques have been employed recently as anefficient noninvasive and low-cost tool to analyze vocalsignals with the aim of detecting and classifying alterationsin the production of sounds that may be associated withlarynx pathologies [1] The human voice is an importantcommunication tool and any inadequate behavior may havedeep implications in an individualrsquos social and professionallife

The acoustic analysis is a complementary technique tomethods based on the direct inspection of vocal folds whichmay reduce the frequency of invasive exams since suchmeasurements are able to reveal important physiologicalcharacteristics of the vocal tract [2] In general most of thetechniques involving the analysis of dysfunctions available inthe literature employ several sorts of preprocessing stages inorder to extract useful characteristics for the classification of

diverted patterns and to obtain performance improvement ofclassifiers [2ndash11]

The existing systems for detection and classification oflaryngeal pathologies are phoneme-dependent where thetrain step used fixed phonemes for example a [2 3]and diagnosis can be done by taking any vowel whichcan be phonated comfortably Various techniques have beenproposed for the extraction of signal features in order toimprove the performance of automatic pathological speechdetection systems A method based on MPEG-7 audio lowlevel is proposed in [3] for the extraction of features thatcan be classified by support vectormachine (SVM) SimilarlySVM classifiers have also been used in the features extractionbased on wavelet transform nonlinear analysis of temporalseries and information theory [4ndash8] Mel-frequency cepstralcoefficients and linear prediction cepstral coefficients areused as acoustic features in [9] associated with a strategybased on combining classifiers with Gaussianmixturemodelhidden Markov model (HMM) and SVM Specifically fea-ture extraction is achieved in [10] by using eight measuresderived from the nonlinear dynamic analysis (correlationdimension four entropy measures Hurst exponent thelargest Lyapunov exponent and the first minimum of mutual

Hindawi Publishing CorporationMathematical Problems in EngineeringVolume 2014 Article ID 924786 7 pageshttpdxdoiorg1011552014924786

2 Mathematical Problems in Engineering

information function) and quadratic discriminant analysis(QDA) by classifier [10] Nonlinear Teager-Kaiser energyoperator has been used for features extraction in [11] witha classifier based on neural system of multilayer perceptron(MLP)

Thus it is possible to state that all aforementionedmethods employ classifiers with complex architectures forexample MLP SVM and HMM and a complex stage forthe features extraction which can restrict their applicationor even make the technique unfeasible in real-time systemsIn addition most research works are based on the discrimi-nation between healthy and pathological voices but not thepathology types

This paper introduces a new low-complexity approachfor the detection and classification of laryngeal pathologiesThis system uses a measure of information theory known ascorrentropy to characterize vocal pathologies by means ofspectral characteristics and statistical moments of high orderinvolving the vocal signalThe system is characterized by highsuccess rates and low computational complexity associatedwith a simple classification stage based on Euclidian distanceThe influence of the kernel size in the classification hit rate isanalyzed since it is the only free parameter of the methodBesides the classification between healthy and pathologicalvoices it is possible to define which pathology is most likelyto exist in the voice under analysis

The paper is organized as follows Section 2 presentsthe main concepts related to correntropy The proposedarchitecture is described in Section 3 In Section 4 the perfor-mance results of the proposed technique are discussed whileSection 5 presents the final considerations

2 Correntropy

The extraction of information from a database is a frequentand relevant problem in various applications involving signalprocessing Within this context statistical measures of sim-ilarity such as correntropy may be successfully used for theextraction of information and data characterization

Correntropy is a generalization of the correlation mea-sure between random signals because such measure is ableto extract both second-order and higher-order statisticalinformation from the analyzed signals [12] In the past fewyears this concept been successfully applied in the solutionof various engineering problems such as the modeling oftemporal series [13] nonlinearity tests [14] objects recog-nition [15] analysis of independent components [16] andautomatic modulation classification [17] Although corren-tropy is similar to correlation by definition recent studieshave demonstrated that its performance is superior whendealing with nonlinear or non-Gaussian systems without anysignificant increase of computational cost [12]

According to [12] correntropy is between random signals119909119905 119905 isin 119879 where 119905 represents time and119879 is the set of interest

indices defined as119881 (1199051 1199052) = 119864 [119896

120590(1199091199051

1199091199052

)]

= ∬119896120590(1199091199051

1199091199052

) 119901 (1199091199051

1199091199052

) 1198891199091199051

1198891199091199052

(1)

where 119864[sdot] is the expectation operator and 119896120590(sdot sdot) corre-

sponds to any symmetric function defined as positive It isobserved that correntropy is defined as the joint expectationof 119896120590(1199091199051

1199091199052

) In this work 119896120590(sdot sdot) is a Gaussian kernel given

as

119896120590(1199091199051

1199091199052

) =1

radic2120587120590119890minus(1199091199051minus1199091199052)221205902

(2)

where 120590 is the variance defined as the kernel size The kernelsize may be interpreted as the resolution for which corren-tropy measures similarity in a space with characteristics ofhigh dimensionality [12]

In fact by applying an extension of the Taylor series to thecorrentropy measure (1) can be rewritten as [12]

119881 (1199051 1199052) =

1

radic2120587120590

infin

sum119899=0

(minus1)119899

21198991205902119899119899119864 [100381710038171003817100381710038171199091199051

minus 1199091199052

10038171003817100381710038171003817

2119899

] (3)

It is possible to notice that (3) defines the sum of infinitemoments of even order which therefore includes its ownconventional covariance Consequently correntropy containsinformation of infinite statistical moments involving its data

It is interesting to observe that the kernel size in (3) isa parameter that ponders the influence of the second-orderand higher-order moments For sufficiently large values thesecond-order moments are dominant and the measure getscloser to the correlation

When only a finite amount of data is available thatis (119909119899 119909119899)119873

119899=1 it is possible to define the estimator to the

autocorrentropy function as [18]

(119898) =1

119873 minus 119898

119873minus1

sum119899=119898

119896120590(119909 (119899) 119909 (119899 minus 119898)) (4)

The autocorrentropy with an estimator defined by (4)does not ensure zero average even when the input data arecentralized due to the nonlinear transforms produced by thekernel However a centralized estimator for correntropy hasbeen defined in [18] which is given as

(119898) =1

119873 minus 119898

119873minus1

sum119899=119898

119896120590(119909 (119899) 119909 (119899 minus 119898))

minus

119873

sum119899=1

119873

sum119898=1

119896120590(119909 (119899) 119909 (119899 minus 119898))

(5)

This work explores the spectral properties of centralizedautocorrentropy in voice signals in order to detect and classifyvocal pathologies Specifically the spectral descriptors ofhealthy pathological voices used in the classification areextracted from the signals analyzed by the correntropyspectral density (CSD) function defined as [19]

119875120590(119908) =

infin

sum119898=minusinfin

(119898) exp (minus119895119908119898) (6)

where 119908 is the digital frequency given in radians The CSDfunction may be considered a generalization of the powerspectral density (PSD) of a signal

Mathematical Problems in Engineering 3

One of the advantages of using correntropy measuresin the classification of vocal signals lies in the robustnessof such measures against impulsive noise due to the use ofthe Gaussian kernel in (5) which is close to zero that is119896120590= (119909(119899) 119909(119899 minus 119898)) when 119909(119899) or 119909(119899 minus 119898) is an outlier

In addition the correntropy function extract informationof higher-order statistical moments present in the datathus potentially increasing the classification efficiency theproposed technique

3 Proposed Architecture

The classification method of vocal pathologies proposed inthis paper is composed of two stages The first stage ischaracterized by the extraction of descriptors for the voicesignals based on the CSD defined in (6) The second stage isresponsible for the classification of voices by simple metricsof the Euclidian distance

Since the analyzed voice signals may vary in amplitudethe normalization of all signals in the database is performedThen the signals are clustered into two sets one set withhealthy voice signals and another one with pathological voicesignals which contain voices with edema and nodules Afterthe calculation of the CSD for all signals in each set theaverage value of CSD is calculated for each set The averageCSDs of the sets are used as descriptors for the healthy andpathological voicesThis procedure is depicted in Figure 1 Anequivalent methodology is applied to obtain the descriptorsfor the voices with edema and nodules This procedure isdetailed in Figure 2

The CSDs of the healthy and pathological voices and therespective descriptors stored for the classification stage areshown in Figures 3 and 4 It is possible to observe in Figure 3that the average CSDs in the healthy voices have correntropyand frequency values that are different from those in theaverage CSDs of pathological voices On the other handit may be observed in Figure 4 that the average CSDs forvoices with edema and nodules have different correntropyvalues even though the frequencies are similar In order todecrease computational cost and improve the success ratethe descriptors of different classes considered in this workare constituted by the first fifty samples because there is aclear distinction involving the respective average CSDs ofeach class within this interval

The classification architecture proposed in this work ispresented in Figure 5 Initially the desired voice signal to becalculated is normalized and its respective CSDs are obtainedfrom (6) Then the Euclidian distance between the voiceCSD and each descriptor for the classes defined as healthyand pathological voices is calculated The closer descriptorobtained according to the Euclidian distance criterion definesthe class of such signal If the analyzed voice is classified aspathological it is necessary to apply a new classification stagein order to distinguish between an edema and a nodule

4 Experiments and Results

The proposed architecture has been validated by com-puter simulation performed in MATLAB The architecture

Normalization

CSD

FFT

Pathological voice

Healthyvoice

features

Average

FFT

Average

Pathologicalvoices

features

Proposedarchitecture

healthyvoices Edema NoduleP0

V(P0)

P1

P0 P1

V(P1)

Figure 1 Extractor for the characteristics of healthy and pathologi-cal voices

performance was evaluated according to Monte Carlorsquosmethod For each experiment a minimum number of 100trials were usedThe set of voices for each class is divided intotwo subsets one to extract the descriptors and another for theclassification test Crossvalidation is used in this case where50 of the voices are for the extraction of descriptors and theremaining 50 are used for test purposes The evaluation ofthe architecture is based on the average success rates and alsomaximum minimum and standard deviation

41 Database Thedatabase used for this workwas developedby Massachusetts Eye and Ear Infirmary (MEEI) Voiceand Speech [20] This database has been widely used ininternational research works in acoustic analysis of disor-dered voices and in the discrimination between healthy andpathological voices It contains the sustained pronunciationof vowel ldquoardquo with 116 files from distinct speakers where 53 arehealthy voices 43 are voices affected by edema and 20 arevoices with nodules All used signals have the duration from1 to 3 seconds sampling frequency of 25 kHz and resolutionof 16 bits

42 Experiments Initially 43 voices affected by edema and 20voices with nodules are joined in one database called patho-logical The first experiment is characterized by extracting


edema nodule

Edemavoices

features

Nodulevoices

features

ProposedArchitecture

P3

P3

P2

P2

Figure 2 Extractor for the characteristics of voices with edema andnodules

the success rate between healthy and pathological voicesThen the success rate is assessed among 43 voices with edemaand 20 voices with nodules

The only adjustable parameter in the architecture is thevariance of the Gaussian kernel that is the kernel sizeMany systems use a heuristic known as Silvermanrsquos ruleto determine such variance [21] However this work hasemployed a numerical evaluation method to determine theinfluence of the kernel size on the accurate classification ratefor the proposed architecture The performed experimentsare represented in Figures 6 and 7

From the aforementioned experiments it is possible todetermine suboptimal values for the kernel width associatedwith each class of vocal disease The kernel size was tested byusing a logarithmic scale with values ranging from 001 to 10The kernel adjustment has provided an effective mechanismto eliminate outliers The right choice of its size may increasethe success rate considerably The amount of samples is alsoa fundamental factor in the classification rates according tothe results given in Figures 6 and 7

The results shown in Figure 6 demonstrate that theindependently of the kernel size the classifier for healthy andpathological voices represents an unsatisfactory result whenonly a few samples are used However when the kernel isequal to 077 and about 1000 signal samples are used theclassifier has presented a success rate of 93

One of the goals of this work is to reduce the complexityassociated with the architecture while a classifier based onthe Euclidean distance is used However it is sensitive to

6000

5000

4000

3000

2000

1000

0

6000

5000

4000

3000

2000

1000

0

10 20 30 40 50

50 100 150 200 250 300

Frequency (Hz)

HealthyPathological

Cor

rent

ropy

Descriptors

Figure 3 Correntropy spectral density for healthy and pathologicalvoices obtained for a kernel size equal to 0003

Table 1 Performance of classification for healthy and pathologicalclasses with kernel size of 077 and 1000 samples

Method Recognition rate ()Minimum Average Maximum Standard deviation

Correntropy 9122 9381 9565 130Correlation 6042 7143 8233 912

the sample size as it is necessary optimize such parameterThus it is possible to state that amounts of samples higherthan 1000 affect success rates between healthy and patholog-ical voices

The success rate in the classification between voices withedema and nodules varied from 65 (when the kernel sizeis 031 and the number of samples is 700) to 98 (when thekernel size is 177 and 1300 samples are adopted) accordingto Figure 7

After adjusting the architecture with the adequate valuesfor the kernel size and number of samples the correntropymeasure was investigated regarding its capability of classify-ing and characterizing the statistical independencies of theconsidered voice signals Accordingly correlation is adoptedas a reference measure since it can be seen as a particularcase of correntropy and is very often used in classification


3500

3000

2500

2000

1500

1000

500

010 20 30 40 50

3500

3000

2500

2000

1500

1000

500

050 100 150 200 250 300

Frequency (Hz)

EdemaNodule

Cor

rent

ropy

Descriptors

Figure 4 Correntropy spectral density for voices with edema andnodules obtained for a kernel size equal to 00129

problemsTheobtained results are presented inTables 1 and 2The average success rate is determined considering 100 trials

Based on Table 1 it is possible to observe that the archi-tecture based on correntropy is able to distinguish betweenhealthy and pathological voices with a success rate between9122 and 9565 and average rate of 9381 while thestandard deviation is 130 On the other hand when anarchitecture based on correlation is used the success rate ofthe classifier is reduced considerably Besides it can be statedthat the standard deviation of the recognition rate for theproposed architecture is much lower if compared to that forthe architecture based on correlation thus indicating higherreliability of the classifier with correntropy

From Table 2 it can be seen that the recognition ratebetween the voices with edema and nodules is considerablyhigher in the architecture based on correntropy Once againthe success rate of the proposed architecture is within anacceptable range of values and with low standard deviation

43 Comparison with Existing Methods Several methods forthe detection of vocal pathologies have been proposed inthe literature With the aim of better assessing the classifierdeveloped in this work the obtained results are comparedwith those regarding the works in [6 8 10] which were alsoobtained using the MEEI database [20]

Feature extractor

Decision 01Euclidean distance

Healthy Pathological

Decision 02

NoduleEdema

Voice signal

Feat

ures

Feat

ures

Step 1

Step 2

Euclidean distance

P0P1

P2P3

Figure 5 Automatic classification process of pathological voices

4 6 8 10 12 14 16

06507

07508

08509

0951

Kernel size

Succ

ess r

ate

101

100

10minus1

10minus2

Sample sizes (lowast100)

Figure 6 Influence of the kernel size and amount of samples on thesuccess rates between healthy and pathological voices

Table 2 Performance of classification for edemas and nodules withkernel size of 177 and 1300 samples



The work in [6] considers a transform initially appliedto the voice signal to obtain a space of smaller dimensionby using the decomposition in singular values that ishigher-order singular value decomposition (HOSVD) In theclassification stage measure of mutual information and a


4 6 8 10 12 14 16

0506070809

1

Kernel size

Succ

ess r

ate

101

100

10minus1

10minus2


Figure 7 Influence of the kernel size and amount of samples on thesuccess rates between voices with edema and nodule

SVM network are usedThe success rate is around 941 withan interval of reliability of 028

The classification architecture in [8] is developed fromeleven characteristics extracted by means of nonlinear analy-sis of temporal series where two are based on conventionalnonlinear statistics other two are based on the analysisof recurrence and fractal scheduling and the rest of themare obtained from different estimations of the entropy Theachieved success rate is 98 by using a SVM and Gaussianmixture models (GMM) in the classification stage

Information measures are employed in [10] for exampleShannon entropy correlation entropy approximate entropyTsallis entropy Hurst exponent maximal Lyapunov expo-nent and the first minimum of the mutual informationfunction in addition to LPC (linear prediction coding)coefficients [10] In the classification process a quadraticdiscriminant analysis (QDA) is applied with a success rateequal to about 9650

Thus it is possible to state that all aforementionedmethods employ a complex stage for the extraction ofcharacteristics with the calculation of a large set of variablesthat in general are sent to a neural network for classificationpurposes

On the other hand the architecture proposed in thispaper uses only one extractor defined by the CSD and also avery simple classification stage based on Euclidian distanceTherefore the introduced strategy presents low computa-tional complexity which implies simple implementation inreal-time embedded systems Besides the proposed systempresents high success rate that is about 97

5 Conclusion

This paper has presented a novel method of automatic classi-fication for pathological voices based on correntropy spectraldensity (CSD) It has been demonstrated that CSD is adequateto characterize dynamic interdependencies among the voicesignal samples being able to extract distinct characteristicsbetween healthy and pathological voices Among the maincharacteristics of such method it is worth mentioning thatthe classification stage becomes simpler by the use of Euclid-ian distance which effectively reduces its computationalcomplexity From the obtained results it has been shown

that the proposed classifier presents high recognition ratewhich is achieved after a simple adjustment in the kernelsize employed by the feature extractorThe proposed methodcan be used as a valuable tool by researchers and speechpathologists Future work aims at the development of exper-iments using other databases and also the implementation ofan online diagnosing system

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] B G Aguiar Neto S C Costa J M Fechine and M MuppaldquoFeature estimation for vocal fold edema detection using short-term cepstral analysisrdquo in Proceedings of the 7th IEEE Interna-tional Conference on Bioinformatics and Bioengineering (BIBErsquo07) pp 1158ndash1162 January 2007

[2] J I Godino-Llorente P Gomez-Vilda and M Blanco-VelascoldquoDimensionality reduction of a pathological voice qualityassessment system based on gaussian mixture models andshort-term cepstral parametersrdquo IEEE Transactions on Biomed-ical Engineering vol 53 no 10 pp 1943ndash1953 2006

[3] G Muhammad and M Melhem ldquoPathological voice detec-tion and binary classification using MPEG-7 audio featuresrdquoBiomedical Signal Processing and Control vol 11 pp 1ndash9 2014

[4] R T S Carvalho C C Cavalcante and P C Cortez ldquoWavelettransform and artificial neural networks applied to voice dis-orders identificationrdquo in Proceedings of the 3rd World Congresson Nature and Biologically Inspired Computing (NaBIC rsquo11) pp371ndash376 IEEE October 2011

[5] E S Fonseca R C Guido A C Silvestre and J C Pereira ldquoDis-crete wavelet transform and support vector machine applied topathological voice signals identificationrdquo in Proceedings of the7th IEEE International Symposium onMultimedia (ISM rsquo05) pp785ndash789 December 2005

[6] M Markaki and Y Stylianou ldquoVoice pathology detection anddiscrimination based on modulation spectral featuresrdquo IEEETransactions on Audio Speech and Language Processing vol 19no 7 pp 1938ndash1948 2011

[7] E S Fonseca and J C Pereira ldquoNormal versus pathologicalvoice signalsrdquo IEEE Engineering in Medicine and Biology Maga-zine vol 28 no 5 pp 44ndash48 2009

[8] J D Arias-Londono J I Godino-Llorente N Saenz-Lechon VOsma-Ruiz and G Castellanos-Domınguez ldquoAutomatic detec-tion of pathological voices using complexity measures noiseparameters and mel-cepstral coefficientsrdquo IEEE Transactionson Biomedical Engineering vol 58 no 2 pp 370ndash379 2011

[9] S Jothilakshmi ldquoAutomatic system to detect the type of voicepathologyrdquo Applied Soft Computing vol 21 pp 244ndash249 2014

[10] W C A Costa S L N C Costa F M Assis and B G AguiarldquoHealthy and pathological voice assessment by means of non-linear dynamic analysis measures and linear predictive codingrdquoBrazilian Journal of Biomedical Engineering vol 29 no 1 pp3ndash14 2013

[11] L Salhi and A Cherif ldquoRobustness of auditory teager energycepstrum coefficients for classification of pathological andnormal voices in noisy environmentsrdquo The Scientific WorldJournal vol 2013 Article ID 435729 8 pages 2013


[12] I Santamarıa P P Pokharel and J C Principe ldquoGeneralizedcorrelation function definition properties and application toblind equalizationrdquo IEEE Transactions on Signal Processing vol54 no 6 I pp 2187ndash2197 2006

[13] A Gunduz and J C Principe ldquoCorrentropy as a novel measurefor nonlinearity testsrdquo Signal Processing vol 89 no 1 pp 14ndash232009

[14] I Park and J C Prıncipe ldquoCorrentropy based Granger causal-ityrdquo in Proceedings of the IEEE International Conference onAcoustics Speech and Signal Processing (ICASSP rsquo08) pp 3605ndash3608 April 2008

[15] K JeongW Liu S Han E Hasanbelliu and J C Principe ldquoThecorrentropyMACE filterrdquo Pattern Recognition vol 42 no 5 pp871ndash885 2009

[16] R LiW Liu and J C Principe ldquoA unifying criterion for instan-taneous blind source separation based on correntropyrdquo SignalProcessing vol 87 no 8 pp 1872ndash1881 2007

[17] A I R Fontes L A Pasa V A De Sousa Jr F M AbinaderJr J A F Costa and L F Q Silveira ldquoAutomatic modulationclassification using information theoretic similarity measuresrdquoinProceedings of the 76th IEEEVehicular Technology Conference(VTC rsquo12) pp 1ndash5 September 2012

[18] J C Principe Information Theoretic Learning Renyirsquos Entropyand Kernel Perspectives Springer 2010

[19] J Xu Nonlinear signal processing based on reproducing KernelHilbert space [PhD thesis] University of Florida 2007

[20] M Eye and E Infirmary Elemetrics Disordered Voice Database(Version 103) Voice and Speech Lab Boston Mass USA 1994

[21] B W Silverman ldquoDensity estimation for statistics and dataanalysisrdquo Technometrics vol 37 1986

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of


Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of


Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of


Mathematical PhysicsAdvances in

Complex AnalysisJournal of


OptimizationJournal of


CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of


Operations ResearchAdvances in

Journal of


Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences


The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014


Algebra

Discrete Dynamics in Nature and Society



Decision SciencesAdvances in

Discrete MathematicsJournal of


Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of


information function) and quadratic discriminant analysis(QDA) by classifier [10] Nonlinear Teager-Kaiser energyoperator has been used for features extraction in [11] witha classifier based on neural system of multilayer perceptron(MLP)

Thus it is possible to state that all aforementionedmethods employ classifiers with complex architectures forexample MLP SVM and HMM and a complex stage forthe features extraction which can restrict their applicationor even make the technique unfeasible in real-time systemsIn addition most research works are based on the discrimi-nation between healthy and pathological voices but not thepathology types

This paper introduces a new low-complexity approachfor the detection and classification of laryngeal pathologiesThis system uses a measure of information theory known ascorrentropy to characterize vocal pathologies by means ofspectral characteristics and statistical moments of high orderinvolving the vocal signalThe system is characterized by highsuccess rates and low computational complexity associatedwith a simple classification stage based on Euclidian distanceThe influence of the kernel size in the classification hit rate isanalyzed since it is the only free parameter of the methodBesides the classification between healthy and pathologicalvoices it is possible to define which pathology is most likelyto exist in the voice under analysis

The paper is organized as follows Section 2 presentsthe main concepts related to correntropy The proposedarchitecture is described in Section 3 In Section 4 the perfor-mance results of the proposed technique are discussed whileSection 5 presents the final considerations

2 Correntropy

The extraction of information from a database is a frequentand relevant problem in various applications involving signalprocessing Within this context statistical measures of sim-ilarity such as correntropy may be successfully used for theextraction of information and data characterization

Correntropy is a generalization of the correlation mea-sure between random signals because such measure is ableto extract both second-order and higher-order statisticalinformation from the analyzed signals [12] In the past fewyears this concept been successfully applied in the solutionof various engineering problems such as the modeling oftemporal series [13] nonlinearity tests [14] objects recog-nition [15] analysis of independent components [16] andautomatic modulation classification [17] Although corren-tropy is similar to correlation by definition recent studieshave demonstrated that its performance is superior whendealing with nonlinear or non-Gaussian systems without anysignificant increase of computational cost [12]

According to [12] correntropy is between random signals119909119905 119905 isin 119879 where 119905 represents time and119879 is the set of interest

indices defined as119881 (1199051 1199052) = 119864 [119896

120590(1199091199051

1199091199052

)]

= ∬119896120590(1199091199051

1199091199052

) 119901 (1199091199051

1199091199052

) 1198891199091199051

1198891199091199052

(1)

where 119864[sdot] is the expectation operator and 119896120590(sdot sdot) corre-

sponds to any symmetric function defined as positive It isobserved that correntropy is defined as the joint expectationof 119896120590(1199091199051

1199091199052

) In this work 119896120590(sdot sdot) is a Gaussian kernel given

as

119896120590(1199091199051

1199091199052

) =1

radic2120587120590119890minus(1199091199051minus1199091199052)221205902

(2)

where 120590 is the variance defined as the kernel size The kernelsize may be interpreted as the resolution for which corren-tropy measures similarity in a space with characteristics ofhigh dimensionality [12]

In fact by applying an extension of the Taylor series to thecorrentropy measure (1) can be rewritten as [12]

119881 (1199051 1199052) =

1

radic2120587120590

infin

sum119899=0

(minus1)119899

21198991205902119899119899119864 [100381710038171003817100381710038171199091199051

minus 1199091199052

10038171003817100381710038171003817

2119899

] (3)

It is possible to notice that (3) defines the sum of infinitemoments of even order which therefore includes its ownconventional covariance Consequently correntropy containsinformation of infinite statistical moments involving its data

It is interesting to observe that the kernel size in (3) isa parameter that ponders the influence of the second-orderand higher-order moments For sufficiently large values thesecond-order moments are dominant and the measure getscloser to the correlation

When only a finite amount of data is available thatis (119909119899 119909119899)119873

119899=1 it is possible to define the estimator to the

autocorrentropy function as [18]

(119898) =1

119873 minus 119898

119873minus1

sum119899=119898

119896120590(119909 (119899) 119909 (119899 minus 119898)) (4)

The autocorrentropy with an estimator defined by (4)does not ensure zero average even when the input data arecentralized due to the nonlinear transforms produced by thekernel However a centralized estimator for correntropy hasbeen defined in [18] which is given as

(119898) =1

119873 minus 119898

119873minus1

sum119899=119898

119896120590(119909 (119899) 119909 (119899 minus 119898))

minus

119873

sum119899=1

119873

sum119898=1

119896120590(119909 (119899) 119909 (119899 minus 119898))

(5)

This work explores the spectral properties of centralizedautocorrentropy in voice signals in order to detect and classifyvocal pathologies Specifically the spectral descriptors ofhealthy pathological voices used in the classification areextracted from the signals analyzed by the correntropyspectral density (CSD) function defined as [19]

119875120590(119908) =

infin

sum119898=minusinfin

(119898) exp (minus119895119908119898) (6)

where 119908 is the digital frequency given in radians The CSDfunction may be considered a generalization of the powerspectral density (PSD) of a signal











Normalization

CSD

FFT

Pathological voice

Healthyvoice

features

Average

FFT

Average

Pathologicalvoices

features



V(P0)

P1

P0 P1

V(P1)






edema nodule

Edemavoices

features

Nodulevoices

features


P3

P3

P2

P2







6000

5000

4000

3000

2000

1000

0

6000

5000

4000

3000

2000

1000

0

10 20 30 40 50

50 100 150 200 250 300

Frequency (Hz)

HealthyPathological

Cor

rent

ropy

Descriptors









3500

3000

2500

2000

1500

1000

500

010 20 30 40 50

3500

3000

2500

2000

1500

1000

500

050 100 150 200 250 300

Frequency (Hz)

EdemaNodule

Cor

rent

ropy

Descriptors






Feature extractor



Decision 02

NoduleEdema

Voice signal

Feat

ures

Feat

ures

Step 1

Step 2

Euclidean distance

P0P1

P2P3


4 6 8 10 12 14 16

06507

07508

08509

0951

Kernel size

Succ

ess r

ate

101

100

10minus1

10minus2








4 6 8 10 12 14 16

0506070809

1

Kernel size

Succ

ess r

ate

101

100

10minus1

10minus2








5 Conclusion





References






























Volume 2014




Journal of











Journal of


Function Spaces






Algebra



















Normalization

CSD

FFT

Pathological voice

Healthyvoice

features

Average

FFT

Average

Pathologicalvoices

features



V(P0)

P1

P0 P1

V(P1)






edema nodule

Edemavoices

features

Nodulevoices

features


P3

P3

P2

P2







6000

5000

4000

3000

2000

1000

0

6000

5000

4000

3000

2000

1000

0

10 20 30 40 50

50 100 150 200 250 300

Frequency (Hz)

HealthyPathological

Cor

rent

ropy

Descriptors









3500

3000

2500

2000

1500

1000

500

010 20 30 40 50

3500

3000

2500

2000

1500

1000

500

050 100 150 200 250 300

Frequency (Hz)

EdemaNodule

Cor

rent

ropy

Descriptors






Feature extractor



Decision 02

NoduleEdema

Voice signal

Feat

ures

Feat

ures

Step 1

Step 2

Euclidean distance

P0P1

P2P3


4 6 8 10 12 14 16

06507

07508

08509

0951

Kernel size

Succ

ess r

ate

101

100

10minus1

10minus2








4 6 8 10 12 14 16

0506070809

1

Kernel size

Succ

ess r

ate

101

100

10minus1

10minus2








5 Conclusion





References






























Volume 2014




Journal of











Journal of


Function Spaces






Algebra










edema nodule

Edemavoices

features

Nodulevoices

features


P3

P3

P2

P2







6000

5000

4000

3000

2000

1000

0

6000

5000

4000

3000

2000

1000

0

10 20 30 40 50

50 100 150 200 250 300

Frequency (Hz)

HealthyPathological

Cor

rent

ropy

Descriptors









3500

3000

2500

2000

1500

1000

500

010 20 30 40 50

3500

3000

2500

2000

1500

1000

500

050 100 150 200 250 300

Frequency (Hz)

EdemaNodule

Cor

rent

ropy

Descriptors






Feature extractor



Decision 02

NoduleEdema

Voice signal

Feat

ures

Feat

ures

Step 1

Step 2

Euclidean distance

P0P1

P2P3


4 6 8 10 12 14 16

06507

07508

08509

0951

Kernel size

Succ

ess r

ate

101

100

10minus1

10minus2








4 6 8 10 12 14 16

0506070809

1

Kernel size

Succ

ess r

ate

101

100

10minus1

10minus2








5 Conclusion





References






























Volume 2014




Journal of











Journal of


Function Spaces






Algebra










3500

3000

2500

2000

1500

1000

500

010 20 30 40 50

3500

3000

2500

2000

1500

1000

500

050 100 150 200 250 300

Frequency (Hz)

EdemaNodule

Cor

rent

ropy

Descriptors






Feature extractor



Decision 02

NoduleEdema

Voice signal

Feat

ures

Feat

ures

Step 1

Step 2

Euclidean distance

P0P1

P2P3


4 6 8 10 12 14 16

06507

07508

08509

0951

Kernel size

Succ

ess r

ate

101

100

10minus1

10minus2








4 6 8 10 12 14 16

0506070809

1

Kernel size

Succ

ess r

ate

101

100

10minus1

10minus2








5 Conclusion





References






























Volume 2014




Journal of











Journal of


Function Spaces






Algebra










4 6 8 10 12 14 16

0506070809

1

Kernel size

Succ

ess r

ate

101

100

10minus1

10minus2








5 Conclusion





References






























Volume 2014




Journal of











Journal of


Function Spaces






Algebra



























Volume 2014




Journal of











Journal of


Function Spaces






Algebra









research article classification system of pathological voices … · 2019. 7. 31. · research...

Documents