a new feature constituting approach to detection of vocal fold pathology

14
This article was downloaded by: [North Dakota State University] On: 22 November 2014, At: 17:39 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Click for updates International Journal of Systems Science Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/tsys20 A new feature constituting approach to detection of vocal fold pathology M. Hariharan a , Kemal Polat b & Sazali Yaacob a a School of Mechatronic Engineering, Universiti Malaysia Perlis (UniMAP), Campus Pauh Putra, Perlis, Malaysia b Department of Electrical and Electronics Engineering, Faculty of Engineering and Architecture, Abant Izzet Baysal University, Bolu, Turkey Published online: 14 May 2013. To cite this article: M. Hariharan, Kemal Polat & Sazali Yaacob (2014) A new feature constituting approach to detection of vocal fold pathology, International Journal of Systems Science, 45:8, 1622-1634, DOI: 10.1080/00207721.2013.794905 To link to this article: http://dx.doi.org/10.1080/00207721.2013.794905 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http:// www.tandfonline.com/page/terms-and-conditions

Upload: sazali

Post on 27-Mar-2017

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A new feature constituting approach to detection of vocal fold pathology

This article was downloaded by [North Dakota State University]On 22 November 2014 At 1739Publisher Taylor amp FrancisInforma Ltd Registered in England and Wales Registered Number 1072954 Registered office Mortimer House37-41 Mortimer Street London W1T 3JH UK

Click for updates

International Journal of Systems SciencePublication details including instructions for authors and subscription informationhttpwwwtandfonlinecomloitsys20

A new feature constituting approach to detection ofvocal fold pathologyM Hariharana Kemal Polatb amp Sazali Yaacoba

a School of Mechatronic Engineering Universiti Malaysia Perlis (UniMAP) Campus PauhPutra Perlis Malaysiab Department of Electrical and Electronics Engineering Faculty of Engineering andArchitecture Abant Izzet Baysal University Bolu TurkeyPublished online 14 May 2013

To cite this article M Hariharan Kemal Polat amp Sazali Yaacob (2014) A new feature constituting approach to detection ofvocal fold pathology International Journal of Systems Science 458 1622-1634 DOI 101080002077212013794905

To link to this article httpdxdoiorg101080002077212013794905

PLEASE SCROLL DOWN FOR ARTICLE

Taylor amp Francis makes every effort to ensure the accuracy of all the information (the ldquoContentrdquo) containedin the publications on our platform However Taylor amp Francis our agents and our licensors make norepresentations or warranties whatsoever as to the accuracy completeness or suitability for any purpose of theContent Any opinions and views expressed in this publication are the opinions and views of the authors andare not the views of or endorsed by Taylor amp Francis The accuracy of the Content should not be relied upon andshould be independently verified with primary sources of information Taylor and Francis shall not be liable forany losses actions claims proceedings demands costs expenses damages and other liabilities whatsoeveror howsoever caused arising directly or indirectly in connection with in relation to or arising out of the use ofthe Content

This article may be used for research teaching and private study purposes Any substantial or systematicreproduction redistribution reselling loan sub-licensing systematic supply or distribution in anyform to anyone is expressly forbidden Terms amp Conditions of access and use can be found at httpwwwtandfonlinecompageterms-and-conditions

International Journal of Systems Science 2014Vol 45 No 8 1622ndash1634 httpdxdoiorg101080002077212013794905

A new feature constituting approach to detection of vocal fold pathology

M Hariharanalowast Kemal Polatb and Sazali Yaacoba

aSchool of Mechatronic Engineering Universiti Malaysia Perlis (UniMAP) Campus Pauh Putra Perlis Malaysia bDepartment ofElectrical and Electronics Engineering Faculty of Engineering and Architecture Abant Izzet Baysal University Bolu Turkey

(Received 27 September 2012 final version received 3 April 2013)

In the last two decades non-invasive methods through acoustic analysis of voice signal have been proved to be excellentand reliable tool to diagnose vocal fold pathologies This paper proposes a new feature vector based on the wavelet packettransform and singular value decomposition for the detection of vocal fold pathology k-means clustering based featureweighting is proposed to increase the distinguishing performance of the proposed features In this work two databasesMassachusetts Eye and Ear Infirmary (MEEI) voice disorders database and MAPACI speech pathology database are usedFour different supervised classifiers such as k-nearest neighbour (k-NN) least-square support vector machine probabilisticneural network and general regression neural network are employed for testing the proposed features The experimentalresults uncover that the proposed features give very promising classification accuracy of 100 for both MEEI database andMAPACI speech pathology database

Keywords acoustic analysis vocal fold pathology feature extraction feature weighting and classification

1 Introduction

Voice is a highly multivariate component of speech andits quantitative description has led to the development ofclinical tools To detect the vocal fold pathology medi-cal professionals use subjective techniques (invasive meth-ods) such as the direct inspection of the vocal foldsand the observation of the vocal folds by endoscopic in-struments These techniques are expensive risky timeconsuming discomfort to the patients and require costlyresources such as special light sources endoscopic instru-ments and specialised video camera equipment To circum-vent the above problems non-invasive methods have beendeveloped to help the medical professionals for early de-tection of vocal fold pathology With the rapid develop-ment of signal processing techniques vocalvoice signalcan be used for the detection of vocal fold pathologiesand its quantitative informations play an important roleto understand the process of vocal fold pathology forma-tion In the last 20 years many research works have beencarried out on the automatic detection and classificationof vocal fold pathologies by means of acoustic analysisparametric and non-parametric feature extraction auto-matic pattern recognition or statistical methods (KasuyaOgawa Mashima and Ebihara 1986 Feijoo and Hernan-dez 1990 Boyanov Ivanov Hadjitodorov and Chollet 1993Deliyski 1993 Kasuya Endo and Saliu 1993 Boyanov andHadjitodorov 1997 Hernandez-Espinosa Gomez-VildaGodino-Llorente and Aguilera-Navarro 2000 Martinez

lowastCorresponding author Email hariunimapedumy

and Rufiner 2000 Godino-Llorente Gomez-Vilda andBlanco-Velasco 2006)

A large amount of acoustic parameters have been pro-posed and its effectiveness has been proven by experimentalresearches The important parameters are pitch (Boyanovet al 1993) jitter (Feijoo and Hernandez 1990 Kasuyaet al 1993) shimmer (Ludlow Bassich Connor Coul-ter and Lee 1987 Kasuya et al 1993) harmonics-to-noise(Yumoto Sasaki and Okamura 1984 Krom 1993) and nor-malised noise energy (Kasuya et al 1986) In the last twodecades time-frequencyscale analyses (waveletwaveletpacket transforms) have been used as tools to analyse allkinds of problems in signal and image processing Sincethe speechvoice signal is a highly non-stationary signalFourier transform is not a very useful tool to analyse non-stationary signal as the time-domain informations are lostwhile performing the frequency transformation (PandianSazali and Muthusamy 2008 Paulraj Sazali and Hariha-ran 2009) Time-frequencyscale analysis (waveletwaveletpacket transform) is a good tool for the analysis of non-stationary signals both in time and frequency scale (Pandianet al 2008 Paulraj et al 2009) Hence wavelet and waveletpacket analysis has the potential for the identification ofvocal fold pathology

In Nayak and Bhat (2003) authors have presented aprocedure to identify pathological disorders of larynx usingwavelet analysis Nayak Bhat Acharya and Aithal (2005)have proposed a method for the classification and analysis

Ccopy 2013 Taylor amp Francis

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1623

of speech abnormalities based on wavelet analysis and arti-ficial neural network Fonseca Guido Scalassara Macieland Pereira (2007) have presented wavelet and least-squaresupport vector machine (LS-SVM) for the identification ofvocal fold pathology Salhi Talbi and Cherif 2008 haveproposed a hybrid approach using wavelet analysis andmultilayer neural network for the vocal fold pathologiesKukharchik Kheidorov Bovbel and Ladeev (2008) havepresented wavelet transform and support vector machine(SVM) for the vocal fold pathology detection Crovato andSchuck (2007) presented a vocal fold pathology (dysphonicvoice) classification system using the wavelet packet trans-form (WPT) and the best basis algorithm (BBA) as dimen-sionality reduction and six artificial neural networks actingas classification systems Hierarchical system for diagno-sis of vocal fold pathologies based on wavelets and SVMhas been proposed by Nikkhah-Bahrami Ahmadi-NoubariSeyed Aghazadeh and Khadivi Heris (2009) Arjmandi andPooyan (2011) have applied linear discriminant analysisand wavelet packets for vocal fold pathology assessmentErfanian Saeedi Almasganj and Torabinejad (2011) haveproposed adaptive wavelets for vocal fold pathology assess-ment In Khadivi Heris Seyed Aghazadeh and Nikkhah-Bahrami (2009) WPT with non-linear features were usedfor optimal selection of features to assess the vocal foldpathologies Azadi and Almasganj (2011) have proposedpartitioning and biased support vector machine (PBSVM)-based classifier for vocal fold pathology assessment usinglabelled and unlabelled data-sets Accuracy of their resultvaried from 85 to 100 approximately under differentexperiments From the previous work it is observed thatthe reliability of wavelet and WPT-based features has beenproven by many experimental researchers It is not easyto compare their results since they conducted the analy-sis with different size of data-set different databases anddifferent presentation of results Cross-validation schemeand use of more than one database are the possible ways toprove the reliability of the results of various feature extrac-tion and classification methods (Saenz-Lechon Godino-Llorente Osma-Ruiz and Gomez-Vilda 2006)

In this paper a new feature vector is proposed basedon WPT and singular value decomposition for the auto-matic detection of vocal fold pathology k-means cluster-ing based feature weighting is proposed to increase thediscrimination ability of the proposed features Two dif-ferent databases such as Massachusetts Eye and Ear Infir-mary (MEEI) voice disorders database (Kay Elemetrics Inc1994) and MAPACI speech pathology database (MAPACI2004) are used to test either the robustness or the inde-pendence of the algorithms to the databases Two differentexperiments (frame-based and file-based) are carried outusing the voice samples of the above-said databases Fourdifferent supervised classifiers such as k-nearest neighbour(k-NN) LS-SVM probabilistic neural network (PNN) andgeneral regression neural network (GRNN) are employed

for the identification of vocal fold pathology Two schemesof data validation method are used (conventional validation[ConV] and tenfold cross-validation [CrossV]) in orderto test the effectiveness of the proposed features and thereliability of the classification results

2 Database

In the area of automatic detection of vocal fold pathologyanalysis the only one commercially available database isMEEI voice disorders database (Kay Elemetrics Inc 1994)The database contains 53 normal and 657 pathological voicesamples developed by the MEEI Voice and Speech LabsThe voice samples were the sustained phonation of thevowel ah (1ndash3 s) long and reading (12 s) of the lsquoRainbowPassagersquo from patients with normal voices and a wide va-riety of organic neurological traumatic and psychogenicvoice disorders in different stages All the voice sampleswere collected in a controlled environment and sampledwith 25 kHz or 50 kHz sampling rate and 16 bits of res-olution To test the effectiveness of the proposed methoda total of 226 voice samples of sustained phonation of thevowel ah (173 pathological and 53 normal) are used anddownsampled to 25 kHz for our analysis Two differentexperiments are performed (frame-based and file-based)In frame-based analysis voice samples are segmented intoframes of 40 ms long using a Hamming window with 50overlap Then each window is parameterised by means ofWPT and SVD (singular value decomposition) In file-based analysis 173 pathological and 53 normal voice sam-ples are subjected to feature extraction The second data-setis taken from MAPACI (2004) database (MAPACI 2004)and all the speech samples were recorded at 44100 Hz dur-ing the lifetime project of MAPACI (2004) The recordingdevice was a Sennheiser headset microphone The speechdatabase consists of 48 voice samples (12 normal and 12pathological males + 12 normal and 12 pathological fe-males) and their ages ranged from 20 to 68 years with an av-erage of 3675 years and a standard deviation of 1435 yearsFrame-based and file-based analyses are conducted on thisdatabase as well (Figure 1) shows the pathological andnormal voice samples from MEEI database and MAPACIspeech pathology database

3 WPT and SVD

Various feature extraction methods have been proposed forrobust representation of normal and pathological speech Inthe last decade time-frequency analysis has been used forrobust representation of normal and pathological speechThe wavelet transform provides time-frequency represen-tation of the signal It decomposes signal over dilated andtranslated wavelets A wavelet is a waveform of effectivelylimited duration that has an average value of zero Wavelettransform is defined as the convolution of a signal f(t) with a

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1624 M Hariharan et al

Figure 1 Plots of pathological and normal voice samples

wavelet function ψ(t) shifted in time by a translation param-eter and dilated by a scale parameter The general definitionof the wavelet transform is given as (Burrus Gopinath GuoOdegard and Selesnick 1997 Bopardikar 2000)

W (a b) =infinint

minusinfinf (t)

1radica

ψlowast(

t minus b

a

)dt (1)

In the tree each subspace is indexed by its depth i and thenumber of subspaces p The two wavelet packet orthogonalbases at a parent node (ip) are given by the followingforms

ψ2pi+1(k) =

infinsumn =minusinfin

l[n] ψpi (k minus 2in) (2)

where l[n] is a low-pass (scaling) filter

ψ2p+1i+1 (k) =

infinsumn =minusinfin

h[n] ψpi (k minus 2in) (3)

where h[n] is the high-pass (wavelet) filter Wavelet packetis an extension of wavelet transform In WPT decomposi-tion procedure a signal is decomposed into two frequency

bands such as lower frequency band (approximation co-efficients) and higher frequency band (detail coefficients)Wavelet packet decomposition helps to partition both lowerand higher frequency bands into smaller bands which can-not be achieved by using general discrete wavelet trans-form Hence WPT gives a balanced binary tree structure(Figure 2) shows the block diagram of the proposed system

In this work the automatic detection of vocal foldpathology is carried out by using WPT and SVD In frame-based analysis the voice samples are firstly segmented intoshort-time frames of 40 ms with an overlap of 50 betweenadjacent frames and windowed by using Hamming win-dow (Godino-Llorente and Gomez-Vilda 2004 Godino-Llorente et al 2006) In frame-based analysis 1557 framesof pathological voices and 1537 frames of normal voicesare used Each frame of signal is decomposed into five lev-els using WPT and yields 32 subbands A matrix of size32 times M (fifth-level wavelet packet coefficients of each32 wavelet packet sub-bands) composed of wavelet packetcoefficients which is obtained for further processing whoserows pertain to wavelet packet subbands and columns towavelet packet coefficients

A = [C1

5 (M) C25 (M) middot middot middot C32

5 (M)]T

(4)

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1625

Figure 2 Overall diagram of the proposed system

To derive simple and effective feature vector singularvalue decomposition is applied In file-based analysis eachvoice sample is decomposed into five levels using WPTand the same SVD is applied to derive feature vector for thedetection of vocal fold pathology The same methodologyis applied on the voice samples of second data-set as wellConsidering the fact that pathological voice signals containmore rapid variations and the popularity of Daubechieswavelets in speech applications lsquodb4rsquo wavelet is used inour study The order was chosen to be low to model thetransients and rapid variations in a signal efficiently

31 Singular value decomposition

The singular value decomposition which is a factorisationand summarisation technique effectively reduces a rectan-gular matrix A (n times p matrix) of wavelet packet coefficientsinto a much smaller invertible and square matrix (KalkerHaitsma and Oostveen 2001 Ozer Sankur Memon andAnarim 2005) SVD theorem states that the given n times pmatrix is decomposed as given by

X = UDVT (5)

The columns of U are orthonormal eigenvectors of AAT

and are called left-singular vectors The columns of V are or-thonormal eigenvectors of ATA and are called right-singularvectors The diagonal entries of D are called the singularvalues of the matrix n times p In this work 32 singular valuesgive a good summarisation of the matrix of wavelet packetcoefficients of a signal

32 k-Means clustering based feature weighting

In many practical machine learning applications the per-formance of the learned models degrade gracefully due tothe irrelevant and noisy features To improve the robust-ness of the features and the classification accuracy clus-tering algorithms are used as feature weighting methodwhere they transform the extracted non-linearly separable

features to linearly separable features and also to map thefeatures according to the distributions (Polat and Gunes2006 Latifoglu Polat Kara and Gunes 2008 GunesPolat and Yosunkaya 2010 Polat 2012 Polat and Dur-duran 2012) Clustering algorithms are used not only tostudy the similarity or dissimilarity of the features but alsouseful for compression and reduction of the features (Polatand Gunes 2006 Latifoglu et al 2008 Gunes et al 2010Polat 2012 Polat and Durduran 2012) Several clusteringalgorithms have been proposed in the literature (MacQueen1967 Bezdek 1981 Chiu 1994 Yager and Filev 1994 Xuand Wunsch 2005) In this work k-means clustering is usedas a feature weighting method since it is simple and widelyimplemented in solving many practical problems (Polat andGunes 2006 Latifoglu et al 2008 Gunes et al 2010 Po-lat 2012 Polat and Durduran 2012) k-means clustering isused to partition n observations into k clusters so as to min-imise the mean squared distance from each data point toits nearest centre For a data-set xi i = 1 2 3 n k-means clustering algorithm finds the cluster centres ci andthe membership matrix U iteratively using the followingfour steps

Step 1 Randomly initialise the cluster centre ci i =123 c

Step 2 Determine the membership matrix usingEquation (6)

uij =

10

if∥∥xj minus ci

∥∥2 le∥∥xj minus ci

∥∥2 for each k = i

otherwise(6)

Step 3 Calculate the cost function based on the Euclideandistance using the following Equation (7)

J =csum

i=1

sumkxkisinGi

xk minus ci2 (7)

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1626 M Hariharan et al

Figure 3 Working of k-means clustering based feature weighting method

Stop if either this value or its improvement over previousiteration is below a certain thresholdStep 4 Update the cluster centres according toEquation (8) and go to Step 2

ci = 1

|Gi |sum

kxkisin Gi

xk (8)

where |Gi | =nsum

j=1uij

After finding the cluster centres the extracted featuresare preprocessed (Polat and Gunes 2006 Latifoglu et al2008 Gunes et al 2010 Polat 2012 Polat and Durduran2012) as summarised in (Figure 3) below firstly calcu-late the cluster centres using k-means clustering methodNext the ratios of means of features to their centres arecalculated Finally these ratios are multiplied with eachrespective feature

Raw features are obtained through feature extractionThrough k-means clustering based feature weighting theraw features are preprocessed which gives the weightedfeatures (Figure 4a and b) shows the class distribution ofraw and weighted SVD features MEEI database The classdistribution of raw and weighted SVD features for MAPACIspeech pathology database is shown in (Figure 5a and b)From the figures it is observed that the discrimination abil-

ity of the raw SVD features has been increased by applyingk-means clustering based feature weighting method

4 Classifiers

In the area of automatic detection of vocal fold patholo-gies various classifiers have been proposed such as multi-layer perceptron (Ritchings McGillion Conroy and Moore1999 Ritchings McGillion and Moore 1999) hiddenMarkov models (Wester 1998) linear discriminant anal-ysis (Hariharan Paulraj and Yaacob 2009 UmapathyKrishnan Parsa and Jamieson 2005) SVM (Khadivi Heriset al 2009 Arjmandi and Pooyan 2011 Erfanian Saeediet al 2011) Gaussian mixture models (Godino-Llorenteet al 2006) radial basis neural networks (Hariharan andSazali 2010 Hariharan Paulraj and Yaacob 2011) and k-nearest neighbourhood classifier (Ananthakrishna Shamaand Niranjan 2004 Shama and Cholayya 2007) In thiswork four different classifiers are employed such as k-NNLS-SVM PNN and GRNN

41 k-NN classifier

In pattern recognition the k-NN algorithm is a method forclassifying objects based on closest training examples inthe feature space (Fukunaga 1990 Duda Hart and Stork

Figure 4 (a) Class distribution of raw SVD features (MEEI database frame-based) according to the first two features (Feature 1 andFeature 2) (b) Class distribution of weighted SVD features (MEEI database frame-based) according to the first two features (Feature 1and Feature 2)

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1627

Figure 5 (a) Class distribution of raw SVD features (MAPACI speech pathology frame-based) according to the first two features (Feature1 and Feature 2) (b) Comparison of results (frame-based MEEI database raw features)

2001) An object is classified by a majority vote of itsneighbour with the object being assigned to the class mostcommon amongst its k-nearest neighbours k is a positiveinteger In the k-NN algorithm the classification of a newtest feature vector is determined by the class of its k-NNs

42 Probabilistic neural network

Specht (1990) has proposed the PNN based on Bayesianclassification and classical estimators for probability den-sity function (PDF) PNN comprises four units such as in-put units pattern units summation units and output unitsAll the units are fully interconnected and the pattern unitsare activated by exponential function instead of sigmoidalactivation function The pattern unit computes distancesfrom the input vector to the training input vectors when aninput is presented and produces a vector whose elementsindicate how close the input is to a training input The sum-mation unit sums these contributions for each class of inputsand produces a net output which is a vector of probabili-ties From the maximum of these probabilities output unitsproduce a 1 for that class and a 0 for the other classes using

compete transfer function (Hariharan Yaacob and Awang2011) The performance of the PNN classifier highly de-pends upon the smoothing parameter or spread factor (σ )Based on the experimental investigations the σ value isvaried between 001 and 0055 in steps of 0005

43 General regression neural network

GRNN is a kind of radial basis networks and the train-ing is conducted using one-pass learning This networkdoes not require an iterative training procedure it presentsmuch faster learning than multilayer perceptron it is moreaccurate than MLP and relatively insensitive to outliersThe target variable for the GRNN network is continuous(Eskidere Ertas and Hanilci 2011 Wu and Tsai 2011Hariharan Sindhu and Yaacob 2011 HariharanSaraswathy Sindhu Khairunizam and Yaacob 2012Hoglund 2012) Specht (1991) has proposed the model ofGRNN to perform general (linear or non-linear) regres-sions GRNN is based on the theory of probability re-gression analysis It usually uses Parzen window estimatesto set up the PDF from the observed data samples TheGRNN has four different layers input layer pattern layer

Table 1 Details of training and test data

Type ofType of experiment validation MEEI database MAPACI speech pathology database

Frame-based (1557pathological + 1537normal) (1560pathological + 1560normal)

CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

ConV Training = 2166 (70) testing =928 (30)

Training = 2184 (70) testing =936 (30)

File-based (173pathological + 53normal) (24pathological + 24normal)

CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

ConV Training = 158 (70) testing =68 (30)

Training = 34 (70) testing =14 (30)

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1628 M Hariharan et al

Figure 6 Comparison of results (a) frame-based MEEI database raw features (b) file-based MEEI database raw features (c) frame-based MAPACI speech pathology database raw features and (d) file-based MAPACI speech pathology database raw features

summation layer and output layer The performance of theGRNN classifier highly depends upon the smoothing pa-rameter or spread factor (σ ) Based on the experimentalinvestigations the σ value is varied between 001 and 0055in steps of 0005

44 Support vector machine

In this work SVM is used as a classifier and it is a promis-ing method for solving non-linear classification problemsfunction estimation and density estimation and patternrecognition tasks (Azadi and Almasganj 2011 Calisir andDogantekin 2011 Ismail Samsudin and Shabri 2011 YuYao Wang and Lai 2011 Martis Acharya Mandana Rayand Chakraborty 2012) It has been originally proposed toclassify samples within two classes It maps training sam-ples of two classes into a higher dimensional space througha kernel function SVM seeks an optimal separating hyper-plane in this new space to maximise its distance from theclosest training point While testing a query point is cate-gorised according to the distance between the point and thehyperplane

SVM models are built around a kernel function thattransforms the input data into an n-dimensional space where

a hyperplane can be constructed to partition the data Threekinds of kernel functions such as linear kernel multilayerkernel and radial basis function (RBF) kernel are normallyused by researchers (Azadi and Almasganj 2011 Calisirand Dogantekin 2011 Ismail et al 2011 Yu et al 2011Martis et al 2012) In this work the RBF kernel functionis used since it gives an excellent generalisation and lowcomputational cost In the RBF kernel σ 2 (sig2) is theimportant parameter and it causes the changes in shapeflexion of the hyperplane

In this work LS-SVMlab toolbox (Suykens VanGestel De Brabanter De Moor and Vandewalle2003 De Brabanter 2010) is used to perform classificationof normal and pathological speech samples There are twoparameters which are to be chosen optimally such as reg-ularisation parameter (γ gam) and σ 2 (squared bandwidthof the RBF kernel) to obtain better accuracy

5 Results and discussions

Many research works have already been done in the areaof automatic detection of voice pathology In this work anew feature vector using WPT and SVD is proposed to clas-sify the voice samples into normal or pathological k-means

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1629

Figure 7 Comparison of results (a) Frame-based MEEI database weighted features (b) file-based MEEI database weighted features(c) Frame-based MAPACI speech pathology database weighted features and (d) file-based MAPACI speech pathology database weightedfeatures

clustering based feature weighting method is proposed forpreprocessing the features which increases its discrimina-tion ability between classes and the classification accuracyas well Four classifiers are employed to investigate the ef-ficacy of the raw and weighted features and two validationschemes (ConV and CrossV) are used to prove the reliabil-ity of the classification results Details of the training andtesting sets used in this work are tabulated in (Table 1)

To test the classifier performance several measuresnamely sensitivity specificity and the overall accuracy areconsidered These measures are calculated from the mea-sures true positive (TP the classifier classified as pathologywhen pathological samples are present) true negative (TNthe classifier classified as normal when normal samples arepresent) false positive (FP the classifier classified as patho-logical when normal samples are present) and false negative(FN the classifier classified as normal when pathologicalsamples are present) These measures are given in Equa-tions (9) (10) and (11) respectively

Sensitivity (SE) = TP(TP + FN) (9)

Specificity (SP) = TN(TN + FP) (10)

Overall accuracy (ACC) = (TP + TN)(TP + TN

+ FP + FN) (11)

In k-NN classifier different values of lsquokrsquo between 1 and10 are used PNN and GRNN are trained with differentspread factors between 001 and 0055 The suitable valueof the regularisation parameter (γ gam) and σ 2 (sig2) forSVM classifier are chosen optimally during training andtesting to obtain better accuracy Sensitivity specificity andoverall accuracy of frame-based and file-based analysis us-ing the raw and weighted features are presented in (Table 2)and (Table 3) for both MEEI database and MAPACIspeech pathology database respectively Average classifica-tion accuracies with standard deviations are tabulated From(Table 2) it is observed that the accuracy of file-based anal-ysis is always better than the frame-based analysis Max-imum classification accuracy of 100 is obtained duringthe file-based analysis using all the classifiers (k-NN LS-SVM PNN and GRNN) for the proposed raw and weightedfeatures Maximum classification accuracy of above 95is obtained using the raw features and 100 is obtained us-ing the weighted features during the frame-based analysis

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1630 M Hariharan et al

Tabl

e2

Com

pari

son

ofre

sult

sfo

rM

EE

Ida

taba

seus

ing

k-N

NL

S-S

VM

PN

Nan

dG

RN

N(r

awan

dw

eigh

ted

feat

ures

)

Raw

feat

ures

Wei

ghte

dfe

atur

es

ME

EI

data

base

SE

SP

AC

CS

ES

PA

CC

k-N

NFr

ame-

base

dC

ross

V96

66

plusmn0

5591

69

plusmn1

0994

05

plusmn0

8399

81

plusmn0

0698

27

plusmn0

3499

03

plusmn0

19k-

valu

e1ndash

10C

onV

959

6plusmn

071

908

7plusmn

154

932

7plusmn

116

998

1plusmn

006

979

8plusmn

033

988

8plusmn

015

File

-bas

edC

ross

V99

54

plusmn0

2410

0plusmn

000

996

5plusmn

019

995

4plusmn

024

100

plusmn0

0099

65

plusmn0

19k-

valu

e1ndash

10C

onV

997

4plusmn

025

100

plusmn0

0099

79

plusmn0

2010

0plusmn

000

100

plusmn0

0010

0plusmn

000

LS

-SV

MFr

ame-

base

dC

ross

V95

10

plusmn0

2195

41

plusmn0

2095

25

plusmn0

1299

75

plusmn0

0399

45

plusmn0

0799

60

plusmn0

0410

000

01

Con

V94

92

plusmn0

9094

49

plusmn1

0594

70

plusmn0

8199

34

plusmn0

2999

46

plusmn0

3499

40

plusmn0

26Fi

le-b

ased

Cro

ssV

989

2plusmn

062

998

1plusmn

060

991

2plusmn

047

993

2plusmn

036

100

plusmn0

0099

47

plusmn0

2810

01

Con

V10

0plusmn

000

988

7plusmn

130

991

2plusmn

103

996

2plusmn

080

994

1plusmn

186

995

6plusmn

071

PN

NFr

ame-

base

dC

ross

V97

43

plusmn0

2693

01

plusmn0

1795

12

plusmn0

1799

88

plusmn0

0598

69

plusmn0

1699

28

plusmn0

070

01ndash0

055

Con

V96

79

plusmn0

2492

44

plusmn0

3394

51

plusmn0

2499

81

plusmn0

0698

41

plusmn0

2199

10

plusmn0

11Fi

le-b

ased

Cro

ssV

974

4plusmn

539

100

plusmn0

0097

74

plusmn4

9797

50

plusmn5

3710

0plusmn

000

977

9plusmn

495

001

ndash00

55C

onV

970

9plusmn

610

100

plusmn0

0097

37

plusmn5

7999

62

plusmn0

8099

41

plusmn1

8699

56

plusmn0

71G

RN

NFr

ame-

base

dC

ross

V93

01

plusmn3

8988

08

plusmn4

9890

15

plusmn4

3199

50

plusmn0

4197

53

plusmn1

3098

27

plusmn0

890

01ndash0

055

Con

V92

52

plusmn3

7887

07

plusmn5

1189

34

plusmn4

3599

39

plusmn0

5597

35

plusmn1

1698

10

plusmn0

91Fi

le-b

ased

Cro

ssV

972

7plusmn

599

100

plusmn0

0096

33

plusmn8

0098

07

plusmn4

8210

0plusmn

000

979

2plusmn

511

001

ndash00

55C

onV

966

9plusmn

675

100

plusmn0

0095

81

plusmn9

0596

77

plusmn6

2310

0plusmn

000

958

1plusmn

815

Tabl

e3

Com

pari

son

ofre

sult

sfo

rM

APA

CI

spee

chpa

thol

ogy

data

base

usin

gk-

NN

LS

-SV

MP

NN

and

GR

NN

(raw

and

wei

ghte

dfe

atur

es)

Raw

feat

ures

Wei

ghte

dfe

atur

es

MA

PAC

Ida

taba

seS

ES

PA

CC

SE

SP

AC

C

k-N

NFr

ame-

base

dC

ross

V94

49

plusmn0

3696

58

plusmn0

3995

51

plusmn0

2198

67

plusmn0

2398

76

plusmn0

2598

71

plusmn0

23k-

valu

e1ndash

10C

onV

942

0plusmn

040

962

9plusmn

033

952

1plusmn

032

100

plusmn0

0010

0plusmn

000

100

plusmn0

00Fi

le-b

ased

Cro

ssV

771

0plusmn

784

669

2plusmn

330

704

2plusmn

365

100

plusmn0

0010

0plusmn

000

983

3plusmn

164

k-va

lue

1ndash10

Con

V77

56

plusmn7

1167

51

plusmn2

4869

71

plusmn2

7099

88

plusmn0

4099

88

plusmn0

4097

50

plusmn2

16L

S-S

VM

Fram

e-ba

sed

Cro

ssV

955

0plusmn

018

972

8plusmn

024

963

7plusmn

016

990

4plusmn

008

990

4plusmn

008

992

4plusmn

006

100

01

Con

V96

86

plusmn0

8195

24

plusmn0

6796

03

plusmn0

4098

71

plusmn0

6898

71

plusmn0

6899

02

plusmn0

41Fi

le-b

ased

Cro

ssV

793

1plusmn

333

688

7plusmn

240

729

2plusmn

260

100

plusmn0

0010

0plusmn

000

958

3plusmn

000

101

Con

V74

53

plusmn7

8284

00

plusmn13

01

771

4plusmn

878

100

plusmn0

0010

0plusmn

000

964

3plusmn

376

PN

NFr

ame-

base

dC

ross

V94

88

plusmn0

2096

23

plusmn0

2695

55

plusmn0

2199

82

plusmn0

0799

82

plusmn0

0799

91

plusmn0

040

01ndash0

055

Con

V94

52

plusmn0

3196

03

plusmn0

3595

26

plusmn0

2599

76

plusmn0

0799

76

plusmn0

0799

88

plusmn0

03Fi

le-b

ased

Cro

ssV

723

0plusmn

396

705

3plusmn

393

708

3plusmn

170

985

7plusmn

452

985

7plusmn

452

991

7plusmn

263

001

ndash00

55C

onV

690

2plusmn

533

691

8plusmn

410

679

3plusmn

386

986

0plusmn

273

986

0plusmn

273

990

0plusmn

172

GR

NN

Fram

e-ba

sed

Cro

ssV

904

1plusmn

378

917

9plusmn

335

909

8plusmn

345

995

9plusmn

029

995

9plusmn

029

997

1plusmn

023

001

ndash00

55C

onV

900

2plusmn

361

917

8plusmn

316

907

2plusmn

324

995

3plusmn

030

995

3plusmn

030

996

6plusmn

024

File

-bas

edC

ross

V69

43

plusmn8

5170

75

plusmn4

8666

67

plusmn7

7397

13

plusmn6

4097

13

plusmn6

4093

75

plusmn9

270

01ndash0

055

Con

V62

05

plusmn9

4064

51

plusmn3

8359

57

plusmn8

5097

38

plusmn5

6897

38

plusmn5

6893

07

plusmn9

91

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1631

Table 4 Computation time for feature extraction and classification

Classification

MEEI database MAPACI database

Type of Type of Weighted Raw Weighted RawClassifiers experiments validation features features features features

k-NN Frame-based CrossV 0338 0341 0342 0346k-value 1ndash10 ConV 3806 3821 6931 6955File-based CrossV 0036 0038 0036 0036k-value 1ndash10 ConV 1759 1754 3001 3206

LS-SVM Frame-based CrossV 1139 1175 1191 121410 001 ConV 0718 0698 0705 0723

File-based CrossV 0005 0005 0002 000210 1 ConV 0006 0006 0004 0004

PNN Frame-based CrossV 2458 3084 2445 2421001ndash0055 ConV 10340 10356 11180 11202File-based CrossV 1078 1124 1093 1062001ndash0055 ConV 4250 4266 4072 4047

GRNN Frame-based CrossV 2587 2571 2676 2582001ndash0055 ConV 10726 10648 11634 11584File-based CrossV 1098 1056 1065 1062001ndash0055 ConV 4302 4260 4049 4066

Feature extractionType of experiments MEEI MAPACI

Normal Pathological Normal PathologicalFile-based 0549 0034 0109 0130Frame-based 0591 0181 1734 1737

To prove the robustness and database independency of theproposed method MAPACI speech pathology database isused for the investigation in addition to MEEI databaseFrom (Table 3) it is inferred that maximum accuracy ofabove 95 is obtained using the raw features and 100is obtained using the weighted features during the frame-based analysis During the file-based analysis we obtainedmaximum classification accuracy of 92 using the rawfeatures and 100 is obtained using the weighted features

(Figure 6andashd) show the comparison of results (maxi-mum accuracy and its corresponding sensitivity and speci-ficity) for MEEI database and MAPACI speech pathologydatabase using the proposed raw features during frame-based and file-based analysis Similarly (Figure 7andashd) il-lustrate the comparison of results (maximum accuracy andits corresponding sensitivity and specificity) for MEEIdatabase and MAPACI speech pathology database for pro-posed weighted features The performance of the weightedfeatures is always better compared to raw features for bothMEEI database and MAPACI speech pathology databaseAll the feature extractions and classification algorithms aredeveloped under MATLAB platform and they run in a lap-top of Intel Core i7-2670 QM (22 GHz) with 4 GB RAM(Table 4) reports the average time taken (in seconds) duringfeature extraction and classification under two different ex-periments (frame-based and file-based) Direct comparisonof our work with the previous works in literature cannot beperformed since most of the works in the literature havenot presented their computation time of feature extraction

and classification phase From (Table 4) it is observed thatthe best classification accuracy could be obtained using oursuggested feature extraction and classification algorithmswith less computation time

From the literature it is observed that the reliabilityof wavelet and WPT-based features has been proven bymany experimental researches Accuracy of their resultswas varied from 85 to 100 approximately under differ-ent experiments Their results highly depend on the opti-mal selection of wavelet packet-based features using featureselectiontransformation techniques and optimisation tech-niques However the presented work gives very promisingclassification accuracy for the databases under test Henceit can be concluded that the proposed k-means clusteringbased feature weighting method could improve the robust-ness of the WPTndashSVD features and increase the classifica-tion accuracy as well for both MEEI database and MAPACIspeech pathology database

6 Conclusions

This paper presents a new feature constituting approachfor efficient detection of vocal fold pathology using a fea-ture extraction method (WPTndashSVD features and a fea-ture weighting method [k-means clustering based]) Totest the effectiveness and reliability of the proposed rawand weighted features four different classifiers such as k-NN LS-SVM PNN and GRNN are employed k-means

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1632 M Hariharan et al

clustering based feature weighting approach can increasethe distinguishing performance of raw WPTndashSVD featuresThe experimental results show that the proposed weightedfeatures give very promising classification accuracy of100 for both MEEI voice disorder database and MAPACIspeech pathology database The proposed features can beused as additional acoustic indicators for researchers andspeech pathologists to detect the vocal fold pathology Theperformance of the k-means clustering algorithm is depend-ing on the initial positions of the clusters and hence in ourwork it was run for several times to obtain better clus-ter centres In the future work k-means clustering basedfeature weighting method could be improved by using ef-ficient initialisation of methods for better initial placementof the cluster centres and different kernel functions Theproposed method could be extended to detect the specifictype of disorders and also to develop an online diagnosingsystem

AcknowledgementsThis work is done in part with data transferred from the databaseMAPACI httpwwwmapacicom The authors would like tothank the anonymous reviewers for their valuable comments

Notes on contributorsM Hariharan has received Bachelor ofEngineering in Electrical and ElectronicsEngineering and Master of Engineeringin Applied Electronics from GovernmentCollege of Technology Coimbatore TamilNadu India He has received a PhD degreein Mechatronic Engineering from UniversitiMalaysia Perlis (UniMAP) Malaysia He iscurrently working as a Senior Lecturer in

the School of Mechatronic Engineering UniMAP Malaysia Hehas published more than 50 papers in referred journals and con-ferences He has been a reviewer in various journals includingComputers and Electrical Engineering Computer Methods andPrograms in Biomedicine Artificial Intelligence in BiomedicineInternational Journal of Phoniatrics Speech Therapy and Com-munication Pathology and Medical Engineering and Physics Hisresearch interests include speech signal processing biomedicalsignal and image processing and artificial intelligence He is amember of IEEE

Kemal Polat was born in Kırıkkale on30 January 1981 Dr Polat has gradu-ated from ElectricalndashElectronics Engineer-ing Department of Selcuk University withBSc degree in 2002 and from ElectricalndashElectronics Engineering Department of Sel-cuk University with MSc degree in 2004He took his PhD degree in Electrical andElectronic Engineering at Selcuk University

in 2008 He is now working as an Assistant Professor in Electricaland Electronic Engineering Department Engineering of FacultyAbant Izzet Baysal University since September 2011 His currentresearch interests are biomedical signal classification statisticalsignal processing digital signal processing and pattern recognitionand classification He has 55 articles published in the SCI journalsand 16 international conference papers He is a reviewer in various

SCI journals including Information Sciences Pattern RecognitionLetters Digital Signal Processing Soft Computing Artificial In-telligence in Medicine Applied Soft Computing Computers inBiology and Medicine Expert Systems Computational Statisticsamp Data Analysis IEEE Transactions on Biomedical EngineeringIEEE Transactions on Evolutionary Computation TALANTASoft Computing Journal of Medical Systems and Scientific Re-search and Essays so on He is the member of editorial board ofJournal of Neural Computing and Applications (SCI expanded)

Sazali Yaacob received his PhD degreein Control Engineering from University ofSheffield UK He is currently working as aProfessor in the School of Mechatronic En-gineering UniMAP Malaysia He has suc-cessfully supervised 8 PhD candidates andmore than 20 MSc graduates through re-search mode Currently he has 10 PhD and8 MSc candidates His research interests

include control modelling and signal processing with applica-tions in the fields of satellite bio-medical applied mechanics androbotics He has published more than 70 papers in journals and200 papers in conference proceedings He received his profes-sional qualification as Charted Engineer from the EngineeringCouncil UK in 2005 and also a member of IET UK since 2003

ReferencesAnanthakrishna T Shama K and Niranjan U (2004)

lsquok-Means Nearest Neighbor Classifier for Voice Pathol-ogyrsquo in Paper Presented at the Proceedings from INDI-CONrsquo04 The 1st IEEE Annual India Conference KharagpurIndia pp 352ndash354

Arjmandi MK and Pooyan M (2011) lsquoAn Optimum Algorithmin Pathological Voice Quality Assessment Using Wavelet-Packet-Based Features Linear Discriminant Analysis andSupport Vector Machinersquo Biomedical Signal Processing andControl 7 3ndash19

Azadi TE and Almasganj F (2011) lsquoPBSVM Partitioningand Biased Support Vector Machine for Vocal Fold PathologyAssessment Using Labeled and Unlabeled Data Setsrsquo ExpertSystems with Applications 38(1) 610ndash619

Bezdek JC (1981) Pattern Recognition with Fuzzy ObjectiveFunction Algorithms Norwell MA Kluwer Academic Pub-lishers

Boyanov B and Hadjitodorov S (1997) lsquoAcoustic Analysis ofPathological Voices A Voice Analysis System for the Screen-ing of Laryngeal Diseasesrsquo IEEE Engineering in Medicineand Biology Magazine 16(4) 74ndash82

Boyanov B Ivanov T Hadjitodorov S and Chollet G (1993)lsquoRobust Hybrid Pitch Detectorrsquo IET Electronics Letters 291924ndash1926

Burrus C Gopinath R Guo H Odegard J and Selesnick I(1997) Introduction to Wavelets and Wavelet Transforms APrimer Upper Saddle River NJ Prentice Hall

Calisir D and Dogantekin E (2011) lsquoA New Intelligent Hep-atitis Diagnosis System PCA-LSSVMrsquo Expert Systems withApplications 38(8) 10705ndash10708

Chiu SL (1994) lsquoFuzzy Model Identification Based on ClusterEstimationrsquo Journal of Intelligent and Fuzzy Systems 2(3)267ndash278

Crovato C and Schuck A (2007) lsquoThe Use of Wavelet PacketTransform and Artificial Neural Networks in Analysis andClassification of Dysphonic Voicesrsquo IEEE Transactions onBiomedical Engineering 54(10) 1898ndash1900

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1633

De Brabanter K Ojeda PKF Alzate C De Brabanter J Pel-ckmans K De Moor B Vandewalle J and Suykens JAK(2010) lsquoLS-SVMlab Toolboxrsquo httpwwwesatkuleuvenbesistalssvmlab

Deliyski D (1993) Acoustic Model and Evaluation of Patholog-ical Voice Productionrsquo in Paper Presented at the Proceed-ings from EUROSPEECHrsquo93 The 3rd Conference on SpeechCommunication and Technology Berlin Germany pp 1969ndash1972

Duda RO Hart PE and Stork DG (2001) Pattern Classifi-cation New York Wiley-Interscience

Erfanian Saeedi N Almasganj F and Torabinejad F (2011)lsquoSupport Vector Wavelet Adaptation for Pathological VoiceAssessmentrsquo Computers in Biology and Medicine 41(9)822ndash828

Eskidere O Ertas F and Hanilci C (2011) lsquoA Comparison ofRegression Methods for Remote Tracking of Parkinsonrsquos Dis-ease Progressionrsquo Expert Systems with Applications 39(5)5523ndash5528

Feijoo S and Hernandez C (1990) lsquoShort-Term Stability Mea-sures for the Evaluation of Vocal Qualityrsquo Journal of Speechand Hearing Research 33(2) 324ndash334

Fonseca E Guido R Scalassara P Maciel C and Pereira J(2007) lsquoWavelet Time-Frequency Analysis and Least SquaresSupport Vector Machines for the Identification of Voice Dis-ordersrsquo Computers in Biology and Medicine(Elsevier) 37(4)571ndash578

Fukunaga K (1990) Introduction to Statistical Pattern Recogni-tion (2nd ed) San Diego CA Academic Press

Godino-Llorente J and Gomez-Vilda P (2004) lsquoAutomaticDetection of Voice Impairments by Means of Short-TermCepstral Parameters and Neural Network Based DetectorsrsquoIEEE Transactions on Biomedical Engineering 51(2) 380ndash384

Godino-Llorente J Gomez-Vilda P and Blanco-Velasco M(2006) lsquoDimensionality Reduction of a Pathological VoiceQuality Assessment System Based on Gaussian Mixture Mod-els and Short-Term Cepstral Parametersrsquo IEEE Transactionson Biomedical Engineering 53(10) 1943ndash1953

Gunes S Polat K and Yosunkaya S (2010) lsquoEfficient SleepStage Recognition System Based on EEG Signal Using k-Means Clustering Based Feature Weightingrsquo Expert Systemswith Applications 37(12) 7922ndash7928

Hariharan M Paulraj M and Yaacob S (2009) lsquoIdentifica-tion of Vocal Fold Pathology Based on Mel Frequency BandEnergy Coefficients and Singular Value Decompositionrsquo inPaper presented at the 2009 IEEE International Conferenceon Signal and Image Processing Applications (ICSIPArsquo09)Kuala Lumpur Malaysia pp 514ndash517

Hariharan M Paulraj M and Yaacob S (2011) lsquoDetection ofVocal Fold Paralysis and Oedema Using Time-Domain Fea-tures and Probabilistic Neural Networkrsquo International Jour-nal of Biomedical Engineering and Technology 6(1) 46ndash57

Hariharan M Saraswathy J Sindhu R Khairunizam W andYaacob S (2012) lsquoInfant Cry Classification to Identify As-phyxia Using Time-Frequency Analysis and Radial BasisNeural Networksrsquo Expert Systems with Applications 39(10)9515ndash9523

Hariharan M and Sazali Y (2010) lsquoTime-Domain Features andProbabilistic Neural Network for the Detection of Vocal FoldPathologyrsquo Malaysian Journal of Computer Science 23(1)60ndash67

Hariharan M Sindhu R and Yaacob S (2011) lsquoNormal andHypoacoustic Infant Cry Signal Classification Using Time-Frequency Analysis and General Regression Neural Net-

workrsquo Computer Methods and Programs in Biomedicine108(2) 559ndash569

Hariharan M Yaacob S and Awang SA (2011) lsquoPathologi-cal Infant Cry Analysis Using Wavelet Packet Transform andProbabilistic Neural Networkrsquo Expert Systems with Applica-tions 38(12) 15377ndash15382

Hernandez-Espinosa C Gomez-Vilda P Godino-Llorente Jand Aguilera-Navarro S (2000) lsquoDiagnosis of Vocal andVoice Disorders by the Speech Signalrsquo in Paper Presentedat the Proceedings from IJCNNrsquo00 The IEEE-INNS-ENNSInternational Joint Conference on Neural Networks ComoItaly pp 253ndash258

Hoglund H (2012) lsquoDetecting Earnings Management with Neu-ral Networksrsquo Expert Systems with Applications 39(10)9564ndash9570

Ismail S Samsudin R and Shabri A (2011) lsquoA Hybrid Modelof Self-Organizing Maps (SOM) and Least Square SupportVector Machine (LSSVM) for Time Series Forecastingrsquo Ex-pert Systems with Applications 38(8) 10574ndash10578

Kalker T Haitsma J and Oostveen J (2001) lsquoRobust AudioHashing for Content Identificationrsquo in Paper Presented at theProceedings from EUSIPCOrsquo01 The 12th European SignalProcessing Conference Vienna Austria pp 2091ndash2094

Kasuya H Endo Y and Saliu S (1993) lsquoNovel AcousticMeasurements of Jitter and Shimmer Characteristics fromPathological Voicersquo Paper Presented at the Proceedings fromEUROSPEECHrsquo93 The 3rd European Conference on SpeechCommunication and Technology Berlin Germany pp 1973ndash1976

Kasuya H Ogawa S Mashima K and Ebihara S (1986)lsquoNormalized Noise Energy as an Acoustic Measure to Eval-uate Pathologic Voicersquo The Journal of the Acoustical Societyof America 80 1329ndash1334

Kay Elemetrics Inc (1994) lsquoVoice Disorders DatabaseVersion 103 [CD-ROM]rsquo httpwwwkaypentaxcomProduct20InfoCSL20Options43374337html

Khadivi Heris H Seyed Aghazadeh B and Nikkhah-BahramiM (2009) lsquoOptimal Feature Selection for the Assessment ofVocal Fold Disordersrsquo Computers in Biology and Medicine39(10) 860ndash868

Krom G (1993) lsquoA Cepstrum-Based Technique for Determininga Harmonics-To-Noise Ratio in Speech Signalsrsquo Journal ofSpeech and Hearing Research 36(2) 254ndash266

Kukharchik P Kheidorov I Bovbel E and Ladeev D (2008)lsquoSpeech Signal Processing Based on Wavelets and SVM forVocal Tract Pathology Detectionrsquo Lecture Notes in ComputerScience (Springer) 5099 192ndash199

Latifoglu F Polat K Kara SI and Gunes S (2008) lsquoMedicalDiagnosis of Atherosclerosis from Carotid Artery DopplerSignals Using Principal Component Analysis (PCA) k-NN based Weighting Pre-Processing and Artificial ImmuneRecognition System (AIRS)rsquo Journal of Biomedical Infor-matics 41(1) 15ndash23

Ludlow CL Bassich CJ Connor NP Coulter DC and LeeYJ (1987) lsquoThe Validity of Using Phonatory Jitter and Shim-mer to Detect Laryngeal Pathologyrsquo Laryngeal Functionin Phonation and Respiration Boston MA Brown amp Copp 492ndash508

MacQueen J (1967) lsquoSome Methods for Classification and Anal-ysis of Multivariate Observationsrsquo in Paper Presented at the5th Berkeley Symposium on Mathematical Statistics and Prob-ability Berkeley CA University California Press Vol 1pp 281ndash297

MAPACI P (2004) lsquoVoice Disorder Databasersquo httpwwwmapacicomindex-inglesphp

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1634 M Hariharan et al

Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372

Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800

Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953

Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327

Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900

Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793

Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793

Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60

Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673

Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279

Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921

Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia

Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345

Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925

Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128

Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740

Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p

Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118

Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576

Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd

Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430

Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97

Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117

Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678

Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219

Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399

Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

Page 2: A new feature constituting approach to detection of vocal fold pathology

International Journal of Systems Science 2014Vol 45 No 8 1622ndash1634 httpdxdoiorg101080002077212013794905

A new feature constituting approach to detection of vocal fold pathology

M Hariharanalowast Kemal Polatb and Sazali Yaacoba

aSchool of Mechatronic Engineering Universiti Malaysia Perlis (UniMAP) Campus Pauh Putra Perlis Malaysia bDepartment ofElectrical and Electronics Engineering Faculty of Engineering and Architecture Abant Izzet Baysal University Bolu Turkey

(Received 27 September 2012 final version received 3 April 2013)

In the last two decades non-invasive methods through acoustic analysis of voice signal have been proved to be excellentand reliable tool to diagnose vocal fold pathologies This paper proposes a new feature vector based on the wavelet packettransform and singular value decomposition for the detection of vocal fold pathology k-means clustering based featureweighting is proposed to increase the distinguishing performance of the proposed features In this work two databasesMassachusetts Eye and Ear Infirmary (MEEI) voice disorders database and MAPACI speech pathology database are usedFour different supervised classifiers such as k-nearest neighbour (k-NN) least-square support vector machine probabilisticneural network and general regression neural network are employed for testing the proposed features The experimentalresults uncover that the proposed features give very promising classification accuracy of 100 for both MEEI database andMAPACI speech pathology database

Keywords acoustic analysis vocal fold pathology feature extraction feature weighting and classification

1 Introduction

Voice is a highly multivariate component of speech andits quantitative description has led to the development ofclinical tools To detect the vocal fold pathology medi-cal professionals use subjective techniques (invasive meth-ods) such as the direct inspection of the vocal foldsand the observation of the vocal folds by endoscopic in-struments These techniques are expensive risky timeconsuming discomfort to the patients and require costlyresources such as special light sources endoscopic instru-ments and specialised video camera equipment To circum-vent the above problems non-invasive methods have beendeveloped to help the medical professionals for early de-tection of vocal fold pathology With the rapid develop-ment of signal processing techniques vocalvoice signalcan be used for the detection of vocal fold pathologiesand its quantitative informations play an important roleto understand the process of vocal fold pathology forma-tion In the last 20 years many research works have beencarried out on the automatic detection and classificationof vocal fold pathologies by means of acoustic analysisparametric and non-parametric feature extraction auto-matic pattern recognition or statistical methods (KasuyaOgawa Mashima and Ebihara 1986 Feijoo and Hernan-dez 1990 Boyanov Ivanov Hadjitodorov and Chollet 1993Deliyski 1993 Kasuya Endo and Saliu 1993 Boyanov andHadjitodorov 1997 Hernandez-Espinosa Gomez-VildaGodino-Llorente and Aguilera-Navarro 2000 Martinez

lowastCorresponding author Email hariunimapedumy

and Rufiner 2000 Godino-Llorente Gomez-Vilda andBlanco-Velasco 2006)

A large amount of acoustic parameters have been pro-posed and its effectiveness has been proven by experimentalresearches The important parameters are pitch (Boyanovet al 1993) jitter (Feijoo and Hernandez 1990 Kasuyaet al 1993) shimmer (Ludlow Bassich Connor Coul-ter and Lee 1987 Kasuya et al 1993) harmonics-to-noise(Yumoto Sasaki and Okamura 1984 Krom 1993) and nor-malised noise energy (Kasuya et al 1986) In the last twodecades time-frequencyscale analyses (waveletwaveletpacket transforms) have been used as tools to analyse allkinds of problems in signal and image processing Sincethe speechvoice signal is a highly non-stationary signalFourier transform is not a very useful tool to analyse non-stationary signal as the time-domain informations are lostwhile performing the frequency transformation (PandianSazali and Muthusamy 2008 Paulraj Sazali and Hariha-ran 2009) Time-frequencyscale analysis (waveletwaveletpacket transform) is a good tool for the analysis of non-stationary signals both in time and frequency scale (Pandianet al 2008 Paulraj et al 2009) Hence wavelet and waveletpacket analysis has the potential for the identification ofvocal fold pathology

In Nayak and Bhat (2003) authors have presented aprocedure to identify pathological disorders of larynx usingwavelet analysis Nayak Bhat Acharya and Aithal (2005)have proposed a method for the classification and analysis

Ccopy 2013 Taylor amp Francis

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1623

of speech abnormalities based on wavelet analysis and arti-ficial neural network Fonseca Guido Scalassara Macieland Pereira (2007) have presented wavelet and least-squaresupport vector machine (LS-SVM) for the identification ofvocal fold pathology Salhi Talbi and Cherif 2008 haveproposed a hybrid approach using wavelet analysis andmultilayer neural network for the vocal fold pathologiesKukharchik Kheidorov Bovbel and Ladeev (2008) havepresented wavelet transform and support vector machine(SVM) for the vocal fold pathology detection Crovato andSchuck (2007) presented a vocal fold pathology (dysphonicvoice) classification system using the wavelet packet trans-form (WPT) and the best basis algorithm (BBA) as dimen-sionality reduction and six artificial neural networks actingas classification systems Hierarchical system for diagno-sis of vocal fold pathologies based on wavelets and SVMhas been proposed by Nikkhah-Bahrami Ahmadi-NoubariSeyed Aghazadeh and Khadivi Heris (2009) Arjmandi andPooyan (2011) have applied linear discriminant analysisand wavelet packets for vocal fold pathology assessmentErfanian Saeedi Almasganj and Torabinejad (2011) haveproposed adaptive wavelets for vocal fold pathology assess-ment In Khadivi Heris Seyed Aghazadeh and Nikkhah-Bahrami (2009) WPT with non-linear features were usedfor optimal selection of features to assess the vocal foldpathologies Azadi and Almasganj (2011) have proposedpartitioning and biased support vector machine (PBSVM)-based classifier for vocal fold pathology assessment usinglabelled and unlabelled data-sets Accuracy of their resultvaried from 85 to 100 approximately under differentexperiments From the previous work it is observed thatthe reliability of wavelet and WPT-based features has beenproven by many experimental researchers It is not easyto compare their results since they conducted the analy-sis with different size of data-set different databases anddifferent presentation of results Cross-validation schemeand use of more than one database are the possible ways toprove the reliability of the results of various feature extrac-tion and classification methods (Saenz-Lechon Godino-Llorente Osma-Ruiz and Gomez-Vilda 2006)

In this paper a new feature vector is proposed basedon WPT and singular value decomposition for the auto-matic detection of vocal fold pathology k-means cluster-ing based feature weighting is proposed to increase thediscrimination ability of the proposed features Two dif-ferent databases such as Massachusetts Eye and Ear Infir-mary (MEEI) voice disorders database (Kay Elemetrics Inc1994) and MAPACI speech pathology database (MAPACI2004) are used to test either the robustness or the inde-pendence of the algorithms to the databases Two differentexperiments (frame-based and file-based) are carried outusing the voice samples of the above-said databases Fourdifferent supervised classifiers such as k-nearest neighbour(k-NN) LS-SVM probabilistic neural network (PNN) andgeneral regression neural network (GRNN) are employed

for the identification of vocal fold pathology Two schemesof data validation method are used (conventional validation[ConV] and tenfold cross-validation [CrossV]) in orderto test the effectiveness of the proposed features and thereliability of the classification results

2 Database

In the area of automatic detection of vocal fold pathologyanalysis the only one commercially available database isMEEI voice disorders database (Kay Elemetrics Inc 1994)The database contains 53 normal and 657 pathological voicesamples developed by the MEEI Voice and Speech LabsThe voice samples were the sustained phonation of thevowel ah (1ndash3 s) long and reading (12 s) of the lsquoRainbowPassagersquo from patients with normal voices and a wide va-riety of organic neurological traumatic and psychogenicvoice disorders in different stages All the voice sampleswere collected in a controlled environment and sampledwith 25 kHz or 50 kHz sampling rate and 16 bits of res-olution To test the effectiveness of the proposed methoda total of 226 voice samples of sustained phonation of thevowel ah (173 pathological and 53 normal) are used anddownsampled to 25 kHz for our analysis Two differentexperiments are performed (frame-based and file-based)In frame-based analysis voice samples are segmented intoframes of 40 ms long using a Hamming window with 50overlap Then each window is parameterised by means ofWPT and SVD (singular value decomposition) In file-based analysis 173 pathological and 53 normal voice sam-ples are subjected to feature extraction The second data-setis taken from MAPACI (2004) database (MAPACI 2004)and all the speech samples were recorded at 44100 Hz dur-ing the lifetime project of MAPACI (2004) The recordingdevice was a Sennheiser headset microphone The speechdatabase consists of 48 voice samples (12 normal and 12pathological males + 12 normal and 12 pathological fe-males) and their ages ranged from 20 to 68 years with an av-erage of 3675 years and a standard deviation of 1435 yearsFrame-based and file-based analyses are conducted on thisdatabase as well (Figure 1) shows the pathological andnormal voice samples from MEEI database and MAPACIspeech pathology database

3 WPT and SVD

Various feature extraction methods have been proposed forrobust representation of normal and pathological speech Inthe last decade time-frequency analysis has been used forrobust representation of normal and pathological speechThe wavelet transform provides time-frequency represen-tation of the signal It decomposes signal over dilated andtranslated wavelets A wavelet is a waveform of effectivelylimited duration that has an average value of zero Wavelettransform is defined as the convolution of a signal f(t) with a

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1624 M Hariharan et al

Figure 1 Plots of pathological and normal voice samples

wavelet function ψ(t) shifted in time by a translation param-eter and dilated by a scale parameter The general definitionof the wavelet transform is given as (Burrus Gopinath GuoOdegard and Selesnick 1997 Bopardikar 2000)

W (a b) =infinint

minusinfinf (t)

1radica

ψlowast(

t minus b

a

)dt (1)

In the tree each subspace is indexed by its depth i and thenumber of subspaces p The two wavelet packet orthogonalbases at a parent node (ip) are given by the followingforms

ψ2pi+1(k) =

infinsumn =minusinfin

l[n] ψpi (k minus 2in) (2)

where l[n] is a low-pass (scaling) filter

ψ2p+1i+1 (k) =

infinsumn =minusinfin

h[n] ψpi (k minus 2in) (3)

where h[n] is the high-pass (wavelet) filter Wavelet packetis an extension of wavelet transform In WPT decomposi-tion procedure a signal is decomposed into two frequency

bands such as lower frequency band (approximation co-efficients) and higher frequency band (detail coefficients)Wavelet packet decomposition helps to partition both lowerand higher frequency bands into smaller bands which can-not be achieved by using general discrete wavelet trans-form Hence WPT gives a balanced binary tree structure(Figure 2) shows the block diagram of the proposed system

In this work the automatic detection of vocal foldpathology is carried out by using WPT and SVD In frame-based analysis the voice samples are firstly segmented intoshort-time frames of 40 ms with an overlap of 50 betweenadjacent frames and windowed by using Hamming win-dow (Godino-Llorente and Gomez-Vilda 2004 Godino-Llorente et al 2006) In frame-based analysis 1557 framesof pathological voices and 1537 frames of normal voicesare used Each frame of signal is decomposed into five lev-els using WPT and yields 32 subbands A matrix of size32 times M (fifth-level wavelet packet coefficients of each32 wavelet packet sub-bands) composed of wavelet packetcoefficients which is obtained for further processing whoserows pertain to wavelet packet subbands and columns towavelet packet coefficients

A = [C1

5 (M) C25 (M) middot middot middot C32

5 (M)]T

(4)

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1625

Figure 2 Overall diagram of the proposed system

To derive simple and effective feature vector singularvalue decomposition is applied In file-based analysis eachvoice sample is decomposed into five levels using WPTand the same SVD is applied to derive feature vector for thedetection of vocal fold pathology The same methodologyis applied on the voice samples of second data-set as wellConsidering the fact that pathological voice signals containmore rapid variations and the popularity of Daubechieswavelets in speech applications lsquodb4rsquo wavelet is used inour study The order was chosen to be low to model thetransients and rapid variations in a signal efficiently

31 Singular value decomposition

The singular value decomposition which is a factorisationand summarisation technique effectively reduces a rectan-gular matrix A (n times p matrix) of wavelet packet coefficientsinto a much smaller invertible and square matrix (KalkerHaitsma and Oostveen 2001 Ozer Sankur Memon andAnarim 2005) SVD theorem states that the given n times pmatrix is decomposed as given by

X = UDVT (5)

The columns of U are orthonormal eigenvectors of AAT

and are called left-singular vectors The columns of V are or-thonormal eigenvectors of ATA and are called right-singularvectors The diagonal entries of D are called the singularvalues of the matrix n times p In this work 32 singular valuesgive a good summarisation of the matrix of wavelet packetcoefficients of a signal

32 k-Means clustering based feature weighting

In many practical machine learning applications the per-formance of the learned models degrade gracefully due tothe irrelevant and noisy features To improve the robust-ness of the features and the classification accuracy clus-tering algorithms are used as feature weighting methodwhere they transform the extracted non-linearly separable

features to linearly separable features and also to map thefeatures according to the distributions (Polat and Gunes2006 Latifoglu Polat Kara and Gunes 2008 GunesPolat and Yosunkaya 2010 Polat 2012 Polat and Dur-duran 2012) Clustering algorithms are used not only tostudy the similarity or dissimilarity of the features but alsouseful for compression and reduction of the features (Polatand Gunes 2006 Latifoglu et al 2008 Gunes et al 2010Polat 2012 Polat and Durduran 2012) Several clusteringalgorithms have been proposed in the literature (MacQueen1967 Bezdek 1981 Chiu 1994 Yager and Filev 1994 Xuand Wunsch 2005) In this work k-means clustering is usedas a feature weighting method since it is simple and widelyimplemented in solving many practical problems (Polat andGunes 2006 Latifoglu et al 2008 Gunes et al 2010 Po-lat 2012 Polat and Durduran 2012) k-means clustering isused to partition n observations into k clusters so as to min-imise the mean squared distance from each data point toits nearest centre For a data-set xi i = 1 2 3 n k-means clustering algorithm finds the cluster centres ci andthe membership matrix U iteratively using the followingfour steps

Step 1 Randomly initialise the cluster centre ci i =123 c

Step 2 Determine the membership matrix usingEquation (6)

uij =

10

if∥∥xj minus ci

∥∥2 le∥∥xj minus ci

∥∥2 for each k = i

otherwise(6)

Step 3 Calculate the cost function based on the Euclideandistance using the following Equation (7)

J =csum

i=1

sumkxkisinGi

xk minus ci2 (7)

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1626 M Hariharan et al

Figure 3 Working of k-means clustering based feature weighting method

Stop if either this value or its improvement over previousiteration is below a certain thresholdStep 4 Update the cluster centres according toEquation (8) and go to Step 2

ci = 1

|Gi |sum

kxkisin Gi

xk (8)

where |Gi | =nsum

j=1uij

After finding the cluster centres the extracted featuresare preprocessed (Polat and Gunes 2006 Latifoglu et al2008 Gunes et al 2010 Polat 2012 Polat and Durduran2012) as summarised in (Figure 3) below firstly calcu-late the cluster centres using k-means clustering methodNext the ratios of means of features to their centres arecalculated Finally these ratios are multiplied with eachrespective feature

Raw features are obtained through feature extractionThrough k-means clustering based feature weighting theraw features are preprocessed which gives the weightedfeatures (Figure 4a and b) shows the class distribution ofraw and weighted SVD features MEEI database The classdistribution of raw and weighted SVD features for MAPACIspeech pathology database is shown in (Figure 5a and b)From the figures it is observed that the discrimination abil-

ity of the raw SVD features has been increased by applyingk-means clustering based feature weighting method

4 Classifiers

In the area of automatic detection of vocal fold patholo-gies various classifiers have been proposed such as multi-layer perceptron (Ritchings McGillion Conroy and Moore1999 Ritchings McGillion and Moore 1999) hiddenMarkov models (Wester 1998) linear discriminant anal-ysis (Hariharan Paulraj and Yaacob 2009 UmapathyKrishnan Parsa and Jamieson 2005) SVM (Khadivi Heriset al 2009 Arjmandi and Pooyan 2011 Erfanian Saeediet al 2011) Gaussian mixture models (Godino-Llorenteet al 2006) radial basis neural networks (Hariharan andSazali 2010 Hariharan Paulraj and Yaacob 2011) and k-nearest neighbourhood classifier (Ananthakrishna Shamaand Niranjan 2004 Shama and Cholayya 2007) In thiswork four different classifiers are employed such as k-NNLS-SVM PNN and GRNN

41 k-NN classifier

In pattern recognition the k-NN algorithm is a method forclassifying objects based on closest training examples inthe feature space (Fukunaga 1990 Duda Hart and Stork

Figure 4 (a) Class distribution of raw SVD features (MEEI database frame-based) according to the first two features (Feature 1 andFeature 2) (b) Class distribution of weighted SVD features (MEEI database frame-based) according to the first two features (Feature 1and Feature 2)

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1627

Figure 5 (a) Class distribution of raw SVD features (MAPACI speech pathology frame-based) according to the first two features (Feature1 and Feature 2) (b) Comparison of results (frame-based MEEI database raw features)

2001) An object is classified by a majority vote of itsneighbour with the object being assigned to the class mostcommon amongst its k-nearest neighbours k is a positiveinteger In the k-NN algorithm the classification of a newtest feature vector is determined by the class of its k-NNs

42 Probabilistic neural network

Specht (1990) has proposed the PNN based on Bayesianclassification and classical estimators for probability den-sity function (PDF) PNN comprises four units such as in-put units pattern units summation units and output unitsAll the units are fully interconnected and the pattern unitsare activated by exponential function instead of sigmoidalactivation function The pattern unit computes distancesfrom the input vector to the training input vectors when aninput is presented and produces a vector whose elementsindicate how close the input is to a training input The sum-mation unit sums these contributions for each class of inputsand produces a net output which is a vector of probabili-ties From the maximum of these probabilities output unitsproduce a 1 for that class and a 0 for the other classes using

compete transfer function (Hariharan Yaacob and Awang2011) The performance of the PNN classifier highly de-pends upon the smoothing parameter or spread factor (σ )Based on the experimental investigations the σ value isvaried between 001 and 0055 in steps of 0005

43 General regression neural network

GRNN is a kind of radial basis networks and the train-ing is conducted using one-pass learning This networkdoes not require an iterative training procedure it presentsmuch faster learning than multilayer perceptron it is moreaccurate than MLP and relatively insensitive to outliersThe target variable for the GRNN network is continuous(Eskidere Ertas and Hanilci 2011 Wu and Tsai 2011Hariharan Sindhu and Yaacob 2011 HariharanSaraswathy Sindhu Khairunizam and Yaacob 2012Hoglund 2012) Specht (1991) has proposed the model ofGRNN to perform general (linear or non-linear) regres-sions GRNN is based on the theory of probability re-gression analysis It usually uses Parzen window estimatesto set up the PDF from the observed data samples TheGRNN has four different layers input layer pattern layer

Table 1 Details of training and test data

Type ofType of experiment validation MEEI database MAPACI speech pathology database

Frame-based (1557pathological + 1537normal) (1560pathological + 1560normal)

CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

ConV Training = 2166 (70) testing =928 (30)

Training = 2184 (70) testing =936 (30)

File-based (173pathological + 53normal) (24pathological + 24normal)

CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

ConV Training = 158 (70) testing =68 (30)

Training = 34 (70) testing =14 (30)

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1628 M Hariharan et al

Figure 6 Comparison of results (a) frame-based MEEI database raw features (b) file-based MEEI database raw features (c) frame-based MAPACI speech pathology database raw features and (d) file-based MAPACI speech pathology database raw features

summation layer and output layer The performance of theGRNN classifier highly depends upon the smoothing pa-rameter or spread factor (σ ) Based on the experimentalinvestigations the σ value is varied between 001 and 0055in steps of 0005

44 Support vector machine

In this work SVM is used as a classifier and it is a promis-ing method for solving non-linear classification problemsfunction estimation and density estimation and patternrecognition tasks (Azadi and Almasganj 2011 Calisir andDogantekin 2011 Ismail Samsudin and Shabri 2011 YuYao Wang and Lai 2011 Martis Acharya Mandana Rayand Chakraborty 2012) It has been originally proposed toclassify samples within two classes It maps training sam-ples of two classes into a higher dimensional space througha kernel function SVM seeks an optimal separating hyper-plane in this new space to maximise its distance from theclosest training point While testing a query point is cate-gorised according to the distance between the point and thehyperplane

SVM models are built around a kernel function thattransforms the input data into an n-dimensional space where

a hyperplane can be constructed to partition the data Threekinds of kernel functions such as linear kernel multilayerkernel and radial basis function (RBF) kernel are normallyused by researchers (Azadi and Almasganj 2011 Calisirand Dogantekin 2011 Ismail et al 2011 Yu et al 2011Martis et al 2012) In this work the RBF kernel functionis used since it gives an excellent generalisation and lowcomputational cost In the RBF kernel σ 2 (sig2) is theimportant parameter and it causes the changes in shapeflexion of the hyperplane

In this work LS-SVMlab toolbox (Suykens VanGestel De Brabanter De Moor and Vandewalle2003 De Brabanter 2010) is used to perform classificationof normal and pathological speech samples There are twoparameters which are to be chosen optimally such as reg-ularisation parameter (γ gam) and σ 2 (squared bandwidthof the RBF kernel) to obtain better accuracy

5 Results and discussions

Many research works have already been done in the areaof automatic detection of voice pathology In this work anew feature vector using WPT and SVD is proposed to clas-sify the voice samples into normal or pathological k-means

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1629

Figure 7 Comparison of results (a) Frame-based MEEI database weighted features (b) file-based MEEI database weighted features(c) Frame-based MAPACI speech pathology database weighted features and (d) file-based MAPACI speech pathology database weightedfeatures

clustering based feature weighting method is proposed forpreprocessing the features which increases its discrimina-tion ability between classes and the classification accuracyas well Four classifiers are employed to investigate the ef-ficacy of the raw and weighted features and two validationschemes (ConV and CrossV) are used to prove the reliabil-ity of the classification results Details of the training andtesting sets used in this work are tabulated in (Table 1)

To test the classifier performance several measuresnamely sensitivity specificity and the overall accuracy areconsidered These measures are calculated from the mea-sures true positive (TP the classifier classified as pathologywhen pathological samples are present) true negative (TNthe classifier classified as normal when normal samples arepresent) false positive (FP the classifier classified as patho-logical when normal samples are present) and false negative(FN the classifier classified as normal when pathologicalsamples are present) These measures are given in Equa-tions (9) (10) and (11) respectively

Sensitivity (SE) = TP(TP + FN) (9)

Specificity (SP) = TN(TN + FP) (10)

Overall accuracy (ACC) = (TP + TN)(TP + TN

+ FP + FN) (11)

In k-NN classifier different values of lsquokrsquo between 1 and10 are used PNN and GRNN are trained with differentspread factors between 001 and 0055 The suitable valueof the regularisation parameter (γ gam) and σ 2 (sig2) forSVM classifier are chosen optimally during training andtesting to obtain better accuracy Sensitivity specificity andoverall accuracy of frame-based and file-based analysis us-ing the raw and weighted features are presented in (Table 2)and (Table 3) for both MEEI database and MAPACIspeech pathology database respectively Average classifica-tion accuracies with standard deviations are tabulated From(Table 2) it is observed that the accuracy of file-based anal-ysis is always better than the frame-based analysis Max-imum classification accuracy of 100 is obtained duringthe file-based analysis using all the classifiers (k-NN LS-SVM PNN and GRNN) for the proposed raw and weightedfeatures Maximum classification accuracy of above 95is obtained using the raw features and 100 is obtained us-ing the weighted features during the frame-based analysis

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1630 M Hariharan et al

Tabl

e2

Com

pari

son

ofre

sult

sfo

rM

EE

Ida

taba

seus

ing

k-N

NL

S-S

VM

PN

Nan

dG

RN

N(r

awan

dw

eigh

ted

feat

ures

)

Raw

feat

ures

Wei

ghte

dfe

atur

es

ME

EI

data

base

SE

SP

AC

CS

ES

PA

CC

k-N

NFr

ame-

base

dC

ross

V96

66

plusmn0

5591

69

plusmn1

0994

05

plusmn0

8399

81

plusmn0

0698

27

plusmn0

3499

03

plusmn0

19k-

valu

e1ndash

10C

onV

959

6plusmn

071

908

7plusmn

154

932

7plusmn

116

998

1plusmn

006

979

8plusmn

033

988

8plusmn

015

File

-bas

edC

ross

V99

54

plusmn0

2410

0plusmn

000

996

5plusmn

019

995

4plusmn

024

100

plusmn0

0099

65

plusmn0

19k-

valu

e1ndash

10C

onV

997

4plusmn

025

100

plusmn0

0099

79

plusmn0

2010

0plusmn

000

100

plusmn0

0010

0plusmn

000

LS

-SV

MFr

ame-

base

dC

ross

V95

10

plusmn0

2195

41

plusmn0

2095

25

plusmn0

1299

75

plusmn0

0399

45

plusmn0

0799

60

plusmn0

0410

000

01

Con

V94

92

plusmn0

9094

49

plusmn1

0594

70

plusmn0

8199

34

plusmn0

2999

46

plusmn0

3499

40

plusmn0

26Fi

le-b

ased

Cro

ssV

989

2plusmn

062

998

1plusmn

060

991

2plusmn

047

993

2plusmn

036

100

plusmn0

0099

47

plusmn0

2810

01

Con

V10

0plusmn

000

988

7plusmn

130

991

2plusmn

103

996

2plusmn

080

994

1plusmn

186

995

6plusmn

071

PN

NFr

ame-

base

dC

ross

V97

43

plusmn0

2693

01

plusmn0

1795

12

plusmn0

1799

88

plusmn0

0598

69

plusmn0

1699

28

plusmn0

070

01ndash0

055

Con

V96

79

plusmn0

2492

44

plusmn0

3394

51

plusmn0

2499

81

plusmn0

0698

41

plusmn0

2199

10

plusmn0

11Fi

le-b

ased

Cro

ssV

974

4plusmn

539

100

plusmn0

0097

74

plusmn4

9797

50

plusmn5

3710

0plusmn

000

977

9plusmn

495

001

ndash00

55C

onV

970

9plusmn

610

100

plusmn0

0097

37

plusmn5

7999

62

plusmn0

8099

41

plusmn1

8699

56

plusmn0

71G

RN

NFr

ame-

base

dC

ross

V93

01

plusmn3

8988

08

plusmn4

9890

15

plusmn4

3199

50

plusmn0

4197

53

plusmn1

3098

27

plusmn0

890

01ndash0

055

Con

V92

52

plusmn3

7887

07

plusmn5

1189

34

plusmn4

3599

39

plusmn0

5597

35

plusmn1

1698

10

plusmn0

91Fi

le-b

ased

Cro

ssV

972

7plusmn

599

100

plusmn0

0096

33

plusmn8

0098

07

plusmn4

8210

0plusmn

000

979

2plusmn

511

001

ndash00

55C

onV

966

9plusmn

675

100

plusmn0

0095

81

plusmn9

0596

77

plusmn6

2310

0plusmn

000

958

1plusmn

815

Tabl

e3

Com

pari

son

ofre

sult

sfo

rM

APA

CI

spee

chpa

thol

ogy

data

base

usin

gk-

NN

LS

-SV

MP

NN

and

GR

NN

(raw

and

wei

ghte

dfe

atur

es)

Raw

feat

ures

Wei

ghte

dfe

atur

es

MA

PAC

Ida

taba

seS

ES

PA

CC

SE

SP

AC

C

k-N

NFr

ame-

base

dC

ross

V94

49

plusmn0

3696

58

plusmn0

3995

51

plusmn0

2198

67

plusmn0

2398

76

plusmn0

2598

71

plusmn0

23k-

valu

e1ndash

10C

onV

942

0plusmn

040

962

9plusmn

033

952

1plusmn

032

100

plusmn0

0010

0plusmn

000

100

plusmn0

00Fi

le-b

ased

Cro

ssV

771

0plusmn

784

669

2plusmn

330

704

2plusmn

365

100

plusmn0

0010

0plusmn

000

983

3plusmn

164

k-va

lue

1ndash10

Con

V77

56

plusmn7

1167

51

plusmn2

4869

71

plusmn2

7099

88

plusmn0

4099

88

plusmn0

4097

50

plusmn2

16L

S-S

VM

Fram

e-ba

sed

Cro

ssV

955

0plusmn

018

972

8plusmn

024

963

7plusmn

016

990

4plusmn

008

990

4plusmn

008

992

4plusmn

006

100

01

Con

V96

86

plusmn0

8195

24

plusmn0

6796

03

plusmn0

4098

71

plusmn0

6898

71

plusmn0

6899

02

plusmn0

41Fi

le-b

ased

Cro

ssV

793

1plusmn

333

688

7plusmn

240

729

2plusmn

260

100

plusmn0

0010

0plusmn

000

958

3plusmn

000

101

Con

V74

53

plusmn7

8284

00

plusmn13

01

771

4plusmn

878

100

plusmn0

0010

0plusmn

000

964

3plusmn

376

PN

NFr

ame-

base

dC

ross

V94

88

plusmn0

2096

23

plusmn0

2695

55

plusmn0

2199

82

plusmn0

0799

82

plusmn0

0799

91

plusmn0

040

01ndash0

055

Con

V94

52

plusmn0

3196

03

plusmn0

3595

26

plusmn0

2599

76

plusmn0

0799

76

plusmn0

0799

88

plusmn0

03Fi

le-b

ased

Cro

ssV

723

0plusmn

396

705

3plusmn

393

708

3plusmn

170

985

7plusmn

452

985

7plusmn

452

991

7plusmn

263

001

ndash00

55C

onV

690

2plusmn

533

691

8plusmn

410

679

3plusmn

386

986

0plusmn

273

986

0plusmn

273

990

0plusmn

172

GR

NN

Fram

e-ba

sed

Cro

ssV

904

1plusmn

378

917

9plusmn

335

909

8plusmn

345

995

9plusmn

029

995

9plusmn

029

997

1plusmn

023

001

ndash00

55C

onV

900

2plusmn

361

917

8plusmn

316

907

2plusmn

324

995

3plusmn

030

995

3plusmn

030

996

6plusmn

024

File

-bas

edC

ross

V69

43

plusmn8

5170

75

plusmn4

8666

67

plusmn7

7397

13

plusmn6

4097

13

plusmn6

4093

75

plusmn9

270

01ndash0

055

Con

V62

05

plusmn9

4064

51

plusmn3

8359

57

plusmn8

5097

38

plusmn5

6897

38

plusmn5

6893

07

plusmn9

91

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1631

Table 4 Computation time for feature extraction and classification

Classification

MEEI database MAPACI database

Type of Type of Weighted Raw Weighted RawClassifiers experiments validation features features features features

k-NN Frame-based CrossV 0338 0341 0342 0346k-value 1ndash10 ConV 3806 3821 6931 6955File-based CrossV 0036 0038 0036 0036k-value 1ndash10 ConV 1759 1754 3001 3206

LS-SVM Frame-based CrossV 1139 1175 1191 121410 001 ConV 0718 0698 0705 0723

File-based CrossV 0005 0005 0002 000210 1 ConV 0006 0006 0004 0004

PNN Frame-based CrossV 2458 3084 2445 2421001ndash0055 ConV 10340 10356 11180 11202File-based CrossV 1078 1124 1093 1062001ndash0055 ConV 4250 4266 4072 4047

GRNN Frame-based CrossV 2587 2571 2676 2582001ndash0055 ConV 10726 10648 11634 11584File-based CrossV 1098 1056 1065 1062001ndash0055 ConV 4302 4260 4049 4066

Feature extractionType of experiments MEEI MAPACI

Normal Pathological Normal PathologicalFile-based 0549 0034 0109 0130Frame-based 0591 0181 1734 1737

To prove the robustness and database independency of theproposed method MAPACI speech pathology database isused for the investigation in addition to MEEI databaseFrom (Table 3) it is inferred that maximum accuracy ofabove 95 is obtained using the raw features and 100is obtained using the weighted features during the frame-based analysis During the file-based analysis we obtainedmaximum classification accuracy of 92 using the rawfeatures and 100 is obtained using the weighted features

(Figure 6andashd) show the comparison of results (maxi-mum accuracy and its corresponding sensitivity and speci-ficity) for MEEI database and MAPACI speech pathologydatabase using the proposed raw features during frame-based and file-based analysis Similarly (Figure 7andashd) il-lustrate the comparison of results (maximum accuracy andits corresponding sensitivity and specificity) for MEEIdatabase and MAPACI speech pathology database for pro-posed weighted features The performance of the weightedfeatures is always better compared to raw features for bothMEEI database and MAPACI speech pathology databaseAll the feature extractions and classification algorithms aredeveloped under MATLAB platform and they run in a lap-top of Intel Core i7-2670 QM (22 GHz) with 4 GB RAM(Table 4) reports the average time taken (in seconds) duringfeature extraction and classification under two different ex-periments (frame-based and file-based) Direct comparisonof our work with the previous works in literature cannot beperformed since most of the works in the literature havenot presented their computation time of feature extraction

and classification phase From (Table 4) it is observed thatthe best classification accuracy could be obtained using oursuggested feature extraction and classification algorithmswith less computation time

From the literature it is observed that the reliabilityof wavelet and WPT-based features has been proven bymany experimental researches Accuracy of their resultswas varied from 85 to 100 approximately under differ-ent experiments Their results highly depend on the opti-mal selection of wavelet packet-based features using featureselectiontransformation techniques and optimisation tech-niques However the presented work gives very promisingclassification accuracy for the databases under test Henceit can be concluded that the proposed k-means clusteringbased feature weighting method could improve the robust-ness of the WPTndashSVD features and increase the classifica-tion accuracy as well for both MEEI database and MAPACIspeech pathology database

6 Conclusions

This paper presents a new feature constituting approachfor efficient detection of vocal fold pathology using a fea-ture extraction method (WPTndashSVD features and a fea-ture weighting method [k-means clustering based]) Totest the effectiveness and reliability of the proposed rawand weighted features four different classifiers such as k-NN LS-SVM PNN and GRNN are employed k-means

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1632 M Hariharan et al

clustering based feature weighting approach can increasethe distinguishing performance of raw WPTndashSVD featuresThe experimental results show that the proposed weightedfeatures give very promising classification accuracy of100 for both MEEI voice disorder database and MAPACIspeech pathology database The proposed features can beused as additional acoustic indicators for researchers andspeech pathologists to detect the vocal fold pathology Theperformance of the k-means clustering algorithm is depend-ing on the initial positions of the clusters and hence in ourwork it was run for several times to obtain better clus-ter centres In the future work k-means clustering basedfeature weighting method could be improved by using ef-ficient initialisation of methods for better initial placementof the cluster centres and different kernel functions Theproposed method could be extended to detect the specifictype of disorders and also to develop an online diagnosingsystem

AcknowledgementsThis work is done in part with data transferred from the databaseMAPACI httpwwwmapacicom The authors would like tothank the anonymous reviewers for their valuable comments

Notes on contributorsM Hariharan has received Bachelor ofEngineering in Electrical and ElectronicsEngineering and Master of Engineeringin Applied Electronics from GovernmentCollege of Technology Coimbatore TamilNadu India He has received a PhD degreein Mechatronic Engineering from UniversitiMalaysia Perlis (UniMAP) Malaysia He iscurrently working as a Senior Lecturer in

the School of Mechatronic Engineering UniMAP Malaysia Hehas published more than 50 papers in referred journals and con-ferences He has been a reviewer in various journals includingComputers and Electrical Engineering Computer Methods andPrograms in Biomedicine Artificial Intelligence in BiomedicineInternational Journal of Phoniatrics Speech Therapy and Com-munication Pathology and Medical Engineering and Physics Hisresearch interests include speech signal processing biomedicalsignal and image processing and artificial intelligence He is amember of IEEE

Kemal Polat was born in Kırıkkale on30 January 1981 Dr Polat has gradu-ated from ElectricalndashElectronics Engineer-ing Department of Selcuk University withBSc degree in 2002 and from ElectricalndashElectronics Engineering Department of Sel-cuk University with MSc degree in 2004He took his PhD degree in Electrical andElectronic Engineering at Selcuk University

in 2008 He is now working as an Assistant Professor in Electricaland Electronic Engineering Department Engineering of FacultyAbant Izzet Baysal University since September 2011 His currentresearch interests are biomedical signal classification statisticalsignal processing digital signal processing and pattern recognitionand classification He has 55 articles published in the SCI journalsand 16 international conference papers He is a reviewer in various

SCI journals including Information Sciences Pattern RecognitionLetters Digital Signal Processing Soft Computing Artificial In-telligence in Medicine Applied Soft Computing Computers inBiology and Medicine Expert Systems Computational Statisticsamp Data Analysis IEEE Transactions on Biomedical EngineeringIEEE Transactions on Evolutionary Computation TALANTASoft Computing Journal of Medical Systems and Scientific Re-search and Essays so on He is the member of editorial board ofJournal of Neural Computing and Applications (SCI expanded)

Sazali Yaacob received his PhD degreein Control Engineering from University ofSheffield UK He is currently working as aProfessor in the School of Mechatronic En-gineering UniMAP Malaysia He has suc-cessfully supervised 8 PhD candidates andmore than 20 MSc graduates through re-search mode Currently he has 10 PhD and8 MSc candidates His research interests

include control modelling and signal processing with applica-tions in the fields of satellite bio-medical applied mechanics androbotics He has published more than 70 papers in journals and200 papers in conference proceedings He received his profes-sional qualification as Charted Engineer from the EngineeringCouncil UK in 2005 and also a member of IET UK since 2003

ReferencesAnanthakrishna T Shama K and Niranjan U (2004)

lsquok-Means Nearest Neighbor Classifier for Voice Pathol-ogyrsquo in Paper Presented at the Proceedings from INDI-CONrsquo04 The 1st IEEE Annual India Conference KharagpurIndia pp 352ndash354

Arjmandi MK and Pooyan M (2011) lsquoAn Optimum Algorithmin Pathological Voice Quality Assessment Using Wavelet-Packet-Based Features Linear Discriminant Analysis andSupport Vector Machinersquo Biomedical Signal Processing andControl 7 3ndash19

Azadi TE and Almasganj F (2011) lsquoPBSVM Partitioningand Biased Support Vector Machine for Vocal Fold PathologyAssessment Using Labeled and Unlabeled Data Setsrsquo ExpertSystems with Applications 38(1) 610ndash619

Bezdek JC (1981) Pattern Recognition with Fuzzy ObjectiveFunction Algorithms Norwell MA Kluwer Academic Pub-lishers

Boyanov B and Hadjitodorov S (1997) lsquoAcoustic Analysis ofPathological Voices A Voice Analysis System for the Screen-ing of Laryngeal Diseasesrsquo IEEE Engineering in Medicineand Biology Magazine 16(4) 74ndash82

Boyanov B Ivanov T Hadjitodorov S and Chollet G (1993)lsquoRobust Hybrid Pitch Detectorrsquo IET Electronics Letters 291924ndash1926

Burrus C Gopinath R Guo H Odegard J and Selesnick I(1997) Introduction to Wavelets and Wavelet Transforms APrimer Upper Saddle River NJ Prentice Hall

Calisir D and Dogantekin E (2011) lsquoA New Intelligent Hep-atitis Diagnosis System PCA-LSSVMrsquo Expert Systems withApplications 38(8) 10705ndash10708

Chiu SL (1994) lsquoFuzzy Model Identification Based on ClusterEstimationrsquo Journal of Intelligent and Fuzzy Systems 2(3)267ndash278

Crovato C and Schuck A (2007) lsquoThe Use of Wavelet PacketTransform and Artificial Neural Networks in Analysis andClassification of Dysphonic Voicesrsquo IEEE Transactions onBiomedical Engineering 54(10) 1898ndash1900

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1633

De Brabanter K Ojeda PKF Alzate C De Brabanter J Pel-ckmans K De Moor B Vandewalle J and Suykens JAK(2010) lsquoLS-SVMlab Toolboxrsquo httpwwwesatkuleuvenbesistalssvmlab

Deliyski D (1993) Acoustic Model and Evaluation of Patholog-ical Voice Productionrsquo in Paper Presented at the Proceed-ings from EUROSPEECHrsquo93 The 3rd Conference on SpeechCommunication and Technology Berlin Germany pp 1969ndash1972

Duda RO Hart PE and Stork DG (2001) Pattern Classifi-cation New York Wiley-Interscience

Erfanian Saeedi N Almasganj F and Torabinejad F (2011)lsquoSupport Vector Wavelet Adaptation for Pathological VoiceAssessmentrsquo Computers in Biology and Medicine 41(9)822ndash828

Eskidere O Ertas F and Hanilci C (2011) lsquoA Comparison ofRegression Methods for Remote Tracking of Parkinsonrsquos Dis-ease Progressionrsquo Expert Systems with Applications 39(5)5523ndash5528

Feijoo S and Hernandez C (1990) lsquoShort-Term Stability Mea-sures for the Evaluation of Vocal Qualityrsquo Journal of Speechand Hearing Research 33(2) 324ndash334

Fonseca E Guido R Scalassara P Maciel C and Pereira J(2007) lsquoWavelet Time-Frequency Analysis and Least SquaresSupport Vector Machines for the Identification of Voice Dis-ordersrsquo Computers in Biology and Medicine(Elsevier) 37(4)571ndash578

Fukunaga K (1990) Introduction to Statistical Pattern Recogni-tion (2nd ed) San Diego CA Academic Press

Godino-Llorente J and Gomez-Vilda P (2004) lsquoAutomaticDetection of Voice Impairments by Means of Short-TermCepstral Parameters and Neural Network Based DetectorsrsquoIEEE Transactions on Biomedical Engineering 51(2) 380ndash384

Godino-Llorente J Gomez-Vilda P and Blanco-Velasco M(2006) lsquoDimensionality Reduction of a Pathological VoiceQuality Assessment System Based on Gaussian Mixture Mod-els and Short-Term Cepstral Parametersrsquo IEEE Transactionson Biomedical Engineering 53(10) 1943ndash1953

Gunes S Polat K and Yosunkaya S (2010) lsquoEfficient SleepStage Recognition System Based on EEG Signal Using k-Means Clustering Based Feature Weightingrsquo Expert Systemswith Applications 37(12) 7922ndash7928

Hariharan M Paulraj M and Yaacob S (2009) lsquoIdentifica-tion of Vocal Fold Pathology Based on Mel Frequency BandEnergy Coefficients and Singular Value Decompositionrsquo inPaper presented at the 2009 IEEE International Conferenceon Signal and Image Processing Applications (ICSIPArsquo09)Kuala Lumpur Malaysia pp 514ndash517

Hariharan M Paulraj M and Yaacob S (2011) lsquoDetection ofVocal Fold Paralysis and Oedema Using Time-Domain Fea-tures and Probabilistic Neural Networkrsquo International Jour-nal of Biomedical Engineering and Technology 6(1) 46ndash57

Hariharan M Saraswathy J Sindhu R Khairunizam W andYaacob S (2012) lsquoInfant Cry Classification to Identify As-phyxia Using Time-Frequency Analysis and Radial BasisNeural Networksrsquo Expert Systems with Applications 39(10)9515ndash9523

Hariharan M and Sazali Y (2010) lsquoTime-Domain Features andProbabilistic Neural Network for the Detection of Vocal FoldPathologyrsquo Malaysian Journal of Computer Science 23(1)60ndash67

Hariharan M Sindhu R and Yaacob S (2011) lsquoNormal andHypoacoustic Infant Cry Signal Classification Using Time-Frequency Analysis and General Regression Neural Net-

workrsquo Computer Methods and Programs in Biomedicine108(2) 559ndash569

Hariharan M Yaacob S and Awang SA (2011) lsquoPathologi-cal Infant Cry Analysis Using Wavelet Packet Transform andProbabilistic Neural Networkrsquo Expert Systems with Applica-tions 38(12) 15377ndash15382

Hernandez-Espinosa C Gomez-Vilda P Godino-Llorente Jand Aguilera-Navarro S (2000) lsquoDiagnosis of Vocal andVoice Disorders by the Speech Signalrsquo in Paper Presentedat the Proceedings from IJCNNrsquo00 The IEEE-INNS-ENNSInternational Joint Conference on Neural Networks ComoItaly pp 253ndash258

Hoglund H (2012) lsquoDetecting Earnings Management with Neu-ral Networksrsquo Expert Systems with Applications 39(10)9564ndash9570

Ismail S Samsudin R and Shabri A (2011) lsquoA Hybrid Modelof Self-Organizing Maps (SOM) and Least Square SupportVector Machine (LSSVM) for Time Series Forecastingrsquo Ex-pert Systems with Applications 38(8) 10574ndash10578

Kalker T Haitsma J and Oostveen J (2001) lsquoRobust AudioHashing for Content Identificationrsquo in Paper Presented at theProceedings from EUSIPCOrsquo01 The 12th European SignalProcessing Conference Vienna Austria pp 2091ndash2094

Kasuya H Endo Y and Saliu S (1993) lsquoNovel AcousticMeasurements of Jitter and Shimmer Characteristics fromPathological Voicersquo Paper Presented at the Proceedings fromEUROSPEECHrsquo93 The 3rd European Conference on SpeechCommunication and Technology Berlin Germany pp 1973ndash1976

Kasuya H Ogawa S Mashima K and Ebihara S (1986)lsquoNormalized Noise Energy as an Acoustic Measure to Eval-uate Pathologic Voicersquo The Journal of the Acoustical Societyof America 80 1329ndash1334

Kay Elemetrics Inc (1994) lsquoVoice Disorders DatabaseVersion 103 [CD-ROM]rsquo httpwwwkaypentaxcomProduct20InfoCSL20Options43374337html

Khadivi Heris H Seyed Aghazadeh B and Nikkhah-BahramiM (2009) lsquoOptimal Feature Selection for the Assessment ofVocal Fold Disordersrsquo Computers in Biology and Medicine39(10) 860ndash868

Krom G (1993) lsquoA Cepstrum-Based Technique for Determininga Harmonics-To-Noise Ratio in Speech Signalsrsquo Journal ofSpeech and Hearing Research 36(2) 254ndash266

Kukharchik P Kheidorov I Bovbel E and Ladeev D (2008)lsquoSpeech Signal Processing Based on Wavelets and SVM forVocal Tract Pathology Detectionrsquo Lecture Notes in ComputerScience (Springer) 5099 192ndash199

Latifoglu F Polat K Kara SI and Gunes S (2008) lsquoMedicalDiagnosis of Atherosclerosis from Carotid Artery DopplerSignals Using Principal Component Analysis (PCA) k-NN based Weighting Pre-Processing and Artificial ImmuneRecognition System (AIRS)rsquo Journal of Biomedical Infor-matics 41(1) 15ndash23

Ludlow CL Bassich CJ Connor NP Coulter DC and LeeYJ (1987) lsquoThe Validity of Using Phonatory Jitter and Shim-mer to Detect Laryngeal Pathologyrsquo Laryngeal Functionin Phonation and Respiration Boston MA Brown amp Copp 492ndash508

MacQueen J (1967) lsquoSome Methods for Classification and Anal-ysis of Multivariate Observationsrsquo in Paper Presented at the5th Berkeley Symposium on Mathematical Statistics and Prob-ability Berkeley CA University California Press Vol 1pp 281ndash297

MAPACI P (2004) lsquoVoice Disorder Databasersquo httpwwwmapacicomindex-inglesphp

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1634 M Hariharan et al

Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372

Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800

Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953

Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327

Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900

Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793

Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793

Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60

Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673

Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279

Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921

Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia

Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345

Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925

Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128

Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740

Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p

Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118

Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576

Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd

Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430

Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97

Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117

Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678

Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219

Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399

Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

Page 3: A new feature constituting approach to detection of vocal fold pathology

International Journal of Systems Science 1623

of speech abnormalities based on wavelet analysis and arti-ficial neural network Fonseca Guido Scalassara Macieland Pereira (2007) have presented wavelet and least-squaresupport vector machine (LS-SVM) for the identification ofvocal fold pathology Salhi Talbi and Cherif 2008 haveproposed a hybrid approach using wavelet analysis andmultilayer neural network for the vocal fold pathologiesKukharchik Kheidorov Bovbel and Ladeev (2008) havepresented wavelet transform and support vector machine(SVM) for the vocal fold pathology detection Crovato andSchuck (2007) presented a vocal fold pathology (dysphonicvoice) classification system using the wavelet packet trans-form (WPT) and the best basis algorithm (BBA) as dimen-sionality reduction and six artificial neural networks actingas classification systems Hierarchical system for diagno-sis of vocal fold pathologies based on wavelets and SVMhas been proposed by Nikkhah-Bahrami Ahmadi-NoubariSeyed Aghazadeh and Khadivi Heris (2009) Arjmandi andPooyan (2011) have applied linear discriminant analysisand wavelet packets for vocal fold pathology assessmentErfanian Saeedi Almasganj and Torabinejad (2011) haveproposed adaptive wavelets for vocal fold pathology assess-ment In Khadivi Heris Seyed Aghazadeh and Nikkhah-Bahrami (2009) WPT with non-linear features were usedfor optimal selection of features to assess the vocal foldpathologies Azadi and Almasganj (2011) have proposedpartitioning and biased support vector machine (PBSVM)-based classifier for vocal fold pathology assessment usinglabelled and unlabelled data-sets Accuracy of their resultvaried from 85 to 100 approximately under differentexperiments From the previous work it is observed thatthe reliability of wavelet and WPT-based features has beenproven by many experimental researchers It is not easyto compare their results since they conducted the analy-sis with different size of data-set different databases anddifferent presentation of results Cross-validation schemeand use of more than one database are the possible ways toprove the reliability of the results of various feature extrac-tion and classification methods (Saenz-Lechon Godino-Llorente Osma-Ruiz and Gomez-Vilda 2006)

In this paper a new feature vector is proposed basedon WPT and singular value decomposition for the auto-matic detection of vocal fold pathology k-means cluster-ing based feature weighting is proposed to increase thediscrimination ability of the proposed features Two dif-ferent databases such as Massachusetts Eye and Ear Infir-mary (MEEI) voice disorders database (Kay Elemetrics Inc1994) and MAPACI speech pathology database (MAPACI2004) are used to test either the robustness or the inde-pendence of the algorithms to the databases Two differentexperiments (frame-based and file-based) are carried outusing the voice samples of the above-said databases Fourdifferent supervised classifiers such as k-nearest neighbour(k-NN) LS-SVM probabilistic neural network (PNN) andgeneral regression neural network (GRNN) are employed

for the identification of vocal fold pathology Two schemesof data validation method are used (conventional validation[ConV] and tenfold cross-validation [CrossV]) in orderto test the effectiveness of the proposed features and thereliability of the classification results

2 Database

In the area of automatic detection of vocal fold pathologyanalysis the only one commercially available database isMEEI voice disorders database (Kay Elemetrics Inc 1994)The database contains 53 normal and 657 pathological voicesamples developed by the MEEI Voice and Speech LabsThe voice samples were the sustained phonation of thevowel ah (1ndash3 s) long and reading (12 s) of the lsquoRainbowPassagersquo from patients with normal voices and a wide va-riety of organic neurological traumatic and psychogenicvoice disorders in different stages All the voice sampleswere collected in a controlled environment and sampledwith 25 kHz or 50 kHz sampling rate and 16 bits of res-olution To test the effectiveness of the proposed methoda total of 226 voice samples of sustained phonation of thevowel ah (173 pathological and 53 normal) are used anddownsampled to 25 kHz for our analysis Two differentexperiments are performed (frame-based and file-based)In frame-based analysis voice samples are segmented intoframes of 40 ms long using a Hamming window with 50overlap Then each window is parameterised by means ofWPT and SVD (singular value decomposition) In file-based analysis 173 pathological and 53 normal voice sam-ples are subjected to feature extraction The second data-setis taken from MAPACI (2004) database (MAPACI 2004)and all the speech samples were recorded at 44100 Hz dur-ing the lifetime project of MAPACI (2004) The recordingdevice was a Sennheiser headset microphone The speechdatabase consists of 48 voice samples (12 normal and 12pathological males + 12 normal and 12 pathological fe-males) and their ages ranged from 20 to 68 years with an av-erage of 3675 years and a standard deviation of 1435 yearsFrame-based and file-based analyses are conducted on thisdatabase as well (Figure 1) shows the pathological andnormal voice samples from MEEI database and MAPACIspeech pathology database

3 WPT and SVD

Various feature extraction methods have been proposed forrobust representation of normal and pathological speech Inthe last decade time-frequency analysis has been used forrobust representation of normal and pathological speechThe wavelet transform provides time-frequency represen-tation of the signal It decomposes signal over dilated andtranslated wavelets A wavelet is a waveform of effectivelylimited duration that has an average value of zero Wavelettransform is defined as the convolution of a signal f(t) with a

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1624 M Hariharan et al

Figure 1 Plots of pathological and normal voice samples

wavelet function ψ(t) shifted in time by a translation param-eter and dilated by a scale parameter The general definitionof the wavelet transform is given as (Burrus Gopinath GuoOdegard and Selesnick 1997 Bopardikar 2000)

W (a b) =infinint

minusinfinf (t)

1radica

ψlowast(

t minus b

a

)dt (1)

In the tree each subspace is indexed by its depth i and thenumber of subspaces p The two wavelet packet orthogonalbases at a parent node (ip) are given by the followingforms

ψ2pi+1(k) =

infinsumn =minusinfin

l[n] ψpi (k minus 2in) (2)

where l[n] is a low-pass (scaling) filter

ψ2p+1i+1 (k) =

infinsumn =minusinfin

h[n] ψpi (k minus 2in) (3)

where h[n] is the high-pass (wavelet) filter Wavelet packetis an extension of wavelet transform In WPT decomposi-tion procedure a signal is decomposed into two frequency

bands such as lower frequency band (approximation co-efficients) and higher frequency band (detail coefficients)Wavelet packet decomposition helps to partition both lowerand higher frequency bands into smaller bands which can-not be achieved by using general discrete wavelet trans-form Hence WPT gives a balanced binary tree structure(Figure 2) shows the block diagram of the proposed system

In this work the automatic detection of vocal foldpathology is carried out by using WPT and SVD In frame-based analysis the voice samples are firstly segmented intoshort-time frames of 40 ms with an overlap of 50 betweenadjacent frames and windowed by using Hamming win-dow (Godino-Llorente and Gomez-Vilda 2004 Godino-Llorente et al 2006) In frame-based analysis 1557 framesof pathological voices and 1537 frames of normal voicesare used Each frame of signal is decomposed into five lev-els using WPT and yields 32 subbands A matrix of size32 times M (fifth-level wavelet packet coefficients of each32 wavelet packet sub-bands) composed of wavelet packetcoefficients which is obtained for further processing whoserows pertain to wavelet packet subbands and columns towavelet packet coefficients

A = [C1

5 (M) C25 (M) middot middot middot C32

5 (M)]T

(4)

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1625

Figure 2 Overall diagram of the proposed system

To derive simple and effective feature vector singularvalue decomposition is applied In file-based analysis eachvoice sample is decomposed into five levels using WPTand the same SVD is applied to derive feature vector for thedetection of vocal fold pathology The same methodologyis applied on the voice samples of second data-set as wellConsidering the fact that pathological voice signals containmore rapid variations and the popularity of Daubechieswavelets in speech applications lsquodb4rsquo wavelet is used inour study The order was chosen to be low to model thetransients and rapid variations in a signal efficiently

31 Singular value decomposition

The singular value decomposition which is a factorisationand summarisation technique effectively reduces a rectan-gular matrix A (n times p matrix) of wavelet packet coefficientsinto a much smaller invertible and square matrix (KalkerHaitsma and Oostveen 2001 Ozer Sankur Memon andAnarim 2005) SVD theorem states that the given n times pmatrix is decomposed as given by

X = UDVT (5)

The columns of U are orthonormal eigenvectors of AAT

and are called left-singular vectors The columns of V are or-thonormal eigenvectors of ATA and are called right-singularvectors The diagonal entries of D are called the singularvalues of the matrix n times p In this work 32 singular valuesgive a good summarisation of the matrix of wavelet packetcoefficients of a signal

32 k-Means clustering based feature weighting

In many practical machine learning applications the per-formance of the learned models degrade gracefully due tothe irrelevant and noisy features To improve the robust-ness of the features and the classification accuracy clus-tering algorithms are used as feature weighting methodwhere they transform the extracted non-linearly separable

features to linearly separable features and also to map thefeatures according to the distributions (Polat and Gunes2006 Latifoglu Polat Kara and Gunes 2008 GunesPolat and Yosunkaya 2010 Polat 2012 Polat and Dur-duran 2012) Clustering algorithms are used not only tostudy the similarity or dissimilarity of the features but alsouseful for compression and reduction of the features (Polatand Gunes 2006 Latifoglu et al 2008 Gunes et al 2010Polat 2012 Polat and Durduran 2012) Several clusteringalgorithms have been proposed in the literature (MacQueen1967 Bezdek 1981 Chiu 1994 Yager and Filev 1994 Xuand Wunsch 2005) In this work k-means clustering is usedas a feature weighting method since it is simple and widelyimplemented in solving many practical problems (Polat andGunes 2006 Latifoglu et al 2008 Gunes et al 2010 Po-lat 2012 Polat and Durduran 2012) k-means clustering isused to partition n observations into k clusters so as to min-imise the mean squared distance from each data point toits nearest centre For a data-set xi i = 1 2 3 n k-means clustering algorithm finds the cluster centres ci andthe membership matrix U iteratively using the followingfour steps

Step 1 Randomly initialise the cluster centre ci i =123 c

Step 2 Determine the membership matrix usingEquation (6)

uij =

10

if∥∥xj minus ci

∥∥2 le∥∥xj minus ci

∥∥2 for each k = i

otherwise(6)

Step 3 Calculate the cost function based on the Euclideandistance using the following Equation (7)

J =csum

i=1

sumkxkisinGi

xk minus ci2 (7)

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1626 M Hariharan et al

Figure 3 Working of k-means clustering based feature weighting method

Stop if either this value or its improvement over previousiteration is below a certain thresholdStep 4 Update the cluster centres according toEquation (8) and go to Step 2

ci = 1

|Gi |sum

kxkisin Gi

xk (8)

where |Gi | =nsum

j=1uij

After finding the cluster centres the extracted featuresare preprocessed (Polat and Gunes 2006 Latifoglu et al2008 Gunes et al 2010 Polat 2012 Polat and Durduran2012) as summarised in (Figure 3) below firstly calcu-late the cluster centres using k-means clustering methodNext the ratios of means of features to their centres arecalculated Finally these ratios are multiplied with eachrespective feature

Raw features are obtained through feature extractionThrough k-means clustering based feature weighting theraw features are preprocessed which gives the weightedfeatures (Figure 4a and b) shows the class distribution ofraw and weighted SVD features MEEI database The classdistribution of raw and weighted SVD features for MAPACIspeech pathology database is shown in (Figure 5a and b)From the figures it is observed that the discrimination abil-

ity of the raw SVD features has been increased by applyingk-means clustering based feature weighting method

4 Classifiers

In the area of automatic detection of vocal fold patholo-gies various classifiers have been proposed such as multi-layer perceptron (Ritchings McGillion Conroy and Moore1999 Ritchings McGillion and Moore 1999) hiddenMarkov models (Wester 1998) linear discriminant anal-ysis (Hariharan Paulraj and Yaacob 2009 UmapathyKrishnan Parsa and Jamieson 2005) SVM (Khadivi Heriset al 2009 Arjmandi and Pooyan 2011 Erfanian Saeediet al 2011) Gaussian mixture models (Godino-Llorenteet al 2006) radial basis neural networks (Hariharan andSazali 2010 Hariharan Paulraj and Yaacob 2011) and k-nearest neighbourhood classifier (Ananthakrishna Shamaand Niranjan 2004 Shama and Cholayya 2007) In thiswork four different classifiers are employed such as k-NNLS-SVM PNN and GRNN

41 k-NN classifier

In pattern recognition the k-NN algorithm is a method forclassifying objects based on closest training examples inthe feature space (Fukunaga 1990 Duda Hart and Stork

Figure 4 (a) Class distribution of raw SVD features (MEEI database frame-based) according to the first two features (Feature 1 andFeature 2) (b) Class distribution of weighted SVD features (MEEI database frame-based) according to the first two features (Feature 1and Feature 2)

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1627

Figure 5 (a) Class distribution of raw SVD features (MAPACI speech pathology frame-based) according to the first two features (Feature1 and Feature 2) (b) Comparison of results (frame-based MEEI database raw features)

2001) An object is classified by a majority vote of itsneighbour with the object being assigned to the class mostcommon amongst its k-nearest neighbours k is a positiveinteger In the k-NN algorithm the classification of a newtest feature vector is determined by the class of its k-NNs

42 Probabilistic neural network

Specht (1990) has proposed the PNN based on Bayesianclassification and classical estimators for probability den-sity function (PDF) PNN comprises four units such as in-put units pattern units summation units and output unitsAll the units are fully interconnected and the pattern unitsare activated by exponential function instead of sigmoidalactivation function The pattern unit computes distancesfrom the input vector to the training input vectors when aninput is presented and produces a vector whose elementsindicate how close the input is to a training input The sum-mation unit sums these contributions for each class of inputsand produces a net output which is a vector of probabili-ties From the maximum of these probabilities output unitsproduce a 1 for that class and a 0 for the other classes using

compete transfer function (Hariharan Yaacob and Awang2011) The performance of the PNN classifier highly de-pends upon the smoothing parameter or spread factor (σ )Based on the experimental investigations the σ value isvaried between 001 and 0055 in steps of 0005

43 General regression neural network

GRNN is a kind of radial basis networks and the train-ing is conducted using one-pass learning This networkdoes not require an iterative training procedure it presentsmuch faster learning than multilayer perceptron it is moreaccurate than MLP and relatively insensitive to outliersThe target variable for the GRNN network is continuous(Eskidere Ertas and Hanilci 2011 Wu and Tsai 2011Hariharan Sindhu and Yaacob 2011 HariharanSaraswathy Sindhu Khairunizam and Yaacob 2012Hoglund 2012) Specht (1991) has proposed the model ofGRNN to perform general (linear or non-linear) regres-sions GRNN is based on the theory of probability re-gression analysis It usually uses Parzen window estimatesto set up the PDF from the observed data samples TheGRNN has four different layers input layer pattern layer

Table 1 Details of training and test data

Type ofType of experiment validation MEEI database MAPACI speech pathology database

Frame-based (1557pathological + 1537normal) (1560pathological + 1560normal)

CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

ConV Training = 2166 (70) testing =928 (30)

Training = 2184 (70) testing =936 (30)

File-based (173pathological + 53normal) (24pathological + 24normal)

CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

ConV Training = 158 (70) testing =68 (30)

Training = 34 (70) testing =14 (30)

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1628 M Hariharan et al

Figure 6 Comparison of results (a) frame-based MEEI database raw features (b) file-based MEEI database raw features (c) frame-based MAPACI speech pathology database raw features and (d) file-based MAPACI speech pathology database raw features

summation layer and output layer The performance of theGRNN classifier highly depends upon the smoothing pa-rameter or spread factor (σ ) Based on the experimentalinvestigations the σ value is varied between 001 and 0055in steps of 0005

44 Support vector machine

In this work SVM is used as a classifier and it is a promis-ing method for solving non-linear classification problemsfunction estimation and density estimation and patternrecognition tasks (Azadi and Almasganj 2011 Calisir andDogantekin 2011 Ismail Samsudin and Shabri 2011 YuYao Wang and Lai 2011 Martis Acharya Mandana Rayand Chakraborty 2012) It has been originally proposed toclassify samples within two classes It maps training sam-ples of two classes into a higher dimensional space througha kernel function SVM seeks an optimal separating hyper-plane in this new space to maximise its distance from theclosest training point While testing a query point is cate-gorised according to the distance between the point and thehyperplane

SVM models are built around a kernel function thattransforms the input data into an n-dimensional space where

a hyperplane can be constructed to partition the data Threekinds of kernel functions such as linear kernel multilayerkernel and radial basis function (RBF) kernel are normallyused by researchers (Azadi and Almasganj 2011 Calisirand Dogantekin 2011 Ismail et al 2011 Yu et al 2011Martis et al 2012) In this work the RBF kernel functionis used since it gives an excellent generalisation and lowcomputational cost In the RBF kernel σ 2 (sig2) is theimportant parameter and it causes the changes in shapeflexion of the hyperplane

In this work LS-SVMlab toolbox (Suykens VanGestel De Brabanter De Moor and Vandewalle2003 De Brabanter 2010) is used to perform classificationof normal and pathological speech samples There are twoparameters which are to be chosen optimally such as reg-ularisation parameter (γ gam) and σ 2 (squared bandwidthof the RBF kernel) to obtain better accuracy

5 Results and discussions

Many research works have already been done in the areaof automatic detection of voice pathology In this work anew feature vector using WPT and SVD is proposed to clas-sify the voice samples into normal or pathological k-means

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1629

Figure 7 Comparison of results (a) Frame-based MEEI database weighted features (b) file-based MEEI database weighted features(c) Frame-based MAPACI speech pathology database weighted features and (d) file-based MAPACI speech pathology database weightedfeatures

clustering based feature weighting method is proposed forpreprocessing the features which increases its discrimina-tion ability between classes and the classification accuracyas well Four classifiers are employed to investigate the ef-ficacy of the raw and weighted features and two validationschemes (ConV and CrossV) are used to prove the reliabil-ity of the classification results Details of the training andtesting sets used in this work are tabulated in (Table 1)

To test the classifier performance several measuresnamely sensitivity specificity and the overall accuracy areconsidered These measures are calculated from the mea-sures true positive (TP the classifier classified as pathologywhen pathological samples are present) true negative (TNthe classifier classified as normal when normal samples arepresent) false positive (FP the classifier classified as patho-logical when normal samples are present) and false negative(FN the classifier classified as normal when pathologicalsamples are present) These measures are given in Equa-tions (9) (10) and (11) respectively

Sensitivity (SE) = TP(TP + FN) (9)

Specificity (SP) = TN(TN + FP) (10)

Overall accuracy (ACC) = (TP + TN)(TP + TN

+ FP + FN) (11)

In k-NN classifier different values of lsquokrsquo between 1 and10 are used PNN and GRNN are trained with differentspread factors between 001 and 0055 The suitable valueof the regularisation parameter (γ gam) and σ 2 (sig2) forSVM classifier are chosen optimally during training andtesting to obtain better accuracy Sensitivity specificity andoverall accuracy of frame-based and file-based analysis us-ing the raw and weighted features are presented in (Table 2)and (Table 3) for both MEEI database and MAPACIspeech pathology database respectively Average classifica-tion accuracies with standard deviations are tabulated From(Table 2) it is observed that the accuracy of file-based anal-ysis is always better than the frame-based analysis Max-imum classification accuracy of 100 is obtained duringthe file-based analysis using all the classifiers (k-NN LS-SVM PNN and GRNN) for the proposed raw and weightedfeatures Maximum classification accuracy of above 95is obtained using the raw features and 100 is obtained us-ing the weighted features during the frame-based analysis

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1630 M Hariharan et al

Tabl

e2

Com

pari

son

ofre

sult

sfo

rM

EE

Ida

taba

seus

ing

k-N

NL

S-S

VM

PN

Nan

dG

RN

N(r

awan

dw

eigh

ted

feat

ures

)

Raw

feat

ures

Wei

ghte

dfe

atur

es

ME

EI

data

base

SE

SP

AC

CS

ES

PA

CC

k-N

NFr

ame-

base

dC

ross

V96

66

plusmn0

5591

69

plusmn1

0994

05

plusmn0

8399

81

plusmn0

0698

27

plusmn0

3499

03

plusmn0

19k-

valu

e1ndash

10C

onV

959

6plusmn

071

908

7plusmn

154

932

7plusmn

116

998

1plusmn

006

979

8plusmn

033

988

8plusmn

015

File

-bas

edC

ross

V99

54

plusmn0

2410

0plusmn

000

996

5plusmn

019

995

4plusmn

024

100

plusmn0

0099

65

plusmn0

19k-

valu

e1ndash

10C

onV

997

4plusmn

025

100

plusmn0

0099

79

plusmn0

2010

0plusmn

000

100

plusmn0

0010

0plusmn

000

LS

-SV

MFr

ame-

base

dC

ross

V95

10

plusmn0

2195

41

plusmn0

2095

25

plusmn0

1299

75

plusmn0

0399

45

plusmn0

0799

60

plusmn0

0410

000

01

Con

V94

92

plusmn0

9094

49

plusmn1

0594

70

plusmn0

8199

34

plusmn0

2999

46

plusmn0

3499

40

plusmn0

26Fi

le-b

ased

Cro

ssV

989

2plusmn

062

998

1plusmn

060

991

2plusmn

047

993

2plusmn

036

100

plusmn0

0099

47

plusmn0

2810

01

Con

V10

0plusmn

000

988

7plusmn

130

991

2plusmn

103

996

2plusmn

080

994

1plusmn

186

995

6plusmn

071

PN

NFr

ame-

base

dC

ross

V97

43

plusmn0

2693

01

plusmn0

1795

12

plusmn0

1799

88

plusmn0

0598

69

plusmn0

1699

28

plusmn0

070

01ndash0

055

Con

V96

79

plusmn0

2492

44

plusmn0

3394

51

plusmn0

2499

81

plusmn0

0698

41

plusmn0

2199

10

plusmn0

11Fi

le-b

ased

Cro

ssV

974

4plusmn

539

100

plusmn0

0097

74

plusmn4

9797

50

plusmn5

3710

0plusmn

000

977

9plusmn

495

001

ndash00

55C

onV

970

9plusmn

610

100

plusmn0

0097

37

plusmn5

7999

62

plusmn0

8099

41

plusmn1

8699

56

plusmn0

71G

RN

NFr

ame-

base

dC

ross

V93

01

plusmn3

8988

08

plusmn4

9890

15

plusmn4

3199

50

plusmn0

4197

53

plusmn1

3098

27

plusmn0

890

01ndash0

055

Con

V92

52

plusmn3

7887

07

plusmn5

1189

34

plusmn4

3599

39

plusmn0

5597

35

plusmn1

1698

10

plusmn0

91Fi

le-b

ased

Cro

ssV

972

7plusmn

599

100

plusmn0

0096

33

plusmn8

0098

07

plusmn4

8210

0plusmn

000

979

2plusmn

511

001

ndash00

55C

onV

966

9plusmn

675

100

plusmn0

0095

81

plusmn9

0596

77

plusmn6

2310

0plusmn

000

958

1plusmn

815

Tabl

e3

Com

pari

son

ofre

sult

sfo

rM

APA

CI

spee

chpa

thol

ogy

data

base

usin

gk-

NN

LS

-SV

MP

NN

and

GR

NN

(raw

and

wei

ghte

dfe

atur

es)

Raw

feat

ures

Wei

ghte

dfe

atur

es

MA

PAC

Ida

taba

seS

ES

PA

CC

SE

SP

AC

C

k-N

NFr

ame-

base

dC

ross

V94

49

plusmn0

3696

58

plusmn0

3995

51

plusmn0

2198

67

plusmn0

2398

76

plusmn0

2598

71

plusmn0

23k-

valu

e1ndash

10C

onV

942

0plusmn

040

962

9plusmn

033

952

1plusmn

032

100

plusmn0

0010

0plusmn

000

100

plusmn0

00Fi

le-b

ased

Cro

ssV

771

0plusmn

784

669

2plusmn

330

704

2plusmn

365

100

plusmn0

0010

0plusmn

000

983

3plusmn

164

k-va

lue

1ndash10

Con

V77

56

plusmn7

1167

51

plusmn2

4869

71

plusmn2

7099

88

plusmn0

4099

88

plusmn0

4097

50

plusmn2

16L

S-S

VM

Fram

e-ba

sed

Cro

ssV

955

0plusmn

018

972

8plusmn

024

963

7plusmn

016

990

4plusmn

008

990

4plusmn

008

992

4plusmn

006

100

01

Con

V96

86

plusmn0

8195

24

plusmn0

6796

03

plusmn0

4098

71

plusmn0

6898

71

plusmn0

6899

02

plusmn0

41Fi

le-b

ased

Cro

ssV

793

1plusmn

333

688

7plusmn

240

729

2plusmn

260

100

plusmn0

0010

0plusmn

000

958

3plusmn

000

101

Con

V74

53

plusmn7

8284

00

plusmn13

01

771

4plusmn

878

100

plusmn0

0010

0plusmn

000

964

3plusmn

376

PN

NFr

ame-

base

dC

ross

V94

88

plusmn0

2096

23

plusmn0

2695

55

plusmn0

2199

82

plusmn0

0799

82

plusmn0

0799

91

plusmn0

040

01ndash0

055

Con

V94

52

plusmn0

3196

03

plusmn0

3595

26

plusmn0

2599

76

plusmn0

0799

76

plusmn0

0799

88

plusmn0

03Fi

le-b

ased

Cro

ssV

723

0plusmn

396

705

3plusmn

393

708

3plusmn

170

985

7plusmn

452

985

7plusmn

452

991

7plusmn

263

001

ndash00

55C

onV

690

2plusmn

533

691

8plusmn

410

679

3plusmn

386

986

0plusmn

273

986

0plusmn

273

990

0plusmn

172

GR

NN

Fram

e-ba

sed

Cro

ssV

904

1plusmn

378

917

9plusmn

335

909

8plusmn

345

995

9plusmn

029

995

9plusmn

029

997

1plusmn

023

001

ndash00

55C

onV

900

2plusmn

361

917

8plusmn

316

907

2plusmn

324

995

3plusmn

030

995

3plusmn

030

996

6plusmn

024

File

-bas

edC

ross

V69

43

plusmn8

5170

75

plusmn4

8666

67

plusmn7

7397

13

plusmn6

4097

13

plusmn6

4093

75

plusmn9

270

01ndash0

055

Con

V62

05

plusmn9

4064

51

plusmn3

8359

57

plusmn8

5097

38

plusmn5

6897

38

plusmn5

6893

07

plusmn9

91

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1631

Table 4 Computation time for feature extraction and classification

Classification

MEEI database MAPACI database

Type of Type of Weighted Raw Weighted RawClassifiers experiments validation features features features features

k-NN Frame-based CrossV 0338 0341 0342 0346k-value 1ndash10 ConV 3806 3821 6931 6955File-based CrossV 0036 0038 0036 0036k-value 1ndash10 ConV 1759 1754 3001 3206

LS-SVM Frame-based CrossV 1139 1175 1191 121410 001 ConV 0718 0698 0705 0723

File-based CrossV 0005 0005 0002 000210 1 ConV 0006 0006 0004 0004

PNN Frame-based CrossV 2458 3084 2445 2421001ndash0055 ConV 10340 10356 11180 11202File-based CrossV 1078 1124 1093 1062001ndash0055 ConV 4250 4266 4072 4047

GRNN Frame-based CrossV 2587 2571 2676 2582001ndash0055 ConV 10726 10648 11634 11584File-based CrossV 1098 1056 1065 1062001ndash0055 ConV 4302 4260 4049 4066

Feature extractionType of experiments MEEI MAPACI

Normal Pathological Normal PathologicalFile-based 0549 0034 0109 0130Frame-based 0591 0181 1734 1737

To prove the robustness and database independency of theproposed method MAPACI speech pathology database isused for the investigation in addition to MEEI databaseFrom (Table 3) it is inferred that maximum accuracy ofabove 95 is obtained using the raw features and 100is obtained using the weighted features during the frame-based analysis During the file-based analysis we obtainedmaximum classification accuracy of 92 using the rawfeatures and 100 is obtained using the weighted features

(Figure 6andashd) show the comparison of results (maxi-mum accuracy and its corresponding sensitivity and speci-ficity) for MEEI database and MAPACI speech pathologydatabase using the proposed raw features during frame-based and file-based analysis Similarly (Figure 7andashd) il-lustrate the comparison of results (maximum accuracy andits corresponding sensitivity and specificity) for MEEIdatabase and MAPACI speech pathology database for pro-posed weighted features The performance of the weightedfeatures is always better compared to raw features for bothMEEI database and MAPACI speech pathology databaseAll the feature extractions and classification algorithms aredeveloped under MATLAB platform and they run in a lap-top of Intel Core i7-2670 QM (22 GHz) with 4 GB RAM(Table 4) reports the average time taken (in seconds) duringfeature extraction and classification under two different ex-periments (frame-based and file-based) Direct comparisonof our work with the previous works in literature cannot beperformed since most of the works in the literature havenot presented their computation time of feature extraction

and classification phase From (Table 4) it is observed thatthe best classification accuracy could be obtained using oursuggested feature extraction and classification algorithmswith less computation time

From the literature it is observed that the reliabilityof wavelet and WPT-based features has been proven bymany experimental researches Accuracy of their resultswas varied from 85 to 100 approximately under differ-ent experiments Their results highly depend on the opti-mal selection of wavelet packet-based features using featureselectiontransformation techniques and optimisation tech-niques However the presented work gives very promisingclassification accuracy for the databases under test Henceit can be concluded that the proposed k-means clusteringbased feature weighting method could improve the robust-ness of the WPTndashSVD features and increase the classifica-tion accuracy as well for both MEEI database and MAPACIspeech pathology database

6 Conclusions

This paper presents a new feature constituting approachfor efficient detection of vocal fold pathology using a fea-ture extraction method (WPTndashSVD features and a fea-ture weighting method [k-means clustering based]) Totest the effectiveness and reliability of the proposed rawand weighted features four different classifiers such as k-NN LS-SVM PNN and GRNN are employed k-means

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1632 M Hariharan et al

clustering based feature weighting approach can increasethe distinguishing performance of raw WPTndashSVD featuresThe experimental results show that the proposed weightedfeatures give very promising classification accuracy of100 for both MEEI voice disorder database and MAPACIspeech pathology database The proposed features can beused as additional acoustic indicators for researchers andspeech pathologists to detect the vocal fold pathology Theperformance of the k-means clustering algorithm is depend-ing on the initial positions of the clusters and hence in ourwork it was run for several times to obtain better clus-ter centres In the future work k-means clustering basedfeature weighting method could be improved by using ef-ficient initialisation of methods for better initial placementof the cluster centres and different kernel functions Theproposed method could be extended to detect the specifictype of disorders and also to develop an online diagnosingsystem

AcknowledgementsThis work is done in part with data transferred from the databaseMAPACI httpwwwmapacicom The authors would like tothank the anonymous reviewers for their valuable comments

Notes on contributorsM Hariharan has received Bachelor ofEngineering in Electrical and ElectronicsEngineering and Master of Engineeringin Applied Electronics from GovernmentCollege of Technology Coimbatore TamilNadu India He has received a PhD degreein Mechatronic Engineering from UniversitiMalaysia Perlis (UniMAP) Malaysia He iscurrently working as a Senior Lecturer in

the School of Mechatronic Engineering UniMAP Malaysia Hehas published more than 50 papers in referred journals and con-ferences He has been a reviewer in various journals includingComputers and Electrical Engineering Computer Methods andPrograms in Biomedicine Artificial Intelligence in BiomedicineInternational Journal of Phoniatrics Speech Therapy and Com-munication Pathology and Medical Engineering and Physics Hisresearch interests include speech signal processing biomedicalsignal and image processing and artificial intelligence He is amember of IEEE

Kemal Polat was born in Kırıkkale on30 January 1981 Dr Polat has gradu-ated from ElectricalndashElectronics Engineer-ing Department of Selcuk University withBSc degree in 2002 and from ElectricalndashElectronics Engineering Department of Sel-cuk University with MSc degree in 2004He took his PhD degree in Electrical andElectronic Engineering at Selcuk University

in 2008 He is now working as an Assistant Professor in Electricaland Electronic Engineering Department Engineering of FacultyAbant Izzet Baysal University since September 2011 His currentresearch interests are biomedical signal classification statisticalsignal processing digital signal processing and pattern recognitionand classification He has 55 articles published in the SCI journalsand 16 international conference papers He is a reviewer in various

SCI journals including Information Sciences Pattern RecognitionLetters Digital Signal Processing Soft Computing Artificial In-telligence in Medicine Applied Soft Computing Computers inBiology and Medicine Expert Systems Computational Statisticsamp Data Analysis IEEE Transactions on Biomedical EngineeringIEEE Transactions on Evolutionary Computation TALANTASoft Computing Journal of Medical Systems and Scientific Re-search and Essays so on He is the member of editorial board ofJournal of Neural Computing and Applications (SCI expanded)

Sazali Yaacob received his PhD degreein Control Engineering from University ofSheffield UK He is currently working as aProfessor in the School of Mechatronic En-gineering UniMAP Malaysia He has suc-cessfully supervised 8 PhD candidates andmore than 20 MSc graduates through re-search mode Currently he has 10 PhD and8 MSc candidates His research interests

include control modelling and signal processing with applica-tions in the fields of satellite bio-medical applied mechanics androbotics He has published more than 70 papers in journals and200 papers in conference proceedings He received his profes-sional qualification as Charted Engineer from the EngineeringCouncil UK in 2005 and also a member of IET UK since 2003

ReferencesAnanthakrishna T Shama K and Niranjan U (2004)

lsquok-Means Nearest Neighbor Classifier for Voice Pathol-ogyrsquo in Paper Presented at the Proceedings from INDI-CONrsquo04 The 1st IEEE Annual India Conference KharagpurIndia pp 352ndash354

Arjmandi MK and Pooyan M (2011) lsquoAn Optimum Algorithmin Pathological Voice Quality Assessment Using Wavelet-Packet-Based Features Linear Discriminant Analysis andSupport Vector Machinersquo Biomedical Signal Processing andControl 7 3ndash19

Azadi TE and Almasganj F (2011) lsquoPBSVM Partitioningand Biased Support Vector Machine for Vocal Fold PathologyAssessment Using Labeled and Unlabeled Data Setsrsquo ExpertSystems with Applications 38(1) 610ndash619

Bezdek JC (1981) Pattern Recognition with Fuzzy ObjectiveFunction Algorithms Norwell MA Kluwer Academic Pub-lishers

Boyanov B and Hadjitodorov S (1997) lsquoAcoustic Analysis ofPathological Voices A Voice Analysis System for the Screen-ing of Laryngeal Diseasesrsquo IEEE Engineering in Medicineand Biology Magazine 16(4) 74ndash82

Boyanov B Ivanov T Hadjitodorov S and Chollet G (1993)lsquoRobust Hybrid Pitch Detectorrsquo IET Electronics Letters 291924ndash1926

Burrus C Gopinath R Guo H Odegard J and Selesnick I(1997) Introduction to Wavelets and Wavelet Transforms APrimer Upper Saddle River NJ Prentice Hall

Calisir D and Dogantekin E (2011) lsquoA New Intelligent Hep-atitis Diagnosis System PCA-LSSVMrsquo Expert Systems withApplications 38(8) 10705ndash10708

Chiu SL (1994) lsquoFuzzy Model Identification Based on ClusterEstimationrsquo Journal of Intelligent and Fuzzy Systems 2(3)267ndash278

Crovato C and Schuck A (2007) lsquoThe Use of Wavelet PacketTransform and Artificial Neural Networks in Analysis andClassification of Dysphonic Voicesrsquo IEEE Transactions onBiomedical Engineering 54(10) 1898ndash1900

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1633

De Brabanter K Ojeda PKF Alzate C De Brabanter J Pel-ckmans K De Moor B Vandewalle J and Suykens JAK(2010) lsquoLS-SVMlab Toolboxrsquo httpwwwesatkuleuvenbesistalssvmlab

Deliyski D (1993) Acoustic Model and Evaluation of Patholog-ical Voice Productionrsquo in Paper Presented at the Proceed-ings from EUROSPEECHrsquo93 The 3rd Conference on SpeechCommunication and Technology Berlin Germany pp 1969ndash1972

Duda RO Hart PE and Stork DG (2001) Pattern Classifi-cation New York Wiley-Interscience

Erfanian Saeedi N Almasganj F and Torabinejad F (2011)lsquoSupport Vector Wavelet Adaptation for Pathological VoiceAssessmentrsquo Computers in Biology and Medicine 41(9)822ndash828

Eskidere O Ertas F and Hanilci C (2011) lsquoA Comparison ofRegression Methods for Remote Tracking of Parkinsonrsquos Dis-ease Progressionrsquo Expert Systems with Applications 39(5)5523ndash5528

Feijoo S and Hernandez C (1990) lsquoShort-Term Stability Mea-sures for the Evaluation of Vocal Qualityrsquo Journal of Speechand Hearing Research 33(2) 324ndash334

Fonseca E Guido R Scalassara P Maciel C and Pereira J(2007) lsquoWavelet Time-Frequency Analysis and Least SquaresSupport Vector Machines for the Identification of Voice Dis-ordersrsquo Computers in Biology and Medicine(Elsevier) 37(4)571ndash578

Fukunaga K (1990) Introduction to Statistical Pattern Recogni-tion (2nd ed) San Diego CA Academic Press

Godino-Llorente J and Gomez-Vilda P (2004) lsquoAutomaticDetection of Voice Impairments by Means of Short-TermCepstral Parameters and Neural Network Based DetectorsrsquoIEEE Transactions on Biomedical Engineering 51(2) 380ndash384

Godino-Llorente J Gomez-Vilda P and Blanco-Velasco M(2006) lsquoDimensionality Reduction of a Pathological VoiceQuality Assessment System Based on Gaussian Mixture Mod-els and Short-Term Cepstral Parametersrsquo IEEE Transactionson Biomedical Engineering 53(10) 1943ndash1953

Gunes S Polat K and Yosunkaya S (2010) lsquoEfficient SleepStage Recognition System Based on EEG Signal Using k-Means Clustering Based Feature Weightingrsquo Expert Systemswith Applications 37(12) 7922ndash7928

Hariharan M Paulraj M and Yaacob S (2009) lsquoIdentifica-tion of Vocal Fold Pathology Based on Mel Frequency BandEnergy Coefficients and Singular Value Decompositionrsquo inPaper presented at the 2009 IEEE International Conferenceon Signal and Image Processing Applications (ICSIPArsquo09)Kuala Lumpur Malaysia pp 514ndash517

Hariharan M Paulraj M and Yaacob S (2011) lsquoDetection ofVocal Fold Paralysis and Oedema Using Time-Domain Fea-tures and Probabilistic Neural Networkrsquo International Jour-nal of Biomedical Engineering and Technology 6(1) 46ndash57

Hariharan M Saraswathy J Sindhu R Khairunizam W andYaacob S (2012) lsquoInfant Cry Classification to Identify As-phyxia Using Time-Frequency Analysis and Radial BasisNeural Networksrsquo Expert Systems with Applications 39(10)9515ndash9523

Hariharan M and Sazali Y (2010) lsquoTime-Domain Features andProbabilistic Neural Network for the Detection of Vocal FoldPathologyrsquo Malaysian Journal of Computer Science 23(1)60ndash67

Hariharan M Sindhu R and Yaacob S (2011) lsquoNormal andHypoacoustic Infant Cry Signal Classification Using Time-Frequency Analysis and General Regression Neural Net-

workrsquo Computer Methods and Programs in Biomedicine108(2) 559ndash569

Hariharan M Yaacob S and Awang SA (2011) lsquoPathologi-cal Infant Cry Analysis Using Wavelet Packet Transform andProbabilistic Neural Networkrsquo Expert Systems with Applica-tions 38(12) 15377ndash15382

Hernandez-Espinosa C Gomez-Vilda P Godino-Llorente Jand Aguilera-Navarro S (2000) lsquoDiagnosis of Vocal andVoice Disorders by the Speech Signalrsquo in Paper Presentedat the Proceedings from IJCNNrsquo00 The IEEE-INNS-ENNSInternational Joint Conference on Neural Networks ComoItaly pp 253ndash258

Hoglund H (2012) lsquoDetecting Earnings Management with Neu-ral Networksrsquo Expert Systems with Applications 39(10)9564ndash9570

Ismail S Samsudin R and Shabri A (2011) lsquoA Hybrid Modelof Self-Organizing Maps (SOM) and Least Square SupportVector Machine (LSSVM) for Time Series Forecastingrsquo Ex-pert Systems with Applications 38(8) 10574ndash10578

Kalker T Haitsma J and Oostveen J (2001) lsquoRobust AudioHashing for Content Identificationrsquo in Paper Presented at theProceedings from EUSIPCOrsquo01 The 12th European SignalProcessing Conference Vienna Austria pp 2091ndash2094

Kasuya H Endo Y and Saliu S (1993) lsquoNovel AcousticMeasurements of Jitter and Shimmer Characteristics fromPathological Voicersquo Paper Presented at the Proceedings fromEUROSPEECHrsquo93 The 3rd European Conference on SpeechCommunication and Technology Berlin Germany pp 1973ndash1976

Kasuya H Ogawa S Mashima K and Ebihara S (1986)lsquoNormalized Noise Energy as an Acoustic Measure to Eval-uate Pathologic Voicersquo The Journal of the Acoustical Societyof America 80 1329ndash1334

Kay Elemetrics Inc (1994) lsquoVoice Disorders DatabaseVersion 103 [CD-ROM]rsquo httpwwwkaypentaxcomProduct20InfoCSL20Options43374337html

Khadivi Heris H Seyed Aghazadeh B and Nikkhah-BahramiM (2009) lsquoOptimal Feature Selection for the Assessment ofVocal Fold Disordersrsquo Computers in Biology and Medicine39(10) 860ndash868

Krom G (1993) lsquoA Cepstrum-Based Technique for Determininga Harmonics-To-Noise Ratio in Speech Signalsrsquo Journal ofSpeech and Hearing Research 36(2) 254ndash266

Kukharchik P Kheidorov I Bovbel E and Ladeev D (2008)lsquoSpeech Signal Processing Based on Wavelets and SVM forVocal Tract Pathology Detectionrsquo Lecture Notes in ComputerScience (Springer) 5099 192ndash199

Latifoglu F Polat K Kara SI and Gunes S (2008) lsquoMedicalDiagnosis of Atherosclerosis from Carotid Artery DopplerSignals Using Principal Component Analysis (PCA) k-NN based Weighting Pre-Processing and Artificial ImmuneRecognition System (AIRS)rsquo Journal of Biomedical Infor-matics 41(1) 15ndash23

Ludlow CL Bassich CJ Connor NP Coulter DC and LeeYJ (1987) lsquoThe Validity of Using Phonatory Jitter and Shim-mer to Detect Laryngeal Pathologyrsquo Laryngeal Functionin Phonation and Respiration Boston MA Brown amp Copp 492ndash508

MacQueen J (1967) lsquoSome Methods for Classification and Anal-ysis of Multivariate Observationsrsquo in Paper Presented at the5th Berkeley Symposium on Mathematical Statistics and Prob-ability Berkeley CA University California Press Vol 1pp 281ndash297

MAPACI P (2004) lsquoVoice Disorder Databasersquo httpwwwmapacicomindex-inglesphp

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1634 M Hariharan et al

Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372

Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800

Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953

Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327

Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900

Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793

Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793

Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60

Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673

Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279

Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921

Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia

Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345

Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925

Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128

Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740

Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p

Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118

Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576

Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd

Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430

Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97

Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117

Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678

Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219

Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399

Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

Page 4: A new feature constituting approach to detection of vocal fold pathology

1624 M Hariharan et al

Figure 1 Plots of pathological and normal voice samples

wavelet function ψ(t) shifted in time by a translation param-eter and dilated by a scale parameter The general definitionof the wavelet transform is given as (Burrus Gopinath GuoOdegard and Selesnick 1997 Bopardikar 2000)

W (a b) =infinint

minusinfinf (t)

1radica

ψlowast(

t minus b

a

)dt (1)

In the tree each subspace is indexed by its depth i and thenumber of subspaces p The two wavelet packet orthogonalbases at a parent node (ip) are given by the followingforms

ψ2pi+1(k) =

infinsumn =minusinfin

l[n] ψpi (k minus 2in) (2)

where l[n] is a low-pass (scaling) filter

ψ2p+1i+1 (k) =

infinsumn =minusinfin

h[n] ψpi (k minus 2in) (3)

where h[n] is the high-pass (wavelet) filter Wavelet packetis an extension of wavelet transform In WPT decomposi-tion procedure a signal is decomposed into two frequency

bands such as lower frequency band (approximation co-efficients) and higher frequency band (detail coefficients)Wavelet packet decomposition helps to partition both lowerand higher frequency bands into smaller bands which can-not be achieved by using general discrete wavelet trans-form Hence WPT gives a balanced binary tree structure(Figure 2) shows the block diagram of the proposed system

In this work the automatic detection of vocal foldpathology is carried out by using WPT and SVD In frame-based analysis the voice samples are firstly segmented intoshort-time frames of 40 ms with an overlap of 50 betweenadjacent frames and windowed by using Hamming win-dow (Godino-Llorente and Gomez-Vilda 2004 Godino-Llorente et al 2006) In frame-based analysis 1557 framesof pathological voices and 1537 frames of normal voicesare used Each frame of signal is decomposed into five lev-els using WPT and yields 32 subbands A matrix of size32 times M (fifth-level wavelet packet coefficients of each32 wavelet packet sub-bands) composed of wavelet packetcoefficients which is obtained for further processing whoserows pertain to wavelet packet subbands and columns towavelet packet coefficients

A = [C1

5 (M) C25 (M) middot middot middot C32

5 (M)]T

(4)

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1625

Figure 2 Overall diagram of the proposed system

To derive simple and effective feature vector singularvalue decomposition is applied In file-based analysis eachvoice sample is decomposed into five levels using WPTand the same SVD is applied to derive feature vector for thedetection of vocal fold pathology The same methodologyis applied on the voice samples of second data-set as wellConsidering the fact that pathological voice signals containmore rapid variations and the popularity of Daubechieswavelets in speech applications lsquodb4rsquo wavelet is used inour study The order was chosen to be low to model thetransients and rapid variations in a signal efficiently

31 Singular value decomposition

The singular value decomposition which is a factorisationand summarisation technique effectively reduces a rectan-gular matrix A (n times p matrix) of wavelet packet coefficientsinto a much smaller invertible and square matrix (KalkerHaitsma and Oostveen 2001 Ozer Sankur Memon andAnarim 2005) SVD theorem states that the given n times pmatrix is decomposed as given by

X = UDVT (5)

The columns of U are orthonormal eigenvectors of AAT

and are called left-singular vectors The columns of V are or-thonormal eigenvectors of ATA and are called right-singularvectors The diagonal entries of D are called the singularvalues of the matrix n times p In this work 32 singular valuesgive a good summarisation of the matrix of wavelet packetcoefficients of a signal

32 k-Means clustering based feature weighting

In many practical machine learning applications the per-formance of the learned models degrade gracefully due tothe irrelevant and noisy features To improve the robust-ness of the features and the classification accuracy clus-tering algorithms are used as feature weighting methodwhere they transform the extracted non-linearly separable

features to linearly separable features and also to map thefeatures according to the distributions (Polat and Gunes2006 Latifoglu Polat Kara and Gunes 2008 GunesPolat and Yosunkaya 2010 Polat 2012 Polat and Dur-duran 2012) Clustering algorithms are used not only tostudy the similarity or dissimilarity of the features but alsouseful for compression and reduction of the features (Polatand Gunes 2006 Latifoglu et al 2008 Gunes et al 2010Polat 2012 Polat and Durduran 2012) Several clusteringalgorithms have been proposed in the literature (MacQueen1967 Bezdek 1981 Chiu 1994 Yager and Filev 1994 Xuand Wunsch 2005) In this work k-means clustering is usedas a feature weighting method since it is simple and widelyimplemented in solving many practical problems (Polat andGunes 2006 Latifoglu et al 2008 Gunes et al 2010 Po-lat 2012 Polat and Durduran 2012) k-means clustering isused to partition n observations into k clusters so as to min-imise the mean squared distance from each data point toits nearest centre For a data-set xi i = 1 2 3 n k-means clustering algorithm finds the cluster centres ci andthe membership matrix U iteratively using the followingfour steps

Step 1 Randomly initialise the cluster centre ci i =123 c

Step 2 Determine the membership matrix usingEquation (6)

uij =

10

if∥∥xj minus ci

∥∥2 le∥∥xj minus ci

∥∥2 for each k = i

otherwise(6)

Step 3 Calculate the cost function based on the Euclideandistance using the following Equation (7)

J =csum

i=1

sumkxkisinGi

xk minus ci2 (7)

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1626 M Hariharan et al

Figure 3 Working of k-means clustering based feature weighting method

Stop if either this value or its improvement over previousiteration is below a certain thresholdStep 4 Update the cluster centres according toEquation (8) and go to Step 2

ci = 1

|Gi |sum

kxkisin Gi

xk (8)

where |Gi | =nsum

j=1uij

After finding the cluster centres the extracted featuresare preprocessed (Polat and Gunes 2006 Latifoglu et al2008 Gunes et al 2010 Polat 2012 Polat and Durduran2012) as summarised in (Figure 3) below firstly calcu-late the cluster centres using k-means clustering methodNext the ratios of means of features to their centres arecalculated Finally these ratios are multiplied with eachrespective feature

Raw features are obtained through feature extractionThrough k-means clustering based feature weighting theraw features are preprocessed which gives the weightedfeatures (Figure 4a and b) shows the class distribution ofraw and weighted SVD features MEEI database The classdistribution of raw and weighted SVD features for MAPACIspeech pathology database is shown in (Figure 5a and b)From the figures it is observed that the discrimination abil-

ity of the raw SVD features has been increased by applyingk-means clustering based feature weighting method

4 Classifiers

In the area of automatic detection of vocal fold patholo-gies various classifiers have been proposed such as multi-layer perceptron (Ritchings McGillion Conroy and Moore1999 Ritchings McGillion and Moore 1999) hiddenMarkov models (Wester 1998) linear discriminant anal-ysis (Hariharan Paulraj and Yaacob 2009 UmapathyKrishnan Parsa and Jamieson 2005) SVM (Khadivi Heriset al 2009 Arjmandi and Pooyan 2011 Erfanian Saeediet al 2011) Gaussian mixture models (Godino-Llorenteet al 2006) radial basis neural networks (Hariharan andSazali 2010 Hariharan Paulraj and Yaacob 2011) and k-nearest neighbourhood classifier (Ananthakrishna Shamaand Niranjan 2004 Shama and Cholayya 2007) In thiswork four different classifiers are employed such as k-NNLS-SVM PNN and GRNN

41 k-NN classifier

In pattern recognition the k-NN algorithm is a method forclassifying objects based on closest training examples inthe feature space (Fukunaga 1990 Duda Hart and Stork

Figure 4 (a) Class distribution of raw SVD features (MEEI database frame-based) according to the first two features (Feature 1 andFeature 2) (b) Class distribution of weighted SVD features (MEEI database frame-based) according to the first two features (Feature 1and Feature 2)

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1627

Figure 5 (a) Class distribution of raw SVD features (MAPACI speech pathology frame-based) according to the first two features (Feature1 and Feature 2) (b) Comparison of results (frame-based MEEI database raw features)

2001) An object is classified by a majority vote of itsneighbour with the object being assigned to the class mostcommon amongst its k-nearest neighbours k is a positiveinteger In the k-NN algorithm the classification of a newtest feature vector is determined by the class of its k-NNs

42 Probabilistic neural network

Specht (1990) has proposed the PNN based on Bayesianclassification and classical estimators for probability den-sity function (PDF) PNN comprises four units such as in-put units pattern units summation units and output unitsAll the units are fully interconnected and the pattern unitsare activated by exponential function instead of sigmoidalactivation function The pattern unit computes distancesfrom the input vector to the training input vectors when aninput is presented and produces a vector whose elementsindicate how close the input is to a training input The sum-mation unit sums these contributions for each class of inputsand produces a net output which is a vector of probabili-ties From the maximum of these probabilities output unitsproduce a 1 for that class and a 0 for the other classes using

compete transfer function (Hariharan Yaacob and Awang2011) The performance of the PNN classifier highly de-pends upon the smoothing parameter or spread factor (σ )Based on the experimental investigations the σ value isvaried between 001 and 0055 in steps of 0005

43 General regression neural network

GRNN is a kind of radial basis networks and the train-ing is conducted using one-pass learning This networkdoes not require an iterative training procedure it presentsmuch faster learning than multilayer perceptron it is moreaccurate than MLP and relatively insensitive to outliersThe target variable for the GRNN network is continuous(Eskidere Ertas and Hanilci 2011 Wu and Tsai 2011Hariharan Sindhu and Yaacob 2011 HariharanSaraswathy Sindhu Khairunizam and Yaacob 2012Hoglund 2012) Specht (1991) has proposed the model ofGRNN to perform general (linear or non-linear) regres-sions GRNN is based on the theory of probability re-gression analysis It usually uses Parzen window estimatesto set up the PDF from the observed data samples TheGRNN has four different layers input layer pattern layer

Table 1 Details of training and test data

Type ofType of experiment validation MEEI database MAPACI speech pathology database

Frame-based (1557pathological + 1537normal) (1560pathological + 1560normal)

CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

ConV Training = 2166 (70) testing =928 (30)

Training = 2184 (70) testing =936 (30)

File-based (173pathological + 53normal) (24pathological + 24normal)

CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

ConV Training = 158 (70) testing =68 (30)

Training = 34 (70) testing =14 (30)

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1628 M Hariharan et al

Figure 6 Comparison of results (a) frame-based MEEI database raw features (b) file-based MEEI database raw features (c) frame-based MAPACI speech pathology database raw features and (d) file-based MAPACI speech pathology database raw features

summation layer and output layer The performance of theGRNN classifier highly depends upon the smoothing pa-rameter or spread factor (σ ) Based on the experimentalinvestigations the σ value is varied between 001 and 0055in steps of 0005

44 Support vector machine

In this work SVM is used as a classifier and it is a promis-ing method for solving non-linear classification problemsfunction estimation and density estimation and patternrecognition tasks (Azadi and Almasganj 2011 Calisir andDogantekin 2011 Ismail Samsudin and Shabri 2011 YuYao Wang and Lai 2011 Martis Acharya Mandana Rayand Chakraborty 2012) It has been originally proposed toclassify samples within two classes It maps training sam-ples of two classes into a higher dimensional space througha kernel function SVM seeks an optimal separating hyper-plane in this new space to maximise its distance from theclosest training point While testing a query point is cate-gorised according to the distance between the point and thehyperplane

SVM models are built around a kernel function thattransforms the input data into an n-dimensional space where

a hyperplane can be constructed to partition the data Threekinds of kernel functions such as linear kernel multilayerkernel and radial basis function (RBF) kernel are normallyused by researchers (Azadi and Almasganj 2011 Calisirand Dogantekin 2011 Ismail et al 2011 Yu et al 2011Martis et al 2012) In this work the RBF kernel functionis used since it gives an excellent generalisation and lowcomputational cost In the RBF kernel σ 2 (sig2) is theimportant parameter and it causes the changes in shapeflexion of the hyperplane

In this work LS-SVMlab toolbox (Suykens VanGestel De Brabanter De Moor and Vandewalle2003 De Brabanter 2010) is used to perform classificationof normal and pathological speech samples There are twoparameters which are to be chosen optimally such as reg-ularisation parameter (γ gam) and σ 2 (squared bandwidthof the RBF kernel) to obtain better accuracy

5 Results and discussions

Many research works have already been done in the areaof automatic detection of voice pathology In this work anew feature vector using WPT and SVD is proposed to clas-sify the voice samples into normal or pathological k-means

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1629

Figure 7 Comparison of results (a) Frame-based MEEI database weighted features (b) file-based MEEI database weighted features(c) Frame-based MAPACI speech pathology database weighted features and (d) file-based MAPACI speech pathology database weightedfeatures

clustering based feature weighting method is proposed forpreprocessing the features which increases its discrimina-tion ability between classes and the classification accuracyas well Four classifiers are employed to investigate the ef-ficacy of the raw and weighted features and two validationschemes (ConV and CrossV) are used to prove the reliabil-ity of the classification results Details of the training andtesting sets used in this work are tabulated in (Table 1)

To test the classifier performance several measuresnamely sensitivity specificity and the overall accuracy areconsidered These measures are calculated from the mea-sures true positive (TP the classifier classified as pathologywhen pathological samples are present) true negative (TNthe classifier classified as normal when normal samples arepresent) false positive (FP the classifier classified as patho-logical when normal samples are present) and false negative(FN the classifier classified as normal when pathologicalsamples are present) These measures are given in Equa-tions (9) (10) and (11) respectively

Sensitivity (SE) = TP(TP + FN) (9)

Specificity (SP) = TN(TN + FP) (10)

Overall accuracy (ACC) = (TP + TN)(TP + TN

+ FP + FN) (11)

In k-NN classifier different values of lsquokrsquo between 1 and10 are used PNN and GRNN are trained with differentspread factors between 001 and 0055 The suitable valueof the regularisation parameter (γ gam) and σ 2 (sig2) forSVM classifier are chosen optimally during training andtesting to obtain better accuracy Sensitivity specificity andoverall accuracy of frame-based and file-based analysis us-ing the raw and weighted features are presented in (Table 2)and (Table 3) for both MEEI database and MAPACIspeech pathology database respectively Average classifica-tion accuracies with standard deviations are tabulated From(Table 2) it is observed that the accuracy of file-based anal-ysis is always better than the frame-based analysis Max-imum classification accuracy of 100 is obtained duringthe file-based analysis using all the classifiers (k-NN LS-SVM PNN and GRNN) for the proposed raw and weightedfeatures Maximum classification accuracy of above 95is obtained using the raw features and 100 is obtained us-ing the weighted features during the frame-based analysis

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1630 M Hariharan et al

Tabl

e2

Com

pari

son

ofre

sult

sfo

rM

EE

Ida

taba

seus

ing

k-N

NL

S-S

VM

PN

Nan

dG

RN

N(r

awan

dw

eigh

ted

feat

ures

)

Raw

feat

ures

Wei

ghte

dfe

atur

es

ME

EI

data

base

SE

SP

AC

CS

ES

PA

CC

k-N

NFr

ame-

base

dC

ross

V96

66

plusmn0

5591

69

plusmn1

0994

05

plusmn0

8399

81

plusmn0

0698

27

plusmn0

3499

03

plusmn0

19k-

valu

e1ndash

10C

onV

959

6plusmn

071

908

7plusmn

154

932

7plusmn

116

998

1plusmn

006

979

8plusmn

033

988

8plusmn

015

File

-bas

edC

ross

V99

54

plusmn0

2410

0plusmn

000

996

5plusmn

019

995

4plusmn

024

100

plusmn0

0099

65

plusmn0

19k-

valu

e1ndash

10C

onV

997

4plusmn

025

100

plusmn0

0099

79

plusmn0

2010

0plusmn

000

100

plusmn0

0010

0plusmn

000

LS

-SV

MFr

ame-

base

dC

ross

V95

10

plusmn0

2195

41

plusmn0

2095

25

plusmn0

1299

75

plusmn0

0399

45

plusmn0

0799

60

plusmn0

0410

000

01

Con

V94

92

plusmn0

9094

49

plusmn1

0594

70

plusmn0

8199

34

plusmn0

2999

46

plusmn0

3499

40

plusmn0

26Fi

le-b

ased

Cro

ssV

989

2plusmn

062

998

1plusmn

060

991

2plusmn

047

993

2plusmn

036

100

plusmn0

0099

47

plusmn0

2810

01

Con

V10

0plusmn

000

988

7plusmn

130

991

2plusmn

103

996

2plusmn

080

994

1plusmn

186

995

6plusmn

071

PN

NFr

ame-

base

dC

ross

V97

43

plusmn0

2693

01

plusmn0

1795

12

plusmn0

1799

88

plusmn0

0598

69

plusmn0

1699

28

plusmn0

070

01ndash0

055

Con

V96

79

plusmn0

2492

44

plusmn0

3394

51

plusmn0

2499

81

plusmn0

0698

41

plusmn0

2199

10

plusmn0

11Fi

le-b

ased

Cro

ssV

974

4plusmn

539

100

plusmn0

0097

74

plusmn4

9797

50

plusmn5

3710

0plusmn

000

977

9plusmn

495

001

ndash00

55C

onV

970

9plusmn

610

100

plusmn0

0097

37

plusmn5

7999

62

plusmn0

8099

41

plusmn1

8699

56

plusmn0

71G

RN

NFr

ame-

base

dC

ross

V93

01

plusmn3

8988

08

plusmn4

9890

15

plusmn4

3199

50

plusmn0

4197

53

plusmn1

3098

27

plusmn0

890

01ndash0

055

Con

V92

52

plusmn3

7887

07

plusmn5

1189

34

plusmn4

3599

39

plusmn0

5597

35

plusmn1

1698

10

plusmn0

91Fi

le-b

ased

Cro

ssV

972

7plusmn

599

100

plusmn0

0096

33

plusmn8

0098

07

plusmn4

8210

0plusmn

000

979

2plusmn

511

001

ndash00

55C

onV

966

9plusmn

675

100

plusmn0

0095

81

plusmn9

0596

77

plusmn6

2310

0plusmn

000

958

1plusmn

815

Tabl

e3

Com

pari

son

ofre

sult

sfo

rM

APA

CI

spee

chpa

thol

ogy

data

base

usin

gk-

NN

LS

-SV

MP

NN

and

GR

NN

(raw

and

wei

ghte

dfe

atur

es)

Raw

feat

ures

Wei

ghte

dfe

atur

es

MA

PAC

Ida

taba

seS

ES

PA

CC

SE

SP

AC

C

k-N

NFr

ame-

base

dC

ross

V94

49

plusmn0

3696

58

plusmn0

3995

51

plusmn0

2198

67

plusmn0

2398

76

plusmn0

2598

71

plusmn0

23k-

valu

e1ndash

10C

onV

942

0plusmn

040

962

9plusmn

033

952

1plusmn

032

100

plusmn0

0010

0plusmn

000

100

plusmn0

00Fi

le-b

ased

Cro

ssV

771

0plusmn

784

669

2plusmn

330

704

2plusmn

365

100

plusmn0

0010

0plusmn

000

983

3plusmn

164

k-va

lue

1ndash10

Con

V77

56

plusmn7

1167

51

plusmn2

4869

71

plusmn2

7099

88

plusmn0

4099

88

plusmn0

4097

50

plusmn2

16L

S-S

VM

Fram

e-ba

sed

Cro

ssV

955

0plusmn

018

972

8plusmn

024

963

7plusmn

016

990

4plusmn

008

990

4plusmn

008

992

4plusmn

006

100

01

Con

V96

86

plusmn0

8195

24

plusmn0

6796

03

plusmn0

4098

71

plusmn0

6898

71

plusmn0

6899

02

plusmn0

41Fi

le-b

ased

Cro

ssV

793

1plusmn

333

688

7plusmn

240

729

2plusmn

260

100

plusmn0

0010

0plusmn

000

958

3plusmn

000

101

Con

V74

53

plusmn7

8284

00

plusmn13

01

771

4plusmn

878

100

plusmn0

0010

0plusmn

000

964

3plusmn

376

PN

NFr

ame-

base

dC

ross

V94

88

plusmn0

2096

23

plusmn0

2695

55

plusmn0

2199

82

plusmn0

0799

82

plusmn0

0799

91

plusmn0

040

01ndash0

055

Con

V94

52

plusmn0

3196

03

plusmn0

3595

26

plusmn0

2599

76

plusmn0

0799

76

plusmn0

0799

88

plusmn0

03Fi

le-b

ased

Cro

ssV

723

0plusmn

396

705

3plusmn

393

708

3plusmn

170

985

7plusmn

452

985

7plusmn

452

991

7plusmn

263

001

ndash00

55C

onV

690

2plusmn

533

691

8plusmn

410

679

3plusmn

386

986

0plusmn

273

986

0plusmn

273

990

0plusmn

172

GR

NN

Fram

e-ba

sed

Cro

ssV

904

1plusmn

378

917

9plusmn

335

909

8plusmn

345

995

9plusmn

029

995

9plusmn

029

997

1plusmn

023

001

ndash00

55C

onV

900

2plusmn

361

917

8plusmn

316

907

2plusmn

324

995

3plusmn

030

995

3plusmn

030

996

6plusmn

024

File

-bas

edC

ross

V69

43

plusmn8

5170

75

plusmn4

8666

67

plusmn7

7397

13

plusmn6

4097

13

plusmn6

4093

75

plusmn9

270

01ndash0

055

Con

V62

05

plusmn9

4064

51

plusmn3

8359

57

plusmn8

5097

38

plusmn5

6897

38

plusmn5

6893

07

plusmn9

91

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1631

Table 4 Computation time for feature extraction and classification

Classification

MEEI database MAPACI database

Type of Type of Weighted Raw Weighted RawClassifiers experiments validation features features features features

k-NN Frame-based CrossV 0338 0341 0342 0346k-value 1ndash10 ConV 3806 3821 6931 6955File-based CrossV 0036 0038 0036 0036k-value 1ndash10 ConV 1759 1754 3001 3206

LS-SVM Frame-based CrossV 1139 1175 1191 121410 001 ConV 0718 0698 0705 0723

File-based CrossV 0005 0005 0002 000210 1 ConV 0006 0006 0004 0004

PNN Frame-based CrossV 2458 3084 2445 2421001ndash0055 ConV 10340 10356 11180 11202File-based CrossV 1078 1124 1093 1062001ndash0055 ConV 4250 4266 4072 4047

GRNN Frame-based CrossV 2587 2571 2676 2582001ndash0055 ConV 10726 10648 11634 11584File-based CrossV 1098 1056 1065 1062001ndash0055 ConV 4302 4260 4049 4066

Feature extractionType of experiments MEEI MAPACI

Normal Pathological Normal PathologicalFile-based 0549 0034 0109 0130Frame-based 0591 0181 1734 1737

To prove the robustness and database independency of theproposed method MAPACI speech pathology database isused for the investigation in addition to MEEI databaseFrom (Table 3) it is inferred that maximum accuracy ofabove 95 is obtained using the raw features and 100is obtained using the weighted features during the frame-based analysis During the file-based analysis we obtainedmaximum classification accuracy of 92 using the rawfeatures and 100 is obtained using the weighted features

(Figure 6andashd) show the comparison of results (maxi-mum accuracy and its corresponding sensitivity and speci-ficity) for MEEI database and MAPACI speech pathologydatabase using the proposed raw features during frame-based and file-based analysis Similarly (Figure 7andashd) il-lustrate the comparison of results (maximum accuracy andits corresponding sensitivity and specificity) for MEEIdatabase and MAPACI speech pathology database for pro-posed weighted features The performance of the weightedfeatures is always better compared to raw features for bothMEEI database and MAPACI speech pathology databaseAll the feature extractions and classification algorithms aredeveloped under MATLAB platform and they run in a lap-top of Intel Core i7-2670 QM (22 GHz) with 4 GB RAM(Table 4) reports the average time taken (in seconds) duringfeature extraction and classification under two different ex-periments (frame-based and file-based) Direct comparisonof our work with the previous works in literature cannot beperformed since most of the works in the literature havenot presented their computation time of feature extraction

and classification phase From (Table 4) it is observed thatthe best classification accuracy could be obtained using oursuggested feature extraction and classification algorithmswith less computation time

From the literature it is observed that the reliabilityof wavelet and WPT-based features has been proven bymany experimental researches Accuracy of their resultswas varied from 85 to 100 approximately under differ-ent experiments Their results highly depend on the opti-mal selection of wavelet packet-based features using featureselectiontransformation techniques and optimisation tech-niques However the presented work gives very promisingclassification accuracy for the databases under test Henceit can be concluded that the proposed k-means clusteringbased feature weighting method could improve the robust-ness of the WPTndashSVD features and increase the classifica-tion accuracy as well for both MEEI database and MAPACIspeech pathology database

6 Conclusions

This paper presents a new feature constituting approachfor efficient detection of vocal fold pathology using a fea-ture extraction method (WPTndashSVD features and a fea-ture weighting method [k-means clustering based]) Totest the effectiveness and reliability of the proposed rawand weighted features four different classifiers such as k-NN LS-SVM PNN and GRNN are employed k-means

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1632 M Hariharan et al

clustering based feature weighting approach can increasethe distinguishing performance of raw WPTndashSVD featuresThe experimental results show that the proposed weightedfeatures give very promising classification accuracy of100 for both MEEI voice disorder database and MAPACIspeech pathology database The proposed features can beused as additional acoustic indicators for researchers andspeech pathologists to detect the vocal fold pathology Theperformance of the k-means clustering algorithm is depend-ing on the initial positions of the clusters and hence in ourwork it was run for several times to obtain better clus-ter centres In the future work k-means clustering basedfeature weighting method could be improved by using ef-ficient initialisation of methods for better initial placementof the cluster centres and different kernel functions Theproposed method could be extended to detect the specifictype of disorders and also to develop an online diagnosingsystem

AcknowledgementsThis work is done in part with data transferred from the databaseMAPACI httpwwwmapacicom The authors would like tothank the anonymous reviewers for their valuable comments

Notes on contributorsM Hariharan has received Bachelor ofEngineering in Electrical and ElectronicsEngineering and Master of Engineeringin Applied Electronics from GovernmentCollege of Technology Coimbatore TamilNadu India He has received a PhD degreein Mechatronic Engineering from UniversitiMalaysia Perlis (UniMAP) Malaysia He iscurrently working as a Senior Lecturer in

the School of Mechatronic Engineering UniMAP Malaysia Hehas published more than 50 papers in referred journals and con-ferences He has been a reviewer in various journals includingComputers and Electrical Engineering Computer Methods andPrograms in Biomedicine Artificial Intelligence in BiomedicineInternational Journal of Phoniatrics Speech Therapy and Com-munication Pathology and Medical Engineering and Physics Hisresearch interests include speech signal processing biomedicalsignal and image processing and artificial intelligence He is amember of IEEE

Kemal Polat was born in Kırıkkale on30 January 1981 Dr Polat has gradu-ated from ElectricalndashElectronics Engineer-ing Department of Selcuk University withBSc degree in 2002 and from ElectricalndashElectronics Engineering Department of Sel-cuk University with MSc degree in 2004He took his PhD degree in Electrical andElectronic Engineering at Selcuk University

in 2008 He is now working as an Assistant Professor in Electricaland Electronic Engineering Department Engineering of FacultyAbant Izzet Baysal University since September 2011 His currentresearch interests are biomedical signal classification statisticalsignal processing digital signal processing and pattern recognitionand classification He has 55 articles published in the SCI journalsand 16 international conference papers He is a reviewer in various

SCI journals including Information Sciences Pattern RecognitionLetters Digital Signal Processing Soft Computing Artificial In-telligence in Medicine Applied Soft Computing Computers inBiology and Medicine Expert Systems Computational Statisticsamp Data Analysis IEEE Transactions on Biomedical EngineeringIEEE Transactions on Evolutionary Computation TALANTASoft Computing Journal of Medical Systems and Scientific Re-search and Essays so on He is the member of editorial board ofJournal of Neural Computing and Applications (SCI expanded)

Sazali Yaacob received his PhD degreein Control Engineering from University ofSheffield UK He is currently working as aProfessor in the School of Mechatronic En-gineering UniMAP Malaysia He has suc-cessfully supervised 8 PhD candidates andmore than 20 MSc graduates through re-search mode Currently he has 10 PhD and8 MSc candidates His research interests

include control modelling and signal processing with applica-tions in the fields of satellite bio-medical applied mechanics androbotics He has published more than 70 papers in journals and200 papers in conference proceedings He received his profes-sional qualification as Charted Engineer from the EngineeringCouncil UK in 2005 and also a member of IET UK since 2003

ReferencesAnanthakrishna T Shama K and Niranjan U (2004)

lsquok-Means Nearest Neighbor Classifier for Voice Pathol-ogyrsquo in Paper Presented at the Proceedings from INDI-CONrsquo04 The 1st IEEE Annual India Conference KharagpurIndia pp 352ndash354

Arjmandi MK and Pooyan M (2011) lsquoAn Optimum Algorithmin Pathological Voice Quality Assessment Using Wavelet-Packet-Based Features Linear Discriminant Analysis andSupport Vector Machinersquo Biomedical Signal Processing andControl 7 3ndash19

Azadi TE and Almasganj F (2011) lsquoPBSVM Partitioningand Biased Support Vector Machine for Vocal Fold PathologyAssessment Using Labeled and Unlabeled Data Setsrsquo ExpertSystems with Applications 38(1) 610ndash619

Bezdek JC (1981) Pattern Recognition with Fuzzy ObjectiveFunction Algorithms Norwell MA Kluwer Academic Pub-lishers

Boyanov B and Hadjitodorov S (1997) lsquoAcoustic Analysis ofPathological Voices A Voice Analysis System for the Screen-ing of Laryngeal Diseasesrsquo IEEE Engineering in Medicineand Biology Magazine 16(4) 74ndash82

Boyanov B Ivanov T Hadjitodorov S and Chollet G (1993)lsquoRobust Hybrid Pitch Detectorrsquo IET Electronics Letters 291924ndash1926

Burrus C Gopinath R Guo H Odegard J and Selesnick I(1997) Introduction to Wavelets and Wavelet Transforms APrimer Upper Saddle River NJ Prentice Hall

Calisir D and Dogantekin E (2011) lsquoA New Intelligent Hep-atitis Diagnosis System PCA-LSSVMrsquo Expert Systems withApplications 38(8) 10705ndash10708

Chiu SL (1994) lsquoFuzzy Model Identification Based on ClusterEstimationrsquo Journal of Intelligent and Fuzzy Systems 2(3)267ndash278

Crovato C and Schuck A (2007) lsquoThe Use of Wavelet PacketTransform and Artificial Neural Networks in Analysis andClassification of Dysphonic Voicesrsquo IEEE Transactions onBiomedical Engineering 54(10) 1898ndash1900

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1633

De Brabanter K Ojeda PKF Alzate C De Brabanter J Pel-ckmans K De Moor B Vandewalle J and Suykens JAK(2010) lsquoLS-SVMlab Toolboxrsquo httpwwwesatkuleuvenbesistalssvmlab

Deliyski D (1993) Acoustic Model and Evaluation of Patholog-ical Voice Productionrsquo in Paper Presented at the Proceed-ings from EUROSPEECHrsquo93 The 3rd Conference on SpeechCommunication and Technology Berlin Germany pp 1969ndash1972

Duda RO Hart PE and Stork DG (2001) Pattern Classifi-cation New York Wiley-Interscience

Erfanian Saeedi N Almasganj F and Torabinejad F (2011)lsquoSupport Vector Wavelet Adaptation for Pathological VoiceAssessmentrsquo Computers in Biology and Medicine 41(9)822ndash828

Eskidere O Ertas F and Hanilci C (2011) lsquoA Comparison ofRegression Methods for Remote Tracking of Parkinsonrsquos Dis-ease Progressionrsquo Expert Systems with Applications 39(5)5523ndash5528

Feijoo S and Hernandez C (1990) lsquoShort-Term Stability Mea-sures for the Evaluation of Vocal Qualityrsquo Journal of Speechand Hearing Research 33(2) 324ndash334

Fonseca E Guido R Scalassara P Maciel C and Pereira J(2007) lsquoWavelet Time-Frequency Analysis and Least SquaresSupport Vector Machines for the Identification of Voice Dis-ordersrsquo Computers in Biology and Medicine(Elsevier) 37(4)571ndash578

Fukunaga K (1990) Introduction to Statistical Pattern Recogni-tion (2nd ed) San Diego CA Academic Press

Godino-Llorente J and Gomez-Vilda P (2004) lsquoAutomaticDetection of Voice Impairments by Means of Short-TermCepstral Parameters and Neural Network Based DetectorsrsquoIEEE Transactions on Biomedical Engineering 51(2) 380ndash384

Godino-Llorente J Gomez-Vilda P and Blanco-Velasco M(2006) lsquoDimensionality Reduction of a Pathological VoiceQuality Assessment System Based on Gaussian Mixture Mod-els and Short-Term Cepstral Parametersrsquo IEEE Transactionson Biomedical Engineering 53(10) 1943ndash1953

Gunes S Polat K and Yosunkaya S (2010) lsquoEfficient SleepStage Recognition System Based on EEG Signal Using k-Means Clustering Based Feature Weightingrsquo Expert Systemswith Applications 37(12) 7922ndash7928

Hariharan M Paulraj M and Yaacob S (2009) lsquoIdentifica-tion of Vocal Fold Pathology Based on Mel Frequency BandEnergy Coefficients and Singular Value Decompositionrsquo inPaper presented at the 2009 IEEE International Conferenceon Signal and Image Processing Applications (ICSIPArsquo09)Kuala Lumpur Malaysia pp 514ndash517

Hariharan M Paulraj M and Yaacob S (2011) lsquoDetection ofVocal Fold Paralysis and Oedema Using Time-Domain Fea-tures and Probabilistic Neural Networkrsquo International Jour-nal of Biomedical Engineering and Technology 6(1) 46ndash57

Hariharan M Saraswathy J Sindhu R Khairunizam W andYaacob S (2012) lsquoInfant Cry Classification to Identify As-phyxia Using Time-Frequency Analysis and Radial BasisNeural Networksrsquo Expert Systems with Applications 39(10)9515ndash9523

Hariharan M and Sazali Y (2010) lsquoTime-Domain Features andProbabilistic Neural Network for the Detection of Vocal FoldPathologyrsquo Malaysian Journal of Computer Science 23(1)60ndash67

Hariharan M Sindhu R and Yaacob S (2011) lsquoNormal andHypoacoustic Infant Cry Signal Classification Using Time-Frequency Analysis and General Regression Neural Net-

workrsquo Computer Methods and Programs in Biomedicine108(2) 559ndash569

Hariharan M Yaacob S and Awang SA (2011) lsquoPathologi-cal Infant Cry Analysis Using Wavelet Packet Transform andProbabilistic Neural Networkrsquo Expert Systems with Applica-tions 38(12) 15377ndash15382

Hernandez-Espinosa C Gomez-Vilda P Godino-Llorente Jand Aguilera-Navarro S (2000) lsquoDiagnosis of Vocal andVoice Disorders by the Speech Signalrsquo in Paper Presentedat the Proceedings from IJCNNrsquo00 The IEEE-INNS-ENNSInternational Joint Conference on Neural Networks ComoItaly pp 253ndash258

Hoglund H (2012) lsquoDetecting Earnings Management with Neu-ral Networksrsquo Expert Systems with Applications 39(10)9564ndash9570

Ismail S Samsudin R and Shabri A (2011) lsquoA Hybrid Modelof Self-Organizing Maps (SOM) and Least Square SupportVector Machine (LSSVM) for Time Series Forecastingrsquo Ex-pert Systems with Applications 38(8) 10574ndash10578

Kalker T Haitsma J and Oostveen J (2001) lsquoRobust AudioHashing for Content Identificationrsquo in Paper Presented at theProceedings from EUSIPCOrsquo01 The 12th European SignalProcessing Conference Vienna Austria pp 2091ndash2094

Kasuya H Endo Y and Saliu S (1993) lsquoNovel AcousticMeasurements of Jitter and Shimmer Characteristics fromPathological Voicersquo Paper Presented at the Proceedings fromEUROSPEECHrsquo93 The 3rd European Conference on SpeechCommunication and Technology Berlin Germany pp 1973ndash1976

Kasuya H Ogawa S Mashima K and Ebihara S (1986)lsquoNormalized Noise Energy as an Acoustic Measure to Eval-uate Pathologic Voicersquo The Journal of the Acoustical Societyof America 80 1329ndash1334

Kay Elemetrics Inc (1994) lsquoVoice Disorders DatabaseVersion 103 [CD-ROM]rsquo httpwwwkaypentaxcomProduct20InfoCSL20Options43374337html

Khadivi Heris H Seyed Aghazadeh B and Nikkhah-BahramiM (2009) lsquoOptimal Feature Selection for the Assessment ofVocal Fold Disordersrsquo Computers in Biology and Medicine39(10) 860ndash868

Krom G (1993) lsquoA Cepstrum-Based Technique for Determininga Harmonics-To-Noise Ratio in Speech Signalsrsquo Journal ofSpeech and Hearing Research 36(2) 254ndash266

Kukharchik P Kheidorov I Bovbel E and Ladeev D (2008)lsquoSpeech Signal Processing Based on Wavelets and SVM forVocal Tract Pathology Detectionrsquo Lecture Notes in ComputerScience (Springer) 5099 192ndash199

Latifoglu F Polat K Kara SI and Gunes S (2008) lsquoMedicalDiagnosis of Atherosclerosis from Carotid Artery DopplerSignals Using Principal Component Analysis (PCA) k-NN based Weighting Pre-Processing and Artificial ImmuneRecognition System (AIRS)rsquo Journal of Biomedical Infor-matics 41(1) 15ndash23

Ludlow CL Bassich CJ Connor NP Coulter DC and LeeYJ (1987) lsquoThe Validity of Using Phonatory Jitter and Shim-mer to Detect Laryngeal Pathologyrsquo Laryngeal Functionin Phonation and Respiration Boston MA Brown amp Copp 492ndash508

MacQueen J (1967) lsquoSome Methods for Classification and Anal-ysis of Multivariate Observationsrsquo in Paper Presented at the5th Berkeley Symposium on Mathematical Statistics and Prob-ability Berkeley CA University California Press Vol 1pp 281ndash297

MAPACI P (2004) lsquoVoice Disorder Databasersquo httpwwwmapacicomindex-inglesphp

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1634 M Hariharan et al

Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372

Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800

Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953

Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327

Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900

Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793

Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793

Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60

Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673

Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279

Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921

Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia

Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345

Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925

Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128

Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740

Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p

Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118

Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576

Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd

Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430

Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97

Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117

Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678

Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219

Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399

Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

Page 5: A new feature constituting approach to detection of vocal fold pathology

International Journal of Systems Science 1625

Figure 2 Overall diagram of the proposed system

To derive simple and effective feature vector singularvalue decomposition is applied In file-based analysis eachvoice sample is decomposed into five levels using WPTand the same SVD is applied to derive feature vector for thedetection of vocal fold pathology The same methodologyis applied on the voice samples of second data-set as wellConsidering the fact that pathological voice signals containmore rapid variations and the popularity of Daubechieswavelets in speech applications lsquodb4rsquo wavelet is used inour study The order was chosen to be low to model thetransients and rapid variations in a signal efficiently

31 Singular value decomposition

The singular value decomposition which is a factorisationand summarisation technique effectively reduces a rectan-gular matrix A (n times p matrix) of wavelet packet coefficientsinto a much smaller invertible and square matrix (KalkerHaitsma and Oostveen 2001 Ozer Sankur Memon andAnarim 2005) SVD theorem states that the given n times pmatrix is decomposed as given by

X = UDVT (5)

The columns of U are orthonormal eigenvectors of AAT

and are called left-singular vectors The columns of V are or-thonormal eigenvectors of ATA and are called right-singularvectors The diagonal entries of D are called the singularvalues of the matrix n times p In this work 32 singular valuesgive a good summarisation of the matrix of wavelet packetcoefficients of a signal

32 k-Means clustering based feature weighting

In many practical machine learning applications the per-formance of the learned models degrade gracefully due tothe irrelevant and noisy features To improve the robust-ness of the features and the classification accuracy clus-tering algorithms are used as feature weighting methodwhere they transform the extracted non-linearly separable

features to linearly separable features and also to map thefeatures according to the distributions (Polat and Gunes2006 Latifoglu Polat Kara and Gunes 2008 GunesPolat and Yosunkaya 2010 Polat 2012 Polat and Dur-duran 2012) Clustering algorithms are used not only tostudy the similarity or dissimilarity of the features but alsouseful for compression and reduction of the features (Polatand Gunes 2006 Latifoglu et al 2008 Gunes et al 2010Polat 2012 Polat and Durduran 2012) Several clusteringalgorithms have been proposed in the literature (MacQueen1967 Bezdek 1981 Chiu 1994 Yager and Filev 1994 Xuand Wunsch 2005) In this work k-means clustering is usedas a feature weighting method since it is simple and widelyimplemented in solving many practical problems (Polat andGunes 2006 Latifoglu et al 2008 Gunes et al 2010 Po-lat 2012 Polat and Durduran 2012) k-means clustering isused to partition n observations into k clusters so as to min-imise the mean squared distance from each data point toits nearest centre For a data-set xi i = 1 2 3 n k-means clustering algorithm finds the cluster centres ci andthe membership matrix U iteratively using the followingfour steps

Step 1 Randomly initialise the cluster centre ci i =123 c

Step 2 Determine the membership matrix usingEquation (6)

uij =

10

if∥∥xj minus ci

∥∥2 le∥∥xj minus ci

∥∥2 for each k = i

otherwise(6)

Step 3 Calculate the cost function based on the Euclideandistance using the following Equation (7)

J =csum

i=1

sumkxkisinGi

xk minus ci2 (7)

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1626 M Hariharan et al

Figure 3 Working of k-means clustering based feature weighting method

Stop if either this value or its improvement over previousiteration is below a certain thresholdStep 4 Update the cluster centres according toEquation (8) and go to Step 2

ci = 1

|Gi |sum

kxkisin Gi

xk (8)

where |Gi | =nsum

j=1uij

After finding the cluster centres the extracted featuresare preprocessed (Polat and Gunes 2006 Latifoglu et al2008 Gunes et al 2010 Polat 2012 Polat and Durduran2012) as summarised in (Figure 3) below firstly calcu-late the cluster centres using k-means clustering methodNext the ratios of means of features to their centres arecalculated Finally these ratios are multiplied with eachrespective feature

Raw features are obtained through feature extractionThrough k-means clustering based feature weighting theraw features are preprocessed which gives the weightedfeatures (Figure 4a and b) shows the class distribution ofraw and weighted SVD features MEEI database The classdistribution of raw and weighted SVD features for MAPACIspeech pathology database is shown in (Figure 5a and b)From the figures it is observed that the discrimination abil-

ity of the raw SVD features has been increased by applyingk-means clustering based feature weighting method

4 Classifiers

In the area of automatic detection of vocal fold patholo-gies various classifiers have been proposed such as multi-layer perceptron (Ritchings McGillion Conroy and Moore1999 Ritchings McGillion and Moore 1999) hiddenMarkov models (Wester 1998) linear discriminant anal-ysis (Hariharan Paulraj and Yaacob 2009 UmapathyKrishnan Parsa and Jamieson 2005) SVM (Khadivi Heriset al 2009 Arjmandi and Pooyan 2011 Erfanian Saeediet al 2011) Gaussian mixture models (Godino-Llorenteet al 2006) radial basis neural networks (Hariharan andSazali 2010 Hariharan Paulraj and Yaacob 2011) and k-nearest neighbourhood classifier (Ananthakrishna Shamaand Niranjan 2004 Shama and Cholayya 2007) In thiswork four different classifiers are employed such as k-NNLS-SVM PNN and GRNN

41 k-NN classifier

In pattern recognition the k-NN algorithm is a method forclassifying objects based on closest training examples inthe feature space (Fukunaga 1990 Duda Hart and Stork

Figure 4 (a) Class distribution of raw SVD features (MEEI database frame-based) according to the first two features (Feature 1 andFeature 2) (b) Class distribution of weighted SVD features (MEEI database frame-based) according to the first two features (Feature 1and Feature 2)

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1627

Figure 5 (a) Class distribution of raw SVD features (MAPACI speech pathology frame-based) according to the first two features (Feature1 and Feature 2) (b) Comparison of results (frame-based MEEI database raw features)

2001) An object is classified by a majority vote of itsneighbour with the object being assigned to the class mostcommon amongst its k-nearest neighbours k is a positiveinteger In the k-NN algorithm the classification of a newtest feature vector is determined by the class of its k-NNs

42 Probabilistic neural network

Specht (1990) has proposed the PNN based on Bayesianclassification and classical estimators for probability den-sity function (PDF) PNN comprises four units such as in-put units pattern units summation units and output unitsAll the units are fully interconnected and the pattern unitsare activated by exponential function instead of sigmoidalactivation function The pattern unit computes distancesfrom the input vector to the training input vectors when aninput is presented and produces a vector whose elementsindicate how close the input is to a training input The sum-mation unit sums these contributions for each class of inputsand produces a net output which is a vector of probabili-ties From the maximum of these probabilities output unitsproduce a 1 for that class and a 0 for the other classes using

compete transfer function (Hariharan Yaacob and Awang2011) The performance of the PNN classifier highly de-pends upon the smoothing parameter or spread factor (σ )Based on the experimental investigations the σ value isvaried between 001 and 0055 in steps of 0005

43 General regression neural network

GRNN is a kind of radial basis networks and the train-ing is conducted using one-pass learning This networkdoes not require an iterative training procedure it presentsmuch faster learning than multilayer perceptron it is moreaccurate than MLP and relatively insensitive to outliersThe target variable for the GRNN network is continuous(Eskidere Ertas and Hanilci 2011 Wu and Tsai 2011Hariharan Sindhu and Yaacob 2011 HariharanSaraswathy Sindhu Khairunizam and Yaacob 2012Hoglund 2012) Specht (1991) has proposed the model ofGRNN to perform general (linear or non-linear) regres-sions GRNN is based on the theory of probability re-gression analysis It usually uses Parzen window estimatesto set up the PDF from the observed data samples TheGRNN has four different layers input layer pattern layer

Table 1 Details of training and test data

Type ofType of experiment validation MEEI database MAPACI speech pathology database

Frame-based (1557pathological + 1537normal) (1560pathological + 1560normal)

CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

ConV Training = 2166 (70) testing =928 (30)

Training = 2184 (70) testing =936 (30)

File-based (173pathological + 53normal) (24pathological + 24normal)

CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

ConV Training = 158 (70) testing =68 (30)

Training = 34 (70) testing =14 (30)

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1628 M Hariharan et al

Figure 6 Comparison of results (a) frame-based MEEI database raw features (b) file-based MEEI database raw features (c) frame-based MAPACI speech pathology database raw features and (d) file-based MAPACI speech pathology database raw features

summation layer and output layer The performance of theGRNN classifier highly depends upon the smoothing pa-rameter or spread factor (σ ) Based on the experimentalinvestigations the σ value is varied between 001 and 0055in steps of 0005

44 Support vector machine

In this work SVM is used as a classifier and it is a promis-ing method for solving non-linear classification problemsfunction estimation and density estimation and patternrecognition tasks (Azadi and Almasganj 2011 Calisir andDogantekin 2011 Ismail Samsudin and Shabri 2011 YuYao Wang and Lai 2011 Martis Acharya Mandana Rayand Chakraborty 2012) It has been originally proposed toclassify samples within two classes It maps training sam-ples of two classes into a higher dimensional space througha kernel function SVM seeks an optimal separating hyper-plane in this new space to maximise its distance from theclosest training point While testing a query point is cate-gorised according to the distance between the point and thehyperplane

SVM models are built around a kernel function thattransforms the input data into an n-dimensional space where

a hyperplane can be constructed to partition the data Threekinds of kernel functions such as linear kernel multilayerkernel and radial basis function (RBF) kernel are normallyused by researchers (Azadi and Almasganj 2011 Calisirand Dogantekin 2011 Ismail et al 2011 Yu et al 2011Martis et al 2012) In this work the RBF kernel functionis used since it gives an excellent generalisation and lowcomputational cost In the RBF kernel σ 2 (sig2) is theimportant parameter and it causes the changes in shapeflexion of the hyperplane

In this work LS-SVMlab toolbox (Suykens VanGestel De Brabanter De Moor and Vandewalle2003 De Brabanter 2010) is used to perform classificationof normal and pathological speech samples There are twoparameters which are to be chosen optimally such as reg-ularisation parameter (γ gam) and σ 2 (squared bandwidthof the RBF kernel) to obtain better accuracy

5 Results and discussions

Many research works have already been done in the areaof automatic detection of voice pathology In this work anew feature vector using WPT and SVD is proposed to clas-sify the voice samples into normal or pathological k-means

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1629

Figure 7 Comparison of results (a) Frame-based MEEI database weighted features (b) file-based MEEI database weighted features(c) Frame-based MAPACI speech pathology database weighted features and (d) file-based MAPACI speech pathology database weightedfeatures

clustering based feature weighting method is proposed forpreprocessing the features which increases its discrimina-tion ability between classes and the classification accuracyas well Four classifiers are employed to investigate the ef-ficacy of the raw and weighted features and two validationschemes (ConV and CrossV) are used to prove the reliabil-ity of the classification results Details of the training andtesting sets used in this work are tabulated in (Table 1)

To test the classifier performance several measuresnamely sensitivity specificity and the overall accuracy areconsidered These measures are calculated from the mea-sures true positive (TP the classifier classified as pathologywhen pathological samples are present) true negative (TNthe classifier classified as normal when normal samples arepresent) false positive (FP the classifier classified as patho-logical when normal samples are present) and false negative(FN the classifier classified as normal when pathologicalsamples are present) These measures are given in Equa-tions (9) (10) and (11) respectively

Sensitivity (SE) = TP(TP + FN) (9)

Specificity (SP) = TN(TN + FP) (10)

Overall accuracy (ACC) = (TP + TN)(TP + TN

+ FP + FN) (11)

In k-NN classifier different values of lsquokrsquo between 1 and10 are used PNN and GRNN are trained with differentspread factors between 001 and 0055 The suitable valueof the regularisation parameter (γ gam) and σ 2 (sig2) forSVM classifier are chosen optimally during training andtesting to obtain better accuracy Sensitivity specificity andoverall accuracy of frame-based and file-based analysis us-ing the raw and weighted features are presented in (Table 2)and (Table 3) for both MEEI database and MAPACIspeech pathology database respectively Average classifica-tion accuracies with standard deviations are tabulated From(Table 2) it is observed that the accuracy of file-based anal-ysis is always better than the frame-based analysis Max-imum classification accuracy of 100 is obtained duringthe file-based analysis using all the classifiers (k-NN LS-SVM PNN and GRNN) for the proposed raw and weightedfeatures Maximum classification accuracy of above 95is obtained using the raw features and 100 is obtained us-ing the weighted features during the frame-based analysis

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1630 M Hariharan et al

Tabl

e2

Com

pari

son

ofre

sult

sfo

rM

EE

Ida

taba

seus

ing

k-N

NL

S-S

VM

PN

Nan

dG

RN

N(r

awan

dw

eigh

ted

feat

ures

)

Raw

feat

ures

Wei

ghte

dfe

atur

es

ME

EI

data

base

SE

SP

AC

CS

ES

PA

CC

k-N

NFr

ame-

base

dC

ross

V96

66

plusmn0

5591

69

plusmn1

0994

05

plusmn0

8399

81

plusmn0

0698

27

plusmn0

3499

03

plusmn0

19k-

valu

e1ndash

10C

onV

959

6plusmn

071

908

7plusmn

154

932

7plusmn

116

998

1plusmn

006

979

8plusmn

033

988

8plusmn

015

File

-bas

edC

ross

V99

54

plusmn0

2410

0plusmn

000

996

5plusmn

019

995

4plusmn

024

100

plusmn0

0099

65

plusmn0

19k-

valu

e1ndash

10C

onV

997

4plusmn

025

100

plusmn0

0099

79

plusmn0

2010

0plusmn

000

100

plusmn0

0010

0plusmn

000

LS

-SV

MFr

ame-

base

dC

ross

V95

10

plusmn0

2195

41

plusmn0

2095

25

plusmn0

1299

75

plusmn0

0399

45

plusmn0

0799

60

plusmn0

0410

000

01

Con

V94

92

plusmn0

9094

49

plusmn1

0594

70

plusmn0

8199

34

plusmn0

2999

46

plusmn0

3499

40

plusmn0

26Fi

le-b

ased

Cro

ssV

989

2plusmn

062

998

1plusmn

060

991

2plusmn

047

993

2plusmn

036

100

plusmn0

0099

47

plusmn0

2810

01

Con

V10

0plusmn

000

988

7plusmn

130

991

2plusmn

103

996

2plusmn

080

994

1plusmn

186

995

6plusmn

071

PN

NFr

ame-

base

dC

ross

V97

43

plusmn0

2693

01

plusmn0

1795

12

plusmn0

1799

88

plusmn0

0598

69

plusmn0

1699

28

plusmn0

070

01ndash0

055

Con

V96

79

plusmn0

2492

44

plusmn0

3394

51

plusmn0

2499

81

plusmn0

0698

41

plusmn0

2199

10

plusmn0

11Fi

le-b

ased

Cro

ssV

974

4plusmn

539

100

plusmn0

0097

74

plusmn4

9797

50

plusmn5

3710

0plusmn

000

977

9plusmn

495

001

ndash00

55C

onV

970

9plusmn

610

100

plusmn0

0097

37

plusmn5

7999

62

plusmn0

8099

41

plusmn1

8699

56

plusmn0

71G

RN

NFr

ame-

base

dC

ross

V93

01

plusmn3

8988

08

plusmn4

9890

15

plusmn4

3199

50

plusmn0

4197

53

plusmn1

3098

27

plusmn0

890

01ndash0

055

Con

V92

52

plusmn3

7887

07

plusmn5

1189

34

plusmn4

3599

39

plusmn0

5597

35

plusmn1

1698

10

plusmn0

91Fi

le-b

ased

Cro

ssV

972

7plusmn

599

100

plusmn0

0096

33

plusmn8

0098

07

plusmn4

8210

0plusmn

000

979

2plusmn

511

001

ndash00

55C

onV

966

9plusmn

675

100

plusmn0

0095

81

plusmn9

0596

77

plusmn6

2310

0plusmn

000

958

1plusmn

815

Tabl

e3

Com

pari

son

ofre

sult

sfo

rM

APA

CI

spee

chpa

thol

ogy

data

base

usin

gk-

NN

LS

-SV

MP

NN

and

GR

NN

(raw

and

wei

ghte

dfe

atur

es)

Raw

feat

ures

Wei

ghte

dfe

atur

es

MA

PAC

Ida

taba

seS

ES

PA

CC

SE

SP

AC

C

k-N

NFr

ame-

base

dC

ross

V94

49

plusmn0

3696

58

plusmn0

3995

51

plusmn0

2198

67

plusmn0

2398

76

plusmn0

2598

71

plusmn0

23k-

valu

e1ndash

10C

onV

942

0plusmn

040

962

9plusmn

033

952

1plusmn

032

100

plusmn0

0010

0plusmn

000

100

plusmn0

00Fi

le-b

ased

Cro

ssV

771

0plusmn

784

669

2plusmn

330

704

2plusmn

365

100

plusmn0

0010

0plusmn

000

983

3plusmn

164

k-va

lue

1ndash10

Con

V77

56

plusmn7

1167

51

plusmn2

4869

71

plusmn2

7099

88

plusmn0

4099

88

plusmn0

4097

50

plusmn2

16L

S-S

VM

Fram

e-ba

sed

Cro

ssV

955

0plusmn

018

972

8plusmn

024

963

7plusmn

016

990

4plusmn

008

990

4plusmn

008

992

4plusmn

006

100

01

Con

V96

86

plusmn0

8195

24

plusmn0

6796

03

plusmn0

4098

71

plusmn0

6898

71

plusmn0

6899

02

plusmn0

41Fi

le-b

ased

Cro

ssV

793

1plusmn

333

688

7plusmn

240

729

2plusmn

260

100

plusmn0

0010

0plusmn

000

958

3plusmn

000

101

Con

V74

53

plusmn7

8284

00

plusmn13

01

771

4plusmn

878

100

plusmn0

0010

0plusmn

000

964

3plusmn

376

PN

NFr

ame-

base

dC

ross

V94

88

plusmn0

2096

23

plusmn0

2695

55

plusmn0

2199

82

plusmn0

0799

82

plusmn0

0799

91

plusmn0

040

01ndash0

055

Con

V94

52

plusmn0

3196

03

plusmn0

3595

26

plusmn0

2599

76

plusmn0

0799

76

plusmn0

0799

88

plusmn0

03Fi

le-b

ased

Cro

ssV

723

0plusmn

396

705

3plusmn

393

708

3plusmn

170

985

7plusmn

452

985

7plusmn

452

991

7plusmn

263

001

ndash00

55C

onV

690

2plusmn

533

691

8plusmn

410

679

3plusmn

386

986

0plusmn

273

986

0plusmn

273

990

0plusmn

172

GR

NN

Fram

e-ba

sed

Cro

ssV

904

1plusmn

378

917

9plusmn

335

909

8plusmn

345

995

9plusmn

029

995

9plusmn

029

997

1plusmn

023

001

ndash00

55C

onV

900

2plusmn

361

917

8plusmn

316

907

2plusmn

324

995

3plusmn

030

995

3plusmn

030

996

6plusmn

024

File

-bas

edC

ross

V69

43

plusmn8

5170

75

plusmn4

8666

67

plusmn7

7397

13

plusmn6

4097

13

plusmn6

4093

75

plusmn9

270

01ndash0

055

Con

V62

05

plusmn9

4064

51

plusmn3

8359

57

plusmn8

5097

38

plusmn5

6897

38

plusmn5

6893

07

plusmn9

91

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1631

Table 4 Computation time for feature extraction and classification

Classification

MEEI database MAPACI database

Type of Type of Weighted Raw Weighted RawClassifiers experiments validation features features features features

k-NN Frame-based CrossV 0338 0341 0342 0346k-value 1ndash10 ConV 3806 3821 6931 6955File-based CrossV 0036 0038 0036 0036k-value 1ndash10 ConV 1759 1754 3001 3206

LS-SVM Frame-based CrossV 1139 1175 1191 121410 001 ConV 0718 0698 0705 0723

File-based CrossV 0005 0005 0002 000210 1 ConV 0006 0006 0004 0004

PNN Frame-based CrossV 2458 3084 2445 2421001ndash0055 ConV 10340 10356 11180 11202File-based CrossV 1078 1124 1093 1062001ndash0055 ConV 4250 4266 4072 4047

GRNN Frame-based CrossV 2587 2571 2676 2582001ndash0055 ConV 10726 10648 11634 11584File-based CrossV 1098 1056 1065 1062001ndash0055 ConV 4302 4260 4049 4066

Feature extractionType of experiments MEEI MAPACI

Normal Pathological Normal PathologicalFile-based 0549 0034 0109 0130Frame-based 0591 0181 1734 1737

To prove the robustness and database independency of theproposed method MAPACI speech pathology database isused for the investigation in addition to MEEI databaseFrom (Table 3) it is inferred that maximum accuracy ofabove 95 is obtained using the raw features and 100is obtained using the weighted features during the frame-based analysis During the file-based analysis we obtainedmaximum classification accuracy of 92 using the rawfeatures and 100 is obtained using the weighted features

(Figure 6andashd) show the comparison of results (maxi-mum accuracy and its corresponding sensitivity and speci-ficity) for MEEI database and MAPACI speech pathologydatabase using the proposed raw features during frame-based and file-based analysis Similarly (Figure 7andashd) il-lustrate the comparison of results (maximum accuracy andits corresponding sensitivity and specificity) for MEEIdatabase and MAPACI speech pathology database for pro-posed weighted features The performance of the weightedfeatures is always better compared to raw features for bothMEEI database and MAPACI speech pathology databaseAll the feature extractions and classification algorithms aredeveloped under MATLAB platform and they run in a lap-top of Intel Core i7-2670 QM (22 GHz) with 4 GB RAM(Table 4) reports the average time taken (in seconds) duringfeature extraction and classification under two different ex-periments (frame-based and file-based) Direct comparisonof our work with the previous works in literature cannot beperformed since most of the works in the literature havenot presented their computation time of feature extraction

and classification phase From (Table 4) it is observed thatthe best classification accuracy could be obtained using oursuggested feature extraction and classification algorithmswith less computation time

From the literature it is observed that the reliabilityof wavelet and WPT-based features has been proven bymany experimental researches Accuracy of their resultswas varied from 85 to 100 approximately under differ-ent experiments Their results highly depend on the opti-mal selection of wavelet packet-based features using featureselectiontransformation techniques and optimisation tech-niques However the presented work gives very promisingclassification accuracy for the databases under test Henceit can be concluded that the proposed k-means clusteringbased feature weighting method could improve the robust-ness of the WPTndashSVD features and increase the classifica-tion accuracy as well for both MEEI database and MAPACIspeech pathology database

6 Conclusions

This paper presents a new feature constituting approachfor efficient detection of vocal fold pathology using a fea-ture extraction method (WPTndashSVD features and a fea-ture weighting method [k-means clustering based]) Totest the effectiveness and reliability of the proposed rawand weighted features four different classifiers such as k-NN LS-SVM PNN and GRNN are employed k-means

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1632 M Hariharan et al

clustering based feature weighting approach can increasethe distinguishing performance of raw WPTndashSVD featuresThe experimental results show that the proposed weightedfeatures give very promising classification accuracy of100 for both MEEI voice disorder database and MAPACIspeech pathology database The proposed features can beused as additional acoustic indicators for researchers andspeech pathologists to detect the vocal fold pathology Theperformance of the k-means clustering algorithm is depend-ing on the initial positions of the clusters and hence in ourwork it was run for several times to obtain better clus-ter centres In the future work k-means clustering basedfeature weighting method could be improved by using ef-ficient initialisation of methods for better initial placementof the cluster centres and different kernel functions Theproposed method could be extended to detect the specifictype of disorders and also to develop an online diagnosingsystem

AcknowledgementsThis work is done in part with data transferred from the databaseMAPACI httpwwwmapacicom The authors would like tothank the anonymous reviewers for their valuable comments

Notes on contributorsM Hariharan has received Bachelor ofEngineering in Electrical and ElectronicsEngineering and Master of Engineeringin Applied Electronics from GovernmentCollege of Technology Coimbatore TamilNadu India He has received a PhD degreein Mechatronic Engineering from UniversitiMalaysia Perlis (UniMAP) Malaysia He iscurrently working as a Senior Lecturer in

the School of Mechatronic Engineering UniMAP Malaysia Hehas published more than 50 papers in referred journals and con-ferences He has been a reviewer in various journals includingComputers and Electrical Engineering Computer Methods andPrograms in Biomedicine Artificial Intelligence in BiomedicineInternational Journal of Phoniatrics Speech Therapy and Com-munication Pathology and Medical Engineering and Physics Hisresearch interests include speech signal processing biomedicalsignal and image processing and artificial intelligence He is amember of IEEE

Kemal Polat was born in Kırıkkale on30 January 1981 Dr Polat has gradu-ated from ElectricalndashElectronics Engineer-ing Department of Selcuk University withBSc degree in 2002 and from ElectricalndashElectronics Engineering Department of Sel-cuk University with MSc degree in 2004He took his PhD degree in Electrical andElectronic Engineering at Selcuk University

in 2008 He is now working as an Assistant Professor in Electricaland Electronic Engineering Department Engineering of FacultyAbant Izzet Baysal University since September 2011 His currentresearch interests are biomedical signal classification statisticalsignal processing digital signal processing and pattern recognitionand classification He has 55 articles published in the SCI journalsand 16 international conference papers He is a reviewer in various

SCI journals including Information Sciences Pattern RecognitionLetters Digital Signal Processing Soft Computing Artificial In-telligence in Medicine Applied Soft Computing Computers inBiology and Medicine Expert Systems Computational Statisticsamp Data Analysis IEEE Transactions on Biomedical EngineeringIEEE Transactions on Evolutionary Computation TALANTASoft Computing Journal of Medical Systems and Scientific Re-search and Essays so on He is the member of editorial board ofJournal of Neural Computing and Applications (SCI expanded)

Sazali Yaacob received his PhD degreein Control Engineering from University ofSheffield UK He is currently working as aProfessor in the School of Mechatronic En-gineering UniMAP Malaysia He has suc-cessfully supervised 8 PhD candidates andmore than 20 MSc graduates through re-search mode Currently he has 10 PhD and8 MSc candidates His research interests

include control modelling and signal processing with applica-tions in the fields of satellite bio-medical applied mechanics androbotics He has published more than 70 papers in journals and200 papers in conference proceedings He received his profes-sional qualification as Charted Engineer from the EngineeringCouncil UK in 2005 and also a member of IET UK since 2003

ReferencesAnanthakrishna T Shama K and Niranjan U (2004)

lsquok-Means Nearest Neighbor Classifier for Voice Pathol-ogyrsquo in Paper Presented at the Proceedings from INDI-CONrsquo04 The 1st IEEE Annual India Conference KharagpurIndia pp 352ndash354

Arjmandi MK and Pooyan M (2011) lsquoAn Optimum Algorithmin Pathological Voice Quality Assessment Using Wavelet-Packet-Based Features Linear Discriminant Analysis andSupport Vector Machinersquo Biomedical Signal Processing andControl 7 3ndash19

Azadi TE and Almasganj F (2011) lsquoPBSVM Partitioningand Biased Support Vector Machine for Vocal Fold PathologyAssessment Using Labeled and Unlabeled Data Setsrsquo ExpertSystems with Applications 38(1) 610ndash619

Bezdek JC (1981) Pattern Recognition with Fuzzy ObjectiveFunction Algorithms Norwell MA Kluwer Academic Pub-lishers

Boyanov B and Hadjitodorov S (1997) lsquoAcoustic Analysis ofPathological Voices A Voice Analysis System for the Screen-ing of Laryngeal Diseasesrsquo IEEE Engineering in Medicineand Biology Magazine 16(4) 74ndash82

Boyanov B Ivanov T Hadjitodorov S and Chollet G (1993)lsquoRobust Hybrid Pitch Detectorrsquo IET Electronics Letters 291924ndash1926

Burrus C Gopinath R Guo H Odegard J and Selesnick I(1997) Introduction to Wavelets and Wavelet Transforms APrimer Upper Saddle River NJ Prentice Hall

Calisir D and Dogantekin E (2011) lsquoA New Intelligent Hep-atitis Diagnosis System PCA-LSSVMrsquo Expert Systems withApplications 38(8) 10705ndash10708

Chiu SL (1994) lsquoFuzzy Model Identification Based on ClusterEstimationrsquo Journal of Intelligent and Fuzzy Systems 2(3)267ndash278

Crovato C and Schuck A (2007) lsquoThe Use of Wavelet PacketTransform and Artificial Neural Networks in Analysis andClassification of Dysphonic Voicesrsquo IEEE Transactions onBiomedical Engineering 54(10) 1898ndash1900

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1633

De Brabanter K Ojeda PKF Alzate C De Brabanter J Pel-ckmans K De Moor B Vandewalle J and Suykens JAK(2010) lsquoLS-SVMlab Toolboxrsquo httpwwwesatkuleuvenbesistalssvmlab

Deliyski D (1993) Acoustic Model and Evaluation of Patholog-ical Voice Productionrsquo in Paper Presented at the Proceed-ings from EUROSPEECHrsquo93 The 3rd Conference on SpeechCommunication and Technology Berlin Germany pp 1969ndash1972

Duda RO Hart PE and Stork DG (2001) Pattern Classifi-cation New York Wiley-Interscience

Erfanian Saeedi N Almasganj F and Torabinejad F (2011)lsquoSupport Vector Wavelet Adaptation for Pathological VoiceAssessmentrsquo Computers in Biology and Medicine 41(9)822ndash828

Eskidere O Ertas F and Hanilci C (2011) lsquoA Comparison ofRegression Methods for Remote Tracking of Parkinsonrsquos Dis-ease Progressionrsquo Expert Systems with Applications 39(5)5523ndash5528

Feijoo S and Hernandez C (1990) lsquoShort-Term Stability Mea-sures for the Evaluation of Vocal Qualityrsquo Journal of Speechand Hearing Research 33(2) 324ndash334

Fonseca E Guido R Scalassara P Maciel C and Pereira J(2007) lsquoWavelet Time-Frequency Analysis and Least SquaresSupport Vector Machines for the Identification of Voice Dis-ordersrsquo Computers in Biology and Medicine(Elsevier) 37(4)571ndash578

Fukunaga K (1990) Introduction to Statistical Pattern Recogni-tion (2nd ed) San Diego CA Academic Press

Godino-Llorente J and Gomez-Vilda P (2004) lsquoAutomaticDetection of Voice Impairments by Means of Short-TermCepstral Parameters and Neural Network Based DetectorsrsquoIEEE Transactions on Biomedical Engineering 51(2) 380ndash384

Godino-Llorente J Gomez-Vilda P and Blanco-Velasco M(2006) lsquoDimensionality Reduction of a Pathological VoiceQuality Assessment System Based on Gaussian Mixture Mod-els and Short-Term Cepstral Parametersrsquo IEEE Transactionson Biomedical Engineering 53(10) 1943ndash1953

Gunes S Polat K and Yosunkaya S (2010) lsquoEfficient SleepStage Recognition System Based on EEG Signal Using k-Means Clustering Based Feature Weightingrsquo Expert Systemswith Applications 37(12) 7922ndash7928

Hariharan M Paulraj M and Yaacob S (2009) lsquoIdentifica-tion of Vocal Fold Pathology Based on Mel Frequency BandEnergy Coefficients and Singular Value Decompositionrsquo inPaper presented at the 2009 IEEE International Conferenceon Signal and Image Processing Applications (ICSIPArsquo09)Kuala Lumpur Malaysia pp 514ndash517

Hariharan M Paulraj M and Yaacob S (2011) lsquoDetection ofVocal Fold Paralysis and Oedema Using Time-Domain Fea-tures and Probabilistic Neural Networkrsquo International Jour-nal of Biomedical Engineering and Technology 6(1) 46ndash57

Hariharan M Saraswathy J Sindhu R Khairunizam W andYaacob S (2012) lsquoInfant Cry Classification to Identify As-phyxia Using Time-Frequency Analysis and Radial BasisNeural Networksrsquo Expert Systems with Applications 39(10)9515ndash9523

Hariharan M and Sazali Y (2010) lsquoTime-Domain Features andProbabilistic Neural Network for the Detection of Vocal FoldPathologyrsquo Malaysian Journal of Computer Science 23(1)60ndash67

Hariharan M Sindhu R and Yaacob S (2011) lsquoNormal andHypoacoustic Infant Cry Signal Classification Using Time-Frequency Analysis and General Regression Neural Net-

workrsquo Computer Methods and Programs in Biomedicine108(2) 559ndash569

Hariharan M Yaacob S and Awang SA (2011) lsquoPathologi-cal Infant Cry Analysis Using Wavelet Packet Transform andProbabilistic Neural Networkrsquo Expert Systems with Applica-tions 38(12) 15377ndash15382

Hernandez-Espinosa C Gomez-Vilda P Godino-Llorente Jand Aguilera-Navarro S (2000) lsquoDiagnosis of Vocal andVoice Disorders by the Speech Signalrsquo in Paper Presentedat the Proceedings from IJCNNrsquo00 The IEEE-INNS-ENNSInternational Joint Conference on Neural Networks ComoItaly pp 253ndash258

Hoglund H (2012) lsquoDetecting Earnings Management with Neu-ral Networksrsquo Expert Systems with Applications 39(10)9564ndash9570

Ismail S Samsudin R and Shabri A (2011) lsquoA Hybrid Modelof Self-Organizing Maps (SOM) and Least Square SupportVector Machine (LSSVM) for Time Series Forecastingrsquo Ex-pert Systems with Applications 38(8) 10574ndash10578

Kalker T Haitsma J and Oostveen J (2001) lsquoRobust AudioHashing for Content Identificationrsquo in Paper Presented at theProceedings from EUSIPCOrsquo01 The 12th European SignalProcessing Conference Vienna Austria pp 2091ndash2094

Kasuya H Endo Y and Saliu S (1993) lsquoNovel AcousticMeasurements of Jitter and Shimmer Characteristics fromPathological Voicersquo Paper Presented at the Proceedings fromEUROSPEECHrsquo93 The 3rd European Conference on SpeechCommunication and Technology Berlin Germany pp 1973ndash1976

Kasuya H Ogawa S Mashima K and Ebihara S (1986)lsquoNormalized Noise Energy as an Acoustic Measure to Eval-uate Pathologic Voicersquo The Journal of the Acoustical Societyof America 80 1329ndash1334

Kay Elemetrics Inc (1994) lsquoVoice Disorders DatabaseVersion 103 [CD-ROM]rsquo httpwwwkaypentaxcomProduct20InfoCSL20Options43374337html

Khadivi Heris H Seyed Aghazadeh B and Nikkhah-BahramiM (2009) lsquoOptimal Feature Selection for the Assessment ofVocal Fold Disordersrsquo Computers in Biology and Medicine39(10) 860ndash868

Krom G (1993) lsquoA Cepstrum-Based Technique for Determininga Harmonics-To-Noise Ratio in Speech Signalsrsquo Journal ofSpeech and Hearing Research 36(2) 254ndash266

Kukharchik P Kheidorov I Bovbel E and Ladeev D (2008)lsquoSpeech Signal Processing Based on Wavelets and SVM forVocal Tract Pathology Detectionrsquo Lecture Notes in ComputerScience (Springer) 5099 192ndash199

Latifoglu F Polat K Kara SI and Gunes S (2008) lsquoMedicalDiagnosis of Atherosclerosis from Carotid Artery DopplerSignals Using Principal Component Analysis (PCA) k-NN based Weighting Pre-Processing and Artificial ImmuneRecognition System (AIRS)rsquo Journal of Biomedical Infor-matics 41(1) 15ndash23

Ludlow CL Bassich CJ Connor NP Coulter DC and LeeYJ (1987) lsquoThe Validity of Using Phonatory Jitter and Shim-mer to Detect Laryngeal Pathologyrsquo Laryngeal Functionin Phonation and Respiration Boston MA Brown amp Copp 492ndash508

MacQueen J (1967) lsquoSome Methods for Classification and Anal-ysis of Multivariate Observationsrsquo in Paper Presented at the5th Berkeley Symposium on Mathematical Statistics and Prob-ability Berkeley CA University California Press Vol 1pp 281ndash297

MAPACI P (2004) lsquoVoice Disorder Databasersquo httpwwwmapacicomindex-inglesphp

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1634 M Hariharan et al

Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372

Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800

Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953

Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327

Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900

Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793

Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793

Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60

Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673

Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279

Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921

Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia

Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345

Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925

Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128

Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740

Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p

Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118

Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576

Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd

Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430

Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97

Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117

Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678

Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219

Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399

Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

Page 6: A new feature constituting approach to detection of vocal fold pathology

1626 M Hariharan et al

Figure 3 Working of k-means clustering based feature weighting method

Stop if either this value or its improvement over previousiteration is below a certain thresholdStep 4 Update the cluster centres according toEquation (8) and go to Step 2

ci = 1

|Gi |sum

kxkisin Gi

xk (8)

where |Gi | =nsum

j=1uij

After finding the cluster centres the extracted featuresare preprocessed (Polat and Gunes 2006 Latifoglu et al2008 Gunes et al 2010 Polat 2012 Polat and Durduran2012) as summarised in (Figure 3) below firstly calcu-late the cluster centres using k-means clustering methodNext the ratios of means of features to their centres arecalculated Finally these ratios are multiplied with eachrespective feature

Raw features are obtained through feature extractionThrough k-means clustering based feature weighting theraw features are preprocessed which gives the weightedfeatures (Figure 4a and b) shows the class distribution ofraw and weighted SVD features MEEI database The classdistribution of raw and weighted SVD features for MAPACIspeech pathology database is shown in (Figure 5a and b)From the figures it is observed that the discrimination abil-

ity of the raw SVD features has been increased by applyingk-means clustering based feature weighting method

4 Classifiers

In the area of automatic detection of vocal fold patholo-gies various classifiers have been proposed such as multi-layer perceptron (Ritchings McGillion Conroy and Moore1999 Ritchings McGillion and Moore 1999) hiddenMarkov models (Wester 1998) linear discriminant anal-ysis (Hariharan Paulraj and Yaacob 2009 UmapathyKrishnan Parsa and Jamieson 2005) SVM (Khadivi Heriset al 2009 Arjmandi and Pooyan 2011 Erfanian Saeediet al 2011) Gaussian mixture models (Godino-Llorenteet al 2006) radial basis neural networks (Hariharan andSazali 2010 Hariharan Paulraj and Yaacob 2011) and k-nearest neighbourhood classifier (Ananthakrishna Shamaand Niranjan 2004 Shama and Cholayya 2007) In thiswork four different classifiers are employed such as k-NNLS-SVM PNN and GRNN

41 k-NN classifier

In pattern recognition the k-NN algorithm is a method forclassifying objects based on closest training examples inthe feature space (Fukunaga 1990 Duda Hart and Stork

Figure 4 (a) Class distribution of raw SVD features (MEEI database frame-based) according to the first two features (Feature 1 andFeature 2) (b) Class distribution of weighted SVD features (MEEI database frame-based) according to the first two features (Feature 1and Feature 2)

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1627

Figure 5 (a) Class distribution of raw SVD features (MAPACI speech pathology frame-based) according to the first two features (Feature1 and Feature 2) (b) Comparison of results (frame-based MEEI database raw features)

2001) An object is classified by a majority vote of itsneighbour with the object being assigned to the class mostcommon amongst its k-nearest neighbours k is a positiveinteger In the k-NN algorithm the classification of a newtest feature vector is determined by the class of its k-NNs

42 Probabilistic neural network

Specht (1990) has proposed the PNN based on Bayesianclassification and classical estimators for probability den-sity function (PDF) PNN comprises four units such as in-put units pattern units summation units and output unitsAll the units are fully interconnected and the pattern unitsare activated by exponential function instead of sigmoidalactivation function The pattern unit computes distancesfrom the input vector to the training input vectors when aninput is presented and produces a vector whose elementsindicate how close the input is to a training input The sum-mation unit sums these contributions for each class of inputsand produces a net output which is a vector of probabili-ties From the maximum of these probabilities output unitsproduce a 1 for that class and a 0 for the other classes using

compete transfer function (Hariharan Yaacob and Awang2011) The performance of the PNN classifier highly de-pends upon the smoothing parameter or spread factor (σ )Based on the experimental investigations the σ value isvaried between 001 and 0055 in steps of 0005

43 General regression neural network

GRNN is a kind of radial basis networks and the train-ing is conducted using one-pass learning This networkdoes not require an iterative training procedure it presentsmuch faster learning than multilayer perceptron it is moreaccurate than MLP and relatively insensitive to outliersThe target variable for the GRNN network is continuous(Eskidere Ertas and Hanilci 2011 Wu and Tsai 2011Hariharan Sindhu and Yaacob 2011 HariharanSaraswathy Sindhu Khairunizam and Yaacob 2012Hoglund 2012) Specht (1991) has proposed the model ofGRNN to perform general (linear or non-linear) regres-sions GRNN is based on the theory of probability re-gression analysis It usually uses Parzen window estimatesto set up the PDF from the observed data samples TheGRNN has four different layers input layer pattern layer

Table 1 Details of training and test data

Type ofType of experiment validation MEEI database MAPACI speech pathology database

Frame-based (1557pathological + 1537normal) (1560pathological + 1560normal)

CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

ConV Training = 2166 (70) testing =928 (30)

Training = 2184 (70) testing =936 (30)

File-based (173pathological + 53normal) (24pathological + 24normal)

CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

ConV Training = 158 (70) testing =68 (30)

Training = 34 (70) testing =14 (30)

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1628 M Hariharan et al

Figure 6 Comparison of results (a) frame-based MEEI database raw features (b) file-based MEEI database raw features (c) frame-based MAPACI speech pathology database raw features and (d) file-based MAPACI speech pathology database raw features

summation layer and output layer The performance of theGRNN classifier highly depends upon the smoothing pa-rameter or spread factor (σ ) Based on the experimentalinvestigations the σ value is varied between 001 and 0055in steps of 0005

44 Support vector machine

In this work SVM is used as a classifier and it is a promis-ing method for solving non-linear classification problemsfunction estimation and density estimation and patternrecognition tasks (Azadi and Almasganj 2011 Calisir andDogantekin 2011 Ismail Samsudin and Shabri 2011 YuYao Wang and Lai 2011 Martis Acharya Mandana Rayand Chakraborty 2012) It has been originally proposed toclassify samples within two classes It maps training sam-ples of two classes into a higher dimensional space througha kernel function SVM seeks an optimal separating hyper-plane in this new space to maximise its distance from theclosest training point While testing a query point is cate-gorised according to the distance between the point and thehyperplane

SVM models are built around a kernel function thattransforms the input data into an n-dimensional space where

a hyperplane can be constructed to partition the data Threekinds of kernel functions such as linear kernel multilayerkernel and radial basis function (RBF) kernel are normallyused by researchers (Azadi and Almasganj 2011 Calisirand Dogantekin 2011 Ismail et al 2011 Yu et al 2011Martis et al 2012) In this work the RBF kernel functionis used since it gives an excellent generalisation and lowcomputational cost In the RBF kernel σ 2 (sig2) is theimportant parameter and it causes the changes in shapeflexion of the hyperplane

In this work LS-SVMlab toolbox (Suykens VanGestel De Brabanter De Moor and Vandewalle2003 De Brabanter 2010) is used to perform classificationof normal and pathological speech samples There are twoparameters which are to be chosen optimally such as reg-ularisation parameter (γ gam) and σ 2 (squared bandwidthof the RBF kernel) to obtain better accuracy

5 Results and discussions

Many research works have already been done in the areaof automatic detection of voice pathology In this work anew feature vector using WPT and SVD is proposed to clas-sify the voice samples into normal or pathological k-means

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1629

Figure 7 Comparison of results (a) Frame-based MEEI database weighted features (b) file-based MEEI database weighted features(c) Frame-based MAPACI speech pathology database weighted features and (d) file-based MAPACI speech pathology database weightedfeatures

clustering based feature weighting method is proposed forpreprocessing the features which increases its discrimina-tion ability between classes and the classification accuracyas well Four classifiers are employed to investigate the ef-ficacy of the raw and weighted features and two validationschemes (ConV and CrossV) are used to prove the reliabil-ity of the classification results Details of the training andtesting sets used in this work are tabulated in (Table 1)

To test the classifier performance several measuresnamely sensitivity specificity and the overall accuracy areconsidered These measures are calculated from the mea-sures true positive (TP the classifier classified as pathologywhen pathological samples are present) true negative (TNthe classifier classified as normal when normal samples arepresent) false positive (FP the classifier classified as patho-logical when normal samples are present) and false negative(FN the classifier classified as normal when pathologicalsamples are present) These measures are given in Equa-tions (9) (10) and (11) respectively

Sensitivity (SE) = TP(TP + FN) (9)

Specificity (SP) = TN(TN + FP) (10)

Overall accuracy (ACC) = (TP + TN)(TP + TN

+ FP + FN) (11)

In k-NN classifier different values of lsquokrsquo between 1 and10 are used PNN and GRNN are trained with differentspread factors between 001 and 0055 The suitable valueof the regularisation parameter (γ gam) and σ 2 (sig2) forSVM classifier are chosen optimally during training andtesting to obtain better accuracy Sensitivity specificity andoverall accuracy of frame-based and file-based analysis us-ing the raw and weighted features are presented in (Table 2)and (Table 3) for both MEEI database and MAPACIspeech pathology database respectively Average classifica-tion accuracies with standard deviations are tabulated From(Table 2) it is observed that the accuracy of file-based anal-ysis is always better than the frame-based analysis Max-imum classification accuracy of 100 is obtained duringthe file-based analysis using all the classifiers (k-NN LS-SVM PNN and GRNN) for the proposed raw and weightedfeatures Maximum classification accuracy of above 95is obtained using the raw features and 100 is obtained us-ing the weighted features during the frame-based analysis

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1630 M Hariharan et al

Tabl

e2

Com

pari

son

ofre

sult

sfo

rM

EE

Ida

taba

seus

ing

k-N

NL

S-S

VM

PN

Nan

dG

RN

N(r

awan

dw

eigh

ted

feat

ures

)

Raw

feat

ures

Wei

ghte

dfe

atur

es

ME

EI

data

base

SE

SP

AC

CS

ES

PA

CC

k-N

NFr

ame-

base

dC

ross

V96

66

plusmn0

5591

69

plusmn1

0994

05

plusmn0

8399

81

plusmn0

0698

27

plusmn0

3499

03

plusmn0

19k-

valu

e1ndash

10C

onV

959

6plusmn

071

908

7plusmn

154

932

7plusmn

116

998

1plusmn

006

979

8plusmn

033

988

8plusmn

015

File

-bas

edC

ross

V99

54

plusmn0

2410

0plusmn

000

996

5plusmn

019

995

4plusmn

024

100

plusmn0

0099

65

plusmn0

19k-

valu

e1ndash

10C

onV

997

4plusmn

025

100

plusmn0

0099

79

plusmn0

2010

0plusmn

000

100

plusmn0

0010

0plusmn

000

LS

-SV

MFr

ame-

base

dC

ross

V95

10

plusmn0

2195

41

plusmn0

2095

25

plusmn0

1299

75

plusmn0

0399

45

plusmn0

0799

60

plusmn0

0410

000

01

Con

V94

92

plusmn0

9094

49

plusmn1

0594

70

plusmn0

8199

34

plusmn0

2999

46

plusmn0

3499

40

plusmn0

26Fi

le-b

ased

Cro

ssV

989

2plusmn

062

998

1plusmn

060

991

2plusmn

047

993

2plusmn

036

100

plusmn0

0099

47

plusmn0

2810

01

Con

V10

0plusmn

000

988

7plusmn

130

991

2plusmn

103

996

2plusmn

080

994

1plusmn

186

995

6plusmn

071

PN

NFr

ame-

base

dC

ross

V97

43

plusmn0

2693

01

plusmn0

1795

12

plusmn0

1799

88

plusmn0

0598

69

plusmn0

1699

28

plusmn0

070

01ndash0

055

Con

V96

79

plusmn0

2492

44

plusmn0

3394

51

plusmn0

2499

81

plusmn0

0698

41

plusmn0

2199

10

plusmn0

11Fi

le-b

ased

Cro

ssV

974

4plusmn

539

100

plusmn0

0097

74

plusmn4

9797

50

plusmn5

3710

0plusmn

000

977

9plusmn

495

001

ndash00

55C

onV

970

9plusmn

610

100

plusmn0

0097

37

plusmn5

7999

62

plusmn0

8099

41

plusmn1

8699

56

plusmn0

71G

RN

NFr

ame-

base

dC

ross

V93

01

plusmn3

8988

08

plusmn4

9890

15

plusmn4

3199

50

plusmn0

4197

53

plusmn1

3098

27

plusmn0

890

01ndash0

055

Con

V92

52

plusmn3

7887

07

plusmn5

1189

34

plusmn4

3599

39

plusmn0

5597

35

plusmn1

1698

10

plusmn0

91Fi

le-b

ased

Cro

ssV

972

7plusmn

599

100

plusmn0

0096

33

plusmn8

0098

07

plusmn4

8210

0plusmn

000

979

2plusmn

511

001

ndash00

55C

onV

966

9plusmn

675

100

plusmn0

0095

81

plusmn9

0596

77

plusmn6

2310

0plusmn

000

958

1plusmn

815

Tabl

e3

Com

pari

son

ofre

sult

sfo

rM

APA

CI

spee

chpa

thol

ogy

data

base

usin

gk-

NN

LS

-SV

MP

NN

and

GR

NN

(raw

and

wei

ghte

dfe

atur

es)

Raw

feat

ures

Wei

ghte

dfe

atur

es

MA

PAC

Ida

taba

seS

ES

PA

CC

SE

SP

AC

C

k-N

NFr

ame-

base

dC

ross

V94

49

plusmn0

3696

58

plusmn0

3995

51

plusmn0

2198

67

plusmn0

2398

76

plusmn0

2598

71

plusmn0

23k-

valu

e1ndash

10C

onV

942

0plusmn

040

962

9plusmn

033

952

1plusmn

032

100

plusmn0

0010

0plusmn

000

100

plusmn0

00Fi

le-b

ased

Cro

ssV

771

0plusmn

784

669

2plusmn

330

704

2plusmn

365

100

plusmn0

0010

0plusmn

000

983

3plusmn

164

k-va

lue

1ndash10

Con

V77

56

plusmn7

1167

51

plusmn2

4869

71

plusmn2

7099

88

plusmn0

4099

88

plusmn0

4097

50

plusmn2

16L

S-S

VM

Fram

e-ba

sed

Cro

ssV

955

0plusmn

018

972

8plusmn

024

963

7plusmn

016

990

4plusmn

008

990

4plusmn

008

992

4plusmn

006

100

01

Con

V96

86

plusmn0

8195

24

plusmn0

6796

03

plusmn0

4098

71

plusmn0

6898

71

plusmn0

6899

02

plusmn0

41Fi

le-b

ased

Cro

ssV

793

1plusmn

333

688

7plusmn

240

729

2plusmn

260

100

plusmn0

0010

0plusmn

000

958

3plusmn

000

101

Con

V74

53

plusmn7

8284

00

plusmn13

01

771

4plusmn

878

100

plusmn0

0010

0plusmn

000

964

3plusmn

376

PN

NFr

ame-

base

dC

ross

V94

88

plusmn0

2096

23

plusmn0

2695

55

plusmn0

2199

82

plusmn0

0799

82

plusmn0

0799

91

plusmn0

040

01ndash0

055

Con

V94

52

plusmn0

3196

03

plusmn0

3595

26

plusmn0

2599

76

plusmn0

0799

76

plusmn0

0799

88

plusmn0

03Fi

le-b

ased

Cro

ssV

723

0plusmn

396

705

3plusmn

393

708

3plusmn

170

985

7plusmn

452

985

7plusmn

452

991

7plusmn

263

001

ndash00

55C

onV

690

2plusmn

533

691

8plusmn

410

679

3plusmn

386

986

0plusmn

273

986

0plusmn

273

990

0plusmn

172

GR

NN

Fram

e-ba

sed

Cro

ssV

904

1plusmn

378

917

9plusmn

335

909

8plusmn

345

995

9plusmn

029

995

9plusmn

029

997

1plusmn

023

001

ndash00

55C

onV

900

2plusmn

361

917

8plusmn

316

907

2plusmn

324

995

3plusmn

030

995

3plusmn

030

996

6plusmn

024

File

-bas

edC

ross

V69

43

plusmn8

5170

75

plusmn4

8666

67

plusmn7

7397

13

plusmn6

4097

13

plusmn6

4093

75

plusmn9

270

01ndash0

055

Con

V62

05

plusmn9

4064

51

plusmn3

8359

57

plusmn8

5097

38

plusmn5

6897

38

plusmn5

6893

07

plusmn9

91

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1631

Table 4 Computation time for feature extraction and classification

Classification

MEEI database MAPACI database

Type of Type of Weighted Raw Weighted RawClassifiers experiments validation features features features features

k-NN Frame-based CrossV 0338 0341 0342 0346k-value 1ndash10 ConV 3806 3821 6931 6955File-based CrossV 0036 0038 0036 0036k-value 1ndash10 ConV 1759 1754 3001 3206

LS-SVM Frame-based CrossV 1139 1175 1191 121410 001 ConV 0718 0698 0705 0723

File-based CrossV 0005 0005 0002 000210 1 ConV 0006 0006 0004 0004

PNN Frame-based CrossV 2458 3084 2445 2421001ndash0055 ConV 10340 10356 11180 11202File-based CrossV 1078 1124 1093 1062001ndash0055 ConV 4250 4266 4072 4047

GRNN Frame-based CrossV 2587 2571 2676 2582001ndash0055 ConV 10726 10648 11634 11584File-based CrossV 1098 1056 1065 1062001ndash0055 ConV 4302 4260 4049 4066

Feature extractionType of experiments MEEI MAPACI

Normal Pathological Normal PathologicalFile-based 0549 0034 0109 0130Frame-based 0591 0181 1734 1737

To prove the robustness and database independency of theproposed method MAPACI speech pathology database isused for the investigation in addition to MEEI databaseFrom (Table 3) it is inferred that maximum accuracy ofabove 95 is obtained using the raw features and 100is obtained using the weighted features during the frame-based analysis During the file-based analysis we obtainedmaximum classification accuracy of 92 using the rawfeatures and 100 is obtained using the weighted features

(Figure 6andashd) show the comparison of results (maxi-mum accuracy and its corresponding sensitivity and speci-ficity) for MEEI database and MAPACI speech pathologydatabase using the proposed raw features during frame-based and file-based analysis Similarly (Figure 7andashd) il-lustrate the comparison of results (maximum accuracy andits corresponding sensitivity and specificity) for MEEIdatabase and MAPACI speech pathology database for pro-posed weighted features The performance of the weightedfeatures is always better compared to raw features for bothMEEI database and MAPACI speech pathology databaseAll the feature extractions and classification algorithms aredeveloped under MATLAB platform and they run in a lap-top of Intel Core i7-2670 QM (22 GHz) with 4 GB RAM(Table 4) reports the average time taken (in seconds) duringfeature extraction and classification under two different ex-periments (frame-based and file-based) Direct comparisonof our work with the previous works in literature cannot beperformed since most of the works in the literature havenot presented their computation time of feature extraction

and classification phase From (Table 4) it is observed thatthe best classification accuracy could be obtained using oursuggested feature extraction and classification algorithmswith less computation time

From the literature it is observed that the reliabilityof wavelet and WPT-based features has been proven bymany experimental researches Accuracy of their resultswas varied from 85 to 100 approximately under differ-ent experiments Their results highly depend on the opti-mal selection of wavelet packet-based features using featureselectiontransformation techniques and optimisation tech-niques However the presented work gives very promisingclassification accuracy for the databases under test Henceit can be concluded that the proposed k-means clusteringbased feature weighting method could improve the robust-ness of the WPTndashSVD features and increase the classifica-tion accuracy as well for both MEEI database and MAPACIspeech pathology database

6 Conclusions

This paper presents a new feature constituting approachfor efficient detection of vocal fold pathology using a fea-ture extraction method (WPTndashSVD features and a fea-ture weighting method [k-means clustering based]) Totest the effectiveness and reliability of the proposed rawand weighted features four different classifiers such as k-NN LS-SVM PNN and GRNN are employed k-means

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1632 M Hariharan et al

clustering based feature weighting approach can increasethe distinguishing performance of raw WPTndashSVD featuresThe experimental results show that the proposed weightedfeatures give very promising classification accuracy of100 for both MEEI voice disorder database and MAPACIspeech pathology database The proposed features can beused as additional acoustic indicators for researchers andspeech pathologists to detect the vocal fold pathology Theperformance of the k-means clustering algorithm is depend-ing on the initial positions of the clusters and hence in ourwork it was run for several times to obtain better clus-ter centres In the future work k-means clustering basedfeature weighting method could be improved by using ef-ficient initialisation of methods for better initial placementof the cluster centres and different kernel functions Theproposed method could be extended to detect the specifictype of disorders and also to develop an online diagnosingsystem

AcknowledgementsThis work is done in part with data transferred from the databaseMAPACI httpwwwmapacicom The authors would like tothank the anonymous reviewers for their valuable comments

Notes on contributorsM Hariharan has received Bachelor ofEngineering in Electrical and ElectronicsEngineering and Master of Engineeringin Applied Electronics from GovernmentCollege of Technology Coimbatore TamilNadu India He has received a PhD degreein Mechatronic Engineering from UniversitiMalaysia Perlis (UniMAP) Malaysia He iscurrently working as a Senior Lecturer in

the School of Mechatronic Engineering UniMAP Malaysia Hehas published more than 50 papers in referred journals and con-ferences He has been a reviewer in various journals includingComputers and Electrical Engineering Computer Methods andPrograms in Biomedicine Artificial Intelligence in BiomedicineInternational Journal of Phoniatrics Speech Therapy and Com-munication Pathology and Medical Engineering and Physics Hisresearch interests include speech signal processing biomedicalsignal and image processing and artificial intelligence He is amember of IEEE

Kemal Polat was born in Kırıkkale on30 January 1981 Dr Polat has gradu-ated from ElectricalndashElectronics Engineer-ing Department of Selcuk University withBSc degree in 2002 and from ElectricalndashElectronics Engineering Department of Sel-cuk University with MSc degree in 2004He took his PhD degree in Electrical andElectronic Engineering at Selcuk University

in 2008 He is now working as an Assistant Professor in Electricaland Electronic Engineering Department Engineering of FacultyAbant Izzet Baysal University since September 2011 His currentresearch interests are biomedical signal classification statisticalsignal processing digital signal processing and pattern recognitionand classification He has 55 articles published in the SCI journalsand 16 international conference papers He is a reviewer in various

SCI journals including Information Sciences Pattern RecognitionLetters Digital Signal Processing Soft Computing Artificial In-telligence in Medicine Applied Soft Computing Computers inBiology and Medicine Expert Systems Computational Statisticsamp Data Analysis IEEE Transactions on Biomedical EngineeringIEEE Transactions on Evolutionary Computation TALANTASoft Computing Journal of Medical Systems and Scientific Re-search and Essays so on He is the member of editorial board ofJournal of Neural Computing and Applications (SCI expanded)

Sazali Yaacob received his PhD degreein Control Engineering from University ofSheffield UK He is currently working as aProfessor in the School of Mechatronic En-gineering UniMAP Malaysia He has suc-cessfully supervised 8 PhD candidates andmore than 20 MSc graduates through re-search mode Currently he has 10 PhD and8 MSc candidates His research interests

include control modelling and signal processing with applica-tions in the fields of satellite bio-medical applied mechanics androbotics He has published more than 70 papers in journals and200 papers in conference proceedings He received his profes-sional qualification as Charted Engineer from the EngineeringCouncil UK in 2005 and also a member of IET UK since 2003

ReferencesAnanthakrishna T Shama K and Niranjan U (2004)

lsquok-Means Nearest Neighbor Classifier for Voice Pathol-ogyrsquo in Paper Presented at the Proceedings from INDI-CONrsquo04 The 1st IEEE Annual India Conference KharagpurIndia pp 352ndash354

Arjmandi MK and Pooyan M (2011) lsquoAn Optimum Algorithmin Pathological Voice Quality Assessment Using Wavelet-Packet-Based Features Linear Discriminant Analysis andSupport Vector Machinersquo Biomedical Signal Processing andControl 7 3ndash19

Azadi TE and Almasganj F (2011) lsquoPBSVM Partitioningand Biased Support Vector Machine for Vocal Fold PathologyAssessment Using Labeled and Unlabeled Data Setsrsquo ExpertSystems with Applications 38(1) 610ndash619

Bezdek JC (1981) Pattern Recognition with Fuzzy ObjectiveFunction Algorithms Norwell MA Kluwer Academic Pub-lishers

Boyanov B and Hadjitodorov S (1997) lsquoAcoustic Analysis ofPathological Voices A Voice Analysis System for the Screen-ing of Laryngeal Diseasesrsquo IEEE Engineering in Medicineand Biology Magazine 16(4) 74ndash82

Boyanov B Ivanov T Hadjitodorov S and Chollet G (1993)lsquoRobust Hybrid Pitch Detectorrsquo IET Electronics Letters 291924ndash1926

Burrus C Gopinath R Guo H Odegard J and Selesnick I(1997) Introduction to Wavelets and Wavelet Transforms APrimer Upper Saddle River NJ Prentice Hall

Calisir D and Dogantekin E (2011) lsquoA New Intelligent Hep-atitis Diagnosis System PCA-LSSVMrsquo Expert Systems withApplications 38(8) 10705ndash10708

Chiu SL (1994) lsquoFuzzy Model Identification Based on ClusterEstimationrsquo Journal of Intelligent and Fuzzy Systems 2(3)267ndash278

Crovato C and Schuck A (2007) lsquoThe Use of Wavelet PacketTransform and Artificial Neural Networks in Analysis andClassification of Dysphonic Voicesrsquo IEEE Transactions onBiomedical Engineering 54(10) 1898ndash1900

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1633

De Brabanter K Ojeda PKF Alzate C De Brabanter J Pel-ckmans K De Moor B Vandewalle J and Suykens JAK(2010) lsquoLS-SVMlab Toolboxrsquo httpwwwesatkuleuvenbesistalssvmlab

Deliyski D (1993) Acoustic Model and Evaluation of Patholog-ical Voice Productionrsquo in Paper Presented at the Proceed-ings from EUROSPEECHrsquo93 The 3rd Conference on SpeechCommunication and Technology Berlin Germany pp 1969ndash1972

Duda RO Hart PE and Stork DG (2001) Pattern Classifi-cation New York Wiley-Interscience

Erfanian Saeedi N Almasganj F and Torabinejad F (2011)lsquoSupport Vector Wavelet Adaptation for Pathological VoiceAssessmentrsquo Computers in Biology and Medicine 41(9)822ndash828

Eskidere O Ertas F and Hanilci C (2011) lsquoA Comparison ofRegression Methods for Remote Tracking of Parkinsonrsquos Dis-ease Progressionrsquo Expert Systems with Applications 39(5)5523ndash5528

Feijoo S and Hernandez C (1990) lsquoShort-Term Stability Mea-sures for the Evaluation of Vocal Qualityrsquo Journal of Speechand Hearing Research 33(2) 324ndash334

Fonseca E Guido R Scalassara P Maciel C and Pereira J(2007) lsquoWavelet Time-Frequency Analysis and Least SquaresSupport Vector Machines for the Identification of Voice Dis-ordersrsquo Computers in Biology and Medicine(Elsevier) 37(4)571ndash578

Fukunaga K (1990) Introduction to Statistical Pattern Recogni-tion (2nd ed) San Diego CA Academic Press

Godino-Llorente J and Gomez-Vilda P (2004) lsquoAutomaticDetection of Voice Impairments by Means of Short-TermCepstral Parameters and Neural Network Based DetectorsrsquoIEEE Transactions on Biomedical Engineering 51(2) 380ndash384

Godino-Llorente J Gomez-Vilda P and Blanco-Velasco M(2006) lsquoDimensionality Reduction of a Pathological VoiceQuality Assessment System Based on Gaussian Mixture Mod-els and Short-Term Cepstral Parametersrsquo IEEE Transactionson Biomedical Engineering 53(10) 1943ndash1953

Gunes S Polat K and Yosunkaya S (2010) lsquoEfficient SleepStage Recognition System Based on EEG Signal Using k-Means Clustering Based Feature Weightingrsquo Expert Systemswith Applications 37(12) 7922ndash7928

Hariharan M Paulraj M and Yaacob S (2009) lsquoIdentifica-tion of Vocal Fold Pathology Based on Mel Frequency BandEnergy Coefficients and Singular Value Decompositionrsquo inPaper presented at the 2009 IEEE International Conferenceon Signal and Image Processing Applications (ICSIPArsquo09)Kuala Lumpur Malaysia pp 514ndash517

Hariharan M Paulraj M and Yaacob S (2011) lsquoDetection ofVocal Fold Paralysis and Oedema Using Time-Domain Fea-tures and Probabilistic Neural Networkrsquo International Jour-nal of Biomedical Engineering and Technology 6(1) 46ndash57

Hariharan M Saraswathy J Sindhu R Khairunizam W andYaacob S (2012) lsquoInfant Cry Classification to Identify As-phyxia Using Time-Frequency Analysis and Radial BasisNeural Networksrsquo Expert Systems with Applications 39(10)9515ndash9523

Hariharan M and Sazali Y (2010) lsquoTime-Domain Features andProbabilistic Neural Network for the Detection of Vocal FoldPathologyrsquo Malaysian Journal of Computer Science 23(1)60ndash67

Hariharan M Sindhu R and Yaacob S (2011) lsquoNormal andHypoacoustic Infant Cry Signal Classification Using Time-Frequency Analysis and General Regression Neural Net-

workrsquo Computer Methods and Programs in Biomedicine108(2) 559ndash569

Hariharan M Yaacob S and Awang SA (2011) lsquoPathologi-cal Infant Cry Analysis Using Wavelet Packet Transform andProbabilistic Neural Networkrsquo Expert Systems with Applica-tions 38(12) 15377ndash15382

Hernandez-Espinosa C Gomez-Vilda P Godino-Llorente Jand Aguilera-Navarro S (2000) lsquoDiagnosis of Vocal andVoice Disorders by the Speech Signalrsquo in Paper Presentedat the Proceedings from IJCNNrsquo00 The IEEE-INNS-ENNSInternational Joint Conference on Neural Networks ComoItaly pp 253ndash258

Hoglund H (2012) lsquoDetecting Earnings Management with Neu-ral Networksrsquo Expert Systems with Applications 39(10)9564ndash9570

Ismail S Samsudin R and Shabri A (2011) lsquoA Hybrid Modelof Self-Organizing Maps (SOM) and Least Square SupportVector Machine (LSSVM) for Time Series Forecastingrsquo Ex-pert Systems with Applications 38(8) 10574ndash10578

Kalker T Haitsma J and Oostveen J (2001) lsquoRobust AudioHashing for Content Identificationrsquo in Paper Presented at theProceedings from EUSIPCOrsquo01 The 12th European SignalProcessing Conference Vienna Austria pp 2091ndash2094

Kasuya H Endo Y and Saliu S (1993) lsquoNovel AcousticMeasurements of Jitter and Shimmer Characteristics fromPathological Voicersquo Paper Presented at the Proceedings fromEUROSPEECHrsquo93 The 3rd European Conference on SpeechCommunication and Technology Berlin Germany pp 1973ndash1976

Kasuya H Ogawa S Mashima K and Ebihara S (1986)lsquoNormalized Noise Energy as an Acoustic Measure to Eval-uate Pathologic Voicersquo The Journal of the Acoustical Societyof America 80 1329ndash1334

Kay Elemetrics Inc (1994) lsquoVoice Disorders DatabaseVersion 103 [CD-ROM]rsquo httpwwwkaypentaxcomProduct20InfoCSL20Options43374337html

Khadivi Heris H Seyed Aghazadeh B and Nikkhah-BahramiM (2009) lsquoOptimal Feature Selection for the Assessment ofVocal Fold Disordersrsquo Computers in Biology and Medicine39(10) 860ndash868

Krom G (1993) lsquoA Cepstrum-Based Technique for Determininga Harmonics-To-Noise Ratio in Speech Signalsrsquo Journal ofSpeech and Hearing Research 36(2) 254ndash266

Kukharchik P Kheidorov I Bovbel E and Ladeev D (2008)lsquoSpeech Signal Processing Based on Wavelets and SVM forVocal Tract Pathology Detectionrsquo Lecture Notes in ComputerScience (Springer) 5099 192ndash199

Latifoglu F Polat K Kara SI and Gunes S (2008) lsquoMedicalDiagnosis of Atherosclerosis from Carotid Artery DopplerSignals Using Principal Component Analysis (PCA) k-NN based Weighting Pre-Processing and Artificial ImmuneRecognition System (AIRS)rsquo Journal of Biomedical Infor-matics 41(1) 15ndash23

Ludlow CL Bassich CJ Connor NP Coulter DC and LeeYJ (1987) lsquoThe Validity of Using Phonatory Jitter and Shim-mer to Detect Laryngeal Pathologyrsquo Laryngeal Functionin Phonation and Respiration Boston MA Brown amp Copp 492ndash508

MacQueen J (1967) lsquoSome Methods for Classification and Anal-ysis of Multivariate Observationsrsquo in Paper Presented at the5th Berkeley Symposium on Mathematical Statistics and Prob-ability Berkeley CA University California Press Vol 1pp 281ndash297

MAPACI P (2004) lsquoVoice Disorder Databasersquo httpwwwmapacicomindex-inglesphp

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1634 M Hariharan et al

Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372

Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800

Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953

Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327

Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900

Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793

Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793

Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60

Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673

Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279

Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921

Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia

Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345

Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925

Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128

Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740

Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p

Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118

Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576

Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd

Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430

Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97

Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117

Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678

Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219

Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399

Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

Page 7: A new feature constituting approach to detection of vocal fold pathology

International Journal of Systems Science 1627

Figure 5 (a) Class distribution of raw SVD features (MAPACI speech pathology frame-based) according to the first two features (Feature1 and Feature 2) (b) Comparison of results (frame-based MEEI database raw features)

2001) An object is classified by a majority vote of itsneighbour with the object being assigned to the class mostcommon amongst its k-nearest neighbours k is a positiveinteger In the k-NN algorithm the classification of a newtest feature vector is determined by the class of its k-NNs

42 Probabilistic neural network

Specht (1990) has proposed the PNN based on Bayesianclassification and classical estimators for probability den-sity function (PDF) PNN comprises four units such as in-put units pattern units summation units and output unitsAll the units are fully interconnected and the pattern unitsare activated by exponential function instead of sigmoidalactivation function The pattern unit computes distancesfrom the input vector to the training input vectors when aninput is presented and produces a vector whose elementsindicate how close the input is to a training input The sum-mation unit sums these contributions for each class of inputsand produces a net output which is a vector of probabili-ties From the maximum of these probabilities output unitsproduce a 1 for that class and a 0 for the other classes using

compete transfer function (Hariharan Yaacob and Awang2011) The performance of the PNN classifier highly de-pends upon the smoothing parameter or spread factor (σ )Based on the experimental investigations the σ value isvaried between 001 and 0055 in steps of 0005

43 General regression neural network

GRNN is a kind of radial basis networks and the train-ing is conducted using one-pass learning This networkdoes not require an iterative training procedure it presentsmuch faster learning than multilayer perceptron it is moreaccurate than MLP and relatively insensitive to outliersThe target variable for the GRNN network is continuous(Eskidere Ertas and Hanilci 2011 Wu and Tsai 2011Hariharan Sindhu and Yaacob 2011 HariharanSaraswathy Sindhu Khairunizam and Yaacob 2012Hoglund 2012) Specht (1991) has proposed the model ofGRNN to perform general (linear or non-linear) regres-sions GRNN is based on the theory of probability re-gression analysis It usually uses Parzen window estimatesto set up the PDF from the observed data samples TheGRNN has four different layers input layer pattern layer

Table 1 Details of training and test data

Type ofType of experiment validation MEEI database MAPACI speech pathology database

Frame-based (1557pathological + 1537normal) (1560pathological + 1560normal)

CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

ConV Training = 2166 (70) testing =928 (30)

Training = 2184 (70) testing =936 (30)

File-based (173pathological + 53normal) (24pathological + 24normal)

CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times

ConV Training = 158 (70) testing =68 (30)

Training = 34 (70) testing =14 (30)

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1628 M Hariharan et al

Figure 6 Comparison of results (a) frame-based MEEI database raw features (b) file-based MEEI database raw features (c) frame-based MAPACI speech pathology database raw features and (d) file-based MAPACI speech pathology database raw features

summation layer and output layer The performance of theGRNN classifier highly depends upon the smoothing pa-rameter or spread factor (σ ) Based on the experimentalinvestigations the σ value is varied between 001 and 0055in steps of 0005

44 Support vector machine

In this work SVM is used as a classifier and it is a promis-ing method for solving non-linear classification problemsfunction estimation and density estimation and patternrecognition tasks (Azadi and Almasganj 2011 Calisir andDogantekin 2011 Ismail Samsudin and Shabri 2011 YuYao Wang and Lai 2011 Martis Acharya Mandana Rayand Chakraborty 2012) It has been originally proposed toclassify samples within two classes It maps training sam-ples of two classes into a higher dimensional space througha kernel function SVM seeks an optimal separating hyper-plane in this new space to maximise its distance from theclosest training point While testing a query point is cate-gorised according to the distance between the point and thehyperplane

SVM models are built around a kernel function thattransforms the input data into an n-dimensional space where

a hyperplane can be constructed to partition the data Threekinds of kernel functions such as linear kernel multilayerkernel and radial basis function (RBF) kernel are normallyused by researchers (Azadi and Almasganj 2011 Calisirand Dogantekin 2011 Ismail et al 2011 Yu et al 2011Martis et al 2012) In this work the RBF kernel functionis used since it gives an excellent generalisation and lowcomputational cost In the RBF kernel σ 2 (sig2) is theimportant parameter and it causes the changes in shapeflexion of the hyperplane

In this work LS-SVMlab toolbox (Suykens VanGestel De Brabanter De Moor and Vandewalle2003 De Brabanter 2010) is used to perform classificationof normal and pathological speech samples There are twoparameters which are to be chosen optimally such as reg-ularisation parameter (γ gam) and σ 2 (squared bandwidthof the RBF kernel) to obtain better accuracy

5 Results and discussions

Many research works have already been done in the areaof automatic detection of voice pathology In this work anew feature vector using WPT and SVD is proposed to clas-sify the voice samples into normal or pathological k-means

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1629

Figure 7 Comparison of results (a) Frame-based MEEI database weighted features (b) file-based MEEI database weighted features(c) Frame-based MAPACI speech pathology database weighted features and (d) file-based MAPACI speech pathology database weightedfeatures

clustering based feature weighting method is proposed forpreprocessing the features which increases its discrimina-tion ability between classes and the classification accuracyas well Four classifiers are employed to investigate the ef-ficacy of the raw and weighted features and two validationschemes (ConV and CrossV) are used to prove the reliabil-ity of the classification results Details of the training andtesting sets used in this work are tabulated in (Table 1)

To test the classifier performance several measuresnamely sensitivity specificity and the overall accuracy areconsidered These measures are calculated from the mea-sures true positive (TP the classifier classified as pathologywhen pathological samples are present) true negative (TNthe classifier classified as normal when normal samples arepresent) false positive (FP the classifier classified as patho-logical when normal samples are present) and false negative(FN the classifier classified as normal when pathologicalsamples are present) These measures are given in Equa-tions (9) (10) and (11) respectively

Sensitivity (SE) = TP(TP + FN) (9)

Specificity (SP) = TN(TN + FP) (10)

Overall accuracy (ACC) = (TP + TN)(TP + TN

+ FP + FN) (11)

In k-NN classifier different values of lsquokrsquo between 1 and10 are used PNN and GRNN are trained with differentspread factors between 001 and 0055 The suitable valueof the regularisation parameter (γ gam) and σ 2 (sig2) forSVM classifier are chosen optimally during training andtesting to obtain better accuracy Sensitivity specificity andoverall accuracy of frame-based and file-based analysis us-ing the raw and weighted features are presented in (Table 2)and (Table 3) for both MEEI database and MAPACIspeech pathology database respectively Average classifica-tion accuracies with standard deviations are tabulated From(Table 2) it is observed that the accuracy of file-based anal-ysis is always better than the frame-based analysis Max-imum classification accuracy of 100 is obtained duringthe file-based analysis using all the classifiers (k-NN LS-SVM PNN and GRNN) for the proposed raw and weightedfeatures Maximum classification accuracy of above 95is obtained using the raw features and 100 is obtained us-ing the weighted features during the frame-based analysis

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1630 M Hariharan et al

Tabl

e2

Com

pari

son

ofre

sult

sfo

rM

EE

Ida

taba

seus

ing

k-N

NL

S-S

VM

PN

Nan

dG

RN

N(r

awan

dw

eigh

ted

feat

ures

)

Raw

feat

ures

Wei

ghte

dfe

atur

es

ME

EI

data

base

SE

SP

AC

CS

ES

PA

CC

k-N

NFr

ame-

base

dC

ross

V96

66

plusmn0

5591

69

plusmn1

0994

05

plusmn0

8399

81

plusmn0

0698

27

plusmn0

3499

03

plusmn0

19k-

valu

e1ndash

10C

onV

959

6plusmn

071

908

7plusmn

154

932

7plusmn

116

998

1plusmn

006

979

8plusmn

033

988

8plusmn

015

File

-bas

edC

ross

V99

54

plusmn0

2410

0plusmn

000

996

5plusmn

019

995

4plusmn

024

100

plusmn0

0099

65

plusmn0

19k-

valu

e1ndash

10C

onV

997

4plusmn

025

100

plusmn0

0099

79

plusmn0

2010

0plusmn

000

100

plusmn0

0010

0plusmn

000

LS

-SV

MFr

ame-

base

dC

ross

V95

10

plusmn0

2195

41

plusmn0

2095

25

plusmn0

1299

75

plusmn0

0399

45

plusmn0

0799

60

plusmn0

0410

000

01

Con

V94

92

plusmn0

9094

49

plusmn1

0594

70

plusmn0

8199

34

plusmn0

2999

46

plusmn0

3499

40

plusmn0

26Fi

le-b

ased

Cro

ssV

989

2plusmn

062

998

1plusmn

060

991

2plusmn

047

993

2plusmn

036

100

plusmn0

0099

47

plusmn0

2810

01

Con

V10

0plusmn

000

988

7plusmn

130

991

2plusmn

103

996

2plusmn

080

994

1plusmn

186

995

6plusmn

071

PN

NFr

ame-

base

dC

ross

V97

43

plusmn0

2693

01

plusmn0

1795

12

plusmn0

1799

88

plusmn0

0598

69

plusmn0

1699

28

plusmn0

070

01ndash0

055

Con

V96

79

plusmn0

2492

44

plusmn0

3394

51

plusmn0

2499

81

plusmn0

0698

41

plusmn0

2199

10

plusmn0

11Fi

le-b

ased

Cro

ssV

974

4plusmn

539

100

plusmn0

0097

74

plusmn4

9797

50

plusmn5

3710

0plusmn

000

977

9plusmn

495

001

ndash00

55C

onV

970

9plusmn

610

100

plusmn0

0097

37

plusmn5

7999

62

plusmn0

8099

41

plusmn1

8699

56

plusmn0

71G

RN

NFr

ame-

base

dC

ross

V93

01

plusmn3

8988

08

plusmn4

9890

15

plusmn4

3199

50

plusmn0

4197

53

plusmn1

3098

27

plusmn0

890

01ndash0

055

Con

V92

52

plusmn3

7887

07

plusmn5

1189

34

plusmn4

3599

39

plusmn0

5597

35

plusmn1

1698

10

plusmn0

91Fi

le-b

ased

Cro

ssV

972

7plusmn

599

100

plusmn0

0096

33

plusmn8

0098

07

plusmn4

8210

0plusmn

000

979

2plusmn

511

001

ndash00

55C

onV

966

9plusmn

675

100

plusmn0

0095

81

plusmn9

0596

77

plusmn6

2310

0plusmn

000

958

1plusmn

815

Tabl

e3

Com

pari

son

ofre

sult

sfo

rM

APA

CI

spee

chpa

thol

ogy

data

base

usin

gk-

NN

LS

-SV

MP

NN

and

GR

NN

(raw

and

wei

ghte

dfe

atur

es)

Raw

feat

ures

Wei

ghte

dfe

atur

es

MA

PAC

Ida

taba

seS

ES

PA

CC

SE

SP

AC

C

k-N

NFr

ame-

base

dC

ross

V94

49

plusmn0

3696

58

plusmn0

3995

51

plusmn0

2198

67

plusmn0

2398

76

plusmn0

2598

71

plusmn0

23k-

valu

e1ndash

10C

onV

942

0plusmn

040

962

9plusmn

033

952

1plusmn

032

100

plusmn0

0010

0plusmn

000

100

plusmn0

00Fi

le-b

ased

Cro

ssV

771

0plusmn

784

669

2plusmn

330

704

2plusmn

365

100

plusmn0

0010

0plusmn

000

983

3plusmn

164

k-va

lue

1ndash10

Con

V77

56

plusmn7

1167

51

plusmn2

4869

71

plusmn2

7099

88

plusmn0

4099

88

plusmn0

4097

50

plusmn2

16L

S-S

VM

Fram

e-ba

sed

Cro

ssV

955

0plusmn

018

972

8plusmn

024

963

7plusmn

016

990

4plusmn

008

990

4plusmn

008

992

4plusmn

006

100

01

Con

V96

86

plusmn0

8195

24

plusmn0

6796

03

plusmn0

4098

71

plusmn0

6898

71

plusmn0

6899

02

plusmn0

41Fi

le-b

ased

Cro

ssV

793

1plusmn

333

688

7plusmn

240

729

2plusmn

260

100

plusmn0

0010

0plusmn

000

958

3plusmn

000

101

Con

V74

53

plusmn7

8284

00

plusmn13

01

771

4plusmn

878

100

plusmn0

0010

0plusmn

000

964

3plusmn

376

PN

NFr

ame-

base

dC

ross

V94

88

plusmn0

2096

23

plusmn0

2695

55

plusmn0

2199

82

plusmn0

0799

82

plusmn0

0799

91

plusmn0

040

01ndash0

055

Con

V94

52

plusmn0

3196

03

plusmn0

3595

26

plusmn0

2599

76

plusmn0

0799

76

plusmn0

0799

88

plusmn0

03Fi

le-b

ased

Cro

ssV

723

0plusmn

396

705

3plusmn

393

708

3plusmn

170

985

7plusmn

452

985

7plusmn

452

991

7plusmn

263

001

ndash00

55C

onV

690

2plusmn

533

691

8plusmn

410

679

3plusmn

386

986

0plusmn

273

986

0plusmn

273

990

0plusmn

172

GR

NN

Fram

e-ba

sed

Cro

ssV

904

1plusmn

378

917

9plusmn

335

909

8plusmn

345

995

9plusmn

029

995

9plusmn

029

997

1plusmn

023

001

ndash00

55C

onV

900

2plusmn

361

917

8plusmn

316

907

2plusmn

324

995

3plusmn

030

995

3plusmn

030

996

6plusmn

024

File

-bas

edC

ross

V69

43

plusmn8

5170

75

plusmn4

8666

67

plusmn7

7397

13

plusmn6

4097

13

plusmn6

4093

75

plusmn9

270

01ndash0

055

Con

V62

05

plusmn9

4064

51

plusmn3

8359

57

plusmn8

5097

38

plusmn5

6897

38

plusmn5

6893

07

plusmn9

91

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1631

Table 4 Computation time for feature extraction and classification

Classification

MEEI database MAPACI database

Type of Type of Weighted Raw Weighted RawClassifiers experiments validation features features features features

k-NN Frame-based CrossV 0338 0341 0342 0346k-value 1ndash10 ConV 3806 3821 6931 6955File-based CrossV 0036 0038 0036 0036k-value 1ndash10 ConV 1759 1754 3001 3206

LS-SVM Frame-based CrossV 1139 1175 1191 121410 001 ConV 0718 0698 0705 0723

File-based CrossV 0005 0005 0002 000210 1 ConV 0006 0006 0004 0004

PNN Frame-based CrossV 2458 3084 2445 2421001ndash0055 ConV 10340 10356 11180 11202File-based CrossV 1078 1124 1093 1062001ndash0055 ConV 4250 4266 4072 4047

GRNN Frame-based CrossV 2587 2571 2676 2582001ndash0055 ConV 10726 10648 11634 11584File-based CrossV 1098 1056 1065 1062001ndash0055 ConV 4302 4260 4049 4066

Feature extractionType of experiments MEEI MAPACI

Normal Pathological Normal PathologicalFile-based 0549 0034 0109 0130Frame-based 0591 0181 1734 1737

To prove the robustness and database independency of theproposed method MAPACI speech pathology database isused for the investigation in addition to MEEI databaseFrom (Table 3) it is inferred that maximum accuracy ofabove 95 is obtained using the raw features and 100is obtained using the weighted features during the frame-based analysis During the file-based analysis we obtainedmaximum classification accuracy of 92 using the rawfeatures and 100 is obtained using the weighted features

(Figure 6andashd) show the comparison of results (maxi-mum accuracy and its corresponding sensitivity and speci-ficity) for MEEI database and MAPACI speech pathologydatabase using the proposed raw features during frame-based and file-based analysis Similarly (Figure 7andashd) il-lustrate the comparison of results (maximum accuracy andits corresponding sensitivity and specificity) for MEEIdatabase and MAPACI speech pathology database for pro-posed weighted features The performance of the weightedfeatures is always better compared to raw features for bothMEEI database and MAPACI speech pathology databaseAll the feature extractions and classification algorithms aredeveloped under MATLAB platform and they run in a lap-top of Intel Core i7-2670 QM (22 GHz) with 4 GB RAM(Table 4) reports the average time taken (in seconds) duringfeature extraction and classification under two different ex-periments (frame-based and file-based) Direct comparisonof our work with the previous works in literature cannot beperformed since most of the works in the literature havenot presented their computation time of feature extraction

and classification phase From (Table 4) it is observed thatthe best classification accuracy could be obtained using oursuggested feature extraction and classification algorithmswith less computation time

From the literature it is observed that the reliabilityof wavelet and WPT-based features has been proven bymany experimental researches Accuracy of their resultswas varied from 85 to 100 approximately under differ-ent experiments Their results highly depend on the opti-mal selection of wavelet packet-based features using featureselectiontransformation techniques and optimisation tech-niques However the presented work gives very promisingclassification accuracy for the databases under test Henceit can be concluded that the proposed k-means clusteringbased feature weighting method could improve the robust-ness of the WPTndashSVD features and increase the classifica-tion accuracy as well for both MEEI database and MAPACIspeech pathology database

6 Conclusions

This paper presents a new feature constituting approachfor efficient detection of vocal fold pathology using a fea-ture extraction method (WPTndashSVD features and a fea-ture weighting method [k-means clustering based]) Totest the effectiveness and reliability of the proposed rawand weighted features four different classifiers such as k-NN LS-SVM PNN and GRNN are employed k-means

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1632 M Hariharan et al

clustering based feature weighting approach can increasethe distinguishing performance of raw WPTndashSVD featuresThe experimental results show that the proposed weightedfeatures give very promising classification accuracy of100 for both MEEI voice disorder database and MAPACIspeech pathology database The proposed features can beused as additional acoustic indicators for researchers andspeech pathologists to detect the vocal fold pathology Theperformance of the k-means clustering algorithm is depend-ing on the initial positions of the clusters and hence in ourwork it was run for several times to obtain better clus-ter centres In the future work k-means clustering basedfeature weighting method could be improved by using ef-ficient initialisation of methods for better initial placementof the cluster centres and different kernel functions Theproposed method could be extended to detect the specifictype of disorders and also to develop an online diagnosingsystem

AcknowledgementsThis work is done in part with data transferred from the databaseMAPACI httpwwwmapacicom The authors would like tothank the anonymous reviewers for their valuable comments

Notes on contributorsM Hariharan has received Bachelor ofEngineering in Electrical and ElectronicsEngineering and Master of Engineeringin Applied Electronics from GovernmentCollege of Technology Coimbatore TamilNadu India He has received a PhD degreein Mechatronic Engineering from UniversitiMalaysia Perlis (UniMAP) Malaysia He iscurrently working as a Senior Lecturer in

the School of Mechatronic Engineering UniMAP Malaysia Hehas published more than 50 papers in referred journals and con-ferences He has been a reviewer in various journals includingComputers and Electrical Engineering Computer Methods andPrograms in Biomedicine Artificial Intelligence in BiomedicineInternational Journal of Phoniatrics Speech Therapy and Com-munication Pathology and Medical Engineering and Physics Hisresearch interests include speech signal processing biomedicalsignal and image processing and artificial intelligence He is amember of IEEE

Kemal Polat was born in Kırıkkale on30 January 1981 Dr Polat has gradu-ated from ElectricalndashElectronics Engineer-ing Department of Selcuk University withBSc degree in 2002 and from ElectricalndashElectronics Engineering Department of Sel-cuk University with MSc degree in 2004He took his PhD degree in Electrical andElectronic Engineering at Selcuk University

in 2008 He is now working as an Assistant Professor in Electricaland Electronic Engineering Department Engineering of FacultyAbant Izzet Baysal University since September 2011 His currentresearch interests are biomedical signal classification statisticalsignal processing digital signal processing and pattern recognitionand classification He has 55 articles published in the SCI journalsand 16 international conference papers He is a reviewer in various

SCI journals including Information Sciences Pattern RecognitionLetters Digital Signal Processing Soft Computing Artificial In-telligence in Medicine Applied Soft Computing Computers inBiology and Medicine Expert Systems Computational Statisticsamp Data Analysis IEEE Transactions on Biomedical EngineeringIEEE Transactions on Evolutionary Computation TALANTASoft Computing Journal of Medical Systems and Scientific Re-search and Essays so on He is the member of editorial board ofJournal of Neural Computing and Applications (SCI expanded)

Sazali Yaacob received his PhD degreein Control Engineering from University ofSheffield UK He is currently working as aProfessor in the School of Mechatronic En-gineering UniMAP Malaysia He has suc-cessfully supervised 8 PhD candidates andmore than 20 MSc graduates through re-search mode Currently he has 10 PhD and8 MSc candidates His research interests

include control modelling and signal processing with applica-tions in the fields of satellite bio-medical applied mechanics androbotics He has published more than 70 papers in journals and200 papers in conference proceedings He received his profes-sional qualification as Charted Engineer from the EngineeringCouncil UK in 2005 and also a member of IET UK since 2003

ReferencesAnanthakrishna T Shama K and Niranjan U (2004)

lsquok-Means Nearest Neighbor Classifier for Voice Pathol-ogyrsquo in Paper Presented at the Proceedings from INDI-CONrsquo04 The 1st IEEE Annual India Conference KharagpurIndia pp 352ndash354

Arjmandi MK and Pooyan M (2011) lsquoAn Optimum Algorithmin Pathological Voice Quality Assessment Using Wavelet-Packet-Based Features Linear Discriminant Analysis andSupport Vector Machinersquo Biomedical Signal Processing andControl 7 3ndash19

Azadi TE and Almasganj F (2011) lsquoPBSVM Partitioningand Biased Support Vector Machine for Vocal Fold PathologyAssessment Using Labeled and Unlabeled Data Setsrsquo ExpertSystems with Applications 38(1) 610ndash619

Bezdek JC (1981) Pattern Recognition with Fuzzy ObjectiveFunction Algorithms Norwell MA Kluwer Academic Pub-lishers

Boyanov B and Hadjitodorov S (1997) lsquoAcoustic Analysis ofPathological Voices A Voice Analysis System for the Screen-ing of Laryngeal Diseasesrsquo IEEE Engineering in Medicineand Biology Magazine 16(4) 74ndash82

Boyanov B Ivanov T Hadjitodorov S and Chollet G (1993)lsquoRobust Hybrid Pitch Detectorrsquo IET Electronics Letters 291924ndash1926

Burrus C Gopinath R Guo H Odegard J and Selesnick I(1997) Introduction to Wavelets and Wavelet Transforms APrimer Upper Saddle River NJ Prentice Hall

Calisir D and Dogantekin E (2011) lsquoA New Intelligent Hep-atitis Diagnosis System PCA-LSSVMrsquo Expert Systems withApplications 38(8) 10705ndash10708

Chiu SL (1994) lsquoFuzzy Model Identification Based on ClusterEstimationrsquo Journal of Intelligent and Fuzzy Systems 2(3)267ndash278

Crovato C and Schuck A (2007) lsquoThe Use of Wavelet PacketTransform and Artificial Neural Networks in Analysis andClassification of Dysphonic Voicesrsquo IEEE Transactions onBiomedical Engineering 54(10) 1898ndash1900

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1633

De Brabanter K Ojeda PKF Alzate C De Brabanter J Pel-ckmans K De Moor B Vandewalle J and Suykens JAK(2010) lsquoLS-SVMlab Toolboxrsquo httpwwwesatkuleuvenbesistalssvmlab

Deliyski D (1993) Acoustic Model and Evaluation of Patholog-ical Voice Productionrsquo in Paper Presented at the Proceed-ings from EUROSPEECHrsquo93 The 3rd Conference on SpeechCommunication and Technology Berlin Germany pp 1969ndash1972

Duda RO Hart PE and Stork DG (2001) Pattern Classifi-cation New York Wiley-Interscience

Erfanian Saeedi N Almasganj F and Torabinejad F (2011)lsquoSupport Vector Wavelet Adaptation for Pathological VoiceAssessmentrsquo Computers in Biology and Medicine 41(9)822ndash828

Eskidere O Ertas F and Hanilci C (2011) lsquoA Comparison ofRegression Methods for Remote Tracking of Parkinsonrsquos Dis-ease Progressionrsquo Expert Systems with Applications 39(5)5523ndash5528

Feijoo S and Hernandez C (1990) lsquoShort-Term Stability Mea-sures for the Evaluation of Vocal Qualityrsquo Journal of Speechand Hearing Research 33(2) 324ndash334

Fonseca E Guido R Scalassara P Maciel C and Pereira J(2007) lsquoWavelet Time-Frequency Analysis and Least SquaresSupport Vector Machines for the Identification of Voice Dis-ordersrsquo Computers in Biology and Medicine(Elsevier) 37(4)571ndash578

Fukunaga K (1990) Introduction to Statistical Pattern Recogni-tion (2nd ed) San Diego CA Academic Press

Godino-Llorente J and Gomez-Vilda P (2004) lsquoAutomaticDetection of Voice Impairments by Means of Short-TermCepstral Parameters and Neural Network Based DetectorsrsquoIEEE Transactions on Biomedical Engineering 51(2) 380ndash384

Godino-Llorente J Gomez-Vilda P and Blanco-Velasco M(2006) lsquoDimensionality Reduction of a Pathological VoiceQuality Assessment System Based on Gaussian Mixture Mod-els and Short-Term Cepstral Parametersrsquo IEEE Transactionson Biomedical Engineering 53(10) 1943ndash1953

Gunes S Polat K and Yosunkaya S (2010) lsquoEfficient SleepStage Recognition System Based on EEG Signal Using k-Means Clustering Based Feature Weightingrsquo Expert Systemswith Applications 37(12) 7922ndash7928

Hariharan M Paulraj M and Yaacob S (2009) lsquoIdentifica-tion of Vocal Fold Pathology Based on Mel Frequency BandEnergy Coefficients and Singular Value Decompositionrsquo inPaper presented at the 2009 IEEE International Conferenceon Signal and Image Processing Applications (ICSIPArsquo09)Kuala Lumpur Malaysia pp 514ndash517

Hariharan M Paulraj M and Yaacob S (2011) lsquoDetection ofVocal Fold Paralysis and Oedema Using Time-Domain Fea-tures and Probabilistic Neural Networkrsquo International Jour-nal of Biomedical Engineering and Technology 6(1) 46ndash57

Hariharan M Saraswathy J Sindhu R Khairunizam W andYaacob S (2012) lsquoInfant Cry Classification to Identify As-phyxia Using Time-Frequency Analysis and Radial BasisNeural Networksrsquo Expert Systems with Applications 39(10)9515ndash9523

Hariharan M and Sazali Y (2010) lsquoTime-Domain Features andProbabilistic Neural Network for the Detection of Vocal FoldPathologyrsquo Malaysian Journal of Computer Science 23(1)60ndash67

Hariharan M Sindhu R and Yaacob S (2011) lsquoNormal andHypoacoustic Infant Cry Signal Classification Using Time-Frequency Analysis and General Regression Neural Net-

workrsquo Computer Methods and Programs in Biomedicine108(2) 559ndash569

Hariharan M Yaacob S and Awang SA (2011) lsquoPathologi-cal Infant Cry Analysis Using Wavelet Packet Transform andProbabilistic Neural Networkrsquo Expert Systems with Applica-tions 38(12) 15377ndash15382

Hernandez-Espinosa C Gomez-Vilda P Godino-Llorente Jand Aguilera-Navarro S (2000) lsquoDiagnosis of Vocal andVoice Disorders by the Speech Signalrsquo in Paper Presentedat the Proceedings from IJCNNrsquo00 The IEEE-INNS-ENNSInternational Joint Conference on Neural Networks ComoItaly pp 253ndash258

Hoglund H (2012) lsquoDetecting Earnings Management with Neu-ral Networksrsquo Expert Systems with Applications 39(10)9564ndash9570

Ismail S Samsudin R and Shabri A (2011) lsquoA Hybrid Modelof Self-Organizing Maps (SOM) and Least Square SupportVector Machine (LSSVM) for Time Series Forecastingrsquo Ex-pert Systems with Applications 38(8) 10574ndash10578

Kalker T Haitsma J and Oostveen J (2001) lsquoRobust AudioHashing for Content Identificationrsquo in Paper Presented at theProceedings from EUSIPCOrsquo01 The 12th European SignalProcessing Conference Vienna Austria pp 2091ndash2094

Kasuya H Endo Y and Saliu S (1993) lsquoNovel AcousticMeasurements of Jitter and Shimmer Characteristics fromPathological Voicersquo Paper Presented at the Proceedings fromEUROSPEECHrsquo93 The 3rd European Conference on SpeechCommunication and Technology Berlin Germany pp 1973ndash1976

Kasuya H Ogawa S Mashima K and Ebihara S (1986)lsquoNormalized Noise Energy as an Acoustic Measure to Eval-uate Pathologic Voicersquo The Journal of the Acoustical Societyof America 80 1329ndash1334

Kay Elemetrics Inc (1994) lsquoVoice Disorders DatabaseVersion 103 [CD-ROM]rsquo httpwwwkaypentaxcomProduct20InfoCSL20Options43374337html

Khadivi Heris H Seyed Aghazadeh B and Nikkhah-BahramiM (2009) lsquoOptimal Feature Selection for the Assessment ofVocal Fold Disordersrsquo Computers in Biology and Medicine39(10) 860ndash868

Krom G (1993) lsquoA Cepstrum-Based Technique for Determininga Harmonics-To-Noise Ratio in Speech Signalsrsquo Journal ofSpeech and Hearing Research 36(2) 254ndash266

Kukharchik P Kheidorov I Bovbel E and Ladeev D (2008)lsquoSpeech Signal Processing Based on Wavelets and SVM forVocal Tract Pathology Detectionrsquo Lecture Notes in ComputerScience (Springer) 5099 192ndash199

Latifoglu F Polat K Kara SI and Gunes S (2008) lsquoMedicalDiagnosis of Atherosclerosis from Carotid Artery DopplerSignals Using Principal Component Analysis (PCA) k-NN based Weighting Pre-Processing and Artificial ImmuneRecognition System (AIRS)rsquo Journal of Biomedical Infor-matics 41(1) 15ndash23

Ludlow CL Bassich CJ Connor NP Coulter DC and LeeYJ (1987) lsquoThe Validity of Using Phonatory Jitter and Shim-mer to Detect Laryngeal Pathologyrsquo Laryngeal Functionin Phonation and Respiration Boston MA Brown amp Copp 492ndash508

MacQueen J (1967) lsquoSome Methods for Classification and Anal-ysis of Multivariate Observationsrsquo in Paper Presented at the5th Berkeley Symposium on Mathematical Statistics and Prob-ability Berkeley CA University California Press Vol 1pp 281ndash297

MAPACI P (2004) lsquoVoice Disorder Databasersquo httpwwwmapacicomindex-inglesphp

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1634 M Hariharan et al

Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372

Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800

Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953

Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327

Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900

Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793

Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793

Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60

Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673

Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279

Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921

Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia

Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345

Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925

Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128

Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740

Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p

Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118

Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576

Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd

Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430

Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97

Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117

Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678

Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219

Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399

Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

Page 8: A new feature constituting approach to detection of vocal fold pathology

1628 M Hariharan et al

Figure 6 Comparison of results (a) frame-based MEEI database raw features (b) file-based MEEI database raw features (c) frame-based MAPACI speech pathology database raw features and (d) file-based MAPACI speech pathology database raw features

summation layer and output layer The performance of theGRNN classifier highly depends upon the smoothing pa-rameter or spread factor (σ ) Based on the experimentalinvestigations the σ value is varied between 001 and 0055in steps of 0005

44 Support vector machine

In this work SVM is used as a classifier and it is a promis-ing method for solving non-linear classification problemsfunction estimation and density estimation and patternrecognition tasks (Azadi and Almasganj 2011 Calisir andDogantekin 2011 Ismail Samsudin and Shabri 2011 YuYao Wang and Lai 2011 Martis Acharya Mandana Rayand Chakraborty 2012) It has been originally proposed toclassify samples within two classes It maps training sam-ples of two classes into a higher dimensional space througha kernel function SVM seeks an optimal separating hyper-plane in this new space to maximise its distance from theclosest training point While testing a query point is cate-gorised according to the distance between the point and thehyperplane

SVM models are built around a kernel function thattransforms the input data into an n-dimensional space where

a hyperplane can be constructed to partition the data Threekinds of kernel functions such as linear kernel multilayerkernel and radial basis function (RBF) kernel are normallyused by researchers (Azadi and Almasganj 2011 Calisirand Dogantekin 2011 Ismail et al 2011 Yu et al 2011Martis et al 2012) In this work the RBF kernel functionis used since it gives an excellent generalisation and lowcomputational cost In the RBF kernel σ 2 (sig2) is theimportant parameter and it causes the changes in shapeflexion of the hyperplane

In this work LS-SVMlab toolbox (Suykens VanGestel De Brabanter De Moor and Vandewalle2003 De Brabanter 2010) is used to perform classificationof normal and pathological speech samples There are twoparameters which are to be chosen optimally such as reg-ularisation parameter (γ gam) and σ 2 (squared bandwidthof the RBF kernel) to obtain better accuracy

5 Results and discussions

Many research works have already been done in the areaof automatic detection of voice pathology In this work anew feature vector using WPT and SVD is proposed to clas-sify the voice samples into normal or pathological k-means

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1629

Figure 7 Comparison of results (a) Frame-based MEEI database weighted features (b) file-based MEEI database weighted features(c) Frame-based MAPACI speech pathology database weighted features and (d) file-based MAPACI speech pathology database weightedfeatures

clustering based feature weighting method is proposed forpreprocessing the features which increases its discrimina-tion ability between classes and the classification accuracyas well Four classifiers are employed to investigate the ef-ficacy of the raw and weighted features and two validationschemes (ConV and CrossV) are used to prove the reliabil-ity of the classification results Details of the training andtesting sets used in this work are tabulated in (Table 1)

To test the classifier performance several measuresnamely sensitivity specificity and the overall accuracy areconsidered These measures are calculated from the mea-sures true positive (TP the classifier classified as pathologywhen pathological samples are present) true negative (TNthe classifier classified as normal when normal samples arepresent) false positive (FP the classifier classified as patho-logical when normal samples are present) and false negative(FN the classifier classified as normal when pathologicalsamples are present) These measures are given in Equa-tions (9) (10) and (11) respectively

Sensitivity (SE) = TP(TP + FN) (9)

Specificity (SP) = TN(TN + FP) (10)

Overall accuracy (ACC) = (TP + TN)(TP + TN

+ FP + FN) (11)

In k-NN classifier different values of lsquokrsquo between 1 and10 are used PNN and GRNN are trained with differentspread factors between 001 and 0055 The suitable valueof the regularisation parameter (γ gam) and σ 2 (sig2) forSVM classifier are chosen optimally during training andtesting to obtain better accuracy Sensitivity specificity andoverall accuracy of frame-based and file-based analysis us-ing the raw and weighted features are presented in (Table 2)and (Table 3) for both MEEI database and MAPACIspeech pathology database respectively Average classifica-tion accuracies with standard deviations are tabulated From(Table 2) it is observed that the accuracy of file-based anal-ysis is always better than the frame-based analysis Max-imum classification accuracy of 100 is obtained duringthe file-based analysis using all the classifiers (k-NN LS-SVM PNN and GRNN) for the proposed raw and weightedfeatures Maximum classification accuracy of above 95is obtained using the raw features and 100 is obtained us-ing the weighted features during the frame-based analysis

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1630 M Hariharan et al

Tabl

e2

Com

pari

son

ofre

sult

sfo

rM

EE

Ida

taba

seus

ing

k-N

NL

S-S

VM

PN

Nan

dG

RN

N(r

awan

dw

eigh

ted

feat

ures

)

Raw

feat

ures

Wei

ghte

dfe

atur

es

ME

EI

data

base

SE

SP

AC

CS

ES

PA

CC

k-N

NFr

ame-

base

dC

ross

V96

66

plusmn0

5591

69

plusmn1

0994

05

plusmn0

8399

81

plusmn0

0698

27

plusmn0

3499

03

plusmn0

19k-

valu

e1ndash

10C

onV

959

6plusmn

071

908

7plusmn

154

932

7plusmn

116

998

1plusmn

006

979

8plusmn

033

988

8plusmn

015

File

-bas

edC

ross

V99

54

plusmn0

2410

0plusmn

000

996

5plusmn

019

995

4plusmn

024

100

plusmn0

0099

65

plusmn0

19k-

valu

e1ndash

10C

onV

997

4plusmn

025

100

plusmn0

0099

79

plusmn0

2010

0plusmn

000

100

plusmn0

0010

0plusmn

000

LS

-SV

MFr

ame-

base

dC

ross

V95

10

plusmn0

2195

41

plusmn0

2095

25

plusmn0

1299

75

plusmn0

0399

45

plusmn0

0799

60

plusmn0

0410

000

01

Con

V94

92

plusmn0

9094

49

plusmn1

0594

70

plusmn0

8199

34

plusmn0

2999

46

plusmn0

3499

40

plusmn0

26Fi

le-b

ased

Cro

ssV

989

2plusmn

062

998

1plusmn

060

991

2plusmn

047

993

2plusmn

036

100

plusmn0

0099

47

plusmn0

2810

01

Con

V10

0plusmn

000

988

7plusmn

130

991

2plusmn

103

996

2plusmn

080

994

1plusmn

186

995

6plusmn

071

PN

NFr

ame-

base

dC

ross

V97

43

plusmn0

2693

01

plusmn0

1795

12

plusmn0

1799

88

plusmn0

0598

69

plusmn0

1699

28

plusmn0

070

01ndash0

055

Con

V96

79

plusmn0

2492

44

plusmn0

3394

51

plusmn0

2499

81

plusmn0

0698

41

plusmn0

2199

10

plusmn0

11Fi

le-b

ased

Cro

ssV

974

4plusmn

539

100

plusmn0

0097

74

plusmn4

9797

50

plusmn5

3710

0plusmn

000

977

9plusmn

495

001

ndash00

55C

onV

970

9plusmn

610

100

plusmn0

0097

37

plusmn5

7999

62

plusmn0

8099

41

plusmn1

8699

56

plusmn0

71G

RN

NFr

ame-

base

dC

ross

V93

01

plusmn3

8988

08

plusmn4

9890

15

plusmn4

3199

50

plusmn0

4197

53

plusmn1

3098

27

plusmn0

890

01ndash0

055

Con

V92

52

plusmn3

7887

07

plusmn5

1189

34

plusmn4

3599

39

plusmn0

5597

35

plusmn1

1698

10

plusmn0

91Fi

le-b

ased

Cro

ssV

972

7plusmn

599

100

plusmn0

0096

33

plusmn8

0098

07

plusmn4

8210

0plusmn

000

979

2plusmn

511

001

ndash00

55C

onV

966

9plusmn

675

100

plusmn0

0095

81

plusmn9

0596

77

plusmn6

2310

0plusmn

000

958

1plusmn

815

Tabl

e3

Com

pari

son

ofre

sult

sfo

rM

APA

CI

spee

chpa

thol

ogy

data

base

usin

gk-

NN

LS

-SV

MP

NN

and

GR

NN

(raw

and

wei

ghte

dfe

atur

es)

Raw

feat

ures

Wei

ghte

dfe

atur

es

MA

PAC

Ida

taba

seS

ES

PA

CC

SE

SP

AC

C

k-N

NFr

ame-

base

dC

ross

V94

49

plusmn0

3696

58

plusmn0

3995

51

plusmn0

2198

67

plusmn0

2398

76

plusmn0

2598

71

plusmn0

23k-

valu

e1ndash

10C

onV

942

0plusmn

040

962

9plusmn

033

952

1plusmn

032

100

plusmn0

0010

0plusmn

000

100

plusmn0

00Fi

le-b

ased

Cro

ssV

771

0plusmn

784

669

2plusmn

330

704

2plusmn

365

100

plusmn0

0010

0plusmn

000

983

3plusmn

164

k-va

lue

1ndash10

Con

V77

56

plusmn7

1167

51

plusmn2

4869

71

plusmn2

7099

88

plusmn0

4099

88

plusmn0

4097

50

plusmn2

16L

S-S

VM

Fram

e-ba

sed

Cro

ssV

955

0plusmn

018

972

8plusmn

024

963

7plusmn

016

990

4plusmn

008

990

4plusmn

008

992

4plusmn

006

100

01

Con

V96

86

plusmn0

8195

24

plusmn0

6796

03

plusmn0

4098

71

plusmn0

6898

71

plusmn0

6899

02

plusmn0

41Fi

le-b

ased

Cro

ssV

793

1plusmn

333

688

7plusmn

240

729

2plusmn

260

100

plusmn0

0010

0plusmn

000

958

3plusmn

000

101

Con

V74

53

plusmn7

8284

00

plusmn13

01

771

4plusmn

878

100

plusmn0

0010

0plusmn

000

964

3plusmn

376

PN

NFr

ame-

base

dC

ross

V94

88

plusmn0

2096

23

plusmn0

2695

55

plusmn0

2199

82

plusmn0

0799

82

plusmn0

0799

91

plusmn0

040

01ndash0

055

Con

V94

52

plusmn0

3196

03

plusmn0

3595

26

plusmn0

2599

76

plusmn0

0799

76

plusmn0

0799

88

plusmn0

03Fi

le-b

ased

Cro

ssV

723

0plusmn

396

705

3plusmn

393

708

3plusmn

170

985

7plusmn

452

985

7plusmn

452

991

7plusmn

263

001

ndash00

55C

onV

690

2plusmn

533

691

8plusmn

410

679

3plusmn

386

986

0plusmn

273

986

0plusmn

273

990

0plusmn

172

GR

NN

Fram

e-ba

sed

Cro

ssV

904

1plusmn

378

917

9plusmn

335

909

8plusmn

345

995

9plusmn

029

995

9plusmn

029

997

1plusmn

023

001

ndash00

55C

onV

900

2plusmn

361

917

8plusmn

316

907

2plusmn

324

995

3plusmn

030

995

3plusmn

030

996

6plusmn

024

File

-bas

edC

ross

V69

43

plusmn8

5170

75

plusmn4

8666

67

plusmn7

7397

13

plusmn6

4097

13

plusmn6

4093

75

plusmn9

270

01ndash0

055

Con

V62

05

plusmn9

4064

51

plusmn3

8359

57

plusmn8

5097

38

plusmn5

6897

38

plusmn5

6893

07

plusmn9

91

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1631

Table 4 Computation time for feature extraction and classification

Classification

MEEI database MAPACI database

Type of Type of Weighted Raw Weighted RawClassifiers experiments validation features features features features

k-NN Frame-based CrossV 0338 0341 0342 0346k-value 1ndash10 ConV 3806 3821 6931 6955File-based CrossV 0036 0038 0036 0036k-value 1ndash10 ConV 1759 1754 3001 3206

LS-SVM Frame-based CrossV 1139 1175 1191 121410 001 ConV 0718 0698 0705 0723

File-based CrossV 0005 0005 0002 000210 1 ConV 0006 0006 0004 0004

PNN Frame-based CrossV 2458 3084 2445 2421001ndash0055 ConV 10340 10356 11180 11202File-based CrossV 1078 1124 1093 1062001ndash0055 ConV 4250 4266 4072 4047

GRNN Frame-based CrossV 2587 2571 2676 2582001ndash0055 ConV 10726 10648 11634 11584File-based CrossV 1098 1056 1065 1062001ndash0055 ConV 4302 4260 4049 4066

Feature extractionType of experiments MEEI MAPACI

Normal Pathological Normal PathologicalFile-based 0549 0034 0109 0130Frame-based 0591 0181 1734 1737

To prove the robustness and database independency of theproposed method MAPACI speech pathology database isused for the investigation in addition to MEEI databaseFrom (Table 3) it is inferred that maximum accuracy ofabove 95 is obtained using the raw features and 100is obtained using the weighted features during the frame-based analysis During the file-based analysis we obtainedmaximum classification accuracy of 92 using the rawfeatures and 100 is obtained using the weighted features

(Figure 6andashd) show the comparison of results (maxi-mum accuracy and its corresponding sensitivity and speci-ficity) for MEEI database and MAPACI speech pathologydatabase using the proposed raw features during frame-based and file-based analysis Similarly (Figure 7andashd) il-lustrate the comparison of results (maximum accuracy andits corresponding sensitivity and specificity) for MEEIdatabase and MAPACI speech pathology database for pro-posed weighted features The performance of the weightedfeatures is always better compared to raw features for bothMEEI database and MAPACI speech pathology databaseAll the feature extractions and classification algorithms aredeveloped under MATLAB platform and they run in a lap-top of Intel Core i7-2670 QM (22 GHz) with 4 GB RAM(Table 4) reports the average time taken (in seconds) duringfeature extraction and classification under two different ex-periments (frame-based and file-based) Direct comparisonof our work with the previous works in literature cannot beperformed since most of the works in the literature havenot presented their computation time of feature extraction

and classification phase From (Table 4) it is observed thatthe best classification accuracy could be obtained using oursuggested feature extraction and classification algorithmswith less computation time

From the literature it is observed that the reliabilityof wavelet and WPT-based features has been proven bymany experimental researches Accuracy of their resultswas varied from 85 to 100 approximately under differ-ent experiments Their results highly depend on the opti-mal selection of wavelet packet-based features using featureselectiontransformation techniques and optimisation tech-niques However the presented work gives very promisingclassification accuracy for the databases under test Henceit can be concluded that the proposed k-means clusteringbased feature weighting method could improve the robust-ness of the WPTndashSVD features and increase the classifica-tion accuracy as well for both MEEI database and MAPACIspeech pathology database

6 Conclusions

This paper presents a new feature constituting approachfor efficient detection of vocal fold pathology using a fea-ture extraction method (WPTndashSVD features and a fea-ture weighting method [k-means clustering based]) Totest the effectiveness and reliability of the proposed rawand weighted features four different classifiers such as k-NN LS-SVM PNN and GRNN are employed k-means

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1632 M Hariharan et al

clustering based feature weighting approach can increasethe distinguishing performance of raw WPTndashSVD featuresThe experimental results show that the proposed weightedfeatures give very promising classification accuracy of100 for both MEEI voice disorder database and MAPACIspeech pathology database The proposed features can beused as additional acoustic indicators for researchers andspeech pathologists to detect the vocal fold pathology Theperformance of the k-means clustering algorithm is depend-ing on the initial positions of the clusters and hence in ourwork it was run for several times to obtain better clus-ter centres In the future work k-means clustering basedfeature weighting method could be improved by using ef-ficient initialisation of methods for better initial placementof the cluster centres and different kernel functions Theproposed method could be extended to detect the specifictype of disorders and also to develop an online diagnosingsystem

AcknowledgementsThis work is done in part with data transferred from the databaseMAPACI httpwwwmapacicom The authors would like tothank the anonymous reviewers for their valuable comments

Notes on contributorsM Hariharan has received Bachelor ofEngineering in Electrical and ElectronicsEngineering and Master of Engineeringin Applied Electronics from GovernmentCollege of Technology Coimbatore TamilNadu India He has received a PhD degreein Mechatronic Engineering from UniversitiMalaysia Perlis (UniMAP) Malaysia He iscurrently working as a Senior Lecturer in

the School of Mechatronic Engineering UniMAP Malaysia Hehas published more than 50 papers in referred journals and con-ferences He has been a reviewer in various journals includingComputers and Electrical Engineering Computer Methods andPrograms in Biomedicine Artificial Intelligence in BiomedicineInternational Journal of Phoniatrics Speech Therapy and Com-munication Pathology and Medical Engineering and Physics Hisresearch interests include speech signal processing biomedicalsignal and image processing and artificial intelligence He is amember of IEEE

Kemal Polat was born in Kırıkkale on30 January 1981 Dr Polat has gradu-ated from ElectricalndashElectronics Engineer-ing Department of Selcuk University withBSc degree in 2002 and from ElectricalndashElectronics Engineering Department of Sel-cuk University with MSc degree in 2004He took his PhD degree in Electrical andElectronic Engineering at Selcuk University

in 2008 He is now working as an Assistant Professor in Electricaland Electronic Engineering Department Engineering of FacultyAbant Izzet Baysal University since September 2011 His currentresearch interests are biomedical signal classification statisticalsignal processing digital signal processing and pattern recognitionand classification He has 55 articles published in the SCI journalsand 16 international conference papers He is a reviewer in various

SCI journals including Information Sciences Pattern RecognitionLetters Digital Signal Processing Soft Computing Artificial In-telligence in Medicine Applied Soft Computing Computers inBiology and Medicine Expert Systems Computational Statisticsamp Data Analysis IEEE Transactions on Biomedical EngineeringIEEE Transactions on Evolutionary Computation TALANTASoft Computing Journal of Medical Systems and Scientific Re-search and Essays so on He is the member of editorial board ofJournal of Neural Computing and Applications (SCI expanded)

Sazali Yaacob received his PhD degreein Control Engineering from University ofSheffield UK He is currently working as aProfessor in the School of Mechatronic En-gineering UniMAP Malaysia He has suc-cessfully supervised 8 PhD candidates andmore than 20 MSc graduates through re-search mode Currently he has 10 PhD and8 MSc candidates His research interests

include control modelling and signal processing with applica-tions in the fields of satellite bio-medical applied mechanics androbotics He has published more than 70 papers in journals and200 papers in conference proceedings He received his profes-sional qualification as Charted Engineer from the EngineeringCouncil UK in 2005 and also a member of IET UK since 2003

ReferencesAnanthakrishna T Shama K and Niranjan U (2004)

lsquok-Means Nearest Neighbor Classifier for Voice Pathol-ogyrsquo in Paper Presented at the Proceedings from INDI-CONrsquo04 The 1st IEEE Annual India Conference KharagpurIndia pp 352ndash354

Arjmandi MK and Pooyan M (2011) lsquoAn Optimum Algorithmin Pathological Voice Quality Assessment Using Wavelet-Packet-Based Features Linear Discriminant Analysis andSupport Vector Machinersquo Biomedical Signal Processing andControl 7 3ndash19

Azadi TE and Almasganj F (2011) lsquoPBSVM Partitioningand Biased Support Vector Machine for Vocal Fold PathologyAssessment Using Labeled and Unlabeled Data Setsrsquo ExpertSystems with Applications 38(1) 610ndash619

Bezdek JC (1981) Pattern Recognition with Fuzzy ObjectiveFunction Algorithms Norwell MA Kluwer Academic Pub-lishers

Boyanov B and Hadjitodorov S (1997) lsquoAcoustic Analysis ofPathological Voices A Voice Analysis System for the Screen-ing of Laryngeal Diseasesrsquo IEEE Engineering in Medicineand Biology Magazine 16(4) 74ndash82

Boyanov B Ivanov T Hadjitodorov S and Chollet G (1993)lsquoRobust Hybrid Pitch Detectorrsquo IET Electronics Letters 291924ndash1926

Burrus C Gopinath R Guo H Odegard J and Selesnick I(1997) Introduction to Wavelets and Wavelet Transforms APrimer Upper Saddle River NJ Prentice Hall

Calisir D and Dogantekin E (2011) lsquoA New Intelligent Hep-atitis Diagnosis System PCA-LSSVMrsquo Expert Systems withApplications 38(8) 10705ndash10708

Chiu SL (1994) lsquoFuzzy Model Identification Based on ClusterEstimationrsquo Journal of Intelligent and Fuzzy Systems 2(3)267ndash278

Crovato C and Schuck A (2007) lsquoThe Use of Wavelet PacketTransform and Artificial Neural Networks in Analysis andClassification of Dysphonic Voicesrsquo IEEE Transactions onBiomedical Engineering 54(10) 1898ndash1900

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1633

De Brabanter K Ojeda PKF Alzate C De Brabanter J Pel-ckmans K De Moor B Vandewalle J and Suykens JAK(2010) lsquoLS-SVMlab Toolboxrsquo httpwwwesatkuleuvenbesistalssvmlab

Deliyski D (1993) Acoustic Model and Evaluation of Patholog-ical Voice Productionrsquo in Paper Presented at the Proceed-ings from EUROSPEECHrsquo93 The 3rd Conference on SpeechCommunication and Technology Berlin Germany pp 1969ndash1972

Duda RO Hart PE and Stork DG (2001) Pattern Classifi-cation New York Wiley-Interscience

Erfanian Saeedi N Almasganj F and Torabinejad F (2011)lsquoSupport Vector Wavelet Adaptation for Pathological VoiceAssessmentrsquo Computers in Biology and Medicine 41(9)822ndash828

Eskidere O Ertas F and Hanilci C (2011) lsquoA Comparison ofRegression Methods for Remote Tracking of Parkinsonrsquos Dis-ease Progressionrsquo Expert Systems with Applications 39(5)5523ndash5528

Feijoo S and Hernandez C (1990) lsquoShort-Term Stability Mea-sures for the Evaluation of Vocal Qualityrsquo Journal of Speechand Hearing Research 33(2) 324ndash334

Fonseca E Guido R Scalassara P Maciel C and Pereira J(2007) lsquoWavelet Time-Frequency Analysis and Least SquaresSupport Vector Machines for the Identification of Voice Dis-ordersrsquo Computers in Biology and Medicine(Elsevier) 37(4)571ndash578

Fukunaga K (1990) Introduction to Statistical Pattern Recogni-tion (2nd ed) San Diego CA Academic Press

Godino-Llorente J and Gomez-Vilda P (2004) lsquoAutomaticDetection of Voice Impairments by Means of Short-TermCepstral Parameters and Neural Network Based DetectorsrsquoIEEE Transactions on Biomedical Engineering 51(2) 380ndash384

Godino-Llorente J Gomez-Vilda P and Blanco-Velasco M(2006) lsquoDimensionality Reduction of a Pathological VoiceQuality Assessment System Based on Gaussian Mixture Mod-els and Short-Term Cepstral Parametersrsquo IEEE Transactionson Biomedical Engineering 53(10) 1943ndash1953

Gunes S Polat K and Yosunkaya S (2010) lsquoEfficient SleepStage Recognition System Based on EEG Signal Using k-Means Clustering Based Feature Weightingrsquo Expert Systemswith Applications 37(12) 7922ndash7928

Hariharan M Paulraj M and Yaacob S (2009) lsquoIdentifica-tion of Vocal Fold Pathology Based on Mel Frequency BandEnergy Coefficients and Singular Value Decompositionrsquo inPaper presented at the 2009 IEEE International Conferenceon Signal and Image Processing Applications (ICSIPArsquo09)Kuala Lumpur Malaysia pp 514ndash517

Hariharan M Paulraj M and Yaacob S (2011) lsquoDetection ofVocal Fold Paralysis and Oedema Using Time-Domain Fea-tures and Probabilistic Neural Networkrsquo International Jour-nal of Biomedical Engineering and Technology 6(1) 46ndash57

Hariharan M Saraswathy J Sindhu R Khairunizam W andYaacob S (2012) lsquoInfant Cry Classification to Identify As-phyxia Using Time-Frequency Analysis and Radial BasisNeural Networksrsquo Expert Systems with Applications 39(10)9515ndash9523

Hariharan M and Sazali Y (2010) lsquoTime-Domain Features andProbabilistic Neural Network for the Detection of Vocal FoldPathologyrsquo Malaysian Journal of Computer Science 23(1)60ndash67

Hariharan M Sindhu R and Yaacob S (2011) lsquoNormal andHypoacoustic Infant Cry Signal Classification Using Time-Frequency Analysis and General Regression Neural Net-

workrsquo Computer Methods and Programs in Biomedicine108(2) 559ndash569

Hariharan M Yaacob S and Awang SA (2011) lsquoPathologi-cal Infant Cry Analysis Using Wavelet Packet Transform andProbabilistic Neural Networkrsquo Expert Systems with Applica-tions 38(12) 15377ndash15382

Hernandez-Espinosa C Gomez-Vilda P Godino-Llorente Jand Aguilera-Navarro S (2000) lsquoDiagnosis of Vocal andVoice Disorders by the Speech Signalrsquo in Paper Presentedat the Proceedings from IJCNNrsquo00 The IEEE-INNS-ENNSInternational Joint Conference on Neural Networks ComoItaly pp 253ndash258

Hoglund H (2012) lsquoDetecting Earnings Management with Neu-ral Networksrsquo Expert Systems with Applications 39(10)9564ndash9570

Ismail S Samsudin R and Shabri A (2011) lsquoA Hybrid Modelof Self-Organizing Maps (SOM) and Least Square SupportVector Machine (LSSVM) for Time Series Forecastingrsquo Ex-pert Systems with Applications 38(8) 10574ndash10578

Kalker T Haitsma J and Oostveen J (2001) lsquoRobust AudioHashing for Content Identificationrsquo in Paper Presented at theProceedings from EUSIPCOrsquo01 The 12th European SignalProcessing Conference Vienna Austria pp 2091ndash2094

Kasuya H Endo Y and Saliu S (1993) lsquoNovel AcousticMeasurements of Jitter and Shimmer Characteristics fromPathological Voicersquo Paper Presented at the Proceedings fromEUROSPEECHrsquo93 The 3rd European Conference on SpeechCommunication and Technology Berlin Germany pp 1973ndash1976

Kasuya H Ogawa S Mashima K and Ebihara S (1986)lsquoNormalized Noise Energy as an Acoustic Measure to Eval-uate Pathologic Voicersquo The Journal of the Acoustical Societyof America 80 1329ndash1334

Kay Elemetrics Inc (1994) lsquoVoice Disorders DatabaseVersion 103 [CD-ROM]rsquo httpwwwkaypentaxcomProduct20InfoCSL20Options43374337html

Khadivi Heris H Seyed Aghazadeh B and Nikkhah-BahramiM (2009) lsquoOptimal Feature Selection for the Assessment ofVocal Fold Disordersrsquo Computers in Biology and Medicine39(10) 860ndash868

Krom G (1993) lsquoA Cepstrum-Based Technique for Determininga Harmonics-To-Noise Ratio in Speech Signalsrsquo Journal ofSpeech and Hearing Research 36(2) 254ndash266

Kukharchik P Kheidorov I Bovbel E and Ladeev D (2008)lsquoSpeech Signal Processing Based on Wavelets and SVM forVocal Tract Pathology Detectionrsquo Lecture Notes in ComputerScience (Springer) 5099 192ndash199

Latifoglu F Polat K Kara SI and Gunes S (2008) lsquoMedicalDiagnosis of Atherosclerosis from Carotid Artery DopplerSignals Using Principal Component Analysis (PCA) k-NN based Weighting Pre-Processing and Artificial ImmuneRecognition System (AIRS)rsquo Journal of Biomedical Infor-matics 41(1) 15ndash23

Ludlow CL Bassich CJ Connor NP Coulter DC and LeeYJ (1987) lsquoThe Validity of Using Phonatory Jitter and Shim-mer to Detect Laryngeal Pathologyrsquo Laryngeal Functionin Phonation and Respiration Boston MA Brown amp Copp 492ndash508

MacQueen J (1967) lsquoSome Methods for Classification and Anal-ysis of Multivariate Observationsrsquo in Paper Presented at the5th Berkeley Symposium on Mathematical Statistics and Prob-ability Berkeley CA University California Press Vol 1pp 281ndash297

MAPACI P (2004) lsquoVoice Disorder Databasersquo httpwwwmapacicomindex-inglesphp

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1634 M Hariharan et al

Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372

Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800

Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953

Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327

Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900

Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793

Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793

Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60

Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673

Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279

Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921

Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia

Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345

Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925

Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128

Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740

Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p

Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118

Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576

Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd

Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430

Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97

Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117

Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678

Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219

Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399

Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

Page 9: A new feature constituting approach to detection of vocal fold pathology

International Journal of Systems Science 1629

Figure 7 Comparison of results (a) Frame-based MEEI database weighted features (b) file-based MEEI database weighted features(c) Frame-based MAPACI speech pathology database weighted features and (d) file-based MAPACI speech pathology database weightedfeatures

clustering based feature weighting method is proposed forpreprocessing the features which increases its discrimina-tion ability between classes and the classification accuracyas well Four classifiers are employed to investigate the ef-ficacy of the raw and weighted features and two validationschemes (ConV and CrossV) are used to prove the reliabil-ity of the classification results Details of the training andtesting sets used in this work are tabulated in (Table 1)

To test the classifier performance several measuresnamely sensitivity specificity and the overall accuracy areconsidered These measures are calculated from the mea-sures true positive (TP the classifier classified as pathologywhen pathological samples are present) true negative (TNthe classifier classified as normal when normal samples arepresent) false positive (FP the classifier classified as patho-logical when normal samples are present) and false negative(FN the classifier classified as normal when pathologicalsamples are present) These measures are given in Equa-tions (9) (10) and (11) respectively

Sensitivity (SE) = TP(TP + FN) (9)

Specificity (SP) = TN(TN + FP) (10)

Overall accuracy (ACC) = (TP + TN)(TP + TN

+ FP + FN) (11)

In k-NN classifier different values of lsquokrsquo between 1 and10 are used PNN and GRNN are trained with differentspread factors between 001 and 0055 The suitable valueof the regularisation parameter (γ gam) and σ 2 (sig2) forSVM classifier are chosen optimally during training andtesting to obtain better accuracy Sensitivity specificity andoverall accuracy of frame-based and file-based analysis us-ing the raw and weighted features are presented in (Table 2)and (Table 3) for both MEEI database and MAPACIspeech pathology database respectively Average classifica-tion accuracies with standard deviations are tabulated From(Table 2) it is observed that the accuracy of file-based anal-ysis is always better than the frame-based analysis Max-imum classification accuracy of 100 is obtained duringthe file-based analysis using all the classifiers (k-NN LS-SVM PNN and GRNN) for the proposed raw and weightedfeatures Maximum classification accuracy of above 95is obtained using the raw features and 100 is obtained us-ing the weighted features during the frame-based analysis

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1630 M Hariharan et al

Tabl

e2

Com

pari

son

ofre

sult

sfo

rM

EE

Ida

taba

seus

ing

k-N

NL

S-S

VM

PN

Nan

dG

RN

N(r

awan

dw

eigh

ted

feat

ures

)

Raw

feat

ures

Wei

ghte

dfe

atur

es

ME

EI

data

base

SE

SP

AC

CS

ES

PA

CC

k-N

NFr

ame-

base

dC

ross

V96

66

plusmn0

5591

69

plusmn1

0994

05

plusmn0

8399

81

plusmn0

0698

27

plusmn0

3499

03

plusmn0

19k-

valu

e1ndash

10C

onV

959

6plusmn

071

908

7plusmn

154

932

7plusmn

116

998

1plusmn

006

979

8plusmn

033

988

8plusmn

015

File

-bas

edC

ross

V99

54

plusmn0

2410

0plusmn

000

996

5plusmn

019

995

4plusmn

024

100

plusmn0

0099

65

plusmn0

19k-

valu

e1ndash

10C

onV

997

4plusmn

025

100

plusmn0

0099

79

plusmn0

2010

0plusmn

000

100

plusmn0

0010

0plusmn

000

LS

-SV

MFr

ame-

base

dC

ross

V95

10

plusmn0

2195

41

plusmn0

2095

25

plusmn0

1299

75

plusmn0

0399

45

plusmn0

0799

60

plusmn0

0410

000

01

Con

V94

92

plusmn0

9094

49

plusmn1

0594

70

plusmn0

8199

34

plusmn0

2999

46

plusmn0

3499

40

plusmn0

26Fi

le-b

ased

Cro

ssV

989

2plusmn

062

998

1plusmn

060

991

2plusmn

047

993

2plusmn

036

100

plusmn0

0099

47

plusmn0

2810

01

Con

V10

0plusmn

000

988

7plusmn

130

991

2plusmn

103

996

2plusmn

080

994

1plusmn

186

995

6plusmn

071

PN

NFr

ame-

base

dC

ross

V97

43

plusmn0

2693

01

plusmn0

1795

12

plusmn0

1799

88

plusmn0

0598

69

plusmn0

1699

28

plusmn0

070

01ndash0

055

Con

V96

79

plusmn0

2492

44

plusmn0

3394

51

plusmn0

2499

81

plusmn0

0698

41

plusmn0

2199

10

plusmn0

11Fi

le-b

ased

Cro

ssV

974

4plusmn

539

100

plusmn0

0097

74

plusmn4

9797

50

plusmn5

3710

0plusmn

000

977

9plusmn

495

001

ndash00

55C

onV

970

9plusmn

610

100

plusmn0

0097

37

plusmn5

7999

62

plusmn0

8099

41

plusmn1

8699

56

plusmn0

71G

RN

NFr

ame-

base

dC

ross

V93

01

plusmn3

8988

08

plusmn4

9890

15

plusmn4

3199

50

plusmn0

4197

53

plusmn1

3098

27

plusmn0

890

01ndash0

055

Con

V92

52

plusmn3

7887

07

plusmn5

1189

34

plusmn4

3599

39

plusmn0

5597

35

plusmn1

1698

10

plusmn0

91Fi

le-b

ased

Cro

ssV

972

7plusmn

599

100

plusmn0

0096

33

plusmn8

0098

07

plusmn4

8210

0plusmn

000

979

2plusmn

511

001

ndash00

55C

onV

966

9plusmn

675

100

plusmn0

0095

81

plusmn9

0596

77

plusmn6

2310

0plusmn

000

958

1plusmn

815

Tabl

e3

Com

pari

son

ofre

sult

sfo

rM

APA

CI

spee

chpa

thol

ogy

data

base

usin

gk-

NN

LS

-SV

MP

NN

and

GR

NN

(raw

and

wei

ghte

dfe

atur

es)

Raw

feat

ures

Wei

ghte

dfe

atur

es

MA

PAC

Ida

taba

seS

ES

PA

CC

SE

SP

AC

C

k-N

NFr

ame-

base

dC

ross

V94

49

plusmn0

3696

58

plusmn0

3995

51

plusmn0

2198

67

plusmn0

2398

76

plusmn0

2598

71

plusmn0

23k-

valu

e1ndash

10C

onV

942

0plusmn

040

962

9plusmn

033

952

1plusmn

032

100

plusmn0

0010

0plusmn

000

100

plusmn0

00Fi

le-b

ased

Cro

ssV

771

0plusmn

784

669

2plusmn

330

704

2plusmn

365

100

plusmn0

0010

0plusmn

000

983

3plusmn

164

k-va

lue

1ndash10

Con

V77

56

plusmn7

1167

51

plusmn2

4869

71

plusmn2

7099

88

plusmn0

4099

88

plusmn0

4097

50

plusmn2

16L

S-S

VM

Fram

e-ba

sed

Cro

ssV

955

0plusmn

018

972

8plusmn

024

963

7plusmn

016

990

4plusmn

008

990

4plusmn

008

992

4plusmn

006

100

01

Con

V96

86

plusmn0

8195

24

plusmn0

6796

03

plusmn0

4098

71

plusmn0

6898

71

plusmn0

6899

02

plusmn0

41Fi

le-b

ased

Cro

ssV

793

1plusmn

333

688

7plusmn

240

729

2plusmn

260

100

plusmn0

0010

0plusmn

000

958

3plusmn

000

101

Con

V74

53

plusmn7

8284

00

plusmn13

01

771

4plusmn

878

100

plusmn0

0010

0plusmn

000

964

3plusmn

376

PN

NFr

ame-

base

dC

ross

V94

88

plusmn0

2096

23

plusmn0

2695

55

plusmn0

2199

82

plusmn0

0799

82

plusmn0

0799

91

plusmn0

040

01ndash0

055

Con

V94

52

plusmn0

3196

03

plusmn0

3595

26

plusmn0

2599

76

plusmn0

0799

76

plusmn0

0799

88

plusmn0

03Fi

le-b

ased

Cro

ssV

723

0plusmn

396

705

3plusmn

393

708

3plusmn

170

985

7plusmn

452

985

7plusmn

452

991

7plusmn

263

001

ndash00

55C

onV

690

2plusmn

533

691

8plusmn

410

679

3plusmn

386

986

0plusmn

273

986

0plusmn

273

990

0plusmn

172

GR

NN

Fram

e-ba

sed

Cro

ssV

904

1plusmn

378

917

9plusmn

335

909

8plusmn

345

995

9plusmn

029

995

9plusmn

029

997

1plusmn

023

001

ndash00

55C

onV

900

2plusmn

361

917

8plusmn

316

907

2plusmn

324

995

3plusmn

030

995

3plusmn

030

996

6plusmn

024

File

-bas

edC

ross

V69

43

plusmn8

5170

75

plusmn4

8666

67

plusmn7

7397

13

plusmn6

4097

13

plusmn6

4093

75

plusmn9

270

01ndash0

055

Con

V62

05

plusmn9

4064

51

plusmn3

8359

57

plusmn8

5097

38

plusmn5

6897

38

plusmn5

6893

07

plusmn9

91

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1631

Table 4 Computation time for feature extraction and classification

Classification

MEEI database MAPACI database

Type of Type of Weighted Raw Weighted RawClassifiers experiments validation features features features features

k-NN Frame-based CrossV 0338 0341 0342 0346k-value 1ndash10 ConV 3806 3821 6931 6955File-based CrossV 0036 0038 0036 0036k-value 1ndash10 ConV 1759 1754 3001 3206

LS-SVM Frame-based CrossV 1139 1175 1191 121410 001 ConV 0718 0698 0705 0723

File-based CrossV 0005 0005 0002 000210 1 ConV 0006 0006 0004 0004

PNN Frame-based CrossV 2458 3084 2445 2421001ndash0055 ConV 10340 10356 11180 11202File-based CrossV 1078 1124 1093 1062001ndash0055 ConV 4250 4266 4072 4047

GRNN Frame-based CrossV 2587 2571 2676 2582001ndash0055 ConV 10726 10648 11634 11584File-based CrossV 1098 1056 1065 1062001ndash0055 ConV 4302 4260 4049 4066

Feature extractionType of experiments MEEI MAPACI

Normal Pathological Normal PathologicalFile-based 0549 0034 0109 0130Frame-based 0591 0181 1734 1737

To prove the robustness and database independency of theproposed method MAPACI speech pathology database isused for the investigation in addition to MEEI databaseFrom (Table 3) it is inferred that maximum accuracy ofabove 95 is obtained using the raw features and 100is obtained using the weighted features during the frame-based analysis During the file-based analysis we obtainedmaximum classification accuracy of 92 using the rawfeatures and 100 is obtained using the weighted features

(Figure 6andashd) show the comparison of results (maxi-mum accuracy and its corresponding sensitivity and speci-ficity) for MEEI database and MAPACI speech pathologydatabase using the proposed raw features during frame-based and file-based analysis Similarly (Figure 7andashd) il-lustrate the comparison of results (maximum accuracy andits corresponding sensitivity and specificity) for MEEIdatabase and MAPACI speech pathology database for pro-posed weighted features The performance of the weightedfeatures is always better compared to raw features for bothMEEI database and MAPACI speech pathology databaseAll the feature extractions and classification algorithms aredeveloped under MATLAB platform and they run in a lap-top of Intel Core i7-2670 QM (22 GHz) with 4 GB RAM(Table 4) reports the average time taken (in seconds) duringfeature extraction and classification under two different ex-periments (frame-based and file-based) Direct comparisonof our work with the previous works in literature cannot beperformed since most of the works in the literature havenot presented their computation time of feature extraction

and classification phase From (Table 4) it is observed thatthe best classification accuracy could be obtained using oursuggested feature extraction and classification algorithmswith less computation time

From the literature it is observed that the reliabilityof wavelet and WPT-based features has been proven bymany experimental researches Accuracy of their resultswas varied from 85 to 100 approximately under differ-ent experiments Their results highly depend on the opti-mal selection of wavelet packet-based features using featureselectiontransformation techniques and optimisation tech-niques However the presented work gives very promisingclassification accuracy for the databases under test Henceit can be concluded that the proposed k-means clusteringbased feature weighting method could improve the robust-ness of the WPTndashSVD features and increase the classifica-tion accuracy as well for both MEEI database and MAPACIspeech pathology database

6 Conclusions

This paper presents a new feature constituting approachfor efficient detection of vocal fold pathology using a fea-ture extraction method (WPTndashSVD features and a fea-ture weighting method [k-means clustering based]) Totest the effectiveness and reliability of the proposed rawand weighted features four different classifiers such as k-NN LS-SVM PNN and GRNN are employed k-means

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1632 M Hariharan et al

clustering based feature weighting approach can increasethe distinguishing performance of raw WPTndashSVD featuresThe experimental results show that the proposed weightedfeatures give very promising classification accuracy of100 for both MEEI voice disorder database and MAPACIspeech pathology database The proposed features can beused as additional acoustic indicators for researchers andspeech pathologists to detect the vocal fold pathology Theperformance of the k-means clustering algorithm is depend-ing on the initial positions of the clusters and hence in ourwork it was run for several times to obtain better clus-ter centres In the future work k-means clustering basedfeature weighting method could be improved by using ef-ficient initialisation of methods for better initial placementof the cluster centres and different kernel functions Theproposed method could be extended to detect the specifictype of disorders and also to develop an online diagnosingsystem

AcknowledgementsThis work is done in part with data transferred from the databaseMAPACI httpwwwmapacicom The authors would like tothank the anonymous reviewers for their valuable comments

Notes on contributorsM Hariharan has received Bachelor ofEngineering in Electrical and ElectronicsEngineering and Master of Engineeringin Applied Electronics from GovernmentCollege of Technology Coimbatore TamilNadu India He has received a PhD degreein Mechatronic Engineering from UniversitiMalaysia Perlis (UniMAP) Malaysia He iscurrently working as a Senior Lecturer in

the School of Mechatronic Engineering UniMAP Malaysia Hehas published more than 50 papers in referred journals and con-ferences He has been a reviewer in various journals includingComputers and Electrical Engineering Computer Methods andPrograms in Biomedicine Artificial Intelligence in BiomedicineInternational Journal of Phoniatrics Speech Therapy and Com-munication Pathology and Medical Engineering and Physics Hisresearch interests include speech signal processing biomedicalsignal and image processing and artificial intelligence He is amember of IEEE

Kemal Polat was born in Kırıkkale on30 January 1981 Dr Polat has gradu-ated from ElectricalndashElectronics Engineer-ing Department of Selcuk University withBSc degree in 2002 and from ElectricalndashElectronics Engineering Department of Sel-cuk University with MSc degree in 2004He took his PhD degree in Electrical andElectronic Engineering at Selcuk University

in 2008 He is now working as an Assistant Professor in Electricaland Electronic Engineering Department Engineering of FacultyAbant Izzet Baysal University since September 2011 His currentresearch interests are biomedical signal classification statisticalsignal processing digital signal processing and pattern recognitionand classification He has 55 articles published in the SCI journalsand 16 international conference papers He is a reviewer in various

SCI journals including Information Sciences Pattern RecognitionLetters Digital Signal Processing Soft Computing Artificial In-telligence in Medicine Applied Soft Computing Computers inBiology and Medicine Expert Systems Computational Statisticsamp Data Analysis IEEE Transactions on Biomedical EngineeringIEEE Transactions on Evolutionary Computation TALANTASoft Computing Journal of Medical Systems and Scientific Re-search and Essays so on He is the member of editorial board ofJournal of Neural Computing and Applications (SCI expanded)

Sazali Yaacob received his PhD degreein Control Engineering from University ofSheffield UK He is currently working as aProfessor in the School of Mechatronic En-gineering UniMAP Malaysia He has suc-cessfully supervised 8 PhD candidates andmore than 20 MSc graduates through re-search mode Currently he has 10 PhD and8 MSc candidates His research interests

include control modelling and signal processing with applica-tions in the fields of satellite bio-medical applied mechanics androbotics He has published more than 70 papers in journals and200 papers in conference proceedings He received his profes-sional qualification as Charted Engineer from the EngineeringCouncil UK in 2005 and also a member of IET UK since 2003

ReferencesAnanthakrishna T Shama K and Niranjan U (2004)

lsquok-Means Nearest Neighbor Classifier for Voice Pathol-ogyrsquo in Paper Presented at the Proceedings from INDI-CONrsquo04 The 1st IEEE Annual India Conference KharagpurIndia pp 352ndash354

Arjmandi MK and Pooyan M (2011) lsquoAn Optimum Algorithmin Pathological Voice Quality Assessment Using Wavelet-Packet-Based Features Linear Discriminant Analysis andSupport Vector Machinersquo Biomedical Signal Processing andControl 7 3ndash19

Azadi TE and Almasganj F (2011) lsquoPBSVM Partitioningand Biased Support Vector Machine for Vocal Fold PathologyAssessment Using Labeled and Unlabeled Data Setsrsquo ExpertSystems with Applications 38(1) 610ndash619

Bezdek JC (1981) Pattern Recognition with Fuzzy ObjectiveFunction Algorithms Norwell MA Kluwer Academic Pub-lishers

Boyanov B and Hadjitodorov S (1997) lsquoAcoustic Analysis ofPathological Voices A Voice Analysis System for the Screen-ing of Laryngeal Diseasesrsquo IEEE Engineering in Medicineand Biology Magazine 16(4) 74ndash82

Boyanov B Ivanov T Hadjitodorov S and Chollet G (1993)lsquoRobust Hybrid Pitch Detectorrsquo IET Electronics Letters 291924ndash1926

Burrus C Gopinath R Guo H Odegard J and Selesnick I(1997) Introduction to Wavelets and Wavelet Transforms APrimer Upper Saddle River NJ Prentice Hall

Calisir D and Dogantekin E (2011) lsquoA New Intelligent Hep-atitis Diagnosis System PCA-LSSVMrsquo Expert Systems withApplications 38(8) 10705ndash10708

Chiu SL (1994) lsquoFuzzy Model Identification Based on ClusterEstimationrsquo Journal of Intelligent and Fuzzy Systems 2(3)267ndash278

Crovato C and Schuck A (2007) lsquoThe Use of Wavelet PacketTransform and Artificial Neural Networks in Analysis andClassification of Dysphonic Voicesrsquo IEEE Transactions onBiomedical Engineering 54(10) 1898ndash1900

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1633

De Brabanter K Ojeda PKF Alzate C De Brabanter J Pel-ckmans K De Moor B Vandewalle J and Suykens JAK(2010) lsquoLS-SVMlab Toolboxrsquo httpwwwesatkuleuvenbesistalssvmlab

Deliyski D (1993) Acoustic Model and Evaluation of Patholog-ical Voice Productionrsquo in Paper Presented at the Proceed-ings from EUROSPEECHrsquo93 The 3rd Conference on SpeechCommunication and Technology Berlin Germany pp 1969ndash1972

Duda RO Hart PE and Stork DG (2001) Pattern Classifi-cation New York Wiley-Interscience

Erfanian Saeedi N Almasganj F and Torabinejad F (2011)lsquoSupport Vector Wavelet Adaptation for Pathological VoiceAssessmentrsquo Computers in Biology and Medicine 41(9)822ndash828

Eskidere O Ertas F and Hanilci C (2011) lsquoA Comparison ofRegression Methods for Remote Tracking of Parkinsonrsquos Dis-ease Progressionrsquo Expert Systems with Applications 39(5)5523ndash5528

Feijoo S and Hernandez C (1990) lsquoShort-Term Stability Mea-sures for the Evaluation of Vocal Qualityrsquo Journal of Speechand Hearing Research 33(2) 324ndash334

Fonseca E Guido R Scalassara P Maciel C and Pereira J(2007) lsquoWavelet Time-Frequency Analysis and Least SquaresSupport Vector Machines for the Identification of Voice Dis-ordersrsquo Computers in Biology and Medicine(Elsevier) 37(4)571ndash578

Fukunaga K (1990) Introduction to Statistical Pattern Recogni-tion (2nd ed) San Diego CA Academic Press

Godino-Llorente J and Gomez-Vilda P (2004) lsquoAutomaticDetection of Voice Impairments by Means of Short-TermCepstral Parameters and Neural Network Based DetectorsrsquoIEEE Transactions on Biomedical Engineering 51(2) 380ndash384

Godino-Llorente J Gomez-Vilda P and Blanco-Velasco M(2006) lsquoDimensionality Reduction of a Pathological VoiceQuality Assessment System Based on Gaussian Mixture Mod-els and Short-Term Cepstral Parametersrsquo IEEE Transactionson Biomedical Engineering 53(10) 1943ndash1953

Gunes S Polat K and Yosunkaya S (2010) lsquoEfficient SleepStage Recognition System Based on EEG Signal Using k-Means Clustering Based Feature Weightingrsquo Expert Systemswith Applications 37(12) 7922ndash7928

Hariharan M Paulraj M and Yaacob S (2009) lsquoIdentifica-tion of Vocal Fold Pathology Based on Mel Frequency BandEnergy Coefficients and Singular Value Decompositionrsquo inPaper presented at the 2009 IEEE International Conferenceon Signal and Image Processing Applications (ICSIPArsquo09)Kuala Lumpur Malaysia pp 514ndash517

Hariharan M Paulraj M and Yaacob S (2011) lsquoDetection ofVocal Fold Paralysis and Oedema Using Time-Domain Fea-tures and Probabilistic Neural Networkrsquo International Jour-nal of Biomedical Engineering and Technology 6(1) 46ndash57

Hariharan M Saraswathy J Sindhu R Khairunizam W andYaacob S (2012) lsquoInfant Cry Classification to Identify As-phyxia Using Time-Frequency Analysis and Radial BasisNeural Networksrsquo Expert Systems with Applications 39(10)9515ndash9523

Hariharan M and Sazali Y (2010) lsquoTime-Domain Features andProbabilistic Neural Network for the Detection of Vocal FoldPathologyrsquo Malaysian Journal of Computer Science 23(1)60ndash67

Hariharan M Sindhu R and Yaacob S (2011) lsquoNormal andHypoacoustic Infant Cry Signal Classification Using Time-Frequency Analysis and General Regression Neural Net-

workrsquo Computer Methods and Programs in Biomedicine108(2) 559ndash569

Hariharan M Yaacob S and Awang SA (2011) lsquoPathologi-cal Infant Cry Analysis Using Wavelet Packet Transform andProbabilistic Neural Networkrsquo Expert Systems with Applica-tions 38(12) 15377ndash15382

Hernandez-Espinosa C Gomez-Vilda P Godino-Llorente Jand Aguilera-Navarro S (2000) lsquoDiagnosis of Vocal andVoice Disorders by the Speech Signalrsquo in Paper Presentedat the Proceedings from IJCNNrsquo00 The IEEE-INNS-ENNSInternational Joint Conference on Neural Networks ComoItaly pp 253ndash258

Hoglund H (2012) lsquoDetecting Earnings Management with Neu-ral Networksrsquo Expert Systems with Applications 39(10)9564ndash9570

Ismail S Samsudin R and Shabri A (2011) lsquoA Hybrid Modelof Self-Organizing Maps (SOM) and Least Square SupportVector Machine (LSSVM) for Time Series Forecastingrsquo Ex-pert Systems with Applications 38(8) 10574ndash10578

Kalker T Haitsma J and Oostveen J (2001) lsquoRobust AudioHashing for Content Identificationrsquo in Paper Presented at theProceedings from EUSIPCOrsquo01 The 12th European SignalProcessing Conference Vienna Austria pp 2091ndash2094

Kasuya H Endo Y and Saliu S (1993) lsquoNovel AcousticMeasurements of Jitter and Shimmer Characteristics fromPathological Voicersquo Paper Presented at the Proceedings fromEUROSPEECHrsquo93 The 3rd European Conference on SpeechCommunication and Technology Berlin Germany pp 1973ndash1976

Kasuya H Ogawa S Mashima K and Ebihara S (1986)lsquoNormalized Noise Energy as an Acoustic Measure to Eval-uate Pathologic Voicersquo The Journal of the Acoustical Societyof America 80 1329ndash1334

Kay Elemetrics Inc (1994) lsquoVoice Disorders DatabaseVersion 103 [CD-ROM]rsquo httpwwwkaypentaxcomProduct20InfoCSL20Options43374337html

Khadivi Heris H Seyed Aghazadeh B and Nikkhah-BahramiM (2009) lsquoOptimal Feature Selection for the Assessment ofVocal Fold Disordersrsquo Computers in Biology and Medicine39(10) 860ndash868

Krom G (1993) lsquoA Cepstrum-Based Technique for Determininga Harmonics-To-Noise Ratio in Speech Signalsrsquo Journal ofSpeech and Hearing Research 36(2) 254ndash266

Kukharchik P Kheidorov I Bovbel E and Ladeev D (2008)lsquoSpeech Signal Processing Based on Wavelets and SVM forVocal Tract Pathology Detectionrsquo Lecture Notes in ComputerScience (Springer) 5099 192ndash199

Latifoglu F Polat K Kara SI and Gunes S (2008) lsquoMedicalDiagnosis of Atherosclerosis from Carotid Artery DopplerSignals Using Principal Component Analysis (PCA) k-NN based Weighting Pre-Processing and Artificial ImmuneRecognition System (AIRS)rsquo Journal of Biomedical Infor-matics 41(1) 15ndash23

Ludlow CL Bassich CJ Connor NP Coulter DC and LeeYJ (1987) lsquoThe Validity of Using Phonatory Jitter and Shim-mer to Detect Laryngeal Pathologyrsquo Laryngeal Functionin Phonation and Respiration Boston MA Brown amp Copp 492ndash508

MacQueen J (1967) lsquoSome Methods for Classification and Anal-ysis of Multivariate Observationsrsquo in Paper Presented at the5th Berkeley Symposium on Mathematical Statistics and Prob-ability Berkeley CA University California Press Vol 1pp 281ndash297

MAPACI P (2004) lsquoVoice Disorder Databasersquo httpwwwmapacicomindex-inglesphp

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1634 M Hariharan et al

Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372

Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800

Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953

Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327

Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900

Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793

Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793

Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60

Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673

Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279

Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921

Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia

Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345

Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925

Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128

Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740

Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p

Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118

Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576

Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd

Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430

Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97

Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117

Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678

Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219

Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399

Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

Page 10: A new feature constituting approach to detection of vocal fold pathology

1630 M Hariharan et al

Tabl

e2

Com

pari

son

ofre

sult

sfo

rM

EE

Ida

taba

seus

ing

k-N

NL

S-S

VM

PN

Nan

dG

RN

N(r

awan

dw

eigh

ted

feat

ures

)

Raw

feat

ures

Wei

ghte

dfe

atur

es

ME

EI

data

base

SE

SP

AC

CS

ES

PA

CC

k-N

NFr

ame-

base

dC

ross

V96

66

plusmn0

5591

69

plusmn1

0994

05

plusmn0

8399

81

plusmn0

0698

27

plusmn0

3499

03

plusmn0

19k-

valu

e1ndash

10C

onV

959

6plusmn

071

908

7plusmn

154

932

7plusmn

116

998

1plusmn

006

979

8plusmn

033

988

8plusmn

015

File

-bas

edC

ross

V99

54

plusmn0

2410

0plusmn

000

996

5plusmn

019

995

4plusmn

024

100

plusmn0

0099

65

plusmn0

19k-

valu

e1ndash

10C

onV

997

4plusmn

025

100

plusmn0

0099

79

plusmn0

2010

0plusmn

000

100

plusmn0

0010

0plusmn

000

LS

-SV

MFr

ame-

base

dC

ross

V95

10

plusmn0

2195

41

plusmn0

2095

25

plusmn0

1299

75

plusmn0

0399

45

plusmn0

0799

60

plusmn0

0410

000

01

Con

V94

92

plusmn0

9094

49

plusmn1

0594

70

plusmn0

8199

34

plusmn0

2999

46

plusmn0

3499

40

plusmn0

26Fi

le-b

ased

Cro

ssV

989

2plusmn

062

998

1plusmn

060

991

2plusmn

047

993

2plusmn

036

100

plusmn0

0099

47

plusmn0

2810

01

Con

V10

0plusmn

000

988

7plusmn

130

991

2plusmn

103

996

2plusmn

080

994

1plusmn

186

995

6plusmn

071

PN

NFr

ame-

base

dC

ross

V97

43

plusmn0

2693

01

plusmn0

1795

12

plusmn0

1799

88

plusmn0

0598

69

plusmn0

1699

28

plusmn0

070

01ndash0

055

Con

V96

79

plusmn0

2492

44

plusmn0

3394

51

plusmn0

2499

81

plusmn0

0698

41

plusmn0

2199

10

plusmn0

11Fi

le-b

ased

Cro

ssV

974

4plusmn

539

100

plusmn0

0097

74

plusmn4

9797

50

plusmn5

3710

0plusmn

000

977

9plusmn

495

001

ndash00

55C

onV

970

9plusmn

610

100

plusmn0

0097

37

plusmn5

7999

62

plusmn0

8099

41

plusmn1

8699

56

plusmn0

71G

RN

NFr

ame-

base

dC

ross

V93

01

plusmn3

8988

08

plusmn4

9890

15

plusmn4

3199

50

plusmn0

4197

53

plusmn1

3098

27

plusmn0

890

01ndash0

055

Con

V92

52

plusmn3

7887

07

plusmn5

1189

34

plusmn4

3599

39

plusmn0

5597

35

plusmn1

1698

10

plusmn0

91Fi

le-b

ased

Cro

ssV

972

7plusmn

599

100

plusmn0

0096

33

plusmn8

0098

07

plusmn4

8210

0plusmn

000

979

2plusmn

511

001

ndash00

55C

onV

966

9plusmn

675

100

plusmn0

0095

81

plusmn9

0596

77

plusmn6

2310

0plusmn

000

958

1plusmn

815

Tabl

e3

Com

pari

son

ofre

sult

sfo

rM

APA

CI

spee

chpa

thol

ogy

data

base

usin

gk-

NN

LS

-SV

MP

NN

and

GR

NN

(raw

and

wei

ghte

dfe

atur

es)

Raw

feat

ures

Wei

ghte

dfe

atur

es

MA

PAC

Ida

taba

seS

ES

PA

CC

SE

SP

AC

C

k-N

NFr

ame-

base

dC

ross

V94

49

plusmn0

3696

58

plusmn0

3995

51

plusmn0

2198

67

plusmn0

2398

76

plusmn0

2598

71

plusmn0

23k-

valu

e1ndash

10C

onV

942

0plusmn

040

962

9plusmn

033

952

1plusmn

032

100

plusmn0

0010

0plusmn

000

100

plusmn0

00Fi

le-b

ased

Cro

ssV

771

0plusmn

784

669

2plusmn

330

704

2plusmn

365

100

plusmn0

0010

0plusmn

000

983

3plusmn

164

k-va

lue

1ndash10

Con

V77

56

plusmn7

1167

51

plusmn2

4869

71

plusmn2

7099

88

plusmn0

4099

88

plusmn0

4097

50

plusmn2

16L

S-S

VM

Fram

e-ba

sed

Cro

ssV

955

0plusmn

018

972

8plusmn

024

963

7plusmn

016

990

4plusmn

008

990

4plusmn

008

992

4plusmn

006

100

01

Con

V96

86

plusmn0

8195

24

plusmn0

6796

03

plusmn0

4098

71

plusmn0

6898

71

plusmn0

6899

02

plusmn0

41Fi

le-b

ased

Cro

ssV

793

1plusmn

333

688

7plusmn

240

729

2plusmn

260

100

plusmn0

0010

0plusmn

000

958

3plusmn

000

101

Con

V74

53

plusmn7

8284

00

plusmn13

01

771

4plusmn

878

100

plusmn0

0010

0plusmn

000

964

3plusmn

376

PN

NFr

ame-

base

dC

ross

V94

88

plusmn0

2096

23

plusmn0

2695

55

plusmn0

2199

82

plusmn0

0799

82

plusmn0

0799

91

plusmn0

040

01ndash0

055

Con

V94

52

plusmn0

3196

03

plusmn0

3595

26

plusmn0

2599

76

plusmn0

0799

76

plusmn0

0799

88

plusmn0

03Fi

le-b

ased

Cro

ssV

723

0plusmn

396

705

3plusmn

393

708

3plusmn

170

985

7plusmn

452

985

7plusmn

452

991

7plusmn

263

001

ndash00

55C

onV

690

2plusmn

533

691

8plusmn

410

679

3plusmn

386

986

0plusmn

273

986

0plusmn

273

990

0plusmn

172

GR

NN

Fram

e-ba

sed

Cro

ssV

904

1plusmn

378

917

9plusmn

335

909

8plusmn

345

995

9plusmn

029

995

9plusmn

029

997

1plusmn

023

001

ndash00

55C

onV

900

2plusmn

361

917

8plusmn

316

907

2plusmn

324

995

3plusmn

030

995

3plusmn

030

996

6plusmn

024

File

-bas

edC

ross

V69

43

plusmn8

5170

75

plusmn4

8666

67

plusmn7

7397

13

plusmn6

4097

13

plusmn6

4093

75

plusmn9

270

01ndash0

055

Con

V62

05

plusmn9

4064

51

plusmn3

8359

57

plusmn8

5097

38

plusmn5

6897

38

plusmn5

6893

07

plusmn9

91

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1631

Table 4 Computation time for feature extraction and classification

Classification

MEEI database MAPACI database

Type of Type of Weighted Raw Weighted RawClassifiers experiments validation features features features features

k-NN Frame-based CrossV 0338 0341 0342 0346k-value 1ndash10 ConV 3806 3821 6931 6955File-based CrossV 0036 0038 0036 0036k-value 1ndash10 ConV 1759 1754 3001 3206

LS-SVM Frame-based CrossV 1139 1175 1191 121410 001 ConV 0718 0698 0705 0723

File-based CrossV 0005 0005 0002 000210 1 ConV 0006 0006 0004 0004

PNN Frame-based CrossV 2458 3084 2445 2421001ndash0055 ConV 10340 10356 11180 11202File-based CrossV 1078 1124 1093 1062001ndash0055 ConV 4250 4266 4072 4047

GRNN Frame-based CrossV 2587 2571 2676 2582001ndash0055 ConV 10726 10648 11634 11584File-based CrossV 1098 1056 1065 1062001ndash0055 ConV 4302 4260 4049 4066

Feature extractionType of experiments MEEI MAPACI

Normal Pathological Normal PathologicalFile-based 0549 0034 0109 0130Frame-based 0591 0181 1734 1737

To prove the robustness and database independency of theproposed method MAPACI speech pathology database isused for the investigation in addition to MEEI databaseFrom (Table 3) it is inferred that maximum accuracy ofabove 95 is obtained using the raw features and 100is obtained using the weighted features during the frame-based analysis During the file-based analysis we obtainedmaximum classification accuracy of 92 using the rawfeatures and 100 is obtained using the weighted features

(Figure 6andashd) show the comparison of results (maxi-mum accuracy and its corresponding sensitivity and speci-ficity) for MEEI database and MAPACI speech pathologydatabase using the proposed raw features during frame-based and file-based analysis Similarly (Figure 7andashd) il-lustrate the comparison of results (maximum accuracy andits corresponding sensitivity and specificity) for MEEIdatabase and MAPACI speech pathology database for pro-posed weighted features The performance of the weightedfeatures is always better compared to raw features for bothMEEI database and MAPACI speech pathology databaseAll the feature extractions and classification algorithms aredeveloped under MATLAB platform and they run in a lap-top of Intel Core i7-2670 QM (22 GHz) with 4 GB RAM(Table 4) reports the average time taken (in seconds) duringfeature extraction and classification under two different ex-periments (frame-based and file-based) Direct comparisonof our work with the previous works in literature cannot beperformed since most of the works in the literature havenot presented their computation time of feature extraction

and classification phase From (Table 4) it is observed thatthe best classification accuracy could be obtained using oursuggested feature extraction and classification algorithmswith less computation time

From the literature it is observed that the reliabilityof wavelet and WPT-based features has been proven bymany experimental researches Accuracy of their resultswas varied from 85 to 100 approximately under differ-ent experiments Their results highly depend on the opti-mal selection of wavelet packet-based features using featureselectiontransformation techniques and optimisation tech-niques However the presented work gives very promisingclassification accuracy for the databases under test Henceit can be concluded that the proposed k-means clusteringbased feature weighting method could improve the robust-ness of the WPTndashSVD features and increase the classifica-tion accuracy as well for both MEEI database and MAPACIspeech pathology database

6 Conclusions

This paper presents a new feature constituting approachfor efficient detection of vocal fold pathology using a fea-ture extraction method (WPTndashSVD features and a fea-ture weighting method [k-means clustering based]) Totest the effectiveness and reliability of the proposed rawand weighted features four different classifiers such as k-NN LS-SVM PNN and GRNN are employed k-means

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1632 M Hariharan et al

clustering based feature weighting approach can increasethe distinguishing performance of raw WPTndashSVD featuresThe experimental results show that the proposed weightedfeatures give very promising classification accuracy of100 for both MEEI voice disorder database and MAPACIspeech pathology database The proposed features can beused as additional acoustic indicators for researchers andspeech pathologists to detect the vocal fold pathology Theperformance of the k-means clustering algorithm is depend-ing on the initial positions of the clusters and hence in ourwork it was run for several times to obtain better clus-ter centres In the future work k-means clustering basedfeature weighting method could be improved by using ef-ficient initialisation of methods for better initial placementof the cluster centres and different kernel functions Theproposed method could be extended to detect the specifictype of disorders and also to develop an online diagnosingsystem

AcknowledgementsThis work is done in part with data transferred from the databaseMAPACI httpwwwmapacicom The authors would like tothank the anonymous reviewers for their valuable comments

Notes on contributorsM Hariharan has received Bachelor ofEngineering in Electrical and ElectronicsEngineering and Master of Engineeringin Applied Electronics from GovernmentCollege of Technology Coimbatore TamilNadu India He has received a PhD degreein Mechatronic Engineering from UniversitiMalaysia Perlis (UniMAP) Malaysia He iscurrently working as a Senior Lecturer in

the School of Mechatronic Engineering UniMAP Malaysia Hehas published more than 50 papers in referred journals and con-ferences He has been a reviewer in various journals includingComputers and Electrical Engineering Computer Methods andPrograms in Biomedicine Artificial Intelligence in BiomedicineInternational Journal of Phoniatrics Speech Therapy and Com-munication Pathology and Medical Engineering and Physics Hisresearch interests include speech signal processing biomedicalsignal and image processing and artificial intelligence He is amember of IEEE

Kemal Polat was born in Kırıkkale on30 January 1981 Dr Polat has gradu-ated from ElectricalndashElectronics Engineer-ing Department of Selcuk University withBSc degree in 2002 and from ElectricalndashElectronics Engineering Department of Sel-cuk University with MSc degree in 2004He took his PhD degree in Electrical andElectronic Engineering at Selcuk University

in 2008 He is now working as an Assistant Professor in Electricaland Electronic Engineering Department Engineering of FacultyAbant Izzet Baysal University since September 2011 His currentresearch interests are biomedical signal classification statisticalsignal processing digital signal processing and pattern recognitionand classification He has 55 articles published in the SCI journalsand 16 international conference papers He is a reviewer in various

SCI journals including Information Sciences Pattern RecognitionLetters Digital Signal Processing Soft Computing Artificial In-telligence in Medicine Applied Soft Computing Computers inBiology and Medicine Expert Systems Computational Statisticsamp Data Analysis IEEE Transactions on Biomedical EngineeringIEEE Transactions on Evolutionary Computation TALANTASoft Computing Journal of Medical Systems and Scientific Re-search and Essays so on He is the member of editorial board ofJournal of Neural Computing and Applications (SCI expanded)

Sazali Yaacob received his PhD degreein Control Engineering from University ofSheffield UK He is currently working as aProfessor in the School of Mechatronic En-gineering UniMAP Malaysia He has suc-cessfully supervised 8 PhD candidates andmore than 20 MSc graduates through re-search mode Currently he has 10 PhD and8 MSc candidates His research interests

include control modelling and signal processing with applica-tions in the fields of satellite bio-medical applied mechanics androbotics He has published more than 70 papers in journals and200 papers in conference proceedings He received his profes-sional qualification as Charted Engineer from the EngineeringCouncil UK in 2005 and also a member of IET UK since 2003

ReferencesAnanthakrishna T Shama K and Niranjan U (2004)

lsquok-Means Nearest Neighbor Classifier for Voice Pathol-ogyrsquo in Paper Presented at the Proceedings from INDI-CONrsquo04 The 1st IEEE Annual India Conference KharagpurIndia pp 352ndash354

Arjmandi MK and Pooyan M (2011) lsquoAn Optimum Algorithmin Pathological Voice Quality Assessment Using Wavelet-Packet-Based Features Linear Discriminant Analysis andSupport Vector Machinersquo Biomedical Signal Processing andControl 7 3ndash19

Azadi TE and Almasganj F (2011) lsquoPBSVM Partitioningand Biased Support Vector Machine for Vocal Fold PathologyAssessment Using Labeled and Unlabeled Data Setsrsquo ExpertSystems with Applications 38(1) 610ndash619

Bezdek JC (1981) Pattern Recognition with Fuzzy ObjectiveFunction Algorithms Norwell MA Kluwer Academic Pub-lishers

Boyanov B and Hadjitodorov S (1997) lsquoAcoustic Analysis ofPathological Voices A Voice Analysis System for the Screen-ing of Laryngeal Diseasesrsquo IEEE Engineering in Medicineand Biology Magazine 16(4) 74ndash82

Boyanov B Ivanov T Hadjitodorov S and Chollet G (1993)lsquoRobust Hybrid Pitch Detectorrsquo IET Electronics Letters 291924ndash1926

Burrus C Gopinath R Guo H Odegard J and Selesnick I(1997) Introduction to Wavelets and Wavelet Transforms APrimer Upper Saddle River NJ Prentice Hall

Calisir D and Dogantekin E (2011) lsquoA New Intelligent Hep-atitis Diagnosis System PCA-LSSVMrsquo Expert Systems withApplications 38(8) 10705ndash10708

Chiu SL (1994) lsquoFuzzy Model Identification Based on ClusterEstimationrsquo Journal of Intelligent and Fuzzy Systems 2(3)267ndash278

Crovato C and Schuck A (2007) lsquoThe Use of Wavelet PacketTransform and Artificial Neural Networks in Analysis andClassification of Dysphonic Voicesrsquo IEEE Transactions onBiomedical Engineering 54(10) 1898ndash1900

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1633

De Brabanter K Ojeda PKF Alzate C De Brabanter J Pel-ckmans K De Moor B Vandewalle J and Suykens JAK(2010) lsquoLS-SVMlab Toolboxrsquo httpwwwesatkuleuvenbesistalssvmlab

Deliyski D (1993) Acoustic Model and Evaluation of Patholog-ical Voice Productionrsquo in Paper Presented at the Proceed-ings from EUROSPEECHrsquo93 The 3rd Conference on SpeechCommunication and Technology Berlin Germany pp 1969ndash1972

Duda RO Hart PE and Stork DG (2001) Pattern Classifi-cation New York Wiley-Interscience

Erfanian Saeedi N Almasganj F and Torabinejad F (2011)lsquoSupport Vector Wavelet Adaptation for Pathological VoiceAssessmentrsquo Computers in Biology and Medicine 41(9)822ndash828

Eskidere O Ertas F and Hanilci C (2011) lsquoA Comparison ofRegression Methods for Remote Tracking of Parkinsonrsquos Dis-ease Progressionrsquo Expert Systems with Applications 39(5)5523ndash5528

Feijoo S and Hernandez C (1990) lsquoShort-Term Stability Mea-sures for the Evaluation of Vocal Qualityrsquo Journal of Speechand Hearing Research 33(2) 324ndash334

Fonseca E Guido R Scalassara P Maciel C and Pereira J(2007) lsquoWavelet Time-Frequency Analysis and Least SquaresSupport Vector Machines for the Identification of Voice Dis-ordersrsquo Computers in Biology and Medicine(Elsevier) 37(4)571ndash578

Fukunaga K (1990) Introduction to Statistical Pattern Recogni-tion (2nd ed) San Diego CA Academic Press

Godino-Llorente J and Gomez-Vilda P (2004) lsquoAutomaticDetection of Voice Impairments by Means of Short-TermCepstral Parameters and Neural Network Based DetectorsrsquoIEEE Transactions on Biomedical Engineering 51(2) 380ndash384

Godino-Llorente J Gomez-Vilda P and Blanco-Velasco M(2006) lsquoDimensionality Reduction of a Pathological VoiceQuality Assessment System Based on Gaussian Mixture Mod-els and Short-Term Cepstral Parametersrsquo IEEE Transactionson Biomedical Engineering 53(10) 1943ndash1953

Gunes S Polat K and Yosunkaya S (2010) lsquoEfficient SleepStage Recognition System Based on EEG Signal Using k-Means Clustering Based Feature Weightingrsquo Expert Systemswith Applications 37(12) 7922ndash7928

Hariharan M Paulraj M and Yaacob S (2009) lsquoIdentifica-tion of Vocal Fold Pathology Based on Mel Frequency BandEnergy Coefficients and Singular Value Decompositionrsquo inPaper presented at the 2009 IEEE International Conferenceon Signal and Image Processing Applications (ICSIPArsquo09)Kuala Lumpur Malaysia pp 514ndash517

Hariharan M Paulraj M and Yaacob S (2011) lsquoDetection ofVocal Fold Paralysis and Oedema Using Time-Domain Fea-tures and Probabilistic Neural Networkrsquo International Jour-nal of Biomedical Engineering and Technology 6(1) 46ndash57

Hariharan M Saraswathy J Sindhu R Khairunizam W andYaacob S (2012) lsquoInfant Cry Classification to Identify As-phyxia Using Time-Frequency Analysis and Radial BasisNeural Networksrsquo Expert Systems with Applications 39(10)9515ndash9523

Hariharan M and Sazali Y (2010) lsquoTime-Domain Features andProbabilistic Neural Network for the Detection of Vocal FoldPathologyrsquo Malaysian Journal of Computer Science 23(1)60ndash67

Hariharan M Sindhu R and Yaacob S (2011) lsquoNormal andHypoacoustic Infant Cry Signal Classification Using Time-Frequency Analysis and General Regression Neural Net-

workrsquo Computer Methods and Programs in Biomedicine108(2) 559ndash569

Hariharan M Yaacob S and Awang SA (2011) lsquoPathologi-cal Infant Cry Analysis Using Wavelet Packet Transform andProbabilistic Neural Networkrsquo Expert Systems with Applica-tions 38(12) 15377ndash15382

Hernandez-Espinosa C Gomez-Vilda P Godino-Llorente Jand Aguilera-Navarro S (2000) lsquoDiagnosis of Vocal andVoice Disorders by the Speech Signalrsquo in Paper Presentedat the Proceedings from IJCNNrsquo00 The IEEE-INNS-ENNSInternational Joint Conference on Neural Networks ComoItaly pp 253ndash258

Hoglund H (2012) lsquoDetecting Earnings Management with Neu-ral Networksrsquo Expert Systems with Applications 39(10)9564ndash9570

Ismail S Samsudin R and Shabri A (2011) lsquoA Hybrid Modelof Self-Organizing Maps (SOM) and Least Square SupportVector Machine (LSSVM) for Time Series Forecastingrsquo Ex-pert Systems with Applications 38(8) 10574ndash10578

Kalker T Haitsma J and Oostveen J (2001) lsquoRobust AudioHashing for Content Identificationrsquo in Paper Presented at theProceedings from EUSIPCOrsquo01 The 12th European SignalProcessing Conference Vienna Austria pp 2091ndash2094

Kasuya H Endo Y and Saliu S (1993) lsquoNovel AcousticMeasurements of Jitter and Shimmer Characteristics fromPathological Voicersquo Paper Presented at the Proceedings fromEUROSPEECHrsquo93 The 3rd European Conference on SpeechCommunication and Technology Berlin Germany pp 1973ndash1976

Kasuya H Ogawa S Mashima K and Ebihara S (1986)lsquoNormalized Noise Energy as an Acoustic Measure to Eval-uate Pathologic Voicersquo The Journal of the Acoustical Societyof America 80 1329ndash1334

Kay Elemetrics Inc (1994) lsquoVoice Disorders DatabaseVersion 103 [CD-ROM]rsquo httpwwwkaypentaxcomProduct20InfoCSL20Options43374337html

Khadivi Heris H Seyed Aghazadeh B and Nikkhah-BahramiM (2009) lsquoOptimal Feature Selection for the Assessment ofVocal Fold Disordersrsquo Computers in Biology and Medicine39(10) 860ndash868

Krom G (1993) lsquoA Cepstrum-Based Technique for Determininga Harmonics-To-Noise Ratio in Speech Signalsrsquo Journal ofSpeech and Hearing Research 36(2) 254ndash266

Kukharchik P Kheidorov I Bovbel E and Ladeev D (2008)lsquoSpeech Signal Processing Based on Wavelets and SVM forVocal Tract Pathology Detectionrsquo Lecture Notes in ComputerScience (Springer) 5099 192ndash199

Latifoglu F Polat K Kara SI and Gunes S (2008) lsquoMedicalDiagnosis of Atherosclerosis from Carotid Artery DopplerSignals Using Principal Component Analysis (PCA) k-NN based Weighting Pre-Processing and Artificial ImmuneRecognition System (AIRS)rsquo Journal of Biomedical Infor-matics 41(1) 15ndash23

Ludlow CL Bassich CJ Connor NP Coulter DC and LeeYJ (1987) lsquoThe Validity of Using Phonatory Jitter and Shim-mer to Detect Laryngeal Pathologyrsquo Laryngeal Functionin Phonation and Respiration Boston MA Brown amp Copp 492ndash508

MacQueen J (1967) lsquoSome Methods for Classification and Anal-ysis of Multivariate Observationsrsquo in Paper Presented at the5th Berkeley Symposium on Mathematical Statistics and Prob-ability Berkeley CA University California Press Vol 1pp 281ndash297

MAPACI P (2004) lsquoVoice Disorder Databasersquo httpwwwmapacicomindex-inglesphp

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1634 M Hariharan et al

Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372

Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800

Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953

Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327

Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900

Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793

Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793

Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60

Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673

Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279

Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921

Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia

Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345

Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925

Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128

Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740

Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p

Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118

Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576

Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd

Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430

Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97

Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117

Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678

Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219

Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399

Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

Page 11: A new feature constituting approach to detection of vocal fold pathology

International Journal of Systems Science 1631

Table 4 Computation time for feature extraction and classification

Classification

MEEI database MAPACI database

Type of Type of Weighted Raw Weighted RawClassifiers experiments validation features features features features

k-NN Frame-based CrossV 0338 0341 0342 0346k-value 1ndash10 ConV 3806 3821 6931 6955File-based CrossV 0036 0038 0036 0036k-value 1ndash10 ConV 1759 1754 3001 3206

LS-SVM Frame-based CrossV 1139 1175 1191 121410 001 ConV 0718 0698 0705 0723

File-based CrossV 0005 0005 0002 000210 1 ConV 0006 0006 0004 0004

PNN Frame-based CrossV 2458 3084 2445 2421001ndash0055 ConV 10340 10356 11180 11202File-based CrossV 1078 1124 1093 1062001ndash0055 ConV 4250 4266 4072 4047

GRNN Frame-based CrossV 2587 2571 2676 2582001ndash0055 ConV 10726 10648 11634 11584File-based CrossV 1098 1056 1065 1062001ndash0055 ConV 4302 4260 4049 4066

Feature extractionType of experiments MEEI MAPACI

Normal Pathological Normal PathologicalFile-based 0549 0034 0109 0130Frame-based 0591 0181 1734 1737

To prove the robustness and database independency of theproposed method MAPACI speech pathology database isused for the investigation in addition to MEEI databaseFrom (Table 3) it is inferred that maximum accuracy ofabove 95 is obtained using the raw features and 100is obtained using the weighted features during the frame-based analysis During the file-based analysis we obtainedmaximum classification accuracy of 92 using the rawfeatures and 100 is obtained using the weighted features

(Figure 6andashd) show the comparison of results (maxi-mum accuracy and its corresponding sensitivity and speci-ficity) for MEEI database and MAPACI speech pathologydatabase using the proposed raw features during frame-based and file-based analysis Similarly (Figure 7andashd) il-lustrate the comparison of results (maximum accuracy andits corresponding sensitivity and specificity) for MEEIdatabase and MAPACI speech pathology database for pro-posed weighted features The performance of the weightedfeatures is always better compared to raw features for bothMEEI database and MAPACI speech pathology databaseAll the feature extractions and classification algorithms aredeveloped under MATLAB platform and they run in a lap-top of Intel Core i7-2670 QM (22 GHz) with 4 GB RAM(Table 4) reports the average time taken (in seconds) duringfeature extraction and classification under two different ex-periments (frame-based and file-based) Direct comparisonof our work with the previous works in literature cannot beperformed since most of the works in the literature havenot presented their computation time of feature extraction

and classification phase From (Table 4) it is observed thatthe best classification accuracy could be obtained using oursuggested feature extraction and classification algorithmswith less computation time

From the literature it is observed that the reliabilityof wavelet and WPT-based features has been proven bymany experimental researches Accuracy of their resultswas varied from 85 to 100 approximately under differ-ent experiments Their results highly depend on the opti-mal selection of wavelet packet-based features using featureselectiontransformation techniques and optimisation tech-niques However the presented work gives very promisingclassification accuracy for the databases under test Henceit can be concluded that the proposed k-means clusteringbased feature weighting method could improve the robust-ness of the WPTndashSVD features and increase the classifica-tion accuracy as well for both MEEI database and MAPACIspeech pathology database

6 Conclusions

This paper presents a new feature constituting approachfor efficient detection of vocal fold pathology using a fea-ture extraction method (WPTndashSVD features and a fea-ture weighting method [k-means clustering based]) Totest the effectiveness and reliability of the proposed rawand weighted features four different classifiers such as k-NN LS-SVM PNN and GRNN are employed k-means

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1632 M Hariharan et al

clustering based feature weighting approach can increasethe distinguishing performance of raw WPTndashSVD featuresThe experimental results show that the proposed weightedfeatures give very promising classification accuracy of100 for both MEEI voice disorder database and MAPACIspeech pathology database The proposed features can beused as additional acoustic indicators for researchers andspeech pathologists to detect the vocal fold pathology Theperformance of the k-means clustering algorithm is depend-ing on the initial positions of the clusters and hence in ourwork it was run for several times to obtain better clus-ter centres In the future work k-means clustering basedfeature weighting method could be improved by using ef-ficient initialisation of methods for better initial placementof the cluster centres and different kernel functions Theproposed method could be extended to detect the specifictype of disorders and also to develop an online diagnosingsystem

AcknowledgementsThis work is done in part with data transferred from the databaseMAPACI httpwwwmapacicom The authors would like tothank the anonymous reviewers for their valuable comments

Notes on contributorsM Hariharan has received Bachelor ofEngineering in Electrical and ElectronicsEngineering and Master of Engineeringin Applied Electronics from GovernmentCollege of Technology Coimbatore TamilNadu India He has received a PhD degreein Mechatronic Engineering from UniversitiMalaysia Perlis (UniMAP) Malaysia He iscurrently working as a Senior Lecturer in

the School of Mechatronic Engineering UniMAP Malaysia Hehas published more than 50 papers in referred journals and con-ferences He has been a reviewer in various journals includingComputers and Electrical Engineering Computer Methods andPrograms in Biomedicine Artificial Intelligence in BiomedicineInternational Journal of Phoniatrics Speech Therapy and Com-munication Pathology and Medical Engineering and Physics Hisresearch interests include speech signal processing biomedicalsignal and image processing and artificial intelligence He is amember of IEEE

Kemal Polat was born in Kırıkkale on30 January 1981 Dr Polat has gradu-ated from ElectricalndashElectronics Engineer-ing Department of Selcuk University withBSc degree in 2002 and from ElectricalndashElectronics Engineering Department of Sel-cuk University with MSc degree in 2004He took his PhD degree in Electrical andElectronic Engineering at Selcuk University

in 2008 He is now working as an Assistant Professor in Electricaland Electronic Engineering Department Engineering of FacultyAbant Izzet Baysal University since September 2011 His currentresearch interests are biomedical signal classification statisticalsignal processing digital signal processing and pattern recognitionand classification He has 55 articles published in the SCI journalsand 16 international conference papers He is a reviewer in various

SCI journals including Information Sciences Pattern RecognitionLetters Digital Signal Processing Soft Computing Artificial In-telligence in Medicine Applied Soft Computing Computers inBiology and Medicine Expert Systems Computational Statisticsamp Data Analysis IEEE Transactions on Biomedical EngineeringIEEE Transactions on Evolutionary Computation TALANTASoft Computing Journal of Medical Systems and Scientific Re-search and Essays so on He is the member of editorial board ofJournal of Neural Computing and Applications (SCI expanded)

Sazali Yaacob received his PhD degreein Control Engineering from University ofSheffield UK He is currently working as aProfessor in the School of Mechatronic En-gineering UniMAP Malaysia He has suc-cessfully supervised 8 PhD candidates andmore than 20 MSc graduates through re-search mode Currently he has 10 PhD and8 MSc candidates His research interests

include control modelling and signal processing with applica-tions in the fields of satellite bio-medical applied mechanics androbotics He has published more than 70 papers in journals and200 papers in conference proceedings He received his profes-sional qualification as Charted Engineer from the EngineeringCouncil UK in 2005 and also a member of IET UK since 2003

ReferencesAnanthakrishna T Shama K and Niranjan U (2004)

lsquok-Means Nearest Neighbor Classifier for Voice Pathol-ogyrsquo in Paper Presented at the Proceedings from INDI-CONrsquo04 The 1st IEEE Annual India Conference KharagpurIndia pp 352ndash354

Arjmandi MK and Pooyan M (2011) lsquoAn Optimum Algorithmin Pathological Voice Quality Assessment Using Wavelet-Packet-Based Features Linear Discriminant Analysis andSupport Vector Machinersquo Biomedical Signal Processing andControl 7 3ndash19

Azadi TE and Almasganj F (2011) lsquoPBSVM Partitioningand Biased Support Vector Machine for Vocal Fold PathologyAssessment Using Labeled and Unlabeled Data Setsrsquo ExpertSystems with Applications 38(1) 610ndash619

Bezdek JC (1981) Pattern Recognition with Fuzzy ObjectiveFunction Algorithms Norwell MA Kluwer Academic Pub-lishers

Boyanov B and Hadjitodorov S (1997) lsquoAcoustic Analysis ofPathological Voices A Voice Analysis System for the Screen-ing of Laryngeal Diseasesrsquo IEEE Engineering in Medicineand Biology Magazine 16(4) 74ndash82

Boyanov B Ivanov T Hadjitodorov S and Chollet G (1993)lsquoRobust Hybrid Pitch Detectorrsquo IET Electronics Letters 291924ndash1926

Burrus C Gopinath R Guo H Odegard J and Selesnick I(1997) Introduction to Wavelets and Wavelet Transforms APrimer Upper Saddle River NJ Prentice Hall

Calisir D and Dogantekin E (2011) lsquoA New Intelligent Hep-atitis Diagnosis System PCA-LSSVMrsquo Expert Systems withApplications 38(8) 10705ndash10708

Chiu SL (1994) lsquoFuzzy Model Identification Based on ClusterEstimationrsquo Journal of Intelligent and Fuzzy Systems 2(3)267ndash278

Crovato C and Schuck A (2007) lsquoThe Use of Wavelet PacketTransform and Artificial Neural Networks in Analysis andClassification of Dysphonic Voicesrsquo IEEE Transactions onBiomedical Engineering 54(10) 1898ndash1900

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1633

De Brabanter K Ojeda PKF Alzate C De Brabanter J Pel-ckmans K De Moor B Vandewalle J and Suykens JAK(2010) lsquoLS-SVMlab Toolboxrsquo httpwwwesatkuleuvenbesistalssvmlab

Deliyski D (1993) Acoustic Model and Evaluation of Patholog-ical Voice Productionrsquo in Paper Presented at the Proceed-ings from EUROSPEECHrsquo93 The 3rd Conference on SpeechCommunication and Technology Berlin Germany pp 1969ndash1972

Duda RO Hart PE and Stork DG (2001) Pattern Classifi-cation New York Wiley-Interscience

Erfanian Saeedi N Almasganj F and Torabinejad F (2011)lsquoSupport Vector Wavelet Adaptation for Pathological VoiceAssessmentrsquo Computers in Biology and Medicine 41(9)822ndash828

Eskidere O Ertas F and Hanilci C (2011) lsquoA Comparison ofRegression Methods for Remote Tracking of Parkinsonrsquos Dis-ease Progressionrsquo Expert Systems with Applications 39(5)5523ndash5528

Feijoo S and Hernandez C (1990) lsquoShort-Term Stability Mea-sures for the Evaluation of Vocal Qualityrsquo Journal of Speechand Hearing Research 33(2) 324ndash334

Fonseca E Guido R Scalassara P Maciel C and Pereira J(2007) lsquoWavelet Time-Frequency Analysis and Least SquaresSupport Vector Machines for the Identification of Voice Dis-ordersrsquo Computers in Biology and Medicine(Elsevier) 37(4)571ndash578

Fukunaga K (1990) Introduction to Statistical Pattern Recogni-tion (2nd ed) San Diego CA Academic Press

Godino-Llorente J and Gomez-Vilda P (2004) lsquoAutomaticDetection of Voice Impairments by Means of Short-TermCepstral Parameters and Neural Network Based DetectorsrsquoIEEE Transactions on Biomedical Engineering 51(2) 380ndash384

Godino-Llorente J Gomez-Vilda P and Blanco-Velasco M(2006) lsquoDimensionality Reduction of a Pathological VoiceQuality Assessment System Based on Gaussian Mixture Mod-els and Short-Term Cepstral Parametersrsquo IEEE Transactionson Biomedical Engineering 53(10) 1943ndash1953

Gunes S Polat K and Yosunkaya S (2010) lsquoEfficient SleepStage Recognition System Based on EEG Signal Using k-Means Clustering Based Feature Weightingrsquo Expert Systemswith Applications 37(12) 7922ndash7928

Hariharan M Paulraj M and Yaacob S (2009) lsquoIdentifica-tion of Vocal Fold Pathology Based on Mel Frequency BandEnergy Coefficients and Singular Value Decompositionrsquo inPaper presented at the 2009 IEEE International Conferenceon Signal and Image Processing Applications (ICSIPArsquo09)Kuala Lumpur Malaysia pp 514ndash517

Hariharan M Paulraj M and Yaacob S (2011) lsquoDetection ofVocal Fold Paralysis and Oedema Using Time-Domain Fea-tures and Probabilistic Neural Networkrsquo International Jour-nal of Biomedical Engineering and Technology 6(1) 46ndash57

Hariharan M Saraswathy J Sindhu R Khairunizam W andYaacob S (2012) lsquoInfant Cry Classification to Identify As-phyxia Using Time-Frequency Analysis and Radial BasisNeural Networksrsquo Expert Systems with Applications 39(10)9515ndash9523

Hariharan M and Sazali Y (2010) lsquoTime-Domain Features andProbabilistic Neural Network for the Detection of Vocal FoldPathologyrsquo Malaysian Journal of Computer Science 23(1)60ndash67

Hariharan M Sindhu R and Yaacob S (2011) lsquoNormal andHypoacoustic Infant Cry Signal Classification Using Time-Frequency Analysis and General Regression Neural Net-

workrsquo Computer Methods and Programs in Biomedicine108(2) 559ndash569

Hariharan M Yaacob S and Awang SA (2011) lsquoPathologi-cal Infant Cry Analysis Using Wavelet Packet Transform andProbabilistic Neural Networkrsquo Expert Systems with Applica-tions 38(12) 15377ndash15382

Hernandez-Espinosa C Gomez-Vilda P Godino-Llorente Jand Aguilera-Navarro S (2000) lsquoDiagnosis of Vocal andVoice Disorders by the Speech Signalrsquo in Paper Presentedat the Proceedings from IJCNNrsquo00 The IEEE-INNS-ENNSInternational Joint Conference on Neural Networks ComoItaly pp 253ndash258

Hoglund H (2012) lsquoDetecting Earnings Management with Neu-ral Networksrsquo Expert Systems with Applications 39(10)9564ndash9570

Ismail S Samsudin R and Shabri A (2011) lsquoA Hybrid Modelof Self-Organizing Maps (SOM) and Least Square SupportVector Machine (LSSVM) for Time Series Forecastingrsquo Ex-pert Systems with Applications 38(8) 10574ndash10578

Kalker T Haitsma J and Oostveen J (2001) lsquoRobust AudioHashing for Content Identificationrsquo in Paper Presented at theProceedings from EUSIPCOrsquo01 The 12th European SignalProcessing Conference Vienna Austria pp 2091ndash2094

Kasuya H Endo Y and Saliu S (1993) lsquoNovel AcousticMeasurements of Jitter and Shimmer Characteristics fromPathological Voicersquo Paper Presented at the Proceedings fromEUROSPEECHrsquo93 The 3rd European Conference on SpeechCommunication and Technology Berlin Germany pp 1973ndash1976

Kasuya H Ogawa S Mashima K and Ebihara S (1986)lsquoNormalized Noise Energy as an Acoustic Measure to Eval-uate Pathologic Voicersquo The Journal of the Acoustical Societyof America 80 1329ndash1334

Kay Elemetrics Inc (1994) lsquoVoice Disorders DatabaseVersion 103 [CD-ROM]rsquo httpwwwkaypentaxcomProduct20InfoCSL20Options43374337html

Khadivi Heris H Seyed Aghazadeh B and Nikkhah-BahramiM (2009) lsquoOptimal Feature Selection for the Assessment ofVocal Fold Disordersrsquo Computers in Biology and Medicine39(10) 860ndash868

Krom G (1993) lsquoA Cepstrum-Based Technique for Determininga Harmonics-To-Noise Ratio in Speech Signalsrsquo Journal ofSpeech and Hearing Research 36(2) 254ndash266

Kukharchik P Kheidorov I Bovbel E and Ladeev D (2008)lsquoSpeech Signal Processing Based on Wavelets and SVM forVocal Tract Pathology Detectionrsquo Lecture Notes in ComputerScience (Springer) 5099 192ndash199

Latifoglu F Polat K Kara SI and Gunes S (2008) lsquoMedicalDiagnosis of Atherosclerosis from Carotid Artery DopplerSignals Using Principal Component Analysis (PCA) k-NN based Weighting Pre-Processing and Artificial ImmuneRecognition System (AIRS)rsquo Journal of Biomedical Infor-matics 41(1) 15ndash23

Ludlow CL Bassich CJ Connor NP Coulter DC and LeeYJ (1987) lsquoThe Validity of Using Phonatory Jitter and Shim-mer to Detect Laryngeal Pathologyrsquo Laryngeal Functionin Phonation and Respiration Boston MA Brown amp Copp 492ndash508

MacQueen J (1967) lsquoSome Methods for Classification and Anal-ysis of Multivariate Observationsrsquo in Paper Presented at the5th Berkeley Symposium on Mathematical Statistics and Prob-ability Berkeley CA University California Press Vol 1pp 281ndash297

MAPACI P (2004) lsquoVoice Disorder Databasersquo httpwwwmapacicomindex-inglesphp

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1634 M Hariharan et al

Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372

Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800

Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953

Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327

Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900

Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793

Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793

Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60

Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673

Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279

Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921

Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia

Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345

Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925

Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128

Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740

Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p

Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118

Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576

Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd

Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430

Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97

Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117

Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678

Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219

Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399

Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

Page 12: A new feature constituting approach to detection of vocal fold pathology

1632 M Hariharan et al

clustering based feature weighting approach can increasethe distinguishing performance of raw WPTndashSVD featuresThe experimental results show that the proposed weightedfeatures give very promising classification accuracy of100 for both MEEI voice disorder database and MAPACIspeech pathology database The proposed features can beused as additional acoustic indicators for researchers andspeech pathologists to detect the vocal fold pathology Theperformance of the k-means clustering algorithm is depend-ing on the initial positions of the clusters and hence in ourwork it was run for several times to obtain better clus-ter centres In the future work k-means clustering basedfeature weighting method could be improved by using ef-ficient initialisation of methods for better initial placementof the cluster centres and different kernel functions Theproposed method could be extended to detect the specifictype of disorders and also to develop an online diagnosingsystem

AcknowledgementsThis work is done in part with data transferred from the databaseMAPACI httpwwwmapacicom The authors would like tothank the anonymous reviewers for their valuable comments

Notes on contributorsM Hariharan has received Bachelor ofEngineering in Electrical and ElectronicsEngineering and Master of Engineeringin Applied Electronics from GovernmentCollege of Technology Coimbatore TamilNadu India He has received a PhD degreein Mechatronic Engineering from UniversitiMalaysia Perlis (UniMAP) Malaysia He iscurrently working as a Senior Lecturer in

the School of Mechatronic Engineering UniMAP Malaysia Hehas published more than 50 papers in referred journals and con-ferences He has been a reviewer in various journals includingComputers and Electrical Engineering Computer Methods andPrograms in Biomedicine Artificial Intelligence in BiomedicineInternational Journal of Phoniatrics Speech Therapy and Com-munication Pathology and Medical Engineering and Physics Hisresearch interests include speech signal processing biomedicalsignal and image processing and artificial intelligence He is amember of IEEE

Kemal Polat was born in Kırıkkale on30 January 1981 Dr Polat has gradu-ated from ElectricalndashElectronics Engineer-ing Department of Selcuk University withBSc degree in 2002 and from ElectricalndashElectronics Engineering Department of Sel-cuk University with MSc degree in 2004He took his PhD degree in Electrical andElectronic Engineering at Selcuk University

in 2008 He is now working as an Assistant Professor in Electricaland Electronic Engineering Department Engineering of FacultyAbant Izzet Baysal University since September 2011 His currentresearch interests are biomedical signal classification statisticalsignal processing digital signal processing and pattern recognitionand classification He has 55 articles published in the SCI journalsand 16 international conference papers He is a reviewer in various

SCI journals including Information Sciences Pattern RecognitionLetters Digital Signal Processing Soft Computing Artificial In-telligence in Medicine Applied Soft Computing Computers inBiology and Medicine Expert Systems Computational Statisticsamp Data Analysis IEEE Transactions on Biomedical EngineeringIEEE Transactions on Evolutionary Computation TALANTASoft Computing Journal of Medical Systems and Scientific Re-search and Essays so on He is the member of editorial board ofJournal of Neural Computing and Applications (SCI expanded)

Sazali Yaacob received his PhD degreein Control Engineering from University ofSheffield UK He is currently working as aProfessor in the School of Mechatronic En-gineering UniMAP Malaysia He has suc-cessfully supervised 8 PhD candidates andmore than 20 MSc graduates through re-search mode Currently he has 10 PhD and8 MSc candidates His research interests

include control modelling and signal processing with applica-tions in the fields of satellite bio-medical applied mechanics androbotics He has published more than 70 papers in journals and200 papers in conference proceedings He received his profes-sional qualification as Charted Engineer from the EngineeringCouncil UK in 2005 and also a member of IET UK since 2003

ReferencesAnanthakrishna T Shama K and Niranjan U (2004)

lsquok-Means Nearest Neighbor Classifier for Voice Pathol-ogyrsquo in Paper Presented at the Proceedings from INDI-CONrsquo04 The 1st IEEE Annual India Conference KharagpurIndia pp 352ndash354

Arjmandi MK and Pooyan M (2011) lsquoAn Optimum Algorithmin Pathological Voice Quality Assessment Using Wavelet-Packet-Based Features Linear Discriminant Analysis andSupport Vector Machinersquo Biomedical Signal Processing andControl 7 3ndash19

Azadi TE and Almasganj F (2011) lsquoPBSVM Partitioningand Biased Support Vector Machine for Vocal Fold PathologyAssessment Using Labeled and Unlabeled Data Setsrsquo ExpertSystems with Applications 38(1) 610ndash619

Bezdek JC (1981) Pattern Recognition with Fuzzy ObjectiveFunction Algorithms Norwell MA Kluwer Academic Pub-lishers

Boyanov B and Hadjitodorov S (1997) lsquoAcoustic Analysis ofPathological Voices A Voice Analysis System for the Screen-ing of Laryngeal Diseasesrsquo IEEE Engineering in Medicineand Biology Magazine 16(4) 74ndash82

Boyanov B Ivanov T Hadjitodorov S and Chollet G (1993)lsquoRobust Hybrid Pitch Detectorrsquo IET Electronics Letters 291924ndash1926

Burrus C Gopinath R Guo H Odegard J and Selesnick I(1997) Introduction to Wavelets and Wavelet Transforms APrimer Upper Saddle River NJ Prentice Hall

Calisir D and Dogantekin E (2011) lsquoA New Intelligent Hep-atitis Diagnosis System PCA-LSSVMrsquo Expert Systems withApplications 38(8) 10705ndash10708

Chiu SL (1994) lsquoFuzzy Model Identification Based on ClusterEstimationrsquo Journal of Intelligent and Fuzzy Systems 2(3)267ndash278

Crovato C and Schuck A (2007) lsquoThe Use of Wavelet PacketTransform and Artificial Neural Networks in Analysis andClassification of Dysphonic Voicesrsquo IEEE Transactions onBiomedical Engineering 54(10) 1898ndash1900

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

International Journal of Systems Science 1633

De Brabanter K Ojeda PKF Alzate C De Brabanter J Pel-ckmans K De Moor B Vandewalle J and Suykens JAK(2010) lsquoLS-SVMlab Toolboxrsquo httpwwwesatkuleuvenbesistalssvmlab

Deliyski D (1993) Acoustic Model and Evaluation of Patholog-ical Voice Productionrsquo in Paper Presented at the Proceed-ings from EUROSPEECHrsquo93 The 3rd Conference on SpeechCommunication and Technology Berlin Germany pp 1969ndash1972

Duda RO Hart PE and Stork DG (2001) Pattern Classifi-cation New York Wiley-Interscience

Erfanian Saeedi N Almasganj F and Torabinejad F (2011)lsquoSupport Vector Wavelet Adaptation for Pathological VoiceAssessmentrsquo Computers in Biology and Medicine 41(9)822ndash828

Eskidere O Ertas F and Hanilci C (2011) lsquoA Comparison ofRegression Methods for Remote Tracking of Parkinsonrsquos Dis-ease Progressionrsquo Expert Systems with Applications 39(5)5523ndash5528

Feijoo S and Hernandez C (1990) lsquoShort-Term Stability Mea-sures for the Evaluation of Vocal Qualityrsquo Journal of Speechand Hearing Research 33(2) 324ndash334

Fonseca E Guido R Scalassara P Maciel C and Pereira J(2007) lsquoWavelet Time-Frequency Analysis and Least SquaresSupport Vector Machines for the Identification of Voice Dis-ordersrsquo Computers in Biology and Medicine(Elsevier) 37(4)571ndash578

Fukunaga K (1990) Introduction to Statistical Pattern Recogni-tion (2nd ed) San Diego CA Academic Press

Godino-Llorente J and Gomez-Vilda P (2004) lsquoAutomaticDetection of Voice Impairments by Means of Short-TermCepstral Parameters and Neural Network Based DetectorsrsquoIEEE Transactions on Biomedical Engineering 51(2) 380ndash384

Godino-Llorente J Gomez-Vilda P and Blanco-Velasco M(2006) lsquoDimensionality Reduction of a Pathological VoiceQuality Assessment System Based on Gaussian Mixture Mod-els and Short-Term Cepstral Parametersrsquo IEEE Transactionson Biomedical Engineering 53(10) 1943ndash1953

Gunes S Polat K and Yosunkaya S (2010) lsquoEfficient SleepStage Recognition System Based on EEG Signal Using k-Means Clustering Based Feature Weightingrsquo Expert Systemswith Applications 37(12) 7922ndash7928

Hariharan M Paulraj M and Yaacob S (2009) lsquoIdentifica-tion of Vocal Fold Pathology Based on Mel Frequency BandEnergy Coefficients and Singular Value Decompositionrsquo inPaper presented at the 2009 IEEE International Conferenceon Signal and Image Processing Applications (ICSIPArsquo09)Kuala Lumpur Malaysia pp 514ndash517

Hariharan M Paulraj M and Yaacob S (2011) lsquoDetection ofVocal Fold Paralysis and Oedema Using Time-Domain Fea-tures and Probabilistic Neural Networkrsquo International Jour-nal of Biomedical Engineering and Technology 6(1) 46ndash57

Hariharan M Saraswathy J Sindhu R Khairunizam W andYaacob S (2012) lsquoInfant Cry Classification to Identify As-phyxia Using Time-Frequency Analysis and Radial BasisNeural Networksrsquo Expert Systems with Applications 39(10)9515ndash9523

Hariharan M and Sazali Y (2010) lsquoTime-Domain Features andProbabilistic Neural Network for the Detection of Vocal FoldPathologyrsquo Malaysian Journal of Computer Science 23(1)60ndash67

Hariharan M Sindhu R and Yaacob S (2011) lsquoNormal andHypoacoustic Infant Cry Signal Classification Using Time-Frequency Analysis and General Regression Neural Net-

workrsquo Computer Methods and Programs in Biomedicine108(2) 559ndash569

Hariharan M Yaacob S and Awang SA (2011) lsquoPathologi-cal Infant Cry Analysis Using Wavelet Packet Transform andProbabilistic Neural Networkrsquo Expert Systems with Applica-tions 38(12) 15377ndash15382

Hernandez-Espinosa C Gomez-Vilda P Godino-Llorente Jand Aguilera-Navarro S (2000) lsquoDiagnosis of Vocal andVoice Disorders by the Speech Signalrsquo in Paper Presentedat the Proceedings from IJCNNrsquo00 The IEEE-INNS-ENNSInternational Joint Conference on Neural Networks ComoItaly pp 253ndash258

Hoglund H (2012) lsquoDetecting Earnings Management with Neu-ral Networksrsquo Expert Systems with Applications 39(10)9564ndash9570

Ismail S Samsudin R and Shabri A (2011) lsquoA Hybrid Modelof Self-Organizing Maps (SOM) and Least Square SupportVector Machine (LSSVM) for Time Series Forecastingrsquo Ex-pert Systems with Applications 38(8) 10574ndash10578

Kalker T Haitsma J and Oostveen J (2001) lsquoRobust AudioHashing for Content Identificationrsquo in Paper Presented at theProceedings from EUSIPCOrsquo01 The 12th European SignalProcessing Conference Vienna Austria pp 2091ndash2094

Kasuya H Endo Y and Saliu S (1993) lsquoNovel AcousticMeasurements of Jitter and Shimmer Characteristics fromPathological Voicersquo Paper Presented at the Proceedings fromEUROSPEECHrsquo93 The 3rd European Conference on SpeechCommunication and Technology Berlin Germany pp 1973ndash1976

Kasuya H Ogawa S Mashima K and Ebihara S (1986)lsquoNormalized Noise Energy as an Acoustic Measure to Eval-uate Pathologic Voicersquo The Journal of the Acoustical Societyof America 80 1329ndash1334

Kay Elemetrics Inc (1994) lsquoVoice Disorders DatabaseVersion 103 [CD-ROM]rsquo httpwwwkaypentaxcomProduct20InfoCSL20Options43374337html

Khadivi Heris H Seyed Aghazadeh B and Nikkhah-BahramiM (2009) lsquoOptimal Feature Selection for the Assessment ofVocal Fold Disordersrsquo Computers in Biology and Medicine39(10) 860ndash868

Krom G (1993) lsquoA Cepstrum-Based Technique for Determininga Harmonics-To-Noise Ratio in Speech Signalsrsquo Journal ofSpeech and Hearing Research 36(2) 254ndash266

Kukharchik P Kheidorov I Bovbel E and Ladeev D (2008)lsquoSpeech Signal Processing Based on Wavelets and SVM forVocal Tract Pathology Detectionrsquo Lecture Notes in ComputerScience (Springer) 5099 192ndash199

Latifoglu F Polat K Kara SI and Gunes S (2008) lsquoMedicalDiagnosis of Atherosclerosis from Carotid Artery DopplerSignals Using Principal Component Analysis (PCA) k-NN based Weighting Pre-Processing and Artificial ImmuneRecognition System (AIRS)rsquo Journal of Biomedical Infor-matics 41(1) 15ndash23

Ludlow CL Bassich CJ Connor NP Coulter DC and LeeYJ (1987) lsquoThe Validity of Using Phonatory Jitter and Shim-mer to Detect Laryngeal Pathologyrsquo Laryngeal Functionin Phonation and Respiration Boston MA Brown amp Copp 492ndash508

MacQueen J (1967) lsquoSome Methods for Classification and Anal-ysis of Multivariate Observationsrsquo in Paper Presented at the5th Berkeley Symposium on Mathematical Statistics and Prob-ability Berkeley CA University California Press Vol 1pp 281ndash297

MAPACI P (2004) lsquoVoice Disorder Databasersquo httpwwwmapacicomindex-inglesphp

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1634 M Hariharan et al

Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372

Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800

Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953

Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327

Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900

Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793

Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793

Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60

Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673

Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279

Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921

Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia

Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345

Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925

Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128

Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740

Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p

Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118

Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576

Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd

Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430

Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97

Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117

Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678

Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219

Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399

Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

Page 13: A new feature constituting approach to detection of vocal fold pathology

International Journal of Systems Science 1633

De Brabanter K Ojeda PKF Alzate C De Brabanter J Pel-ckmans K De Moor B Vandewalle J and Suykens JAK(2010) lsquoLS-SVMlab Toolboxrsquo httpwwwesatkuleuvenbesistalssvmlab

Deliyski D (1993) Acoustic Model and Evaluation of Patholog-ical Voice Productionrsquo in Paper Presented at the Proceed-ings from EUROSPEECHrsquo93 The 3rd Conference on SpeechCommunication and Technology Berlin Germany pp 1969ndash1972

Duda RO Hart PE and Stork DG (2001) Pattern Classifi-cation New York Wiley-Interscience

Erfanian Saeedi N Almasganj F and Torabinejad F (2011)lsquoSupport Vector Wavelet Adaptation for Pathological VoiceAssessmentrsquo Computers in Biology and Medicine 41(9)822ndash828

Eskidere O Ertas F and Hanilci C (2011) lsquoA Comparison ofRegression Methods for Remote Tracking of Parkinsonrsquos Dis-ease Progressionrsquo Expert Systems with Applications 39(5)5523ndash5528

Feijoo S and Hernandez C (1990) lsquoShort-Term Stability Mea-sures for the Evaluation of Vocal Qualityrsquo Journal of Speechand Hearing Research 33(2) 324ndash334

Fonseca E Guido R Scalassara P Maciel C and Pereira J(2007) lsquoWavelet Time-Frequency Analysis and Least SquaresSupport Vector Machines for the Identification of Voice Dis-ordersrsquo Computers in Biology and Medicine(Elsevier) 37(4)571ndash578

Fukunaga K (1990) Introduction to Statistical Pattern Recogni-tion (2nd ed) San Diego CA Academic Press

Godino-Llorente J and Gomez-Vilda P (2004) lsquoAutomaticDetection of Voice Impairments by Means of Short-TermCepstral Parameters and Neural Network Based DetectorsrsquoIEEE Transactions on Biomedical Engineering 51(2) 380ndash384

Godino-Llorente J Gomez-Vilda P and Blanco-Velasco M(2006) lsquoDimensionality Reduction of a Pathological VoiceQuality Assessment System Based on Gaussian Mixture Mod-els and Short-Term Cepstral Parametersrsquo IEEE Transactionson Biomedical Engineering 53(10) 1943ndash1953

Gunes S Polat K and Yosunkaya S (2010) lsquoEfficient SleepStage Recognition System Based on EEG Signal Using k-Means Clustering Based Feature Weightingrsquo Expert Systemswith Applications 37(12) 7922ndash7928

Hariharan M Paulraj M and Yaacob S (2009) lsquoIdentifica-tion of Vocal Fold Pathology Based on Mel Frequency BandEnergy Coefficients and Singular Value Decompositionrsquo inPaper presented at the 2009 IEEE International Conferenceon Signal and Image Processing Applications (ICSIPArsquo09)Kuala Lumpur Malaysia pp 514ndash517

Hariharan M Paulraj M and Yaacob S (2011) lsquoDetection ofVocal Fold Paralysis and Oedema Using Time-Domain Fea-tures and Probabilistic Neural Networkrsquo International Jour-nal of Biomedical Engineering and Technology 6(1) 46ndash57

Hariharan M Saraswathy J Sindhu R Khairunizam W andYaacob S (2012) lsquoInfant Cry Classification to Identify As-phyxia Using Time-Frequency Analysis and Radial BasisNeural Networksrsquo Expert Systems with Applications 39(10)9515ndash9523

Hariharan M and Sazali Y (2010) lsquoTime-Domain Features andProbabilistic Neural Network for the Detection of Vocal FoldPathologyrsquo Malaysian Journal of Computer Science 23(1)60ndash67

Hariharan M Sindhu R and Yaacob S (2011) lsquoNormal andHypoacoustic Infant Cry Signal Classification Using Time-Frequency Analysis and General Regression Neural Net-

workrsquo Computer Methods and Programs in Biomedicine108(2) 559ndash569

Hariharan M Yaacob S and Awang SA (2011) lsquoPathologi-cal Infant Cry Analysis Using Wavelet Packet Transform andProbabilistic Neural Networkrsquo Expert Systems with Applica-tions 38(12) 15377ndash15382

Hernandez-Espinosa C Gomez-Vilda P Godino-Llorente Jand Aguilera-Navarro S (2000) lsquoDiagnosis of Vocal andVoice Disorders by the Speech Signalrsquo in Paper Presentedat the Proceedings from IJCNNrsquo00 The IEEE-INNS-ENNSInternational Joint Conference on Neural Networks ComoItaly pp 253ndash258

Hoglund H (2012) lsquoDetecting Earnings Management with Neu-ral Networksrsquo Expert Systems with Applications 39(10)9564ndash9570

Ismail S Samsudin R and Shabri A (2011) lsquoA Hybrid Modelof Self-Organizing Maps (SOM) and Least Square SupportVector Machine (LSSVM) for Time Series Forecastingrsquo Ex-pert Systems with Applications 38(8) 10574ndash10578

Kalker T Haitsma J and Oostveen J (2001) lsquoRobust AudioHashing for Content Identificationrsquo in Paper Presented at theProceedings from EUSIPCOrsquo01 The 12th European SignalProcessing Conference Vienna Austria pp 2091ndash2094

Kasuya H Endo Y and Saliu S (1993) lsquoNovel AcousticMeasurements of Jitter and Shimmer Characteristics fromPathological Voicersquo Paper Presented at the Proceedings fromEUROSPEECHrsquo93 The 3rd European Conference on SpeechCommunication and Technology Berlin Germany pp 1973ndash1976

Kasuya H Ogawa S Mashima K and Ebihara S (1986)lsquoNormalized Noise Energy as an Acoustic Measure to Eval-uate Pathologic Voicersquo The Journal of the Acoustical Societyof America 80 1329ndash1334

Kay Elemetrics Inc (1994) lsquoVoice Disorders DatabaseVersion 103 [CD-ROM]rsquo httpwwwkaypentaxcomProduct20InfoCSL20Options43374337html

Khadivi Heris H Seyed Aghazadeh B and Nikkhah-BahramiM (2009) lsquoOptimal Feature Selection for the Assessment ofVocal Fold Disordersrsquo Computers in Biology and Medicine39(10) 860ndash868

Krom G (1993) lsquoA Cepstrum-Based Technique for Determininga Harmonics-To-Noise Ratio in Speech Signalsrsquo Journal ofSpeech and Hearing Research 36(2) 254ndash266

Kukharchik P Kheidorov I Bovbel E and Ladeev D (2008)lsquoSpeech Signal Processing Based on Wavelets and SVM forVocal Tract Pathology Detectionrsquo Lecture Notes in ComputerScience (Springer) 5099 192ndash199

Latifoglu F Polat K Kara SI and Gunes S (2008) lsquoMedicalDiagnosis of Atherosclerosis from Carotid Artery DopplerSignals Using Principal Component Analysis (PCA) k-NN based Weighting Pre-Processing and Artificial ImmuneRecognition System (AIRS)rsquo Journal of Biomedical Infor-matics 41(1) 15ndash23

Ludlow CL Bassich CJ Connor NP Coulter DC and LeeYJ (1987) lsquoThe Validity of Using Phonatory Jitter and Shim-mer to Detect Laryngeal Pathologyrsquo Laryngeal Functionin Phonation and Respiration Boston MA Brown amp Copp 492ndash508

MacQueen J (1967) lsquoSome Methods for Classification and Anal-ysis of Multivariate Observationsrsquo in Paper Presented at the5th Berkeley Symposium on Mathematical Statistics and Prob-ability Berkeley CA University California Press Vol 1pp 281ndash297

MAPACI P (2004) lsquoVoice Disorder Databasersquo httpwwwmapacicomindex-inglesphp

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

1634 M Hariharan et al

Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372

Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800

Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953

Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327

Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900

Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793

Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793

Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60

Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673

Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279

Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921

Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia

Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345

Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925

Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128

Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740

Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p

Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118

Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576

Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd

Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430

Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97

Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117

Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678

Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219

Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399

Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14

Page 14: A new feature constituting approach to detection of vocal fold pathology

1634 M Hariharan et al

Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372

Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800

Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953

Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327

Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900

Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793

Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793

Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60

Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673

Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279

Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921

Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia

Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345

Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925

Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128

Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740

Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p

Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118

Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576

Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd

Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430

Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97

Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117

Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678

Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219

Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399

Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6

Dow

nloa

ded

by [

Nor

th D

akot

a St

ate

Uni

vers

ity]

at 1

739

22

Nov

embe

r 20

14