a new feature constituting approach to detection of vocal fold pathology
TRANSCRIPT
This article was downloaded by [North Dakota State University]On 22 November 2014 At 1739Publisher Taylor amp FrancisInforma Ltd Registered in England and Wales Registered Number 1072954 Registered office Mortimer House37-41 Mortimer Street London W1T 3JH UK
Click for updates
International Journal of Systems SciencePublication details including instructions for authors and subscription informationhttpwwwtandfonlinecomloitsys20
A new feature constituting approach to detection ofvocal fold pathologyM Hariharana Kemal Polatb amp Sazali Yaacoba
a School of Mechatronic Engineering Universiti Malaysia Perlis (UniMAP) Campus PauhPutra Perlis Malaysiab Department of Electrical and Electronics Engineering Faculty of Engineering andArchitecture Abant Izzet Baysal University Bolu TurkeyPublished online 14 May 2013
To cite this article M Hariharan Kemal Polat amp Sazali Yaacob (2014) A new feature constituting approach to detection ofvocal fold pathology International Journal of Systems Science 458 1622-1634 DOI 101080002077212013794905
To link to this article httpdxdoiorg101080002077212013794905
PLEASE SCROLL DOWN FOR ARTICLE
Taylor amp Francis makes every effort to ensure the accuracy of all the information (the ldquoContentrdquo) containedin the publications on our platform However Taylor amp Francis our agents and our licensors make norepresentations or warranties whatsoever as to the accuracy completeness or suitability for any purpose of theContent Any opinions and views expressed in this publication are the opinions and views of the authors andare not the views of or endorsed by Taylor amp Francis The accuracy of the Content should not be relied upon andshould be independently verified with primary sources of information Taylor and Francis shall not be liable forany losses actions claims proceedings demands costs expenses damages and other liabilities whatsoeveror howsoever caused arising directly or indirectly in connection with in relation to or arising out of the use ofthe Content
This article may be used for research teaching and private study purposes Any substantial or systematicreproduction redistribution reselling loan sub-licensing systematic supply or distribution in anyform to anyone is expressly forbidden Terms amp Conditions of access and use can be found at httpwwwtandfonlinecompageterms-and-conditions
International Journal of Systems Science 2014Vol 45 No 8 1622ndash1634 httpdxdoiorg101080002077212013794905
A new feature constituting approach to detection of vocal fold pathology
M Hariharanalowast Kemal Polatb and Sazali Yaacoba
aSchool of Mechatronic Engineering Universiti Malaysia Perlis (UniMAP) Campus Pauh Putra Perlis Malaysia bDepartment ofElectrical and Electronics Engineering Faculty of Engineering and Architecture Abant Izzet Baysal University Bolu Turkey
(Received 27 September 2012 final version received 3 April 2013)
In the last two decades non-invasive methods through acoustic analysis of voice signal have been proved to be excellentand reliable tool to diagnose vocal fold pathologies This paper proposes a new feature vector based on the wavelet packettransform and singular value decomposition for the detection of vocal fold pathology k-means clustering based featureweighting is proposed to increase the distinguishing performance of the proposed features In this work two databasesMassachusetts Eye and Ear Infirmary (MEEI) voice disorders database and MAPACI speech pathology database are usedFour different supervised classifiers such as k-nearest neighbour (k-NN) least-square support vector machine probabilisticneural network and general regression neural network are employed for testing the proposed features The experimentalresults uncover that the proposed features give very promising classification accuracy of 100 for both MEEI database andMAPACI speech pathology database
Keywords acoustic analysis vocal fold pathology feature extraction feature weighting and classification
1 Introduction
Voice is a highly multivariate component of speech andits quantitative description has led to the development ofclinical tools To detect the vocal fold pathology medi-cal professionals use subjective techniques (invasive meth-ods) such as the direct inspection of the vocal foldsand the observation of the vocal folds by endoscopic in-struments These techniques are expensive risky timeconsuming discomfort to the patients and require costlyresources such as special light sources endoscopic instru-ments and specialised video camera equipment To circum-vent the above problems non-invasive methods have beendeveloped to help the medical professionals for early de-tection of vocal fold pathology With the rapid develop-ment of signal processing techniques vocalvoice signalcan be used for the detection of vocal fold pathologiesand its quantitative informations play an important roleto understand the process of vocal fold pathology forma-tion In the last 20 years many research works have beencarried out on the automatic detection and classificationof vocal fold pathologies by means of acoustic analysisparametric and non-parametric feature extraction auto-matic pattern recognition or statistical methods (KasuyaOgawa Mashima and Ebihara 1986 Feijoo and Hernan-dez 1990 Boyanov Ivanov Hadjitodorov and Chollet 1993Deliyski 1993 Kasuya Endo and Saliu 1993 Boyanov andHadjitodorov 1997 Hernandez-Espinosa Gomez-VildaGodino-Llorente and Aguilera-Navarro 2000 Martinez
lowastCorresponding author Email hariunimapedumy
and Rufiner 2000 Godino-Llorente Gomez-Vilda andBlanco-Velasco 2006)
A large amount of acoustic parameters have been pro-posed and its effectiveness has been proven by experimentalresearches The important parameters are pitch (Boyanovet al 1993) jitter (Feijoo and Hernandez 1990 Kasuyaet al 1993) shimmer (Ludlow Bassich Connor Coul-ter and Lee 1987 Kasuya et al 1993) harmonics-to-noise(Yumoto Sasaki and Okamura 1984 Krom 1993) and nor-malised noise energy (Kasuya et al 1986) In the last twodecades time-frequencyscale analyses (waveletwaveletpacket transforms) have been used as tools to analyse allkinds of problems in signal and image processing Sincethe speechvoice signal is a highly non-stationary signalFourier transform is not a very useful tool to analyse non-stationary signal as the time-domain informations are lostwhile performing the frequency transformation (PandianSazali and Muthusamy 2008 Paulraj Sazali and Hariha-ran 2009) Time-frequencyscale analysis (waveletwaveletpacket transform) is a good tool for the analysis of non-stationary signals both in time and frequency scale (Pandianet al 2008 Paulraj et al 2009) Hence wavelet and waveletpacket analysis has the potential for the identification ofvocal fold pathology
In Nayak and Bhat (2003) authors have presented aprocedure to identify pathological disorders of larynx usingwavelet analysis Nayak Bhat Acharya and Aithal (2005)have proposed a method for the classification and analysis
Ccopy 2013 Taylor amp Francis
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1623
of speech abnormalities based on wavelet analysis and arti-ficial neural network Fonseca Guido Scalassara Macieland Pereira (2007) have presented wavelet and least-squaresupport vector machine (LS-SVM) for the identification ofvocal fold pathology Salhi Talbi and Cherif 2008 haveproposed a hybrid approach using wavelet analysis andmultilayer neural network for the vocal fold pathologiesKukharchik Kheidorov Bovbel and Ladeev (2008) havepresented wavelet transform and support vector machine(SVM) for the vocal fold pathology detection Crovato andSchuck (2007) presented a vocal fold pathology (dysphonicvoice) classification system using the wavelet packet trans-form (WPT) and the best basis algorithm (BBA) as dimen-sionality reduction and six artificial neural networks actingas classification systems Hierarchical system for diagno-sis of vocal fold pathologies based on wavelets and SVMhas been proposed by Nikkhah-Bahrami Ahmadi-NoubariSeyed Aghazadeh and Khadivi Heris (2009) Arjmandi andPooyan (2011) have applied linear discriminant analysisand wavelet packets for vocal fold pathology assessmentErfanian Saeedi Almasganj and Torabinejad (2011) haveproposed adaptive wavelets for vocal fold pathology assess-ment In Khadivi Heris Seyed Aghazadeh and Nikkhah-Bahrami (2009) WPT with non-linear features were usedfor optimal selection of features to assess the vocal foldpathologies Azadi and Almasganj (2011) have proposedpartitioning and biased support vector machine (PBSVM)-based classifier for vocal fold pathology assessment usinglabelled and unlabelled data-sets Accuracy of their resultvaried from 85 to 100 approximately under differentexperiments From the previous work it is observed thatthe reliability of wavelet and WPT-based features has beenproven by many experimental researchers It is not easyto compare their results since they conducted the analy-sis with different size of data-set different databases anddifferent presentation of results Cross-validation schemeand use of more than one database are the possible ways toprove the reliability of the results of various feature extrac-tion and classification methods (Saenz-Lechon Godino-Llorente Osma-Ruiz and Gomez-Vilda 2006)
In this paper a new feature vector is proposed basedon WPT and singular value decomposition for the auto-matic detection of vocal fold pathology k-means cluster-ing based feature weighting is proposed to increase thediscrimination ability of the proposed features Two dif-ferent databases such as Massachusetts Eye and Ear Infir-mary (MEEI) voice disorders database (Kay Elemetrics Inc1994) and MAPACI speech pathology database (MAPACI2004) are used to test either the robustness or the inde-pendence of the algorithms to the databases Two differentexperiments (frame-based and file-based) are carried outusing the voice samples of the above-said databases Fourdifferent supervised classifiers such as k-nearest neighbour(k-NN) LS-SVM probabilistic neural network (PNN) andgeneral regression neural network (GRNN) are employed
for the identification of vocal fold pathology Two schemesof data validation method are used (conventional validation[ConV] and tenfold cross-validation [CrossV]) in orderto test the effectiveness of the proposed features and thereliability of the classification results
2 Database
In the area of automatic detection of vocal fold pathologyanalysis the only one commercially available database isMEEI voice disorders database (Kay Elemetrics Inc 1994)The database contains 53 normal and 657 pathological voicesamples developed by the MEEI Voice and Speech LabsThe voice samples were the sustained phonation of thevowel ah (1ndash3 s) long and reading (12 s) of the lsquoRainbowPassagersquo from patients with normal voices and a wide va-riety of organic neurological traumatic and psychogenicvoice disorders in different stages All the voice sampleswere collected in a controlled environment and sampledwith 25 kHz or 50 kHz sampling rate and 16 bits of res-olution To test the effectiveness of the proposed methoda total of 226 voice samples of sustained phonation of thevowel ah (173 pathological and 53 normal) are used anddownsampled to 25 kHz for our analysis Two differentexperiments are performed (frame-based and file-based)In frame-based analysis voice samples are segmented intoframes of 40 ms long using a Hamming window with 50overlap Then each window is parameterised by means ofWPT and SVD (singular value decomposition) In file-based analysis 173 pathological and 53 normal voice sam-ples are subjected to feature extraction The second data-setis taken from MAPACI (2004) database (MAPACI 2004)and all the speech samples were recorded at 44100 Hz dur-ing the lifetime project of MAPACI (2004) The recordingdevice was a Sennheiser headset microphone The speechdatabase consists of 48 voice samples (12 normal and 12pathological males + 12 normal and 12 pathological fe-males) and their ages ranged from 20 to 68 years with an av-erage of 3675 years and a standard deviation of 1435 yearsFrame-based and file-based analyses are conducted on thisdatabase as well (Figure 1) shows the pathological andnormal voice samples from MEEI database and MAPACIspeech pathology database
3 WPT and SVD
Various feature extraction methods have been proposed forrobust representation of normal and pathological speech Inthe last decade time-frequency analysis has been used forrobust representation of normal and pathological speechThe wavelet transform provides time-frequency represen-tation of the signal It decomposes signal over dilated andtranslated wavelets A wavelet is a waveform of effectivelylimited duration that has an average value of zero Wavelettransform is defined as the convolution of a signal f(t) with a
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1624 M Hariharan et al
Figure 1 Plots of pathological and normal voice samples
wavelet function ψ(t) shifted in time by a translation param-eter and dilated by a scale parameter The general definitionof the wavelet transform is given as (Burrus Gopinath GuoOdegard and Selesnick 1997 Bopardikar 2000)
W (a b) =infinint
minusinfinf (t)
1radica
ψlowast(
t minus b
a
)dt (1)
In the tree each subspace is indexed by its depth i and thenumber of subspaces p The two wavelet packet orthogonalbases at a parent node (ip) are given by the followingforms
ψ2pi+1(k) =
infinsumn =minusinfin
l[n] ψpi (k minus 2in) (2)
where l[n] is a low-pass (scaling) filter
ψ2p+1i+1 (k) =
infinsumn =minusinfin
h[n] ψpi (k minus 2in) (3)
where h[n] is the high-pass (wavelet) filter Wavelet packetis an extension of wavelet transform In WPT decomposi-tion procedure a signal is decomposed into two frequency
bands such as lower frequency band (approximation co-efficients) and higher frequency band (detail coefficients)Wavelet packet decomposition helps to partition both lowerand higher frequency bands into smaller bands which can-not be achieved by using general discrete wavelet trans-form Hence WPT gives a balanced binary tree structure(Figure 2) shows the block diagram of the proposed system
In this work the automatic detection of vocal foldpathology is carried out by using WPT and SVD In frame-based analysis the voice samples are firstly segmented intoshort-time frames of 40 ms with an overlap of 50 betweenadjacent frames and windowed by using Hamming win-dow (Godino-Llorente and Gomez-Vilda 2004 Godino-Llorente et al 2006) In frame-based analysis 1557 framesof pathological voices and 1537 frames of normal voicesare used Each frame of signal is decomposed into five lev-els using WPT and yields 32 subbands A matrix of size32 times M (fifth-level wavelet packet coefficients of each32 wavelet packet sub-bands) composed of wavelet packetcoefficients which is obtained for further processing whoserows pertain to wavelet packet subbands and columns towavelet packet coefficients
A = [C1
5 (M) C25 (M) middot middot middot C32
5 (M)]T
(4)
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1625
Figure 2 Overall diagram of the proposed system
To derive simple and effective feature vector singularvalue decomposition is applied In file-based analysis eachvoice sample is decomposed into five levels using WPTand the same SVD is applied to derive feature vector for thedetection of vocal fold pathology The same methodologyis applied on the voice samples of second data-set as wellConsidering the fact that pathological voice signals containmore rapid variations and the popularity of Daubechieswavelets in speech applications lsquodb4rsquo wavelet is used inour study The order was chosen to be low to model thetransients and rapid variations in a signal efficiently
31 Singular value decomposition
The singular value decomposition which is a factorisationand summarisation technique effectively reduces a rectan-gular matrix A (n times p matrix) of wavelet packet coefficientsinto a much smaller invertible and square matrix (KalkerHaitsma and Oostveen 2001 Ozer Sankur Memon andAnarim 2005) SVD theorem states that the given n times pmatrix is decomposed as given by
X = UDVT (5)
The columns of U are orthonormal eigenvectors of AAT
and are called left-singular vectors The columns of V are or-thonormal eigenvectors of ATA and are called right-singularvectors The diagonal entries of D are called the singularvalues of the matrix n times p In this work 32 singular valuesgive a good summarisation of the matrix of wavelet packetcoefficients of a signal
32 k-Means clustering based feature weighting
In many practical machine learning applications the per-formance of the learned models degrade gracefully due tothe irrelevant and noisy features To improve the robust-ness of the features and the classification accuracy clus-tering algorithms are used as feature weighting methodwhere they transform the extracted non-linearly separable
features to linearly separable features and also to map thefeatures according to the distributions (Polat and Gunes2006 Latifoglu Polat Kara and Gunes 2008 GunesPolat and Yosunkaya 2010 Polat 2012 Polat and Dur-duran 2012) Clustering algorithms are used not only tostudy the similarity or dissimilarity of the features but alsouseful for compression and reduction of the features (Polatand Gunes 2006 Latifoglu et al 2008 Gunes et al 2010Polat 2012 Polat and Durduran 2012) Several clusteringalgorithms have been proposed in the literature (MacQueen1967 Bezdek 1981 Chiu 1994 Yager and Filev 1994 Xuand Wunsch 2005) In this work k-means clustering is usedas a feature weighting method since it is simple and widelyimplemented in solving many practical problems (Polat andGunes 2006 Latifoglu et al 2008 Gunes et al 2010 Po-lat 2012 Polat and Durduran 2012) k-means clustering isused to partition n observations into k clusters so as to min-imise the mean squared distance from each data point toits nearest centre For a data-set xi i = 1 2 3 n k-means clustering algorithm finds the cluster centres ci andthe membership matrix U iteratively using the followingfour steps
Step 1 Randomly initialise the cluster centre ci i =123 c
Step 2 Determine the membership matrix usingEquation (6)
uij =
10
if∥∥xj minus ci
∥∥2 le∥∥xj minus ci
∥∥2 for each k = i
otherwise(6)
Step 3 Calculate the cost function based on the Euclideandistance using the following Equation (7)
J =csum
i=1
sumkxkisinGi
xk minus ci2 (7)
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1626 M Hariharan et al
Figure 3 Working of k-means clustering based feature weighting method
Stop if either this value or its improvement over previousiteration is below a certain thresholdStep 4 Update the cluster centres according toEquation (8) and go to Step 2
ci = 1
|Gi |sum
kxkisin Gi
xk (8)
where |Gi | =nsum
j=1uij
After finding the cluster centres the extracted featuresare preprocessed (Polat and Gunes 2006 Latifoglu et al2008 Gunes et al 2010 Polat 2012 Polat and Durduran2012) as summarised in (Figure 3) below firstly calcu-late the cluster centres using k-means clustering methodNext the ratios of means of features to their centres arecalculated Finally these ratios are multiplied with eachrespective feature
Raw features are obtained through feature extractionThrough k-means clustering based feature weighting theraw features are preprocessed which gives the weightedfeatures (Figure 4a and b) shows the class distribution ofraw and weighted SVD features MEEI database The classdistribution of raw and weighted SVD features for MAPACIspeech pathology database is shown in (Figure 5a and b)From the figures it is observed that the discrimination abil-
ity of the raw SVD features has been increased by applyingk-means clustering based feature weighting method
4 Classifiers
In the area of automatic detection of vocal fold patholo-gies various classifiers have been proposed such as multi-layer perceptron (Ritchings McGillion Conroy and Moore1999 Ritchings McGillion and Moore 1999) hiddenMarkov models (Wester 1998) linear discriminant anal-ysis (Hariharan Paulraj and Yaacob 2009 UmapathyKrishnan Parsa and Jamieson 2005) SVM (Khadivi Heriset al 2009 Arjmandi and Pooyan 2011 Erfanian Saeediet al 2011) Gaussian mixture models (Godino-Llorenteet al 2006) radial basis neural networks (Hariharan andSazali 2010 Hariharan Paulraj and Yaacob 2011) and k-nearest neighbourhood classifier (Ananthakrishna Shamaand Niranjan 2004 Shama and Cholayya 2007) In thiswork four different classifiers are employed such as k-NNLS-SVM PNN and GRNN
41 k-NN classifier
In pattern recognition the k-NN algorithm is a method forclassifying objects based on closest training examples inthe feature space (Fukunaga 1990 Duda Hart and Stork
Figure 4 (a) Class distribution of raw SVD features (MEEI database frame-based) according to the first two features (Feature 1 andFeature 2) (b) Class distribution of weighted SVD features (MEEI database frame-based) according to the first two features (Feature 1and Feature 2)
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1627
Figure 5 (a) Class distribution of raw SVD features (MAPACI speech pathology frame-based) according to the first two features (Feature1 and Feature 2) (b) Comparison of results (frame-based MEEI database raw features)
2001) An object is classified by a majority vote of itsneighbour with the object being assigned to the class mostcommon amongst its k-nearest neighbours k is a positiveinteger In the k-NN algorithm the classification of a newtest feature vector is determined by the class of its k-NNs
42 Probabilistic neural network
Specht (1990) has proposed the PNN based on Bayesianclassification and classical estimators for probability den-sity function (PDF) PNN comprises four units such as in-put units pattern units summation units and output unitsAll the units are fully interconnected and the pattern unitsare activated by exponential function instead of sigmoidalactivation function The pattern unit computes distancesfrom the input vector to the training input vectors when aninput is presented and produces a vector whose elementsindicate how close the input is to a training input The sum-mation unit sums these contributions for each class of inputsand produces a net output which is a vector of probabili-ties From the maximum of these probabilities output unitsproduce a 1 for that class and a 0 for the other classes using
compete transfer function (Hariharan Yaacob and Awang2011) The performance of the PNN classifier highly de-pends upon the smoothing parameter or spread factor (σ )Based on the experimental investigations the σ value isvaried between 001 and 0055 in steps of 0005
43 General regression neural network
GRNN is a kind of radial basis networks and the train-ing is conducted using one-pass learning This networkdoes not require an iterative training procedure it presentsmuch faster learning than multilayer perceptron it is moreaccurate than MLP and relatively insensitive to outliersThe target variable for the GRNN network is continuous(Eskidere Ertas and Hanilci 2011 Wu and Tsai 2011Hariharan Sindhu and Yaacob 2011 HariharanSaraswathy Sindhu Khairunizam and Yaacob 2012Hoglund 2012) Specht (1991) has proposed the model ofGRNN to perform general (linear or non-linear) regres-sions GRNN is based on the theory of probability re-gression analysis It usually uses Parzen window estimatesto set up the PDF from the observed data samples TheGRNN has four different layers input layer pattern layer
Table 1 Details of training and test data
Type ofType of experiment validation MEEI database MAPACI speech pathology database
Frame-based (1557pathological + 1537normal) (1560pathological + 1560normal)
CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
ConV Training = 2166 (70) testing =928 (30)
Training = 2184 (70) testing =936 (30)
File-based (173pathological + 53normal) (24pathological + 24normal)
CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
ConV Training = 158 (70) testing =68 (30)
Training = 34 (70) testing =14 (30)
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1628 M Hariharan et al
Figure 6 Comparison of results (a) frame-based MEEI database raw features (b) file-based MEEI database raw features (c) frame-based MAPACI speech pathology database raw features and (d) file-based MAPACI speech pathology database raw features
summation layer and output layer The performance of theGRNN classifier highly depends upon the smoothing pa-rameter or spread factor (σ ) Based on the experimentalinvestigations the σ value is varied between 001 and 0055in steps of 0005
44 Support vector machine
In this work SVM is used as a classifier and it is a promis-ing method for solving non-linear classification problemsfunction estimation and density estimation and patternrecognition tasks (Azadi and Almasganj 2011 Calisir andDogantekin 2011 Ismail Samsudin and Shabri 2011 YuYao Wang and Lai 2011 Martis Acharya Mandana Rayand Chakraborty 2012) It has been originally proposed toclassify samples within two classes It maps training sam-ples of two classes into a higher dimensional space througha kernel function SVM seeks an optimal separating hyper-plane in this new space to maximise its distance from theclosest training point While testing a query point is cate-gorised according to the distance between the point and thehyperplane
SVM models are built around a kernel function thattransforms the input data into an n-dimensional space where
a hyperplane can be constructed to partition the data Threekinds of kernel functions such as linear kernel multilayerkernel and radial basis function (RBF) kernel are normallyused by researchers (Azadi and Almasganj 2011 Calisirand Dogantekin 2011 Ismail et al 2011 Yu et al 2011Martis et al 2012) In this work the RBF kernel functionis used since it gives an excellent generalisation and lowcomputational cost In the RBF kernel σ 2 (sig2) is theimportant parameter and it causes the changes in shapeflexion of the hyperplane
In this work LS-SVMlab toolbox (Suykens VanGestel De Brabanter De Moor and Vandewalle2003 De Brabanter 2010) is used to perform classificationof normal and pathological speech samples There are twoparameters which are to be chosen optimally such as reg-ularisation parameter (γ gam) and σ 2 (squared bandwidthof the RBF kernel) to obtain better accuracy
5 Results and discussions
Many research works have already been done in the areaof automatic detection of voice pathology In this work anew feature vector using WPT and SVD is proposed to clas-sify the voice samples into normal or pathological k-means
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1629
Figure 7 Comparison of results (a) Frame-based MEEI database weighted features (b) file-based MEEI database weighted features(c) Frame-based MAPACI speech pathology database weighted features and (d) file-based MAPACI speech pathology database weightedfeatures
clustering based feature weighting method is proposed forpreprocessing the features which increases its discrimina-tion ability between classes and the classification accuracyas well Four classifiers are employed to investigate the ef-ficacy of the raw and weighted features and two validationschemes (ConV and CrossV) are used to prove the reliabil-ity of the classification results Details of the training andtesting sets used in this work are tabulated in (Table 1)
To test the classifier performance several measuresnamely sensitivity specificity and the overall accuracy areconsidered These measures are calculated from the mea-sures true positive (TP the classifier classified as pathologywhen pathological samples are present) true negative (TNthe classifier classified as normal when normal samples arepresent) false positive (FP the classifier classified as patho-logical when normal samples are present) and false negative(FN the classifier classified as normal when pathologicalsamples are present) These measures are given in Equa-tions (9) (10) and (11) respectively
Sensitivity (SE) = TP(TP + FN) (9)
Specificity (SP) = TN(TN + FP) (10)
Overall accuracy (ACC) = (TP + TN)(TP + TN
+ FP + FN) (11)
In k-NN classifier different values of lsquokrsquo between 1 and10 are used PNN and GRNN are trained with differentspread factors between 001 and 0055 The suitable valueof the regularisation parameter (γ gam) and σ 2 (sig2) forSVM classifier are chosen optimally during training andtesting to obtain better accuracy Sensitivity specificity andoverall accuracy of frame-based and file-based analysis us-ing the raw and weighted features are presented in (Table 2)and (Table 3) for both MEEI database and MAPACIspeech pathology database respectively Average classifica-tion accuracies with standard deviations are tabulated From(Table 2) it is observed that the accuracy of file-based anal-ysis is always better than the frame-based analysis Max-imum classification accuracy of 100 is obtained duringthe file-based analysis using all the classifiers (k-NN LS-SVM PNN and GRNN) for the proposed raw and weightedfeatures Maximum classification accuracy of above 95is obtained using the raw features and 100 is obtained us-ing the weighted features during the frame-based analysis
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1630 M Hariharan et al
Tabl
e2
Com
pari
son
ofre
sult
sfo
rM
EE
Ida
taba
seus
ing
k-N
NL
S-S
VM
PN
Nan
dG
RN
N(r
awan
dw
eigh
ted
feat
ures
)
Raw
feat
ures
Wei
ghte
dfe
atur
es
ME
EI
data
base
SE
SP
AC
CS
ES
PA
CC
k-N
NFr
ame-
base
dC
ross
V96
66
plusmn0
5591
69
plusmn1
0994
05
plusmn0
8399
81
plusmn0
0698
27
plusmn0
3499
03
plusmn0
19k-
valu
e1ndash
10C
onV
959
6plusmn
071
908
7plusmn
154
932
7plusmn
116
998
1plusmn
006
979
8plusmn
033
988
8plusmn
015
File
-bas
edC
ross
V99
54
plusmn0
2410
0plusmn
000
996
5plusmn
019
995
4plusmn
024
100
plusmn0
0099
65
plusmn0
19k-
valu
e1ndash
10C
onV
997
4plusmn
025
100
plusmn0
0099
79
plusmn0
2010
0plusmn
000
100
plusmn0
0010
0plusmn
000
LS
-SV
MFr
ame-
base
dC
ross
V95
10
plusmn0
2195
41
plusmn0
2095
25
plusmn0
1299
75
plusmn0
0399
45
plusmn0
0799
60
plusmn0
0410
000
01
Con
V94
92
plusmn0
9094
49
plusmn1
0594
70
plusmn0
8199
34
plusmn0
2999
46
plusmn0
3499
40
plusmn0
26Fi
le-b
ased
Cro
ssV
989
2plusmn
062
998
1plusmn
060
991
2plusmn
047
993
2plusmn
036
100
plusmn0
0099
47
plusmn0
2810
01
Con
V10
0plusmn
000
988
7plusmn
130
991
2plusmn
103
996
2plusmn
080
994
1plusmn
186
995
6plusmn
071
PN
NFr
ame-
base
dC
ross
V97
43
plusmn0
2693
01
plusmn0
1795
12
plusmn0
1799
88
plusmn0
0598
69
plusmn0
1699
28
plusmn0
070
01ndash0
055
Con
V96
79
plusmn0
2492
44
plusmn0
3394
51
plusmn0
2499
81
plusmn0
0698
41
plusmn0
2199
10
plusmn0
11Fi
le-b
ased
Cro
ssV
974
4plusmn
539
100
plusmn0
0097
74
plusmn4
9797
50
plusmn5
3710
0plusmn
000
977
9plusmn
495
001
ndash00
55C
onV
970
9plusmn
610
100
plusmn0
0097
37
plusmn5
7999
62
plusmn0
8099
41
plusmn1
8699
56
plusmn0
71G
RN
NFr
ame-
base
dC
ross
V93
01
plusmn3
8988
08
plusmn4
9890
15
plusmn4
3199
50
plusmn0
4197
53
plusmn1
3098
27
plusmn0
890
01ndash0
055
Con
V92
52
plusmn3
7887
07
plusmn5
1189
34
plusmn4
3599
39
plusmn0
5597
35
plusmn1
1698
10
plusmn0
91Fi
le-b
ased
Cro
ssV
972
7plusmn
599
100
plusmn0
0096
33
plusmn8
0098
07
plusmn4
8210
0plusmn
000
979
2plusmn
511
001
ndash00
55C
onV
966
9plusmn
675
100
plusmn0
0095
81
plusmn9
0596
77
plusmn6
2310
0plusmn
000
958
1plusmn
815
Tabl
e3
Com
pari
son
ofre
sult
sfo
rM
APA
CI
spee
chpa
thol
ogy
data
base
usin
gk-
NN
LS
-SV
MP
NN
and
GR
NN
(raw
and
wei
ghte
dfe
atur
es)
Raw
feat
ures
Wei
ghte
dfe
atur
es
MA
PAC
Ida
taba
seS
ES
PA
CC
SE
SP
AC
C
k-N
NFr
ame-
base
dC
ross
V94
49
plusmn0
3696
58
plusmn0
3995
51
plusmn0
2198
67
plusmn0
2398
76
plusmn0
2598
71
plusmn0
23k-
valu
e1ndash
10C
onV
942
0plusmn
040
962
9plusmn
033
952
1plusmn
032
100
plusmn0
0010
0plusmn
000
100
plusmn0
00Fi
le-b
ased
Cro
ssV
771
0plusmn
784
669
2plusmn
330
704
2plusmn
365
100
plusmn0
0010
0plusmn
000
983
3plusmn
164
k-va
lue
1ndash10
Con
V77
56
plusmn7
1167
51
plusmn2
4869
71
plusmn2
7099
88
plusmn0
4099
88
plusmn0
4097
50
plusmn2
16L
S-S
VM
Fram
e-ba
sed
Cro
ssV
955
0plusmn
018
972
8plusmn
024
963
7plusmn
016
990
4plusmn
008
990
4plusmn
008
992
4plusmn
006
100
01
Con
V96
86
plusmn0
8195
24
plusmn0
6796
03
plusmn0
4098
71
plusmn0
6898
71
plusmn0
6899
02
plusmn0
41Fi
le-b
ased
Cro
ssV
793
1plusmn
333
688
7plusmn
240
729
2plusmn
260
100
plusmn0
0010
0plusmn
000
958
3plusmn
000
101
Con
V74
53
plusmn7
8284
00
plusmn13
01
771
4plusmn
878
100
plusmn0
0010
0plusmn
000
964
3plusmn
376
PN
NFr
ame-
base
dC
ross
V94
88
plusmn0
2096
23
plusmn0
2695
55
plusmn0
2199
82
plusmn0
0799
82
plusmn0
0799
91
plusmn0
040
01ndash0
055
Con
V94
52
plusmn0
3196
03
plusmn0
3595
26
plusmn0
2599
76
plusmn0
0799
76
plusmn0
0799
88
plusmn0
03Fi
le-b
ased
Cro
ssV
723
0plusmn
396
705
3plusmn
393
708
3plusmn
170
985
7plusmn
452
985
7plusmn
452
991
7plusmn
263
001
ndash00
55C
onV
690
2plusmn
533
691
8plusmn
410
679
3plusmn
386
986
0plusmn
273
986
0plusmn
273
990
0plusmn
172
GR
NN
Fram
e-ba
sed
Cro
ssV
904
1plusmn
378
917
9plusmn
335
909
8plusmn
345
995
9plusmn
029
995
9plusmn
029
997
1plusmn
023
001
ndash00
55C
onV
900
2plusmn
361
917
8plusmn
316
907
2plusmn
324
995
3plusmn
030
995
3plusmn
030
996
6plusmn
024
File
-bas
edC
ross
V69
43
plusmn8
5170
75
plusmn4
8666
67
plusmn7
7397
13
plusmn6
4097
13
plusmn6
4093
75
plusmn9
270
01ndash0
055
Con
V62
05
plusmn9
4064
51
plusmn3
8359
57
plusmn8
5097
38
plusmn5
6897
38
plusmn5
6893
07
plusmn9
91
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1631
Table 4 Computation time for feature extraction and classification
Classification
MEEI database MAPACI database
Type of Type of Weighted Raw Weighted RawClassifiers experiments validation features features features features
k-NN Frame-based CrossV 0338 0341 0342 0346k-value 1ndash10 ConV 3806 3821 6931 6955File-based CrossV 0036 0038 0036 0036k-value 1ndash10 ConV 1759 1754 3001 3206
LS-SVM Frame-based CrossV 1139 1175 1191 121410 001 ConV 0718 0698 0705 0723
File-based CrossV 0005 0005 0002 000210 1 ConV 0006 0006 0004 0004
PNN Frame-based CrossV 2458 3084 2445 2421001ndash0055 ConV 10340 10356 11180 11202File-based CrossV 1078 1124 1093 1062001ndash0055 ConV 4250 4266 4072 4047
GRNN Frame-based CrossV 2587 2571 2676 2582001ndash0055 ConV 10726 10648 11634 11584File-based CrossV 1098 1056 1065 1062001ndash0055 ConV 4302 4260 4049 4066
Feature extractionType of experiments MEEI MAPACI
Normal Pathological Normal PathologicalFile-based 0549 0034 0109 0130Frame-based 0591 0181 1734 1737
To prove the robustness and database independency of theproposed method MAPACI speech pathology database isused for the investigation in addition to MEEI databaseFrom (Table 3) it is inferred that maximum accuracy ofabove 95 is obtained using the raw features and 100is obtained using the weighted features during the frame-based analysis During the file-based analysis we obtainedmaximum classification accuracy of 92 using the rawfeatures and 100 is obtained using the weighted features
(Figure 6andashd) show the comparison of results (maxi-mum accuracy and its corresponding sensitivity and speci-ficity) for MEEI database and MAPACI speech pathologydatabase using the proposed raw features during frame-based and file-based analysis Similarly (Figure 7andashd) il-lustrate the comparison of results (maximum accuracy andits corresponding sensitivity and specificity) for MEEIdatabase and MAPACI speech pathology database for pro-posed weighted features The performance of the weightedfeatures is always better compared to raw features for bothMEEI database and MAPACI speech pathology databaseAll the feature extractions and classification algorithms aredeveloped under MATLAB platform and they run in a lap-top of Intel Core i7-2670 QM (22 GHz) with 4 GB RAM(Table 4) reports the average time taken (in seconds) duringfeature extraction and classification under two different ex-periments (frame-based and file-based) Direct comparisonof our work with the previous works in literature cannot beperformed since most of the works in the literature havenot presented their computation time of feature extraction
and classification phase From (Table 4) it is observed thatthe best classification accuracy could be obtained using oursuggested feature extraction and classification algorithmswith less computation time
From the literature it is observed that the reliabilityof wavelet and WPT-based features has been proven bymany experimental researches Accuracy of their resultswas varied from 85 to 100 approximately under differ-ent experiments Their results highly depend on the opti-mal selection of wavelet packet-based features using featureselectiontransformation techniques and optimisation tech-niques However the presented work gives very promisingclassification accuracy for the databases under test Henceit can be concluded that the proposed k-means clusteringbased feature weighting method could improve the robust-ness of the WPTndashSVD features and increase the classifica-tion accuracy as well for both MEEI database and MAPACIspeech pathology database
6 Conclusions
This paper presents a new feature constituting approachfor efficient detection of vocal fold pathology using a fea-ture extraction method (WPTndashSVD features and a fea-ture weighting method [k-means clustering based]) Totest the effectiveness and reliability of the proposed rawand weighted features four different classifiers such as k-NN LS-SVM PNN and GRNN are employed k-means
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1632 M Hariharan et al
clustering based feature weighting approach can increasethe distinguishing performance of raw WPTndashSVD featuresThe experimental results show that the proposed weightedfeatures give very promising classification accuracy of100 for both MEEI voice disorder database and MAPACIspeech pathology database The proposed features can beused as additional acoustic indicators for researchers andspeech pathologists to detect the vocal fold pathology Theperformance of the k-means clustering algorithm is depend-ing on the initial positions of the clusters and hence in ourwork it was run for several times to obtain better clus-ter centres In the future work k-means clustering basedfeature weighting method could be improved by using ef-ficient initialisation of methods for better initial placementof the cluster centres and different kernel functions Theproposed method could be extended to detect the specifictype of disorders and also to develop an online diagnosingsystem
AcknowledgementsThis work is done in part with data transferred from the databaseMAPACI httpwwwmapacicom The authors would like tothank the anonymous reviewers for their valuable comments
Notes on contributorsM Hariharan has received Bachelor ofEngineering in Electrical and ElectronicsEngineering and Master of Engineeringin Applied Electronics from GovernmentCollege of Technology Coimbatore TamilNadu India He has received a PhD degreein Mechatronic Engineering from UniversitiMalaysia Perlis (UniMAP) Malaysia He iscurrently working as a Senior Lecturer in
the School of Mechatronic Engineering UniMAP Malaysia Hehas published more than 50 papers in referred journals and con-ferences He has been a reviewer in various journals includingComputers and Electrical Engineering Computer Methods andPrograms in Biomedicine Artificial Intelligence in BiomedicineInternational Journal of Phoniatrics Speech Therapy and Com-munication Pathology and Medical Engineering and Physics Hisresearch interests include speech signal processing biomedicalsignal and image processing and artificial intelligence He is amember of IEEE
Kemal Polat was born in Kırıkkale on30 January 1981 Dr Polat has gradu-ated from ElectricalndashElectronics Engineer-ing Department of Selcuk University withBSc degree in 2002 and from ElectricalndashElectronics Engineering Department of Sel-cuk University with MSc degree in 2004He took his PhD degree in Electrical andElectronic Engineering at Selcuk University
in 2008 He is now working as an Assistant Professor in Electricaland Electronic Engineering Department Engineering of FacultyAbant Izzet Baysal University since September 2011 His currentresearch interests are biomedical signal classification statisticalsignal processing digital signal processing and pattern recognitionand classification He has 55 articles published in the SCI journalsand 16 international conference papers He is a reviewer in various
SCI journals including Information Sciences Pattern RecognitionLetters Digital Signal Processing Soft Computing Artificial In-telligence in Medicine Applied Soft Computing Computers inBiology and Medicine Expert Systems Computational Statisticsamp Data Analysis IEEE Transactions on Biomedical EngineeringIEEE Transactions on Evolutionary Computation TALANTASoft Computing Journal of Medical Systems and Scientific Re-search and Essays so on He is the member of editorial board ofJournal of Neural Computing and Applications (SCI expanded)
Sazali Yaacob received his PhD degreein Control Engineering from University ofSheffield UK He is currently working as aProfessor in the School of Mechatronic En-gineering UniMAP Malaysia He has suc-cessfully supervised 8 PhD candidates andmore than 20 MSc graduates through re-search mode Currently he has 10 PhD and8 MSc candidates His research interests
include control modelling and signal processing with applica-tions in the fields of satellite bio-medical applied mechanics androbotics He has published more than 70 papers in journals and200 papers in conference proceedings He received his profes-sional qualification as Charted Engineer from the EngineeringCouncil UK in 2005 and also a member of IET UK since 2003
ReferencesAnanthakrishna T Shama K and Niranjan U (2004)
lsquok-Means Nearest Neighbor Classifier for Voice Pathol-ogyrsquo in Paper Presented at the Proceedings from INDI-CONrsquo04 The 1st IEEE Annual India Conference KharagpurIndia pp 352ndash354
Arjmandi MK and Pooyan M (2011) lsquoAn Optimum Algorithmin Pathological Voice Quality Assessment Using Wavelet-Packet-Based Features Linear Discriminant Analysis andSupport Vector Machinersquo Biomedical Signal Processing andControl 7 3ndash19
Azadi TE and Almasganj F (2011) lsquoPBSVM Partitioningand Biased Support Vector Machine for Vocal Fold PathologyAssessment Using Labeled and Unlabeled Data Setsrsquo ExpertSystems with Applications 38(1) 610ndash619
Bezdek JC (1981) Pattern Recognition with Fuzzy ObjectiveFunction Algorithms Norwell MA Kluwer Academic Pub-lishers
Boyanov B and Hadjitodorov S (1997) lsquoAcoustic Analysis ofPathological Voices A Voice Analysis System for the Screen-ing of Laryngeal Diseasesrsquo IEEE Engineering in Medicineand Biology Magazine 16(4) 74ndash82
Boyanov B Ivanov T Hadjitodorov S and Chollet G (1993)lsquoRobust Hybrid Pitch Detectorrsquo IET Electronics Letters 291924ndash1926
Burrus C Gopinath R Guo H Odegard J and Selesnick I(1997) Introduction to Wavelets and Wavelet Transforms APrimer Upper Saddle River NJ Prentice Hall
Calisir D and Dogantekin E (2011) lsquoA New Intelligent Hep-atitis Diagnosis System PCA-LSSVMrsquo Expert Systems withApplications 38(8) 10705ndash10708
Chiu SL (1994) lsquoFuzzy Model Identification Based on ClusterEstimationrsquo Journal of Intelligent and Fuzzy Systems 2(3)267ndash278
Crovato C and Schuck A (2007) lsquoThe Use of Wavelet PacketTransform and Artificial Neural Networks in Analysis andClassification of Dysphonic Voicesrsquo IEEE Transactions onBiomedical Engineering 54(10) 1898ndash1900
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1633
De Brabanter K Ojeda PKF Alzate C De Brabanter J Pel-ckmans K De Moor B Vandewalle J and Suykens JAK(2010) lsquoLS-SVMlab Toolboxrsquo httpwwwesatkuleuvenbesistalssvmlab
Deliyski D (1993) Acoustic Model and Evaluation of Patholog-ical Voice Productionrsquo in Paper Presented at the Proceed-ings from EUROSPEECHrsquo93 The 3rd Conference on SpeechCommunication and Technology Berlin Germany pp 1969ndash1972
Duda RO Hart PE and Stork DG (2001) Pattern Classifi-cation New York Wiley-Interscience
Erfanian Saeedi N Almasganj F and Torabinejad F (2011)lsquoSupport Vector Wavelet Adaptation for Pathological VoiceAssessmentrsquo Computers in Biology and Medicine 41(9)822ndash828
Eskidere O Ertas F and Hanilci C (2011) lsquoA Comparison ofRegression Methods for Remote Tracking of Parkinsonrsquos Dis-ease Progressionrsquo Expert Systems with Applications 39(5)5523ndash5528
Feijoo S and Hernandez C (1990) lsquoShort-Term Stability Mea-sures for the Evaluation of Vocal Qualityrsquo Journal of Speechand Hearing Research 33(2) 324ndash334
Fonseca E Guido R Scalassara P Maciel C and Pereira J(2007) lsquoWavelet Time-Frequency Analysis and Least SquaresSupport Vector Machines for the Identification of Voice Dis-ordersrsquo Computers in Biology and Medicine(Elsevier) 37(4)571ndash578
Fukunaga K (1990) Introduction to Statistical Pattern Recogni-tion (2nd ed) San Diego CA Academic Press
Godino-Llorente J and Gomez-Vilda P (2004) lsquoAutomaticDetection of Voice Impairments by Means of Short-TermCepstral Parameters and Neural Network Based DetectorsrsquoIEEE Transactions on Biomedical Engineering 51(2) 380ndash384
Godino-Llorente J Gomez-Vilda P and Blanco-Velasco M(2006) lsquoDimensionality Reduction of a Pathological VoiceQuality Assessment System Based on Gaussian Mixture Mod-els and Short-Term Cepstral Parametersrsquo IEEE Transactionson Biomedical Engineering 53(10) 1943ndash1953
Gunes S Polat K and Yosunkaya S (2010) lsquoEfficient SleepStage Recognition System Based on EEG Signal Using k-Means Clustering Based Feature Weightingrsquo Expert Systemswith Applications 37(12) 7922ndash7928
Hariharan M Paulraj M and Yaacob S (2009) lsquoIdentifica-tion of Vocal Fold Pathology Based on Mel Frequency BandEnergy Coefficients and Singular Value Decompositionrsquo inPaper presented at the 2009 IEEE International Conferenceon Signal and Image Processing Applications (ICSIPArsquo09)Kuala Lumpur Malaysia pp 514ndash517
Hariharan M Paulraj M and Yaacob S (2011) lsquoDetection ofVocal Fold Paralysis and Oedema Using Time-Domain Fea-tures and Probabilistic Neural Networkrsquo International Jour-nal of Biomedical Engineering and Technology 6(1) 46ndash57
Hariharan M Saraswathy J Sindhu R Khairunizam W andYaacob S (2012) lsquoInfant Cry Classification to Identify As-phyxia Using Time-Frequency Analysis and Radial BasisNeural Networksrsquo Expert Systems with Applications 39(10)9515ndash9523
Hariharan M and Sazali Y (2010) lsquoTime-Domain Features andProbabilistic Neural Network for the Detection of Vocal FoldPathologyrsquo Malaysian Journal of Computer Science 23(1)60ndash67
Hariharan M Sindhu R and Yaacob S (2011) lsquoNormal andHypoacoustic Infant Cry Signal Classification Using Time-Frequency Analysis and General Regression Neural Net-
workrsquo Computer Methods and Programs in Biomedicine108(2) 559ndash569
Hariharan M Yaacob S and Awang SA (2011) lsquoPathologi-cal Infant Cry Analysis Using Wavelet Packet Transform andProbabilistic Neural Networkrsquo Expert Systems with Applica-tions 38(12) 15377ndash15382
Hernandez-Espinosa C Gomez-Vilda P Godino-Llorente Jand Aguilera-Navarro S (2000) lsquoDiagnosis of Vocal andVoice Disorders by the Speech Signalrsquo in Paper Presentedat the Proceedings from IJCNNrsquo00 The IEEE-INNS-ENNSInternational Joint Conference on Neural Networks ComoItaly pp 253ndash258
Hoglund H (2012) lsquoDetecting Earnings Management with Neu-ral Networksrsquo Expert Systems with Applications 39(10)9564ndash9570
Ismail S Samsudin R and Shabri A (2011) lsquoA Hybrid Modelof Self-Organizing Maps (SOM) and Least Square SupportVector Machine (LSSVM) for Time Series Forecastingrsquo Ex-pert Systems with Applications 38(8) 10574ndash10578
Kalker T Haitsma J and Oostveen J (2001) lsquoRobust AudioHashing for Content Identificationrsquo in Paper Presented at theProceedings from EUSIPCOrsquo01 The 12th European SignalProcessing Conference Vienna Austria pp 2091ndash2094
Kasuya H Endo Y and Saliu S (1993) lsquoNovel AcousticMeasurements of Jitter and Shimmer Characteristics fromPathological Voicersquo Paper Presented at the Proceedings fromEUROSPEECHrsquo93 The 3rd European Conference on SpeechCommunication and Technology Berlin Germany pp 1973ndash1976
Kasuya H Ogawa S Mashima K and Ebihara S (1986)lsquoNormalized Noise Energy as an Acoustic Measure to Eval-uate Pathologic Voicersquo The Journal of the Acoustical Societyof America 80 1329ndash1334
Kay Elemetrics Inc (1994) lsquoVoice Disorders DatabaseVersion 103 [CD-ROM]rsquo httpwwwkaypentaxcomProduct20InfoCSL20Options43374337html
Khadivi Heris H Seyed Aghazadeh B and Nikkhah-BahramiM (2009) lsquoOptimal Feature Selection for the Assessment ofVocal Fold Disordersrsquo Computers in Biology and Medicine39(10) 860ndash868
Krom G (1993) lsquoA Cepstrum-Based Technique for Determininga Harmonics-To-Noise Ratio in Speech Signalsrsquo Journal ofSpeech and Hearing Research 36(2) 254ndash266
Kukharchik P Kheidorov I Bovbel E and Ladeev D (2008)lsquoSpeech Signal Processing Based on Wavelets and SVM forVocal Tract Pathology Detectionrsquo Lecture Notes in ComputerScience (Springer) 5099 192ndash199
Latifoglu F Polat K Kara SI and Gunes S (2008) lsquoMedicalDiagnosis of Atherosclerosis from Carotid Artery DopplerSignals Using Principal Component Analysis (PCA) k-NN based Weighting Pre-Processing and Artificial ImmuneRecognition System (AIRS)rsquo Journal of Biomedical Infor-matics 41(1) 15ndash23
Ludlow CL Bassich CJ Connor NP Coulter DC and LeeYJ (1987) lsquoThe Validity of Using Phonatory Jitter and Shim-mer to Detect Laryngeal Pathologyrsquo Laryngeal Functionin Phonation and Respiration Boston MA Brown amp Copp 492ndash508
MacQueen J (1967) lsquoSome Methods for Classification and Anal-ysis of Multivariate Observationsrsquo in Paper Presented at the5th Berkeley Symposium on Mathematical Statistics and Prob-ability Berkeley CA University California Press Vol 1pp 281ndash297
MAPACI P (2004) lsquoVoice Disorder Databasersquo httpwwwmapacicomindex-inglesphp
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1634 M Hariharan et al
Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372
Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800
Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953
Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327
Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900
Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793
Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793
Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60
Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673
Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279
Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921
Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia
Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345
Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925
Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128
Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740
Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p
Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118
Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576
Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd
Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430
Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97
Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117
Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678
Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219
Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399
Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 2014Vol 45 No 8 1622ndash1634 httpdxdoiorg101080002077212013794905
A new feature constituting approach to detection of vocal fold pathology
M Hariharanalowast Kemal Polatb and Sazali Yaacoba
aSchool of Mechatronic Engineering Universiti Malaysia Perlis (UniMAP) Campus Pauh Putra Perlis Malaysia bDepartment ofElectrical and Electronics Engineering Faculty of Engineering and Architecture Abant Izzet Baysal University Bolu Turkey
(Received 27 September 2012 final version received 3 April 2013)
In the last two decades non-invasive methods through acoustic analysis of voice signal have been proved to be excellentand reliable tool to diagnose vocal fold pathologies This paper proposes a new feature vector based on the wavelet packettransform and singular value decomposition for the detection of vocal fold pathology k-means clustering based featureweighting is proposed to increase the distinguishing performance of the proposed features In this work two databasesMassachusetts Eye and Ear Infirmary (MEEI) voice disorders database and MAPACI speech pathology database are usedFour different supervised classifiers such as k-nearest neighbour (k-NN) least-square support vector machine probabilisticneural network and general regression neural network are employed for testing the proposed features The experimentalresults uncover that the proposed features give very promising classification accuracy of 100 for both MEEI database andMAPACI speech pathology database
Keywords acoustic analysis vocal fold pathology feature extraction feature weighting and classification
1 Introduction
Voice is a highly multivariate component of speech andits quantitative description has led to the development ofclinical tools To detect the vocal fold pathology medi-cal professionals use subjective techniques (invasive meth-ods) such as the direct inspection of the vocal foldsand the observation of the vocal folds by endoscopic in-struments These techniques are expensive risky timeconsuming discomfort to the patients and require costlyresources such as special light sources endoscopic instru-ments and specialised video camera equipment To circum-vent the above problems non-invasive methods have beendeveloped to help the medical professionals for early de-tection of vocal fold pathology With the rapid develop-ment of signal processing techniques vocalvoice signalcan be used for the detection of vocal fold pathologiesand its quantitative informations play an important roleto understand the process of vocal fold pathology forma-tion In the last 20 years many research works have beencarried out on the automatic detection and classificationof vocal fold pathologies by means of acoustic analysisparametric and non-parametric feature extraction auto-matic pattern recognition or statistical methods (KasuyaOgawa Mashima and Ebihara 1986 Feijoo and Hernan-dez 1990 Boyanov Ivanov Hadjitodorov and Chollet 1993Deliyski 1993 Kasuya Endo and Saliu 1993 Boyanov andHadjitodorov 1997 Hernandez-Espinosa Gomez-VildaGodino-Llorente and Aguilera-Navarro 2000 Martinez
lowastCorresponding author Email hariunimapedumy
and Rufiner 2000 Godino-Llorente Gomez-Vilda andBlanco-Velasco 2006)
A large amount of acoustic parameters have been pro-posed and its effectiveness has been proven by experimentalresearches The important parameters are pitch (Boyanovet al 1993) jitter (Feijoo and Hernandez 1990 Kasuyaet al 1993) shimmer (Ludlow Bassich Connor Coul-ter and Lee 1987 Kasuya et al 1993) harmonics-to-noise(Yumoto Sasaki and Okamura 1984 Krom 1993) and nor-malised noise energy (Kasuya et al 1986) In the last twodecades time-frequencyscale analyses (waveletwaveletpacket transforms) have been used as tools to analyse allkinds of problems in signal and image processing Sincethe speechvoice signal is a highly non-stationary signalFourier transform is not a very useful tool to analyse non-stationary signal as the time-domain informations are lostwhile performing the frequency transformation (PandianSazali and Muthusamy 2008 Paulraj Sazali and Hariha-ran 2009) Time-frequencyscale analysis (waveletwaveletpacket transform) is a good tool for the analysis of non-stationary signals both in time and frequency scale (Pandianet al 2008 Paulraj et al 2009) Hence wavelet and waveletpacket analysis has the potential for the identification ofvocal fold pathology
In Nayak and Bhat (2003) authors have presented aprocedure to identify pathological disorders of larynx usingwavelet analysis Nayak Bhat Acharya and Aithal (2005)have proposed a method for the classification and analysis
Ccopy 2013 Taylor amp Francis
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1623
of speech abnormalities based on wavelet analysis and arti-ficial neural network Fonseca Guido Scalassara Macieland Pereira (2007) have presented wavelet and least-squaresupport vector machine (LS-SVM) for the identification ofvocal fold pathology Salhi Talbi and Cherif 2008 haveproposed a hybrid approach using wavelet analysis andmultilayer neural network for the vocal fold pathologiesKukharchik Kheidorov Bovbel and Ladeev (2008) havepresented wavelet transform and support vector machine(SVM) for the vocal fold pathology detection Crovato andSchuck (2007) presented a vocal fold pathology (dysphonicvoice) classification system using the wavelet packet trans-form (WPT) and the best basis algorithm (BBA) as dimen-sionality reduction and six artificial neural networks actingas classification systems Hierarchical system for diagno-sis of vocal fold pathologies based on wavelets and SVMhas been proposed by Nikkhah-Bahrami Ahmadi-NoubariSeyed Aghazadeh and Khadivi Heris (2009) Arjmandi andPooyan (2011) have applied linear discriminant analysisand wavelet packets for vocal fold pathology assessmentErfanian Saeedi Almasganj and Torabinejad (2011) haveproposed adaptive wavelets for vocal fold pathology assess-ment In Khadivi Heris Seyed Aghazadeh and Nikkhah-Bahrami (2009) WPT with non-linear features were usedfor optimal selection of features to assess the vocal foldpathologies Azadi and Almasganj (2011) have proposedpartitioning and biased support vector machine (PBSVM)-based classifier for vocal fold pathology assessment usinglabelled and unlabelled data-sets Accuracy of their resultvaried from 85 to 100 approximately under differentexperiments From the previous work it is observed thatthe reliability of wavelet and WPT-based features has beenproven by many experimental researchers It is not easyto compare their results since they conducted the analy-sis with different size of data-set different databases anddifferent presentation of results Cross-validation schemeand use of more than one database are the possible ways toprove the reliability of the results of various feature extrac-tion and classification methods (Saenz-Lechon Godino-Llorente Osma-Ruiz and Gomez-Vilda 2006)
In this paper a new feature vector is proposed basedon WPT and singular value decomposition for the auto-matic detection of vocal fold pathology k-means cluster-ing based feature weighting is proposed to increase thediscrimination ability of the proposed features Two dif-ferent databases such as Massachusetts Eye and Ear Infir-mary (MEEI) voice disorders database (Kay Elemetrics Inc1994) and MAPACI speech pathology database (MAPACI2004) are used to test either the robustness or the inde-pendence of the algorithms to the databases Two differentexperiments (frame-based and file-based) are carried outusing the voice samples of the above-said databases Fourdifferent supervised classifiers such as k-nearest neighbour(k-NN) LS-SVM probabilistic neural network (PNN) andgeneral regression neural network (GRNN) are employed
for the identification of vocal fold pathology Two schemesof data validation method are used (conventional validation[ConV] and tenfold cross-validation [CrossV]) in orderto test the effectiveness of the proposed features and thereliability of the classification results
2 Database
In the area of automatic detection of vocal fold pathologyanalysis the only one commercially available database isMEEI voice disorders database (Kay Elemetrics Inc 1994)The database contains 53 normal and 657 pathological voicesamples developed by the MEEI Voice and Speech LabsThe voice samples were the sustained phonation of thevowel ah (1ndash3 s) long and reading (12 s) of the lsquoRainbowPassagersquo from patients with normal voices and a wide va-riety of organic neurological traumatic and psychogenicvoice disorders in different stages All the voice sampleswere collected in a controlled environment and sampledwith 25 kHz or 50 kHz sampling rate and 16 bits of res-olution To test the effectiveness of the proposed methoda total of 226 voice samples of sustained phonation of thevowel ah (173 pathological and 53 normal) are used anddownsampled to 25 kHz for our analysis Two differentexperiments are performed (frame-based and file-based)In frame-based analysis voice samples are segmented intoframes of 40 ms long using a Hamming window with 50overlap Then each window is parameterised by means ofWPT and SVD (singular value decomposition) In file-based analysis 173 pathological and 53 normal voice sam-ples are subjected to feature extraction The second data-setis taken from MAPACI (2004) database (MAPACI 2004)and all the speech samples were recorded at 44100 Hz dur-ing the lifetime project of MAPACI (2004) The recordingdevice was a Sennheiser headset microphone The speechdatabase consists of 48 voice samples (12 normal and 12pathological males + 12 normal and 12 pathological fe-males) and their ages ranged from 20 to 68 years with an av-erage of 3675 years and a standard deviation of 1435 yearsFrame-based and file-based analyses are conducted on thisdatabase as well (Figure 1) shows the pathological andnormal voice samples from MEEI database and MAPACIspeech pathology database
3 WPT and SVD
Various feature extraction methods have been proposed forrobust representation of normal and pathological speech Inthe last decade time-frequency analysis has been used forrobust representation of normal and pathological speechThe wavelet transform provides time-frequency represen-tation of the signal It decomposes signal over dilated andtranslated wavelets A wavelet is a waveform of effectivelylimited duration that has an average value of zero Wavelettransform is defined as the convolution of a signal f(t) with a
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1624 M Hariharan et al
Figure 1 Plots of pathological and normal voice samples
wavelet function ψ(t) shifted in time by a translation param-eter and dilated by a scale parameter The general definitionof the wavelet transform is given as (Burrus Gopinath GuoOdegard and Selesnick 1997 Bopardikar 2000)
W (a b) =infinint
minusinfinf (t)
1radica
ψlowast(
t minus b
a
)dt (1)
In the tree each subspace is indexed by its depth i and thenumber of subspaces p The two wavelet packet orthogonalbases at a parent node (ip) are given by the followingforms
ψ2pi+1(k) =
infinsumn =minusinfin
l[n] ψpi (k minus 2in) (2)
where l[n] is a low-pass (scaling) filter
ψ2p+1i+1 (k) =
infinsumn =minusinfin
h[n] ψpi (k minus 2in) (3)
where h[n] is the high-pass (wavelet) filter Wavelet packetis an extension of wavelet transform In WPT decomposi-tion procedure a signal is decomposed into two frequency
bands such as lower frequency band (approximation co-efficients) and higher frequency band (detail coefficients)Wavelet packet decomposition helps to partition both lowerand higher frequency bands into smaller bands which can-not be achieved by using general discrete wavelet trans-form Hence WPT gives a balanced binary tree structure(Figure 2) shows the block diagram of the proposed system
In this work the automatic detection of vocal foldpathology is carried out by using WPT and SVD In frame-based analysis the voice samples are firstly segmented intoshort-time frames of 40 ms with an overlap of 50 betweenadjacent frames and windowed by using Hamming win-dow (Godino-Llorente and Gomez-Vilda 2004 Godino-Llorente et al 2006) In frame-based analysis 1557 framesof pathological voices and 1537 frames of normal voicesare used Each frame of signal is decomposed into five lev-els using WPT and yields 32 subbands A matrix of size32 times M (fifth-level wavelet packet coefficients of each32 wavelet packet sub-bands) composed of wavelet packetcoefficients which is obtained for further processing whoserows pertain to wavelet packet subbands and columns towavelet packet coefficients
A = [C1
5 (M) C25 (M) middot middot middot C32
5 (M)]T
(4)
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1625
Figure 2 Overall diagram of the proposed system
To derive simple and effective feature vector singularvalue decomposition is applied In file-based analysis eachvoice sample is decomposed into five levels using WPTand the same SVD is applied to derive feature vector for thedetection of vocal fold pathology The same methodologyis applied on the voice samples of second data-set as wellConsidering the fact that pathological voice signals containmore rapid variations and the popularity of Daubechieswavelets in speech applications lsquodb4rsquo wavelet is used inour study The order was chosen to be low to model thetransients and rapid variations in a signal efficiently
31 Singular value decomposition
The singular value decomposition which is a factorisationand summarisation technique effectively reduces a rectan-gular matrix A (n times p matrix) of wavelet packet coefficientsinto a much smaller invertible and square matrix (KalkerHaitsma and Oostveen 2001 Ozer Sankur Memon andAnarim 2005) SVD theorem states that the given n times pmatrix is decomposed as given by
X = UDVT (5)
The columns of U are orthonormal eigenvectors of AAT
and are called left-singular vectors The columns of V are or-thonormal eigenvectors of ATA and are called right-singularvectors The diagonal entries of D are called the singularvalues of the matrix n times p In this work 32 singular valuesgive a good summarisation of the matrix of wavelet packetcoefficients of a signal
32 k-Means clustering based feature weighting
In many practical machine learning applications the per-formance of the learned models degrade gracefully due tothe irrelevant and noisy features To improve the robust-ness of the features and the classification accuracy clus-tering algorithms are used as feature weighting methodwhere they transform the extracted non-linearly separable
features to linearly separable features and also to map thefeatures according to the distributions (Polat and Gunes2006 Latifoglu Polat Kara and Gunes 2008 GunesPolat and Yosunkaya 2010 Polat 2012 Polat and Dur-duran 2012) Clustering algorithms are used not only tostudy the similarity or dissimilarity of the features but alsouseful for compression and reduction of the features (Polatand Gunes 2006 Latifoglu et al 2008 Gunes et al 2010Polat 2012 Polat and Durduran 2012) Several clusteringalgorithms have been proposed in the literature (MacQueen1967 Bezdek 1981 Chiu 1994 Yager and Filev 1994 Xuand Wunsch 2005) In this work k-means clustering is usedas a feature weighting method since it is simple and widelyimplemented in solving many practical problems (Polat andGunes 2006 Latifoglu et al 2008 Gunes et al 2010 Po-lat 2012 Polat and Durduran 2012) k-means clustering isused to partition n observations into k clusters so as to min-imise the mean squared distance from each data point toits nearest centre For a data-set xi i = 1 2 3 n k-means clustering algorithm finds the cluster centres ci andthe membership matrix U iteratively using the followingfour steps
Step 1 Randomly initialise the cluster centre ci i =123 c
Step 2 Determine the membership matrix usingEquation (6)
uij =
10
if∥∥xj minus ci
∥∥2 le∥∥xj minus ci
∥∥2 for each k = i
otherwise(6)
Step 3 Calculate the cost function based on the Euclideandistance using the following Equation (7)
J =csum
i=1
sumkxkisinGi
xk minus ci2 (7)
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1626 M Hariharan et al
Figure 3 Working of k-means clustering based feature weighting method
Stop if either this value or its improvement over previousiteration is below a certain thresholdStep 4 Update the cluster centres according toEquation (8) and go to Step 2
ci = 1
|Gi |sum
kxkisin Gi
xk (8)
where |Gi | =nsum
j=1uij
After finding the cluster centres the extracted featuresare preprocessed (Polat and Gunes 2006 Latifoglu et al2008 Gunes et al 2010 Polat 2012 Polat and Durduran2012) as summarised in (Figure 3) below firstly calcu-late the cluster centres using k-means clustering methodNext the ratios of means of features to their centres arecalculated Finally these ratios are multiplied with eachrespective feature
Raw features are obtained through feature extractionThrough k-means clustering based feature weighting theraw features are preprocessed which gives the weightedfeatures (Figure 4a and b) shows the class distribution ofraw and weighted SVD features MEEI database The classdistribution of raw and weighted SVD features for MAPACIspeech pathology database is shown in (Figure 5a and b)From the figures it is observed that the discrimination abil-
ity of the raw SVD features has been increased by applyingk-means clustering based feature weighting method
4 Classifiers
In the area of automatic detection of vocal fold patholo-gies various classifiers have been proposed such as multi-layer perceptron (Ritchings McGillion Conroy and Moore1999 Ritchings McGillion and Moore 1999) hiddenMarkov models (Wester 1998) linear discriminant anal-ysis (Hariharan Paulraj and Yaacob 2009 UmapathyKrishnan Parsa and Jamieson 2005) SVM (Khadivi Heriset al 2009 Arjmandi and Pooyan 2011 Erfanian Saeediet al 2011) Gaussian mixture models (Godino-Llorenteet al 2006) radial basis neural networks (Hariharan andSazali 2010 Hariharan Paulraj and Yaacob 2011) and k-nearest neighbourhood classifier (Ananthakrishna Shamaand Niranjan 2004 Shama and Cholayya 2007) In thiswork four different classifiers are employed such as k-NNLS-SVM PNN and GRNN
41 k-NN classifier
In pattern recognition the k-NN algorithm is a method forclassifying objects based on closest training examples inthe feature space (Fukunaga 1990 Duda Hart and Stork
Figure 4 (a) Class distribution of raw SVD features (MEEI database frame-based) according to the first two features (Feature 1 andFeature 2) (b) Class distribution of weighted SVD features (MEEI database frame-based) according to the first two features (Feature 1and Feature 2)
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1627
Figure 5 (a) Class distribution of raw SVD features (MAPACI speech pathology frame-based) according to the first two features (Feature1 and Feature 2) (b) Comparison of results (frame-based MEEI database raw features)
2001) An object is classified by a majority vote of itsneighbour with the object being assigned to the class mostcommon amongst its k-nearest neighbours k is a positiveinteger In the k-NN algorithm the classification of a newtest feature vector is determined by the class of its k-NNs
42 Probabilistic neural network
Specht (1990) has proposed the PNN based on Bayesianclassification and classical estimators for probability den-sity function (PDF) PNN comprises four units such as in-put units pattern units summation units and output unitsAll the units are fully interconnected and the pattern unitsare activated by exponential function instead of sigmoidalactivation function The pattern unit computes distancesfrom the input vector to the training input vectors when aninput is presented and produces a vector whose elementsindicate how close the input is to a training input The sum-mation unit sums these contributions for each class of inputsand produces a net output which is a vector of probabili-ties From the maximum of these probabilities output unitsproduce a 1 for that class and a 0 for the other classes using
compete transfer function (Hariharan Yaacob and Awang2011) The performance of the PNN classifier highly de-pends upon the smoothing parameter or spread factor (σ )Based on the experimental investigations the σ value isvaried between 001 and 0055 in steps of 0005
43 General regression neural network
GRNN is a kind of radial basis networks and the train-ing is conducted using one-pass learning This networkdoes not require an iterative training procedure it presentsmuch faster learning than multilayer perceptron it is moreaccurate than MLP and relatively insensitive to outliersThe target variable for the GRNN network is continuous(Eskidere Ertas and Hanilci 2011 Wu and Tsai 2011Hariharan Sindhu and Yaacob 2011 HariharanSaraswathy Sindhu Khairunizam and Yaacob 2012Hoglund 2012) Specht (1991) has proposed the model ofGRNN to perform general (linear or non-linear) regres-sions GRNN is based on the theory of probability re-gression analysis It usually uses Parzen window estimatesto set up the PDF from the observed data samples TheGRNN has four different layers input layer pattern layer
Table 1 Details of training and test data
Type ofType of experiment validation MEEI database MAPACI speech pathology database
Frame-based (1557pathological + 1537normal) (1560pathological + 1560normal)
CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
ConV Training = 2166 (70) testing =928 (30)
Training = 2184 (70) testing =936 (30)
File-based (173pathological + 53normal) (24pathological + 24normal)
CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
ConV Training = 158 (70) testing =68 (30)
Training = 34 (70) testing =14 (30)
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1628 M Hariharan et al
Figure 6 Comparison of results (a) frame-based MEEI database raw features (b) file-based MEEI database raw features (c) frame-based MAPACI speech pathology database raw features and (d) file-based MAPACI speech pathology database raw features
summation layer and output layer The performance of theGRNN classifier highly depends upon the smoothing pa-rameter or spread factor (σ ) Based on the experimentalinvestigations the σ value is varied between 001 and 0055in steps of 0005
44 Support vector machine
In this work SVM is used as a classifier and it is a promis-ing method for solving non-linear classification problemsfunction estimation and density estimation and patternrecognition tasks (Azadi and Almasganj 2011 Calisir andDogantekin 2011 Ismail Samsudin and Shabri 2011 YuYao Wang and Lai 2011 Martis Acharya Mandana Rayand Chakraborty 2012) It has been originally proposed toclassify samples within two classes It maps training sam-ples of two classes into a higher dimensional space througha kernel function SVM seeks an optimal separating hyper-plane in this new space to maximise its distance from theclosest training point While testing a query point is cate-gorised according to the distance between the point and thehyperplane
SVM models are built around a kernel function thattransforms the input data into an n-dimensional space where
a hyperplane can be constructed to partition the data Threekinds of kernel functions such as linear kernel multilayerkernel and radial basis function (RBF) kernel are normallyused by researchers (Azadi and Almasganj 2011 Calisirand Dogantekin 2011 Ismail et al 2011 Yu et al 2011Martis et al 2012) In this work the RBF kernel functionis used since it gives an excellent generalisation and lowcomputational cost In the RBF kernel σ 2 (sig2) is theimportant parameter and it causes the changes in shapeflexion of the hyperplane
In this work LS-SVMlab toolbox (Suykens VanGestel De Brabanter De Moor and Vandewalle2003 De Brabanter 2010) is used to perform classificationof normal and pathological speech samples There are twoparameters which are to be chosen optimally such as reg-ularisation parameter (γ gam) and σ 2 (squared bandwidthof the RBF kernel) to obtain better accuracy
5 Results and discussions
Many research works have already been done in the areaof automatic detection of voice pathology In this work anew feature vector using WPT and SVD is proposed to clas-sify the voice samples into normal or pathological k-means
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1629
Figure 7 Comparison of results (a) Frame-based MEEI database weighted features (b) file-based MEEI database weighted features(c) Frame-based MAPACI speech pathology database weighted features and (d) file-based MAPACI speech pathology database weightedfeatures
clustering based feature weighting method is proposed forpreprocessing the features which increases its discrimina-tion ability between classes and the classification accuracyas well Four classifiers are employed to investigate the ef-ficacy of the raw and weighted features and two validationschemes (ConV and CrossV) are used to prove the reliabil-ity of the classification results Details of the training andtesting sets used in this work are tabulated in (Table 1)
To test the classifier performance several measuresnamely sensitivity specificity and the overall accuracy areconsidered These measures are calculated from the mea-sures true positive (TP the classifier classified as pathologywhen pathological samples are present) true negative (TNthe classifier classified as normal when normal samples arepresent) false positive (FP the classifier classified as patho-logical when normal samples are present) and false negative(FN the classifier classified as normal when pathologicalsamples are present) These measures are given in Equa-tions (9) (10) and (11) respectively
Sensitivity (SE) = TP(TP + FN) (9)
Specificity (SP) = TN(TN + FP) (10)
Overall accuracy (ACC) = (TP + TN)(TP + TN
+ FP + FN) (11)
In k-NN classifier different values of lsquokrsquo between 1 and10 are used PNN and GRNN are trained with differentspread factors between 001 and 0055 The suitable valueof the regularisation parameter (γ gam) and σ 2 (sig2) forSVM classifier are chosen optimally during training andtesting to obtain better accuracy Sensitivity specificity andoverall accuracy of frame-based and file-based analysis us-ing the raw and weighted features are presented in (Table 2)and (Table 3) for both MEEI database and MAPACIspeech pathology database respectively Average classifica-tion accuracies with standard deviations are tabulated From(Table 2) it is observed that the accuracy of file-based anal-ysis is always better than the frame-based analysis Max-imum classification accuracy of 100 is obtained duringthe file-based analysis using all the classifiers (k-NN LS-SVM PNN and GRNN) for the proposed raw and weightedfeatures Maximum classification accuracy of above 95is obtained using the raw features and 100 is obtained us-ing the weighted features during the frame-based analysis
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1630 M Hariharan et al
Tabl
e2
Com
pari
son
ofre
sult
sfo
rM
EE
Ida
taba
seus
ing
k-N
NL
S-S
VM
PN
Nan
dG
RN
N(r
awan
dw
eigh
ted
feat
ures
)
Raw
feat
ures
Wei
ghte
dfe
atur
es
ME
EI
data
base
SE
SP
AC
CS
ES
PA
CC
k-N
NFr
ame-
base
dC
ross
V96
66
plusmn0
5591
69
plusmn1
0994
05
plusmn0
8399
81
plusmn0
0698
27
plusmn0
3499
03
plusmn0
19k-
valu
e1ndash
10C
onV
959
6plusmn
071
908
7plusmn
154
932
7plusmn
116
998
1plusmn
006
979
8plusmn
033
988
8plusmn
015
File
-bas
edC
ross
V99
54
plusmn0
2410
0plusmn
000
996
5plusmn
019
995
4plusmn
024
100
plusmn0
0099
65
plusmn0
19k-
valu
e1ndash
10C
onV
997
4plusmn
025
100
plusmn0
0099
79
plusmn0
2010
0plusmn
000
100
plusmn0
0010
0plusmn
000
LS
-SV
MFr
ame-
base
dC
ross
V95
10
plusmn0
2195
41
plusmn0
2095
25
plusmn0
1299
75
plusmn0
0399
45
plusmn0
0799
60
plusmn0
0410
000
01
Con
V94
92
plusmn0
9094
49
plusmn1
0594
70
plusmn0
8199
34
plusmn0
2999
46
plusmn0
3499
40
plusmn0
26Fi
le-b
ased
Cro
ssV
989
2plusmn
062
998
1plusmn
060
991
2plusmn
047
993
2plusmn
036
100
plusmn0
0099
47
plusmn0
2810
01
Con
V10
0plusmn
000
988
7plusmn
130
991
2plusmn
103
996
2plusmn
080
994
1plusmn
186
995
6plusmn
071
PN
NFr
ame-
base
dC
ross
V97
43
plusmn0
2693
01
plusmn0
1795
12
plusmn0
1799
88
plusmn0
0598
69
plusmn0
1699
28
plusmn0
070
01ndash0
055
Con
V96
79
plusmn0
2492
44
plusmn0
3394
51
plusmn0
2499
81
plusmn0
0698
41
plusmn0
2199
10
plusmn0
11Fi
le-b
ased
Cro
ssV
974
4plusmn
539
100
plusmn0
0097
74
plusmn4
9797
50
plusmn5
3710
0plusmn
000
977
9plusmn
495
001
ndash00
55C
onV
970
9plusmn
610
100
plusmn0
0097
37
plusmn5
7999
62
plusmn0
8099
41
plusmn1
8699
56
plusmn0
71G
RN
NFr
ame-
base
dC
ross
V93
01
plusmn3
8988
08
plusmn4
9890
15
plusmn4
3199
50
plusmn0
4197
53
plusmn1
3098
27
plusmn0
890
01ndash0
055
Con
V92
52
plusmn3
7887
07
plusmn5
1189
34
plusmn4
3599
39
plusmn0
5597
35
plusmn1
1698
10
plusmn0
91Fi
le-b
ased
Cro
ssV
972
7plusmn
599
100
plusmn0
0096
33
plusmn8
0098
07
plusmn4
8210
0plusmn
000
979
2plusmn
511
001
ndash00
55C
onV
966
9plusmn
675
100
plusmn0
0095
81
plusmn9
0596
77
plusmn6
2310
0plusmn
000
958
1plusmn
815
Tabl
e3
Com
pari
son
ofre
sult
sfo
rM
APA
CI
spee
chpa
thol
ogy
data
base
usin
gk-
NN
LS
-SV
MP
NN
and
GR
NN
(raw
and
wei
ghte
dfe
atur
es)
Raw
feat
ures
Wei
ghte
dfe
atur
es
MA
PAC
Ida
taba
seS
ES
PA
CC
SE
SP
AC
C
k-N
NFr
ame-
base
dC
ross
V94
49
plusmn0
3696
58
plusmn0
3995
51
plusmn0
2198
67
plusmn0
2398
76
plusmn0
2598
71
plusmn0
23k-
valu
e1ndash
10C
onV
942
0plusmn
040
962
9plusmn
033
952
1plusmn
032
100
plusmn0
0010
0plusmn
000
100
plusmn0
00Fi
le-b
ased
Cro
ssV
771
0plusmn
784
669
2plusmn
330
704
2plusmn
365
100
plusmn0
0010
0plusmn
000
983
3plusmn
164
k-va
lue
1ndash10
Con
V77
56
plusmn7
1167
51
plusmn2
4869
71
plusmn2
7099
88
plusmn0
4099
88
plusmn0
4097
50
plusmn2
16L
S-S
VM
Fram
e-ba
sed
Cro
ssV
955
0plusmn
018
972
8plusmn
024
963
7plusmn
016
990
4plusmn
008
990
4plusmn
008
992
4plusmn
006
100
01
Con
V96
86
plusmn0
8195
24
plusmn0
6796
03
plusmn0
4098
71
plusmn0
6898
71
plusmn0
6899
02
plusmn0
41Fi
le-b
ased
Cro
ssV
793
1plusmn
333
688
7plusmn
240
729
2plusmn
260
100
plusmn0
0010
0plusmn
000
958
3plusmn
000
101
Con
V74
53
plusmn7
8284
00
plusmn13
01
771
4plusmn
878
100
plusmn0
0010
0plusmn
000
964
3plusmn
376
PN
NFr
ame-
base
dC
ross
V94
88
plusmn0
2096
23
plusmn0
2695
55
plusmn0
2199
82
plusmn0
0799
82
plusmn0
0799
91
plusmn0
040
01ndash0
055
Con
V94
52
plusmn0
3196
03
plusmn0
3595
26
plusmn0
2599
76
plusmn0
0799
76
plusmn0
0799
88
plusmn0
03Fi
le-b
ased
Cro
ssV
723
0plusmn
396
705
3plusmn
393
708
3plusmn
170
985
7plusmn
452
985
7plusmn
452
991
7plusmn
263
001
ndash00
55C
onV
690
2plusmn
533
691
8plusmn
410
679
3plusmn
386
986
0plusmn
273
986
0plusmn
273
990
0plusmn
172
GR
NN
Fram
e-ba
sed
Cro
ssV
904
1plusmn
378
917
9plusmn
335
909
8plusmn
345
995
9plusmn
029
995
9plusmn
029
997
1plusmn
023
001
ndash00
55C
onV
900
2plusmn
361
917
8plusmn
316
907
2plusmn
324
995
3plusmn
030
995
3plusmn
030
996
6plusmn
024
File
-bas
edC
ross
V69
43
plusmn8
5170
75
plusmn4
8666
67
plusmn7
7397
13
plusmn6
4097
13
plusmn6
4093
75
plusmn9
270
01ndash0
055
Con
V62
05
plusmn9
4064
51
plusmn3
8359
57
plusmn8
5097
38
plusmn5
6897
38
plusmn5
6893
07
plusmn9
91
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1631
Table 4 Computation time for feature extraction and classification
Classification
MEEI database MAPACI database
Type of Type of Weighted Raw Weighted RawClassifiers experiments validation features features features features
k-NN Frame-based CrossV 0338 0341 0342 0346k-value 1ndash10 ConV 3806 3821 6931 6955File-based CrossV 0036 0038 0036 0036k-value 1ndash10 ConV 1759 1754 3001 3206
LS-SVM Frame-based CrossV 1139 1175 1191 121410 001 ConV 0718 0698 0705 0723
File-based CrossV 0005 0005 0002 000210 1 ConV 0006 0006 0004 0004
PNN Frame-based CrossV 2458 3084 2445 2421001ndash0055 ConV 10340 10356 11180 11202File-based CrossV 1078 1124 1093 1062001ndash0055 ConV 4250 4266 4072 4047
GRNN Frame-based CrossV 2587 2571 2676 2582001ndash0055 ConV 10726 10648 11634 11584File-based CrossV 1098 1056 1065 1062001ndash0055 ConV 4302 4260 4049 4066
Feature extractionType of experiments MEEI MAPACI
Normal Pathological Normal PathologicalFile-based 0549 0034 0109 0130Frame-based 0591 0181 1734 1737
To prove the robustness and database independency of theproposed method MAPACI speech pathology database isused for the investigation in addition to MEEI databaseFrom (Table 3) it is inferred that maximum accuracy ofabove 95 is obtained using the raw features and 100is obtained using the weighted features during the frame-based analysis During the file-based analysis we obtainedmaximum classification accuracy of 92 using the rawfeatures and 100 is obtained using the weighted features
(Figure 6andashd) show the comparison of results (maxi-mum accuracy and its corresponding sensitivity and speci-ficity) for MEEI database and MAPACI speech pathologydatabase using the proposed raw features during frame-based and file-based analysis Similarly (Figure 7andashd) il-lustrate the comparison of results (maximum accuracy andits corresponding sensitivity and specificity) for MEEIdatabase and MAPACI speech pathology database for pro-posed weighted features The performance of the weightedfeatures is always better compared to raw features for bothMEEI database and MAPACI speech pathology databaseAll the feature extractions and classification algorithms aredeveloped under MATLAB platform and they run in a lap-top of Intel Core i7-2670 QM (22 GHz) with 4 GB RAM(Table 4) reports the average time taken (in seconds) duringfeature extraction and classification under two different ex-periments (frame-based and file-based) Direct comparisonof our work with the previous works in literature cannot beperformed since most of the works in the literature havenot presented their computation time of feature extraction
and classification phase From (Table 4) it is observed thatthe best classification accuracy could be obtained using oursuggested feature extraction and classification algorithmswith less computation time
From the literature it is observed that the reliabilityof wavelet and WPT-based features has been proven bymany experimental researches Accuracy of their resultswas varied from 85 to 100 approximately under differ-ent experiments Their results highly depend on the opti-mal selection of wavelet packet-based features using featureselectiontransformation techniques and optimisation tech-niques However the presented work gives very promisingclassification accuracy for the databases under test Henceit can be concluded that the proposed k-means clusteringbased feature weighting method could improve the robust-ness of the WPTndashSVD features and increase the classifica-tion accuracy as well for both MEEI database and MAPACIspeech pathology database
6 Conclusions
This paper presents a new feature constituting approachfor efficient detection of vocal fold pathology using a fea-ture extraction method (WPTndashSVD features and a fea-ture weighting method [k-means clustering based]) Totest the effectiveness and reliability of the proposed rawand weighted features four different classifiers such as k-NN LS-SVM PNN and GRNN are employed k-means
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1632 M Hariharan et al
clustering based feature weighting approach can increasethe distinguishing performance of raw WPTndashSVD featuresThe experimental results show that the proposed weightedfeatures give very promising classification accuracy of100 for both MEEI voice disorder database and MAPACIspeech pathology database The proposed features can beused as additional acoustic indicators for researchers andspeech pathologists to detect the vocal fold pathology Theperformance of the k-means clustering algorithm is depend-ing on the initial positions of the clusters and hence in ourwork it was run for several times to obtain better clus-ter centres In the future work k-means clustering basedfeature weighting method could be improved by using ef-ficient initialisation of methods for better initial placementof the cluster centres and different kernel functions Theproposed method could be extended to detect the specifictype of disorders and also to develop an online diagnosingsystem
AcknowledgementsThis work is done in part with data transferred from the databaseMAPACI httpwwwmapacicom The authors would like tothank the anonymous reviewers for their valuable comments
Notes on contributorsM Hariharan has received Bachelor ofEngineering in Electrical and ElectronicsEngineering and Master of Engineeringin Applied Electronics from GovernmentCollege of Technology Coimbatore TamilNadu India He has received a PhD degreein Mechatronic Engineering from UniversitiMalaysia Perlis (UniMAP) Malaysia He iscurrently working as a Senior Lecturer in
the School of Mechatronic Engineering UniMAP Malaysia Hehas published more than 50 papers in referred journals and con-ferences He has been a reviewer in various journals includingComputers and Electrical Engineering Computer Methods andPrograms in Biomedicine Artificial Intelligence in BiomedicineInternational Journal of Phoniatrics Speech Therapy and Com-munication Pathology and Medical Engineering and Physics Hisresearch interests include speech signal processing biomedicalsignal and image processing and artificial intelligence He is amember of IEEE
Kemal Polat was born in Kırıkkale on30 January 1981 Dr Polat has gradu-ated from ElectricalndashElectronics Engineer-ing Department of Selcuk University withBSc degree in 2002 and from ElectricalndashElectronics Engineering Department of Sel-cuk University with MSc degree in 2004He took his PhD degree in Electrical andElectronic Engineering at Selcuk University
in 2008 He is now working as an Assistant Professor in Electricaland Electronic Engineering Department Engineering of FacultyAbant Izzet Baysal University since September 2011 His currentresearch interests are biomedical signal classification statisticalsignal processing digital signal processing and pattern recognitionand classification He has 55 articles published in the SCI journalsand 16 international conference papers He is a reviewer in various
SCI journals including Information Sciences Pattern RecognitionLetters Digital Signal Processing Soft Computing Artificial In-telligence in Medicine Applied Soft Computing Computers inBiology and Medicine Expert Systems Computational Statisticsamp Data Analysis IEEE Transactions on Biomedical EngineeringIEEE Transactions on Evolutionary Computation TALANTASoft Computing Journal of Medical Systems and Scientific Re-search and Essays so on He is the member of editorial board ofJournal of Neural Computing and Applications (SCI expanded)
Sazali Yaacob received his PhD degreein Control Engineering from University ofSheffield UK He is currently working as aProfessor in the School of Mechatronic En-gineering UniMAP Malaysia He has suc-cessfully supervised 8 PhD candidates andmore than 20 MSc graduates through re-search mode Currently he has 10 PhD and8 MSc candidates His research interests
include control modelling and signal processing with applica-tions in the fields of satellite bio-medical applied mechanics androbotics He has published more than 70 papers in journals and200 papers in conference proceedings He received his profes-sional qualification as Charted Engineer from the EngineeringCouncil UK in 2005 and also a member of IET UK since 2003
ReferencesAnanthakrishna T Shama K and Niranjan U (2004)
lsquok-Means Nearest Neighbor Classifier for Voice Pathol-ogyrsquo in Paper Presented at the Proceedings from INDI-CONrsquo04 The 1st IEEE Annual India Conference KharagpurIndia pp 352ndash354
Arjmandi MK and Pooyan M (2011) lsquoAn Optimum Algorithmin Pathological Voice Quality Assessment Using Wavelet-Packet-Based Features Linear Discriminant Analysis andSupport Vector Machinersquo Biomedical Signal Processing andControl 7 3ndash19
Azadi TE and Almasganj F (2011) lsquoPBSVM Partitioningand Biased Support Vector Machine for Vocal Fold PathologyAssessment Using Labeled and Unlabeled Data Setsrsquo ExpertSystems with Applications 38(1) 610ndash619
Bezdek JC (1981) Pattern Recognition with Fuzzy ObjectiveFunction Algorithms Norwell MA Kluwer Academic Pub-lishers
Boyanov B and Hadjitodorov S (1997) lsquoAcoustic Analysis ofPathological Voices A Voice Analysis System for the Screen-ing of Laryngeal Diseasesrsquo IEEE Engineering in Medicineand Biology Magazine 16(4) 74ndash82
Boyanov B Ivanov T Hadjitodorov S and Chollet G (1993)lsquoRobust Hybrid Pitch Detectorrsquo IET Electronics Letters 291924ndash1926
Burrus C Gopinath R Guo H Odegard J and Selesnick I(1997) Introduction to Wavelets and Wavelet Transforms APrimer Upper Saddle River NJ Prentice Hall
Calisir D and Dogantekin E (2011) lsquoA New Intelligent Hep-atitis Diagnosis System PCA-LSSVMrsquo Expert Systems withApplications 38(8) 10705ndash10708
Chiu SL (1994) lsquoFuzzy Model Identification Based on ClusterEstimationrsquo Journal of Intelligent and Fuzzy Systems 2(3)267ndash278
Crovato C and Schuck A (2007) lsquoThe Use of Wavelet PacketTransform and Artificial Neural Networks in Analysis andClassification of Dysphonic Voicesrsquo IEEE Transactions onBiomedical Engineering 54(10) 1898ndash1900
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1633
De Brabanter K Ojeda PKF Alzate C De Brabanter J Pel-ckmans K De Moor B Vandewalle J and Suykens JAK(2010) lsquoLS-SVMlab Toolboxrsquo httpwwwesatkuleuvenbesistalssvmlab
Deliyski D (1993) Acoustic Model and Evaluation of Patholog-ical Voice Productionrsquo in Paper Presented at the Proceed-ings from EUROSPEECHrsquo93 The 3rd Conference on SpeechCommunication and Technology Berlin Germany pp 1969ndash1972
Duda RO Hart PE and Stork DG (2001) Pattern Classifi-cation New York Wiley-Interscience
Erfanian Saeedi N Almasganj F and Torabinejad F (2011)lsquoSupport Vector Wavelet Adaptation for Pathological VoiceAssessmentrsquo Computers in Biology and Medicine 41(9)822ndash828
Eskidere O Ertas F and Hanilci C (2011) lsquoA Comparison ofRegression Methods for Remote Tracking of Parkinsonrsquos Dis-ease Progressionrsquo Expert Systems with Applications 39(5)5523ndash5528
Feijoo S and Hernandez C (1990) lsquoShort-Term Stability Mea-sures for the Evaluation of Vocal Qualityrsquo Journal of Speechand Hearing Research 33(2) 324ndash334
Fonseca E Guido R Scalassara P Maciel C and Pereira J(2007) lsquoWavelet Time-Frequency Analysis and Least SquaresSupport Vector Machines for the Identification of Voice Dis-ordersrsquo Computers in Biology and Medicine(Elsevier) 37(4)571ndash578
Fukunaga K (1990) Introduction to Statistical Pattern Recogni-tion (2nd ed) San Diego CA Academic Press
Godino-Llorente J and Gomez-Vilda P (2004) lsquoAutomaticDetection of Voice Impairments by Means of Short-TermCepstral Parameters and Neural Network Based DetectorsrsquoIEEE Transactions on Biomedical Engineering 51(2) 380ndash384
Godino-Llorente J Gomez-Vilda P and Blanco-Velasco M(2006) lsquoDimensionality Reduction of a Pathological VoiceQuality Assessment System Based on Gaussian Mixture Mod-els and Short-Term Cepstral Parametersrsquo IEEE Transactionson Biomedical Engineering 53(10) 1943ndash1953
Gunes S Polat K and Yosunkaya S (2010) lsquoEfficient SleepStage Recognition System Based on EEG Signal Using k-Means Clustering Based Feature Weightingrsquo Expert Systemswith Applications 37(12) 7922ndash7928
Hariharan M Paulraj M and Yaacob S (2009) lsquoIdentifica-tion of Vocal Fold Pathology Based on Mel Frequency BandEnergy Coefficients and Singular Value Decompositionrsquo inPaper presented at the 2009 IEEE International Conferenceon Signal and Image Processing Applications (ICSIPArsquo09)Kuala Lumpur Malaysia pp 514ndash517
Hariharan M Paulraj M and Yaacob S (2011) lsquoDetection ofVocal Fold Paralysis and Oedema Using Time-Domain Fea-tures and Probabilistic Neural Networkrsquo International Jour-nal of Biomedical Engineering and Technology 6(1) 46ndash57
Hariharan M Saraswathy J Sindhu R Khairunizam W andYaacob S (2012) lsquoInfant Cry Classification to Identify As-phyxia Using Time-Frequency Analysis and Radial BasisNeural Networksrsquo Expert Systems with Applications 39(10)9515ndash9523
Hariharan M and Sazali Y (2010) lsquoTime-Domain Features andProbabilistic Neural Network for the Detection of Vocal FoldPathologyrsquo Malaysian Journal of Computer Science 23(1)60ndash67
Hariharan M Sindhu R and Yaacob S (2011) lsquoNormal andHypoacoustic Infant Cry Signal Classification Using Time-Frequency Analysis and General Regression Neural Net-
workrsquo Computer Methods and Programs in Biomedicine108(2) 559ndash569
Hariharan M Yaacob S and Awang SA (2011) lsquoPathologi-cal Infant Cry Analysis Using Wavelet Packet Transform andProbabilistic Neural Networkrsquo Expert Systems with Applica-tions 38(12) 15377ndash15382
Hernandez-Espinosa C Gomez-Vilda P Godino-Llorente Jand Aguilera-Navarro S (2000) lsquoDiagnosis of Vocal andVoice Disorders by the Speech Signalrsquo in Paper Presentedat the Proceedings from IJCNNrsquo00 The IEEE-INNS-ENNSInternational Joint Conference on Neural Networks ComoItaly pp 253ndash258
Hoglund H (2012) lsquoDetecting Earnings Management with Neu-ral Networksrsquo Expert Systems with Applications 39(10)9564ndash9570
Ismail S Samsudin R and Shabri A (2011) lsquoA Hybrid Modelof Self-Organizing Maps (SOM) and Least Square SupportVector Machine (LSSVM) for Time Series Forecastingrsquo Ex-pert Systems with Applications 38(8) 10574ndash10578
Kalker T Haitsma J and Oostveen J (2001) lsquoRobust AudioHashing for Content Identificationrsquo in Paper Presented at theProceedings from EUSIPCOrsquo01 The 12th European SignalProcessing Conference Vienna Austria pp 2091ndash2094
Kasuya H Endo Y and Saliu S (1993) lsquoNovel AcousticMeasurements of Jitter and Shimmer Characteristics fromPathological Voicersquo Paper Presented at the Proceedings fromEUROSPEECHrsquo93 The 3rd European Conference on SpeechCommunication and Technology Berlin Germany pp 1973ndash1976
Kasuya H Ogawa S Mashima K and Ebihara S (1986)lsquoNormalized Noise Energy as an Acoustic Measure to Eval-uate Pathologic Voicersquo The Journal of the Acoustical Societyof America 80 1329ndash1334
Kay Elemetrics Inc (1994) lsquoVoice Disorders DatabaseVersion 103 [CD-ROM]rsquo httpwwwkaypentaxcomProduct20InfoCSL20Options43374337html
Khadivi Heris H Seyed Aghazadeh B and Nikkhah-BahramiM (2009) lsquoOptimal Feature Selection for the Assessment ofVocal Fold Disordersrsquo Computers in Biology and Medicine39(10) 860ndash868
Krom G (1993) lsquoA Cepstrum-Based Technique for Determininga Harmonics-To-Noise Ratio in Speech Signalsrsquo Journal ofSpeech and Hearing Research 36(2) 254ndash266
Kukharchik P Kheidorov I Bovbel E and Ladeev D (2008)lsquoSpeech Signal Processing Based on Wavelets and SVM forVocal Tract Pathology Detectionrsquo Lecture Notes in ComputerScience (Springer) 5099 192ndash199
Latifoglu F Polat K Kara SI and Gunes S (2008) lsquoMedicalDiagnosis of Atherosclerosis from Carotid Artery DopplerSignals Using Principal Component Analysis (PCA) k-NN based Weighting Pre-Processing and Artificial ImmuneRecognition System (AIRS)rsquo Journal of Biomedical Infor-matics 41(1) 15ndash23
Ludlow CL Bassich CJ Connor NP Coulter DC and LeeYJ (1987) lsquoThe Validity of Using Phonatory Jitter and Shim-mer to Detect Laryngeal Pathologyrsquo Laryngeal Functionin Phonation and Respiration Boston MA Brown amp Copp 492ndash508
MacQueen J (1967) lsquoSome Methods for Classification and Anal-ysis of Multivariate Observationsrsquo in Paper Presented at the5th Berkeley Symposium on Mathematical Statistics and Prob-ability Berkeley CA University California Press Vol 1pp 281ndash297
MAPACI P (2004) lsquoVoice Disorder Databasersquo httpwwwmapacicomindex-inglesphp
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1634 M Hariharan et al
Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372
Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800
Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953
Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327
Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900
Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793
Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793
Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60
Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673
Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279
Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921
Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia
Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345
Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925
Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128
Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740
Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p
Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118
Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576
Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd
Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430
Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97
Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117
Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678
Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219
Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399
Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1623
of speech abnormalities based on wavelet analysis and arti-ficial neural network Fonseca Guido Scalassara Macieland Pereira (2007) have presented wavelet and least-squaresupport vector machine (LS-SVM) for the identification ofvocal fold pathology Salhi Talbi and Cherif 2008 haveproposed a hybrid approach using wavelet analysis andmultilayer neural network for the vocal fold pathologiesKukharchik Kheidorov Bovbel and Ladeev (2008) havepresented wavelet transform and support vector machine(SVM) for the vocal fold pathology detection Crovato andSchuck (2007) presented a vocal fold pathology (dysphonicvoice) classification system using the wavelet packet trans-form (WPT) and the best basis algorithm (BBA) as dimen-sionality reduction and six artificial neural networks actingas classification systems Hierarchical system for diagno-sis of vocal fold pathologies based on wavelets and SVMhas been proposed by Nikkhah-Bahrami Ahmadi-NoubariSeyed Aghazadeh and Khadivi Heris (2009) Arjmandi andPooyan (2011) have applied linear discriminant analysisand wavelet packets for vocal fold pathology assessmentErfanian Saeedi Almasganj and Torabinejad (2011) haveproposed adaptive wavelets for vocal fold pathology assess-ment In Khadivi Heris Seyed Aghazadeh and Nikkhah-Bahrami (2009) WPT with non-linear features were usedfor optimal selection of features to assess the vocal foldpathologies Azadi and Almasganj (2011) have proposedpartitioning and biased support vector machine (PBSVM)-based classifier for vocal fold pathology assessment usinglabelled and unlabelled data-sets Accuracy of their resultvaried from 85 to 100 approximately under differentexperiments From the previous work it is observed thatthe reliability of wavelet and WPT-based features has beenproven by many experimental researchers It is not easyto compare their results since they conducted the analy-sis with different size of data-set different databases anddifferent presentation of results Cross-validation schemeand use of more than one database are the possible ways toprove the reliability of the results of various feature extrac-tion and classification methods (Saenz-Lechon Godino-Llorente Osma-Ruiz and Gomez-Vilda 2006)
In this paper a new feature vector is proposed basedon WPT and singular value decomposition for the auto-matic detection of vocal fold pathology k-means cluster-ing based feature weighting is proposed to increase thediscrimination ability of the proposed features Two dif-ferent databases such as Massachusetts Eye and Ear Infir-mary (MEEI) voice disorders database (Kay Elemetrics Inc1994) and MAPACI speech pathology database (MAPACI2004) are used to test either the robustness or the inde-pendence of the algorithms to the databases Two differentexperiments (frame-based and file-based) are carried outusing the voice samples of the above-said databases Fourdifferent supervised classifiers such as k-nearest neighbour(k-NN) LS-SVM probabilistic neural network (PNN) andgeneral regression neural network (GRNN) are employed
for the identification of vocal fold pathology Two schemesof data validation method are used (conventional validation[ConV] and tenfold cross-validation [CrossV]) in orderto test the effectiveness of the proposed features and thereliability of the classification results
2 Database
In the area of automatic detection of vocal fold pathologyanalysis the only one commercially available database isMEEI voice disorders database (Kay Elemetrics Inc 1994)The database contains 53 normal and 657 pathological voicesamples developed by the MEEI Voice and Speech LabsThe voice samples were the sustained phonation of thevowel ah (1ndash3 s) long and reading (12 s) of the lsquoRainbowPassagersquo from patients with normal voices and a wide va-riety of organic neurological traumatic and psychogenicvoice disorders in different stages All the voice sampleswere collected in a controlled environment and sampledwith 25 kHz or 50 kHz sampling rate and 16 bits of res-olution To test the effectiveness of the proposed methoda total of 226 voice samples of sustained phonation of thevowel ah (173 pathological and 53 normal) are used anddownsampled to 25 kHz for our analysis Two differentexperiments are performed (frame-based and file-based)In frame-based analysis voice samples are segmented intoframes of 40 ms long using a Hamming window with 50overlap Then each window is parameterised by means ofWPT and SVD (singular value decomposition) In file-based analysis 173 pathological and 53 normal voice sam-ples are subjected to feature extraction The second data-setis taken from MAPACI (2004) database (MAPACI 2004)and all the speech samples were recorded at 44100 Hz dur-ing the lifetime project of MAPACI (2004) The recordingdevice was a Sennheiser headset microphone The speechdatabase consists of 48 voice samples (12 normal and 12pathological males + 12 normal and 12 pathological fe-males) and their ages ranged from 20 to 68 years with an av-erage of 3675 years and a standard deviation of 1435 yearsFrame-based and file-based analyses are conducted on thisdatabase as well (Figure 1) shows the pathological andnormal voice samples from MEEI database and MAPACIspeech pathology database
3 WPT and SVD
Various feature extraction methods have been proposed forrobust representation of normal and pathological speech Inthe last decade time-frequency analysis has been used forrobust representation of normal and pathological speechThe wavelet transform provides time-frequency represen-tation of the signal It decomposes signal over dilated andtranslated wavelets A wavelet is a waveform of effectivelylimited duration that has an average value of zero Wavelettransform is defined as the convolution of a signal f(t) with a
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1624 M Hariharan et al
Figure 1 Plots of pathological and normal voice samples
wavelet function ψ(t) shifted in time by a translation param-eter and dilated by a scale parameter The general definitionof the wavelet transform is given as (Burrus Gopinath GuoOdegard and Selesnick 1997 Bopardikar 2000)
W (a b) =infinint
minusinfinf (t)
1radica
ψlowast(
t minus b
a
)dt (1)
In the tree each subspace is indexed by its depth i and thenumber of subspaces p The two wavelet packet orthogonalbases at a parent node (ip) are given by the followingforms
ψ2pi+1(k) =
infinsumn =minusinfin
l[n] ψpi (k minus 2in) (2)
where l[n] is a low-pass (scaling) filter
ψ2p+1i+1 (k) =
infinsumn =minusinfin
h[n] ψpi (k minus 2in) (3)
where h[n] is the high-pass (wavelet) filter Wavelet packetis an extension of wavelet transform In WPT decomposi-tion procedure a signal is decomposed into two frequency
bands such as lower frequency band (approximation co-efficients) and higher frequency band (detail coefficients)Wavelet packet decomposition helps to partition both lowerand higher frequency bands into smaller bands which can-not be achieved by using general discrete wavelet trans-form Hence WPT gives a balanced binary tree structure(Figure 2) shows the block diagram of the proposed system
In this work the automatic detection of vocal foldpathology is carried out by using WPT and SVD In frame-based analysis the voice samples are firstly segmented intoshort-time frames of 40 ms with an overlap of 50 betweenadjacent frames and windowed by using Hamming win-dow (Godino-Llorente and Gomez-Vilda 2004 Godino-Llorente et al 2006) In frame-based analysis 1557 framesof pathological voices and 1537 frames of normal voicesare used Each frame of signal is decomposed into five lev-els using WPT and yields 32 subbands A matrix of size32 times M (fifth-level wavelet packet coefficients of each32 wavelet packet sub-bands) composed of wavelet packetcoefficients which is obtained for further processing whoserows pertain to wavelet packet subbands and columns towavelet packet coefficients
A = [C1
5 (M) C25 (M) middot middot middot C32
5 (M)]T
(4)
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1625
Figure 2 Overall diagram of the proposed system
To derive simple and effective feature vector singularvalue decomposition is applied In file-based analysis eachvoice sample is decomposed into five levels using WPTand the same SVD is applied to derive feature vector for thedetection of vocal fold pathology The same methodologyis applied on the voice samples of second data-set as wellConsidering the fact that pathological voice signals containmore rapid variations and the popularity of Daubechieswavelets in speech applications lsquodb4rsquo wavelet is used inour study The order was chosen to be low to model thetransients and rapid variations in a signal efficiently
31 Singular value decomposition
The singular value decomposition which is a factorisationand summarisation technique effectively reduces a rectan-gular matrix A (n times p matrix) of wavelet packet coefficientsinto a much smaller invertible and square matrix (KalkerHaitsma and Oostveen 2001 Ozer Sankur Memon andAnarim 2005) SVD theorem states that the given n times pmatrix is decomposed as given by
X = UDVT (5)
The columns of U are orthonormal eigenvectors of AAT
and are called left-singular vectors The columns of V are or-thonormal eigenvectors of ATA and are called right-singularvectors The diagonal entries of D are called the singularvalues of the matrix n times p In this work 32 singular valuesgive a good summarisation of the matrix of wavelet packetcoefficients of a signal
32 k-Means clustering based feature weighting
In many practical machine learning applications the per-formance of the learned models degrade gracefully due tothe irrelevant and noisy features To improve the robust-ness of the features and the classification accuracy clus-tering algorithms are used as feature weighting methodwhere they transform the extracted non-linearly separable
features to linearly separable features and also to map thefeatures according to the distributions (Polat and Gunes2006 Latifoglu Polat Kara and Gunes 2008 GunesPolat and Yosunkaya 2010 Polat 2012 Polat and Dur-duran 2012) Clustering algorithms are used not only tostudy the similarity or dissimilarity of the features but alsouseful for compression and reduction of the features (Polatand Gunes 2006 Latifoglu et al 2008 Gunes et al 2010Polat 2012 Polat and Durduran 2012) Several clusteringalgorithms have been proposed in the literature (MacQueen1967 Bezdek 1981 Chiu 1994 Yager and Filev 1994 Xuand Wunsch 2005) In this work k-means clustering is usedas a feature weighting method since it is simple and widelyimplemented in solving many practical problems (Polat andGunes 2006 Latifoglu et al 2008 Gunes et al 2010 Po-lat 2012 Polat and Durduran 2012) k-means clustering isused to partition n observations into k clusters so as to min-imise the mean squared distance from each data point toits nearest centre For a data-set xi i = 1 2 3 n k-means clustering algorithm finds the cluster centres ci andthe membership matrix U iteratively using the followingfour steps
Step 1 Randomly initialise the cluster centre ci i =123 c
Step 2 Determine the membership matrix usingEquation (6)
uij =
10
if∥∥xj minus ci
∥∥2 le∥∥xj minus ci
∥∥2 for each k = i
otherwise(6)
Step 3 Calculate the cost function based on the Euclideandistance using the following Equation (7)
J =csum
i=1
sumkxkisinGi
xk minus ci2 (7)
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1626 M Hariharan et al
Figure 3 Working of k-means clustering based feature weighting method
Stop if either this value or its improvement over previousiteration is below a certain thresholdStep 4 Update the cluster centres according toEquation (8) and go to Step 2
ci = 1
|Gi |sum
kxkisin Gi
xk (8)
where |Gi | =nsum
j=1uij
After finding the cluster centres the extracted featuresare preprocessed (Polat and Gunes 2006 Latifoglu et al2008 Gunes et al 2010 Polat 2012 Polat and Durduran2012) as summarised in (Figure 3) below firstly calcu-late the cluster centres using k-means clustering methodNext the ratios of means of features to their centres arecalculated Finally these ratios are multiplied with eachrespective feature
Raw features are obtained through feature extractionThrough k-means clustering based feature weighting theraw features are preprocessed which gives the weightedfeatures (Figure 4a and b) shows the class distribution ofraw and weighted SVD features MEEI database The classdistribution of raw and weighted SVD features for MAPACIspeech pathology database is shown in (Figure 5a and b)From the figures it is observed that the discrimination abil-
ity of the raw SVD features has been increased by applyingk-means clustering based feature weighting method
4 Classifiers
In the area of automatic detection of vocal fold patholo-gies various classifiers have been proposed such as multi-layer perceptron (Ritchings McGillion Conroy and Moore1999 Ritchings McGillion and Moore 1999) hiddenMarkov models (Wester 1998) linear discriminant anal-ysis (Hariharan Paulraj and Yaacob 2009 UmapathyKrishnan Parsa and Jamieson 2005) SVM (Khadivi Heriset al 2009 Arjmandi and Pooyan 2011 Erfanian Saeediet al 2011) Gaussian mixture models (Godino-Llorenteet al 2006) radial basis neural networks (Hariharan andSazali 2010 Hariharan Paulraj and Yaacob 2011) and k-nearest neighbourhood classifier (Ananthakrishna Shamaand Niranjan 2004 Shama and Cholayya 2007) In thiswork four different classifiers are employed such as k-NNLS-SVM PNN and GRNN
41 k-NN classifier
In pattern recognition the k-NN algorithm is a method forclassifying objects based on closest training examples inthe feature space (Fukunaga 1990 Duda Hart and Stork
Figure 4 (a) Class distribution of raw SVD features (MEEI database frame-based) according to the first two features (Feature 1 andFeature 2) (b) Class distribution of weighted SVD features (MEEI database frame-based) according to the first two features (Feature 1and Feature 2)
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1627
Figure 5 (a) Class distribution of raw SVD features (MAPACI speech pathology frame-based) according to the first two features (Feature1 and Feature 2) (b) Comparison of results (frame-based MEEI database raw features)
2001) An object is classified by a majority vote of itsneighbour with the object being assigned to the class mostcommon amongst its k-nearest neighbours k is a positiveinteger In the k-NN algorithm the classification of a newtest feature vector is determined by the class of its k-NNs
42 Probabilistic neural network
Specht (1990) has proposed the PNN based on Bayesianclassification and classical estimators for probability den-sity function (PDF) PNN comprises four units such as in-put units pattern units summation units and output unitsAll the units are fully interconnected and the pattern unitsare activated by exponential function instead of sigmoidalactivation function The pattern unit computes distancesfrom the input vector to the training input vectors when aninput is presented and produces a vector whose elementsindicate how close the input is to a training input The sum-mation unit sums these contributions for each class of inputsand produces a net output which is a vector of probabili-ties From the maximum of these probabilities output unitsproduce a 1 for that class and a 0 for the other classes using
compete transfer function (Hariharan Yaacob and Awang2011) The performance of the PNN classifier highly de-pends upon the smoothing parameter or spread factor (σ )Based on the experimental investigations the σ value isvaried between 001 and 0055 in steps of 0005
43 General regression neural network
GRNN is a kind of radial basis networks and the train-ing is conducted using one-pass learning This networkdoes not require an iterative training procedure it presentsmuch faster learning than multilayer perceptron it is moreaccurate than MLP and relatively insensitive to outliersThe target variable for the GRNN network is continuous(Eskidere Ertas and Hanilci 2011 Wu and Tsai 2011Hariharan Sindhu and Yaacob 2011 HariharanSaraswathy Sindhu Khairunizam and Yaacob 2012Hoglund 2012) Specht (1991) has proposed the model ofGRNN to perform general (linear or non-linear) regres-sions GRNN is based on the theory of probability re-gression analysis It usually uses Parzen window estimatesto set up the PDF from the observed data samples TheGRNN has four different layers input layer pattern layer
Table 1 Details of training and test data
Type ofType of experiment validation MEEI database MAPACI speech pathology database
Frame-based (1557pathological + 1537normal) (1560pathological + 1560normal)
CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
ConV Training = 2166 (70) testing =928 (30)
Training = 2184 (70) testing =936 (30)
File-based (173pathological + 53normal) (24pathological + 24normal)
CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
ConV Training = 158 (70) testing =68 (30)
Training = 34 (70) testing =14 (30)
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1628 M Hariharan et al
Figure 6 Comparison of results (a) frame-based MEEI database raw features (b) file-based MEEI database raw features (c) frame-based MAPACI speech pathology database raw features and (d) file-based MAPACI speech pathology database raw features
summation layer and output layer The performance of theGRNN classifier highly depends upon the smoothing pa-rameter or spread factor (σ ) Based on the experimentalinvestigations the σ value is varied between 001 and 0055in steps of 0005
44 Support vector machine
In this work SVM is used as a classifier and it is a promis-ing method for solving non-linear classification problemsfunction estimation and density estimation and patternrecognition tasks (Azadi and Almasganj 2011 Calisir andDogantekin 2011 Ismail Samsudin and Shabri 2011 YuYao Wang and Lai 2011 Martis Acharya Mandana Rayand Chakraborty 2012) It has been originally proposed toclassify samples within two classes It maps training sam-ples of two classes into a higher dimensional space througha kernel function SVM seeks an optimal separating hyper-plane in this new space to maximise its distance from theclosest training point While testing a query point is cate-gorised according to the distance between the point and thehyperplane
SVM models are built around a kernel function thattransforms the input data into an n-dimensional space where
a hyperplane can be constructed to partition the data Threekinds of kernel functions such as linear kernel multilayerkernel and radial basis function (RBF) kernel are normallyused by researchers (Azadi and Almasganj 2011 Calisirand Dogantekin 2011 Ismail et al 2011 Yu et al 2011Martis et al 2012) In this work the RBF kernel functionis used since it gives an excellent generalisation and lowcomputational cost In the RBF kernel σ 2 (sig2) is theimportant parameter and it causes the changes in shapeflexion of the hyperplane
In this work LS-SVMlab toolbox (Suykens VanGestel De Brabanter De Moor and Vandewalle2003 De Brabanter 2010) is used to perform classificationof normal and pathological speech samples There are twoparameters which are to be chosen optimally such as reg-ularisation parameter (γ gam) and σ 2 (squared bandwidthof the RBF kernel) to obtain better accuracy
5 Results and discussions
Many research works have already been done in the areaof automatic detection of voice pathology In this work anew feature vector using WPT and SVD is proposed to clas-sify the voice samples into normal or pathological k-means
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1629
Figure 7 Comparison of results (a) Frame-based MEEI database weighted features (b) file-based MEEI database weighted features(c) Frame-based MAPACI speech pathology database weighted features and (d) file-based MAPACI speech pathology database weightedfeatures
clustering based feature weighting method is proposed forpreprocessing the features which increases its discrimina-tion ability between classes and the classification accuracyas well Four classifiers are employed to investigate the ef-ficacy of the raw and weighted features and two validationschemes (ConV and CrossV) are used to prove the reliabil-ity of the classification results Details of the training andtesting sets used in this work are tabulated in (Table 1)
To test the classifier performance several measuresnamely sensitivity specificity and the overall accuracy areconsidered These measures are calculated from the mea-sures true positive (TP the classifier classified as pathologywhen pathological samples are present) true negative (TNthe classifier classified as normal when normal samples arepresent) false positive (FP the classifier classified as patho-logical when normal samples are present) and false negative(FN the classifier classified as normal when pathologicalsamples are present) These measures are given in Equa-tions (9) (10) and (11) respectively
Sensitivity (SE) = TP(TP + FN) (9)
Specificity (SP) = TN(TN + FP) (10)
Overall accuracy (ACC) = (TP + TN)(TP + TN
+ FP + FN) (11)
In k-NN classifier different values of lsquokrsquo between 1 and10 are used PNN and GRNN are trained with differentspread factors between 001 and 0055 The suitable valueof the regularisation parameter (γ gam) and σ 2 (sig2) forSVM classifier are chosen optimally during training andtesting to obtain better accuracy Sensitivity specificity andoverall accuracy of frame-based and file-based analysis us-ing the raw and weighted features are presented in (Table 2)and (Table 3) for both MEEI database and MAPACIspeech pathology database respectively Average classifica-tion accuracies with standard deviations are tabulated From(Table 2) it is observed that the accuracy of file-based anal-ysis is always better than the frame-based analysis Max-imum classification accuracy of 100 is obtained duringthe file-based analysis using all the classifiers (k-NN LS-SVM PNN and GRNN) for the proposed raw and weightedfeatures Maximum classification accuracy of above 95is obtained using the raw features and 100 is obtained us-ing the weighted features during the frame-based analysis
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1630 M Hariharan et al
Tabl
e2
Com
pari
son
ofre
sult
sfo
rM
EE
Ida
taba
seus
ing
k-N
NL
S-S
VM
PN
Nan
dG
RN
N(r
awan
dw
eigh
ted
feat
ures
)
Raw
feat
ures
Wei
ghte
dfe
atur
es
ME
EI
data
base
SE
SP
AC
CS
ES
PA
CC
k-N
NFr
ame-
base
dC
ross
V96
66
plusmn0
5591
69
plusmn1
0994
05
plusmn0
8399
81
plusmn0
0698
27
plusmn0
3499
03
plusmn0
19k-
valu
e1ndash
10C
onV
959
6plusmn
071
908
7plusmn
154
932
7plusmn
116
998
1plusmn
006
979
8plusmn
033
988
8plusmn
015
File
-bas
edC
ross
V99
54
plusmn0
2410
0plusmn
000
996
5plusmn
019
995
4plusmn
024
100
plusmn0
0099
65
plusmn0
19k-
valu
e1ndash
10C
onV
997
4plusmn
025
100
plusmn0
0099
79
plusmn0
2010
0plusmn
000
100
plusmn0
0010
0plusmn
000
LS
-SV
MFr
ame-
base
dC
ross
V95
10
plusmn0
2195
41
plusmn0
2095
25
plusmn0
1299
75
plusmn0
0399
45
plusmn0
0799
60
plusmn0
0410
000
01
Con
V94
92
plusmn0
9094
49
plusmn1
0594
70
plusmn0
8199
34
plusmn0
2999
46
plusmn0
3499
40
plusmn0
26Fi
le-b
ased
Cro
ssV
989
2plusmn
062
998
1plusmn
060
991
2plusmn
047
993
2plusmn
036
100
plusmn0
0099
47
plusmn0
2810
01
Con
V10
0plusmn
000
988
7plusmn
130
991
2plusmn
103
996
2plusmn
080
994
1plusmn
186
995
6plusmn
071
PN
NFr
ame-
base
dC
ross
V97
43
plusmn0
2693
01
plusmn0
1795
12
plusmn0
1799
88
plusmn0
0598
69
plusmn0
1699
28
plusmn0
070
01ndash0
055
Con
V96
79
plusmn0
2492
44
plusmn0
3394
51
plusmn0
2499
81
plusmn0
0698
41
plusmn0
2199
10
plusmn0
11Fi
le-b
ased
Cro
ssV
974
4plusmn
539
100
plusmn0
0097
74
plusmn4
9797
50
plusmn5
3710
0plusmn
000
977
9plusmn
495
001
ndash00
55C
onV
970
9plusmn
610
100
plusmn0
0097
37
plusmn5
7999
62
plusmn0
8099
41
plusmn1
8699
56
plusmn0
71G
RN
NFr
ame-
base
dC
ross
V93
01
plusmn3
8988
08
plusmn4
9890
15
plusmn4
3199
50
plusmn0
4197
53
plusmn1
3098
27
plusmn0
890
01ndash0
055
Con
V92
52
plusmn3
7887
07
plusmn5
1189
34
plusmn4
3599
39
plusmn0
5597
35
plusmn1
1698
10
plusmn0
91Fi
le-b
ased
Cro
ssV
972
7plusmn
599
100
plusmn0
0096
33
plusmn8
0098
07
plusmn4
8210
0plusmn
000
979
2plusmn
511
001
ndash00
55C
onV
966
9plusmn
675
100
plusmn0
0095
81
plusmn9
0596
77
plusmn6
2310
0plusmn
000
958
1plusmn
815
Tabl
e3
Com
pari
son
ofre
sult
sfo
rM
APA
CI
spee
chpa
thol
ogy
data
base
usin
gk-
NN
LS
-SV
MP
NN
and
GR
NN
(raw
and
wei
ghte
dfe
atur
es)
Raw
feat
ures
Wei
ghte
dfe
atur
es
MA
PAC
Ida
taba
seS
ES
PA
CC
SE
SP
AC
C
k-N
NFr
ame-
base
dC
ross
V94
49
plusmn0
3696
58
plusmn0
3995
51
plusmn0
2198
67
plusmn0
2398
76
plusmn0
2598
71
plusmn0
23k-
valu
e1ndash
10C
onV
942
0plusmn
040
962
9plusmn
033
952
1plusmn
032
100
plusmn0
0010
0plusmn
000
100
plusmn0
00Fi
le-b
ased
Cro
ssV
771
0plusmn
784
669
2plusmn
330
704
2plusmn
365
100
plusmn0
0010
0plusmn
000
983
3plusmn
164
k-va
lue
1ndash10
Con
V77
56
plusmn7
1167
51
plusmn2
4869
71
plusmn2
7099
88
plusmn0
4099
88
plusmn0
4097
50
plusmn2
16L
S-S
VM
Fram
e-ba
sed
Cro
ssV
955
0plusmn
018
972
8plusmn
024
963
7plusmn
016
990
4plusmn
008
990
4plusmn
008
992
4plusmn
006
100
01
Con
V96
86
plusmn0
8195
24
plusmn0
6796
03
plusmn0
4098
71
plusmn0
6898
71
plusmn0
6899
02
plusmn0
41Fi
le-b
ased
Cro
ssV
793
1plusmn
333
688
7plusmn
240
729
2plusmn
260
100
plusmn0
0010
0plusmn
000
958
3plusmn
000
101
Con
V74
53
plusmn7
8284
00
plusmn13
01
771
4plusmn
878
100
plusmn0
0010
0plusmn
000
964
3plusmn
376
PN
NFr
ame-
base
dC
ross
V94
88
plusmn0
2096
23
plusmn0
2695
55
plusmn0
2199
82
plusmn0
0799
82
plusmn0
0799
91
plusmn0
040
01ndash0
055
Con
V94
52
plusmn0
3196
03
plusmn0
3595
26
plusmn0
2599
76
plusmn0
0799
76
plusmn0
0799
88
plusmn0
03Fi
le-b
ased
Cro
ssV
723
0plusmn
396
705
3plusmn
393
708
3plusmn
170
985
7plusmn
452
985
7plusmn
452
991
7plusmn
263
001
ndash00
55C
onV
690
2plusmn
533
691
8plusmn
410
679
3plusmn
386
986
0plusmn
273
986
0plusmn
273
990
0plusmn
172
GR
NN
Fram
e-ba
sed
Cro
ssV
904
1plusmn
378
917
9plusmn
335
909
8plusmn
345
995
9plusmn
029
995
9plusmn
029
997
1plusmn
023
001
ndash00
55C
onV
900
2plusmn
361
917
8plusmn
316
907
2plusmn
324
995
3plusmn
030
995
3plusmn
030
996
6plusmn
024
File
-bas
edC
ross
V69
43
plusmn8
5170
75
plusmn4
8666
67
plusmn7
7397
13
plusmn6
4097
13
plusmn6
4093
75
plusmn9
270
01ndash0
055
Con
V62
05
plusmn9
4064
51
plusmn3
8359
57
plusmn8
5097
38
plusmn5
6897
38
plusmn5
6893
07
plusmn9
91
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1631
Table 4 Computation time for feature extraction and classification
Classification
MEEI database MAPACI database
Type of Type of Weighted Raw Weighted RawClassifiers experiments validation features features features features
k-NN Frame-based CrossV 0338 0341 0342 0346k-value 1ndash10 ConV 3806 3821 6931 6955File-based CrossV 0036 0038 0036 0036k-value 1ndash10 ConV 1759 1754 3001 3206
LS-SVM Frame-based CrossV 1139 1175 1191 121410 001 ConV 0718 0698 0705 0723
File-based CrossV 0005 0005 0002 000210 1 ConV 0006 0006 0004 0004
PNN Frame-based CrossV 2458 3084 2445 2421001ndash0055 ConV 10340 10356 11180 11202File-based CrossV 1078 1124 1093 1062001ndash0055 ConV 4250 4266 4072 4047
GRNN Frame-based CrossV 2587 2571 2676 2582001ndash0055 ConV 10726 10648 11634 11584File-based CrossV 1098 1056 1065 1062001ndash0055 ConV 4302 4260 4049 4066
Feature extractionType of experiments MEEI MAPACI
Normal Pathological Normal PathologicalFile-based 0549 0034 0109 0130Frame-based 0591 0181 1734 1737
To prove the robustness and database independency of theproposed method MAPACI speech pathology database isused for the investigation in addition to MEEI databaseFrom (Table 3) it is inferred that maximum accuracy ofabove 95 is obtained using the raw features and 100is obtained using the weighted features during the frame-based analysis During the file-based analysis we obtainedmaximum classification accuracy of 92 using the rawfeatures and 100 is obtained using the weighted features
(Figure 6andashd) show the comparison of results (maxi-mum accuracy and its corresponding sensitivity and speci-ficity) for MEEI database and MAPACI speech pathologydatabase using the proposed raw features during frame-based and file-based analysis Similarly (Figure 7andashd) il-lustrate the comparison of results (maximum accuracy andits corresponding sensitivity and specificity) for MEEIdatabase and MAPACI speech pathology database for pro-posed weighted features The performance of the weightedfeatures is always better compared to raw features for bothMEEI database and MAPACI speech pathology databaseAll the feature extractions and classification algorithms aredeveloped under MATLAB platform and they run in a lap-top of Intel Core i7-2670 QM (22 GHz) with 4 GB RAM(Table 4) reports the average time taken (in seconds) duringfeature extraction and classification under two different ex-periments (frame-based and file-based) Direct comparisonof our work with the previous works in literature cannot beperformed since most of the works in the literature havenot presented their computation time of feature extraction
and classification phase From (Table 4) it is observed thatthe best classification accuracy could be obtained using oursuggested feature extraction and classification algorithmswith less computation time
From the literature it is observed that the reliabilityof wavelet and WPT-based features has been proven bymany experimental researches Accuracy of their resultswas varied from 85 to 100 approximately under differ-ent experiments Their results highly depend on the opti-mal selection of wavelet packet-based features using featureselectiontransformation techniques and optimisation tech-niques However the presented work gives very promisingclassification accuracy for the databases under test Henceit can be concluded that the proposed k-means clusteringbased feature weighting method could improve the robust-ness of the WPTndashSVD features and increase the classifica-tion accuracy as well for both MEEI database and MAPACIspeech pathology database
6 Conclusions
This paper presents a new feature constituting approachfor efficient detection of vocal fold pathology using a fea-ture extraction method (WPTndashSVD features and a fea-ture weighting method [k-means clustering based]) Totest the effectiveness and reliability of the proposed rawand weighted features four different classifiers such as k-NN LS-SVM PNN and GRNN are employed k-means
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1632 M Hariharan et al
clustering based feature weighting approach can increasethe distinguishing performance of raw WPTndashSVD featuresThe experimental results show that the proposed weightedfeatures give very promising classification accuracy of100 for both MEEI voice disorder database and MAPACIspeech pathology database The proposed features can beused as additional acoustic indicators for researchers andspeech pathologists to detect the vocal fold pathology Theperformance of the k-means clustering algorithm is depend-ing on the initial positions of the clusters and hence in ourwork it was run for several times to obtain better clus-ter centres In the future work k-means clustering basedfeature weighting method could be improved by using ef-ficient initialisation of methods for better initial placementof the cluster centres and different kernel functions Theproposed method could be extended to detect the specifictype of disorders and also to develop an online diagnosingsystem
AcknowledgementsThis work is done in part with data transferred from the databaseMAPACI httpwwwmapacicom The authors would like tothank the anonymous reviewers for their valuable comments
Notes on contributorsM Hariharan has received Bachelor ofEngineering in Electrical and ElectronicsEngineering and Master of Engineeringin Applied Electronics from GovernmentCollege of Technology Coimbatore TamilNadu India He has received a PhD degreein Mechatronic Engineering from UniversitiMalaysia Perlis (UniMAP) Malaysia He iscurrently working as a Senior Lecturer in
the School of Mechatronic Engineering UniMAP Malaysia Hehas published more than 50 papers in referred journals and con-ferences He has been a reviewer in various journals includingComputers and Electrical Engineering Computer Methods andPrograms in Biomedicine Artificial Intelligence in BiomedicineInternational Journal of Phoniatrics Speech Therapy and Com-munication Pathology and Medical Engineering and Physics Hisresearch interests include speech signal processing biomedicalsignal and image processing and artificial intelligence He is amember of IEEE
Kemal Polat was born in Kırıkkale on30 January 1981 Dr Polat has gradu-ated from ElectricalndashElectronics Engineer-ing Department of Selcuk University withBSc degree in 2002 and from ElectricalndashElectronics Engineering Department of Sel-cuk University with MSc degree in 2004He took his PhD degree in Electrical andElectronic Engineering at Selcuk University
in 2008 He is now working as an Assistant Professor in Electricaland Electronic Engineering Department Engineering of FacultyAbant Izzet Baysal University since September 2011 His currentresearch interests are biomedical signal classification statisticalsignal processing digital signal processing and pattern recognitionand classification He has 55 articles published in the SCI journalsand 16 international conference papers He is a reviewer in various
SCI journals including Information Sciences Pattern RecognitionLetters Digital Signal Processing Soft Computing Artificial In-telligence in Medicine Applied Soft Computing Computers inBiology and Medicine Expert Systems Computational Statisticsamp Data Analysis IEEE Transactions on Biomedical EngineeringIEEE Transactions on Evolutionary Computation TALANTASoft Computing Journal of Medical Systems and Scientific Re-search and Essays so on He is the member of editorial board ofJournal of Neural Computing and Applications (SCI expanded)
Sazali Yaacob received his PhD degreein Control Engineering from University ofSheffield UK He is currently working as aProfessor in the School of Mechatronic En-gineering UniMAP Malaysia He has suc-cessfully supervised 8 PhD candidates andmore than 20 MSc graduates through re-search mode Currently he has 10 PhD and8 MSc candidates His research interests
include control modelling and signal processing with applica-tions in the fields of satellite bio-medical applied mechanics androbotics He has published more than 70 papers in journals and200 papers in conference proceedings He received his profes-sional qualification as Charted Engineer from the EngineeringCouncil UK in 2005 and also a member of IET UK since 2003
ReferencesAnanthakrishna T Shama K and Niranjan U (2004)
lsquok-Means Nearest Neighbor Classifier for Voice Pathol-ogyrsquo in Paper Presented at the Proceedings from INDI-CONrsquo04 The 1st IEEE Annual India Conference KharagpurIndia pp 352ndash354
Arjmandi MK and Pooyan M (2011) lsquoAn Optimum Algorithmin Pathological Voice Quality Assessment Using Wavelet-Packet-Based Features Linear Discriminant Analysis andSupport Vector Machinersquo Biomedical Signal Processing andControl 7 3ndash19
Azadi TE and Almasganj F (2011) lsquoPBSVM Partitioningand Biased Support Vector Machine for Vocal Fold PathologyAssessment Using Labeled and Unlabeled Data Setsrsquo ExpertSystems with Applications 38(1) 610ndash619
Bezdek JC (1981) Pattern Recognition with Fuzzy ObjectiveFunction Algorithms Norwell MA Kluwer Academic Pub-lishers
Boyanov B and Hadjitodorov S (1997) lsquoAcoustic Analysis ofPathological Voices A Voice Analysis System for the Screen-ing of Laryngeal Diseasesrsquo IEEE Engineering in Medicineand Biology Magazine 16(4) 74ndash82
Boyanov B Ivanov T Hadjitodorov S and Chollet G (1993)lsquoRobust Hybrid Pitch Detectorrsquo IET Electronics Letters 291924ndash1926
Burrus C Gopinath R Guo H Odegard J and Selesnick I(1997) Introduction to Wavelets and Wavelet Transforms APrimer Upper Saddle River NJ Prentice Hall
Calisir D and Dogantekin E (2011) lsquoA New Intelligent Hep-atitis Diagnosis System PCA-LSSVMrsquo Expert Systems withApplications 38(8) 10705ndash10708
Chiu SL (1994) lsquoFuzzy Model Identification Based on ClusterEstimationrsquo Journal of Intelligent and Fuzzy Systems 2(3)267ndash278
Crovato C and Schuck A (2007) lsquoThe Use of Wavelet PacketTransform and Artificial Neural Networks in Analysis andClassification of Dysphonic Voicesrsquo IEEE Transactions onBiomedical Engineering 54(10) 1898ndash1900
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1633
De Brabanter K Ojeda PKF Alzate C De Brabanter J Pel-ckmans K De Moor B Vandewalle J and Suykens JAK(2010) lsquoLS-SVMlab Toolboxrsquo httpwwwesatkuleuvenbesistalssvmlab
Deliyski D (1993) Acoustic Model and Evaluation of Patholog-ical Voice Productionrsquo in Paper Presented at the Proceed-ings from EUROSPEECHrsquo93 The 3rd Conference on SpeechCommunication and Technology Berlin Germany pp 1969ndash1972
Duda RO Hart PE and Stork DG (2001) Pattern Classifi-cation New York Wiley-Interscience
Erfanian Saeedi N Almasganj F and Torabinejad F (2011)lsquoSupport Vector Wavelet Adaptation for Pathological VoiceAssessmentrsquo Computers in Biology and Medicine 41(9)822ndash828
Eskidere O Ertas F and Hanilci C (2011) lsquoA Comparison ofRegression Methods for Remote Tracking of Parkinsonrsquos Dis-ease Progressionrsquo Expert Systems with Applications 39(5)5523ndash5528
Feijoo S and Hernandez C (1990) lsquoShort-Term Stability Mea-sures for the Evaluation of Vocal Qualityrsquo Journal of Speechand Hearing Research 33(2) 324ndash334
Fonseca E Guido R Scalassara P Maciel C and Pereira J(2007) lsquoWavelet Time-Frequency Analysis and Least SquaresSupport Vector Machines for the Identification of Voice Dis-ordersrsquo Computers in Biology and Medicine(Elsevier) 37(4)571ndash578
Fukunaga K (1990) Introduction to Statistical Pattern Recogni-tion (2nd ed) San Diego CA Academic Press
Godino-Llorente J and Gomez-Vilda P (2004) lsquoAutomaticDetection of Voice Impairments by Means of Short-TermCepstral Parameters and Neural Network Based DetectorsrsquoIEEE Transactions on Biomedical Engineering 51(2) 380ndash384
Godino-Llorente J Gomez-Vilda P and Blanco-Velasco M(2006) lsquoDimensionality Reduction of a Pathological VoiceQuality Assessment System Based on Gaussian Mixture Mod-els and Short-Term Cepstral Parametersrsquo IEEE Transactionson Biomedical Engineering 53(10) 1943ndash1953
Gunes S Polat K and Yosunkaya S (2010) lsquoEfficient SleepStage Recognition System Based on EEG Signal Using k-Means Clustering Based Feature Weightingrsquo Expert Systemswith Applications 37(12) 7922ndash7928
Hariharan M Paulraj M and Yaacob S (2009) lsquoIdentifica-tion of Vocal Fold Pathology Based on Mel Frequency BandEnergy Coefficients and Singular Value Decompositionrsquo inPaper presented at the 2009 IEEE International Conferenceon Signal and Image Processing Applications (ICSIPArsquo09)Kuala Lumpur Malaysia pp 514ndash517
Hariharan M Paulraj M and Yaacob S (2011) lsquoDetection ofVocal Fold Paralysis and Oedema Using Time-Domain Fea-tures and Probabilistic Neural Networkrsquo International Jour-nal of Biomedical Engineering and Technology 6(1) 46ndash57
Hariharan M Saraswathy J Sindhu R Khairunizam W andYaacob S (2012) lsquoInfant Cry Classification to Identify As-phyxia Using Time-Frequency Analysis and Radial BasisNeural Networksrsquo Expert Systems with Applications 39(10)9515ndash9523
Hariharan M and Sazali Y (2010) lsquoTime-Domain Features andProbabilistic Neural Network for the Detection of Vocal FoldPathologyrsquo Malaysian Journal of Computer Science 23(1)60ndash67
Hariharan M Sindhu R and Yaacob S (2011) lsquoNormal andHypoacoustic Infant Cry Signal Classification Using Time-Frequency Analysis and General Regression Neural Net-
workrsquo Computer Methods and Programs in Biomedicine108(2) 559ndash569
Hariharan M Yaacob S and Awang SA (2011) lsquoPathologi-cal Infant Cry Analysis Using Wavelet Packet Transform andProbabilistic Neural Networkrsquo Expert Systems with Applica-tions 38(12) 15377ndash15382
Hernandez-Espinosa C Gomez-Vilda P Godino-Llorente Jand Aguilera-Navarro S (2000) lsquoDiagnosis of Vocal andVoice Disorders by the Speech Signalrsquo in Paper Presentedat the Proceedings from IJCNNrsquo00 The IEEE-INNS-ENNSInternational Joint Conference on Neural Networks ComoItaly pp 253ndash258
Hoglund H (2012) lsquoDetecting Earnings Management with Neu-ral Networksrsquo Expert Systems with Applications 39(10)9564ndash9570
Ismail S Samsudin R and Shabri A (2011) lsquoA Hybrid Modelof Self-Organizing Maps (SOM) and Least Square SupportVector Machine (LSSVM) for Time Series Forecastingrsquo Ex-pert Systems with Applications 38(8) 10574ndash10578
Kalker T Haitsma J and Oostveen J (2001) lsquoRobust AudioHashing for Content Identificationrsquo in Paper Presented at theProceedings from EUSIPCOrsquo01 The 12th European SignalProcessing Conference Vienna Austria pp 2091ndash2094
Kasuya H Endo Y and Saliu S (1993) lsquoNovel AcousticMeasurements of Jitter and Shimmer Characteristics fromPathological Voicersquo Paper Presented at the Proceedings fromEUROSPEECHrsquo93 The 3rd European Conference on SpeechCommunication and Technology Berlin Germany pp 1973ndash1976
Kasuya H Ogawa S Mashima K and Ebihara S (1986)lsquoNormalized Noise Energy as an Acoustic Measure to Eval-uate Pathologic Voicersquo The Journal of the Acoustical Societyof America 80 1329ndash1334
Kay Elemetrics Inc (1994) lsquoVoice Disorders DatabaseVersion 103 [CD-ROM]rsquo httpwwwkaypentaxcomProduct20InfoCSL20Options43374337html
Khadivi Heris H Seyed Aghazadeh B and Nikkhah-BahramiM (2009) lsquoOptimal Feature Selection for the Assessment ofVocal Fold Disordersrsquo Computers in Biology and Medicine39(10) 860ndash868
Krom G (1993) lsquoA Cepstrum-Based Technique for Determininga Harmonics-To-Noise Ratio in Speech Signalsrsquo Journal ofSpeech and Hearing Research 36(2) 254ndash266
Kukharchik P Kheidorov I Bovbel E and Ladeev D (2008)lsquoSpeech Signal Processing Based on Wavelets and SVM forVocal Tract Pathology Detectionrsquo Lecture Notes in ComputerScience (Springer) 5099 192ndash199
Latifoglu F Polat K Kara SI and Gunes S (2008) lsquoMedicalDiagnosis of Atherosclerosis from Carotid Artery DopplerSignals Using Principal Component Analysis (PCA) k-NN based Weighting Pre-Processing and Artificial ImmuneRecognition System (AIRS)rsquo Journal of Biomedical Infor-matics 41(1) 15ndash23
Ludlow CL Bassich CJ Connor NP Coulter DC and LeeYJ (1987) lsquoThe Validity of Using Phonatory Jitter and Shim-mer to Detect Laryngeal Pathologyrsquo Laryngeal Functionin Phonation and Respiration Boston MA Brown amp Copp 492ndash508
MacQueen J (1967) lsquoSome Methods for Classification and Anal-ysis of Multivariate Observationsrsquo in Paper Presented at the5th Berkeley Symposium on Mathematical Statistics and Prob-ability Berkeley CA University California Press Vol 1pp 281ndash297
MAPACI P (2004) lsquoVoice Disorder Databasersquo httpwwwmapacicomindex-inglesphp
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1634 M Hariharan et al
Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372
Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800
Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953
Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327
Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900
Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793
Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793
Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60
Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673
Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279
Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921
Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia
Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345
Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925
Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128
Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740
Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p
Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118
Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576
Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd
Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430
Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97
Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117
Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678
Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219
Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399
Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1624 M Hariharan et al
Figure 1 Plots of pathological and normal voice samples
wavelet function ψ(t) shifted in time by a translation param-eter and dilated by a scale parameter The general definitionof the wavelet transform is given as (Burrus Gopinath GuoOdegard and Selesnick 1997 Bopardikar 2000)
W (a b) =infinint
minusinfinf (t)
1radica
ψlowast(
t minus b
a
)dt (1)
In the tree each subspace is indexed by its depth i and thenumber of subspaces p The two wavelet packet orthogonalbases at a parent node (ip) are given by the followingforms
ψ2pi+1(k) =
infinsumn =minusinfin
l[n] ψpi (k minus 2in) (2)
where l[n] is a low-pass (scaling) filter
ψ2p+1i+1 (k) =
infinsumn =minusinfin
h[n] ψpi (k minus 2in) (3)
where h[n] is the high-pass (wavelet) filter Wavelet packetis an extension of wavelet transform In WPT decomposi-tion procedure a signal is decomposed into two frequency
bands such as lower frequency band (approximation co-efficients) and higher frequency band (detail coefficients)Wavelet packet decomposition helps to partition both lowerand higher frequency bands into smaller bands which can-not be achieved by using general discrete wavelet trans-form Hence WPT gives a balanced binary tree structure(Figure 2) shows the block diagram of the proposed system
In this work the automatic detection of vocal foldpathology is carried out by using WPT and SVD In frame-based analysis the voice samples are firstly segmented intoshort-time frames of 40 ms with an overlap of 50 betweenadjacent frames and windowed by using Hamming win-dow (Godino-Llorente and Gomez-Vilda 2004 Godino-Llorente et al 2006) In frame-based analysis 1557 framesof pathological voices and 1537 frames of normal voicesare used Each frame of signal is decomposed into five lev-els using WPT and yields 32 subbands A matrix of size32 times M (fifth-level wavelet packet coefficients of each32 wavelet packet sub-bands) composed of wavelet packetcoefficients which is obtained for further processing whoserows pertain to wavelet packet subbands and columns towavelet packet coefficients
A = [C1
5 (M) C25 (M) middot middot middot C32
5 (M)]T
(4)
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1625
Figure 2 Overall diagram of the proposed system
To derive simple and effective feature vector singularvalue decomposition is applied In file-based analysis eachvoice sample is decomposed into five levels using WPTand the same SVD is applied to derive feature vector for thedetection of vocal fold pathology The same methodologyis applied on the voice samples of second data-set as wellConsidering the fact that pathological voice signals containmore rapid variations and the popularity of Daubechieswavelets in speech applications lsquodb4rsquo wavelet is used inour study The order was chosen to be low to model thetransients and rapid variations in a signal efficiently
31 Singular value decomposition
The singular value decomposition which is a factorisationand summarisation technique effectively reduces a rectan-gular matrix A (n times p matrix) of wavelet packet coefficientsinto a much smaller invertible and square matrix (KalkerHaitsma and Oostveen 2001 Ozer Sankur Memon andAnarim 2005) SVD theorem states that the given n times pmatrix is decomposed as given by
X = UDVT (5)
The columns of U are orthonormal eigenvectors of AAT
and are called left-singular vectors The columns of V are or-thonormal eigenvectors of ATA and are called right-singularvectors The diagonal entries of D are called the singularvalues of the matrix n times p In this work 32 singular valuesgive a good summarisation of the matrix of wavelet packetcoefficients of a signal
32 k-Means clustering based feature weighting
In many practical machine learning applications the per-formance of the learned models degrade gracefully due tothe irrelevant and noisy features To improve the robust-ness of the features and the classification accuracy clus-tering algorithms are used as feature weighting methodwhere they transform the extracted non-linearly separable
features to linearly separable features and also to map thefeatures according to the distributions (Polat and Gunes2006 Latifoglu Polat Kara and Gunes 2008 GunesPolat and Yosunkaya 2010 Polat 2012 Polat and Dur-duran 2012) Clustering algorithms are used not only tostudy the similarity or dissimilarity of the features but alsouseful for compression and reduction of the features (Polatand Gunes 2006 Latifoglu et al 2008 Gunes et al 2010Polat 2012 Polat and Durduran 2012) Several clusteringalgorithms have been proposed in the literature (MacQueen1967 Bezdek 1981 Chiu 1994 Yager and Filev 1994 Xuand Wunsch 2005) In this work k-means clustering is usedas a feature weighting method since it is simple and widelyimplemented in solving many practical problems (Polat andGunes 2006 Latifoglu et al 2008 Gunes et al 2010 Po-lat 2012 Polat and Durduran 2012) k-means clustering isused to partition n observations into k clusters so as to min-imise the mean squared distance from each data point toits nearest centre For a data-set xi i = 1 2 3 n k-means clustering algorithm finds the cluster centres ci andthe membership matrix U iteratively using the followingfour steps
Step 1 Randomly initialise the cluster centre ci i =123 c
Step 2 Determine the membership matrix usingEquation (6)
uij =
10
if∥∥xj minus ci
∥∥2 le∥∥xj minus ci
∥∥2 for each k = i
otherwise(6)
Step 3 Calculate the cost function based on the Euclideandistance using the following Equation (7)
J =csum
i=1
sumkxkisinGi
xk minus ci2 (7)
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1626 M Hariharan et al
Figure 3 Working of k-means clustering based feature weighting method
Stop if either this value or its improvement over previousiteration is below a certain thresholdStep 4 Update the cluster centres according toEquation (8) and go to Step 2
ci = 1
|Gi |sum
kxkisin Gi
xk (8)
where |Gi | =nsum
j=1uij
After finding the cluster centres the extracted featuresare preprocessed (Polat and Gunes 2006 Latifoglu et al2008 Gunes et al 2010 Polat 2012 Polat and Durduran2012) as summarised in (Figure 3) below firstly calcu-late the cluster centres using k-means clustering methodNext the ratios of means of features to their centres arecalculated Finally these ratios are multiplied with eachrespective feature
Raw features are obtained through feature extractionThrough k-means clustering based feature weighting theraw features are preprocessed which gives the weightedfeatures (Figure 4a and b) shows the class distribution ofraw and weighted SVD features MEEI database The classdistribution of raw and weighted SVD features for MAPACIspeech pathology database is shown in (Figure 5a and b)From the figures it is observed that the discrimination abil-
ity of the raw SVD features has been increased by applyingk-means clustering based feature weighting method
4 Classifiers
In the area of automatic detection of vocal fold patholo-gies various classifiers have been proposed such as multi-layer perceptron (Ritchings McGillion Conroy and Moore1999 Ritchings McGillion and Moore 1999) hiddenMarkov models (Wester 1998) linear discriminant anal-ysis (Hariharan Paulraj and Yaacob 2009 UmapathyKrishnan Parsa and Jamieson 2005) SVM (Khadivi Heriset al 2009 Arjmandi and Pooyan 2011 Erfanian Saeediet al 2011) Gaussian mixture models (Godino-Llorenteet al 2006) radial basis neural networks (Hariharan andSazali 2010 Hariharan Paulraj and Yaacob 2011) and k-nearest neighbourhood classifier (Ananthakrishna Shamaand Niranjan 2004 Shama and Cholayya 2007) In thiswork four different classifiers are employed such as k-NNLS-SVM PNN and GRNN
41 k-NN classifier
In pattern recognition the k-NN algorithm is a method forclassifying objects based on closest training examples inthe feature space (Fukunaga 1990 Duda Hart and Stork
Figure 4 (a) Class distribution of raw SVD features (MEEI database frame-based) according to the first two features (Feature 1 andFeature 2) (b) Class distribution of weighted SVD features (MEEI database frame-based) according to the first two features (Feature 1and Feature 2)
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1627
Figure 5 (a) Class distribution of raw SVD features (MAPACI speech pathology frame-based) according to the first two features (Feature1 and Feature 2) (b) Comparison of results (frame-based MEEI database raw features)
2001) An object is classified by a majority vote of itsneighbour with the object being assigned to the class mostcommon amongst its k-nearest neighbours k is a positiveinteger In the k-NN algorithm the classification of a newtest feature vector is determined by the class of its k-NNs
42 Probabilistic neural network
Specht (1990) has proposed the PNN based on Bayesianclassification and classical estimators for probability den-sity function (PDF) PNN comprises four units such as in-put units pattern units summation units and output unitsAll the units are fully interconnected and the pattern unitsare activated by exponential function instead of sigmoidalactivation function The pattern unit computes distancesfrom the input vector to the training input vectors when aninput is presented and produces a vector whose elementsindicate how close the input is to a training input The sum-mation unit sums these contributions for each class of inputsand produces a net output which is a vector of probabili-ties From the maximum of these probabilities output unitsproduce a 1 for that class and a 0 for the other classes using
compete transfer function (Hariharan Yaacob and Awang2011) The performance of the PNN classifier highly de-pends upon the smoothing parameter or spread factor (σ )Based on the experimental investigations the σ value isvaried between 001 and 0055 in steps of 0005
43 General regression neural network
GRNN is a kind of radial basis networks and the train-ing is conducted using one-pass learning This networkdoes not require an iterative training procedure it presentsmuch faster learning than multilayer perceptron it is moreaccurate than MLP and relatively insensitive to outliersThe target variable for the GRNN network is continuous(Eskidere Ertas and Hanilci 2011 Wu and Tsai 2011Hariharan Sindhu and Yaacob 2011 HariharanSaraswathy Sindhu Khairunizam and Yaacob 2012Hoglund 2012) Specht (1991) has proposed the model ofGRNN to perform general (linear or non-linear) regres-sions GRNN is based on the theory of probability re-gression analysis It usually uses Parzen window estimatesto set up the PDF from the observed data samples TheGRNN has four different layers input layer pattern layer
Table 1 Details of training and test data
Type ofType of experiment validation MEEI database MAPACI speech pathology database
Frame-based (1557pathological + 1537normal) (1560pathological + 1560normal)
CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
ConV Training = 2166 (70) testing =928 (30)
Training = 2184 (70) testing =936 (30)
File-based (173pathological + 53normal) (24pathological + 24normal)
CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
ConV Training = 158 (70) testing =68 (30)
Training = 34 (70) testing =14 (30)
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1628 M Hariharan et al
Figure 6 Comparison of results (a) frame-based MEEI database raw features (b) file-based MEEI database raw features (c) frame-based MAPACI speech pathology database raw features and (d) file-based MAPACI speech pathology database raw features
summation layer and output layer The performance of theGRNN classifier highly depends upon the smoothing pa-rameter or spread factor (σ ) Based on the experimentalinvestigations the σ value is varied between 001 and 0055in steps of 0005
44 Support vector machine
In this work SVM is used as a classifier and it is a promis-ing method for solving non-linear classification problemsfunction estimation and density estimation and patternrecognition tasks (Azadi and Almasganj 2011 Calisir andDogantekin 2011 Ismail Samsudin and Shabri 2011 YuYao Wang and Lai 2011 Martis Acharya Mandana Rayand Chakraborty 2012) It has been originally proposed toclassify samples within two classes It maps training sam-ples of two classes into a higher dimensional space througha kernel function SVM seeks an optimal separating hyper-plane in this new space to maximise its distance from theclosest training point While testing a query point is cate-gorised according to the distance between the point and thehyperplane
SVM models are built around a kernel function thattransforms the input data into an n-dimensional space where
a hyperplane can be constructed to partition the data Threekinds of kernel functions such as linear kernel multilayerkernel and radial basis function (RBF) kernel are normallyused by researchers (Azadi and Almasganj 2011 Calisirand Dogantekin 2011 Ismail et al 2011 Yu et al 2011Martis et al 2012) In this work the RBF kernel functionis used since it gives an excellent generalisation and lowcomputational cost In the RBF kernel σ 2 (sig2) is theimportant parameter and it causes the changes in shapeflexion of the hyperplane
In this work LS-SVMlab toolbox (Suykens VanGestel De Brabanter De Moor and Vandewalle2003 De Brabanter 2010) is used to perform classificationof normal and pathological speech samples There are twoparameters which are to be chosen optimally such as reg-ularisation parameter (γ gam) and σ 2 (squared bandwidthof the RBF kernel) to obtain better accuracy
5 Results and discussions
Many research works have already been done in the areaof automatic detection of voice pathology In this work anew feature vector using WPT and SVD is proposed to clas-sify the voice samples into normal or pathological k-means
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1629
Figure 7 Comparison of results (a) Frame-based MEEI database weighted features (b) file-based MEEI database weighted features(c) Frame-based MAPACI speech pathology database weighted features and (d) file-based MAPACI speech pathology database weightedfeatures
clustering based feature weighting method is proposed forpreprocessing the features which increases its discrimina-tion ability between classes and the classification accuracyas well Four classifiers are employed to investigate the ef-ficacy of the raw and weighted features and two validationschemes (ConV and CrossV) are used to prove the reliabil-ity of the classification results Details of the training andtesting sets used in this work are tabulated in (Table 1)
To test the classifier performance several measuresnamely sensitivity specificity and the overall accuracy areconsidered These measures are calculated from the mea-sures true positive (TP the classifier classified as pathologywhen pathological samples are present) true negative (TNthe classifier classified as normal when normal samples arepresent) false positive (FP the classifier classified as patho-logical when normal samples are present) and false negative(FN the classifier classified as normal when pathologicalsamples are present) These measures are given in Equa-tions (9) (10) and (11) respectively
Sensitivity (SE) = TP(TP + FN) (9)
Specificity (SP) = TN(TN + FP) (10)
Overall accuracy (ACC) = (TP + TN)(TP + TN
+ FP + FN) (11)
In k-NN classifier different values of lsquokrsquo between 1 and10 are used PNN and GRNN are trained with differentspread factors between 001 and 0055 The suitable valueof the regularisation parameter (γ gam) and σ 2 (sig2) forSVM classifier are chosen optimally during training andtesting to obtain better accuracy Sensitivity specificity andoverall accuracy of frame-based and file-based analysis us-ing the raw and weighted features are presented in (Table 2)and (Table 3) for both MEEI database and MAPACIspeech pathology database respectively Average classifica-tion accuracies with standard deviations are tabulated From(Table 2) it is observed that the accuracy of file-based anal-ysis is always better than the frame-based analysis Max-imum classification accuracy of 100 is obtained duringthe file-based analysis using all the classifiers (k-NN LS-SVM PNN and GRNN) for the proposed raw and weightedfeatures Maximum classification accuracy of above 95is obtained using the raw features and 100 is obtained us-ing the weighted features during the frame-based analysis
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1630 M Hariharan et al
Tabl
e2
Com
pari
son
ofre
sult
sfo
rM
EE
Ida
taba
seus
ing
k-N
NL
S-S
VM
PN
Nan
dG
RN
N(r
awan
dw
eigh
ted
feat
ures
)
Raw
feat
ures
Wei
ghte
dfe
atur
es
ME
EI
data
base
SE
SP
AC
CS
ES
PA
CC
k-N
NFr
ame-
base
dC
ross
V96
66
plusmn0
5591
69
plusmn1
0994
05
plusmn0
8399
81
plusmn0
0698
27
plusmn0
3499
03
plusmn0
19k-
valu
e1ndash
10C
onV
959
6plusmn
071
908
7plusmn
154
932
7plusmn
116
998
1plusmn
006
979
8plusmn
033
988
8plusmn
015
File
-bas
edC
ross
V99
54
plusmn0
2410
0plusmn
000
996
5plusmn
019
995
4plusmn
024
100
plusmn0
0099
65
plusmn0
19k-
valu
e1ndash
10C
onV
997
4plusmn
025
100
plusmn0
0099
79
plusmn0
2010
0plusmn
000
100
plusmn0
0010
0plusmn
000
LS
-SV
MFr
ame-
base
dC
ross
V95
10
plusmn0
2195
41
plusmn0
2095
25
plusmn0
1299
75
plusmn0
0399
45
plusmn0
0799
60
plusmn0
0410
000
01
Con
V94
92
plusmn0
9094
49
plusmn1
0594
70
plusmn0
8199
34
plusmn0
2999
46
plusmn0
3499
40
plusmn0
26Fi
le-b
ased
Cro
ssV
989
2plusmn
062
998
1plusmn
060
991
2plusmn
047
993
2plusmn
036
100
plusmn0
0099
47
plusmn0
2810
01
Con
V10
0plusmn
000
988
7plusmn
130
991
2plusmn
103
996
2plusmn
080
994
1plusmn
186
995
6plusmn
071
PN
NFr
ame-
base
dC
ross
V97
43
plusmn0
2693
01
plusmn0
1795
12
plusmn0
1799
88
plusmn0
0598
69
plusmn0
1699
28
plusmn0
070
01ndash0
055
Con
V96
79
plusmn0
2492
44
plusmn0
3394
51
plusmn0
2499
81
plusmn0
0698
41
plusmn0
2199
10
plusmn0
11Fi
le-b
ased
Cro
ssV
974
4plusmn
539
100
plusmn0
0097
74
plusmn4
9797
50
plusmn5
3710
0plusmn
000
977
9plusmn
495
001
ndash00
55C
onV
970
9plusmn
610
100
plusmn0
0097
37
plusmn5
7999
62
plusmn0
8099
41
plusmn1
8699
56
plusmn0
71G
RN
NFr
ame-
base
dC
ross
V93
01
plusmn3
8988
08
plusmn4
9890
15
plusmn4
3199
50
plusmn0
4197
53
plusmn1
3098
27
plusmn0
890
01ndash0
055
Con
V92
52
plusmn3
7887
07
plusmn5
1189
34
plusmn4
3599
39
plusmn0
5597
35
plusmn1
1698
10
plusmn0
91Fi
le-b
ased
Cro
ssV
972
7plusmn
599
100
plusmn0
0096
33
plusmn8
0098
07
plusmn4
8210
0plusmn
000
979
2plusmn
511
001
ndash00
55C
onV
966
9plusmn
675
100
plusmn0
0095
81
plusmn9
0596
77
plusmn6
2310
0plusmn
000
958
1plusmn
815
Tabl
e3
Com
pari
son
ofre
sult
sfo
rM
APA
CI
spee
chpa
thol
ogy
data
base
usin
gk-
NN
LS
-SV
MP
NN
and
GR
NN
(raw
and
wei
ghte
dfe
atur
es)
Raw
feat
ures
Wei
ghte
dfe
atur
es
MA
PAC
Ida
taba
seS
ES
PA
CC
SE
SP
AC
C
k-N
NFr
ame-
base
dC
ross
V94
49
plusmn0
3696
58
plusmn0
3995
51
plusmn0
2198
67
plusmn0
2398
76
plusmn0
2598
71
plusmn0
23k-
valu
e1ndash
10C
onV
942
0plusmn
040
962
9plusmn
033
952
1plusmn
032
100
plusmn0
0010
0plusmn
000
100
plusmn0
00Fi
le-b
ased
Cro
ssV
771
0plusmn
784
669
2plusmn
330
704
2plusmn
365
100
plusmn0
0010
0plusmn
000
983
3plusmn
164
k-va
lue
1ndash10
Con
V77
56
plusmn7
1167
51
plusmn2
4869
71
plusmn2
7099
88
plusmn0
4099
88
plusmn0
4097
50
plusmn2
16L
S-S
VM
Fram
e-ba
sed
Cro
ssV
955
0plusmn
018
972
8plusmn
024
963
7plusmn
016
990
4plusmn
008
990
4plusmn
008
992
4plusmn
006
100
01
Con
V96
86
plusmn0
8195
24
plusmn0
6796
03
plusmn0
4098
71
plusmn0
6898
71
plusmn0
6899
02
plusmn0
41Fi
le-b
ased
Cro
ssV
793
1plusmn
333
688
7plusmn
240
729
2plusmn
260
100
plusmn0
0010
0plusmn
000
958
3plusmn
000
101
Con
V74
53
plusmn7
8284
00
plusmn13
01
771
4plusmn
878
100
plusmn0
0010
0plusmn
000
964
3plusmn
376
PN
NFr
ame-
base
dC
ross
V94
88
plusmn0
2096
23
plusmn0
2695
55
plusmn0
2199
82
plusmn0
0799
82
plusmn0
0799
91
plusmn0
040
01ndash0
055
Con
V94
52
plusmn0
3196
03
plusmn0
3595
26
plusmn0
2599
76
plusmn0
0799
76
plusmn0
0799
88
plusmn0
03Fi
le-b
ased
Cro
ssV
723
0plusmn
396
705
3plusmn
393
708
3plusmn
170
985
7plusmn
452
985
7plusmn
452
991
7plusmn
263
001
ndash00
55C
onV
690
2plusmn
533
691
8plusmn
410
679
3plusmn
386
986
0plusmn
273
986
0plusmn
273
990
0plusmn
172
GR
NN
Fram
e-ba
sed
Cro
ssV
904
1plusmn
378
917
9plusmn
335
909
8plusmn
345
995
9plusmn
029
995
9plusmn
029
997
1plusmn
023
001
ndash00
55C
onV
900
2plusmn
361
917
8plusmn
316
907
2plusmn
324
995
3plusmn
030
995
3plusmn
030
996
6plusmn
024
File
-bas
edC
ross
V69
43
plusmn8
5170
75
plusmn4
8666
67
plusmn7
7397
13
plusmn6
4097
13
plusmn6
4093
75
plusmn9
270
01ndash0
055
Con
V62
05
plusmn9
4064
51
plusmn3
8359
57
plusmn8
5097
38
plusmn5
6897
38
plusmn5
6893
07
plusmn9
91
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1631
Table 4 Computation time for feature extraction and classification
Classification
MEEI database MAPACI database
Type of Type of Weighted Raw Weighted RawClassifiers experiments validation features features features features
k-NN Frame-based CrossV 0338 0341 0342 0346k-value 1ndash10 ConV 3806 3821 6931 6955File-based CrossV 0036 0038 0036 0036k-value 1ndash10 ConV 1759 1754 3001 3206
LS-SVM Frame-based CrossV 1139 1175 1191 121410 001 ConV 0718 0698 0705 0723
File-based CrossV 0005 0005 0002 000210 1 ConV 0006 0006 0004 0004
PNN Frame-based CrossV 2458 3084 2445 2421001ndash0055 ConV 10340 10356 11180 11202File-based CrossV 1078 1124 1093 1062001ndash0055 ConV 4250 4266 4072 4047
GRNN Frame-based CrossV 2587 2571 2676 2582001ndash0055 ConV 10726 10648 11634 11584File-based CrossV 1098 1056 1065 1062001ndash0055 ConV 4302 4260 4049 4066
Feature extractionType of experiments MEEI MAPACI
Normal Pathological Normal PathologicalFile-based 0549 0034 0109 0130Frame-based 0591 0181 1734 1737
To prove the robustness and database independency of theproposed method MAPACI speech pathology database isused for the investigation in addition to MEEI databaseFrom (Table 3) it is inferred that maximum accuracy ofabove 95 is obtained using the raw features and 100is obtained using the weighted features during the frame-based analysis During the file-based analysis we obtainedmaximum classification accuracy of 92 using the rawfeatures and 100 is obtained using the weighted features
(Figure 6andashd) show the comparison of results (maxi-mum accuracy and its corresponding sensitivity and speci-ficity) for MEEI database and MAPACI speech pathologydatabase using the proposed raw features during frame-based and file-based analysis Similarly (Figure 7andashd) il-lustrate the comparison of results (maximum accuracy andits corresponding sensitivity and specificity) for MEEIdatabase and MAPACI speech pathology database for pro-posed weighted features The performance of the weightedfeatures is always better compared to raw features for bothMEEI database and MAPACI speech pathology databaseAll the feature extractions and classification algorithms aredeveloped under MATLAB platform and they run in a lap-top of Intel Core i7-2670 QM (22 GHz) with 4 GB RAM(Table 4) reports the average time taken (in seconds) duringfeature extraction and classification under two different ex-periments (frame-based and file-based) Direct comparisonof our work with the previous works in literature cannot beperformed since most of the works in the literature havenot presented their computation time of feature extraction
and classification phase From (Table 4) it is observed thatthe best classification accuracy could be obtained using oursuggested feature extraction and classification algorithmswith less computation time
From the literature it is observed that the reliabilityof wavelet and WPT-based features has been proven bymany experimental researches Accuracy of their resultswas varied from 85 to 100 approximately under differ-ent experiments Their results highly depend on the opti-mal selection of wavelet packet-based features using featureselectiontransformation techniques and optimisation tech-niques However the presented work gives very promisingclassification accuracy for the databases under test Henceit can be concluded that the proposed k-means clusteringbased feature weighting method could improve the robust-ness of the WPTndashSVD features and increase the classifica-tion accuracy as well for both MEEI database and MAPACIspeech pathology database
6 Conclusions
This paper presents a new feature constituting approachfor efficient detection of vocal fold pathology using a fea-ture extraction method (WPTndashSVD features and a fea-ture weighting method [k-means clustering based]) Totest the effectiveness and reliability of the proposed rawand weighted features four different classifiers such as k-NN LS-SVM PNN and GRNN are employed k-means
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1632 M Hariharan et al
clustering based feature weighting approach can increasethe distinguishing performance of raw WPTndashSVD featuresThe experimental results show that the proposed weightedfeatures give very promising classification accuracy of100 for both MEEI voice disorder database and MAPACIspeech pathology database The proposed features can beused as additional acoustic indicators for researchers andspeech pathologists to detect the vocal fold pathology Theperformance of the k-means clustering algorithm is depend-ing on the initial positions of the clusters and hence in ourwork it was run for several times to obtain better clus-ter centres In the future work k-means clustering basedfeature weighting method could be improved by using ef-ficient initialisation of methods for better initial placementof the cluster centres and different kernel functions Theproposed method could be extended to detect the specifictype of disorders and also to develop an online diagnosingsystem
AcknowledgementsThis work is done in part with data transferred from the databaseMAPACI httpwwwmapacicom The authors would like tothank the anonymous reviewers for their valuable comments
Notes on contributorsM Hariharan has received Bachelor ofEngineering in Electrical and ElectronicsEngineering and Master of Engineeringin Applied Electronics from GovernmentCollege of Technology Coimbatore TamilNadu India He has received a PhD degreein Mechatronic Engineering from UniversitiMalaysia Perlis (UniMAP) Malaysia He iscurrently working as a Senior Lecturer in
the School of Mechatronic Engineering UniMAP Malaysia Hehas published more than 50 papers in referred journals and con-ferences He has been a reviewer in various journals includingComputers and Electrical Engineering Computer Methods andPrograms in Biomedicine Artificial Intelligence in BiomedicineInternational Journal of Phoniatrics Speech Therapy and Com-munication Pathology and Medical Engineering and Physics Hisresearch interests include speech signal processing biomedicalsignal and image processing and artificial intelligence He is amember of IEEE
Kemal Polat was born in Kırıkkale on30 January 1981 Dr Polat has gradu-ated from ElectricalndashElectronics Engineer-ing Department of Selcuk University withBSc degree in 2002 and from ElectricalndashElectronics Engineering Department of Sel-cuk University with MSc degree in 2004He took his PhD degree in Electrical andElectronic Engineering at Selcuk University
in 2008 He is now working as an Assistant Professor in Electricaland Electronic Engineering Department Engineering of FacultyAbant Izzet Baysal University since September 2011 His currentresearch interests are biomedical signal classification statisticalsignal processing digital signal processing and pattern recognitionand classification He has 55 articles published in the SCI journalsand 16 international conference papers He is a reviewer in various
SCI journals including Information Sciences Pattern RecognitionLetters Digital Signal Processing Soft Computing Artificial In-telligence in Medicine Applied Soft Computing Computers inBiology and Medicine Expert Systems Computational Statisticsamp Data Analysis IEEE Transactions on Biomedical EngineeringIEEE Transactions on Evolutionary Computation TALANTASoft Computing Journal of Medical Systems and Scientific Re-search and Essays so on He is the member of editorial board ofJournal of Neural Computing and Applications (SCI expanded)
Sazali Yaacob received his PhD degreein Control Engineering from University ofSheffield UK He is currently working as aProfessor in the School of Mechatronic En-gineering UniMAP Malaysia He has suc-cessfully supervised 8 PhD candidates andmore than 20 MSc graduates through re-search mode Currently he has 10 PhD and8 MSc candidates His research interests
include control modelling and signal processing with applica-tions in the fields of satellite bio-medical applied mechanics androbotics He has published more than 70 papers in journals and200 papers in conference proceedings He received his profes-sional qualification as Charted Engineer from the EngineeringCouncil UK in 2005 and also a member of IET UK since 2003
ReferencesAnanthakrishna T Shama K and Niranjan U (2004)
lsquok-Means Nearest Neighbor Classifier for Voice Pathol-ogyrsquo in Paper Presented at the Proceedings from INDI-CONrsquo04 The 1st IEEE Annual India Conference KharagpurIndia pp 352ndash354
Arjmandi MK and Pooyan M (2011) lsquoAn Optimum Algorithmin Pathological Voice Quality Assessment Using Wavelet-Packet-Based Features Linear Discriminant Analysis andSupport Vector Machinersquo Biomedical Signal Processing andControl 7 3ndash19
Azadi TE and Almasganj F (2011) lsquoPBSVM Partitioningand Biased Support Vector Machine for Vocal Fold PathologyAssessment Using Labeled and Unlabeled Data Setsrsquo ExpertSystems with Applications 38(1) 610ndash619
Bezdek JC (1981) Pattern Recognition with Fuzzy ObjectiveFunction Algorithms Norwell MA Kluwer Academic Pub-lishers
Boyanov B and Hadjitodorov S (1997) lsquoAcoustic Analysis ofPathological Voices A Voice Analysis System for the Screen-ing of Laryngeal Diseasesrsquo IEEE Engineering in Medicineand Biology Magazine 16(4) 74ndash82
Boyanov B Ivanov T Hadjitodorov S and Chollet G (1993)lsquoRobust Hybrid Pitch Detectorrsquo IET Electronics Letters 291924ndash1926
Burrus C Gopinath R Guo H Odegard J and Selesnick I(1997) Introduction to Wavelets and Wavelet Transforms APrimer Upper Saddle River NJ Prentice Hall
Calisir D and Dogantekin E (2011) lsquoA New Intelligent Hep-atitis Diagnosis System PCA-LSSVMrsquo Expert Systems withApplications 38(8) 10705ndash10708
Chiu SL (1994) lsquoFuzzy Model Identification Based on ClusterEstimationrsquo Journal of Intelligent and Fuzzy Systems 2(3)267ndash278
Crovato C and Schuck A (2007) lsquoThe Use of Wavelet PacketTransform and Artificial Neural Networks in Analysis andClassification of Dysphonic Voicesrsquo IEEE Transactions onBiomedical Engineering 54(10) 1898ndash1900
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1633
De Brabanter K Ojeda PKF Alzate C De Brabanter J Pel-ckmans K De Moor B Vandewalle J and Suykens JAK(2010) lsquoLS-SVMlab Toolboxrsquo httpwwwesatkuleuvenbesistalssvmlab
Deliyski D (1993) Acoustic Model and Evaluation of Patholog-ical Voice Productionrsquo in Paper Presented at the Proceed-ings from EUROSPEECHrsquo93 The 3rd Conference on SpeechCommunication and Technology Berlin Germany pp 1969ndash1972
Duda RO Hart PE and Stork DG (2001) Pattern Classifi-cation New York Wiley-Interscience
Erfanian Saeedi N Almasganj F and Torabinejad F (2011)lsquoSupport Vector Wavelet Adaptation for Pathological VoiceAssessmentrsquo Computers in Biology and Medicine 41(9)822ndash828
Eskidere O Ertas F and Hanilci C (2011) lsquoA Comparison ofRegression Methods for Remote Tracking of Parkinsonrsquos Dis-ease Progressionrsquo Expert Systems with Applications 39(5)5523ndash5528
Feijoo S and Hernandez C (1990) lsquoShort-Term Stability Mea-sures for the Evaluation of Vocal Qualityrsquo Journal of Speechand Hearing Research 33(2) 324ndash334
Fonseca E Guido R Scalassara P Maciel C and Pereira J(2007) lsquoWavelet Time-Frequency Analysis and Least SquaresSupport Vector Machines for the Identification of Voice Dis-ordersrsquo Computers in Biology and Medicine(Elsevier) 37(4)571ndash578
Fukunaga K (1990) Introduction to Statistical Pattern Recogni-tion (2nd ed) San Diego CA Academic Press
Godino-Llorente J and Gomez-Vilda P (2004) lsquoAutomaticDetection of Voice Impairments by Means of Short-TermCepstral Parameters and Neural Network Based DetectorsrsquoIEEE Transactions on Biomedical Engineering 51(2) 380ndash384
Godino-Llorente J Gomez-Vilda P and Blanco-Velasco M(2006) lsquoDimensionality Reduction of a Pathological VoiceQuality Assessment System Based on Gaussian Mixture Mod-els and Short-Term Cepstral Parametersrsquo IEEE Transactionson Biomedical Engineering 53(10) 1943ndash1953
Gunes S Polat K and Yosunkaya S (2010) lsquoEfficient SleepStage Recognition System Based on EEG Signal Using k-Means Clustering Based Feature Weightingrsquo Expert Systemswith Applications 37(12) 7922ndash7928
Hariharan M Paulraj M and Yaacob S (2009) lsquoIdentifica-tion of Vocal Fold Pathology Based on Mel Frequency BandEnergy Coefficients and Singular Value Decompositionrsquo inPaper presented at the 2009 IEEE International Conferenceon Signal and Image Processing Applications (ICSIPArsquo09)Kuala Lumpur Malaysia pp 514ndash517
Hariharan M Paulraj M and Yaacob S (2011) lsquoDetection ofVocal Fold Paralysis and Oedema Using Time-Domain Fea-tures and Probabilistic Neural Networkrsquo International Jour-nal of Biomedical Engineering and Technology 6(1) 46ndash57
Hariharan M Saraswathy J Sindhu R Khairunizam W andYaacob S (2012) lsquoInfant Cry Classification to Identify As-phyxia Using Time-Frequency Analysis and Radial BasisNeural Networksrsquo Expert Systems with Applications 39(10)9515ndash9523
Hariharan M and Sazali Y (2010) lsquoTime-Domain Features andProbabilistic Neural Network for the Detection of Vocal FoldPathologyrsquo Malaysian Journal of Computer Science 23(1)60ndash67
Hariharan M Sindhu R and Yaacob S (2011) lsquoNormal andHypoacoustic Infant Cry Signal Classification Using Time-Frequency Analysis and General Regression Neural Net-
workrsquo Computer Methods and Programs in Biomedicine108(2) 559ndash569
Hariharan M Yaacob S and Awang SA (2011) lsquoPathologi-cal Infant Cry Analysis Using Wavelet Packet Transform andProbabilistic Neural Networkrsquo Expert Systems with Applica-tions 38(12) 15377ndash15382
Hernandez-Espinosa C Gomez-Vilda P Godino-Llorente Jand Aguilera-Navarro S (2000) lsquoDiagnosis of Vocal andVoice Disorders by the Speech Signalrsquo in Paper Presentedat the Proceedings from IJCNNrsquo00 The IEEE-INNS-ENNSInternational Joint Conference on Neural Networks ComoItaly pp 253ndash258
Hoglund H (2012) lsquoDetecting Earnings Management with Neu-ral Networksrsquo Expert Systems with Applications 39(10)9564ndash9570
Ismail S Samsudin R and Shabri A (2011) lsquoA Hybrid Modelof Self-Organizing Maps (SOM) and Least Square SupportVector Machine (LSSVM) for Time Series Forecastingrsquo Ex-pert Systems with Applications 38(8) 10574ndash10578
Kalker T Haitsma J and Oostveen J (2001) lsquoRobust AudioHashing for Content Identificationrsquo in Paper Presented at theProceedings from EUSIPCOrsquo01 The 12th European SignalProcessing Conference Vienna Austria pp 2091ndash2094
Kasuya H Endo Y and Saliu S (1993) lsquoNovel AcousticMeasurements of Jitter and Shimmer Characteristics fromPathological Voicersquo Paper Presented at the Proceedings fromEUROSPEECHrsquo93 The 3rd European Conference on SpeechCommunication and Technology Berlin Germany pp 1973ndash1976
Kasuya H Ogawa S Mashima K and Ebihara S (1986)lsquoNormalized Noise Energy as an Acoustic Measure to Eval-uate Pathologic Voicersquo The Journal of the Acoustical Societyof America 80 1329ndash1334
Kay Elemetrics Inc (1994) lsquoVoice Disorders DatabaseVersion 103 [CD-ROM]rsquo httpwwwkaypentaxcomProduct20InfoCSL20Options43374337html
Khadivi Heris H Seyed Aghazadeh B and Nikkhah-BahramiM (2009) lsquoOptimal Feature Selection for the Assessment ofVocal Fold Disordersrsquo Computers in Biology and Medicine39(10) 860ndash868
Krom G (1993) lsquoA Cepstrum-Based Technique for Determininga Harmonics-To-Noise Ratio in Speech Signalsrsquo Journal ofSpeech and Hearing Research 36(2) 254ndash266
Kukharchik P Kheidorov I Bovbel E and Ladeev D (2008)lsquoSpeech Signal Processing Based on Wavelets and SVM forVocal Tract Pathology Detectionrsquo Lecture Notes in ComputerScience (Springer) 5099 192ndash199
Latifoglu F Polat K Kara SI and Gunes S (2008) lsquoMedicalDiagnosis of Atherosclerosis from Carotid Artery DopplerSignals Using Principal Component Analysis (PCA) k-NN based Weighting Pre-Processing and Artificial ImmuneRecognition System (AIRS)rsquo Journal of Biomedical Infor-matics 41(1) 15ndash23
Ludlow CL Bassich CJ Connor NP Coulter DC and LeeYJ (1987) lsquoThe Validity of Using Phonatory Jitter and Shim-mer to Detect Laryngeal Pathologyrsquo Laryngeal Functionin Phonation and Respiration Boston MA Brown amp Copp 492ndash508
MacQueen J (1967) lsquoSome Methods for Classification and Anal-ysis of Multivariate Observationsrsquo in Paper Presented at the5th Berkeley Symposium on Mathematical Statistics and Prob-ability Berkeley CA University California Press Vol 1pp 281ndash297
MAPACI P (2004) lsquoVoice Disorder Databasersquo httpwwwmapacicomindex-inglesphp
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1634 M Hariharan et al
Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372
Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800
Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953
Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327
Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900
Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793
Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793
Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60
Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673
Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279
Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921
Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia
Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345
Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925
Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128
Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740
Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p
Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118
Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576
Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd
Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430
Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97
Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117
Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678
Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219
Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399
Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1625
Figure 2 Overall diagram of the proposed system
To derive simple and effective feature vector singularvalue decomposition is applied In file-based analysis eachvoice sample is decomposed into five levels using WPTand the same SVD is applied to derive feature vector for thedetection of vocal fold pathology The same methodologyis applied on the voice samples of second data-set as wellConsidering the fact that pathological voice signals containmore rapid variations and the popularity of Daubechieswavelets in speech applications lsquodb4rsquo wavelet is used inour study The order was chosen to be low to model thetransients and rapid variations in a signal efficiently
31 Singular value decomposition
The singular value decomposition which is a factorisationand summarisation technique effectively reduces a rectan-gular matrix A (n times p matrix) of wavelet packet coefficientsinto a much smaller invertible and square matrix (KalkerHaitsma and Oostveen 2001 Ozer Sankur Memon andAnarim 2005) SVD theorem states that the given n times pmatrix is decomposed as given by
X = UDVT (5)
The columns of U are orthonormal eigenvectors of AAT
and are called left-singular vectors The columns of V are or-thonormal eigenvectors of ATA and are called right-singularvectors The diagonal entries of D are called the singularvalues of the matrix n times p In this work 32 singular valuesgive a good summarisation of the matrix of wavelet packetcoefficients of a signal
32 k-Means clustering based feature weighting
In many practical machine learning applications the per-formance of the learned models degrade gracefully due tothe irrelevant and noisy features To improve the robust-ness of the features and the classification accuracy clus-tering algorithms are used as feature weighting methodwhere they transform the extracted non-linearly separable
features to linearly separable features and also to map thefeatures according to the distributions (Polat and Gunes2006 Latifoglu Polat Kara and Gunes 2008 GunesPolat and Yosunkaya 2010 Polat 2012 Polat and Dur-duran 2012) Clustering algorithms are used not only tostudy the similarity or dissimilarity of the features but alsouseful for compression and reduction of the features (Polatand Gunes 2006 Latifoglu et al 2008 Gunes et al 2010Polat 2012 Polat and Durduran 2012) Several clusteringalgorithms have been proposed in the literature (MacQueen1967 Bezdek 1981 Chiu 1994 Yager and Filev 1994 Xuand Wunsch 2005) In this work k-means clustering is usedas a feature weighting method since it is simple and widelyimplemented in solving many practical problems (Polat andGunes 2006 Latifoglu et al 2008 Gunes et al 2010 Po-lat 2012 Polat and Durduran 2012) k-means clustering isused to partition n observations into k clusters so as to min-imise the mean squared distance from each data point toits nearest centre For a data-set xi i = 1 2 3 n k-means clustering algorithm finds the cluster centres ci andthe membership matrix U iteratively using the followingfour steps
Step 1 Randomly initialise the cluster centre ci i =123 c
Step 2 Determine the membership matrix usingEquation (6)
uij =
10
if∥∥xj minus ci
∥∥2 le∥∥xj minus ci
∥∥2 for each k = i
otherwise(6)
Step 3 Calculate the cost function based on the Euclideandistance using the following Equation (7)
J =csum
i=1
sumkxkisinGi
xk minus ci2 (7)
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1626 M Hariharan et al
Figure 3 Working of k-means clustering based feature weighting method
Stop if either this value or its improvement over previousiteration is below a certain thresholdStep 4 Update the cluster centres according toEquation (8) and go to Step 2
ci = 1
|Gi |sum
kxkisin Gi
xk (8)
where |Gi | =nsum
j=1uij
After finding the cluster centres the extracted featuresare preprocessed (Polat and Gunes 2006 Latifoglu et al2008 Gunes et al 2010 Polat 2012 Polat and Durduran2012) as summarised in (Figure 3) below firstly calcu-late the cluster centres using k-means clustering methodNext the ratios of means of features to their centres arecalculated Finally these ratios are multiplied with eachrespective feature
Raw features are obtained through feature extractionThrough k-means clustering based feature weighting theraw features are preprocessed which gives the weightedfeatures (Figure 4a and b) shows the class distribution ofraw and weighted SVD features MEEI database The classdistribution of raw and weighted SVD features for MAPACIspeech pathology database is shown in (Figure 5a and b)From the figures it is observed that the discrimination abil-
ity of the raw SVD features has been increased by applyingk-means clustering based feature weighting method
4 Classifiers
In the area of automatic detection of vocal fold patholo-gies various classifiers have been proposed such as multi-layer perceptron (Ritchings McGillion Conroy and Moore1999 Ritchings McGillion and Moore 1999) hiddenMarkov models (Wester 1998) linear discriminant anal-ysis (Hariharan Paulraj and Yaacob 2009 UmapathyKrishnan Parsa and Jamieson 2005) SVM (Khadivi Heriset al 2009 Arjmandi and Pooyan 2011 Erfanian Saeediet al 2011) Gaussian mixture models (Godino-Llorenteet al 2006) radial basis neural networks (Hariharan andSazali 2010 Hariharan Paulraj and Yaacob 2011) and k-nearest neighbourhood classifier (Ananthakrishna Shamaand Niranjan 2004 Shama and Cholayya 2007) In thiswork four different classifiers are employed such as k-NNLS-SVM PNN and GRNN
41 k-NN classifier
In pattern recognition the k-NN algorithm is a method forclassifying objects based on closest training examples inthe feature space (Fukunaga 1990 Duda Hart and Stork
Figure 4 (a) Class distribution of raw SVD features (MEEI database frame-based) according to the first two features (Feature 1 andFeature 2) (b) Class distribution of weighted SVD features (MEEI database frame-based) according to the first two features (Feature 1and Feature 2)
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1627
Figure 5 (a) Class distribution of raw SVD features (MAPACI speech pathology frame-based) according to the first two features (Feature1 and Feature 2) (b) Comparison of results (frame-based MEEI database raw features)
2001) An object is classified by a majority vote of itsneighbour with the object being assigned to the class mostcommon amongst its k-nearest neighbours k is a positiveinteger In the k-NN algorithm the classification of a newtest feature vector is determined by the class of its k-NNs
42 Probabilistic neural network
Specht (1990) has proposed the PNN based on Bayesianclassification and classical estimators for probability den-sity function (PDF) PNN comprises four units such as in-put units pattern units summation units and output unitsAll the units are fully interconnected and the pattern unitsare activated by exponential function instead of sigmoidalactivation function The pattern unit computes distancesfrom the input vector to the training input vectors when aninput is presented and produces a vector whose elementsindicate how close the input is to a training input The sum-mation unit sums these contributions for each class of inputsand produces a net output which is a vector of probabili-ties From the maximum of these probabilities output unitsproduce a 1 for that class and a 0 for the other classes using
compete transfer function (Hariharan Yaacob and Awang2011) The performance of the PNN classifier highly de-pends upon the smoothing parameter or spread factor (σ )Based on the experimental investigations the σ value isvaried between 001 and 0055 in steps of 0005
43 General regression neural network
GRNN is a kind of radial basis networks and the train-ing is conducted using one-pass learning This networkdoes not require an iterative training procedure it presentsmuch faster learning than multilayer perceptron it is moreaccurate than MLP and relatively insensitive to outliersThe target variable for the GRNN network is continuous(Eskidere Ertas and Hanilci 2011 Wu and Tsai 2011Hariharan Sindhu and Yaacob 2011 HariharanSaraswathy Sindhu Khairunizam and Yaacob 2012Hoglund 2012) Specht (1991) has proposed the model ofGRNN to perform general (linear or non-linear) regres-sions GRNN is based on the theory of probability re-gression analysis It usually uses Parzen window estimatesto set up the PDF from the observed data samples TheGRNN has four different layers input layer pattern layer
Table 1 Details of training and test data
Type ofType of experiment validation MEEI database MAPACI speech pathology database
Frame-based (1557pathological + 1537normal) (1560pathological + 1560normal)
CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
ConV Training = 2166 (70) testing =928 (30)
Training = 2184 (70) testing =936 (30)
File-based (173pathological + 53normal) (24pathological + 24normal)
CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
ConV Training = 158 (70) testing =68 (30)
Training = 34 (70) testing =14 (30)
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1628 M Hariharan et al
Figure 6 Comparison of results (a) frame-based MEEI database raw features (b) file-based MEEI database raw features (c) frame-based MAPACI speech pathology database raw features and (d) file-based MAPACI speech pathology database raw features
summation layer and output layer The performance of theGRNN classifier highly depends upon the smoothing pa-rameter or spread factor (σ ) Based on the experimentalinvestigations the σ value is varied between 001 and 0055in steps of 0005
44 Support vector machine
In this work SVM is used as a classifier and it is a promis-ing method for solving non-linear classification problemsfunction estimation and density estimation and patternrecognition tasks (Azadi and Almasganj 2011 Calisir andDogantekin 2011 Ismail Samsudin and Shabri 2011 YuYao Wang and Lai 2011 Martis Acharya Mandana Rayand Chakraborty 2012) It has been originally proposed toclassify samples within two classes It maps training sam-ples of two classes into a higher dimensional space througha kernel function SVM seeks an optimal separating hyper-plane in this new space to maximise its distance from theclosest training point While testing a query point is cate-gorised according to the distance between the point and thehyperplane
SVM models are built around a kernel function thattransforms the input data into an n-dimensional space where
a hyperplane can be constructed to partition the data Threekinds of kernel functions such as linear kernel multilayerkernel and radial basis function (RBF) kernel are normallyused by researchers (Azadi and Almasganj 2011 Calisirand Dogantekin 2011 Ismail et al 2011 Yu et al 2011Martis et al 2012) In this work the RBF kernel functionis used since it gives an excellent generalisation and lowcomputational cost In the RBF kernel σ 2 (sig2) is theimportant parameter and it causes the changes in shapeflexion of the hyperplane
In this work LS-SVMlab toolbox (Suykens VanGestel De Brabanter De Moor and Vandewalle2003 De Brabanter 2010) is used to perform classificationof normal and pathological speech samples There are twoparameters which are to be chosen optimally such as reg-ularisation parameter (γ gam) and σ 2 (squared bandwidthof the RBF kernel) to obtain better accuracy
5 Results and discussions
Many research works have already been done in the areaof automatic detection of voice pathology In this work anew feature vector using WPT and SVD is proposed to clas-sify the voice samples into normal or pathological k-means
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1629
Figure 7 Comparison of results (a) Frame-based MEEI database weighted features (b) file-based MEEI database weighted features(c) Frame-based MAPACI speech pathology database weighted features and (d) file-based MAPACI speech pathology database weightedfeatures
clustering based feature weighting method is proposed forpreprocessing the features which increases its discrimina-tion ability between classes and the classification accuracyas well Four classifiers are employed to investigate the ef-ficacy of the raw and weighted features and two validationschemes (ConV and CrossV) are used to prove the reliabil-ity of the classification results Details of the training andtesting sets used in this work are tabulated in (Table 1)
To test the classifier performance several measuresnamely sensitivity specificity and the overall accuracy areconsidered These measures are calculated from the mea-sures true positive (TP the classifier classified as pathologywhen pathological samples are present) true negative (TNthe classifier classified as normal when normal samples arepresent) false positive (FP the classifier classified as patho-logical when normal samples are present) and false negative(FN the classifier classified as normal when pathologicalsamples are present) These measures are given in Equa-tions (9) (10) and (11) respectively
Sensitivity (SE) = TP(TP + FN) (9)
Specificity (SP) = TN(TN + FP) (10)
Overall accuracy (ACC) = (TP + TN)(TP + TN
+ FP + FN) (11)
In k-NN classifier different values of lsquokrsquo between 1 and10 are used PNN and GRNN are trained with differentspread factors between 001 and 0055 The suitable valueof the regularisation parameter (γ gam) and σ 2 (sig2) forSVM classifier are chosen optimally during training andtesting to obtain better accuracy Sensitivity specificity andoverall accuracy of frame-based and file-based analysis us-ing the raw and weighted features are presented in (Table 2)and (Table 3) for both MEEI database and MAPACIspeech pathology database respectively Average classifica-tion accuracies with standard deviations are tabulated From(Table 2) it is observed that the accuracy of file-based anal-ysis is always better than the frame-based analysis Max-imum classification accuracy of 100 is obtained duringthe file-based analysis using all the classifiers (k-NN LS-SVM PNN and GRNN) for the proposed raw and weightedfeatures Maximum classification accuracy of above 95is obtained using the raw features and 100 is obtained us-ing the weighted features during the frame-based analysis
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1630 M Hariharan et al
Tabl
e2
Com
pari
son
ofre
sult
sfo
rM
EE
Ida
taba
seus
ing
k-N
NL
S-S
VM
PN
Nan
dG
RN
N(r
awan
dw
eigh
ted
feat
ures
)
Raw
feat
ures
Wei
ghte
dfe
atur
es
ME
EI
data
base
SE
SP
AC
CS
ES
PA
CC
k-N
NFr
ame-
base
dC
ross
V96
66
plusmn0
5591
69
plusmn1
0994
05
plusmn0
8399
81
plusmn0
0698
27
plusmn0
3499
03
plusmn0
19k-
valu
e1ndash
10C
onV
959
6plusmn
071
908
7plusmn
154
932
7plusmn
116
998
1plusmn
006
979
8plusmn
033
988
8plusmn
015
File
-bas
edC
ross
V99
54
plusmn0
2410
0plusmn
000
996
5plusmn
019
995
4plusmn
024
100
plusmn0
0099
65
plusmn0
19k-
valu
e1ndash
10C
onV
997
4plusmn
025
100
plusmn0
0099
79
plusmn0
2010
0plusmn
000
100
plusmn0
0010
0plusmn
000
LS
-SV
MFr
ame-
base
dC
ross
V95
10
plusmn0
2195
41
plusmn0
2095
25
plusmn0
1299
75
plusmn0
0399
45
plusmn0
0799
60
plusmn0
0410
000
01
Con
V94
92
plusmn0
9094
49
plusmn1
0594
70
plusmn0
8199
34
plusmn0
2999
46
plusmn0
3499
40
plusmn0
26Fi
le-b
ased
Cro
ssV
989
2plusmn
062
998
1plusmn
060
991
2plusmn
047
993
2plusmn
036
100
plusmn0
0099
47
plusmn0
2810
01
Con
V10
0plusmn
000
988
7plusmn
130
991
2plusmn
103
996
2plusmn
080
994
1plusmn
186
995
6plusmn
071
PN
NFr
ame-
base
dC
ross
V97
43
plusmn0
2693
01
plusmn0
1795
12
plusmn0
1799
88
plusmn0
0598
69
plusmn0
1699
28
plusmn0
070
01ndash0
055
Con
V96
79
plusmn0
2492
44
plusmn0
3394
51
plusmn0
2499
81
plusmn0
0698
41
plusmn0
2199
10
plusmn0
11Fi
le-b
ased
Cro
ssV
974
4plusmn
539
100
plusmn0
0097
74
plusmn4
9797
50
plusmn5
3710
0plusmn
000
977
9plusmn
495
001
ndash00
55C
onV
970
9plusmn
610
100
plusmn0
0097
37
plusmn5
7999
62
plusmn0
8099
41
plusmn1
8699
56
plusmn0
71G
RN
NFr
ame-
base
dC
ross
V93
01
plusmn3
8988
08
plusmn4
9890
15
plusmn4
3199
50
plusmn0
4197
53
plusmn1
3098
27
plusmn0
890
01ndash0
055
Con
V92
52
plusmn3
7887
07
plusmn5
1189
34
plusmn4
3599
39
plusmn0
5597
35
plusmn1
1698
10
plusmn0
91Fi
le-b
ased
Cro
ssV
972
7plusmn
599
100
plusmn0
0096
33
plusmn8
0098
07
plusmn4
8210
0plusmn
000
979
2plusmn
511
001
ndash00
55C
onV
966
9plusmn
675
100
plusmn0
0095
81
plusmn9
0596
77
plusmn6
2310
0plusmn
000
958
1plusmn
815
Tabl
e3
Com
pari
son
ofre
sult
sfo
rM
APA
CI
spee
chpa
thol
ogy
data
base
usin
gk-
NN
LS
-SV
MP
NN
and
GR
NN
(raw
and
wei
ghte
dfe
atur
es)
Raw
feat
ures
Wei
ghte
dfe
atur
es
MA
PAC
Ida
taba
seS
ES
PA
CC
SE
SP
AC
C
k-N
NFr
ame-
base
dC
ross
V94
49
plusmn0
3696
58
plusmn0
3995
51
plusmn0
2198
67
plusmn0
2398
76
plusmn0
2598
71
plusmn0
23k-
valu
e1ndash
10C
onV
942
0plusmn
040
962
9plusmn
033
952
1plusmn
032
100
plusmn0
0010
0plusmn
000
100
plusmn0
00Fi
le-b
ased
Cro
ssV
771
0plusmn
784
669
2plusmn
330
704
2plusmn
365
100
plusmn0
0010
0plusmn
000
983
3plusmn
164
k-va
lue
1ndash10
Con
V77
56
plusmn7
1167
51
plusmn2
4869
71
plusmn2
7099
88
plusmn0
4099
88
plusmn0
4097
50
plusmn2
16L
S-S
VM
Fram
e-ba
sed
Cro
ssV
955
0plusmn
018
972
8plusmn
024
963
7plusmn
016
990
4plusmn
008
990
4plusmn
008
992
4plusmn
006
100
01
Con
V96
86
plusmn0
8195
24
plusmn0
6796
03
plusmn0
4098
71
plusmn0
6898
71
plusmn0
6899
02
plusmn0
41Fi
le-b
ased
Cro
ssV
793
1plusmn
333
688
7plusmn
240
729
2plusmn
260
100
plusmn0
0010
0plusmn
000
958
3plusmn
000
101
Con
V74
53
plusmn7
8284
00
plusmn13
01
771
4plusmn
878
100
plusmn0
0010
0plusmn
000
964
3plusmn
376
PN
NFr
ame-
base
dC
ross
V94
88
plusmn0
2096
23
plusmn0
2695
55
plusmn0
2199
82
plusmn0
0799
82
plusmn0
0799
91
plusmn0
040
01ndash0
055
Con
V94
52
plusmn0
3196
03
plusmn0
3595
26
plusmn0
2599
76
plusmn0
0799
76
plusmn0
0799
88
plusmn0
03Fi
le-b
ased
Cro
ssV
723
0plusmn
396
705
3plusmn
393
708
3plusmn
170
985
7plusmn
452
985
7plusmn
452
991
7plusmn
263
001
ndash00
55C
onV
690
2plusmn
533
691
8plusmn
410
679
3plusmn
386
986
0plusmn
273
986
0plusmn
273
990
0plusmn
172
GR
NN
Fram
e-ba
sed
Cro
ssV
904
1plusmn
378
917
9plusmn
335
909
8plusmn
345
995
9plusmn
029
995
9plusmn
029
997
1plusmn
023
001
ndash00
55C
onV
900
2plusmn
361
917
8plusmn
316
907
2plusmn
324
995
3plusmn
030
995
3plusmn
030
996
6plusmn
024
File
-bas
edC
ross
V69
43
plusmn8
5170
75
plusmn4
8666
67
plusmn7
7397
13
plusmn6
4097
13
plusmn6
4093
75
plusmn9
270
01ndash0
055
Con
V62
05
plusmn9
4064
51
plusmn3
8359
57
plusmn8
5097
38
plusmn5
6897
38
plusmn5
6893
07
plusmn9
91
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1631
Table 4 Computation time for feature extraction and classification
Classification
MEEI database MAPACI database
Type of Type of Weighted Raw Weighted RawClassifiers experiments validation features features features features
k-NN Frame-based CrossV 0338 0341 0342 0346k-value 1ndash10 ConV 3806 3821 6931 6955File-based CrossV 0036 0038 0036 0036k-value 1ndash10 ConV 1759 1754 3001 3206
LS-SVM Frame-based CrossV 1139 1175 1191 121410 001 ConV 0718 0698 0705 0723
File-based CrossV 0005 0005 0002 000210 1 ConV 0006 0006 0004 0004
PNN Frame-based CrossV 2458 3084 2445 2421001ndash0055 ConV 10340 10356 11180 11202File-based CrossV 1078 1124 1093 1062001ndash0055 ConV 4250 4266 4072 4047
GRNN Frame-based CrossV 2587 2571 2676 2582001ndash0055 ConV 10726 10648 11634 11584File-based CrossV 1098 1056 1065 1062001ndash0055 ConV 4302 4260 4049 4066
Feature extractionType of experiments MEEI MAPACI
Normal Pathological Normal PathologicalFile-based 0549 0034 0109 0130Frame-based 0591 0181 1734 1737
To prove the robustness and database independency of theproposed method MAPACI speech pathology database isused for the investigation in addition to MEEI databaseFrom (Table 3) it is inferred that maximum accuracy ofabove 95 is obtained using the raw features and 100is obtained using the weighted features during the frame-based analysis During the file-based analysis we obtainedmaximum classification accuracy of 92 using the rawfeatures and 100 is obtained using the weighted features
(Figure 6andashd) show the comparison of results (maxi-mum accuracy and its corresponding sensitivity and speci-ficity) for MEEI database and MAPACI speech pathologydatabase using the proposed raw features during frame-based and file-based analysis Similarly (Figure 7andashd) il-lustrate the comparison of results (maximum accuracy andits corresponding sensitivity and specificity) for MEEIdatabase and MAPACI speech pathology database for pro-posed weighted features The performance of the weightedfeatures is always better compared to raw features for bothMEEI database and MAPACI speech pathology databaseAll the feature extractions and classification algorithms aredeveloped under MATLAB platform and they run in a lap-top of Intel Core i7-2670 QM (22 GHz) with 4 GB RAM(Table 4) reports the average time taken (in seconds) duringfeature extraction and classification under two different ex-periments (frame-based and file-based) Direct comparisonof our work with the previous works in literature cannot beperformed since most of the works in the literature havenot presented their computation time of feature extraction
and classification phase From (Table 4) it is observed thatthe best classification accuracy could be obtained using oursuggested feature extraction and classification algorithmswith less computation time
From the literature it is observed that the reliabilityof wavelet and WPT-based features has been proven bymany experimental researches Accuracy of their resultswas varied from 85 to 100 approximately under differ-ent experiments Their results highly depend on the opti-mal selection of wavelet packet-based features using featureselectiontransformation techniques and optimisation tech-niques However the presented work gives very promisingclassification accuracy for the databases under test Henceit can be concluded that the proposed k-means clusteringbased feature weighting method could improve the robust-ness of the WPTndashSVD features and increase the classifica-tion accuracy as well for both MEEI database and MAPACIspeech pathology database
6 Conclusions
This paper presents a new feature constituting approachfor efficient detection of vocal fold pathology using a fea-ture extraction method (WPTndashSVD features and a fea-ture weighting method [k-means clustering based]) Totest the effectiveness and reliability of the proposed rawand weighted features four different classifiers such as k-NN LS-SVM PNN and GRNN are employed k-means
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1632 M Hariharan et al
clustering based feature weighting approach can increasethe distinguishing performance of raw WPTndashSVD featuresThe experimental results show that the proposed weightedfeatures give very promising classification accuracy of100 for both MEEI voice disorder database and MAPACIspeech pathology database The proposed features can beused as additional acoustic indicators for researchers andspeech pathologists to detect the vocal fold pathology Theperformance of the k-means clustering algorithm is depend-ing on the initial positions of the clusters and hence in ourwork it was run for several times to obtain better clus-ter centres In the future work k-means clustering basedfeature weighting method could be improved by using ef-ficient initialisation of methods for better initial placementof the cluster centres and different kernel functions Theproposed method could be extended to detect the specifictype of disorders and also to develop an online diagnosingsystem
AcknowledgementsThis work is done in part with data transferred from the databaseMAPACI httpwwwmapacicom The authors would like tothank the anonymous reviewers for their valuable comments
Notes on contributorsM Hariharan has received Bachelor ofEngineering in Electrical and ElectronicsEngineering and Master of Engineeringin Applied Electronics from GovernmentCollege of Technology Coimbatore TamilNadu India He has received a PhD degreein Mechatronic Engineering from UniversitiMalaysia Perlis (UniMAP) Malaysia He iscurrently working as a Senior Lecturer in
the School of Mechatronic Engineering UniMAP Malaysia Hehas published more than 50 papers in referred journals and con-ferences He has been a reviewer in various journals includingComputers and Electrical Engineering Computer Methods andPrograms in Biomedicine Artificial Intelligence in BiomedicineInternational Journal of Phoniatrics Speech Therapy and Com-munication Pathology and Medical Engineering and Physics Hisresearch interests include speech signal processing biomedicalsignal and image processing and artificial intelligence He is amember of IEEE
Kemal Polat was born in Kırıkkale on30 January 1981 Dr Polat has gradu-ated from ElectricalndashElectronics Engineer-ing Department of Selcuk University withBSc degree in 2002 and from ElectricalndashElectronics Engineering Department of Sel-cuk University with MSc degree in 2004He took his PhD degree in Electrical andElectronic Engineering at Selcuk University
in 2008 He is now working as an Assistant Professor in Electricaland Electronic Engineering Department Engineering of FacultyAbant Izzet Baysal University since September 2011 His currentresearch interests are biomedical signal classification statisticalsignal processing digital signal processing and pattern recognitionand classification He has 55 articles published in the SCI journalsand 16 international conference papers He is a reviewer in various
SCI journals including Information Sciences Pattern RecognitionLetters Digital Signal Processing Soft Computing Artificial In-telligence in Medicine Applied Soft Computing Computers inBiology and Medicine Expert Systems Computational Statisticsamp Data Analysis IEEE Transactions on Biomedical EngineeringIEEE Transactions on Evolutionary Computation TALANTASoft Computing Journal of Medical Systems and Scientific Re-search and Essays so on He is the member of editorial board ofJournal of Neural Computing and Applications (SCI expanded)
Sazali Yaacob received his PhD degreein Control Engineering from University ofSheffield UK He is currently working as aProfessor in the School of Mechatronic En-gineering UniMAP Malaysia He has suc-cessfully supervised 8 PhD candidates andmore than 20 MSc graduates through re-search mode Currently he has 10 PhD and8 MSc candidates His research interests
include control modelling and signal processing with applica-tions in the fields of satellite bio-medical applied mechanics androbotics He has published more than 70 papers in journals and200 papers in conference proceedings He received his profes-sional qualification as Charted Engineer from the EngineeringCouncil UK in 2005 and also a member of IET UK since 2003
ReferencesAnanthakrishna T Shama K and Niranjan U (2004)
lsquok-Means Nearest Neighbor Classifier for Voice Pathol-ogyrsquo in Paper Presented at the Proceedings from INDI-CONrsquo04 The 1st IEEE Annual India Conference KharagpurIndia pp 352ndash354
Arjmandi MK and Pooyan M (2011) lsquoAn Optimum Algorithmin Pathological Voice Quality Assessment Using Wavelet-Packet-Based Features Linear Discriminant Analysis andSupport Vector Machinersquo Biomedical Signal Processing andControl 7 3ndash19
Azadi TE and Almasganj F (2011) lsquoPBSVM Partitioningand Biased Support Vector Machine for Vocal Fold PathologyAssessment Using Labeled and Unlabeled Data Setsrsquo ExpertSystems with Applications 38(1) 610ndash619
Bezdek JC (1981) Pattern Recognition with Fuzzy ObjectiveFunction Algorithms Norwell MA Kluwer Academic Pub-lishers
Boyanov B and Hadjitodorov S (1997) lsquoAcoustic Analysis ofPathological Voices A Voice Analysis System for the Screen-ing of Laryngeal Diseasesrsquo IEEE Engineering in Medicineand Biology Magazine 16(4) 74ndash82
Boyanov B Ivanov T Hadjitodorov S and Chollet G (1993)lsquoRobust Hybrid Pitch Detectorrsquo IET Electronics Letters 291924ndash1926
Burrus C Gopinath R Guo H Odegard J and Selesnick I(1997) Introduction to Wavelets and Wavelet Transforms APrimer Upper Saddle River NJ Prentice Hall
Calisir D and Dogantekin E (2011) lsquoA New Intelligent Hep-atitis Diagnosis System PCA-LSSVMrsquo Expert Systems withApplications 38(8) 10705ndash10708
Chiu SL (1994) lsquoFuzzy Model Identification Based on ClusterEstimationrsquo Journal of Intelligent and Fuzzy Systems 2(3)267ndash278
Crovato C and Schuck A (2007) lsquoThe Use of Wavelet PacketTransform and Artificial Neural Networks in Analysis andClassification of Dysphonic Voicesrsquo IEEE Transactions onBiomedical Engineering 54(10) 1898ndash1900
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1633
De Brabanter K Ojeda PKF Alzate C De Brabanter J Pel-ckmans K De Moor B Vandewalle J and Suykens JAK(2010) lsquoLS-SVMlab Toolboxrsquo httpwwwesatkuleuvenbesistalssvmlab
Deliyski D (1993) Acoustic Model and Evaluation of Patholog-ical Voice Productionrsquo in Paper Presented at the Proceed-ings from EUROSPEECHrsquo93 The 3rd Conference on SpeechCommunication and Technology Berlin Germany pp 1969ndash1972
Duda RO Hart PE and Stork DG (2001) Pattern Classifi-cation New York Wiley-Interscience
Erfanian Saeedi N Almasganj F and Torabinejad F (2011)lsquoSupport Vector Wavelet Adaptation for Pathological VoiceAssessmentrsquo Computers in Biology and Medicine 41(9)822ndash828
Eskidere O Ertas F and Hanilci C (2011) lsquoA Comparison ofRegression Methods for Remote Tracking of Parkinsonrsquos Dis-ease Progressionrsquo Expert Systems with Applications 39(5)5523ndash5528
Feijoo S and Hernandez C (1990) lsquoShort-Term Stability Mea-sures for the Evaluation of Vocal Qualityrsquo Journal of Speechand Hearing Research 33(2) 324ndash334
Fonseca E Guido R Scalassara P Maciel C and Pereira J(2007) lsquoWavelet Time-Frequency Analysis and Least SquaresSupport Vector Machines for the Identification of Voice Dis-ordersrsquo Computers in Biology and Medicine(Elsevier) 37(4)571ndash578
Fukunaga K (1990) Introduction to Statistical Pattern Recogni-tion (2nd ed) San Diego CA Academic Press
Godino-Llorente J and Gomez-Vilda P (2004) lsquoAutomaticDetection of Voice Impairments by Means of Short-TermCepstral Parameters and Neural Network Based DetectorsrsquoIEEE Transactions on Biomedical Engineering 51(2) 380ndash384
Godino-Llorente J Gomez-Vilda P and Blanco-Velasco M(2006) lsquoDimensionality Reduction of a Pathological VoiceQuality Assessment System Based on Gaussian Mixture Mod-els and Short-Term Cepstral Parametersrsquo IEEE Transactionson Biomedical Engineering 53(10) 1943ndash1953
Gunes S Polat K and Yosunkaya S (2010) lsquoEfficient SleepStage Recognition System Based on EEG Signal Using k-Means Clustering Based Feature Weightingrsquo Expert Systemswith Applications 37(12) 7922ndash7928
Hariharan M Paulraj M and Yaacob S (2009) lsquoIdentifica-tion of Vocal Fold Pathology Based on Mel Frequency BandEnergy Coefficients and Singular Value Decompositionrsquo inPaper presented at the 2009 IEEE International Conferenceon Signal and Image Processing Applications (ICSIPArsquo09)Kuala Lumpur Malaysia pp 514ndash517
Hariharan M Paulraj M and Yaacob S (2011) lsquoDetection ofVocal Fold Paralysis and Oedema Using Time-Domain Fea-tures and Probabilistic Neural Networkrsquo International Jour-nal of Biomedical Engineering and Technology 6(1) 46ndash57
Hariharan M Saraswathy J Sindhu R Khairunizam W andYaacob S (2012) lsquoInfant Cry Classification to Identify As-phyxia Using Time-Frequency Analysis and Radial BasisNeural Networksrsquo Expert Systems with Applications 39(10)9515ndash9523
Hariharan M and Sazali Y (2010) lsquoTime-Domain Features andProbabilistic Neural Network for the Detection of Vocal FoldPathologyrsquo Malaysian Journal of Computer Science 23(1)60ndash67
Hariharan M Sindhu R and Yaacob S (2011) lsquoNormal andHypoacoustic Infant Cry Signal Classification Using Time-Frequency Analysis and General Regression Neural Net-
workrsquo Computer Methods and Programs in Biomedicine108(2) 559ndash569
Hariharan M Yaacob S and Awang SA (2011) lsquoPathologi-cal Infant Cry Analysis Using Wavelet Packet Transform andProbabilistic Neural Networkrsquo Expert Systems with Applica-tions 38(12) 15377ndash15382
Hernandez-Espinosa C Gomez-Vilda P Godino-Llorente Jand Aguilera-Navarro S (2000) lsquoDiagnosis of Vocal andVoice Disorders by the Speech Signalrsquo in Paper Presentedat the Proceedings from IJCNNrsquo00 The IEEE-INNS-ENNSInternational Joint Conference on Neural Networks ComoItaly pp 253ndash258
Hoglund H (2012) lsquoDetecting Earnings Management with Neu-ral Networksrsquo Expert Systems with Applications 39(10)9564ndash9570
Ismail S Samsudin R and Shabri A (2011) lsquoA Hybrid Modelof Self-Organizing Maps (SOM) and Least Square SupportVector Machine (LSSVM) for Time Series Forecastingrsquo Ex-pert Systems with Applications 38(8) 10574ndash10578
Kalker T Haitsma J and Oostveen J (2001) lsquoRobust AudioHashing for Content Identificationrsquo in Paper Presented at theProceedings from EUSIPCOrsquo01 The 12th European SignalProcessing Conference Vienna Austria pp 2091ndash2094
Kasuya H Endo Y and Saliu S (1993) lsquoNovel AcousticMeasurements of Jitter and Shimmer Characteristics fromPathological Voicersquo Paper Presented at the Proceedings fromEUROSPEECHrsquo93 The 3rd European Conference on SpeechCommunication and Technology Berlin Germany pp 1973ndash1976
Kasuya H Ogawa S Mashima K and Ebihara S (1986)lsquoNormalized Noise Energy as an Acoustic Measure to Eval-uate Pathologic Voicersquo The Journal of the Acoustical Societyof America 80 1329ndash1334
Kay Elemetrics Inc (1994) lsquoVoice Disorders DatabaseVersion 103 [CD-ROM]rsquo httpwwwkaypentaxcomProduct20InfoCSL20Options43374337html
Khadivi Heris H Seyed Aghazadeh B and Nikkhah-BahramiM (2009) lsquoOptimal Feature Selection for the Assessment ofVocal Fold Disordersrsquo Computers in Biology and Medicine39(10) 860ndash868
Krom G (1993) lsquoA Cepstrum-Based Technique for Determininga Harmonics-To-Noise Ratio in Speech Signalsrsquo Journal ofSpeech and Hearing Research 36(2) 254ndash266
Kukharchik P Kheidorov I Bovbel E and Ladeev D (2008)lsquoSpeech Signal Processing Based on Wavelets and SVM forVocal Tract Pathology Detectionrsquo Lecture Notes in ComputerScience (Springer) 5099 192ndash199
Latifoglu F Polat K Kara SI and Gunes S (2008) lsquoMedicalDiagnosis of Atherosclerosis from Carotid Artery DopplerSignals Using Principal Component Analysis (PCA) k-NN based Weighting Pre-Processing and Artificial ImmuneRecognition System (AIRS)rsquo Journal of Biomedical Infor-matics 41(1) 15ndash23
Ludlow CL Bassich CJ Connor NP Coulter DC and LeeYJ (1987) lsquoThe Validity of Using Phonatory Jitter and Shim-mer to Detect Laryngeal Pathologyrsquo Laryngeal Functionin Phonation and Respiration Boston MA Brown amp Copp 492ndash508
MacQueen J (1967) lsquoSome Methods for Classification and Anal-ysis of Multivariate Observationsrsquo in Paper Presented at the5th Berkeley Symposium on Mathematical Statistics and Prob-ability Berkeley CA University California Press Vol 1pp 281ndash297
MAPACI P (2004) lsquoVoice Disorder Databasersquo httpwwwmapacicomindex-inglesphp
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1634 M Hariharan et al
Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372
Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800
Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953
Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327
Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900
Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793
Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793
Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60
Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673
Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279
Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921
Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia
Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345
Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925
Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128
Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740
Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p
Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118
Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576
Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd
Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430
Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97
Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117
Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678
Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219
Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399
Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1626 M Hariharan et al
Figure 3 Working of k-means clustering based feature weighting method
Stop if either this value or its improvement over previousiteration is below a certain thresholdStep 4 Update the cluster centres according toEquation (8) and go to Step 2
ci = 1
|Gi |sum
kxkisin Gi
xk (8)
where |Gi | =nsum
j=1uij
After finding the cluster centres the extracted featuresare preprocessed (Polat and Gunes 2006 Latifoglu et al2008 Gunes et al 2010 Polat 2012 Polat and Durduran2012) as summarised in (Figure 3) below firstly calcu-late the cluster centres using k-means clustering methodNext the ratios of means of features to their centres arecalculated Finally these ratios are multiplied with eachrespective feature
Raw features are obtained through feature extractionThrough k-means clustering based feature weighting theraw features are preprocessed which gives the weightedfeatures (Figure 4a and b) shows the class distribution ofraw and weighted SVD features MEEI database The classdistribution of raw and weighted SVD features for MAPACIspeech pathology database is shown in (Figure 5a and b)From the figures it is observed that the discrimination abil-
ity of the raw SVD features has been increased by applyingk-means clustering based feature weighting method
4 Classifiers
In the area of automatic detection of vocal fold patholo-gies various classifiers have been proposed such as multi-layer perceptron (Ritchings McGillion Conroy and Moore1999 Ritchings McGillion and Moore 1999) hiddenMarkov models (Wester 1998) linear discriminant anal-ysis (Hariharan Paulraj and Yaacob 2009 UmapathyKrishnan Parsa and Jamieson 2005) SVM (Khadivi Heriset al 2009 Arjmandi and Pooyan 2011 Erfanian Saeediet al 2011) Gaussian mixture models (Godino-Llorenteet al 2006) radial basis neural networks (Hariharan andSazali 2010 Hariharan Paulraj and Yaacob 2011) and k-nearest neighbourhood classifier (Ananthakrishna Shamaand Niranjan 2004 Shama and Cholayya 2007) In thiswork four different classifiers are employed such as k-NNLS-SVM PNN and GRNN
41 k-NN classifier
In pattern recognition the k-NN algorithm is a method forclassifying objects based on closest training examples inthe feature space (Fukunaga 1990 Duda Hart and Stork
Figure 4 (a) Class distribution of raw SVD features (MEEI database frame-based) according to the first two features (Feature 1 andFeature 2) (b) Class distribution of weighted SVD features (MEEI database frame-based) according to the first two features (Feature 1and Feature 2)
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1627
Figure 5 (a) Class distribution of raw SVD features (MAPACI speech pathology frame-based) according to the first two features (Feature1 and Feature 2) (b) Comparison of results (frame-based MEEI database raw features)
2001) An object is classified by a majority vote of itsneighbour with the object being assigned to the class mostcommon amongst its k-nearest neighbours k is a positiveinteger In the k-NN algorithm the classification of a newtest feature vector is determined by the class of its k-NNs
42 Probabilistic neural network
Specht (1990) has proposed the PNN based on Bayesianclassification and classical estimators for probability den-sity function (PDF) PNN comprises four units such as in-put units pattern units summation units and output unitsAll the units are fully interconnected and the pattern unitsare activated by exponential function instead of sigmoidalactivation function The pattern unit computes distancesfrom the input vector to the training input vectors when aninput is presented and produces a vector whose elementsindicate how close the input is to a training input The sum-mation unit sums these contributions for each class of inputsand produces a net output which is a vector of probabili-ties From the maximum of these probabilities output unitsproduce a 1 for that class and a 0 for the other classes using
compete transfer function (Hariharan Yaacob and Awang2011) The performance of the PNN classifier highly de-pends upon the smoothing parameter or spread factor (σ )Based on the experimental investigations the σ value isvaried between 001 and 0055 in steps of 0005
43 General regression neural network
GRNN is a kind of radial basis networks and the train-ing is conducted using one-pass learning This networkdoes not require an iterative training procedure it presentsmuch faster learning than multilayer perceptron it is moreaccurate than MLP and relatively insensitive to outliersThe target variable for the GRNN network is continuous(Eskidere Ertas and Hanilci 2011 Wu and Tsai 2011Hariharan Sindhu and Yaacob 2011 HariharanSaraswathy Sindhu Khairunizam and Yaacob 2012Hoglund 2012) Specht (1991) has proposed the model ofGRNN to perform general (linear or non-linear) regres-sions GRNN is based on the theory of probability re-gression analysis It usually uses Parzen window estimatesto set up the PDF from the observed data samples TheGRNN has four different layers input layer pattern layer
Table 1 Details of training and test data
Type ofType of experiment validation MEEI database MAPACI speech pathology database
Frame-based (1557pathological + 1537normal) (1560pathological + 1560normal)
CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
ConV Training = 2166 (70) testing =928 (30)
Training = 2184 (70) testing =936 (30)
File-based (173pathological + 53normal) (24pathological + 24normal)
CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
ConV Training = 158 (70) testing =68 (30)
Training = 34 (70) testing =14 (30)
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1628 M Hariharan et al
Figure 6 Comparison of results (a) frame-based MEEI database raw features (b) file-based MEEI database raw features (c) frame-based MAPACI speech pathology database raw features and (d) file-based MAPACI speech pathology database raw features
summation layer and output layer The performance of theGRNN classifier highly depends upon the smoothing pa-rameter or spread factor (σ ) Based on the experimentalinvestigations the σ value is varied between 001 and 0055in steps of 0005
44 Support vector machine
In this work SVM is used as a classifier and it is a promis-ing method for solving non-linear classification problemsfunction estimation and density estimation and patternrecognition tasks (Azadi and Almasganj 2011 Calisir andDogantekin 2011 Ismail Samsudin and Shabri 2011 YuYao Wang and Lai 2011 Martis Acharya Mandana Rayand Chakraborty 2012) It has been originally proposed toclassify samples within two classes It maps training sam-ples of two classes into a higher dimensional space througha kernel function SVM seeks an optimal separating hyper-plane in this new space to maximise its distance from theclosest training point While testing a query point is cate-gorised according to the distance between the point and thehyperplane
SVM models are built around a kernel function thattransforms the input data into an n-dimensional space where
a hyperplane can be constructed to partition the data Threekinds of kernel functions such as linear kernel multilayerkernel and radial basis function (RBF) kernel are normallyused by researchers (Azadi and Almasganj 2011 Calisirand Dogantekin 2011 Ismail et al 2011 Yu et al 2011Martis et al 2012) In this work the RBF kernel functionis used since it gives an excellent generalisation and lowcomputational cost In the RBF kernel σ 2 (sig2) is theimportant parameter and it causes the changes in shapeflexion of the hyperplane
In this work LS-SVMlab toolbox (Suykens VanGestel De Brabanter De Moor and Vandewalle2003 De Brabanter 2010) is used to perform classificationof normal and pathological speech samples There are twoparameters which are to be chosen optimally such as reg-ularisation parameter (γ gam) and σ 2 (squared bandwidthof the RBF kernel) to obtain better accuracy
5 Results and discussions
Many research works have already been done in the areaof automatic detection of voice pathology In this work anew feature vector using WPT and SVD is proposed to clas-sify the voice samples into normal or pathological k-means
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1629
Figure 7 Comparison of results (a) Frame-based MEEI database weighted features (b) file-based MEEI database weighted features(c) Frame-based MAPACI speech pathology database weighted features and (d) file-based MAPACI speech pathology database weightedfeatures
clustering based feature weighting method is proposed forpreprocessing the features which increases its discrimina-tion ability between classes and the classification accuracyas well Four classifiers are employed to investigate the ef-ficacy of the raw and weighted features and two validationschemes (ConV and CrossV) are used to prove the reliabil-ity of the classification results Details of the training andtesting sets used in this work are tabulated in (Table 1)
To test the classifier performance several measuresnamely sensitivity specificity and the overall accuracy areconsidered These measures are calculated from the mea-sures true positive (TP the classifier classified as pathologywhen pathological samples are present) true negative (TNthe classifier classified as normal when normal samples arepresent) false positive (FP the classifier classified as patho-logical when normal samples are present) and false negative(FN the classifier classified as normal when pathologicalsamples are present) These measures are given in Equa-tions (9) (10) and (11) respectively
Sensitivity (SE) = TP(TP + FN) (9)
Specificity (SP) = TN(TN + FP) (10)
Overall accuracy (ACC) = (TP + TN)(TP + TN
+ FP + FN) (11)
In k-NN classifier different values of lsquokrsquo between 1 and10 are used PNN and GRNN are trained with differentspread factors between 001 and 0055 The suitable valueof the regularisation parameter (γ gam) and σ 2 (sig2) forSVM classifier are chosen optimally during training andtesting to obtain better accuracy Sensitivity specificity andoverall accuracy of frame-based and file-based analysis us-ing the raw and weighted features are presented in (Table 2)and (Table 3) for both MEEI database and MAPACIspeech pathology database respectively Average classifica-tion accuracies with standard deviations are tabulated From(Table 2) it is observed that the accuracy of file-based anal-ysis is always better than the frame-based analysis Max-imum classification accuracy of 100 is obtained duringthe file-based analysis using all the classifiers (k-NN LS-SVM PNN and GRNN) for the proposed raw and weightedfeatures Maximum classification accuracy of above 95is obtained using the raw features and 100 is obtained us-ing the weighted features during the frame-based analysis
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1630 M Hariharan et al
Tabl
e2
Com
pari
son
ofre
sult
sfo
rM
EE
Ida
taba
seus
ing
k-N
NL
S-S
VM
PN
Nan
dG
RN
N(r
awan
dw
eigh
ted
feat
ures
)
Raw
feat
ures
Wei
ghte
dfe
atur
es
ME
EI
data
base
SE
SP
AC
CS
ES
PA
CC
k-N
NFr
ame-
base
dC
ross
V96
66
plusmn0
5591
69
plusmn1
0994
05
plusmn0
8399
81
plusmn0
0698
27
plusmn0
3499
03
plusmn0
19k-
valu
e1ndash
10C
onV
959
6plusmn
071
908
7plusmn
154
932
7plusmn
116
998
1plusmn
006
979
8plusmn
033
988
8plusmn
015
File
-bas
edC
ross
V99
54
plusmn0
2410
0plusmn
000
996
5plusmn
019
995
4plusmn
024
100
plusmn0
0099
65
plusmn0
19k-
valu
e1ndash
10C
onV
997
4plusmn
025
100
plusmn0
0099
79
plusmn0
2010
0plusmn
000
100
plusmn0
0010
0plusmn
000
LS
-SV
MFr
ame-
base
dC
ross
V95
10
plusmn0
2195
41
plusmn0
2095
25
plusmn0
1299
75
plusmn0
0399
45
plusmn0
0799
60
plusmn0
0410
000
01
Con
V94
92
plusmn0
9094
49
plusmn1
0594
70
plusmn0
8199
34
plusmn0
2999
46
plusmn0
3499
40
plusmn0
26Fi
le-b
ased
Cro
ssV
989
2plusmn
062
998
1plusmn
060
991
2plusmn
047
993
2plusmn
036
100
plusmn0
0099
47
plusmn0
2810
01
Con
V10
0plusmn
000
988
7plusmn
130
991
2plusmn
103
996
2plusmn
080
994
1plusmn
186
995
6plusmn
071
PN
NFr
ame-
base
dC
ross
V97
43
plusmn0
2693
01
plusmn0
1795
12
plusmn0
1799
88
plusmn0
0598
69
plusmn0
1699
28
plusmn0
070
01ndash0
055
Con
V96
79
plusmn0
2492
44
plusmn0
3394
51
plusmn0
2499
81
plusmn0
0698
41
plusmn0
2199
10
plusmn0
11Fi
le-b
ased
Cro
ssV
974
4plusmn
539
100
plusmn0
0097
74
plusmn4
9797
50
plusmn5
3710
0plusmn
000
977
9plusmn
495
001
ndash00
55C
onV
970
9plusmn
610
100
plusmn0
0097
37
plusmn5
7999
62
plusmn0
8099
41
plusmn1
8699
56
plusmn0
71G
RN
NFr
ame-
base
dC
ross
V93
01
plusmn3
8988
08
plusmn4
9890
15
plusmn4
3199
50
plusmn0
4197
53
plusmn1
3098
27
plusmn0
890
01ndash0
055
Con
V92
52
plusmn3
7887
07
plusmn5
1189
34
plusmn4
3599
39
plusmn0
5597
35
plusmn1
1698
10
plusmn0
91Fi
le-b
ased
Cro
ssV
972
7plusmn
599
100
plusmn0
0096
33
plusmn8
0098
07
plusmn4
8210
0plusmn
000
979
2plusmn
511
001
ndash00
55C
onV
966
9plusmn
675
100
plusmn0
0095
81
plusmn9
0596
77
plusmn6
2310
0plusmn
000
958
1plusmn
815
Tabl
e3
Com
pari
son
ofre
sult
sfo
rM
APA
CI
spee
chpa
thol
ogy
data
base
usin
gk-
NN
LS
-SV
MP
NN
and
GR
NN
(raw
and
wei
ghte
dfe
atur
es)
Raw
feat
ures
Wei
ghte
dfe
atur
es
MA
PAC
Ida
taba
seS
ES
PA
CC
SE
SP
AC
C
k-N
NFr
ame-
base
dC
ross
V94
49
plusmn0
3696
58
plusmn0
3995
51
plusmn0
2198
67
plusmn0
2398
76
plusmn0
2598
71
plusmn0
23k-
valu
e1ndash
10C
onV
942
0plusmn
040
962
9plusmn
033
952
1plusmn
032
100
plusmn0
0010
0plusmn
000
100
plusmn0
00Fi
le-b
ased
Cro
ssV
771
0plusmn
784
669
2plusmn
330
704
2plusmn
365
100
plusmn0
0010
0plusmn
000
983
3plusmn
164
k-va
lue
1ndash10
Con
V77
56
plusmn7
1167
51
plusmn2
4869
71
plusmn2
7099
88
plusmn0
4099
88
plusmn0
4097
50
plusmn2
16L
S-S
VM
Fram
e-ba
sed
Cro
ssV
955
0plusmn
018
972
8plusmn
024
963
7plusmn
016
990
4plusmn
008
990
4plusmn
008
992
4plusmn
006
100
01
Con
V96
86
plusmn0
8195
24
plusmn0
6796
03
plusmn0
4098
71
plusmn0
6898
71
plusmn0
6899
02
plusmn0
41Fi
le-b
ased
Cro
ssV
793
1plusmn
333
688
7plusmn
240
729
2plusmn
260
100
plusmn0
0010
0plusmn
000
958
3plusmn
000
101
Con
V74
53
plusmn7
8284
00
plusmn13
01
771
4plusmn
878
100
plusmn0
0010
0plusmn
000
964
3plusmn
376
PN
NFr
ame-
base
dC
ross
V94
88
plusmn0
2096
23
plusmn0
2695
55
plusmn0
2199
82
plusmn0
0799
82
plusmn0
0799
91
plusmn0
040
01ndash0
055
Con
V94
52
plusmn0
3196
03
plusmn0
3595
26
plusmn0
2599
76
plusmn0
0799
76
plusmn0
0799
88
plusmn0
03Fi
le-b
ased
Cro
ssV
723
0plusmn
396
705
3plusmn
393
708
3plusmn
170
985
7plusmn
452
985
7plusmn
452
991
7plusmn
263
001
ndash00
55C
onV
690
2plusmn
533
691
8plusmn
410
679
3plusmn
386
986
0plusmn
273
986
0plusmn
273
990
0plusmn
172
GR
NN
Fram
e-ba
sed
Cro
ssV
904
1plusmn
378
917
9plusmn
335
909
8plusmn
345
995
9plusmn
029
995
9plusmn
029
997
1plusmn
023
001
ndash00
55C
onV
900
2plusmn
361
917
8plusmn
316
907
2plusmn
324
995
3plusmn
030
995
3plusmn
030
996
6plusmn
024
File
-bas
edC
ross
V69
43
plusmn8
5170
75
plusmn4
8666
67
plusmn7
7397
13
plusmn6
4097
13
plusmn6
4093
75
plusmn9
270
01ndash0
055
Con
V62
05
plusmn9
4064
51
plusmn3
8359
57
plusmn8
5097
38
plusmn5
6897
38
plusmn5
6893
07
plusmn9
91
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1631
Table 4 Computation time for feature extraction and classification
Classification
MEEI database MAPACI database
Type of Type of Weighted Raw Weighted RawClassifiers experiments validation features features features features
k-NN Frame-based CrossV 0338 0341 0342 0346k-value 1ndash10 ConV 3806 3821 6931 6955File-based CrossV 0036 0038 0036 0036k-value 1ndash10 ConV 1759 1754 3001 3206
LS-SVM Frame-based CrossV 1139 1175 1191 121410 001 ConV 0718 0698 0705 0723
File-based CrossV 0005 0005 0002 000210 1 ConV 0006 0006 0004 0004
PNN Frame-based CrossV 2458 3084 2445 2421001ndash0055 ConV 10340 10356 11180 11202File-based CrossV 1078 1124 1093 1062001ndash0055 ConV 4250 4266 4072 4047
GRNN Frame-based CrossV 2587 2571 2676 2582001ndash0055 ConV 10726 10648 11634 11584File-based CrossV 1098 1056 1065 1062001ndash0055 ConV 4302 4260 4049 4066
Feature extractionType of experiments MEEI MAPACI
Normal Pathological Normal PathologicalFile-based 0549 0034 0109 0130Frame-based 0591 0181 1734 1737
To prove the robustness and database independency of theproposed method MAPACI speech pathology database isused for the investigation in addition to MEEI databaseFrom (Table 3) it is inferred that maximum accuracy ofabove 95 is obtained using the raw features and 100is obtained using the weighted features during the frame-based analysis During the file-based analysis we obtainedmaximum classification accuracy of 92 using the rawfeatures and 100 is obtained using the weighted features
(Figure 6andashd) show the comparison of results (maxi-mum accuracy and its corresponding sensitivity and speci-ficity) for MEEI database and MAPACI speech pathologydatabase using the proposed raw features during frame-based and file-based analysis Similarly (Figure 7andashd) il-lustrate the comparison of results (maximum accuracy andits corresponding sensitivity and specificity) for MEEIdatabase and MAPACI speech pathology database for pro-posed weighted features The performance of the weightedfeatures is always better compared to raw features for bothMEEI database and MAPACI speech pathology databaseAll the feature extractions and classification algorithms aredeveloped under MATLAB platform and they run in a lap-top of Intel Core i7-2670 QM (22 GHz) with 4 GB RAM(Table 4) reports the average time taken (in seconds) duringfeature extraction and classification under two different ex-periments (frame-based and file-based) Direct comparisonof our work with the previous works in literature cannot beperformed since most of the works in the literature havenot presented their computation time of feature extraction
and classification phase From (Table 4) it is observed thatthe best classification accuracy could be obtained using oursuggested feature extraction and classification algorithmswith less computation time
From the literature it is observed that the reliabilityof wavelet and WPT-based features has been proven bymany experimental researches Accuracy of their resultswas varied from 85 to 100 approximately under differ-ent experiments Their results highly depend on the opti-mal selection of wavelet packet-based features using featureselectiontransformation techniques and optimisation tech-niques However the presented work gives very promisingclassification accuracy for the databases under test Henceit can be concluded that the proposed k-means clusteringbased feature weighting method could improve the robust-ness of the WPTndashSVD features and increase the classifica-tion accuracy as well for both MEEI database and MAPACIspeech pathology database
6 Conclusions
This paper presents a new feature constituting approachfor efficient detection of vocal fold pathology using a fea-ture extraction method (WPTndashSVD features and a fea-ture weighting method [k-means clustering based]) Totest the effectiveness and reliability of the proposed rawand weighted features four different classifiers such as k-NN LS-SVM PNN and GRNN are employed k-means
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1632 M Hariharan et al
clustering based feature weighting approach can increasethe distinguishing performance of raw WPTndashSVD featuresThe experimental results show that the proposed weightedfeatures give very promising classification accuracy of100 for both MEEI voice disorder database and MAPACIspeech pathology database The proposed features can beused as additional acoustic indicators for researchers andspeech pathologists to detect the vocal fold pathology Theperformance of the k-means clustering algorithm is depend-ing on the initial positions of the clusters and hence in ourwork it was run for several times to obtain better clus-ter centres In the future work k-means clustering basedfeature weighting method could be improved by using ef-ficient initialisation of methods for better initial placementof the cluster centres and different kernel functions Theproposed method could be extended to detect the specifictype of disorders and also to develop an online diagnosingsystem
AcknowledgementsThis work is done in part with data transferred from the databaseMAPACI httpwwwmapacicom The authors would like tothank the anonymous reviewers for their valuable comments
Notes on contributorsM Hariharan has received Bachelor ofEngineering in Electrical and ElectronicsEngineering and Master of Engineeringin Applied Electronics from GovernmentCollege of Technology Coimbatore TamilNadu India He has received a PhD degreein Mechatronic Engineering from UniversitiMalaysia Perlis (UniMAP) Malaysia He iscurrently working as a Senior Lecturer in
the School of Mechatronic Engineering UniMAP Malaysia Hehas published more than 50 papers in referred journals and con-ferences He has been a reviewer in various journals includingComputers and Electrical Engineering Computer Methods andPrograms in Biomedicine Artificial Intelligence in BiomedicineInternational Journal of Phoniatrics Speech Therapy and Com-munication Pathology and Medical Engineering and Physics Hisresearch interests include speech signal processing biomedicalsignal and image processing and artificial intelligence He is amember of IEEE
Kemal Polat was born in Kırıkkale on30 January 1981 Dr Polat has gradu-ated from ElectricalndashElectronics Engineer-ing Department of Selcuk University withBSc degree in 2002 and from ElectricalndashElectronics Engineering Department of Sel-cuk University with MSc degree in 2004He took his PhD degree in Electrical andElectronic Engineering at Selcuk University
in 2008 He is now working as an Assistant Professor in Electricaland Electronic Engineering Department Engineering of FacultyAbant Izzet Baysal University since September 2011 His currentresearch interests are biomedical signal classification statisticalsignal processing digital signal processing and pattern recognitionand classification He has 55 articles published in the SCI journalsand 16 international conference papers He is a reviewer in various
SCI journals including Information Sciences Pattern RecognitionLetters Digital Signal Processing Soft Computing Artificial In-telligence in Medicine Applied Soft Computing Computers inBiology and Medicine Expert Systems Computational Statisticsamp Data Analysis IEEE Transactions on Biomedical EngineeringIEEE Transactions on Evolutionary Computation TALANTASoft Computing Journal of Medical Systems and Scientific Re-search and Essays so on He is the member of editorial board ofJournal of Neural Computing and Applications (SCI expanded)
Sazali Yaacob received his PhD degreein Control Engineering from University ofSheffield UK He is currently working as aProfessor in the School of Mechatronic En-gineering UniMAP Malaysia He has suc-cessfully supervised 8 PhD candidates andmore than 20 MSc graduates through re-search mode Currently he has 10 PhD and8 MSc candidates His research interests
include control modelling and signal processing with applica-tions in the fields of satellite bio-medical applied mechanics androbotics He has published more than 70 papers in journals and200 papers in conference proceedings He received his profes-sional qualification as Charted Engineer from the EngineeringCouncil UK in 2005 and also a member of IET UK since 2003
ReferencesAnanthakrishna T Shama K and Niranjan U (2004)
lsquok-Means Nearest Neighbor Classifier for Voice Pathol-ogyrsquo in Paper Presented at the Proceedings from INDI-CONrsquo04 The 1st IEEE Annual India Conference KharagpurIndia pp 352ndash354
Arjmandi MK and Pooyan M (2011) lsquoAn Optimum Algorithmin Pathological Voice Quality Assessment Using Wavelet-Packet-Based Features Linear Discriminant Analysis andSupport Vector Machinersquo Biomedical Signal Processing andControl 7 3ndash19
Azadi TE and Almasganj F (2011) lsquoPBSVM Partitioningand Biased Support Vector Machine for Vocal Fold PathologyAssessment Using Labeled and Unlabeled Data Setsrsquo ExpertSystems with Applications 38(1) 610ndash619
Bezdek JC (1981) Pattern Recognition with Fuzzy ObjectiveFunction Algorithms Norwell MA Kluwer Academic Pub-lishers
Boyanov B and Hadjitodorov S (1997) lsquoAcoustic Analysis ofPathological Voices A Voice Analysis System for the Screen-ing of Laryngeal Diseasesrsquo IEEE Engineering in Medicineand Biology Magazine 16(4) 74ndash82
Boyanov B Ivanov T Hadjitodorov S and Chollet G (1993)lsquoRobust Hybrid Pitch Detectorrsquo IET Electronics Letters 291924ndash1926
Burrus C Gopinath R Guo H Odegard J and Selesnick I(1997) Introduction to Wavelets and Wavelet Transforms APrimer Upper Saddle River NJ Prentice Hall
Calisir D and Dogantekin E (2011) lsquoA New Intelligent Hep-atitis Diagnosis System PCA-LSSVMrsquo Expert Systems withApplications 38(8) 10705ndash10708
Chiu SL (1994) lsquoFuzzy Model Identification Based on ClusterEstimationrsquo Journal of Intelligent and Fuzzy Systems 2(3)267ndash278
Crovato C and Schuck A (2007) lsquoThe Use of Wavelet PacketTransform and Artificial Neural Networks in Analysis andClassification of Dysphonic Voicesrsquo IEEE Transactions onBiomedical Engineering 54(10) 1898ndash1900
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1633
De Brabanter K Ojeda PKF Alzate C De Brabanter J Pel-ckmans K De Moor B Vandewalle J and Suykens JAK(2010) lsquoLS-SVMlab Toolboxrsquo httpwwwesatkuleuvenbesistalssvmlab
Deliyski D (1993) Acoustic Model and Evaluation of Patholog-ical Voice Productionrsquo in Paper Presented at the Proceed-ings from EUROSPEECHrsquo93 The 3rd Conference on SpeechCommunication and Technology Berlin Germany pp 1969ndash1972
Duda RO Hart PE and Stork DG (2001) Pattern Classifi-cation New York Wiley-Interscience
Erfanian Saeedi N Almasganj F and Torabinejad F (2011)lsquoSupport Vector Wavelet Adaptation for Pathological VoiceAssessmentrsquo Computers in Biology and Medicine 41(9)822ndash828
Eskidere O Ertas F and Hanilci C (2011) lsquoA Comparison ofRegression Methods for Remote Tracking of Parkinsonrsquos Dis-ease Progressionrsquo Expert Systems with Applications 39(5)5523ndash5528
Feijoo S and Hernandez C (1990) lsquoShort-Term Stability Mea-sures for the Evaluation of Vocal Qualityrsquo Journal of Speechand Hearing Research 33(2) 324ndash334
Fonseca E Guido R Scalassara P Maciel C and Pereira J(2007) lsquoWavelet Time-Frequency Analysis and Least SquaresSupport Vector Machines for the Identification of Voice Dis-ordersrsquo Computers in Biology and Medicine(Elsevier) 37(4)571ndash578
Fukunaga K (1990) Introduction to Statistical Pattern Recogni-tion (2nd ed) San Diego CA Academic Press
Godino-Llorente J and Gomez-Vilda P (2004) lsquoAutomaticDetection of Voice Impairments by Means of Short-TermCepstral Parameters and Neural Network Based DetectorsrsquoIEEE Transactions on Biomedical Engineering 51(2) 380ndash384
Godino-Llorente J Gomez-Vilda P and Blanco-Velasco M(2006) lsquoDimensionality Reduction of a Pathological VoiceQuality Assessment System Based on Gaussian Mixture Mod-els and Short-Term Cepstral Parametersrsquo IEEE Transactionson Biomedical Engineering 53(10) 1943ndash1953
Gunes S Polat K and Yosunkaya S (2010) lsquoEfficient SleepStage Recognition System Based on EEG Signal Using k-Means Clustering Based Feature Weightingrsquo Expert Systemswith Applications 37(12) 7922ndash7928
Hariharan M Paulraj M and Yaacob S (2009) lsquoIdentifica-tion of Vocal Fold Pathology Based on Mel Frequency BandEnergy Coefficients and Singular Value Decompositionrsquo inPaper presented at the 2009 IEEE International Conferenceon Signal and Image Processing Applications (ICSIPArsquo09)Kuala Lumpur Malaysia pp 514ndash517
Hariharan M Paulraj M and Yaacob S (2011) lsquoDetection ofVocal Fold Paralysis and Oedema Using Time-Domain Fea-tures and Probabilistic Neural Networkrsquo International Jour-nal of Biomedical Engineering and Technology 6(1) 46ndash57
Hariharan M Saraswathy J Sindhu R Khairunizam W andYaacob S (2012) lsquoInfant Cry Classification to Identify As-phyxia Using Time-Frequency Analysis and Radial BasisNeural Networksrsquo Expert Systems with Applications 39(10)9515ndash9523
Hariharan M and Sazali Y (2010) lsquoTime-Domain Features andProbabilistic Neural Network for the Detection of Vocal FoldPathologyrsquo Malaysian Journal of Computer Science 23(1)60ndash67
Hariharan M Sindhu R and Yaacob S (2011) lsquoNormal andHypoacoustic Infant Cry Signal Classification Using Time-Frequency Analysis and General Regression Neural Net-
workrsquo Computer Methods and Programs in Biomedicine108(2) 559ndash569
Hariharan M Yaacob S and Awang SA (2011) lsquoPathologi-cal Infant Cry Analysis Using Wavelet Packet Transform andProbabilistic Neural Networkrsquo Expert Systems with Applica-tions 38(12) 15377ndash15382
Hernandez-Espinosa C Gomez-Vilda P Godino-Llorente Jand Aguilera-Navarro S (2000) lsquoDiagnosis of Vocal andVoice Disorders by the Speech Signalrsquo in Paper Presentedat the Proceedings from IJCNNrsquo00 The IEEE-INNS-ENNSInternational Joint Conference on Neural Networks ComoItaly pp 253ndash258
Hoglund H (2012) lsquoDetecting Earnings Management with Neu-ral Networksrsquo Expert Systems with Applications 39(10)9564ndash9570
Ismail S Samsudin R and Shabri A (2011) lsquoA Hybrid Modelof Self-Organizing Maps (SOM) and Least Square SupportVector Machine (LSSVM) for Time Series Forecastingrsquo Ex-pert Systems with Applications 38(8) 10574ndash10578
Kalker T Haitsma J and Oostveen J (2001) lsquoRobust AudioHashing for Content Identificationrsquo in Paper Presented at theProceedings from EUSIPCOrsquo01 The 12th European SignalProcessing Conference Vienna Austria pp 2091ndash2094
Kasuya H Endo Y and Saliu S (1993) lsquoNovel AcousticMeasurements of Jitter and Shimmer Characteristics fromPathological Voicersquo Paper Presented at the Proceedings fromEUROSPEECHrsquo93 The 3rd European Conference on SpeechCommunication and Technology Berlin Germany pp 1973ndash1976
Kasuya H Ogawa S Mashima K and Ebihara S (1986)lsquoNormalized Noise Energy as an Acoustic Measure to Eval-uate Pathologic Voicersquo The Journal of the Acoustical Societyof America 80 1329ndash1334
Kay Elemetrics Inc (1994) lsquoVoice Disorders DatabaseVersion 103 [CD-ROM]rsquo httpwwwkaypentaxcomProduct20InfoCSL20Options43374337html
Khadivi Heris H Seyed Aghazadeh B and Nikkhah-BahramiM (2009) lsquoOptimal Feature Selection for the Assessment ofVocal Fold Disordersrsquo Computers in Biology and Medicine39(10) 860ndash868
Krom G (1993) lsquoA Cepstrum-Based Technique for Determininga Harmonics-To-Noise Ratio in Speech Signalsrsquo Journal ofSpeech and Hearing Research 36(2) 254ndash266
Kukharchik P Kheidorov I Bovbel E and Ladeev D (2008)lsquoSpeech Signal Processing Based on Wavelets and SVM forVocal Tract Pathology Detectionrsquo Lecture Notes in ComputerScience (Springer) 5099 192ndash199
Latifoglu F Polat K Kara SI and Gunes S (2008) lsquoMedicalDiagnosis of Atherosclerosis from Carotid Artery DopplerSignals Using Principal Component Analysis (PCA) k-NN based Weighting Pre-Processing and Artificial ImmuneRecognition System (AIRS)rsquo Journal of Biomedical Infor-matics 41(1) 15ndash23
Ludlow CL Bassich CJ Connor NP Coulter DC and LeeYJ (1987) lsquoThe Validity of Using Phonatory Jitter and Shim-mer to Detect Laryngeal Pathologyrsquo Laryngeal Functionin Phonation and Respiration Boston MA Brown amp Copp 492ndash508
MacQueen J (1967) lsquoSome Methods for Classification and Anal-ysis of Multivariate Observationsrsquo in Paper Presented at the5th Berkeley Symposium on Mathematical Statistics and Prob-ability Berkeley CA University California Press Vol 1pp 281ndash297
MAPACI P (2004) lsquoVoice Disorder Databasersquo httpwwwmapacicomindex-inglesphp
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1634 M Hariharan et al
Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372
Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800
Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953
Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327
Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900
Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793
Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793
Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60
Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673
Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279
Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921
Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia
Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345
Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925
Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128
Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740
Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p
Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118
Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576
Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd
Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430
Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97
Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117
Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678
Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219
Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399
Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1627
Figure 5 (a) Class distribution of raw SVD features (MAPACI speech pathology frame-based) according to the first two features (Feature1 and Feature 2) (b) Comparison of results (frame-based MEEI database raw features)
2001) An object is classified by a majority vote of itsneighbour with the object being assigned to the class mostcommon amongst its k-nearest neighbours k is a positiveinteger In the k-NN algorithm the classification of a newtest feature vector is determined by the class of its k-NNs
42 Probabilistic neural network
Specht (1990) has proposed the PNN based on Bayesianclassification and classical estimators for probability den-sity function (PDF) PNN comprises four units such as in-put units pattern units summation units and output unitsAll the units are fully interconnected and the pattern unitsare activated by exponential function instead of sigmoidalactivation function The pattern unit computes distancesfrom the input vector to the training input vectors when aninput is presented and produces a vector whose elementsindicate how close the input is to a training input The sum-mation unit sums these contributions for each class of inputsand produces a net output which is a vector of probabili-ties From the maximum of these probabilities output unitsproduce a 1 for that class and a 0 for the other classes using
compete transfer function (Hariharan Yaacob and Awang2011) The performance of the PNN classifier highly de-pends upon the smoothing parameter or spread factor (σ )Based on the experimental investigations the σ value isvaried between 001 and 0055 in steps of 0005
43 General regression neural network
GRNN is a kind of radial basis networks and the train-ing is conducted using one-pass learning This networkdoes not require an iterative training procedure it presentsmuch faster learning than multilayer perceptron it is moreaccurate than MLP and relatively insensitive to outliersThe target variable for the GRNN network is continuous(Eskidere Ertas and Hanilci 2011 Wu and Tsai 2011Hariharan Sindhu and Yaacob 2011 HariharanSaraswathy Sindhu Khairunizam and Yaacob 2012Hoglund 2012) Specht (1991) has proposed the model ofGRNN to perform general (linear or non-linear) regres-sions GRNN is based on the theory of probability re-gression analysis It usually uses Parzen window estimatesto set up the PDF from the observed data samples TheGRNN has four different layers input layer pattern layer
Table 1 Details of training and test data
Type ofType of experiment validation MEEI database MAPACI speech pathology database
Frame-based (1557pathological + 1537normal) (1560pathological + 1560normal)
CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
ConV Training = 2166 (70) testing =928 (30)
Training = 2184 (70) testing =936 (30)
File-based (173pathological + 53normal) (24pathological + 24normal)
CrossV Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
Feature vectors are divided randomlyinto 10 sets and training is repeated for10 times
ConV Training = 158 (70) testing =68 (30)
Training = 34 (70) testing =14 (30)
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1628 M Hariharan et al
Figure 6 Comparison of results (a) frame-based MEEI database raw features (b) file-based MEEI database raw features (c) frame-based MAPACI speech pathology database raw features and (d) file-based MAPACI speech pathology database raw features
summation layer and output layer The performance of theGRNN classifier highly depends upon the smoothing pa-rameter or spread factor (σ ) Based on the experimentalinvestigations the σ value is varied between 001 and 0055in steps of 0005
44 Support vector machine
In this work SVM is used as a classifier and it is a promis-ing method for solving non-linear classification problemsfunction estimation and density estimation and patternrecognition tasks (Azadi and Almasganj 2011 Calisir andDogantekin 2011 Ismail Samsudin and Shabri 2011 YuYao Wang and Lai 2011 Martis Acharya Mandana Rayand Chakraborty 2012) It has been originally proposed toclassify samples within two classes It maps training sam-ples of two classes into a higher dimensional space througha kernel function SVM seeks an optimal separating hyper-plane in this new space to maximise its distance from theclosest training point While testing a query point is cate-gorised according to the distance between the point and thehyperplane
SVM models are built around a kernel function thattransforms the input data into an n-dimensional space where
a hyperplane can be constructed to partition the data Threekinds of kernel functions such as linear kernel multilayerkernel and radial basis function (RBF) kernel are normallyused by researchers (Azadi and Almasganj 2011 Calisirand Dogantekin 2011 Ismail et al 2011 Yu et al 2011Martis et al 2012) In this work the RBF kernel functionis used since it gives an excellent generalisation and lowcomputational cost In the RBF kernel σ 2 (sig2) is theimportant parameter and it causes the changes in shapeflexion of the hyperplane
In this work LS-SVMlab toolbox (Suykens VanGestel De Brabanter De Moor and Vandewalle2003 De Brabanter 2010) is used to perform classificationof normal and pathological speech samples There are twoparameters which are to be chosen optimally such as reg-ularisation parameter (γ gam) and σ 2 (squared bandwidthof the RBF kernel) to obtain better accuracy
5 Results and discussions
Many research works have already been done in the areaof automatic detection of voice pathology In this work anew feature vector using WPT and SVD is proposed to clas-sify the voice samples into normal or pathological k-means
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1629
Figure 7 Comparison of results (a) Frame-based MEEI database weighted features (b) file-based MEEI database weighted features(c) Frame-based MAPACI speech pathology database weighted features and (d) file-based MAPACI speech pathology database weightedfeatures
clustering based feature weighting method is proposed forpreprocessing the features which increases its discrimina-tion ability between classes and the classification accuracyas well Four classifiers are employed to investigate the ef-ficacy of the raw and weighted features and two validationschemes (ConV and CrossV) are used to prove the reliabil-ity of the classification results Details of the training andtesting sets used in this work are tabulated in (Table 1)
To test the classifier performance several measuresnamely sensitivity specificity and the overall accuracy areconsidered These measures are calculated from the mea-sures true positive (TP the classifier classified as pathologywhen pathological samples are present) true negative (TNthe classifier classified as normal when normal samples arepresent) false positive (FP the classifier classified as patho-logical when normal samples are present) and false negative(FN the classifier classified as normal when pathologicalsamples are present) These measures are given in Equa-tions (9) (10) and (11) respectively
Sensitivity (SE) = TP(TP + FN) (9)
Specificity (SP) = TN(TN + FP) (10)
Overall accuracy (ACC) = (TP + TN)(TP + TN
+ FP + FN) (11)
In k-NN classifier different values of lsquokrsquo between 1 and10 are used PNN and GRNN are trained with differentspread factors between 001 and 0055 The suitable valueof the regularisation parameter (γ gam) and σ 2 (sig2) forSVM classifier are chosen optimally during training andtesting to obtain better accuracy Sensitivity specificity andoverall accuracy of frame-based and file-based analysis us-ing the raw and weighted features are presented in (Table 2)and (Table 3) for both MEEI database and MAPACIspeech pathology database respectively Average classifica-tion accuracies with standard deviations are tabulated From(Table 2) it is observed that the accuracy of file-based anal-ysis is always better than the frame-based analysis Max-imum classification accuracy of 100 is obtained duringthe file-based analysis using all the classifiers (k-NN LS-SVM PNN and GRNN) for the proposed raw and weightedfeatures Maximum classification accuracy of above 95is obtained using the raw features and 100 is obtained us-ing the weighted features during the frame-based analysis
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1630 M Hariharan et al
Tabl
e2
Com
pari
son
ofre
sult
sfo
rM
EE
Ida
taba
seus
ing
k-N
NL
S-S
VM
PN
Nan
dG
RN
N(r
awan
dw
eigh
ted
feat
ures
)
Raw
feat
ures
Wei
ghte
dfe
atur
es
ME
EI
data
base
SE
SP
AC
CS
ES
PA
CC
k-N
NFr
ame-
base
dC
ross
V96
66
plusmn0
5591
69
plusmn1
0994
05
plusmn0
8399
81
plusmn0
0698
27
plusmn0
3499
03
plusmn0
19k-
valu
e1ndash
10C
onV
959
6plusmn
071
908
7plusmn
154
932
7plusmn
116
998
1plusmn
006
979
8plusmn
033
988
8plusmn
015
File
-bas
edC
ross
V99
54
plusmn0
2410
0plusmn
000
996
5plusmn
019
995
4plusmn
024
100
plusmn0
0099
65
plusmn0
19k-
valu
e1ndash
10C
onV
997
4plusmn
025
100
plusmn0
0099
79
plusmn0
2010
0plusmn
000
100
plusmn0
0010
0plusmn
000
LS
-SV
MFr
ame-
base
dC
ross
V95
10
plusmn0
2195
41
plusmn0
2095
25
plusmn0
1299
75
plusmn0
0399
45
plusmn0
0799
60
plusmn0
0410
000
01
Con
V94
92
plusmn0
9094
49
plusmn1
0594
70
plusmn0
8199
34
plusmn0
2999
46
plusmn0
3499
40
plusmn0
26Fi
le-b
ased
Cro
ssV
989
2plusmn
062
998
1plusmn
060
991
2plusmn
047
993
2plusmn
036
100
plusmn0
0099
47
plusmn0
2810
01
Con
V10
0plusmn
000
988
7plusmn
130
991
2plusmn
103
996
2plusmn
080
994
1plusmn
186
995
6plusmn
071
PN
NFr
ame-
base
dC
ross
V97
43
plusmn0
2693
01
plusmn0
1795
12
plusmn0
1799
88
plusmn0
0598
69
plusmn0
1699
28
plusmn0
070
01ndash0
055
Con
V96
79
plusmn0
2492
44
plusmn0
3394
51
plusmn0
2499
81
plusmn0
0698
41
plusmn0
2199
10
plusmn0
11Fi
le-b
ased
Cro
ssV
974
4plusmn
539
100
plusmn0
0097
74
plusmn4
9797
50
plusmn5
3710
0plusmn
000
977
9plusmn
495
001
ndash00
55C
onV
970
9plusmn
610
100
plusmn0
0097
37
plusmn5
7999
62
plusmn0
8099
41
plusmn1
8699
56
plusmn0
71G
RN
NFr
ame-
base
dC
ross
V93
01
plusmn3
8988
08
plusmn4
9890
15
plusmn4
3199
50
plusmn0
4197
53
plusmn1
3098
27
plusmn0
890
01ndash0
055
Con
V92
52
plusmn3
7887
07
plusmn5
1189
34
plusmn4
3599
39
plusmn0
5597
35
plusmn1
1698
10
plusmn0
91Fi
le-b
ased
Cro
ssV
972
7plusmn
599
100
plusmn0
0096
33
plusmn8
0098
07
plusmn4
8210
0plusmn
000
979
2plusmn
511
001
ndash00
55C
onV
966
9plusmn
675
100
plusmn0
0095
81
plusmn9
0596
77
plusmn6
2310
0plusmn
000
958
1plusmn
815
Tabl
e3
Com
pari
son
ofre
sult
sfo
rM
APA
CI
spee
chpa
thol
ogy
data
base
usin
gk-
NN
LS
-SV
MP
NN
and
GR
NN
(raw
and
wei
ghte
dfe
atur
es)
Raw
feat
ures
Wei
ghte
dfe
atur
es
MA
PAC
Ida
taba
seS
ES
PA
CC
SE
SP
AC
C
k-N
NFr
ame-
base
dC
ross
V94
49
plusmn0
3696
58
plusmn0
3995
51
plusmn0
2198
67
plusmn0
2398
76
plusmn0
2598
71
plusmn0
23k-
valu
e1ndash
10C
onV
942
0plusmn
040
962
9plusmn
033
952
1plusmn
032
100
plusmn0
0010
0plusmn
000
100
plusmn0
00Fi
le-b
ased
Cro
ssV
771
0plusmn
784
669
2plusmn
330
704
2plusmn
365
100
plusmn0
0010
0plusmn
000
983
3plusmn
164
k-va
lue
1ndash10
Con
V77
56
plusmn7
1167
51
plusmn2
4869
71
plusmn2
7099
88
plusmn0
4099
88
plusmn0
4097
50
plusmn2
16L
S-S
VM
Fram
e-ba
sed
Cro
ssV
955
0plusmn
018
972
8plusmn
024
963
7plusmn
016
990
4plusmn
008
990
4plusmn
008
992
4plusmn
006
100
01
Con
V96
86
plusmn0
8195
24
plusmn0
6796
03
plusmn0
4098
71
plusmn0
6898
71
plusmn0
6899
02
plusmn0
41Fi
le-b
ased
Cro
ssV
793
1plusmn
333
688
7plusmn
240
729
2plusmn
260
100
plusmn0
0010
0plusmn
000
958
3plusmn
000
101
Con
V74
53
plusmn7
8284
00
plusmn13
01
771
4plusmn
878
100
plusmn0
0010
0plusmn
000
964
3plusmn
376
PN
NFr
ame-
base
dC
ross
V94
88
plusmn0
2096
23
plusmn0
2695
55
plusmn0
2199
82
plusmn0
0799
82
plusmn0
0799
91
plusmn0
040
01ndash0
055
Con
V94
52
plusmn0
3196
03
plusmn0
3595
26
plusmn0
2599
76
plusmn0
0799
76
plusmn0
0799
88
plusmn0
03Fi
le-b
ased
Cro
ssV
723
0plusmn
396
705
3plusmn
393
708
3plusmn
170
985
7plusmn
452
985
7plusmn
452
991
7plusmn
263
001
ndash00
55C
onV
690
2plusmn
533
691
8plusmn
410
679
3plusmn
386
986
0plusmn
273
986
0plusmn
273
990
0plusmn
172
GR
NN
Fram
e-ba
sed
Cro
ssV
904
1plusmn
378
917
9plusmn
335
909
8plusmn
345
995
9plusmn
029
995
9plusmn
029
997
1plusmn
023
001
ndash00
55C
onV
900
2plusmn
361
917
8plusmn
316
907
2plusmn
324
995
3plusmn
030
995
3plusmn
030
996
6plusmn
024
File
-bas
edC
ross
V69
43
plusmn8
5170
75
plusmn4
8666
67
plusmn7
7397
13
plusmn6
4097
13
plusmn6
4093
75
plusmn9
270
01ndash0
055
Con
V62
05
plusmn9
4064
51
plusmn3
8359
57
plusmn8
5097
38
plusmn5
6897
38
plusmn5
6893
07
plusmn9
91
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1631
Table 4 Computation time for feature extraction and classification
Classification
MEEI database MAPACI database
Type of Type of Weighted Raw Weighted RawClassifiers experiments validation features features features features
k-NN Frame-based CrossV 0338 0341 0342 0346k-value 1ndash10 ConV 3806 3821 6931 6955File-based CrossV 0036 0038 0036 0036k-value 1ndash10 ConV 1759 1754 3001 3206
LS-SVM Frame-based CrossV 1139 1175 1191 121410 001 ConV 0718 0698 0705 0723
File-based CrossV 0005 0005 0002 000210 1 ConV 0006 0006 0004 0004
PNN Frame-based CrossV 2458 3084 2445 2421001ndash0055 ConV 10340 10356 11180 11202File-based CrossV 1078 1124 1093 1062001ndash0055 ConV 4250 4266 4072 4047
GRNN Frame-based CrossV 2587 2571 2676 2582001ndash0055 ConV 10726 10648 11634 11584File-based CrossV 1098 1056 1065 1062001ndash0055 ConV 4302 4260 4049 4066
Feature extractionType of experiments MEEI MAPACI
Normal Pathological Normal PathologicalFile-based 0549 0034 0109 0130Frame-based 0591 0181 1734 1737
To prove the robustness and database independency of theproposed method MAPACI speech pathology database isused for the investigation in addition to MEEI databaseFrom (Table 3) it is inferred that maximum accuracy ofabove 95 is obtained using the raw features and 100is obtained using the weighted features during the frame-based analysis During the file-based analysis we obtainedmaximum classification accuracy of 92 using the rawfeatures and 100 is obtained using the weighted features
(Figure 6andashd) show the comparison of results (maxi-mum accuracy and its corresponding sensitivity and speci-ficity) for MEEI database and MAPACI speech pathologydatabase using the proposed raw features during frame-based and file-based analysis Similarly (Figure 7andashd) il-lustrate the comparison of results (maximum accuracy andits corresponding sensitivity and specificity) for MEEIdatabase and MAPACI speech pathology database for pro-posed weighted features The performance of the weightedfeatures is always better compared to raw features for bothMEEI database and MAPACI speech pathology databaseAll the feature extractions and classification algorithms aredeveloped under MATLAB platform and they run in a lap-top of Intel Core i7-2670 QM (22 GHz) with 4 GB RAM(Table 4) reports the average time taken (in seconds) duringfeature extraction and classification under two different ex-periments (frame-based and file-based) Direct comparisonof our work with the previous works in literature cannot beperformed since most of the works in the literature havenot presented their computation time of feature extraction
and classification phase From (Table 4) it is observed thatthe best classification accuracy could be obtained using oursuggested feature extraction and classification algorithmswith less computation time
From the literature it is observed that the reliabilityof wavelet and WPT-based features has been proven bymany experimental researches Accuracy of their resultswas varied from 85 to 100 approximately under differ-ent experiments Their results highly depend on the opti-mal selection of wavelet packet-based features using featureselectiontransformation techniques and optimisation tech-niques However the presented work gives very promisingclassification accuracy for the databases under test Henceit can be concluded that the proposed k-means clusteringbased feature weighting method could improve the robust-ness of the WPTndashSVD features and increase the classifica-tion accuracy as well for both MEEI database and MAPACIspeech pathology database
6 Conclusions
This paper presents a new feature constituting approachfor efficient detection of vocal fold pathology using a fea-ture extraction method (WPTndashSVD features and a fea-ture weighting method [k-means clustering based]) Totest the effectiveness and reliability of the proposed rawand weighted features four different classifiers such as k-NN LS-SVM PNN and GRNN are employed k-means
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1632 M Hariharan et al
clustering based feature weighting approach can increasethe distinguishing performance of raw WPTndashSVD featuresThe experimental results show that the proposed weightedfeatures give very promising classification accuracy of100 for both MEEI voice disorder database and MAPACIspeech pathology database The proposed features can beused as additional acoustic indicators for researchers andspeech pathologists to detect the vocal fold pathology Theperformance of the k-means clustering algorithm is depend-ing on the initial positions of the clusters and hence in ourwork it was run for several times to obtain better clus-ter centres In the future work k-means clustering basedfeature weighting method could be improved by using ef-ficient initialisation of methods for better initial placementof the cluster centres and different kernel functions Theproposed method could be extended to detect the specifictype of disorders and also to develop an online diagnosingsystem
AcknowledgementsThis work is done in part with data transferred from the databaseMAPACI httpwwwmapacicom The authors would like tothank the anonymous reviewers for their valuable comments
Notes on contributorsM Hariharan has received Bachelor ofEngineering in Electrical and ElectronicsEngineering and Master of Engineeringin Applied Electronics from GovernmentCollege of Technology Coimbatore TamilNadu India He has received a PhD degreein Mechatronic Engineering from UniversitiMalaysia Perlis (UniMAP) Malaysia He iscurrently working as a Senior Lecturer in
the School of Mechatronic Engineering UniMAP Malaysia Hehas published more than 50 papers in referred journals and con-ferences He has been a reviewer in various journals includingComputers and Electrical Engineering Computer Methods andPrograms in Biomedicine Artificial Intelligence in BiomedicineInternational Journal of Phoniatrics Speech Therapy and Com-munication Pathology and Medical Engineering and Physics Hisresearch interests include speech signal processing biomedicalsignal and image processing and artificial intelligence He is amember of IEEE
Kemal Polat was born in Kırıkkale on30 January 1981 Dr Polat has gradu-ated from ElectricalndashElectronics Engineer-ing Department of Selcuk University withBSc degree in 2002 and from ElectricalndashElectronics Engineering Department of Sel-cuk University with MSc degree in 2004He took his PhD degree in Electrical andElectronic Engineering at Selcuk University
in 2008 He is now working as an Assistant Professor in Electricaland Electronic Engineering Department Engineering of FacultyAbant Izzet Baysal University since September 2011 His currentresearch interests are biomedical signal classification statisticalsignal processing digital signal processing and pattern recognitionand classification He has 55 articles published in the SCI journalsand 16 international conference papers He is a reviewer in various
SCI journals including Information Sciences Pattern RecognitionLetters Digital Signal Processing Soft Computing Artificial In-telligence in Medicine Applied Soft Computing Computers inBiology and Medicine Expert Systems Computational Statisticsamp Data Analysis IEEE Transactions on Biomedical EngineeringIEEE Transactions on Evolutionary Computation TALANTASoft Computing Journal of Medical Systems and Scientific Re-search and Essays so on He is the member of editorial board ofJournal of Neural Computing and Applications (SCI expanded)
Sazali Yaacob received his PhD degreein Control Engineering from University ofSheffield UK He is currently working as aProfessor in the School of Mechatronic En-gineering UniMAP Malaysia He has suc-cessfully supervised 8 PhD candidates andmore than 20 MSc graduates through re-search mode Currently he has 10 PhD and8 MSc candidates His research interests
include control modelling and signal processing with applica-tions in the fields of satellite bio-medical applied mechanics androbotics He has published more than 70 papers in journals and200 papers in conference proceedings He received his profes-sional qualification as Charted Engineer from the EngineeringCouncil UK in 2005 and also a member of IET UK since 2003
ReferencesAnanthakrishna T Shama K and Niranjan U (2004)
lsquok-Means Nearest Neighbor Classifier for Voice Pathol-ogyrsquo in Paper Presented at the Proceedings from INDI-CONrsquo04 The 1st IEEE Annual India Conference KharagpurIndia pp 352ndash354
Arjmandi MK and Pooyan M (2011) lsquoAn Optimum Algorithmin Pathological Voice Quality Assessment Using Wavelet-Packet-Based Features Linear Discriminant Analysis andSupport Vector Machinersquo Biomedical Signal Processing andControl 7 3ndash19
Azadi TE and Almasganj F (2011) lsquoPBSVM Partitioningand Biased Support Vector Machine for Vocal Fold PathologyAssessment Using Labeled and Unlabeled Data Setsrsquo ExpertSystems with Applications 38(1) 610ndash619
Bezdek JC (1981) Pattern Recognition with Fuzzy ObjectiveFunction Algorithms Norwell MA Kluwer Academic Pub-lishers
Boyanov B and Hadjitodorov S (1997) lsquoAcoustic Analysis ofPathological Voices A Voice Analysis System for the Screen-ing of Laryngeal Diseasesrsquo IEEE Engineering in Medicineand Biology Magazine 16(4) 74ndash82
Boyanov B Ivanov T Hadjitodorov S and Chollet G (1993)lsquoRobust Hybrid Pitch Detectorrsquo IET Electronics Letters 291924ndash1926
Burrus C Gopinath R Guo H Odegard J and Selesnick I(1997) Introduction to Wavelets and Wavelet Transforms APrimer Upper Saddle River NJ Prentice Hall
Calisir D and Dogantekin E (2011) lsquoA New Intelligent Hep-atitis Diagnosis System PCA-LSSVMrsquo Expert Systems withApplications 38(8) 10705ndash10708
Chiu SL (1994) lsquoFuzzy Model Identification Based on ClusterEstimationrsquo Journal of Intelligent and Fuzzy Systems 2(3)267ndash278
Crovato C and Schuck A (2007) lsquoThe Use of Wavelet PacketTransform and Artificial Neural Networks in Analysis andClassification of Dysphonic Voicesrsquo IEEE Transactions onBiomedical Engineering 54(10) 1898ndash1900
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1633
De Brabanter K Ojeda PKF Alzate C De Brabanter J Pel-ckmans K De Moor B Vandewalle J and Suykens JAK(2010) lsquoLS-SVMlab Toolboxrsquo httpwwwesatkuleuvenbesistalssvmlab
Deliyski D (1993) Acoustic Model and Evaluation of Patholog-ical Voice Productionrsquo in Paper Presented at the Proceed-ings from EUROSPEECHrsquo93 The 3rd Conference on SpeechCommunication and Technology Berlin Germany pp 1969ndash1972
Duda RO Hart PE and Stork DG (2001) Pattern Classifi-cation New York Wiley-Interscience
Erfanian Saeedi N Almasganj F and Torabinejad F (2011)lsquoSupport Vector Wavelet Adaptation for Pathological VoiceAssessmentrsquo Computers in Biology and Medicine 41(9)822ndash828
Eskidere O Ertas F and Hanilci C (2011) lsquoA Comparison ofRegression Methods for Remote Tracking of Parkinsonrsquos Dis-ease Progressionrsquo Expert Systems with Applications 39(5)5523ndash5528
Feijoo S and Hernandez C (1990) lsquoShort-Term Stability Mea-sures for the Evaluation of Vocal Qualityrsquo Journal of Speechand Hearing Research 33(2) 324ndash334
Fonseca E Guido R Scalassara P Maciel C and Pereira J(2007) lsquoWavelet Time-Frequency Analysis and Least SquaresSupport Vector Machines for the Identification of Voice Dis-ordersrsquo Computers in Biology and Medicine(Elsevier) 37(4)571ndash578
Fukunaga K (1990) Introduction to Statistical Pattern Recogni-tion (2nd ed) San Diego CA Academic Press
Godino-Llorente J and Gomez-Vilda P (2004) lsquoAutomaticDetection of Voice Impairments by Means of Short-TermCepstral Parameters and Neural Network Based DetectorsrsquoIEEE Transactions on Biomedical Engineering 51(2) 380ndash384
Godino-Llorente J Gomez-Vilda P and Blanco-Velasco M(2006) lsquoDimensionality Reduction of a Pathological VoiceQuality Assessment System Based on Gaussian Mixture Mod-els and Short-Term Cepstral Parametersrsquo IEEE Transactionson Biomedical Engineering 53(10) 1943ndash1953
Gunes S Polat K and Yosunkaya S (2010) lsquoEfficient SleepStage Recognition System Based on EEG Signal Using k-Means Clustering Based Feature Weightingrsquo Expert Systemswith Applications 37(12) 7922ndash7928
Hariharan M Paulraj M and Yaacob S (2009) lsquoIdentifica-tion of Vocal Fold Pathology Based on Mel Frequency BandEnergy Coefficients and Singular Value Decompositionrsquo inPaper presented at the 2009 IEEE International Conferenceon Signal and Image Processing Applications (ICSIPArsquo09)Kuala Lumpur Malaysia pp 514ndash517
Hariharan M Paulraj M and Yaacob S (2011) lsquoDetection ofVocal Fold Paralysis and Oedema Using Time-Domain Fea-tures and Probabilistic Neural Networkrsquo International Jour-nal of Biomedical Engineering and Technology 6(1) 46ndash57
Hariharan M Saraswathy J Sindhu R Khairunizam W andYaacob S (2012) lsquoInfant Cry Classification to Identify As-phyxia Using Time-Frequency Analysis and Radial BasisNeural Networksrsquo Expert Systems with Applications 39(10)9515ndash9523
Hariharan M and Sazali Y (2010) lsquoTime-Domain Features andProbabilistic Neural Network for the Detection of Vocal FoldPathologyrsquo Malaysian Journal of Computer Science 23(1)60ndash67
Hariharan M Sindhu R and Yaacob S (2011) lsquoNormal andHypoacoustic Infant Cry Signal Classification Using Time-Frequency Analysis and General Regression Neural Net-
workrsquo Computer Methods and Programs in Biomedicine108(2) 559ndash569
Hariharan M Yaacob S and Awang SA (2011) lsquoPathologi-cal Infant Cry Analysis Using Wavelet Packet Transform andProbabilistic Neural Networkrsquo Expert Systems with Applica-tions 38(12) 15377ndash15382
Hernandez-Espinosa C Gomez-Vilda P Godino-Llorente Jand Aguilera-Navarro S (2000) lsquoDiagnosis of Vocal andVoice Disorders by the Speech Signalrsquo in Paper Presentedat the Proceedings from IJCNNrsquo00 The IEEE-INNS-ENNSInternational Joint Conference on Neural Networks ComoItaly pp 253ndash258
Hoglund H (2012) lsquoDetecting Earnings Management with Neu-ral Networksrsquo Expert Systems with Applications 39(10)9564ndash9570
Ismail S Samsudin R and Shabri A (2011) lsquoA Hybrid Modelof Self-Organizing Maps (SOM) and Least Square SupportVector Machine (LSSVM) for Time Series Forecastingrsquo Ex-pert Systems with Applications 38(8) 10574ndash10578
Kalker T Haitsma J and Oostveen J (2001) lsquoRobust AudioHashing for Content Identificationrsquo in Paper Presented at theProceedings from EUSIPCOrsquo01 The 12th European SignalProcessing Conference Vienna Austria pp 2091ndash2094
Kasuya H Endo Y and Saliu S (1993) lsquoNovel AcousticMeasurements of Jitter and Shimmer Characteristics fromPathological Voicersquo Paper Presented at the Proceedings fromEUROSPEECHrsquo93 The 3rd European Conference on SpeechCommunication and Technology Berlin Germany pp 1973ndash1976
Kasuya H Ogawa S Mashima K and Ebihara S (1986)lsquoNormalized Noise Energy as an Acoustic Measure to Eval-uate Pathologic Voicersquo The Journal of the Acoustical Societyof America 80 1329ndash1334
Kay Elemetrics Inc (1994) lsquoVoice Disorders DatabaseVersion 103 [CD-ROM]rsquo httpwwwkaypentaxcomProduct20InfoCSL20Options43374337html
Khadivi Heris H Seyed Aghazadeh B and Nikkhah-BahramiM (2009) lsquoOptimal Feature Selection for the Assessment ofVocal Fold Disordersrsquo Computers in Biology and Medicine39(10) 860ndash868
Krom G (1993) lsquoA Cepstrum-Based Technique for Determininga Harmonics-To-Noise Ratio in Speech Signalsrsquo Journal ofSpeech and Hearing Research 36(2) 254ndash266
Kukharchik P Kheidorov I Bovbel E and Ladeev D (2008)lsquoSpeech Signal Processing Based on Wavelets and SVM forVocal Tract Pathology Detectionrsquo Lecture Notes in ComputerScience (Springer) 5099 192ndash199
Latifoglu F Polat K Kara SI and Gunes S (2008) lsquoMedicalDiagnosis of Atherosclerosis from Carotid Artery DopplerSignals Using Principal Component Analysis (PCA) k-NN based Weighting Pre-Processing and Artificial ImmuneRecognition System (AIRS)rsquo Journal of Biomedical Infor-matics 41(1) 15ndash23
Ludlow CL Bassich CJ Connor NP Coulter DC and LeeYJ (1987) lsquoThe Validity of Using Phonatory Jitter and Shim-mer to Detect Laryngeal Pathologyrsquo Laryngeal Functionin Phonation and Respiration Boston MA Brown amp Copp 492ndash508
MacQueen J (1967) lsquoSome Methods for Classification and Anal-ysis of Multivariate Observationsrsquo in Paper Presented at the5th Berkeley Symposium on Mathematical Statistics and Prob-ability Berkeley CA University California Press Vol 1pp 281ndash297
MAPACI P (2004) lsquoVoice Disorder Databasersquo httpwwwmapacicomindex-inglesphp
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1634 M Hariharan et al
Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372
Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800
Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953
Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327
Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900
Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793
Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793
Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60
Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673
Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279
Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921
Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia
Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345
Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925
Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128
Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740
Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p
Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118
Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576
Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd
Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430
Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97
Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117
Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678
Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219
Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399
Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1628 M Hariharan et al
Figure 6 Comparison of results (a) frame-based MEEI database raw features (b) file-based MEEI database raw features (c) frame-based MAPACI speech pathology database raw features and (d) file-based MAPACI speech pathology database raw features
summation layer and output layer The performance of theGRNN classifier highly depends upon the smoothing pa-rameter or spread factor (σ ) Based on the experimentalinvestigations the σ value is varied between 001 and 0055in steps of 0005
44 Support vector machine
In this work SVM is used as a classifier and it is a promis-ing method for solving non-linear classification problemsfunction estimation and density estimation and patternrecognition tasks (Azadi and Almasganj 2011 Calisir andDogantekin 2011 Ismail Samsudin and Shabri 2011 YuYao Wang and Lai 2011 Martis Acharya Mandana Rayand Chakraborty 2012) It has been originally proposed toclassify samples within two classes It maps training sam-ples of two classes into a higher dimensional space througha kernel function SVM seeks an optimal separating hyper-plane in this new space to maximise its distance from theclosest training point While testing a query point is cate-gorised according to the distance between the point and thehyperplane
SVM models are built around a kernel function thattransforms the input data into an n-dimensional space where
a hyperplane can be constructed to partition the data Threekinds of kernel functions such as linear kernel multilayerkernel and radial basis function (RBF) kernel are normallyused by researchers (Azadi and Almasganj 2011 Calisirand Dogantekin 2011 Ismail et al 2011 Yu et al 2011Martis et al 2012) In this work the RBF kernel functionis used since it gives an excellent generalisation and lowcomputational cost In the RBF kernel σ 2 (sig2) is theimportant parameter and it causes the changes in shapeflexion of the hyperplane
In this work LS-SVMlab toolbox (Suykens VanGestel De Brabanter De Moor and Vandewalle2003 De Brabanter 2010) is used to perform classificationof normal and pathological speech samples There are twoparameters which are to be chosen optimally such as reg-ularisation parameter (γ gam) and σ 2 (squared bandwidthof the RBF kernel) to obtain better accuracy
5 Results and discussions
Many research works have already been done in the areaof automatic detection of voice pathology In this work anew feature vector using WPT and SVD is proposed to clas-sify the voice samples into normal or pathological k-means
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1629
Figure 7 Comparison of results (a) Frame-based MEEI database weighted features (b) file-based MEEI database weighted features(c) Frame-based MAPACI speech pathology database weighted features and (d) file-based MAPACI speech pathology database weightedfeatures
clustering based feature weighting method is proposed forpreprocessing the features which increases its discrimina-tion ability between classes and the classification accuracyas well Four classifiers are employed to investigate the ef-ficacy of the raw and weighted features and two validationschemes (ConV and CrossV) are used to prove the reliabil-ity of the classification results Details of the training andtesting sets used in this work are tabulated in (Table 1)
To test the classifier performance several measuresnamely sensitivity specificity and the overall accuracy areconsidered These measures are calculated from the mea-sures true positive (TP the classifier classified as pathologywhen pathological samples are present) true negative (TNthe classifier classified as normal when normal samples arepresent) false positive (FP the classifier classified as patho-logical when normal samples are present) and false negative(FN the classifier classified as normal when pathologicalsamples are present) These measures are given in Equa-tions (9) (10) and (11) respectively
Sensitivity (SE) = TP(TP + FN) (9)
Specificity (SP) = TN(TN + FP) (10)
Overall accuracy (ACC) = (TP + TN)(TP + TN
+ FP + FN) (11)
In k-NN classifier different values of lsquokrsquo between 1 and10 are used PNN and GRNN are trained with differentspread factors between 001 and 0055 The suitable valueof the regularisation parameter (γ gam) and σ 2 (sig2) forSVM classifier are chosen optimally during training andtesting to obtain better accuracy Sensitivity specificity andoverall accuracy of frame-based and file-based analysis us-ing the raw and weighted features are presented in (Table 2)and (Table 3) for both MEEI database and MAPACIspeech pathology database respectively Average classifica-tion accuracies with standard deviations are tabulated From(Table 2) it is observed that the accuracy of file-based anal-ysis is always better than the frame-based analysis Max-imum classification accuracy of 100 is obtained duringthe file-based analysis using all the classifiers (k-NN LS-SVM PNN and GRNN) for the proposed raw and weightedfeatures Maximum classification accuracy of above 95is obtained using the raw features and 100 is obtained us-ing the weighted features during the frame-based analysis
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1630 M Hariharan et al
Tabl
e2
Com
pari
son
ofre
sult
sfo
rM
EE
Ida
taba
seus
ing
k-N
NL
S-S
VM
PN
Nan
dG
RN
N(r
awan
dw
eigh
ted
feat
ures
)
Raw
feat
ures
Wei
ghte
dfe
atur
es
ME
EI
data
base
SE
SP
AC
CS
ES
PA
CC
k-N
NFr
ame-
base
dC
ross
V96
66
plusmn0
5591
69
plusmn1
0994
05
plusmn0
8399
81
plusmn0
0698
27
plusmn0
3499
03
plusmn0
19k-
valu
e1ndash
10C
onV
959
6plusmn
071
908
7plusmn
154
932
7plusmn
116
998
1plusmn
006
979
8plusmn
033
988
8plusmn
015
File
-bas
edC
ross
V99
54
plusmn0
2410
0plusmn
000
996
5plusmn
019
995
4plusmn
024
100
plusmn0
0099
65
plusmn0
19k-
valu
e1ndash
10C
onV
997
4plusmn
025
100
plusmn0
0099
79
plusmn0
2010
0plusmn
000
100
plusmn0
0010
0plusmn
000
LS
-SV
MFr
ame-
base
dC
ross
V95
10
plusmn0
2195
41
plusmn0
2095
25
plusmn0
1299
75
plusmn0
0399
45
plusmn0
0799
60
plusmn0
0410
000
01
Con
V94
92
plusmn0
9094
49
plusmn1
0594
70
plusmn0
8199
34
plusmn0
2999
46
plusmn0
3499
40
plusmn0
26Fi
le-b
ased
Cro
ssV
989
2plusmn
062
998
1plusmn
060
991
2plusmn
047
993
2plusmn
036
100
plusmn0
0099
47
plusmn0
2810
01
Con
V10
0plusmn
000
988
7plusmn
130
991
2plusmn
103
996
2plusmn
080
994
1plusmn
186
995
6plusmn
071
PN
NFr
ame-
base
dC
ross
V97
43
plusmn0
2693
01
plusmn0
1795
12
plusmn0
1799
88
plusmn0
0598
69
plusmn0
1699
28
plusmn0
070
01ndash0
055
Con
V96
79
plusmn0
2492
44
plusmn0
3394
51
plusmn0
2499
81
plusmn0
0698
41
plusmn0
2199
10
plusmn0
11Fi
le-b
ased
Cro
ssV
974
4plusmn
539
100
plusmn0
0097
74
plusmn4
9797
50
plusmn5
3710
0plusmn
000
977
9plusmn
495
001
ndash00
55C
onV
970
9plusmn
610
100
plusmn0
0097
37
plusmn5
7999
62
plusmn0
8099
41
plusmn1
8699
56
plusmn0
71G
RN
NFr
ame-
base
dC
ross
V93
01
plusmn3
8988
08
plusmn4
9890
15
plusmn4
3199
50
plusmn0
4197
53
plusmn1
3098
27
plusmn0
890
01ndash0
055
Con
V92
52
plusmn3
7887
07
plusmn5
1189
34
plusmn4
3599
39
plusmn0
5597
35
plusmn1
1698
10
plusmn0
91Fi
le-b
ased
Cro
ssV
972
7plusmn
599
100
plusmn0
0096
33
plusmn8
0098
07
plusmn4
8210
0plusmn
000
979
2plusmn
511
001
ndash00
55C
onV
966
9plusmn
675
100
plusmn0
0095
81
plusmn9
0596
77
plusmn6
2310
0plusmn
000
958
1plusmn
815
Tabl
e3
Com
pari
son
ofre
sult
sfo
rM
APA
CI
spee
chpa
thol
ogy
data
base
usin
gk-
NN
LS
-SV
MP
NN
and
GR
NN
(raw
and
wei
ghte
dfe
atur
es)
Raw
feat
ures
Wei
ghte
dfe
atur
es
MA
PAC
Ida
taba
seS
ES
PA
CC
SE
SP
AC
C
k-N
NFr
ame-
base
dC
ross
V94
49
plusmn0
3696
58
plusmn0
3995
51
plusmn0
2198
67
plusmn0
2398
76
plusmn0
2598
71
plusmn0
23k-
valu
e1ndash
10C
onV
942
0plusmn
040
962
9plusmn
033
952
1plusmn
032
100
plusmn0
0010
0plusmn
000
100
plusmn0
00Fi
le-b
ased
Cro
ssV
771
0plusmn
784
669
2plusmn
330
704
2plusmn
365
100
plusmn0
0010
0plusmn
000
983
3plusmn
164
k-va
lue
1ndash10
Con
V77
56
plusmn7
1167
51
plusmn2
4869
71
plusmn2
7099
88
plusmn0
4099
88
plusmn0
4097
50
plusmn2
16L
S-S
VM
Fram
e-ba
sed
Cro
ssV
955
0plusmn
018
972
8plusmn
024
963
7plusmn
016
990
4plusmn
008
990
4plusmn
008
992
4plusmn
006
100
01
Con
V96
86
plusmn0
8195
24
plusmn0
6796
03
plusmn0
4098
71
plusmn0
6898
71
plusmn0
6899
02
plusmn0
41Fi
le-b
ased
Cro
ssV
793
1plusmn
333
688
7plusmn
240
729
2plusmn
260
100
plusmn0
0010
0plusmn
000
958
3plusmn
000
101
Con
V74
53
plusmn7
8284
00
plusmn13
01
771
4plusmn
878
100
plusmn0
0010
0plusmn
000
964
3plusmn
376
PN
NFr
ame-
base
dC
ross
V94
88
plusmn0
2096
23
plusmn0
2695
55
plusmn0
2199
82
plusmn0
0799
82
plusmn0
0799
91
plusmn0
040
01ndash0
055
Con
V94
52
plusmn0
3196
03
plusmn0
3595
26
plusmn0
2599
76
plusmn0
0799
76
plusmn0
0799
88
plusmn0
03Fi
le-b
ased
Cro
ssV
723
0plusmn
396
705
3plusmn
393
708
3plusmn
170
985
7plusmn
452
985
7plusmn
452
991
7plusmn
263
001
ndash00
55C
onV
690
2plusmn
533
691
8plusmn
410
679
3plusmn
386
986
0plusmn
273
986
0plusmn
273
990
0plusmn
172
GR
NN
Fram
e-ba
sed
Cro
ssV
904
1plusmn
378
917
9plusmn
335
909
8plusmn
345
995
9plusmn
029
995
9plusmn
029
997
1plusmn
023
001
ndash00
55C
onV
900
2plusmn
361
917
8plusmn
316
907
2plusmn
324
995
3plusmn
030
995
3plusmn
030
996
6plusmn
024
File
-bas
edC
ross
V69
43
plusmn8
5170
75
plusmn4
8666
67
plusmn7
7397
13
plusmn6
4097
13
plusmn6
4093
75
plusmn9
270
01ndash0
055
Con
V62
05
plusmn9
4064
51
plusmn3
8359
57
plusmn8
5097
38
plusmn5
6897
38
plusmn5
6893
07
plusmn9
91
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1631
Table 4 Computation time for feature extraction and classification
Classification
MEEI database MAPACI database
Type of Type of Weighted Raw Weighted RawClassifiers experiments validation features features features features
k-NN Frame-based CrossV 0338 0341 0342 0346k-value 1ndash10 ConV 3806 3821 6931 6955File-based CrossV 0036 0038 0036 0036k-value 1ndash10 ConV 1759 1754 3001 3206
LS-SVM Frame-based CrossV 1139 1175 1191 121410 001 ConV 0718 0698 0705 0723
File-based CrossV 0005 0005 0002 000210 1 ConV 0006 0006 0004 0004
PNN Frame-based CrossV 2458 3084 2445 2421001ndash0055 ConV 10340 10356 11180 11202File-based CrossV 1078 1124 1093 1062001ndash0055 ConV 4250 4266 4072 4047
GRNN Frame-based CrossV 2587 2571 2676 2582001ndash0055 ConV 10726 10648 11634 11584File-based CrossV 1098 1056 1065 1062001ndash0055 ConV 4302 4260 4049 4066
Feature extractionType of experiments MEEI MAPACI
Normal Pathological Normal PathologicalFile-based 0549 0034 0109 0130Frame-based 0591 0181 1734 1737
To prove the robustness and database independency of theproposed method MAPACI speech pathology database isused for the investigation in addition to MEEI databaseFrom (Table 3) it is inferred that maximum accuracy ofabove 95 is obtained using the raw features and 100is obtained using the weighted features during the frame-based analysis During the file-based analysis we obtainedmaximum classification accuracy of 92 using the rawfeatures and 100 is obtained using the weighted features
(Figure 6andashd) show the comparison of results (maxi-mum accuracy and its corresponding sensitivity and speci-ficity) for MEEI database and MAPACI speech pathologydatabase using the proposed raw features during frame-based and file-based analysis Similarly (Figure 7andashd) il-lustrate the comparison of results (maximum accuracy andits corresponding sensitivity and specificity) for MEEIdatabase and MAPACI speech pathology database for pro-posed weighted features The performance of the weightedfeatures is always better compared to raw features for bothMEEI database and MAPACI speech pathology databaseAll the feature extractions and classification algorithms aredeveloped under MATLAB platform and they run in a lap-top of Intel Core i7-2670 QM (22 GHz) with 4 GB RAM(Table 4) reports the average time taken (in seconds) duringfeature extraction and classification under two different ex-periments (frame-based and file-based) Direct comparisonof our work with the previous works in literature cannot beperformed since most of the works in the literature havenot presented their computation time of feature extraction
and classification phase From (Table 4) it is observed thatthe best classification accuracy could be obtained using oursuggested feature extraction and classification algorithmswith less computation time
From the literature it is observed that the reliabilityof wavelet and WPT-based features has been proven bymany experimental researches Accuracy of their resultswas varied from 85 to 100 approximately under differ-ent experiments Their results highly depend on the opti-mal selection of wavelet packet-based features using featureselectiontransformation techniques and optimisation tech-niques However the presented work gives very promisingclassification accuracy for the databases under test Henceit can be concluded that the proposed k-means clusteringbased feature weighting method could improve the robust-ness of the WPTndashSVD features and increase the classifica-tion accuracy as well for both MEEI database and MAPACIspeech pathology database
6 Conclusions
This paper presents a new feature constituting approachfor efficient detection of vocal fold pathology using a fea-ture extraction method (WPTndashSVD features and a fea-ture weighting method [k-means clustering based]) Totest the effectiveness and reliability of the proposed rawand weighted features four different classifiers such as k-NN LS-SVM PNN and GRNN are employed k-means
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1632 M Hariharan et al
clustering based feature weighting approach can increasethe distinguishing performance of raw WPTndashSVD featuresThe experimental results show that the proposed weightedfeatures give very promising classification accuracy of100 for both MEEI voice disorder database and MAPACIspeech pathology database The proposed features can beused as additional acoustic indicators for researchers andspeech pathologists to detect the vocal fold pathology Theperformance of the k-means clustering algorithm is depend-ing on the initial positions of the clusters and hence in ourwork it was run for several times to obtain better clus-ter centres In the future work k-means clustering basedfeature weighting method could be improved by using ef-ficient initialisation of methods for better initial placementof the cluster centres and different kernel functions Theproposed method could be extended to detect the specifictype of disorders and also to develop an online diagnosingsystem
AcknowledgementsThis work is done in part with data transferred from the databaseMAPACI httpwwwmapacicom The authors would like tothank the anonymous reviewers for their valuable comments
Notes on contributorsM Hariharan has received Bachelor ofEngineering in Electrical and ElectronicsEngineering and Master of Engineeringin Applied Electronics from GovernmentCollege of Technology Coimbatore TamilNadu India He has received a PhD degreein Mechatronic Engineering from UniversitiMalaysia Perlis (UniMAP) Malaysia He iscurrently working as a Senior Lecturer in
the School of Mechatronic Engineering UniMAP Malaysia Hehas published more than 50 papers in referred journals and con-ferences He has been a reviewer in various journals includingComputers and Electrical Engineering Computer Methods andPrograms in Biomedicine Artificial Intelligence in BiomedicineInternational Journal of Phoniatrics Speech Therapy and Com-munication Pathology and Medical Engineering and Physics Hisresearch interests include speech signal processing biomedicalsignal and image processing and artificial intelligence He is amember of IEEE
Kemal Polat was born in Kırıkkale on30 January 1981 Dr Polat has gradu-ated from ElectricalndashElectronics Engineer-ing Department of Selcuk University withBSc degree in 2002 and from ElectricalndashElectronics Engineering Department of Sel-cuk University with MSc degree in 2004He took his PhD degree in Electrical andElectronic Engineering at Selcuk University
in 2008 He is now working as an Assistant Professor in Electricaland Electronic Engineering Department Engineering of FacultyAbant Izzet Baysal University since September 2011 His currentresearch interests are biomedical signal classification statisticalsignal processing digital signal processing and pattern recognitionand classification He has 55 articles published in the SCI journalsand 16 international conference papers He is a reviewer in various
SCI journals including Information Sciences Pattern RecognitionLetters Digital Signal Processing Soft Computing Artificial In-telligence in Medicine Applied Soft Computing Computers inBiology and Medicine Expert Systems Computational Statisticsamp Data Analysis IEEE Transactions on Biomedical EngineeringIEEE Transactions on Evolutionary Computation TALANTASoft Computing Journal of Medical Systems and Scientific Re-search and Essays so on He is the member of editorial board ofJournal of Neural Computing and Applications (SCI expanded)
Sazali Yaacob received his PhD degreein Control Engineering from University ofSheffield UK He is currently working as aProfessor in the School of Mechatronic En-gineering UniMAP Malaysia He has suc-cessfully supervised 8 PhD candidates andmore than 20 MSc graduates through re-search mode Currently he has 10 PhD and8 MSc candidates His research interests
include control modelling and signal processing with applica-tions in the fields of satellite bio-medical applied mechanics androbotics He has published more than 70 papers in journals and200 papers in conference proceedings He received his profes-sional qualification as Charted Engineer from the EngineeringCouncil UK in 2005 and also a member of IET UK since 2003
ReferencesAnanthakrishna T Shama K and Niranjan U (2004)
lsquok-Means Nearest Neighbor Classifier for Voice Pathol-ogyrsquo in Paper Presented at the Proceedings from INDI-CONrsquo04 The 1st IEEE Annual India Conference KharagpurIndia pp 352ndash354
Arjmandi MK and Pooyan M (2011) lsquoAn Optimum Algorithmin Pathological Voice Quality Assessment Using Wavelet-Packet-Based Features Linear Discriminant Analysis andSupport Vector Machinersquo Biomedical Signal Processing andControl 7 3ndash19
Azadi TE and Almasganj F (2011) lsquoPBSVM Partitioningand Biased Support Vector Machine for Vocal Fold PathologyAssessment Using Labeled and Unlabeled Data Setsrsquo ExpertSystems with Applications 38(1) 610ndash619
Bezdek JC (1981) Pattern Recognition with Fuzzy ObjectiveFunction Algorithms Norwell MA Kluwer Academic Pub-lishers
Boyanov B and Hadjitodorov S (1997) lsquoAcoustic Analysis ofPathological Voices A Voice Analysis System for the Screen-ing of Laryngeal Diseasesrsquo IEEE Engineering in Medicineand Biology Magazine 16(4) 74ndash82
Boyanov B Ivanov T Hadjitodorov S and Chollet G (1993)lsquoRobust Hybrid Pitch Detectorrsquo IET Electronics Letters 291924ndash1926
Burrus C Gopinath R Guo H Odegard J and Selesnick I(1997) Introduction to Wavelets and Wavelet Transforms APrimer Upper Saddle River NJ Prentice Hall
Calisir D and Dogantekin E (2011) lsquoA New Intelligent Hep-atitis Diagnosis System PCA-LSSVMrsquo Expert Systems withApplications 38(8) 10705ndash10708
Chiu SL (1994) lsquoFuzzy Model Identification Based on ClusterEstimationrsquo Journal of Intelligent and Fuzzy Systems 2(3)267ndash278
Crovato C and Schuck A (2007) lsquoThe Use of Wavelet PacketTransform and Artificial Neural Networks in Analysis andClassification of Dysphonic Voicesrsquo IEEE Transactions onBiomedical Engineering 54(10) 1898ndash1900
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1633
De Brabanter K Ojeda PKF Alzate C De Brabanter J Pel-ckmans K De Moor B Vandewalle J and Suykens JAK(2010) lsquoLS-SVMlab Toolboxrsquo httpwwwesatkuleuvenbesistalssvmlab
Deliyski D (1993) Acoustic Model and Evaluation of Patholog-ical Voice Productionrsquo in Paper Presented at the Proceed-ings from EUROSPEECHrsquo93 The 3rd Conference on SpeechCommunication and Technology Berlin Germany pp 1969ndash1972
Duda RO Hart PE and Stork DG (2001) Pattern Classifi-cation New York Wiley-Interscience
Erfanian Saeedi N Almasganj F and Torabinejad F (2011)lsquoSupport Vector Wavelet Adaptation for Pathological VoiceAssessmentrsquo Computers in Biology and Medicine 41(9)822ndash828
Eskidere O Ertas F and Hanilci C (2011) lsquoA Comparison ofRegression Methods for Remote Tracking of Parkinsonrsquos Dis-ease Progressionrsquo Expert Systems with Applications 39(5)5523ndash5528
Feijoo S and Hernandez C (1990) lsquoShort-Term Stability Mea-sures for the Evaluation of Vocal Qualityrsquo Journal of Speechand Hearing Research 33(2) 324ndash334
Fonseca E Guido R Scalassara P Maciel C and Pereira J(2007) lsquoWavelet Time-Frequency Analysis and Least SquaresSupport Vector Machines for the Identification of Voice Dis-ordersrsquo Computers in Biology and Medicine(Elsevier) 37(4)571ndash578
Fukunaga K (1990) Introduction to Statistical Pattern Recogni-tion (2nd ed) San Diego CA Academic Press
Godino-Llorente J and Gomez-Vilda P (2004) lsquoAutomaticDetection of Voice Impairments by Means of Short-TermCepstral Parameters and Neural Network Based DetectorsrsquoIEEE Transactions on Biomedical Engineering 51(2) 380ndash384
Godino-Llorente J Gomez-Vilda P and Blanco-Velasco M(2006) lsquoDimensionality Reduction of a Pathological VoiceQuality Assessment System Based on Gaussian Mixture Mod-els and Short-Term Cepstral Parametersrsquo IEEE Transactionson Biomedical Engineering 53(10) 1943ndash1953
Gunes S Polat K and Yosunkaya S (2010) lsquoEfficient SleepStage Recognition System Based on EEG Signal Using k-Means Clustering Based Feature Weightingrsquo Expert Systemswith Applications 37(12) 7922ndash7928
Hariharan M Paulraj M and Yaacob S (2009) lsquoIdentifica-tion of Vocal Fold Pathology Based on Mel Frequency BandEnergy Coefficients and Singular Value Decompositionrsquo inPaper presented at the 2009 IEEE International Conferenceon Signal and Image Processing Applications (ICSIPArsquo09)Kuala Lumpur Malaysia pp 514ndash517
Hariharan M Paulraj M and Yaacob S (2011) lsquoDetection ofVocal Fold Paralysis and Oedema Using Time-Domain Fea-tures and Probabilistic Neural Networkrsquo International Jour-nal of Biomedical Engineering and Technology 6(1) 46ndash57
Hariharan M Saraswathy J Sindhu R Khairunizam W andYaacob S (2012) lsquoInfant Cry Classification to Identify As-phyxia Using Time-Frequency Analysis and Radial BasisNeural Networksrsquo Expert Systems with Applications 39(10)9515ndash9523
Hariharan M and Sazali Y (2010) lsquoTime-Domain Features andProbabilistic Neural Network for the Detection of Vocal FoldPathologyrsquo Malaysian Journal of Computer Science 23(1)60ndash67
Hariharan M Sindhu R and Yaacob S (2011) lsquoNormal andHypoacoustic Infant Cry Signal Classification Using Time-Frequency Analysis and General Regression Neural Net-
workrsquo Computer Methods and Programs in Biomedicine108(2) 559ndash569
Hariharan M Yaacob S and Awang SA (2011) lsquoPathologi-cal Infant Cry Analysis Using Wavelet Packet Transform andProbabilistic Neural Networkrsquo Expert Systems with Applica-tions 38(12) 15377ndash15382
Hernandez-Espinosa C Gomez-Vilda P Godino-Llorente Jand Aguilera-Navarro S (2000) lsquoDiagnosis of Vocal andVoice Disorders by the Speech Signalrsquo in Paper Presentedat the Proceedings from IJCNNrsquo00 The IEEE-INNS-ENNSInternational Joint Conference on Neural Networks ComoItaly pp 253ndash258
Hoglund H (2012) lsquoDetecting Earnings Management with Neu-ral Networksrsquo Expert Systems with Applications 39(10)9564ndash9570
Ismail S Samsudin R and Shabri A (2011) lsquoA Hybrid Modelof Self-Organizing Maps (SOM) and Least Square SupportVector Machine (LSSVM) for Time Series Forecastingrsquo Ex-pert Systems with Applications 38(8) 10574ndash10578
Kalker T Haitsma J and Oostveen J (2001) lsquoRobust AudioHashing for Content Identificationrsquo in Paper Presented at theProceedings from EUSIPCOrsquo01 The 12th European SignalProcessing Conference Vienna Austria pp 2091ndash2094
Kasuya H Endo Y and Saliu S (1993) lsquoNovel AcousticMeasurements of Jitter and Shimmer Characteristics fromPathological Voicersquo Paper Presented at the Proceedings fromEUROSPEECHrsquo93 The 3rd European Conference on SpeechCommunication and Technology Berlin Germany pp 1973ndash1976
Kasuya H Ogawa S Mashima K and Ebihara S (1986)lsquoNormalized Noise Energy as an Acoustic Measure to Eval-uate Pathologic Voicersquo The Journal of the Acoustical Societyof America 80 1329ndash1334
Kay Elemetrics Inc (1994) lsquoVoice Disorders DatabaseVersion 103 [CD-ROM]rsquo httpwwwkaypentaxcomProduct20InfoCSL20Options43374337html
Khadivi Heris H Seyed Aghazadeh B and Nikkhah-BahramiM (2009) lsquoOptimal Feature Selection for the Assessment ofVocal Fold Disordersrsquo Computers in Biology and Medicine39(10) 860ndash868
Krom G (1993) lsquoA Cepstrum-Based Technique for Determininga Harmonics-To-Noise Ratio in Speech Signalsrsquo Journal ofSpeech and Hearing Research 36(2) 254ndash266
Kukharchik P Kheidorov I Bovbel E and Ladeev D (2008)lsquoSpeech Signal Processing Based on Wavelets and SVM forVocal Tract Pathology Detectionrsquo Lecture Notes in ComputerScience (Springer) 5099 192ndash199
Latifoglu F Polat K Kara SI and Gunes S (2008) lsquoMedicalDiagnosis of Atherosclerosis from Carotid Artery DopplerSignals Using Principal Component Analysis (PCA) k-NN based Weighting Pre-Processing and Artificial ImmuneRecognition System (AIRS)rsquo Journal of Biomedical Infor-matics 41(1) 15ndash23
Ludlow CL Bassich CJ Connor NP Coulter DC and LeeYJ (1987) lsquoThe Validity of Using Phonatory Jitter and Shim-mer to Detect Laryngeal Pathologyrsquo Laryngeal Functionin Phonation and Respiration Boston MA Brown amp Copp 492ndash508
MacQueen J (1967) lsquoSome Methods for Classification and Anal-ysis of Multivariate Observationsrsquo in Paper Presented at the5th Berkeley Symposium on Mathematical Statistics and Prob-ability Berkeley CA University California Press Vol 1pp 281ndash297
MAPACI P (2004) lsquoVoice Disorder Databasersquo httpwwwmapacicomindex-inglesphp
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1634 M Hariharan et al
Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372
Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800
Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953
Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327
Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900
Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793
Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793
Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60
Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673
Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279
Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921
Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia
Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345
Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925
Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128
Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740
Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p
Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118
Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576
Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd
Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430
Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97
Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117
Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678
Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219
Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399
Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1629
Figure 7 Comparison of results (a) Frame-based MEEI database weighted features (b) file-based MEEI database weighted features(c) Frame-based MAPACI speech pathology database weighted features and (d) file-based MAPACI speech pathology database weightedfeatures
clustering based feature weighting method is proposed forpreprocessing the features which increases its discrimina-tion ability between classes and the classification accuracyas well Four classifiers are employed to investigate the ef-ficacy of the raw and weighted features and two validationschemes (ConV and CrossV) are used to prove the reliabil-ity of the classification results Details of the training andtesting sets used in this work are tabulated in (Table 1)
To test the classifier performance several measuresnamely sensitivity specificity and the overall accuracy areconsidered These measures are calculated from the mea-sures true positive (TP the classifier classified as pathologywhen pathological samples are present) true negative (TNthe classifier classified as normal when normal samples arepresent) false positive (FP the classifier classified as patho-logical when normal samples are present) and false negative(FN the classifier classified as normal when pathologicalsamples are present) These measures are given in Equa-tions (9) (10) and (11) respectively
Sensitivity (SE) = TP(TP + FN) (9)
Specificity (SP) = TN(TN + FP) (10)
Overall accuracy (ACC) = (TP + TN)(TP + TN
+ FP + FN) (11)
In k-NN classifier different values of lsquokrsquo between 1 and10 are used PNN and GRNN are trained with differentspread factors between 001 and 0055 The suitable valueof the regularisation parameter (γ gam) and σ 2 (sig2) forSVM classifier are chosen optimally during training andtesting to obtain better accuracy Sensitivity specificity andoverall accuracy of frame-based and file-based analysis us-ing the raw and weighted features are presented in (Table 2)and (Table 3) for both MEEI database and MAPACIspeech pathology database respectively Average classifica-tion accuracies with standard deviations are tabulated From(Table 2) it is observed that the accuracy of file-based anal-ysis is always better than the frame-based analysis Max-imum classification accuracy of 100 is obtained duringthe file-based analysis using all the classifiers (k-NN LS-SVM PNN and GRNN) for the proposed raw and weightedfeatures Maximum classification accuracy of above 95is obtained using the raw features and 100 is obtained us-ing the weighted features during the frame-based analysis
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1630 M Hariharan et al
Tabl
e2
Com
pari
son
ofre
sult
sfo
rM
EE
Ida
taba
seus
ing
k-N
NL
S-S
VM
PN
Nan
dG
RN
N(r
awan
dw
eigh
ted
feat
ures
)
Raw
feat
ures
Wei
ghte
dfe
atur
es
ME
EI
data
base
SE
SP
AC
CS
ES
PA
CC
k-N
NFr
ame-
base
dC
ross
V96
66
plusmn0
5591
69
plusmn1
0994
05
plusmn0
8399
81
plusmn0
0698
27
plusmn0
3499
03
plusmn0
19k-
valu
e1ndash
10C
onV
959
6plusmn
071
908
7plusmn
154
932
7plusmn
116
998
1plusmn
006
979
8plusmn
033
988
8plusmn
015
File
-bas
edC
ross
V99
54
plusmn0
2410
0plusmn
000
996
5plusmn
019
995
4plusmn
024
100
plusmn0
0099
65
plusmn0
19k-
valu
e1ndash
10C
onV
997
4plusmn
025
100
plusmn0
0099
79
plusmn0
2010
0plusmn
000
100
plusmn0
0010
0plusmn
000
LS
-SV
MFr
ame-
base
dC
ross
V95
10
plusmn0
2195
41
plusmn0
2095
25
plusmn0
1299
75
plusmn0
0399
45
plusmn0
0799
60
plusmn0
0410
000
01
Con
V94
92
plusmn0
9094
49
plusmn1
0594
70
plusmn0
8199
34
plusmn0
2999
46
plusmn0
3499
40
plusmn0
26Fi
le-b
ased
Cro
ssV
989
2plusmn
062
998
1plusmn
060
991
2plusmn
047
993
2plusmn
036
100
plusmn0
0099
47
plusmn0
2810
01
Con
V10
0plusmn
000
988
7plusmn
130
991
2plusmn
103
996
2plusmn
080
994
1plusmn
186
995
6plusmn
071
PN
NFr
ame-
base
dC
ross
V97
43
plusmn0
2693
01
plusmn0
1795
12
plusmn0
1799
88
plusmn0
0598
69
plusmn0
1699
28
plusmn0
070
01ndash0
055
Con
V96
79
plusmn0
2492
44
plusmn0
3394
51
plusmn0
2499
81
plusmn0
0698
41
plusmn0
2199
10
plusmn0
11Fi
le-b
ased
Cro
ssV
974
4plusmn
539
100
plusmn0
0097
74
plusmn4
9797
50
plusmn5
3710
0plusmn
000
977
9plusmn
495
001
ndash00
55C
onV
970
9plusmn
610
100
plusmn0
0097
37
plusmn5
7999
62
plusmn0
8099
41
plusmn1
8699
56
plusmn0
71G
RN
NFr
ame-
base
dC
ross
V93
01
plusmn3
8988
08
plusmn4
9890
15
plusmn4
3199
50
plusmn0
4197
53
plusmn1
3098
27
plusmn0
890
01ndash0
055
Con
V92
52
plusmn3
7887
07
plusmn5
1189
34
plusmn4
3599
39
plusmn0
5597
35
plusmn1
1698
10
plusmn0
91Fi
le-b
ased
Cro
ssV
972
7plusmn
599
100
plusmn0
0096
33
plusmn8
0098
07
plusmn4
8210
0plusmn
000
979
2plusmn
511
001
ndash00
55C
onV
966
9plusmn
675
100
plusmn0
0095
81
plusmn9
0596
77
plusmn6
2310
0plusmn
000
958
1plusmn
815
Tabl
e3
Com
pari
son
ofre
sult
sfo
rM
APA
CI
spee
chpa
thol
ogy
data
base
usin
gk-
NN
LS
-SV
MP
NN
and
GR
NN
(raw
and
wei
ghte
dfe
atur
es)
Raw
feat
ures
Wei
ghte
dfe
atur
es
MA
PAC
Ida
taba
seS
ES
PA
CC
SE
SP
AC
C
k-N
NFr
ame-
base
dC
ross
V94
49
plusmn0
3696
58
plusmn0
3995
51
plusmn0
2198
67
plusmn0
2398
76
plusmn0
2598
71
plusmn0
23k-
valu
e1ndash
10C
onV
942
0plusmn
040
962
9plusmn
033
952
1plusmn
032
100
plusmn0
0010
0plusmn
000
100
plusmn0
00Fi
le-b
ased
Cro
ssV
771
0plusmn
784
669
2plusmn
330
704
2plusmn
365
100
plusmn0
0010
0plusmn
000
983
3plusmn
164
k-va
lue
1ndash10
Con
V77
56
plusmn7
1167
51
plusmn2
4869
71
plusmn2
7099
88
plusmn0
4099
88
plusmn0
4097
50
plusmn2
16L
S-S
VM
Fram
e-ba
sed
Cro
ssV
955
0plusmn
018
972
8plusmn
024
963
7plusmn
016
990
4plusmn
008
990
4plusmn
008
992
4plusmn
006
100
01
Con
V96
86
plusmn0
8195
24
plusmn0
6796
03
plusmn0
4098
71
plusmn0
6898
71
plusmn0
6899
02
plusmn0
41Fi
le-b
ased
Cro
ssV
793
1plusmn
333
688
7plusmn
240
729
2plusmn
260
100
plusmn0
0010
0plusmn
000
958
3plusmn
000
101
Con
V74
53
plusmn7
8284
00
plusmn13
01
771
4plusmn
878
100
plusmn0
0010
0plusmn
000
964
3plusmn
376
PN
NFr
ame-
base
dC
ross
V94
88
plusmn0
2096
23
plusmn0
2695
55
plusmn0
2199
82
plusmn0
0799
82
plusmn0
0799
91
plusmn0
040
01ndash0
055
Con
V94
52
plusmn0
3196
03
plusmn0
3595
26
plusmn0
2599
76
plusmn0
0799
76
plusmn0
0799
88
plusmn0
03Fi
le-b
ased
Cro
ssV
723
0plusmn
396
705
3plusmn
393
708
3plusmn
170
985
7plusmn
452
985
7plusmn
452
991
7plusmn
263
001
ndash00
55C
onV
690
2plusmn
533
691
8plusmn
410
679
3plusmn
386
986
0plusmn
273
986
0plusmn
273
990
0plusmn
172
GR
NN
Fram
e-ba
sed
Cro
ssV
904
1plusmn
378
917
9plusmn
335
909
8plusmn
345
995
9plusmn
029
995
9plusmn
029
997
1plusmn
023
001
ndash00
55C
onV
900
2plusmn
361
917
8plusmn
316
907
2plusmn
324
995
3plusmn
030
995
3plusmn
030
996
6plusmn
024
File
-bas
edC
ross
V69
43
plusmn8
5170
75
plusmn4
8666
67
plusmn7
7397
13
plusmn6
4097
13
plusmn6
4093
75
plusmn9
270
01ndash0
055
Con
V62
05
plusmn9
4064
51
plusmn3
8359
57
plusmn8
5097
38
plusmn5
6897
38
plusmn5
6893
07
plusmn9
91
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1631
Table 4 Computation time for feature extraction and classification
Classification
MEEI database MAPACI database
Type of Type of Weighted Raw Weighted RawClassifiers experiments validation features features features features
k-NN Frame-based CrossV 0338 0341 0342 0346k-value 1ndash10 ConV 3806 3821 6931 6955File-based CrossV 0036 0038 0036 0036k-value 1ndash10 ConV 1759 1754 3001 3206
LS-SVM Frame-based CrossV 1139 1175 1191 121410 001 ConV 0718 0698 0705 0723
File-based CrossV 0005 0005 0002 000210 1 ConV 0006 0006 0004 0004
PNN Frame-based CrossV 2458 3084 2445 2421001ndash0055 ConV 10340 10356 11180 11202File-based CrossV 1078 1124 1093 1062001ndash0055 ConV 4250 4266 4072 4047
GRNN Frame-based CrossV 2587 2571 2676 2582001ndash0055 ConV 10726 10648 11634 11584File-based CrossV 1098 1056 1065 1062001ndash0055 ConV 4302 4260 4049 4066
Feature extractionType of experiments MEEI MAPACI
Normal Pathological Normal PathologicalFile-based 0549 0034 0109 0130Frame-based 0591 0181 1734 1737
To prove the robustness and database independency of theproposed method MAPACI speech pathology database isused for the investigation in addition to MEEI databaseFrom (Table 3) it is inferred that maximum accuracy ofabove 95 is obtained using the raw features and 100is obtained using the weighted features during the frame-based analysis During the file-based analysis we obtainedmaximum classification accuracy of 92 using the rawfeatures and 100 is obtained using the weighted features
(Figure 6andashd) show the comparison of results (maxi-mum accuracy and its corresponding sensitivity and speci-ficity) for MEEI database and MAPACI speech pathologydatabase using the proposed raw features during frame-based and file-based analysis Similarly (Figure 7andashd) il-lustrate the comparison of results (maximum accuracy andits corresponding sensitivity and specificity) for MEEIdatabase and MAPACI speech pathology database for pro-posed weighted features The performance of the weightedfeatures is always better compared to raw features for bothMEEI database and MAPACI speech pathology databaseAll the feature extractions and classification algorithms aredeveloped under MATLAB platform and they run in a lap-top of Intel Core i7-2670 QM (22 GHz) with 4 GB RAM(Table 4) reports the average time taken (in seconds) duringfeature extraction and classification under two different ex-periments (frame-based and file-based) Direct comparisonof our work with the previous works in literature cannot beperformed since most of the works in the literature havenot presented their computation time of feature extraction
and classification phase From (Table 4) it is observed thatthe best classification accuracy could be obtained using oursuggested feature extraction and classification algorithmswith less computation time
From the literature it is observed that the reliabilityof wavelet and WPT-based features has been proven bymany experimental researches Accuracy of their resultswas varied from 85 to 100 approximately under differ-ent experiments Their results highly depend on the opti-mal selection of wavelet packet-based features using featureselectiontransformation techniques and optimisation tech-niques However the presented work gives very promisingclassification accuracy for the databases under test Henceit can be concluded that the proposed k-means clusteringbased feature weighting method could improve the robust-ness of the WPTndashSVD features and increase the classifica-tion accuracy as well for both MEEI database and MAPACIspeech pathology database
6 Conclusions
This paper presents a new feature constituting approachfor efficient detection of vocal fold pathology using a fea-ture extraction method (WPTndashSVD features and a fea-ture weighting method [k-means clustering based]) Totest the effectiveness and reliability of the proposed rawand weighted features four different classifiers such as k-NN LS-SVM PNN and GRNN are employed k-means
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1632 M Hariharan et al
clustering based feature weighting approach can increasethe distinguishing performance of raw WPTndashSVD featuresThe experimental results show that the proposed weightedfeatures give very promising classification accuracy of100 for both MEEI voice disorder database and MAPACIspeech pathology database The proposed features can beused as additional acoustic indicators for researchers andspeech pathologists to detect the vocal fold pathology Theperformance of the k-means clustering algorithm is depend-ing on the initial positions of the clusters and hence in ourwork it was run for several times to obtain better clus-ter centres In the future work k-means clustering basedfeature weighting method could be improved by using ef-ficient initialisation of methods for better initial placementof the cluster centres and different kernel functions Theproposed method could be extended to detect the specifictype of disorders and also to develop an online diagnosingsystem
AcknowledgementsThis work is done in part with data transferred from the databaseMAPACI httpwwwmapacicom The authors would like tothank the anonymous reviewers for their valuable comments
Notes on contributorsM Hariharan has received Bachelor ofEngineering in Electrical and ElectronicsEngineering and Master of Engineeringin Applied Electronics from GovernmentCollege of Technology Coimbatore TamilNadu India He has received a PhD degreein Mechatronic Engineering from UniversitiMalaysia Perlis (UniMAP) Malaysia He iscurrently working as a Senior Lecturer in
the School of Mechatronic Engineering UniMAP Malaysia Hehas published more than 50 papers in referred journals and con-ferences He has been a reviewer in various journals includingComputers and Electrical Engineering Computer Methods andPrograms in Biomedicine Artificial Intelligence in BiomedicineInternational Journal of Phoniatrics Speech Therapy and Com-munication Pathology and Medical Engineering and Physics Hisresearch interests include speech signal processing biomedicalsignal and image processing and artificial intelligence He is amember of IEEE
Kemal Polat was born in Kırıkkale on30 January 1981 Dr Polat has gradu-ated from ElectricalndashElectronics Engineer-ing Department of Selcuk University withBSc degree in 2002 and from ElectricalndashElectronics Engineering Department of Sel-cuk University with MSc degree in 2004He took his PhD degree in Electrical andElectronic Engineering at Selcuk University
in 2008 He is now working as an Assistant Professor in Electricaland Electronic Engineering Department Engineering of FacultyAbant Izzet Baysal University since September 2011 His currentresearch interests are biomedical signal classification statisticalsignal processing digital signal processing and pattern recognitionand classification He has 55 articles published in the SCI journalsand 16 international conference papers He is a reviewer in various
SCI journals including Information Sciences Pattern RecognitionLetters Digital Signal Processing Soft Computing Artificial In-telligence in Medicine Applied Soft Computing Computers inBiology and Medicine Expert Systems Computational Statisticsamp Data Analysis IEEE Transactions on Biomedical EngineeringIEEE Transactions on Evolutionary Computation TALANTASoft Computing Journal of Medical Systems and Scientific Re-search and Essays so on He is the member of editorial board ofJournal of Neural Computing and Applications (SCI expanded)
Sazali Yaacob received his PhD degreein Control Engineering from University ofSheffield UK He is currently working as aProfessor in the School of Mechatronic En-gineering UniMAP Malaysia He has suc-cessfully supervised 8 PhD candidates andmore than 20 MSc graduates through re-search mode Currently he has 10 PhD and8 MSc candidates His research interests
include control modelling and signal processing with applica-tions in the fields of satellite bio-medical applied mechanics androbotics He has published more than 70 papers in journals and200 papers in conference proceedings He received his profes-sional qualification as Charted Engineer from the EngineeringCouncil UK in 2005 and also a member of IET UK since 2003
ReferencesAnanthakrishna T Shama K and Niranjan U (2004)
lsquok-Means Nearest Neighbor Classifier for Voice Pathol-ogyrsquo in Paper Presented at the Proceedings from INDI-CONrsquo04 The 1st IEEE Annual India Conference KharagpurIndia pp 352ndash354
Arjmandi MK and Pooyan M (2011) lsquoAn Optimum Algorithmin Pathological Voice Quality Assessment Using Wavelet-Packet-Based Features Linear Discriminant Analysis andSupport Vector Machinersquo Biomedical Signal Processing andControl 7 3ndash19
Azadi TE and Almasganj F (2011) lsquoPBSVM Partitioningand Biased Support Vector Machine for Vocal Fold PathologyAssessment Using Labeled and Unlabeled Data Setsrsquo ExpertSystems with Applications 38(1) 610ndash619
Bezdek JC (1981) Pattern Recognition with Fuzzy ObjectiveFunction Algorithms Norwell MA Kluwer Academic Pub-lishers
Boyanov B and Hadjitodorov S (1997) lsquoAcoustic Analysis ofPathological Voices A Voice Analysis System for the Screen-ing of Laryngeal Diseasesrsquo IEEE Engineering in Medicineand Biology Magazine 16(4) 74ndash82
Boyanov B Ivanov T Hadjitodorov S and Chollet G (1993)lsquoRobust Hybrid Pitch Detectorrsquo IET Electronics Letters 291924ndash1926
Burrus C Gopinath R Guo H Odegard J and Selesnick I(1997) Introduction to Wavelets and Wavelet Transforms APrimer Upper Saddle River NJ Prentice Hall
Calisir D and Dogantekin E (2011) lsquoA New Intelligent Hep-atitis Diagnosis System PCA-LSSVMrsquo Expert Systems withApplications 38(8) 10705ndash10708
Chiu SL (1994) lsquoFuzzy Model Identification Based on ClusterEstimationrsquo Journal of Intelligent and Fuzzy Systems 2(3)267ndash278
Crovato C and Schuck A (2007) lsquoThe Use of Wavelet PacketTransform and Artificial Neural Networks in Analysis andClassification of Dysphonic Voicesrsquo IEEE Transactions onBiomedical Engineering 54(10) 1898ndash1900
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1633
De Brabanter K Ojeda PKF Alzate C De Brabanter J Pel-ckmans K De Moor B Vandewalle J and Suykens JAK(2010) lsquoLS-SVMlab Toolboxrsquo httpwwwesatkuleuvenbesistalssvmlab
Deliyski D (1993) Acoustic Model and Evaluation of Patholog-ical Voice Productionrsquo in Paper Presented at the Proceed-ings from EUROSPEECHrsquo93 The 3rd Conference on SpeechCommunication and Technology Berlin Germany pp 1969ndash1972
Duda RO Hart PE and Stork DG (2001) Pattern Classifi-cation New York Wiley-Interscience
Erfanian Saeedi N Almasganj F and Torabinejad F (2011)lsquoSupport Vector Wavelet Adaptation for Pathological VoiceAssessmentrsquo Computers in Biology and Medicine 41(9)822ndash828
Eskidere O Ertas F and Hanilci C (2011) lsquoA Comparison ofRegression Methods for Remote Tracking of Parkinsonrsquos Dis-ease Progressionrsquo Expert Systems with Applications 39(5)5523ndash5528
Feijoo S and Hernandez C (1990) lsquoShort-Term Stability Mea-sures for the Evaluation of Vocal Qualityrsquo Journal of Speechand Hearing Research 33(2) 324ndash334
Fonseca E Guido R Scalassara P Maciel C and Pereira J(2007) lsquoWavelet Time-Frequency Analysis and Least SquaresSupport Vector Machines for the Identification of Voice Dis-ordersrsquo Computers in Biology and Medicine(Elsevier) 37(4)571ndash578
Fukunaga K (1990) Introduction to Statistical Pattern Recogni-tion (2nd ed) San Diego CA Academic Press
Godino-Llorente J and Gomez-Vilda P (2004) lsquoAutomaticDetection of Voice Impairments by Means of Short-TermCepstral Parameters and Neural Network Based DetectorsrsquoIEEE Transactions on Biomedical Engineering 51(2) 380ndash384
Godino-Llorente J Gomez-Vilda P and Blanco-Velasco M(2006) lsquoDimensionality Reduction of a Pathological VoiceQuality Assessment System Based on Gaussian Mixture Mod-els and Short-Term Cepstral Parametersrsquo IEEE Transactionson Biomedical Engineering 53(10) 1943ndash1953
Gunes S Polat K and Yosunkaya S (2010) lsquoEfficient SleepStage Recognition System Based on EEG Signal Using k-Means Clustering Based Feature Weightingrsquo Expert Systemswith Applications 37(12) 7922ndash7928
Hariharan M Paulraj M and Yaacob S (2009) lsquoIdentifica-tion of Vocal Fold Pathology Based on Mel Frequency BandEnergy Coefficients and Singular Value Decompositionrsquo inPaper presented at the 2009 IEEE International Conferenceon Signal and Image Processing Applications (ICSIPArsquo09)Kuala Lumpur Malaysia pp 514ndash517
Hariharan M Paulraj M and Yaacob S (2011) lsquoDetection ofVocal Fold Paralysis and Oedema Using Time-Domain Fea-tures and Probabilistic Neural Networkrsquo International Jour-nal of Biomedical Engineering and Technology 6(1) 46ndash57
Hariharan M Saraswathy J Sindhu R Khairunizam W andYaacob S (2012) lsquoInfant Cry Classification to Identify As-phyxia Using Time-Frequency Analysis and Radial BasisNeural Networksrsquo Expert Systems with Applications 39(10)9515ndash9523
Hariharan M and Sazali Y (2010) lsquoTime-Domain Features andProbabilistic Neural Network for the Detection of Vocal FoldPathologyrsquo Malaysian Journal of Computer Science 23(1)60ndash67
Hariharan M Sindhu R and Yaacob S (2011) lsquoNormal andHypoacoustic Infant Cry Signal Classification Using Time-Frequency Analysis and General Regression Neural Net-
workrsquo Computer Methods and Programs in Biomedicine108(2) 559ndash569
Hariharan M Yaacob S and Awang SA (2011) lsquoPathologi-cal Infant Cry Analysis Using Wavelet Packet Transform andProbabilistic Neural Networkrsquo Expert Systems with Applica-tions 38(12) 15377ndash15382
Hernandez-Espinosa C Gomez-Vilda P Godino-Llorente Jand Aguilera-Navarro S (2000) lsquoDiagnosis of Vocal andVoice Disorders by the Speech Signalrsquo in Paper Presentedat the Proceedings from IJCNNrsquo00 The IEEE-INNS-ENNSInternational Joint Conference on Neural Networks ComoItaly pp 253ndash258
Hoglund H (2012) lsquoDetecting Earnings Management with Neu-ral Networksrsquo Expert Systems with Applications 39(10)9564ndash9570
Ismail S Samsudin R and Shabri A (2011) lsquoA Hybrid Modelof Self-Organizing Maps (SOM) and Least Square SupportVector Machine (LSSVM) for Time Series Forecastingrsquo Ex-pert Systems with Applications 38(8) 10574ndash10578
Kalker T Haitsma J and Oostveen J (2001) lsquoRobust AudioHashing for Content Identificationrsquo in Paper Presented at theProceedings from EUSIPCOrsquo01 The 12th European SignalProcessing Conference Vienna Austria pp 2091ndash2094
Kasuya H Endo Y and Saliu S (1993) lsquoNovel AcousticMeasurements of Jitter and Shimmer Characteristics fromPathological Voicersquo Paper Presented at the Proceedings fromEUROSPEECHrsquo93 The 3rd European Conference on SpeechCommunication and Technology Berlin Germany pp 1973ndash1976
Kasuya H Ogawa S Mashima K and Ebihara S (1986)lsquoNormalized Noise Energy as an Acoustic Measure to Eval-uate Pathologic Voicersquo The Journal of the Acoustical Societyof America 80 1329ndash1334
Kay Elemetrics Inc (1994) lsquoVoice Disorders DatabaseVersion 103 [CD-ROM]rsquo httpwwwkaypentaxcomProduct20InfoCSL20Options43374337html
Khadivi Heris H Seyed Aghazadeh B and Nikkhah-BahramiM (2009) lsquoOptimal Feature Selection for the Assessment ofVocal Fold Disordersrsquo Computers in Biology and Medicine39(10) 860ndash868
Krom G (1993) lsquoA Cepstrum-Based Technique for Determininga Harmonics-To-Noise Ratio in Speech Signalsrsquo Journal ofSpeech and Hearing Research 36(2) 254ndash266
Kukharchik P Kheidorov I Bovbel E and Ladeev D (2008)lsquoSpeech Signal Processing Based on Wavelets and SVM forVocal Tract Pathology Detectionrsquo Lecture Notes in ComputerScience (Springer) 5099 192ndash199
Latifoglu F Polat K Kara SI and Gunes S (2008) lsquoMedicalDiagnosis of Atherosclerosis from Carotid Artery DopplerSignals Using Principal Component Analysis (PCA) k-NN based Weighting Pre-Processing and Artificial ImmuneRecognition System (AIRS)rsquo Journal of Biomedical Infor-matics 41(1) 15ndash23
Ludlow CL Bassich CJ Connor NP Coulter DC and LeeYJ (1987) lsquoThe Validity of Using Phonatory Jitter and Shim-mer to Detect Laryngeal Pathologyrsquo Laryngeal Functionin Phonation and Respiration Boston MA Brown amp Copp 492ndash508
MacQueen J (1967) lsquoSome Methods for Classification and Anal-ysis of Multivariate Observationsrsquo in Paper Presented at the5th Berkeley Symposium on Mathematical Statistics and Prob-ability Berkeley CA University California Press Vol 1pp 281ndash297
MAPACI P (2004) lsquoVoice Disorder Databasersquo httpwwwmapacicomindex-inglesphp
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1634 M Hariharan et al
Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372
Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800
Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953
Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327
Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900
Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793
Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793
Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60
Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673
Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279
Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921
Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia
Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345
Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925
Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128
Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740
Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p
Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118
Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576
Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd
Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430
Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97
Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117
Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678
Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219
Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399
Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1630 M Hariharan et al
Tabl
e2
Com
pari
son
ofre
sult
sfo
rM
EE
Ida
taba
seus
ing
k-N
NL
S-S
VM
PN
Nan
dG
RN
N(r
awan
dw
eigh
ted
feat
ures
)
Raw
feat
ures
Wei
ghte
dfe
atur
es
ME
EI
data
base
SE
SP
AC
CS
ES
PA
CC
k-N
NFr
ame-
base
dC
ross
V96
66
plusmn0
5591
69
plusmn1
0994
05
plusmn0
8399
81
plusmn0
0698
27
plusmn0
3499
03
plusmn0
19k-
valu
e1ndash
10C
onV
959
6plusmn
071
908
7plusmn
154
932
7plusmn
116
998
1plusmn
006
979
8plusmn
033
988
8plusmn
015
File
-bas
edC
ross
V99
54
plusmn0
2410
0plusmn
000
996
5plusmn
019
995
4plusmn
024
100
plusmn0
0099
65
plusmn0
19k-
valu
e1ndash
10C
onV
997
4plusmn
025
100
plusmn0
0099
79
plusmn0
2010
0plusmn
000
100
plusmn0
0010
0plusmn
000
LS
-SV
MFr
ame-
base
dC
ross
V95
10
plusmn0
2195
41
plusmn0
2095
25
plusmn0
1299
75
plusmn0
0399
45
plusmn0
0799
60
plusmn0
0410
000
01
Con
V94
92
plusmn0
9094
49
plusmn1
0594
70
plusmn0
8199
34
plusmn0
2999
46
plusmn0
3499
40
plusmn0
26Fi
le-b
ased
Cro
ssV
989
2plusmn
062
998
1plusmn
060
991
2plusmn
047
993
2plusmn
036
100
plusmn0
0099
47
plusmn0
2810
01
Con
V10
0plusmn
000
988
7plusmn
130
991
2plusmn
103
996
2plusmn
080
994
1plusmn
186
995
6plusmn
071
PN
NFr
ame-
base
dC
ross
V97
43
plusmn0
2693
01
plusmn0
1795
12
plusmn0
1799
88
plusmn0
0598
69
plusmn0
1699
28
plusmn0
070
01ndash0
055
Con
V96
79
plusmn0
2492
44
plusmn0
3394
51
plusmn0
2499
81
plusmn0
0698
41
plusmn0
2199
10
plusmn0
11Fi
le-b
ased
Cro
ssV
974
4plusmn
539
100
plusmn0
0097
74
plusmn4
9797
50
plusmn5
3710
0plusmn
000
977
9plusmn
495
001
ndash00
55C
onV
970
9plusmn
610
100
plusmn0
0097
37
plusmn5
7999
62
plusmn0
8099
41
plusmn1
8699
56
plusmn0
71G
RN
NFr
ame-
base
dC
ross
V93
01
plusmn3
8988
08
plusmn4
9890
15
plusmn4
3199
50
plusmn0
4197
53
plusmn1
3098
27
plusmn0
890
01ndash0
055
Con
V92
52
plusmn3
7887
07
plusmn5
1189
34
plusmn4
3599
39
plusmn0
5597
35
plusmn1
1698
10
plusmn0
91Fi
le-b
ased
Cro
ssV
972
7plusmn
599
100
plusmn0
0096
33
plusmn8
0098
07
plusmn4
8210
0plusmn
000
979
2plusmn
511
001
ndash00
55C
onV
966
9plusmn
675
100
plusmn0
0095
81
plusmn9
0596
77
plusmn6
2310
0plusmn
000
958
1plusmn
815
Tabl
e3
Com
pari
son
ofre
sult
sfo
rM
APA
CI
spee
chpa
thol
ogy
data
base
usin
gk-
NN
LS
-SV
MP
NN
and
GR
NN
(raw
and
wei
ghte
dfe
atur
es)
Raw
feat
ures
Wei
ghte
dfe
atur
es
MA
PAC
Ida
taba
seS
ES
PA
CC
SE
SP
AC
C
k-N
NFr
ame-
base
dC
ross
V94
49
plusmn0
3696
58
plusmn0
3995
51
plusmn0
2198
67
plusmn0
2398
76
plusmn0
2598
71
plusmn0
23k-
valu
e1ndash
10C
onV
942
0plusmn
040
962
9plusmn
033
952
1plusmn
032
100
plusmn0
0010
0plusmn
000
100
plusmn0
00Fi
le-b
ased
Cro
ssV
771
0plusmn
784
669
2plusmn
330
704
2plusmn
365
100
plusmn0
0010
0plusmn
000
983
3plusmn
164
k-va
lue
1ndash10
Con
V77
56
plusmn7
1167
51
plusmn2
4869
71
plusmn2
7099
88
plusmn0
4099
88
plusmn0
4097
50
plusmn2
16L
S-S
VM
Fram
e-ba
sed
Cro
ssV
955
0plusmn
018
972
8plusmn
024
963
7plusmn
016
990
4plusmn
008
990
4plusmn
008
992
4plusmn
006
100
01
Con
V96
86
plusmn0
8195
24
plusmn0
6796
03
plusmn0
4098
71
plusmn0
6898
71
plusmn0
6899
02
plusmn0
41Fi
le-b
ased
Cro
ssV
793
1plusmn
333
688
7plusmn
240
729
2plusmn
260
100
plusmn0
0010
0plusmn
000
958
3plusmn
000
101
Con
V74
53
plusmn7
8284
00
plusmn13
01
771
4plusmn
878
100
plusmn0
0010
0plusmn
000
964
3plusmn
376
PN
NFr
ame-
base
dC
ross
V94
88
plusmn0
2096
23
plusmn0
2695
55
plusmn0
2199
82
plusmn0
0799
82
plusmn0
0799
91
plusmn0
040
01ndash0
055
Con
V94
52
plusmn0
3196
03
plusmn0
3595
26
plusmn0
2599
76
plusmn0
0799
76
plusmn0
0799
88
plusmn0
03Fi
le-b
ased
Cro
ssV
723
0plusmn
396
705
3plusmn
393
708
3plusmn
170
985
7plusmn
452
985
7plusmn
452
991
7plusmn
263
001
ndash00
55C
onV
690
2plusmn
533
691
8plusmn
410
679
3plusmn
386
986
0plusmn
273
986
0plusmn
273
990
0plusmn
172
GR
NN
Fram
e-ba
sed
Cro
ssV
904
1plusmn
378
917
9plusmn
335
909
8plusmn
345
995
9plusmn
029
995
9plusmn
029
997
1plusmn
023
001
ndash00
55C
onV
900
2plusmn
361
917
8plusmn
316
907
2plusmn
324
995
3plusmn
030
995
3plusmn
030
996
6plusmn
024
File
-bas
edC
ross
V69
43
plusmn8
5170
75
plusmn4
8666
67
plusmn7
7397
13
plusmn6
4097
13
plusmn6
4093
75
plusmn9
270
01ndash0
055
Con
V62
05
plusmn9
4064
51
plusmn3
8359
57
plusmn8
5097
38
plusmn5
6897
38
plusmn5
6893
07
plusmn9
91
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1631
Table 4 Computation time for feature extraction and classification
Classification
MEEI database MAPACI database
Type of Type of Weighted Raw Weighted RawClassifiers experiments validation features features features features
k-NN Frame-based CrossV 0338 0341 0342 0346k-value 1ndash10 ConV 3806 3821 6931 6955File-based CrossV 0036 0038 0036 0036k-value 1ndash10 ConV 1759 1754 3001 3206
LS-SVM Frame-based CrossV 1139 1175 1191 121410 001 ConV 0718 0698 0705 0723
File-based CrossV 0005 0005 0002 000210 1 ConV 0006 0006 0004 0004
PNN Frame-based CrossV 2458 3084 2445 2421001ndash0055 ConV 10340 10356 11180 11202File-based CrossV 1078 1124 1093 1062001ndash0055 ConV 4250 4266 4072 4047
GRNN Frame-based CrossV 2587 2571 2676 2582001ndash0055 ConV 10726 10648 11634 11584File-based CrossV 1098 1056 1065 1062001ndash0055 ConV 4302 4260 4049 4066
Feature extractionType of experiments MEEI MAPACI
Normal Pathological Normal PathologicalFile-based 0549 0034 0109 0130Frame-based 0591 0181 1734 1737
To prove the robustness and database independency of theproposed method MAPACI speech pathology database isused for the investigation in addition to MEEI databaseFrom (Table 3) it is inferred that maximum accuracy ofabove 95 is obtained using the raw features and 100is obtained using the weighted features during the frame-based analysis During the file-based analysis we obtainedmaximum classification accuracy of 92 using the rawfeatures and 100 is obtained using the weighted features
(Figure 6andashd) show the comparison of results (maxi-mum accuracy and its corresponding sensitivity and speci-ficity) for MEEI database and MAPACI speech pathologydatabase using the proposed raw features during frame-based and file-based analysis Similarly (Figure 7andashd) il-lustrate the comparison of results (maximum accuracy andits corresponding sensitivity and specificity) for MEEIdatabase and MAPACI speech pathology database for pro-posed weighted features The performance of the weightedfeatures is always better compared to raw features for bothMEEI database and MAPACI speech pathology databaseAll the feature extractions and classification algorithms aredeveloped under MATLAB platform and they run in a lap-top of Intel Core i7-2670 QM (22 GHz) with 4 GB RAM(Table 4) reports the average time taken (in seconds) duringfeature extraction and classification under two different ex-periments (frame-based and file-based) Direct comparisonof our work with the previous works in literature cannot beperformed since most of the works in the literature havenot presented their computation time of feature extraction
and classification phase From (Table 4) it is observed thatthe best classification accuracy could be obtained using oursuggested feature extraction and classification algorithmswith less computation time
From the literature it is observed that the reliabilityof wavelet and WPT-based features has been proven bymany experimental researches Accuracy of their resultswas varied from 85 to 100 approximately under differ-ent experiments Their results highly depend on the opti-mal selection of wavelet packet-based features using featureselectiontransformation techniques and optimisation tech-niques However the presented work gives very promisingclassification accuracy for the databases under test Henceit can be concluded that the proposed k-means clusteringbased feature weighting method could improve the robust-ness of the WPTndashSVD features and increase the classifica-tion accuracy as well for both MEEI database and MAPACIspeech pathology database
6 Conclusions
This paper presents a new feature constituting approachfor efficient detection of vocal fold pathology using a fea-ture extraction method (WPTndashSVD features and a fea-ture weighting method [k-means clustering based]) Totest the effectiveness and reliability of the proposed rawand weighted features four different classifiers such as k-NN LS-SVM PNN and GRNN are employed k-means
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1632 M Hariharan et al
clustering based feature weighting approach can increasethe distinguishing performance of raw WPTndashSVD featuresThe experimental results show that the proposed weightedfeatures give very promising classification accuracy of100 for both MEEI voice disorder database and MAPACIspeech pathology database The proposed features can beused as additional acoustic indicators for researchers andspeech pathologists to detect the vocal fold pathology Theperformance of the k-means clustering algorithm is depend-ing on the initial positions of the clusters and hence in ourwork it was run for several times to obtain better clus-ter centres In the future work k-means clustering basedfeature weighting method could be improved by using ef-ficient initialisation of methods for better initial placementof the cluster centres and different kernel functions Theproposed method could be extended to detect the specifictype of disorders and also to develop an online diagnosingsystem
AcknowledgementsThis work is done in part with data transferred from the databaseMAPACI httpwwwmapacicom The authors would like tothank the anonymous reviewers for their valuable comments
Notes on contributorsM Hariharan has received Bachelor ofEngineering in Electrical and ElectronicsEngineering and Master of Engineeringin Applied Electronics from GovernmentCollege of Technology Coimbatore TamilNadu India He has received a PhD degreein Mechatronic Engineering from UniversitiMalaysia Perlis (UniMAP) Malaysia He iscurrently working as a Senior Lecturer in
the School of Mechatronic Engineering UniMAP Malaysia Hehas published more than 50 papers in referred journals and con-ferences He has been a reviewer in various journals includingComputers and Electrical Engineering Computer Methods andPrograms in Biomedicine Artificial Intelligence in BiomedicineInternational Journal of Phoniatrics Speech Therapy and Com-munication Pathology and Medical Engineering and Physics Hisresearch interests include speech signal processing biomedicalsignal and image processing and artificial intelligence He is amember of IEEE
Kemal Polat was born in Kırıkkale on30 January 1981 Dr Polat has gradu-ated from ElectricalndashElectronics Engineer-ing Department of Selcuk University withBSc degree in 2002 and from ElectricalndashElectronics Engineering Department of Sel-cuk University with MSc degree in 2004He took his PhD degree in Electrical andElectronic Engineering at Selcuk University
in 2008 He is now working as an Assistant Professor in Electricaland Electronic Engineering Department Engineering of FacultyAbant Izzet Baysal University since September 2011 His currentresearch interests are biomedical signal classification statisticalsignal processing digital signal processing and pattern recognitionand classification He has 55 articles published in the SCI journalsand 16 international conference papers He is a reviewer in various
SCI journals including Information Sciences Pattern RecognitionLetters Digital Signal Processing Soft Computing Artificial In-telligence in Medicine Applied Soft Computing Computers inBiology and Medicine Expert Systems Computational Statisticsamp Data Analysis IEEE Transactions on Biomedical EngineeringIEEE Transactions on Evolutionary Computation TALANTASoft Computing Journal of Medical Systems and Scientific Re-search and Essays so on He is the member of editorial board ofJournal of Neural Computing and Applications (SCI expanded)
Sazali Yaacob received his PhD degreein Control Engineering from University ofSheffield UK He is currently working as aProfessor in the School of Mechatronic En-gineering UniMAP Malaysia He has suc-cessfully supervised 8 PhD candidates andmore than 20 MSc graduates through re-search mode Currently he has 10 PhD and8 MSc candidates His research interests
include control modelling and signal processing with applica-tions in the fields of satellite bio-medical applied mechanics androbotics He has published more than 70 papers in journals and200 papers in conference proceedings He received his profes-sional qualification as Charted Engineer from the EngineeringCouncil UK in 2005 and also a member of IET UK since 2003
ReferencesAnanthakrishna T Shama K and Niranjan U (2004)
lsquok-Means Nearest Neighbor Classifier for Voice Pathol-ogyrsquo in Paper Presented at the Proceedings from INDI-CONrsquo04 The 1st IEEE Annual India Conference KharagpurIndia pp 352ndash354
Arjmandi MK and Pooyan M (2011) lsquoAn Optimum Algorithmin Pathological Voice Quality Assessment Using Wavelet-Packet-Based Features Linear Discriminant Analysis andSupport Vector Machinersquo Biomedical Signal Processing andControl 7 3ndash19
Azadi TE and Almasganj F (2011) lsquoPBSVM Partitioningand Biased Support Vector Machine for Vocal Fold PathologyAssessment Using Labeled and Unlabeled Data Setsrsquo ExpertSystems with Applications 38(1) 610ndash619
Bezdek JC (1981) Pattern Recognition with Fuzzy ObjectiveFunction Algorithms Norwell MA Kluwer Academic Pub-lishers
Boyanov B and Hadjitodorov S (1997) lsquoAcoustic Analysis ofPathological Voices A Voice Analysis System for the Screen-ing of Laryngeal Diseasesrsquo IEEE Engineering in Medicineand Biology Magazine 16(4) 74ndash82
Boyanov B Ivanov T Hadjitodorov S and Chollet G (1993)lsquoRobust Hybrid Pitch Detectorrsquo IET Electronics Letters 291924ndash1926
Burrus C Gopinath R Guo H Odegard J and Selesnick I(1997) Introduction to Wavelets and Wavelet Transforms APrimer Upper Saddle River NJ Prentice Hall
Calisir D and Dogantekin E (2011) lsquoA New Intelligent Hep-atitis Diagnosis System PCA-LSSVMrsquo Expert Systems withApplications 38(8) 10705ndash10708
Chiu SL (1994) lsquoFuzzy Model Identification Based on ClusterEstimationrsquo Journal of Intelligent and Fuzzy Systems 2(3)267ndash278
Crovato C and Schuck A (2007) lsquoThe Use of Wavelet PacketTransform and Artificial Neural Networks in Analysis andClassification of Dysphonic Voicesrsquo IEEE Transactions onBiomedical Engineering 54(10) 1898ndash1900
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1633
De Brabanter K Ojeda PKF Alzate C De Brabanter J Pel-ckmans K De Moor B Vandewalle J and Suykens JAK(2010) lsquoLS-SVMlab Toolboxrsquo httpwwwesatkuleuvenbesistalssvmlab
Deliyski D (1993) Acoustic Model and Evaluation of Patholog-ical Voice Productionrsquo in Paper Presented at the Proceed-ings from EUROSPEECHrsquo93 The 3rd Conference on SpeechCommunication and Technology Berlin Germany pp 1969ndash1972
Duda RO Hart PE and Stork DG (2001) Pattern Classifi-cation New York Wiley-Interscience
Erfanian Saeedi N Almasganj F and Torabinejad F (2011)lsquoSupport Vector Wavelet Adaptation for Pathological VoiceAssessmentrsquo Computers in Biology and Medicine 41(9)822ndash828
Eskidere O Ertas F and Hanilci C (2011) lsquoA Comparison ofRegression Methods for Remote Tracking of Parkinsonrsquos Dis-ease Progressionrsquo Expert Systems with Applications 39(5)5523ndash5528
Feijoo S and Hernandez C (1990) lsquoShort-Term Stability Mea-sures for the Evaluation of Vocal Qualityrsquo Journal of Speechand Hearing Research 33(2) 324ndash334
Fonseca E Guido R Scalassara P Maciel C and Pereira J(2007) lsquoWavelet Time-Frequency Analysis and Least SquaresSupport Vector Machines for the Identification of Voice Dis-ordersrsquo Computers in Biology and Medicine(Elsevier) 37(4)571ndash578
Fukunaga K (1990) Introduction to Statistical Pattern Recogni-tion (2nd ed) San Diego CA Academic Press
Godino-Llorente J and Gomez-Vilda P (2004) lsquoAutomaticDetection of Voice Impairments by Means of Short-TermCepstral Parameters and Neural Network Based DetectorsrsquoIEEE Transactions on Biomedical Engineering 51(2) 380ndash384
Godino-Llorente J Gomez-Vilda P and Blanco-Velasco M(2006) lsquoDimensionality Reduction of a Pathological VoiceQuality Assessment System Based on Gaussian Mixture Mod-els and Short-Term Cepstral Parametersrsquo IEEE Transactionson Biomedical Engineering 53(10) 1943ndash1953
Gunes S Polat K and Yosunkaya S (2010) lsquoEfficient SleepStage Recognition System Based on EEG Signal Using k-Means Clustering Based Feature Weightingrsquo Expert Systemswith Applications 37(12) 7922ndash7928
Hariharan M Paulraj M and Yaacob S (2009) lsquoIdentifica-tion of Vocal Fold Pathology Based on Mel Frequency BandEnergy Coefficients and Singular Value Decompositionrsquo inPaper presented at the 2009 IEEE International Conferenceon Signal and Image Processing Applications (ICSIPArsquo09)Kuala Lumpur Malaysia pp 514ndash517
Hariharan M Paulraj M and Yaacob S (2011) lsquoDetection ofVocal Fold Paralysis and Oedema Using Time-Domain Fea-tures and Probabilistic Neural Networkrsquo International Jour-nal of Biomedical Engineering and Technology 6(1) 46ndash57
Hariharan M Saraswathy J Sindhu R Khairunizam W andYaacob S (2012) lsquoInfant Cry Classification to Identify As-phyxia Using Time-Frequency Analysis and Radial BasisNeural Networksrsquo Expert Systems with Applications 39(10)9515ndash9523
Hariharan M and Sazali Y (2010) lsquoTime-Domain Features andProbabilistic Neural Network for the Detection of Vocal FoldPathologyrsquo Malaysian Journal of Computer Science 23(1)60ndash67
Hariharan M Sindhu R and Yaacob S (2011) lsquoNormal andHypoacoustic Infant Cry Signal Classification Using Time-Frequency Analysis and General Regression Neural Net-
workrsquo Computer Methods and Programs in Biomedicine108(2) 559ndash569
Hariharan M Yaacob S and Awang SA (2011) lsquoPathologi-cal Infant Cry Analysis Using Wavelet Packet Transform andProbabilistic Neural Networkrsquo Expert Systems with Applica-tions 38(12) 15377ndash15382
Hernandez-Espinosa C Gomez-Vilda P Godino-Llorente Jand Aguilera-Navarro S (2000) lsquoDiagnosis of Vocal andVoice Disorders by the Speech Signalrsquo in Paper Presentedat the Proceedings from IJCNNrsquo00 The IEEE-INNS-ENNSInternational Joint Conference on Neural Networks ComoItaly pp 253ndash258
Hoglund H (2012) lsquoDetecting Earnings Management with Neu-ral Networksrsquo Expert Systems with Applications 39(10)9564ndash9570
Ismail S Samsudin R and Shabri A (2011) lsquoA Hybrid Modelof Self-Organizing Maps (SOM) and Least Square SupportVector Machine (LSSVM) for Time Series Forecastingrsquo Ex-pert Systems with Applications 38(8) 10574ndash10578
Kalker T Haitsma J and Oostveen J (2001) lsquoRobust AudioHashing for Content Identificationrsquo in Paper Presented at theProceedings from EUSIPCOrsquo01 The 12th European SignalProcessing Conference Vienna Austria pp 2091ndash2094
Kasuya H Endo Y and Saliu S (1993) lsquoNovel AcousticMeasurements of Jitter and Shimmer Characteristics fromPathological Voicersquo Paper Presented at the Proceedings fromEUROSPEECHrsquo93 The 3rd European Conference on SpeechCommunication and Technology Berlin Germany pp 1973ndash1976
Kasuya H Ogawa S Mashima K and Ebihara S (1986)lsquoNormalized Noise Energy as an Acoustic Measure to Eval-uate Pathologic Voicersquo The Journal of the Acoustical Societyof America 80 1329ndash1334
Kay Elemetrics Inc (1994) lsquoVoice Disorders DatabaseVersion 103 [CD-ROM]rsquo httpwwwkaypentaxcomProduct20InfoCSL20Options43374337html
Khadivi Heris H Seyed Aghazadeh B and Nikkhah-BahramiM (2009) lsquoOptimal Feature Selection for the Assessment ofVocal Fold Disordersrsquo Computers in Biology and Medicine39(10) 860ndash868
Krom G (1993) lsquoA Cepstrum-Based Technique for Determininga Harmonics-To-Noise Ratio in Speech Signalsrsquo Journal ofSpeech and Hearing Research 36(2) 254ndash266
Kukharchik P Kheidorov I Bovbel E and Ladeev D (2008)lsquoSpeech Signal Processing Based on Wavelets and SVM forVocal Tract Pathology Detectionrsquo Lecture Notes in ComputerScience (Springer) 5099 192ndash199
Latifoglu F Polat K Kara SI and Gunes S (2008) lsquoMedicalDiagnosis of Atherosclerosis from Carotid Artery DopplerSignals Using Principal Component Analysis (PCA) k-NN based Weighting Pre-Processing and Artificial ImmuneRecognition System (AIRS)rsquo Journal of Biomedical Infor-matics 41(1) 15ndash23
Ludlow CL Bassich CJ Connor NP Coulter DC and LeeYJ (1987) lsquoThe Validity of Using Phonatory Jitter and Shim-mer to Detect Laryngeal Pathologyrsquo Laryngeal Functionin Phonation and Respiration Boston MA Brown amp Copp 492ndash508
MacQueen J (1967) lsquoSome Methods for Classification and Anal-ysis of Multivariate Observationsrsquo in Paper Presented at the5th Berkeley Symposium on Mathematical Statistics and Prob-ability Berkeley CA University California Press Vol 1pp 281ndash297
MAPACI P (2004) lsquoVoice Disorder Databasersquo httpwwwmapacicomindex-inglesphp
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1634 M Hariharan et al
Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372
Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800
Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953
Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327
Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900
Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793
Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793
Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60
Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673
Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279
Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921
Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia
Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345
Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925
Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128
Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740
Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p
Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118
Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576
Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd
Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430
Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97
Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117
Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678
Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219
Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399
Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1631
Table 4 Computation time for feature extraction and classification
Classification
MEEI database MAPACI database
Type of Type of Weighted Raw Weighted RawClassifiers experiments validation features features features features
k-NN Frame-based CrossV 0338 0341 0342 0346k-value 1ndash10 ConV 3806 3821 6931 6955File-based CrossV 0036 0038 0036 0036k-value 1ndash10 ConV 1759 1754 3001 3206
LS-SVM Frame-based CrossV 1139 1175 1191 121410 001 ConV 0718 0698 0705 0723
File-based CrossV 0005 0005 0002 000210 1 ConV 0006 0006 0004 0004
PNN Frame-based CrossV 2458 3084 2445 2421001ndash0055 ConV 10340 10356 11180 11202File-based CrossV 1078 1124 1093 1062001ndash0055 ConV 4250 4266 4072 4047
GRNN Frame-based CrossV 2587 2571 2676 2582001ndash0055 ConV 10726 10648 11634 11584File-based CrossV 1098 1056 1065 1062001ndash0055 ConV 4302 4260 4049 4066
Feature extractionType of experiments MEEI MAPACI
Normal Pathological Normal PathologicalFile-based 0549 0034 0109 0130Frame-based 0591 0181 1734 1737
To prove the robustness and database independency of theproposed method MAPACI speech pathology database isused for the investigation in addition to MEEI databaseFrom (Table 3) it is inferred that maximum accuracy ofabove 95 is obtained using the raw features and 100is obtained using the weighted features during the frame-based analysis During the file-based analysis we obtainedmaximum classification accuracy of 92 using the rawfeatures and 100 is obtained using the weighted features
(Figure 6andashd) show the comparison of results (maxi-mum accuracy and its corresponding sensitivity and speci-ficity) for MEEI database and MAPACI speech pathologydatabase using the proposed raw features during frame-based and file-based analysis Similarly (Figure 7andashd) il-lustrate the comparison of results (maximum accuracy andits corresponding sensitivity and specificity) for MEEIdatabase and MAPACI speech pathology database for pro-posed weighted features The performance of the weightedfeatures is always better compared to raw features for bothMEEI database and MAPACI speech pathology databaseAll the feature extractions and classification algorithms aredeveloped under MATLAB platform and they run in a lap-top of Intel Core i7-2670 QM (22 GHz) with 4 GB RAM(Table 4) reports the average time taken (in seconds) duringfeature extraction and classification under two different ex-periments (frame-based and file-based) Direct comparisonof our work with the previous works in literature cannot beperformed since most of the works in the literature havenot presented their computation time of feature extraction
and classification phase From (Table 4) it is observed thatthe best classification accuracy could be obtained using oursuggested feature extraction and classification algorithmswith less computation time
From the literature it is observed that the reliabilityof wavelet and WPT-based features has been proven bymany experimental researches Accuracy of their resultswas varied from 85 to 100 approximately under differ-ent experiments Their results highly depend on the opti-mal selection of wavelet packet-based features using featureselectiontransformation techniques and optimisation tech-niques However the presented work gives very promisingclassification accuracy for the databases under test Henceit can be concluded that the proposed k-means clusteringbased feature weighting method could improve the robust-ness of the WPTndashSVD features and increase the classifica-tion accuracy as well for both MEEI database and MAPACIspeech pathology database
6 Conclusions
This paper presents a new feature constituting approachfor efficient detection of vocal fold pathology using a fea-ture extraction method (WPTndashSVD features and a fea-ture weighting method [k-means clustering based]) Totest the effectiveness and reliability of the proposed rawand weighted features four different classifiers such as k-NN LS-SVM PNN and GRNN are employed k-means
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1632 M Hariharan et al
clustering based feature weighting approach can increasethe distinguishing performance of raw WPTndashSVD featuresThe experimental results show that the proposed weightedfeatures give very promising classification accuracy of100 for both MEEI voice disorder database and MAPACIspeech pathology database The proposed features can beused as additional acoustic indicators for researchers andspeech pathologists to detect the vocal fold pathology Theperformance of the k-means clustering algorithm is depend-ing on the initial positions of the clusters and hence in ourwork it was run for several times to obtain better clus-ter centres In the future work k-means clustering basedfeature weighting method could be improved by using ef-ficient initialisation of methods for better initial placementof the cluster centres and different kernel functions Theproposed method could be extended to detect the specifictype of disorders and also to develop an online diagnosingsystem
AcknowledgementsThis work is done in part with data transferred from the databaseMAPACI httpwwwmapacicom The authors would like tothank the anonymous reviewers for their valuable comments
Notes on contributorsM Hariharan has received Bachelor ofEngineering in Electrical and ElectronicsEngineering and Master of Engineeringin Applied Electronics from GovernmentCollege of Technology Coimbatore TamilNadu India He has received a PhD degreein Mechatronic Engineering from UniversitiMalaysia Perlis (UniMAP) Malaysia He iscurrently working as a Senior Lecturer in
the School of Mechatronic Engineering UniMAP Malaysia Hehas published more than 50 papers in referred journals and con-ferences He has been a reviewer in various journals includingComputers and Electrical Engineering Computer Methods andPrograms in Biomedicine Artificial Intelligence in BiomedicineInternational Journal of Phoniatrics Speech Therapy and Com-munication Pathology and Medical Engineering and Physics Hisresearch interests include speech signal processing biomedicalsignal and image processing and artificial intelligence He is amember of IEEE
Kemal Polat was born in Kırıkkale on30 January 1981 Dr Polat has gradu-ated from ElectricalndashElectronics Engineer-ing Department of Selcuk University withBSc degree in 2002 and from ElectricalndashElectronics Engineering Department of Sel-cuk University with MSc degree in 2004He took his PhD degree in Electrical andElectronic Engineering at Selcuk University
in 2008 He is now working as an Assistant Professor in Electricaland Electronic Engineering Department Engineering of FacultyAbant Izzet Baysal University since September 2011 His currentresearch interests are biomedical signal classification statisticalsignal processing digital signal processing and pattern recognitionand classification He has 55 articles published in the SCI journalsand 16 international conference papers He is a reviewer in various
SCI journals including Information Sciences Pattern RecognitionLetters Digital Signal Processing Soft Computing Artificial In-telligence in Medicine Applied Soft Computing Computers inBiology and Medicine Expert Systems Computational Statisticsamp Data Analysis IEEE Transactions on Biomedical EngineeringIEEE Transactions on Evolutionary Computation TALANTASoft Computing Journal of Medical Systems and Scientific Re-search and Essays so on He is the member of editorial board ofJournal of Neural Computing and Applications (SCI expanded)
Sazali Yaacob received his PhD degreein Control Engineering from University ofSheffield UK He is currently working as aProfessor in the School of Mechatronic En-gineering UniMAP Malaysia He has suc-cessfully supervised 8 PhD candidates andmore than 20 MSc graduates through re-search mode Currently he has 10 PhD and8 MSc candidates His research interests
include control modelling and signal processing with applica-tions in the fields of satellite bio-medical applied mechanics androbotics He has published more than 70 papers in journals and200 papers in conference proceedings He received his profes-sional qualification as Charted Engineer from the EngineeringCouncil UK in 2005 and also a member of IET UK since 2003
ReferencesAnanthakrishna T Shama K and Niranjan U (2004)
lsquok-Means Nearest Neighbor Classifier for Voice Pathol-ogyrsquo in Paper Presented at the Proceedings from INDI-CONrsquo04 The 1st IEEE Annual India Conference KharagpurIndia pp 352ndash354
Arjmandi MK and Pooyan M (2011) lsquoAn Optimum Algorithmin Pathological Voice Quality Assessment Using Wavelet-Packet-Based Features Linear Discriminant Analysis andSupport Vector Machinersquo Biomedical Signal Processing andControl 7 3ndash19
Azadi TE and Almasganj F (2011) lsquoPBSVM Partitioningand Biased Support Vector Machine for Vocal Fold PathologyAssessment Using Labeled and Unlabeled Data Setsrsquo ExpertSystems with Applications 38(1) 610ndash619
Bezdek JC (1981) Pattern Recognition with Fuzzy ObjectiveFunction Algorithms Norwell MA Kluwer Academic Pub-lishers
Boyanov B and Hadjitodorov S (1997) lsquoAcoustic Analysis ofPathological Voices A Voice Analysis System for the Screen-ing of Laryngeal Diseasesrsquo IEEE Engineering in Medicineand Biology Magazine 16(4) 74ndash82
Boyanov B Ivanov T Hadjitodorov S and Chollet G (1993)lsquoRobust Hybrid Pitch Detectorrsquo IET Electronics Letters 291924ndash1926
Burrus C Gopinath R Guo H Odegard J and Selesnick I(1997) Introduction to Wavelets and Wavelet Transforms APrimer Upper Saddle River NJ Prentice Hall
Calisir D and Dogantekin E (2011) lsquoA New Intelligent Hep-atitis Diagnosis System PCA-LSSVMrsquo Expert Systems withApplications 38(8) 10705ndash10708
Chiu SL (1994) lsquoFuzzy Model Identification Based on ClusterEstimationrsquo Journal of Intelligent and Fuzzy Systems 2(3)267ndash278
Crovato C and Schuck A (2007) lsquoThe Use of Wavelet PacketTransform and Artificial Neural Networks in Analysis andClassification of Dysphonic Voicesrsquo IEEE Transactions onBiomedical Engineering 54(10) 1898ndash1900
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1633
De Brabanter K Ojeda PKF Alzate C De Brabanter J Pel-ckmans K De Moor B Vandewalle J and Suykens JAK(2010) lsquoLS-SVMlab Toolboxrsquo httpwwwesatkuleuvenbesistalssvmlab
Deliyski D (1993) Acoustic Model and Evaluation of Patholog-ical Voice Productionrsquo in Paper Presented at the Proceed-ings from EUROSPEECHrsquo93 The 3rd Conference on SpeechCommunication and Technology Berlin Germany pp 1969ndash1972
Duda RO Hart PE and Stork DG (2001) Pattern Classifi-cation New York Wiley-Interscience
Erfanian Saeedi N Almasganj F and Torabinejad F (2011)lsquoSupport Vector Wavelet Adaptation for Pathological VoiceAssessmentrsquo Computers in Biology and Medicine 41(9)822ndash828
Eskidere O Ertas F and Hanilci C (2011) lsquoA Comparison ofRegression Methods for Remote Tracking of Parkinsonrsquos Dis-ease Progressionrsquo Expert Systems with Applications 39(5)5523ndash5528
Feijoo S and Hernandez C (1990) lsquoShort-Term Stability Mea-sures for the Evaluation of Vocal Qualityrsquo Journal of Speechand Hearing Research 33(2) 324ndash334
Fonseca E Guido R Scalassara P Maciel C and Pereira J(2007) lsquoWavelet Time-Frequency Analysis and Least SquaresSupport Vector Machines for the Identification of Voice Dis-ordersrsquo Computers in Biology and Medicine(Elsevier) 37(4)571ndash578
Fukunaga K (1990) Introduction to Statistical Pattern Recogni-tion (2nd ed) San Diego CA Academic Press
Godino-Llorente J and Gomez-Vilda P (2004) lsquoAutomaticDetection of Voice Impairments by Means of Short-TermCepstral Parameters and Neural Network Based DetectorsrsquoIEEE Transactions on Biomedical Engineering 51(2) 380ndash384
Godino-Llorente J Gomez-Vilda P and Blanco-Velasco M(2006) lsquoDimensionality Reduction of a Pathological VoiceQuality Assessment System Based on Gaussian Mixture Mod-els and Short-Term Cepstral Parametersrsquo IEEE Transactionson Biomedical Engineering 53(10) 1943ndash1953
Gunes S Polat K and Yosunkaya S (2010) lsquoEfficient SleepStage Recognition System Based on EEG Signal Using k-Means Clustering Based Feature Weightingrsquo Expert Systemswith Applications 37(12) 7922ndash7928
Hariharan M Paulraj M and Yaacob S (2009) lsquoIdentifica-tion of Vocal Fold Pathology Based on Mel Frequency BandEnergy Coefficients and Singular Value Decompositionrsquo inPaper presented at the 2009 IEEE International Conferenceon Signal and Image Processing Applications (ICSIPArsquo09)Kuala Lumpur Malaysia pp 514ndash517
Hariharan M Paulraj M and Yaacob S (2011) lsquoDetection ofVocal Fold Paralysis and Oedema Using Time-Domain Fea-tures and Probabilistic Neural Networkrsquo International Jour-nal of Biomedical Engineering and Technology 6(1) 46ndash57
Hariharan M Saraswathy J Sindhu R Khairunizam W andYaacob S (2012) lsquoInfant Cry Classification to Identify As-phyxia Using Time-Frequency Analysis and Radial BasisNeural Networksrsquo Expert Systems with Applications 39(10)9515ndash9523
Hariharan M and Sazali Y (2010) lsquoTime-Domain Features andProbabilistic Neural Network for the Detection of Vocal FoldPathologyrsquo Malaysian Journal of Computer Science 23(1)60ndash67
Hariharan M Sindhu R and Yaacob S (2011) lsquoNormal andHypoacoustic Infant Cry Signal Classification Using Time-Frequency Analysis and General Regression Neural Net-
workrsquo Computer Methods and Programs in Biomedicine108(2) 559ndash569
Hariharan M Yaacob S and Awang SA (2011) lsquoPathologi-cal Infant Cry Analysis Using Wavelet Packet Transform andProbabilistic Neural Networkrsquo Expert Systems with Applica-tions 38(12) 15377ndash15382
Hernandez-Espinosa C Gomez-Vilda P Godino-Llorente Jand Aguilera-Navarro S (2000) lsquoDiagnosis of Vocal andVoice Disorders by the Speech Signalrsquo in Paper Presentedat the Proceedings from IJCNNrsquo00 The IEEE-INNS-ENNSInternational Joint Conference on Neural Networks ComoItaly pp 253ndash258
Hoglund H (2012) lsquoDetecting Earnings Management with Neu-ral Networksrsquo Expert Systems with Applications 39(10)9564ndash9570
Ismail S Samsudin R and Shabri A (2011) lsquoA Hybrid Modelof Self-Organizing Maps (SOM) and Least Square SupportVector Machine (LSSVM) for Time Series Forecastingrsquo Ex-pert Systems with Applications 38(8) 10574ndash10578
Kalker T Haitsma J and Oostveen J (2001) lsquoRobust AudioHashing for Content Identificationrsquo in Paper Presented at theProceedings from EUSIPCOrsquo01 The 12th European SignalProcessing Conference Vienna Austria pp 2091ndash2094
Kasuya H Endo Y and Saliu S (1993) lsquoNovel AcousticMeasurements of Jitter and Shimmer Characteristics fromPathological Voicersquo Paper Presented at the Proceedings fromEUROSPEECHrsquo93 The 3rd European Conference on SpeechCommunication and Technology Berlin Germany pp 1973ndash1976
Kasuya H Ogawa S Mashima K and Ebihara S (1986)lsquoNormalized Noise Energy as an Acoustic Measure to Eval-uate Pathologic Voicersquo The Journal of the Acoustical Societyof America 80 1329ndash1334
Kay Elemetrics Inc (1994) lsquoVoice Disorders DatabaseVersion 103 [CD-ROM]rsquo httpwwwkaypentaxcomProduct20InfoCSL20Options43374337html
Khadivi Heris H Seyed Aghazadeh B and Nikkhah-BahramiM (2009) lsquoOptimal Feature Selection for the Assessment ofVocal Fold Disordersrsquo Computers in Biology and Medicine39(10) 860ndash868
Krom G (1993) lsquoA Cepstrum-Based Technique for Determininga Harmonics-To-Noise Ratio in Speech Signalsrsquo Journal ofSpeech and Hearing Research 36(2) 254ndash266
Kukharchik P Kheidorov I Bovbel E and Ladeev D (2008)lsquoSpeech Signal Processing Based on Wavelets and SVM forVocal Tract Pathology Detectionrsquo Lecture Notes in ComputerScience (Springer) 5099 192ndash199
Latifoglu F Polat K Kara SI and Gunes S (2008) lsquoMedicalDiagnosis of Atherosclerosis from Carotid Artery DopplerSignals Using Principal Component Analysis (PCA) k-NN based Weighting Pre-Processing and Artificial ImmuneRecognition System (AIRS)rsquo Journal of Biomedical Infor-matics 41(1) 15ndash23
Ludlow CL Bassich CJ Connor NP Coulter DC and LeeYJ (1987) lsquoThe Validity of Using Phonatory Jitter and Shim-mer to Detect Laryngeal Pathologyrsquo Laryngeal Functionin Phonation and Respiration Boston MA Brown amp Copp 492ndash508
MacQueen J (1967) lsquoSome Methods for Classification and Anal-ysis of Multivariate Observationsrsquo in Paper Presented at the5th Berkeley Symposium on Mathematical Statistics and Prob-ability Berkeley CA University California Press Vol 1pp 281ndash297
MAPACI P (2004) lsquoVoice Disorder Databasersquo httpwwwmapacicomindex-inglesphp
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1634 M Hariharan et al
Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372
Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800
Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953
Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327
Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900
Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793
Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793
Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60
Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673
Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279
Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921
Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia
Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345
Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925
Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128
Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740
Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p
Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118
Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576
Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd
Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430
Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97
Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117
Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678
Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219
Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399
Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1632 M Hariharan et al
clustering based feature weighting approach can increasethe distinguishing performance of raw WPTndashSVD featuresThe experimental results show that the proposed weightedfeatures give very promising classification accuracy of100 for both MEEI voice disorder database and MAPACIspeech pathology database The proposed features can beused as additional acoustic indicators for researchers andspeech pathologists to detect the vocal fold pathology Theperformance of the k-means clustering algorithm is depend-ing on the initial positions of the clusters and hence in ourwork it was run for several times to obtain better clus-ter centres In the future work k-means clustering basedfeature weighting method could be improved by using ef-ficient initialisation of methods for better initial placementof the cluster centres and different kernel functions Theproposed method could be extended to detect the specifictype of disorders and also to develop an online diagnosingsystem
AcknowledgementsThis work is done in part with data transferred from the databaseMAPACI httpwwwmapacicom The authors would like tothank the anonymous reviewers for their valuable comments
Notes on contributorsM Hariharan has received Bachelor ofEngineering in Electrical and ElectronicsEngineering and Master of Engineeringin Applied Electronics from GovernmentCollege of Technology Coimbatore TamilNadu India He has received a PhD degreein Mechatronic Engineering from UniversitiMalaysia Perlis (UniMAP) Malaysia He iscurrently working as a Senior Lecturer in
the School of Mechatronic Engineering UniMAP Malaysia Hehas published more than 50 papers in referred journals and con-ferences He has been a reviewer in various journals includingComputers and Electrical Engineering Computer Methods andPrograms in Biomedicine Artificial Intelligence in BiomedicineInternational Journal of Phoniatrics Speech Therapy and Com-munication Pathology and Medical Engineering and Physics Hisresearch interests include speech signal processing biomedicalsignal and image processing and artificial intelligence He is amember of IEEE
Kemal Polat was born in Kırıkkale on30 January 1981 Dr Polat has gradu-ated from ElectricalndashElectronics Engineer-ing Department of Selcuk University withBSc degree in 2002 and from ElectricalndashElectronics Engineering Department of Sel-cuk University with MSc degree in 2004He took his PhD degree in Electrical andElectronic Engineering at Selcuk University
in 2008 He is now working as an Assistant Professor in Electricaland Electronic Engineering Department Engineering of FacultyAbant Izzet Baysal University since September 2011 His currentresearch interests are biomedical signal classification statisticalsignal processing digital signal processing and pattern recognitionand classification He has 55 articles published in the SCI journalsand 16 international conference papers He is a reviewer in various
SCI journals including Information Sciences Pattern RecognitionLetters Digital Signal Processing Soft Computing Artificial In-telligence in Medicine Applied Soft Computing Computers inBiology and Medicine Expert Systems Computational Statisticsamp Data Analysis IEEE Transactions on Biomedical EngineeringIEEE Transactions on Evolutionary Computation TALANTASoft Computing Journal of Medical Systems and Scientific Re-search and Essays so on He is the member of editorial board ofJournal of Neural Computing and Applications (SCI expanded)
Sazali Yaacob received his PhD degreein Control Engineering from University ofSheffield UK He is currently working as aProfessor in the School of Mechatronic En-gineering UniMAP Malaysia He has suc-cessfully supervised 8 PhD candidates andmore than 20 MSc graduates through re-search mode Currently he has 10 PhD and8 MSc candidates His research interests
include control modelling and signal processing with applica-tions in the fields of satellite bio-medical applied mechanics androbotics He has published more than 70 papers in journals and200 papers in conference proceedings He received his profes-sional qualification as Charted Engineer from the EngineeringCouncil UK in 2005 and also a member of IET UK since 2003
ReferencesAnanthakrishna T Shama K and Niranjan U (2004)
lsquok-Means Nearest Neighbor Classifier for Voice Pathol-ogyrsquo in Paper Presented at the Proceedings from INDI-CONrsquo04 The 1st IEEE Annual India Conference KharagpurIndia pp 352ndash354
Arjmandi MK and Pooyan M (2011) lsquoAn Optimum Algorithmin Pathological Voice Quality Assessment Using Wavelet-Packet-Based Features Linear Discriminant Analysis andSupport Vector Machinersquo Biomedical Signal Processing andControl 7 3ndash19
Azadi TE and Almasganj F (2011) lsquoPBSVM Partitioningand Biased Support Vector Machine for Vocal Fold PathologyAssessment Using Labeled and Unlabeled Data Setsrsquo ExpertSystems with Applications 38(1) 610ndash619
Bezdek JC (1981) Pattern Recognition with Fuzzy ObjectiveFunction Algorithms Norwell MA Kluwer Academic Pub-lishers
Boyanov B and Hadjitodorov S (1997) lsquoAcoustic Analysis ofPathological Voices A Voice Analysis System for the Screen-ing of Laryngeal Diseasesrsquo IEEE Engineering in Medicineand Biology Magazine 16(4) 74ndash82
Boyanov B Ivanov T Hadjitodorov S and Chollet G (1993)lsquoRobust Hybrid Pitch Detectorrsquo IET Electronics Letters 291924ndash1926
Burrus C Gopinath R Guo H Odegard J and Selesnick I(1997) Introduction to Wavelets and Wavelet Transforms APrimer Upper Saddle River NJ Prentice Hall
Calisir D and Dogantekin E (2011) lsquoA New Intelligent Hep-atitis Diagnosis System PCA-LSSVMrsquo Expert Systems withApplications 38(8) 10705ndash10708
Chiu SL (1994) lsquoFuzzy Model Identification Based on ClusterEstimationrsquo Journal of Intelligent and Fuzzy Systems 2(3)267ndash278
Crovato C and Schuck A (2007) lsquoThe Use of Wavelet PacketTransform and Artificial Neural Networks in Analysis andClassification of Dysphonic Voicesrsquo IEEE Transactions onBiomedical Engineering 54(10) 1898ndash1900
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1633
De Brabanter K Ojeda PKF Alzate C De Brabanter J Pel-ckmans K De Moor B Vandewalle J and Suykens JAK(2010) lsquoLS-SVMlab Toolboxrsquo httpwwwesatkuleuvenbesistalssvmlab
Deliyski D (1993) Acoustic Model and Evaluation of Patholog-ical Voice Productionrsquo in Paper Presented at the Proceed-ings from EUROSPEECHrsquo93 The 3rd Conference on SpeechCommunication and Technology Berlin Germany pp 1969ndash1972
Duda RO Hart PE and Stork DG (2001) Pattern Classifi-cation New York Wiley-Interscience
Erfanian Saeedi N Almasganj F and Torabinejad F (2011)lsquoSupport Vector Wavelet Adaptation for Pathological VoiceAssessmentrsquo Computers in Biology and Medicine 41(9)822ndash828
Eskidere O Ertas F and Hanilci C (2011) lsquoA Comparison ofRegression Methods for Remote Tracking of Parkinsonrsquos Dis-ease Progressionrsquo Expert Systems with Applications 39(5)5523ndash5528
Feijoo S and Hernandez C (1990) lsquoShort-Term Stability Mea-sures for the Evaluation of Vocal Qualityrsquo Journal of Speechand Hearing Research 33(2) 324ndash334
Fonseca E Guido R Scalassara P Maciel C and Pereira J(2007) lsquoWavelet Time-Frequency Analysis and Least SquaresSupport Vector Machines for the Identification of Voice Dis-ordersrsquo Computers in Biology and Medicine(Elsevier) 37(4)571ndash578
Fukunaga K (1990) Introduction to Statistical Pattern Recogni-tion (2nd ed) San Diego CA Academic Press
Godino-Llorente J and Gomez-Vilda P (2004) lsquoAutomaticDetection of Voice Impairments by Means of Short-TermCepstral Parameters and Neural Network Based DetectorsrsquoIEEE Transactions on Biomedical Engineering 51(2) 380ndash384
Godino-Llorente J Gomez-Vilda P and Blanco-Velasco M(2006) lsquoDimensionality Reduction of a Pathological VoiceQuality Assessment System Based on Gaussian Mixture Mod-els and Short-Term Cepstral Parametersrsquo IEEE Transactionson Biomedical Engineering 53(10) 1943ndash1953
Gunes S Polat K and Yosunkaya S (2010) lsquoEfficient SleepStage Recognition System Based on EEG Signal Using k-Means Clustering Based Feature Weightingrsquo Expert Systemswith Applications 37(12) 7922ndash7928
Hariharan M Paulraj M and Yaacob S (2009) lsquoIdentifica-tion of Vocal Fold Pathology Based on Mel Frequency BandEnergy Coefficients and Singular Value Decompositionrsquo inPaper presented at the 2009 IEEE International Conferenceon Signal and Image Processing Applications (ICSIPArsquo09)Kuala Lumpur Malaysia pp 514ndash517
Hariharan M Paulraj M and Yaacob S (2011) lsquoDetection ofVocal Fold Paralysis and Oedema Using Time-Domain Fea-tures and Probabilistic Neural Networkrsquo International Jour-nal of Biomedical Engineering and Technology 6(1) 46ndash57
Hariharan M Saraswathy J Sindhu R Khairunizam W andYaacob S (2012) lsquoInfant Cry Classification to Identify As-phyxia Using Time-Frequency Analysis and Radial BasisNeural Networksrsquo Expert Systems with Applications 39(10)9515ndash9523
Hariharan M and Sazali Y (2010) lsquoTime-Domain Features andProbabilistic Neural Network for the Detection of Vocal FoldPathologyrsquo Malaysian Journal of Computer Science 23(1)60ndash67
Hariharan M Sindhu R and Yaacob S (2011) lsquoNormal andHypoacoustic Infant Cry Signal Classification Using Time-Frequency Analysis and General Regression Neural Net-
workrsquo Computer Methods and Programs in Biomedicine108(2) 559ndash569
Hariharan M Yaacob S and Awang SA (2011) lsquoPathologi-cal Infant Cry Analysis Using Wavelet Packet Transform andProbabilistic Neural Networkrsquo Expert Systems with Applica-tions 38(12) 15377ndash15382
Hernandez-Espinosa C Gomez-Vilda P Godino-Llorente Jand Aguilera-Navarro S (2000) lsquoDiagnosis of Vocal andVoice Disorders by the Speech Signalrsquo in Paper Presentedat the Proceedings from IJCNNrsquo00 The IEEE-INNS-ENNSInternational Joint Conference on Neural Networks ComoItaly pp 253ndash258
Hoglund H (2012) lsquoDetecting Earnings Management with Neu-ral Networksrsquo Expert Systems with Applications 39(10)9564ndash9570
Ismail S Samsudin R and Shabri A (2011) lsquoA Hybrid Modelof Self-Organizing Maps (SOM) and Least Square SupportVector Machine (LSSVM) for Time Series Forecastingrsquo Ex-pert Systems with Applications 38(8) 10574ndash10578
Kalker T Haitsma J and Oostveen J (2001) lsquoRobust AudioHashing for Content Identificationrsquo in Paper Presented at theProceedings from EUSIPCOrsquo01 The 12th European SignalProcessing Conference Vienna Austria pp 2091ndash2094
Kasuya H Endo Y and Saliu S (1993) lsquoNovel AcousticMeasurements of Jitter and Shimmer Characteristics fromPathological Voicersquo Paper Presented at the Proceedings fromEUROSPEECHrsquo93 The 3rd European Conference on SpeechCommunication and Technology Berlin Germany pp 1973ndash1976
Kasuya H Ogawa S Mashima K and Ebihara S (1986)lsquoNormalized Noise Energy as an Acoustic Measure to Eval-uate Pathologic Voicersquo The Journal of the Acoustical Societyof America 80 1329ndash1334
Kay Elemetrics Inc (1994) lsquoVoice Disorders DatabaseVersion 103 [CD-ROM]rsquo httpwwwkaypentaxcomProduct20InfoCSL20Options43374337html
Khadivi Heris H Seyed Aghazadeh B and Nikkhah-BahramiM (2009) lsquoOptimal Feature Selection for the Assessment ofVocal Fold Disordersrsquo Computers in Biology and Medicine39(10) 860ndash868
Krom G (1993) lsquoA Cepstrum-Based Technique for Determininga Harmonics-To-Noise Ratio in Speech Signalsrsquo Journal ofSpeech and Hearing Research 36(2) 254ndash266
Kukharchik P Kheidorov I Bovbel E and Ladeev D (2008)lsquoSpeech Signal Processing Based on Wavelets and SVM forVocal Tract Pathology Detectionrsquo Lecture Notes in ComputerScience (Springer) 5099 192ndash199
Latifoglu F Polat K Kara SI and Gunes S (2008) lsquoMedicalDiagnosis of Atherosclerosis from Carotid Artery DopplerSignals Using Principal Component Analysis (PCA) k-NN based Weighting Pre-Processing and Artificial ImmuneRecognition System (AIRS)rsquo Journal of Biomedical Infor-matics 41(1) 15ndash23
Ludlow CL Bassich CJ Connor NP Coulter DC and LeeYJ (1987) lsquoThe Validity of Using Phonatory Jitter and Shim-mer to Detect Laryngeal Pathologyrsquo Laryngeal Functionin Phonation and Respiration Boston MA Brown amp Copp 492ndash508
MacQueen J (1967) lsquoSome Methods for Classification and Anal-ysis of Multivariate Observationsrsquo in Paper Presented at the5th Berkeley Symposium on Mathematical Statistics and Prob-ability Berkeley CA University California Press Vol 1pp 281ndash297
MAPACI P (2004) lsquoVoice Disorder Databasersquo httpwwwmapacicomindex-inglesphp
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1634 M Hariharan et al
Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372
Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800
Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953
Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327
Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900
Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793
Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793
Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60
Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673
Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279
Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921
Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia
Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345
Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925
Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128
Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740
Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p
Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118
Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576
Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd
Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430
Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97
Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117
Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678
Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219
Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399
Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
International Journal of Systems Science 1633
De Brabanter K Ojeda PKF Alzate C De Brabanter J Pel-ckmans K De Moor B Vandewalle J and Suykens JAK(2010) lsquoLS-SVMlab Toolboxrsquo httpwwwesatkuleuvenbesistalssvmlab
Deliyski D (1993) Acoustic Model and Evaluation of Patholog-ical Voice Productionrsquo in Paper Presented at the Proceed-ings from EUROSPEECHrsquo93 The 3rd Conference on SpeechCommunication and Technology Berlin Germany pp 1969ndash1972
Duda RO Hart PE and Stork DG (2001) Pattern Classifi-cation New York Wiley-Interscience
Erfanian Saeedi N Almasganj F and Torabinejad F (2011)lsquoSupport Vector Wavelet Adaptation for Pathological VoiceAssessmentrsquo Computers in Biology and Medicine 41(9)822ndash828
Eskidere O Ertas F and Hanilci C (2011) lsquoA Comparison ofRegression Methods for Remote Tracking of Parkinsonrsquos Dis-ease Progressionrsquo Expert Systems with Applications 39(5)5523ndash5528
Feijoo S and Hernandez C (1990) lsquoShort-Term Stability Mea-sures for the Evaluation of Vocal Qualityrsquo Journal of Speechand Hearing Research 33(2) 324ndash334
Fonseca E Guido R Scalassara P Maciel C and Pereira J(2007) lsquoWavelet Time-Frequency Analysis and Least SquaresSupport Vector Machines for the Identification of Voice Dis-ordersrsquo Computers in Biology and Medicine(Elsevier) 37(4)571ndash578
Fukunaga K (1990) Introduction to Statistical Pattern Recogni-tion (2nd ed) San Diego CA Academic Press
Godino-Llorente J and Gomez-Vilda P (2004) lsquoAutomaticDetection of Voice Impairments by Means of Short-TermCepstral Parameters and Neural Network Based DetectorsrsquoIEEE Transactions on Biomedical Engineering 51(2) 380ndash384
Godino-Llorente J Gomez-Vilda P and Blanco-Velasco M(2006) lsquoDimensionality Reduction of a Pathological VoiceQuality Assessment System Based on Gaussian Mixture Mod-els and Short-Term Cepstral Parametersrsquo IEEE Transactionson Biomedical Engineering 53(10) 1943ndash1953
Gunes S Polat K and Yosunkaya S (2010) lsquoEfficient SleepStage Recognition System Based on EEG Signal Using k-Means Clustering Based Feature Weightingrsquo Expert Systemswith Applications 37(12) 7922ndash7928
Hariharan M Paulraj M and Yaacob S (2009) lsquoIdentifica-tion of Vocal Fold Pathology Based on Mel Frequency BandEnergy Coefficients and Singular Value Decompositionrsquo inPaper presented at the 2009 IEEE International Conferenceon Signal and Image Processing Applications (ICSIPArsquo09)Kuala Lumpur Malaysia pp 514ndash517
Hariharan M Paulraj M and Yaacob S (2011) lsquoDetection ofVocal Fold Paralysis and Oedema Using Time-Domain Fea-tures and Probabilistic Neural Networkrsquo International Jour-nal of Biomedical Engineering and Technology 6(1) 46ndash57
Hariharan M Saraswathy J Sindhu R Khairunizam W andYaacob S (2012) lsquoInfant Cry Classification to Identify As-phyxia Using Time-Frequency Analysis and Radial BasisNeural Networksrsquo Expert Systems with Applications 39(10)9515ndash9523
Hariharan M and Sazali Y (2010) lsquoTime-Domain Features andProbabilistic Neural Network for the Detection of Vocal FoldPathologyrsquo Malaysian Journal of Computer Science 23(1)60ndash67
Hariharan M Sindhu R and Yaacob S (2011) lsquoNormal andHypoacoustic Infant Cry Signal Classification Using Time-Frequency Analysis and General Regression Neural Net-
workrsquo Computer Methods and Programs in Biomedicine108(2) 559ndash569
Hariharan M Yaacob S and Awang SA (2011) lsquoPathologi-cal Infant Cry Analysis Using Wavelet Packet Transform andProbabilistic Neural Networkrsquo Expert Systems with Applica-tions 38(12) 15377ndash15382
Hernandez-Espinosa C Gomez-Vilda P Godino-Llorente Jand Aguilera-Navarro S (2000) lsquoDiagnosis of Vocal andVoice Disorders by the Speech Signalrsquo in Paper Presentedat the Proceedings from IJCNNrsquo00 The IEEE-INNS-ENNSInternational Joint Conference on Neural Networks ComoItaly pp 253ndash258
Hoglund H (2012) lsquoDetecting Earnings Management with Neu-ral Networksrsquo Expert Systems with Applications 39(10)9564ndash9570
Ismail S Samsudin R and Shabri A (2011) lsquoA Hybrid Modelof Self-Organizing Maps (SOM) and Least Square SupportVector Machine (LSSVM) for Time Series Forecastingrsquo Ex-pert Systems with Applications 38(8) 10574ndash10578
Kalker T Haitsma J and Oostveen J (2001) lsquoRobust AudioHashing for Content Identificationrsquo in Paper Presented at theProceedings from EUSIPCOrsquo01 The 12th European SignalProcessing Conference Vienna Austria pp 2091ndash2094
Kasuya H Endo Y and Saliu S (1993) lsquoNovel AcousticMeasurements of Jitter and Shimmer Characteristics fromPathological Voicersquo Paper Presented at the Proceedings fromEUROSPEECHrsquo93 The 3rd European Conference on SpeechCommunication and Technology Berlin Germany pp 1973ndash1976
Kasuya H Ogawa S Mashima K and Ebihara S (1986)lsquoNormalized Noise Energy as an Acoustic Measure to Eval-uate Pathologic Voicersquo The Journal of the Acoustical Societyof America 80 1329ndash1334
Kay Elemetrics Inc (1994) lsquoVoice Disorders DatabaseVersion 103 [CD-ROM]rsquo httpwwwkaypentaxcomProduct20InfoCSL20Options43374337html
Khadivi Heris H Seyed Aghazadeh B and Nikkhah-BahramiM (2009) lsquoOptimal Feature Selection for the Assessment ofVocal Fold Disordersrsquo Computers in Biology and Medicine39(10) 860ndash868
Krom G (1993) lsquoA Cepstrum-Based Technique for Determininga Harmonics-To-Noise Ratio in Speech Signalsrsquo Journal ofSpeech and Hearing Research 36(2) 254ndash266
Kukharchik P Kheidorov I Bovbel E and Ladeev D (2008)lsquoSpeech Signal Processing Based on Wavelets and SVM forVocal Tract Pathology Detectionrsquo Lecture Notes in ComputerScience (Springer) 5099 192ndash199
Latifoglu F Polat K Kara SI and Gunes S (2008) lsquoMedicalDiagnosis of Atherosclerosis from Carotid Artery DopplerSignals Using Principal Component Analysis (PCA) k-NN based Weighting Pre-Processing and Artificial ImmuneRecognition System (AIRS)rsquo Journal of Biomedical Infor-matics 41(1) 15ndash23
Ludlow CL Bassich CJ Connor NP Coulter DC and LeeYJ (1987) lsquoThe Validity of Using Phonatory Jitter and Shim-mer to Detect Laryngeal Pathologyrsquo Laryngeal Functionin Phonation and Respiration Boston MA Brown amp Copp 492ndash508
MacQueen J (1967) lsquoSome Methods for Classification and Anal-ysis of Multivariate Observationsrsquo in Paper Presented at the5th Berkeley Symposium on Mathematical Statistics and Prob-ability Berkeley CA University California Press Vol 1pp 281ndash297
MAPACI P (2004) lsquoVoice Disorder Databasersquo httpwwwmapacicomindex-inglesphp
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1634 M Hariharan et al
Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372
Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800
Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953
Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327
Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900
Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793
Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793
Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60
Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673
Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279
Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921
Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia
Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345
Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925
Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128
Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740
Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p
Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118
Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576
Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd
Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430
Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97
Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117
Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678
Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219
Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399
Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14
1634 M Hariharan et al
Martinez C and Rufiner H (2000) lsquoAcoustic Analysis of Speechfor Detection of Laryngeal Pathologiesrsquo in Paper Presentedat the Proceedings from IEEE EMBSrsquo00 The 22nd Annual In-ternational Conference of the IEEE Engineering in Medicineand Biology Society Chicago IL pp 2369ndash2372
Martis RJ Acharya UR Mandana K Ray A andChakraborty C (2012) lsquoApplication of Principal Compo-nent Analysis to ECG Signals for Automated Diagnosis ofCardiac Healthrsquo Expert Systems with Applications 39(14)11792ndash11800
Nayak J and Bhat P (2003) lsquoIdentification of Voice Disor-ders Using Speech Samplesrsquo in Paper Presented at the Pro-ceedings from TENCONrsquo03 The Conference on ConvergentTechnologies for Asia-Pacific Region pp 951ndash953
Nayak J Bhat P Acharya R and Aithal U (2005) lsquoClassifi-cation and Analysis of Speech Abnormalitiesrsquo ITBM-RBM26(5ndash6) 319ndash327
Nikkhah-Bahrami M Ahmadi-Noubari H Seyed AghazadehB and Khadivi Heris H (2009) lsquoHierarchical Diagnosisof Vocal Fold Disordersrsquo Communications in Computer andInformation Science(Springer) 6 897ndash900
Ozer H Sankur B Memon N and Anarim E (2005) lsquoPercep-tual Audio Hashing Functionsrsquo EUROSIP Journal of AppliedSignal Processing 2005(1) 1780ndash1793
Pandian M Sazali Y and Hariharan M (2008) lsquoFeature Ex-traction Based on Mel-Scaled Wavelet Packet Transform forthe Diagnosis of Voice Disordersrsquo in Paper Presented at the4th Kuala Lumpur International Conference on BiomedicalEngineering Kuala Lumpur Malaysia pp 790ndash793
Paulraj M P Sazali Y and Hariharan M (2009) lsquoDiagnosis ofVoices Disorders Using MEL Scaled WPT and FunctionalLink Neural Networkrsquo International Journal of Biomedi-cal Soft Computing and Human Sciences Special IssueBio-sensors Data Acquisition Processing and Control 14(2)55ndash60
Polat K (2012) lsquoApplication of Attribute Weighting MethodBased on Clustering Centers to Discrimination of LinearlyNon-Separable Medical Datasetsrsquo Journal of Medical Sys-tems 36(4) 2657ndash2673
Polat K and Durduran S S (2012) lsquoAutomatic Determinationof Traffic Accidents Based on KMC-Based Attribute Weight-ingrsquo Neural Computing amp Applications 21(6) 1271ndash1279
Polat K and Gunes S (2006) lsquoA Hybrid Medical DecisionMaking System Based on Principles Component Analysisk-NN Based Weighted Pre-Processing and Adaptive Neuro-Fuzzy Inference Systemrsquo Digital Signal Processing 16(6)913ndash921
Rao RM and Bopardikar AS (2000) Wavelet TransformsIntroduction to Theory and Applications India PearsonEducation Asia
Ritchings R McGillion M Conroy G and Moore C (1999)lsquoObjective Assessment of Pathological Voice Qualityrsquo in Pa-per Presented at the Proceedings of IEEE SMCrsquo99 The IEEEInternational Conference on Systems Man and CyberneticsTokyo Japan pp 340ndash345
Ritchings T McGillion M and Moore C (1999) lsquoObjec-tive Assessment of Pathological Voice Quality Using Multi-layer Perceptronsrsquo in Paper Presented at the Proceedingsfrom IEEE BMESEMBSrsquo99 The 1st Joint BMESEMBS Con-ference in Engineering in Medicine and Biology AtlantaGA p 925
Saenz-Lechon N Godino-Llorente JI Osma-Ruiz V andGomez-Vilda P (2006) lsquoMethodological Issues in the De-velopment of Automatic Systems for Voice Pathology Detec-tionrsquo Biomedical Signal Processing and Control 1(2) 120ndash128
Salhi L Talbi M and Cherif A (2008) lsquoVoice DisordersIdentification Using Hybrid Approach Wavelet Analysis andMultilayer Neural Networksrsquo The World Academy of Sci-ence Engineering and Technology (WASETrsquo08) 35 2070ndash3740
Shama K and Cholayya N (2007) lsquoStudy of Harmonics-To-Noise Ratio and Critical-Band Energy Spectrum of Speechas Acoustic Indicators of Laryngeal and Voice PathologyrsquoEURASIP Journal on Applied Signal Processing 2007(1)9 p
Specht D (1990) lsquoProbabilistic Neural Networksrsquo Neural Net-works 3(1) 109ndash118
Specht DF (1991) lsquoA General Regression Neural NetworkrsquoIEEE Transactions on Neural Networks 2(6) 568ndash576
Suykens JAK Van Gestel T De Brabanter J De Moor Band Vandewalle J (2003) Least Squares Support Vector Ma-chines Singapore World Scientific Publishing Co Pte Ltd
Umapathy K Krishnan S Parsa V and Jamieson D(2005) lsquoDiscrimination of Pathological Voices Using a time-Frequency Approachrsquo IEEE Transactions on Biomedical En-gineering 52(3) 421ndash430
Wester M (1998) lsquoAutomatic Classification of Voice QualityComparing Regression Models and Hidden Markov Mod-elsrsquo in Paper Presented at the Proceedings from VOICE-DATArsquo98The Symposium on Databases in Voice Quality Re-search and Education Utrecht The Netherlands pp 92ndash97
Wu JD and Tsai YJ (2011) lsquoSpeaker Identification System Us-ing Empirical Mode Decomposition and an Artificial NeuralNetworkrsquo Expert Systems with Applications 38(5) 6112ndash6117
Xu R and Wunsch D (2005) lsquoSurvey of Clustering Algo-rithmsrsquo IEEE Transactions on Neural Networks 16(3) 645ndash678
Yager R and Filev D (1994) lsquoGeneration of Fuzzy Rules byMountain Clusteringrsquo Journal of Intelligent and Fuzzy Sys-tems 2(3) 209ndash219
Yu L Yao X Wang S and Lai K (2011) lsquoCredit Risk Eval-uation Using a Weighted Least Squares SVM Classifier withDesign of Experiment for Parameter Selectionrsquo Expert Sys-tems with Applications 38(12) 15392ndash15399
Yumoto E Sasaki Y and Okamura H (1984) lsquoHarmonics-to-Noise Ratio and Psychophysical Measurement of the Degreeof Hoarsenessrsquo Journal of Speech and Hearing Research27(1) 2ndash6
Dow
nloa
ded
by [
Nor
th D
akot
a St
ate
Uni
vers
ity]
at 1
739
22
Nov
embe
r 20
14