research article an improved fuzzy c-means clustering
TRANSCRIPT
Research ArticleAn Improved Fuzzy c-Means Clustering Algorithm Based onShadowed Sets and PSO
Jian Zhang1 and Ling Shen2
1 School of Mechanical Engineering Tongji University Shanghai 200092 China2 Precision Medical Device Department University of Shanghai for Science and Technology Shanghai 200093 China
Correspondence should be addressed to Jian Zhang jianzhtongjieducn
Received 8 June 2014 Revised 30 September 2014 Accepted 14 October 2014 Published 12 November 2014
Academic Editor Daoqiang Zhang
Copyright copy 2014 J Zhang and L ShenThis is an open access article distributed under the Creative CommonsAttribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited
To organize the wide variety of data sets automatically and acquire accurate classification this paper presents a modified fuzzy119888-means algorithm (SP-FCM) based on particle swarm optimization (PSO) and shadowed sets to perform feature clustering SP-FCM introduces the global search property of PSO to deal with the problem of premature convergence of conventional fuzzyclustering utilizes vagueness balance property of shadowed sets to handle overlapping among clusters and models uncertaintyin class boundaries This new method uses Xie-Beni index as cluster validity and automatically finds the optimal cluster numberwithin a specific rangewith cluster partitions that provide compact andwell-separated clusters Experiments show that the proposedapproach significantly improves the clustering effect
1 Introduction
Clustering is the process of assigning a homogeneous groupof objects into subsets called clusters so that objects in eachcluster are more similar to each other than objects fromdifferent clusters based on the values of their attributes [1]Clustering techniques have been studied extensively in datamining [2] pattern recognition [3] andmachine learning [4]
Clustering algorithms can be generally grouped into twomain classes namely supervised clustering and unsupervisedclustering where the parameters of classifier are optimizedMany unsupervised clustering algorithms have been devel-oped One such algorithm is 119896-means which assigns 119899 objectsto 119896 clusters by minimizing the sum of squared Euclideandistance between the objects in each cluster to the clustercenter The main drawback of the 119896-means algorithm isthat the result is sensitive to the selection of initial clustercentroids and may converge to local optima [5]
For handling those random distribution data sets softcomputing has been introduced in clustering [6] whichexploits the tolerance for imprecision and uncertainty inorder to achieve tractability and robustness Fuzzy sets andrough sets have been incorporated in the 119888-means framework
to develop the fuzzy 119888-means (FCM) [7] and rough 119888-means(RCM) [8] algorithms
Fuzzy algorithms can assign data object partially tomultiple clusters and handle overlapping partitions Thedegree of membership in the fuzzy clusters depends onthe closeness of the data object to the cluster centers Themost popular fuzzy clustering algorithm is FCM which isintroduced byBezdek [9] andnow it is widely used FCM is aneffective algorithm but the random selection in center pointsmakes iterative process fall into the saddle points or localoptimal solution easily Furthermore if the data sets containsevere noise points or if the data sets are high dimensionalsuch as bioinformatics [10] the alternating optimizationoften fails to find the global optimum In these cases theprobability of finding the global optimum can be increasedby stochastic methods such as evolutionary or swarm-basedmethods Bezdek and Hathaway [11] optimized the hard119888-means (HCM) model with a genetic algorithm Runkler[12] introduced an ant colony optimization algorithm whichexplicitly minimizes the HCM and FCM cluster modelsAl-Sultan and Selim [13] proposed the simulated annealingalgorithm (SA) to overcome some of these limits and gotpromising results
Hindawi Publishing CorporationComputational Intelligence and NeuroscienceVolume 2014 Article ID 368628 10 pageshttpdxdoiorg1011552014368628
2 Computational Intelligence and Neuroscience
PSO is a population based optimization tool developedby Eberhart and Kennedy [14] which can be implementedand applied easily to solve various function optimizationproblems Runkler and Katz [15] introduced two new meth-ods for minimizing the reformulated objective functions ofthe FCM clustering model by PSO PSO-119881 and PSO-119880 Inorder to overcome the shortcomings of FCM a PSO-basedfuzzy clustering algorithm was discussed [16] this algorithmuses the global search capacity of PSO to overcome theshortcomings of FCM For finding more appropriate clustercenters a generalized FCM optimized by PSO algorithm [17]was proposed
Shadowed sets are considered as a conceptual and algo-rithmic bridge between rough sets and fuzzy sets therebyincorporate the generic merits and have been successfullyused for unsupervised learning Shadowed sets introduce(0 1) interval to denote the belongingness of those clusteringpoints and the uncertainty among patterns lying in the shad-owed region is efficiently handled in terms of membershipThus in order to disambiguate and capture the essence of adistribution recently the concept of shadowed sets has beenintroduced [18] which can also raise the efficiency in theiteration process of the new prototypes by eliminating someldquobad pointsrdquo that have bad influence on cluster structure[19 20] Compared with FCM the capability of shadowed 119888-means is enhanced when dealing with outlier [21]
Although lots of clustering algorithms based on FCMPSO or shadowed sets were proposed most of them needto input the preestimated cluster number 119862 To obtain thedesirable cluster partitions in a given data commonly119862 is setmanually and this is a very subjective and somewhat arbitraryprocess A number of approaches have been proposed toselect the appropriate 119862 Bezdek et al [22] suggested therule of thumb 119862 le 11987312 where the upper bound must bedetermined based on knowledge or applications about thedata Another approach is to use a cluster validity indexas a measure criterion about the data partition such asDavies-Bouldin (DB) [23] Xie-Beni (XB) [24] and Dunn[25] indices These indices often follow the principle that thedistance between objects in the same cluster should be assmall as possible and the distance between objects in differentclusters should be as large as possible They have also beenused to acquire the optimal number of clusters 119862 accordingto their maximum or minimum value
Therefore we wish to find the best 119862 in some rangeobtain cluster partitions by considering compactness andintercluster separation and reduce the sensitivity to initialvalues Here we propose a modified algorithm named asSP-FCM which integrates the merits of PSO and interleavesshadowed sets between stabilization iterations And it canautomatically estimate the optimal cluster number with afaster initialization than our previous approach
The structure of the paper is as follows Section 2 outlinesall necessary prerequisites In Section 3 a new clusteringapproach called SP-FCM is presented for automatically find-ing the optimal cluster number Section 4 includes the resultsof experiments involvingUCI data sets yeast gene expressiondata sets and real data set In Section 5 main conclusions arecovered
2 Related Clustering Algorithms
In this section we briefly describe some basic concepts ofFCM PSO shadowed sets and XB validity index and reviewthe PSO-based clustering method
21 FCM We define 119883 = 1199091 119909
119873 as the universe of a
clustering data set 119861 = 1205731 120573
119862 as the prototypes of 119862
clusters and 119880 = [119906119894119895]119873times119862
as a fuzzy partition matrix where119906119894119895isin [0 1] is themembership of 119909
119894in a cluster with prototype
120573119895 119909119894 120573119895isin 119877119875 where P is the data dimensionality 1 le 119894 le 119873
and 1 le 119895 le 119862 The FCM algorithm is derived by minimizingthe objective function [22]
119869FCM (119880 119861119883) =119862
sum
119895=1
119873
sum
119894=1
119906119898
1198941198951198892
119894119895(119909119894 120573119895) (1)
where 119898 gt 10 is the weighting exponent on each fuzzymembership and 119889
119894119895is the Euclidian distance from data
vectors 119909119894to cluster center 120573
119895 And
119862
sum
119895=1
119906119894119895= 1 forall119894 = 1 2 119873
0 lt
119873
sum
119894=1
119906119894119895lt 119873 forall119895 = 1 2 119862
119889119894119895=
10038171003817100381710038171003817119909119894minus 120573119895
10038171003817100381710038171003817
(2)
This produces the following update equations
119906119894119895= (
119862
sum
119896=1
(
119889 (119909119894 120573119895)
119889 (119909119894 120573119896)
)
2(119898minus1)
)
minus1
(3)
120573119895=
(sum119873
119894=1(119906119894119895)
119898
119909119894)
(sum119873
119894=1(119906119894119895)
119898
)
(4)
After computing the memberships of all the objects thenew prototypes of the clusters are calculated The processstops when the prototypes stabilize That is the prototypesfrom the previous iteration are of close proximity to thosegenerated in the current iteration normally less than an errorthreshold
22 PSO PSO was originally introduced in terms of socialand cognitive behavior of bird flocking and fish schoolingThe potential solutions are called particles which fly throughthe problem space by following the current best particlesEach particle keeps track of its coordinates in the problemspace which are associated with the best solution that hasbeen achieved so far The solution is evaluated by the fitnessvalue which is also storedThis value is called 119901best Another
Computational Intelligence and Neuroscience 3
best value that is tracked by the PSO is the best value obtainedso far by any particle in the swarm The best value is a globalbest and is called 119892best The search for the better positionsfollows the rule as
119881 (119905 + 1) = 119908119881 (119905) + 11988811199031(119901best (119905) minus 119875 (119905))
+ 11988821199032(119892best (119905) minus 119875 (119905))
119875 (119905 + 1) = 119875 (119905) + 119881 (119905 + 1)
(5)
where 119875 and 119881 are position and velocity vector of particlerespectively 119908 is inertia weight 119888
1and 119888
2are positive
constants called acceleration coefficients which control theinfluence of 119901best and 119892best in search process and 119903
1and 1199032
are random values in the range [0 1]The fitness value of eachparticlersquos position is determined by a fitness function andPSO is usually executed with repeated application of (5) untila specified number of iterations have been exceeded or thevelocity updates are close to zero over a number of iterations
23 PSO-Based FCM In this algorithm [26] each particlePart119897represents a cluster center vector which is constructed
as
Part119897= (1198751198971 119875
119897119895 119875
119897119862) (6)
where 119897 represents the 119897th particle 119897 = 1 2 119871 119871 is thenumber of particles and 119871 lt 119873 119875
119897119895is the 119895th cluster center
of particle Part119897 Therefore a swarm represents a number
of candidates cluster center for the data vector Each datavector belongs to a cluster according to its membershipfunction and thus a fuzzy membership is assigned to eachdata vector Each cluster has a cluster center per iteration andpresents a solution which gives a vector of cluster centersThis method determines the position vector Partl for everyparticle updates it and then changes the position of clustercenter And the fitness function for evaluating the generalizedsolutions is stated as
119865 (119875) =
1
119869FCM (7)
The smaller is the 119869FCM the better is the clustering effectand the higher is the fitness function 119865(119875)
24 Shadowed Sets Conventional uncertainty models likefuzzy sets tend to capture vagueness through membershipvalues and associate precise numeric values of membershipwith vague concepts By introducing 120572-cut [19] a fuzzy setcan be converted into a classical set Shadowed sets mapeach element of a given fuzzy set into 0 1 and the unitinterval [0 1] namely excluded included and uncertainrespectively
For constructing a shadowed set Mitra et al [21] pro-posed an optimization based on balance of vagueness Aselevating membership values of some regions to 1 and atthe same time reducing membership values of some regionsto 0 the uncertainty in these regions can be eliminatedTo keep the balance of the total uncertainty regions it
f(x)
Ω3
Ω2
Ω1
x1 x2
1
120572
x
1minus120572
0
Figure 1 Shadowed sets induced by fuzzy function 119891(119909)
needs to compensate these changes by the construction ofuncertain regions namely shadowed sets that absorb theprevious elimination of partial membership at low and highranges The shadowed sets are induced by fuzzy membershipfunction in Figure 1
Here119909 denotes the objects119891(119909) isin [0 1] is the continuousmembership function of the objects belonging to a clusterThe symbolΩ
1shows the reduction of membership the sym-
bol Ω2depicts the elevation of membership and the symbol
Ω3shows the formation of shadows In order to balance the
total uncertainty the retention of balance translates into thefollowing dependency
Ω1+ Ω2= Ω3 (8)
And the integral forms are given as
Ω1= int
119909119891(119909)le120572
119891 (119909) 119889119909 Ω2= int
119909119891(119909)ge1minus120572
(1 minus 119891 (119909)) 119889119909
Ω3= int
119909120572lt119891(119909)lt1minus120572
119889119909
(9)
The threshold of reducing and elevating is 120572 and 1 minus 120572(120572 isin (0 05)) The optimal value of 120572 should be acquired bytranslating it into the minimization of the following objectivefunction
119874 (120572) =1003816100381610038161003816Ω1+ Ω2minus Ω3
1003816100381610038161003816 (10)
4 Computational Intelligence and Neuroscience
For a fuzzy set with discrete membership function thebalance equation is modified as
119874(120572119895) =
10038161003816100381610038161003816100381610038161003816100381610038161003816
sum
119906119894119895le120572119895
119906119894119895+ sum
119906119894119895ge119906119895maxminus120572119895
(119906119895maxminus 120572119895)
minus card 119906119894119895| 120572119895lt 119906119894119895lt 119906119895maxminus 120572119895
10038161003816100381610038161003816100381610038161003816100381610038161003816
(11)
In order to find the best 120572119895 it should satisfy the following
optimal problem
120572119895= arg min120572119895
119874(120572119895) (12)
where 119906119894119895isin [0 1] is the membership of 119909
119894in a cluster with
prototype 120573119895 119906119895max
and 119906119895min
denote the highest and lowestmembership values to the 119895th cluster and 120572
119895is the threshold
of the 119895th cluster The range of feasible values of threshold 120572119895
is [119906119895min (119906119895min+ 119906119895max)2] [19]
This approach considers all membership values withrespect to a fixed cluster when updating the prototype ofthis cluster The main merits of shadowed sets involve theoptimizationmechanism for choosing separate threshold andthe reduction of the burden of plain numeric computations
25 XB Clustering Validity Index The clustering algorithmsdescribed above require prespecification of the number ofclusters The partition results are dependent on the choiceof 119862 There exist validity indices to evaluate the goodness ofclustering according to a given number of clusters thereforethese validity indices can be used to acquire the optimal valueof 119862 [27]
The XB index presents a fuzzy-validity criterion basedon a validity function which identifies overall compact andseparate fuzzy 119888-partitions This function depends upon thedata set geometric distance measure and distance betweencluster centroids and fuzzy partition irrespective of anyfuzzy algorithm used For evaluating the goodness of thedata partition both cluster compactness and interclusterseparation should be taken into account For the FCMalgorithm with 119898 = 20 the Xie-Beni index can be shownto be
119878XB =119869FCM1198731198892
min (13)
where 119889min = min119894119895
10038171003817100381710038171003817120573119894minus 120573119895
10038171003817100381710038171003817is the minimum distance
between cluster centroidsThemore separate the clusters thelarger the 119889min and the smaller the 119878XB
3 Shadowed Sets-Based PSO-FuzzyClustering SP-FCM
FCM strives to find 119862 compact clusters in 119883 where 119862 is oneof the specified parameters But the process of selecting andadjusting 119862manually to obtain desirable cluster partitions ina given data set is very subjective and somewhat arbitrary
To seek the optimal cluster structure 119862 is always allowed tobe overestimated [28] such that the distances between someclusters are not big enough or themembership values of someobjects with different clusters are adjacent and ambiguousin a given data set And in this case the modification ofprototypes through long time iteration is meaningless
The main subject of cluster validation is the evaluation ofclustering results to find the partitioning that best fits the dataset Based on the foregoing algorithms we wish to find clusterpartitions that contain compact and well-separated clustersIn our algorithm 119862 is also overestimated and the clusterscompete for data membership We can set [119862min 119862max] as thereasonable range of cluster number based on the knowledgeof the data This provides a more transparent and tractableprocess of cluster number reduction Considering the fuzzypartition matrix 119880 = [119906
119894119895]119873times119862
each column is comprisedof the membership values of all feature vectors 119909
119894with a
single cluster center Thus an optimal threshold 120572119895(119895 =
1 2 119862) for each column should be found to create a harderpartition by (12) The amount of data which are assignedmembership value equal to 1 is identified as the cardinalityof corresponding cluster According to 120572
119895 the cardinality of
the 119895th column is
119872119895= card 119906
119894119895| 119906119894119895ge 119906119895maxminus 120572119895 (14)
Here the threshold is not subjectively user-defined butit is established on the balance of uncertainty and canbe adjusted automatically in the clustering process Thisproperty of shadowed sets can be used to reduce the clusternumber In order to control the convergence speed thedecision to delete clusters can be based on some thresholdsDifferent threshold values should be set for different data setsdepending on the cluster structure and size of data sets Herea threshold 120576 and attrition rate 120588 (0 lt 120588 lt 1) are set Thedecision to delete clusters in SP-FCM is based solely on clustercardinality and the threshold 120576 If 120576 is too small 119862 is reducedmore slowly and it may stop prematurely before the optimalcluster number is found On the other hand if 120576 is too large119862 may be reduced too drastically In our method clusterswhose cardinalities 119872
119895lt 120576 are considered as ldquocandidatesrdquo
for removal And we can remove up to lfloor120588 times 119862rfloor clustershaving the lowest cardinality from the pool of candidatesspecified by 120576 Limiting the number of clusters that can beremoved at one time prevents 119862 from being reduced toodrastically when 120576 is set too high for a given data set Thiswould automatically estimate the best cluster number whilealso utilizing a faster consistent and repeatable initializationtechnique For evaluating the goodness of the data partitionboth cluster compactness and intercluster separation shouldbe taken into account Hence the XB index is adopted
For each 119862 in the range of [119862min 119862max] a set of clustervalidity indexes were calculated where 119862max is the initialcluster number which is set to be much larger than theexpected cluster numberThe partitionmatrix with119862 clusterswith the best aggregate validity index is selected as the finalcluster partitionThe SP-FCM algorithm is summarized as inAlgorithm 1
Computational Intelligence and Neuroscience 5
(1) Initialize 119862max and 119862min let 119862 = 119862max the real number119898 iteration counter 119896 = 0 iteration counter 119905 = 0 maximumiteration number 119879 of PSO
(2) Initialize the population size 119871 the initial velocity of particles the initial position of particles 1198881 1198882 119908 the threshold 120576
and attrition rate 120588(3)DO
Repeat (a) Update partition matrix 119880(119896)for all particles by (3)(b) Calculate the cluster center for each particle by (4)(c) Calculate the fitness value for each particle by (7)(d) Calculate 119901119887119890119904119905 for each particle(e) Calculate 119892119887119890119904119905 for the swarm(f) Update the velocity and position of each particle by (5)(g) 119905 = 119905 + 1
Until PSO termination condition is met (lowast)(i) Calculate the optimal threshold 120572
119895(119895 = 1 2 119862) for each column of partition matrix 119880(119896) by (12)
and relocate 119906119894119895(1 le 119894 le 119873) of 119895th cluster according to 120572
119895
(ii) Calculate cardinality119872119895for each cluster on the basis of the number of data whose membership value equal to 1 by (14)
1 le 119895 le 119862
(iii) Remove all clusters whose119872119895lt 120576 and119872
119895is among lfloor120588 times 119862rfloor lowest cardinality
(iv) Update cluster number C(v) Calculate cluster validity index 119878XB(119896) by (13)(vi) Update iteration counter 119896 = 119896 + 1
While termination condition is not met (lowastlowast)(lowast) The termination condition of PSO in this method is 119905 ge 119879 (reach the maximum number of iterations) or the velocity
updates are close to zero over a number of iterations(lowastlowast) The algorithm can terminate under either of the following two conditions(1) The prototype parameters in 119861 stabilize within some threshold 120575(2) The number of clusters has reached the minimum limit 119862min
Algorithm 1 SP-FCM
Here if lfloor120588 times 119862rfloor is equal to 0 we can let it to be1 This means that the cluster with the lowest cardinalitymay be removed The initial 119862max cluster prototypes canbe initialized using exemplars from data points selected by120573119895= 119909lfloor(119873119862max)119895rfloor
After termination the 119861 and 119880 from119862 isin [119862min 119862max] with the best cluster validity index 119878XB areselected as the final cluster prototype and partition
4 Experimental Results
In this section the performance of FCM RCM shadowed 119888-means (SCM) [21] shadowed rough 119888-means (SRCM) [19]and SP-FCM algorithms is presented on four UCI datasetsfour yeast gene expression datasets and real data Forevaluating the convergence effect the fundamental criterioncan be described as follows the distance between differentobjects in the same cluster should be as close as possible thedistance between different objects in different cluster shouldbe as far as possible Here we use DB index and Dunn indexto evaluate the clustering effect For a given data set and 119862value the higher the similarity values within the clusters andthe intercluster separation the lower the DB index value Agood clustering procedure shouldmake the value ofDB indexas low as possible Reversely higher values of the Dunn indexindicate better clustering in the sense that the clusters are well
separated and relatively compact The details of experimentsare mentioned below
41 UCI Data Set In our experiments totally four UCI datasets are used including 4-dimensional Iris 13-dimensionalWine 10-dimensional Glass and 34-dimensional Iono-sphere There are 3 clusters in data set of Iris each of whichhas 50 data patterns 3 clusters in data set of Wine whichhave 50 60 and 68 data patterns 6 clusters in data set ofGlass which have 30 35 40 42 36 and 31 separately and2 clusters in data set of Ionosphere which have 226 and125 data patterns The validity indices of each method arecompared in Table 1 SP-FCM can identify compact groupscompared to other algorithmswhen given the cluster number119862 It can also be seen that SRCM and SP-FCM have moreobvious advantages than FCM RCM and SCM SP-FCMperforms slightly better than SRCM in most cases due tothe global search ability which enables it to converge to anoptimum or near optimum solutions Moreover shadowedset- and rough set-based clustering methods namely SP-FCM SRCM RCM and SCM perform better than FCM Itimplies that the partition of approximation regions can revealthe nature of data structure and only the lower bound andboundary region of each cluster have positive contribution inthe process of updating the prototypes
6 Computational Intelligence and Neuroscience
Table 1 Performance of FCM RCM SCM SRCM and SP-FCM on four UCI data sets
Different indices Algorithm Data setsIris Wine Glass Ionosphere
DB index
FCM 07642 08803 22971 20587RCM 06875 05692 19635 15434SCM 06862 05327 18495 14763SRCM 06613 04436 15804 13971SP-FCM 06574 04328 15237 14066
Dunn index
FCM 23106 25834 01142 08381RCM 27119 28157 02637 10233SCM 24801 27992 03150 10319SRCM 30874 31342 05108 11924SP-FCM 33254 31764 04921 12605
0
05
1
15
2
25
3
12 11 10 9 8 7 6 5 4 3 2Number of clusters
Xie-
Beni
val
idity
(a) Iris
0
1
2
3
4
5
6
13 12 11 10 9 8 7 6 5 4 3 2Number of clusters
Xie-
Beni
val
idity
(b) Wine
0
2
4
6
8
10
14 1213 11 10 9 8 7 6 5 4 3 2Number of clusters
Xie-
Beni
val
idity
(c) Glass
0
5
10
15
20
25
16 15 14 13 12 11 10 9 8 7 6 5 4 3 2Number of clusters
Xie-
Beni
val
idity
(d) Ionosphere
Figure 2 XB validity index of four UCI data sets with cluster number C
As usual the number of clusters is implied by the nature ofthe problem Here with the shadowed sets involved one cananticipate that the optimal number of clusters could be foundThe fuzzification coefficient 119898 can be optimized however itis common to assume a fixed value of 20 which associateswith the form of the membership functions of the generatedclusters For testing the SP-FCMalgorithm the rule119862 le 11987312is adopted and the range of the expected cluster number canbe set as (1) Iris [119862min = 2 119862max = 12] (2) Wine [119862min = 2119862max = 13] (3) Glass [119862min = 2 119862max = 14] (4) Ionosphere
[119862min = 2 119862max = 16] The swarm size is set as 119871 = 20the maximum iteration number of PSO 119879 = 50 and forcluster reduction the cluster cardinality threshold 120576 = 10and the attrition rate 120588 = 01 In each cycle we get thedistribution of every cluster remove part of them accordingto their cardinality and calculate the XB index and thecluster number 119862 varies from 119862max to 119862min After ending thecirculation the partition with the lowest value is selected asthe final result Figure 2 presents the validity indices in theprocess of generating the optimal cluster number Smaller
Computational Intelligence and Neuroscience 7
Table 2 Performance of FCM RCM SCM SRCM and SP-FCM on four yeast expression data sets
Different indices Algorithm Data setsGDS608 GDS2003 GDS2267 GDS2712
DB index
FCM 20861 24671 15916 19526RCM 16109 22104 10274 12058SCM 15938 21346 08946 10965SRCM 13274 19523 07438 07326SP-FCM 12958 18946 07962 06843
Dunn index
FCM 02647 02976 04208 03519RCM 03789 03981 07164 06074SCM 03865 03775 08439 06207SRCM 05126 04953 09759 08113SP-FCM 05407 05026 09182 08049
values indicate more compact and well-separated clustersThe validity indices reach their minimum value at 119862 = 33 6 and 2 separately which correspond to the final clusterprototype and the best partition Through the shadowedsets and PSO approaches the influence of each boundaryregion to the formation of the prototypes and the clusterscan be properly resolved Although more computing timeis required to run SP-FCM the reasonable result can beacquired for processing the overlapping and vagueness datapatterns
42 Yeast Gene Expression Data Set There are four yeastgene expression data sets used in the experiments includingGDS608 GDS2003 GDS2267 and GDS2712 downloadedfrom Gene Expression Omnibus The number of classesand samples of GDS608 is 26 and 6303 for GDS2003 thenumber of classes and samples is 23 and 5617 for GDS2267is 14 and 9275 and for GDS2712 is 15 and 9275 Table 2presents the validity indices of different methods after thecluster number 119862 was given The SP-FCM and SRCM obtainthe same effect and perform better than other clusteringalgorithms The improvement can be attributed to the factthat the global search capacity of PSO is conducive to findingmore appropriate cluster centers while escaping from localoptima
For getting the optimum 119862 automatically we let119898 = 201198881= 149 119888
2= 149 and 119908 = 072 and the rule 119862 le 11987312
is adopted The swarm size is set as 119871 = 20 the maximumiteration number of PSO is 119879 = 80 and for clusterreduction the range of the expected cluster number thecluster cardinality threshold 120576 and the attrition rate 120588 can beset as (1) GDS608 [119862min = 20 119862max = 80] 120576 = 20 120588 = 005(2) GDS2003 [119862min = 20 119862max = 75] 120576 = 20 120588 = 005(3) GDS2267 [119862min = 10 119862max = 96] 120576 = 20 120588 = 008(4) GDS2712 [119862min = 10 119862max = 96] 120576 = 20 120588 = 008In each cycle we get the distribution of every cluster removepart of them according to their cardinality and calculate theXB index and the cluster number 119862 varies from 119862max to119862min The partition with the lowest value is selected as the
final result after the loop is ended As seen in Figure 3 forGDS608 at the beginning the cluster number decreases at afaster rate it takes 26 iterations to reduce the cluster numberfrom 119862 = 80 to 119862 = 30 and 4 iterations from 119862 = 30
to 119862 = 26 and the XB index begins to increase when thecluster number 119862 lt 26 For GDS2003 it takes 24 iterationsto reduce the cluster number from 119862 = 75 to 119862 = 30 and 7iterations from 119862 = 30 to 119862 = 23 and the XB index beginsto increase when the cluster number 119862 lt 23 For GDS2267 ittakes 23 iterations to reduce the cluster number from 119862 = 96to 119862 = 20 and 6 iterations from 119862 = 20 to 119862 = 14 andthe XB index begins to increase when the cluster number119862 lt 14 For GDS2712 it takes 23 iterations to reduce thecluster number from 119862 = 96 to 119862 = 20 and 5 iterationsfrom 119862 = 20 to 119862 = 15 and the XB index begins to increasewhen the cluster number 119862 lt 15 Here the advantagesof fuzzy sets PSO and shadowed sets are integrated in theSP-FCM and make this algorithm applicable to deal withoverlapping partitions the uncertainty and vagueness arisingfrom the boundary regions and the optimization processin the shadowed sets makes this method robust to outliersso that the approximation regions of each cluster can bedetermined accurately and the obtained prototypes approachthe desired locations
43 Real Data In this experiment totally 10 different pack-ages are tested Each package is represented by 100 framescaptured from different angles by camera and each frame isextracted SIFT feature points which are used for training arecognition system Figure 4 shows some images with theirSIFT keypoints And this data set is comprised of 248150descriptors We let 119898 = 20 119888
1= 149 119888
2= 149 119908 = 072
119871 = 20 120576 = 30 and 120588 = 001 for the SP-FCM and choose thereasonable range [119862min = 200 119862max = 360] according to thecategory amount of packages and distribution of keypoints ineach image Eighty iterations of PSO are run on each given119862 to produce the cluster prototype 119861 and partition matrix119880 as the starting point for the shadowed sets Longer PSOstabilization is needed to obtainmore stable cluster partitions
8 Computational Intelligence and Neuroscience
0
05
1
15
2
25
3
35
30 29 28 27 26 25 24 23 22 21 20Number of clusters
Xie-
Beni
val
idity
(a) GDS608
0
1
2
3
4
5
6
Xie-
Beni
val
idity
30 29 28 27 26 25 24 23 22 21 20Number of clusters
(b) GDS2003
0
1
2
3
4
5
6
7
20 19 18 17 16 15 14 13 12 11 10Number of clusters
Xie-
Beni
val
idity
(c) GDS2267
0
1
2
3
4
5
6
20 19 18 17 16 15 14 13 12 11 10Number of clusters
Xie-
Beni
val
idity
(d) GDS2712
Figure 3 XB validity index of four yeast gene expression data sets with cluster number 119862
Table 3 Performance of FCM RCM SCM SRCM and SP-FCM on package datasets
Different indices AlgorithmsFCM RCM SCM SRCM SP-FCM
DB index 184569 159671 143194 124038 107826Dunn index 92647 116298 125376 169422 167313
Within each cluster the optimal 120572119895decides the cardinality
and realizes cluster reduction and XB index is calculatedEach 119862-partition is ranked using this index and selected asthe final output by the smallest index value that indicates thebest compact and well-separated clusters At the beginningthe cluster number decreases at a faster speed it takes 26iterations to reduce the cluster number from 119862 = 360 to 119862 =289 and 20 iterations from119862 = 289 to119862 = 267 The XB indexincreases at a relatively faster rate when the cluster number119862 lt 267 Figure 5 shows the XB index for 119862 isin [267 289]The index reaches its minimum value at 119862 = 276 that meansthe best partition for this data set is 276 clusters Table 3exhibits the comparative analysis of convergence effect Asexpected SP-FCMcan provide sound results for the real datathe performance is assessed by those validity indices
5 Conclusions
This paper presents a modified fuzzy 119888-means algorithmbased on the particle swarm optimization and shadowed setsto perform unsupervised feature clustering This algorithmcalled SP-FCMutilizes the global search property of PSO andvagueness balance property of shadowed sets such that it canestimate the optimal cluster number as it runs through itsalternating optimization process SP-FCM as a randomizedbased approach has the capability to alleviate the problemsfaced by FCM which has some demerits of initializationand falling in local minima Moreover this algorithm avoidsthe subjective and somewhat arbitrary trials to estimate theappropriate value of cluster number and it enhances thiscapability to find the optimal cluster number within a specific
Computational Intelligence and Neuroscience 9
(a) 644 features (b) 460 features (c) 742 features
(d) 442 features (e) 313 features (f) 602 features
(g) 373 features (h) 724 features (i) 539 features (j) 124 features
Figure 4 Ten package images with SIFT features
0100200300400500600700800900
1000
289
285
282
280
278
276
274
272
270
268
Number of clusters
Xie-
Beni
val
idity
287
283
281
279
277
275
273
271
269
267
Figure 5 XB validity index of bag data set with cluster number 119862
range using cluster validity measures as indicatorsThe use ofXB validity index allows the algorithm to find the optimumcluster number with cluster partitions that provide compactand well-separated clusters From the experiments we haveshown that the SP-FCMalgorithmproduces good resultswithreference to DB and Dunn indices especially to the highdimension and large data cases
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
10 Computational Intelligence and Neuroscience
Acknowledgment
This research was supported by the National Natural ScienceFoundation of China (Grant no 61105089 NSFC)
References
[1] A K Jain ldquoData clustering 50 years beyond K-meansrdquo PatternRecognition Letters vol 31 no 8 pp 651ndash666 2010
[2] H-P Kriegel P Kroger and A Zimek ldquoClustering high-dimensional data a survey on subspace clustering pattern-based clustering and correlation clusteringrdquoACMTransactionson Knowledge Discovery from Data vol 3 no 1 article 1 2009
[3] P Melin and O Castillo ldquoA review on type-2 fuzzy logic appli-cations in clustering classification and pattern recognitionrdquoApplied Soft Computing Journal vol 21 pp 568ndash577 2014
[4] M Yuwono S W Su B D Moulton and H T Nguyen ldquoDataclustering using variants of rapid centroid estimationrdquo IEEETransactions on Evolutionary Computation vol 18 no 3 pp366ndash377 2014
[5] S C Satapathy G Pradhan S Pattnaik J V R Murthy andP V G D P Reddy ldquoPerformance comparisons of PSO basedclusteringrdquo InterJRI Computer Science and Networking vol 1no 1 pp 18ndash23 2009
[6] G Peters and RWeber ldquoDynamic clustering with soft comput-ingrdquo Wiley Interdisciplinary Reviews Data Mining and Knowl-edge Discovery vol 2 no 3 pp 226ndash236 2012
[7] D T Anderson J C Bezdek M Popescu and J M KellerldquoComparing fuzzy probabilistic and possibilistic partitionsrdquoIEEE Transactions on Fuzzy Systems vol 18 no 5 pp 906ndash9182010
[8] P Maji and S Paul ldquoRobust rough-fuzzy C-means algorithmdesign and applications in coding and non-coding RNA expres-sion data clusteringrdquo Fundamenta Informaticae vol 124 no 1-2pp 153ndash174 2013
[9] J Bezdek Fuzzy mathematics in pattern classification [PhDthesis] Cornell University New york NY USA 1974
[10] V Olman F Mao H Wu and Y Xu ldquoParallel clusteringalgorithm for large data sets with applications in bioinformat-icsrdquo IEEEACM Transactions on Computational Biology andBioinformatics vol 6 no 2 pp 344ndash352 2009
[11] J C Bezdek and R J Hathaway ldquoOptimization of fuzzyclustering criteria using genetic algorithmsrdquo in Proceedings ofthe 1st IEEE Conference on Evolutionary Computation pp 589ndash594 June 1994
[12] T A Runkler ldquoAnt colony optimization of clustering modelsrdquoInternational Journal of Intelligent Systems vol 20 no 12 pp1233ndash1251 2005
[13] K S Al-Sultan and S Z Selim ldquoA global algorithm for theFuzzy Clustering problemrdquo Pattern Recognition vol 26 no 9pp 1357ndash1361 1993
[14] R Eberhart and J Kennedy ldquoNew optimizer using particleswarm theoryrdquo in Proceedings of the 6th International Sympo-sium onMicroMachine and Human Science pp 39ndash43 NagoyaJapan October 1995
[15] T A Runkler and C Katz ldquoFuzzy clustering by particleswarm optimizationrdquo in Proceedings of the IEEE InternationalConference on Fuzzy Systems pp 601ndash608 can July 2006
[16] C-F Juang C-M Hsiao and C-H Hsu ldquoHierarchical cluster-based multispecies particle-swarm optimization for fuzzy-system optimizationrdquo IEEE Transactions on Fuzzy Systems vol18 no 1 pp 14ndash26 2010
[17] H Izakian and A Abraham ldquoFuzzy C-means and fuzzy swarmfor fuzzy clustering problemrdquo Expert Systems with Applicationsvol 38 no 3 pp 1835ndash1838 2011
[18] W Pedrycz ldquoShadowed sets representing and processing fuzzysetsrdquo IEEE Transactions on Systems Man and Cybernetics BCybernetics vol 28 no 1 pp 103ndash109 1998
[19] J Zhou W Pedrycz and D Miao ldquoShadowed sets in the char-acterization of rough-fuzzy clusteringrdquoPattern Recognition vol44 no 8 pp 1738ndash1749 2011
[20] W Pedrycz ldquoFrom fuzzy sets to shadowed sets interpretationand computingrdquo International Journal of Intelligent Systems vol24 no 1 pp 48ndash61 2009
[21] S Mitra W Pedrycz and B Barman ldquoShadowed c-meansintegrating fuzzy and rough clusteringrdquo Pattern Recognitionvol 43 no 4 pp 1282ndash1291 2010
[22] J C Bezdek J M Keller R Krishnapuram and N R PalFuzzy Models and Algorithms for Pattern Recognition and ImageProcessing vol 4 ofTheHandbooks of Fuzzy Sets Springer NewYork NY USA 1999
[23] D L Davies and DW Bouldin ldquoA cluster separation measurerdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 1 no 2 pp 224ndash227 1978
[24] X L Xie and G Beni ldquoA validity measure for fuzzy clusteringrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 13 no 8 pp 841ndash847 1991
[25] J C Bezdek and N R Pal ldquoSome new indexes of clustervalidityrdquo IEEE Transactions on Systems Man and CyberneticsPart B Cybernetics vol 28 no 3 pp 301ndash315 1998
[26] S C Satapathy S K Patnaik C D P Dash and S Sahoo ldquoDataclustering using modified fuzzy-PSO (MFPSO)rdquo in Proceedingsof the 5th Multi-Disciplinary Int Workshop on Artificial Intelli-gence Hyderabad India 2011
[27] M K Pakhira S Bandyopadhyay and U Maulik ldquoValidityindex for crisp and fuzzy clustersrdquo Pattern Recognition vol 37no 3 pp 487ndash501 2004
[28] R Krishnapuram and J M Keller ldquoThe possibilistic C-meansalgorithm insights and recommendationsrdquo IEEE Transactionson Fuzzy Systems vol 4 no 3 pp 385ndash393 1996
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
2 Computational Intelligence and Neuroscience
PSO is a population based optimization tool developedby Eberhart and Kennedy [14] which can be implementedand applied easily to solve various function optimizationproblems Runkler and Katz [15] introduced two new meth-ods for minimizing the reformulated objective functions ofthe FCM clustering model by PSO PSO-119881 and PSO-119880 Inorder to overcome the shortcomings of FCM a PSO-basedfuzzy clustering algorithm was discussed [16] this algorithmuses the global search capacity of PSO to overcome theshortcomings of FCM For finding more appropriate clustercenters a generalized FCM optimized by PSO algorithm [17]was proposed
Shadowed sets are considered as a conceptual and algo-rithmic bridge between rough sets and fuzzy sets therebyincorporate the generic merits and have been successfullyused for unsupervised learning Shadowed sets introduce(0 1) interval to denote the belongingness of those clusteringpoints and the uncertainty among patterns lying in the shad-owed region is efficiently handled in terms of membershipThus in order to disambiguate and capture the essence of adistribution recently the concept of shadowed sets has beenintroduced [18] which can also raise the efficiency in theiteration process of the new prototypes by eliminating someldquobad pointsrdquo that have bad influence on cluster structure[19 20] Compared with FCM the capability of shadowed 119888-means is enhanced when dealing with outlier [21]
Although lots of clustering algorithms based on FCMPSO or shadowed sets were proposed most of them needto input the preestimated cluster number 119862 To obtain thedesirable cluster partitions in a given data commonly119862 is setmanually and this is a very subjective and somewhat arbitraryprocess A number of approaches have been proposed toselect the appropriate 119862 Bezdek et al [22] suggested therule of thumb 119862 le 11987312 where the upper bound must bedetermined based on knowledge or applications about thedata Another approach is to use a cluster validity indexas a measure criterion about the data partition such asDavies-Bouldin (DB) [23] Xie-Beni (XB) [24] and Dunn[25] indices These indices often follow the principle that thedistance between objects in the same cluster should be assmall as possible and the distance between objects in differentclusters should be as large as possible They have also beenused to acquire the optimal number of clusters 119862 accordingto their maximum or minimum value
Therefore we wish to find the best 119862 in some rangeobtain cluster partitions by considering compactness andintercluster separation and reduce the sensitivity to initialvalues Here we propose a modified algorithm named asSP-FCM which integrates the merits of PSO and interleavesshadowed sets between stabilization iterations And it canautomatically estimate the optimal cluster number with afaster initialization than our previous approach
The structure of the paper is as follows Section 2 outlinesall necessary prerequisites In Section 3 a new clusteringapproach called SP-FCM is presented for automatically find-ing the optimal cluster number Section 4 includes the resultsof experiments involvingUCI data sets yeast gene expressiondata sets and real data set In Section 5 main conclusions arecovered
2 Related Clustering Algorithms
In this section we briefly describe some basic concepts ofFCM PSO shadowed sets and XB validity index and reviewthe PSO-based clustering method
21 FCM We define 119883 = 1199091 119909
119873 as the universe of a
clustering data set 119861 = 1205731 120573
119862 as the prototypes of 119862
clusters and 119880 = [119906119894119895]119873times119862
as a fuzzy partition matrix where119906119894119895isin [0 1] is themembership of 119909
119894in a cluster with prototype
120573119895 119909119894 120573119895isin 119877119875 where P is the data dimensionality 1 le 119894 le 119873
and 1 le 119895 le 119862 The FCM algorithm is derived by minimizingthe objective function [22]
119869FCM (119880 119861119883) =119862
sum
119895=1
119873
sum
119894=1
119906119898
1198941198951198892
119894119895(119909119894 120573119895) (1)
where 119898 gt 10 is the weighting exponent on each fuzzymembership and 119889
119894119895is the Euclidian distance from data
vectors 119909119894to cluster center 120573
119895 And
119862
sum
119895=1
119906119894119895= 1 forall119894 = 1 2 119873
0 lt
119873
sum
119894=1
119906119894119895lt 119873 forall119895 = 1 2 119862
119889119894119895=
10038171003817100381710038171003817119909119894minus 120573119895
10038171003817100381710038171003817
(2)
This produces the following update equations
119906119894119895= (
119862
sum
119896=1
(
119889 (119909119894 120573119895)
119889 (119909119894 120573119896)
)
2(119898minus1)
)
minus1
(3)
120573119895=
(sum119873
119894=1(119906119894119895)
119898
119909119894)
(sum119873
119894=1(119906119894119895)
119898
)
(4)
After computing the memberships of all the objects thenew prototypes of the clusters are calculated The processstops when the prototypes stabilize That is the prototypesfrom the previous iteration are of close proximity to thosegenerated in the current iteration normally less than an errorthreshold
22 PSO PSO was originally introduced in terms of socialand cognitive behavior of bird flocking and fish schoolingThe potential solutions are called particles which fly throughthe problem space by following the current best particlesEach particle keeps track of its coordinates in the problemspace which are associated with the best solution that hasbeen achieved so far The solution is evaluated by the fitnessvalue which is also storedThis value is called 119901best Another
Computational Intelligence and Neuroscience 3
best value that is tracked by the PSO is the best value obtainedso far by any particle in the swarm The best value is a globalbest and is called 119892best The search for the better positionsfollows the rule as
119881 (119905 + 1) = 119908119881 (119905) + 11988811199031(119901best (119905) minus 119875 (119905))
+ 11988821199032(119892best (119905) minus 119875 (119905))
119875 (119905 + 1) = 119875 (119905) + 119881 (119905 + 1)
(5)
where 119875 and 119881 are position and velocity vector of particlerespectively 119908 is inertia weight 119888
1and 119888
2are positive
constants called acceleration coefficients which control theinfluence of 119901best and 119892best in search process and 119903
1and 1199032
are random values in the range [0 1]The fitness value of eachparticlersquos position is determined by a fitness function andPSO is usually executed with repeated application of (5) untila specified number of iterations have been exceeded or thevelocity updates are close to zero over a number of iterations
23 PSO-Based FCM In this algorithm [26] each particlePart119897represents a cluster center vector which is constructed
as
Part119897= (1198751198971 119875
119897119895 119875
119897119862) (6)
where 119897 represents the 119897th particle 119897 = 1 2 119871 119871 is thenumber of particles and 119871 lt 119873 119875
119897119895is the 119895th cluster center
of particle Part119897 Therefore a swarm represents a number
of candidates cluster center for the data vector Each datavector belongs to a cluster according to its membershipfunction and thus a fuzzy membership is assigned to eachdata vector Each cluster has a cluster center per iteration andpresents a solution which gives a vector of cluster centersThis method determines the position vector Partl for everyparticle updates it and then changes the position of clustercenter And the fitness function for evaluating the generalizedsolutions is stated as
119865 (119875) =
1
119869FCM (7)
The smaller is the 119869FCM the better is the clustering effectand the higher is the fitness function 119865(119875)
24 Shadowed Sets Conventional uncertainty models likefuzzy sets tend to capture vagueness through membershipvalues and associate precise numeric values of membershipwith vague concepts By introducing 120572-cut [19] a fuzzy setcan be converted into a classical set Shadowed sets mapeach element of a given fuzzy set into 0 1 and the unitinterval [0 1] namely excluded included and uncertainrespectively
For constructing a shadowed set Mitra et al [21] pro-posed an optimization based on balance of vagueness Aselevating membership values of some regions to 1 and atthe same time reducing membership values of some regionsto 0 the uncertainty in these regions can be eliminatedTo keep the balance of the total uncertainty regions it
f(x)
Ω3
Ω2
Ω1
x1 x2
1
120572
x
1minus120572
0
Figure 1 Shadowed sets induced by fuzzy function 119891(119909)
needs to compensate these changes by the construction ofuncertain regions namely shadowed sets that absorb theprevious elimination of partial membership at low and highranges The shadowed sets are induced by fuzzy membershipfunction in Figure 1
Here119909 denotes the objects119891(119909) isin [0 1] is the continuousmembership function of the objects belonging to a clusterThe symbolΩ
1shows the reduction of membership the sym-
bol Ω2depicts the elevation of membership and the symbol
Ω3shows the formation of shadows In order to balance the
total uncertainty the retention of balance translates into thefollowing dependency
Ω1+ Ω2= Ω3 (8)
And the integral forms are given as
Ω1= int
119909119891(119909)le120572
119891 (119909) 119889119909 Ω2= int
119909119891(119909)ge1minus120572
(1 minus 119891 (119909)) 119889119909
Ω3= int
119909120572lt119891(119909)lt1minus120572
119889119909
(9)
The threshold of reducing and elevating is 120572 and 1 minus 120572(120572 isin (0 05)) The optimal value of 120572 should be acquired bytranslating it into the minimization of the following objectivefunction
119874 (120572) =1003816100381610038161003816Ω1+ Ω2minus Ω3
1003816100381610038161003816 (10)
4 Computational Intelligence and Neuroscience
For a fuzzy set with discrete membership function thebalance equation is modified as
119874(120572119895) =
10038161003816100381610038161003816100381610038161003816100381610038161003816
sum
119906119894119895le120572119895
119906119894119895+ sum
119906119894119895ge119906119895maxminus120572119895
(119906119895maxminus 120572119895)
minus card 119906119894119895| 120572119895lt 119906119894119895lt 119906119895maxminus 120572119895
10038161003816100381610038161003816100381610038161003816100381610038161003816
(11)
In order to find the best 120572119895 it should satisfy the following
optimal problem
120572119895= arg min120572119895
119874(120572119895) (12)
where 119906119894119895isin [0 1] is the membership of 119909
119894in a cluster with
prototype 120573119895 119906119895max
and 119906119895min
denote the highest and lowestmembership values to the 119895th cluster and 120572
119895is the threshold
of the 119895th cluster The range of feasible values of threshold 120572119895
is [119906119895min (119906119895min+ 119906119895max)2] [19]
This approach considers all membership values withrespect to a fixed cluster when updating the prototype ofthis cluster The main merits of shadowed sets involve theoptimizationmechanism for choosing separate threshold andthe reduction of the burden of plain numeric computations
25 XB Clustering Validity Index The clustering algorithmsdescribed above require prespecification of the number ofclusters The partition results are dependent on the choiceof 119862 There exist validity indices to evaluate the goodness ofclustering according to a given number of clusters thereforethese validity indices can be used to acquire the optimal valueof 119862 [27]
The XB index presents a fuzzy-validity criterion basedon a validity function which identifies overall compact andseparate fuzzy 119888-partitions This function depends upon thedata set geometric distance measure and distance betweencluster centroids and fuzzy partition irrespective of anyfuzzy algorithm used For evaluating the goodness of thedata partition both cluster compactness and interclusterseparation should be taken into account For the FCMalgorithm with 119898 = 20 the Xie-Beni index can be shownto be
119878XB =119869FCM1198731198892
min (13)
where 119889min = min119894119895
10038171003817100381710038171003817120573119894minus 120573119895
10038171003817100381710038171003817is the minimum distance
between cluster centroidsThemore separate the clusters thelarger the 119889min and the smaller the 119878XB
3 Shadowed Sets-Based PSO-FuzzyClustering SP-FCM
FCM strives to find 119862 compact clusters in 119883 where 119862 is oneof the specified parameters But the process of selecting andadjusting 119862manually to obtain desirable cluster partitions ina given data set is very subjective and somewhat arbitrary
To seek the optimal cluster structure 119862 is always allowed tobe overestimated [28] such that the distances between someclusters are not big enough or themembership values of someobjects with different clusters are adjacent and ambiguousin a given data set And in this case the modification ofprototypes through long time iteration is meaningless
The main subject of cluster validation is the evaluation ofclustering results to find the partitioning that best fits the dataset Based on the foregoing algorithms we wish to find clusterpartitions that contain compact and well-separated clustersIn our algorithm 119862 is also overestimated and the clusterscompete for data membership We can set [119862min 119862max] as thereasonable range of cluster number based on the knowledgeof the data This provides a more transparent and tractableprocess of cluster number reduction Considering the fuzzypartition matrix 119880 = [119906
119894119895]119873times119862
each column is comprisedof the membership values of all feature vectors 119909
119894with a
single cluster center Thus an optimal threshold 120572119895(119895 =
1 2 119862) for each column should be found to create a harderpartition by (12) The amount of data which are assignedmembership value equal to 1 is identified as the cardinalityof corresponding cluster According to 120572
119895 the cardinality of
the 119895th column is
119872119895= card 119906
119894119895| 119906119894119895ge 119906119895maxminus 120572119895 (14)
Here the threshold is not subjectively user-defined butit is established on the balance of uncertainty and canbe adjusted automatically in the clustering process Thisproperty of shadowed sets can be used to reduce the clusternumber In order to control the convergence speed thedecision to delete clusters can be based on some thresholdsDifferent threshold values should be set for different data setsdepending on the cluster structure and size of data sets Herea threshold 120576 and attrition rate 120588 (0 lt 120588 lt 1) are set Thedecision to delete clusters in SP-FCM is based solely on clustercardinality and the threshold 120576 If 120576 is too small 119862 is reducedmore slowly and it may stop prematurely before the optimalcluster number is found On the other hand if 120576 is too large119862 may be reduced too drastically In our method clusterswhose cardinalities 119872
119895lt 120576 are considered as ldquocandidatesrdquo
for removal And we can remove up to lfloor120588 times 119862rfloor clustershaving the lowest cardinality from the pool of candidatesspecified by 120576 Limiting the number of clusters that can beremoved at one time prevents 119862 from being reduced toodrastically when 120576 is set too high for a given data set Thiswould automatically estimate the best cluster number whilealso utilizing a faster consistent and repeatable initializationtechnique For evaluating the goodness of the data partitionboth cluster compactness and intercluster separation shouldbe taken into account Hence the XB index is adopted
For each 119862 in the range of [119862min 119862max] a set of clustervalidity indexes were calculated where 119862max is the initialcluster number which is set to be much larger than theexpected cluster numberThe partitionmatrix with119862 clusterswith the best aggregate validity index is selected as the finalcluster partitionThe SP-FCM algorithm is summarized as inAlgorithm 1
Computational Intelligence and Neuroscience 5
(1) Initialize 119862max and 119862min let 119862 = 119862max the real number119898 iteration counter 119896 = 0 iteration counter 119905 = 0 maximumiteration number 119879 of PSO
(2) Initialize the population size 119871 the initial velocity of particles the initial position of particles 1198881 1198882 119908 the threshold 120576
and attrition rate 120588(3)DO
Repeat (a) Update partition matrix 119880(119896)for all particles by (3)(b) Calculate the cluster center for each particle by (4)(c) Calculate the fitness value for each particle by (7)(d) Calculate 119901119887119890119904119905 for each particle(e) Calculate 119892119887119890119904119905 for the swarm(f) Update the velocity and position of each particle by (5)(g) 119905 = 119905 + 1
Until PSO termination condition is met (lowast)(i) Calculate the optimal threshold 120572
119895(119895 = 1 2 119862) for each column of partition matrix 119880(119896) by (12)
and relocate 119906119894119895(1 le 119894 le 119873) of 119895th cluster according to 120572
119895
(ii) Calculate cardinality119872119895for each cluster on the basis of the number of data whose membership value equal to 1 by (14)
1 le 119895 le 119862
(iii) Remove all clusters whose119872119895lt 120576 and119872
119895is among lfloor120588 times 119862rfloor lowest cardinality
(iv) Update cluster number C(v) Calculate cluster validity index 119878XB(119896) by (13)(vi) Update iteration counter 119896 = 119896 + 1
While termination condition is not met (lowastlowast)(lowast) The termination condition of PSO in this method is 119905 ge 119879 (reach the maximum number of iterations) or the velocity
updates are close to zero over a number of iterations(lowastlowast) The algorithm can terminate under either of the following two conditions(1) The prototype parameters in 119861 stabilize within some threshold 120575(2) The number of clusters has reached the minimum limit 119862min
Algorithm 1 SP-FCM
Here if lfloor120588 times 119862rfloor is equal to 0 we can let it to be1 This means that the cluster with the lowest cardinalitymay be removed The initial 119862max cluster prototypes canbe initialized using exemplars from data points selected by120573119895= 119909lfloor(119873119862max)119895rfloor
After termination the 119861 and 119880 from119862 isin [119862min 119862max] with the best cluster validity index 119878XB areselected as the final cluster prototype and partition
4 Experimental Results
In this section the performance of FCM RCM shadowed 119888-means (SCM) [21] shadowed rough 119888-means (SRCM) [19]and SP-FCM algorithms is presented on four UCI datasetsfour yeast gene expression datasets and real data Forevaluating the convergence effect the fundamental criterioncan be described as follows the distance between differentobjects in the same cluster should be as close as possible thedistance between different objects in different cluster shouldbe as far as possible Here we use DB index and Dunn indexto evaluate the clustering effect For a given data set and 119862value the higher the similarity values within the clusters andthe intercluster separation the lower the DB index value Agood clustering procedure shouldmake the value ofDB indexas low as possible Reversely higher values of the Dunn indexindicate better clustering in the sense that the clusters are well
separated and relatively compact The details of experimentsare mentioned below
41 UCI Data Set In our experiments totally four UCI datasets are used including 4-dimensional Iris 13-dimensionalWine 10-dimensional Glass and 34-dimensional Iono-sphere There are 3 clusters in data set of Iris each of whichhas 50 data patterns 3 clusters in data set of Wine whichhave 50 60 and 68 data patterns 6 clusters in data set ofGlass which have 30 35 40 42 36 and 31 separately and2 clusters in data set of Ionosphere which have 226 and125 data patterns The validity indices of each method arecompared in Table 1 SP-FCM can identify compact groupscompared to other algorithmswhen given the cluster number119862 It can also be seen that SRCM and SP-FCM have moreobvious advantages than FCM RCM and SCM SP-FCMperforms slightly better than SRCM in most cases due tothe global search ability which enables it to converge to anoptimum or near optimum solutions Moreover shadowedset- and rough set-based clustering methods namely SP-FCM SRCM RCM and SCM perform better than FCM Itimplies that the partition of approximation regions can revealthe nature of data structure and only the lower bound andboundary region of each cluster have positive contribution inthe process of updating the prototypes
6 Computational Intelligence and Neuroscience
Table 1 Performance of FCM RCM SCM SRCM and SP-FCM on four UCI data sets
Different indices Algorithm Data setsIris Wine Glass Ionosphere
DB index
FCM 07642 08803 22971 20587RCM 06875 05692 19635 15434SCM 06862 05327 18495 14763SRCM 06613 04436 15804 13971SP-FCM 06574 04328 15237 14066
Dunn index
FCM 23106 25834 01142 08381RCM 27119 28157 02637 10233SCM 24801 27992 03150 10319SRCM 30874 31342 05108 11924SP-FCM 33254 31764 04921 12605
0
05
1
15
2
25
3
12 11 10 9 8 7 6 5 4 3 2Number of clusters
Xie-
Beni
val
idity
(a) Iris
0
1
2
3
4
5
6
13 12 11 10 9 8 7 6 5 4 3 2Number of clusters
Xie-
Beni
val
idity
(b) Wine
0
2
4
6
8
10
14 1213 11 10 9 8 7 6 5 4 3 2Number of clusters
Xie-
Beni
val
idity
(c) Glass
0
5
10
15
20
25
16 15 14 13 12 11 10 9 8 7 6 5 4 3 2Number of clusters
Xie-
Beni
val
idity
(d) Ionosphere
Figure 2 XB validity index of four UCI data sets with cluster number C
As usual the number of clusters is implied by the nature ofthe problem Here with the shadowed sets involved one cananticipate that the optimal number of clusters could be foundThe fuzzification coefficient 119898 can be optimized however itis common to assume a fixed value of 20 which associateswith the form of the membership functions of the generatedclusters For testing the SP-FCMalgorithm the rule119862 le 11987312is adopted and the range of the expected cluster number canbe set as (1) Iris [119862min = 2 119862max = 12] (2) Wine [119862min = 2119862max = 13] (3) Glass [119862min = 2 119862max = 14] (4) Ionosphere
[119862min = 2 119862max = 16] The swarm size is set as 119871 = 20the maximum iteration number of PSO 119879 = 50 and forcluster reduction the cluster cardinality threshold 120576 = 10and the attrition rate 120588 = 01 In each cycle we get thedistribution of every cluster remove part of them accordingto their cardinality and calculate the XB index and thecluster number 119862 varies from 119862max to 119862min After ending thecirculation the partition with the lowest value is selected asthe final result Figure 2 presents the validity indices in theprocess of generating the optimal cluster number Smaller
Computational Intelligence and Neuroscience 7
Table 2 Performance of FCM RCM SCM SRCM and SP-FCM on four yeast expression data sets
Different indices Algorithm Data setsGDS608 GDS2003 GDS2267 GDS2712
DB index
FCM 20861 24671 15916 19526RCM 16109 22104 10274 12058SCM 15938 21346 08946 10965SRCM 13274 19523 07438 07326SP-FCM 12958 18946 07962 06843
Dunn index
FCM 02647 02976 04208 03519RCM 03789 03981 07164 06074SCM 03865 03775 08439 06207SRCM 05126 04953 09759 08113SP-FCM 05407 05026 09182 08049
values indicate more compact and well-separated clustersThe validity indices reach their minimum value at 119862 = 33 6 and 2 separately which correspond to the final clusterprototype and the best partition Through the shadowedsets and PSO approaches the influence of each boundaryregion to the formation of the prototypes and the clusterscan be properly resolved Although more computing timeis required to run SP-FCM the reasonable result can beacquired for processing the overlapping and vagueness datapatterns
42 Yeast Gene Expression Data Set There are four yeastgene expression data sets used in the experiments includingGDS608 GDS2003 GDS2267 and GDS2712 downloadedfrom Gene Expression Omnibus The number of classesand samples of GDS608 is 26 and 6303 for GDS2003 thenumber of classes and samples is 23 and 5617 for GDS2267is 14 and 9275 and for GDS2712 is 15 and 9275 Table 2presents the validity indices of different methods after thecluster number 119862 was given The SP-FCM and SRCM obtainthe same effect and perform better than other clusteringalgorithms The improvement can be attributed to the factthat the global search capacity of PSO is conducive to findingmore appropriate cluster centers while escaping from localoptima
For getting the optimum 119862 automatically we let119898 = 201198881= 149 119888
2= 149 and 119908 = 072 and the rule 119862 le 11987312
is adopted The swarm size is set as 119871 = 20 the maximumiteration number of PSO is 119879 = 80 and for clusterreduction the range of the expected cluster number thecluster cardinality threshold 120576 and the attrition rate 120588 can beset as (1) GDS608 [119862min = 20 119862max = 80] 120576 = 20 120588 = 005(2) GDS2003 [119862min = 20 119862max = 75] 120576 = 20 120588 = 005(3) GDS2267 [119862min = 10 119862max = 96] 120576 = 20 120588 = 008(4) GDS2712 [119862min = 10 119862max = 96] 120576 = 20 120588 = 008In each cycle we get the distribution of every cluster removepart of them according to their cardinality and calculate theXB index and the cluster number 119862 varies from 119862max to119862min The partition with the lowest value is selected as the
final result after the loop is ended As seen in Figure 3 forGDS608 at the beginning the cluster number decreases at afaster rate it takes 26 iterations to reduce the cluster numberfrom 119862 = 80 to 119862 = 30 and 4 iterations from 119862 = 30
to 119862 = 26 and the XB index begins to increase when thecluster number 119862 lt 26 For GDS2003 it takes 24 iterationsto reduce the cluster number from 119862 = 75 to 119862 = 30 and 7iterations from 119862 = 30 to 119862 = 23 and the XB index beginsto increase when the cluster number 119862 lt 23 For GDS2267 ittakes 23 iterations to reduce the cluster number from 119862 = 96to 119862 = 20 and 6 iterations from 119862 = 20 to 119862 = 14 andthe XB index begins to increase when the cluster number119862 lt 14 For GDS2712 it takes 23 iterations to reduce thecluster number from 119862 = 96 to 119862 = 20 and 5 iterationsfrom 119862 = 20 to 119862 = 15 and the XB index begins to increasewhen the cluster number 119862 lt 15 Here the advantagesof fuzzy sets PSO and shadowed sets are integrated in theSP-FCM and make this algorithm applicable to deal withoverlapping partitions the uncertainty and vagueness arisingfrom the boundary regions and the optimization processin the shadowed sets makes this method robust to outliersso that the approximation regions of each cluster can bedetermined accurately and the obtained prototypes approachthe desired locations
43 Real Data In this experiment totally 10 different pack-ages are tested Each package is represented by 100 framescaptured from different angles by camera and each frame isextracted SIFT feature points which are used for training arecognition system Figure 4 shows some images with theirSIFT keypoints And this data set is comprised of 248150descriptors We let 119898 = 20 119888
1= 149 119888
2= 149 119908 = 072
119871 = 20 120576 = 30 and 120588 = 001 for the SP-FCM and choose thereasonable range [119862min = 200 119862max = 360] according to thecategory amount of packages and distribution of keypoints ineach image Eighty iterations of PSO are run on each given119862 to produce the cluster prototype 119861 and partition matrix119880 as the starting point for the shadowed sets Longer PSOstabilization is needed to obtainmore stable cluster partitions
8 Computational Intelligence and Neuroscience
0
05
1
15
2
25
3
35
30 29 28 27 26 25 24 23 22 21 20Number of clusters
Xie-
Beni
val
idity
(a) GDS608
0
1
2
3
4
5
6
Xie-
Beni
val
idity
30 29 28 27 26 25 24 23 22 21 20Number of clusters
(b) GDS2003
0
1
2
3
4
5
6
7
20 19 18 17 16 15 14 13 12 11 10Number of clusters
Xie-
Beni
val
idity
(c) GDS2267
0
1
2
3
4
5
6
20 19 18 17 16 15 14 13 12 11 10Number of clusters
Xie-
Beni
val
idity
(d) GDS2712
Figure 3 XB validity index of four yeast gene expression data sets with cluster number 119862
Table 3 Performance of FCM RCM SCM SRCM and SP-FCM on package datasets
Different indices AlgorithmsFCM RCM SCM SRCM SP-FCM
DB index 184569 159671 143194 124038 107826Dunn index 92647 116298 125376 169422 167313
Within each cluster the optimal 120572119895decides the cardinality
and realizes cluster reduction and XB index is calculatedEach 119862-partition is ranked using this index and selected asthe final output by the smallest index value that indicates thebest compact and well-separated clusters At the beginningthe cluster number decreases at a faster speed it takes 26iterations to reduce the cluster number from 119862 = 360 to 119862 =289 and 20 iterations from119862 = 289 to119862 = 267 The XB indexincreases at a relatively faster rate when the cluster number119862 lt 267 Figure 5 shows the XB index for 119862 isin [267 289]The index reaches its minimum value at 119862 = 276 that meansthe best partition for this data set is 276 clusters Table 3exhibits the comparative analysis of convergence effect Asexpected SP-FCMcan provide sound results for the real datathe performance is assessed by those validity indices
5 Conclusions
This paper presents a modified fuzzy 119888-means algorithmbased on the particle swarm optimization and shadowed setsto perform unsupervised feature clustering This algorithmcalled SP-FCMutilizes the global search property of PSO andvagueness balance property of shadowed sets such that it canestimate the optimal cluster number as it runs through itsalternating optimization process SP-FCM as a randomizedbased approach has the capability to alleviate the problemsfaced by FCM which has some demerits of initializationand falling in local minima Moreover this algorithm avoidsthe subjective and somewhat arbitrary trials to estimate theappropriate value of cluster number and it enhances thiscapability to find the optimal cluster number within a specific
Computational Intelligence and Neuroscience 9
(a) 644 features (b) 460 features (c) 742 features
(d) 442 features (e) 313 features (f) 602 features
(g) 373 features (h) 724 features (i) 539 features (j) 124 features
Figure 4 Ten package images with SIFT features
0100200300400500600700800900
1000
289
285
282
280
278
276
274
272
270
268
Number of clusters
Xie-
Beni
val
idity
287
283
281
279
277
275
273
271
269
267
Figure 5 XB validity index of bag data set with cluster number 119862
range using cluster validity measures as indicatorsThe use ofXB validity index allows the algorithm to find the optimumcluster number with cluster partitions that provide compactand well-separated clusters From the experiments we haveshown that the SP-FCMalgorithmproduces good resultswithreference to DB and Dunn indices especially to the highdimension and large data cases
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
10 Computational Intelligence and Neuroscience
Acknowledgment
This research was supported by the National Natural ScienceFoundation of China (Grant no 61105089 NSFC)
References
[1] A K Jain ldquoData clustering 50 years beyond K-meansrdquo PatternRecognition Letters vol 31 no 8 pp 651ndash666 2010
[2] H-P Kriegel P Kroger and A Zimek ldquoClustering high-dimensional data a survey on subspace clustering pattern-based clustering and correlation clusteringrdquoACMTransactionson Knowledge Discovery from Data vol 3 no 1 article 1 2009
[3] P Melin and O Castillo ldquoA review on type-2 fuzzy logic appli-cations in clustering classification and pattern recognitionrdquoApplied Soft Computing Journal vol 21 pp 568ndash577 2014
[4] M Yuwono S W Su B D Moulton and H T Nguyen ldquoDataclustering using variants of rapid centroid estimationrdquo IEEETransactions on Evolutionary Computation vol 18 no 3 pp366ndash377 2014
[5] S C Satapathy G Pradhan S Pattnaik J V R Murthy andP V G D P Reddy ldquoPerformance comparisons of PSO basedclusteringrdquo InterJRI Computer Science and Networking vol 1no 1 pp 18ndash23 2009
[6] G Peters and RWeber ldquoDynamic clustering with soft comput-ingrdquo Wiley Interdisciplinary Reviews Data Mining and Knowl-edge Discovery vol 2 no 3 pp 226ndash236 2012
[7] D T Anderson J C Bezdek M Popescu and J M KellerldquoComparing fuzzy probabilistic and possibilistic partitionsrdquoIEEE Transactions on Fuzzy Systems vol 18 no 5 pp 906ndash9182010
[8] P Maji and S Paul ldquoRobust rough-fuzzy C-means algorithmdesign and applications in coding and non-coding RNA expres-sion data clusteringrdquo Fundamenta Informaticae vol 124 no 1-2pp 153ndash174 2013
[9] J Bezdek Fuzzy mathematics in pattern classification [PhDthesis] Cornell University New york NY USA 1974
[10] V Olman F Mao H Wu and Y Xu ldquoParallel clusteringalgorithm for large data sets with applications in bioinformat-icsrdquo IEEEACM Transactions on Computational Biology andBioinformatics vol 6 no 2 pp 344ndash352 2009
[11] J C Bezdek and R J Hathaway ldquoOptimization of fuzzyclustering criteria using genetic algorithmsrdquo in Proceedings ofthe 1st IEEE Conference on Evolutionary Computation pp 589ndash594 June 1994
[12] T A Runkler ldquoAnt colony optimization of clustering modelsrdquoInternational Journal of Intelligent Systems vol 20 no 12 pp1233ndash1251 2005
[13] K S Al-Sultan and S Z Selim ldquoA global algorithm for theFuzzy Clustering problemrdquo Pattern Recognition vol 26 no 9pp 1357ndash1361 1993
[14] R Eberhart and J Kennedy ldquoNew optimizer using particleswarm theoryrdquo in Proceedings of the 6th International Sympo-sium onMicroMachine and Human Science pp 39ndash43 NagoyaJapan October 1995
[15] T A Runkler and C Katz ldquoFuzzy clustering by particleswarm optimizationrdquo in Proceedings of the IEEE InternationalConference on Fuzzy Systems pp 601ndash608 can July 2006
[16] C-F Juang C-M Hsiao and C-H Hsu ldquoHierarchical cluster-based multispecies particle-swarm optimization for fuzzy-system optimizationrdquo IEEE Transactions on Fuzzy Systems vol18 no 1 pp 14ndash26 2010
[17] H Izakian and A Abraham ldquoFuzzy C-means and fuzzy swarmfor fuzzy clustering problemrdquo Expert Systems with Applicationsvol 38 no 3 pp 1835ndash1838 2011
[18] W Pedrycz ldquoShadowed sets representing and processing fuzzysetsrdquo IEEE Transactions on Systems Man and Cybernetics BCybernetics vol 28 no 1 pp 103ndash109 1998
[19] J Zhou W Pedrycz and D Miao ldquoShadowed sets in the char-acterization of rough-fuzzy clusteringrdquoPattern Recognition vol44 no 8 pp 1738ndash1749 2011
[20] W Pedrycz ldquoFrom fuzzy sets to shadowed sets interpretationand computingrdquo International Journal of Intelligent Systems vol24 no 1 pp 48ndash61 2009
[21] S Mitra W Pedrycz and B Barman ldquoShadowed c-meansintegrating fuzzy and rough clusteringrdquo Pattern Recognitionvol 43 no 4 pp 1282ndash1291 2010
[22] J C Bezdek J M Keller R Krishnapuram and N R PalFuzzy Models and Algorithms for Pattern Recognition and ImageProcessing vol 4 ofTheHandbooks of Fuzzy Sets Springer NewYork NY USA 1999
[23] D L Davies and DW Bouldin ldquoA cluster separation measurerdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 1 no 2 pp 224ndash227 1978
[24] X L Xie and G Beni ldquoA validity measure for fuzzy clusteringrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 13 no 8 pp 841ndash847 1991
[25] J C Bezdek and N R Pal ldquoSome new indexes of clustervalidityrdquo IEEE Transactions on Systems Man and CyberneticsPart B Cybernetics vol 28 no 3 pp 301ndash315 1998
[26] S C Satapathy S K Patnaik C D P Dash and S Sahoo ldquoDataclustering using modified fuzzy-PSO (MFPSO)rdquo in Proceedingsof the 5th Multi-Disciplinary Int Workshop on Artificial Intelli-gence Hyderabad India 2011
[27] M K Pakhira S Bandyopadhyay and U Maulik ldquoValidityindex for crisp and fuzzy clustersrdquo Pattern Recognition vol 37no 3 pp 487ndash501 2004
[28] R Krishnapuram and J M Keller ldquoThe possibilistic C-meansalgorithm insights and recommendationsrdquo IEEE Transactionson Fuzzy Systems vol 4 no 3 pp 385ndash393 1996
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience 3
best value that is tracked by the PSO is the best value obtainedso far by any particle in the swarm The best value is a globalbest and is called 119892best The search for the better positionsfollows the rule as
119881 (119905 + 1) = 119908119881 (119905) + 11988811199031(119901best (119905) minus 119875 (119905))
+ 11988821199032(119892best (119905) minus 119875 (119905))
119875 (119905 + 1) = 119875 (119905) + 119881 (119905 + 1)
(5)
where 119875 and 119881 are position and velocity vector of particlerespectively 119908 is inertia weight 119888
1and 119888
2are positive
constants called acceleration coefficients which control theinfluence of 119901best and 119892best in search process and 119903
1and 1199032
are random values in the range [0 1]The fitness value of eachparticlersquos position is determined by a fitness function andPSO is usually executed with repeated application of (5) untila specified number of iterations have been exceeded or thevelocity updates are close to zero over a number of iterations
23 PSO-Based FCM In this algorithm [26] each particlePart119897represents a cluster center vector which is constructed
as
Part119897= (1198751198971 119875
119897119895 119875
119897119862) (6)
where 119897 represents the 119897th particle 119897 = 1 2 119871 119871 is thenumber of particles and 119871 lt 119873 119875
119897119895is the 119895th cluster center
of particle Part119897 Therefore a swarm represents a number
of candidates cluster center for the data vector Each datavector belongs to a cluster according to its membershipfunction and thus a fuzzy membership is assigned to eachdata vector Each cluster has a cluster center per iteration andpresents a solution which gives a vector of cluster centersThis method determines the position vector Partl for everyparticle updates it and then changes the position of clustercenter And the fitness function for evaluating the generalizedsolutions is stated as
119865 (119875) =
1
119869FCM (7)
The smaller is the 119869FCM the better is the clustering effectand the higher is the fitness function 119865(119875)
24 Shadowed Sets Conventional uncertainty models likefuzzy sets tend to capture vagueness through membershipvalues and associate precise numeric values of membershipwith vague concepts By introducing 120572-cut [19] a fuzzy setcan be converted into a classical set Shadowed sets mapeach element of a given fuzzy set into 0 1 and the unitinterval [0 1] namely excluded included and uncertainrespectively
For constructing a shadowed set Mitra et al [21] pro-posed an optimization based on balance of vagueness Aselevating membership values of some regions to 1 and atthe same time reducing membership values of some regionsto 0 the uncertainty in these regions can be eliminatedTo keep the balance of the total uncertainty regions it
f(x)
Ω3
Ω2
Ω1
x1 x2
1
120572
x
1minus120572
0
Figure 1 Shadowed sets induced by fuzzy function 119891(119909)
needs to compensate these changes by the construction ofuncertain regions namely shadowed sets that absorb theprevious elimination of partial membership at low and highranges The shadowed sets are induced by fuzzy membershipfunction in Figure 1
Here119909 denotes the objects119891(119909) isin [0 1] is the continuousmembership function of the objects belonging to a clusterThe symbolΩ
1shows the reduction of membership the sym-
bol Ω2depicts the elevation of membership and the symbol
Ω3shows the formation of shadows In order to balance the
total uncertainty the retention of balance translates into thefollowing dependency
Ω1+ Ω2= Ω3 (8)
And the integral forms are given as
Ω1= int
119909119891(119909)le120572
119891 (119909) 119889119909 Ω2= int
119909119891(119909)ge1minus120572
(1 minus 119891 (119909)) 119889119909
Ω3= int
119909120572lt119891(119909)lt1minus120572
119889119909
(9)
The threshold of reducing and elevating is 120572 and 1 minus 120572(120572 isin (0 05)) The optimal value of 120572 should be acquired bytranslating it into the minimization of the following objectivefunction
119874 (120572) =1003816100381610038161003816Ω1+ Ω2minus Ω3
1003816100381610038161003816 (10)
4 Computational Intelligence and Neuroscience
For a fuzzy set with discrete membership function thebalance equation is modified as
119874(120572119895) =
10038161003816100381610038161003816100381610038161003816100381610038161003816
sum
119906119894119895le120572119895
119906119894119895+ sum
119906119894119895ge119906119895maxminus120572119895
(119906119895maxminus 120572119895)
minus card 119906119894119895| 120572119895lt 119906119894119895lt 119906119895maxminus 120572119895
10038161003816100381610038161003816100381610038161003816100381610038161003816
(11)
In order to find the best 120572119895 it should satisfy the following
optimal problem
120572119895= arg min120572119895
119874(120572119895) (12)
where 119906119894119895isin [0 1] is the membership of 119909
119894in a cluster with
prototype 120573119895 119906119895max
and 119906119895min
denote the highest and lowestmembership values to the 119895th cluster and 120572
119895is the threshold
of the 119895th cluster The range of feasible values of threshold 120572119895
is [119906119895min (119906119895min+ 119906119895max)2] [19]
This approach considers all membership values withrespect to a fixed cluster when updating the prototype ofthis cluster The main merits of shadowed sets involve theoptimizationmechanism for choosing separate threshold andthe reduction of the burden of plain numeric computations
25 XB Clustering Validity Index The clustering algorithmsdescribed above require prespecification of the number ofclusters The partition results are dependent on the choiceof 119862 There exist validity indices to evaluate the goodness ofclustering according to a given number of clusters thereforethese validity indices can be used to acquire the optimal valueof 119862 [27]
The XB index presents a fuzzy-validity criterion basedon a validity function which identifies overall compact andseparate fuzzy 119888-partitions This function depends upon thedata set geometric distance measure and distance betweencluster centroids and fuzzy partition irrespective of anyfuzzy algorithm used For evaluating the goodness of thedata partition both cluster compactness and interclusterseparation should be taken into account For the FCMalgorithm with 119898 = 20 the Xie-Beni index can be shownto be
119878XB =119869FCM1198731198892
min (13)
where 119889min = min119894119895
10038171003817100381710038171003817120573119894minus 120573119895
10038171003817100381710038171003817is the minimum distance
between cluster centroidsThemore separate the clusters thelarger the 119889min and the smaller the 119878XB
3 Shadowed Sets-Based PSO-FuzzyClustering SP-FCM
FCM strives to find 119862 compact clusters in 119883 where 119862 is oneof the specified parameters But the process of selecting andadjusting 119862manually to obtain desirable cluster partitions ina given data set is very subjective and somewhat arbitrary
To seek the optimal cluster structure 119862 is always allowed tobe overestimated [28] such that the distances between someclusters are not big enough or themembership values of someobjects with different clusters are adjacent and ambiguousin a given data set And in this case the modification ofprototypes through long time iteration is meaningless
The main subject of cluster validation is the evaluation ofclustering results to find the partitioning that best fits the dataset Based on the foregoing algorithms we wish to find clusterpartitions that contain compact and well-separated clustersIn our algorithm 119862 is also overestimated and the clusterscompete for data membership We can set [119862min 119862max] as thereasonable range of cluster number based on the knowledgeof the data This provides a more transparent and tractableprocess of cluster number reduction Considering the fuzzypartition matrix 119880 = [119906
119894119895]119873times119862
each column is comprisedof the membership values of all feature vectors 119909
119894with a
single cluster center Thus an optimal threshold 120572119895(119895 =
1 2 119862) for each column should be found to create a harderpartition by (12) The amount of data which are assignedmembership value equal to 1 is identified as the cardinalityof corresponding cluster According to 120572
119895 the cardinality of
the 119895th column is
119872119895= card 119906
119894119895| 119906119894119895ge 119906119895maxminus 120572119895 (14)
Here the threshold is not subjectively user-defined butit is established on the balance of uncertainty and canbe adjusted automatically in the clustering process Thisproperty of shadowed sets can be used to reduce the clusternumber In order to control the convergence speed thedecision to delete clusters can be based on some thresholdsDifferent threshold values should be set for different data setsdepending on the cluster structure and size of data sets Herea threshold 120576 and attrition rate 120588 (0 lt 120588 lt 1) are set Thedecision to delete clusters in SP-FCM is based solely on clustercardinality and the threshold 120576 If 120576 is too small 119862 is reducedmore slowly and it may stop prematurely before the optimalcluster number is found On the other hand if 120576 is too large119862 may be reduced too drastically In our method clusterswhose cardinalities 119872
119895lt 120576 are considered as ldquocandidatesrdquo
for removal And we can remove up to lfloor120588 times 119862rfloor clustershaving the lowest cardinality from the pool of candidatesspecified by 120576 Limiting the number of clusters that can beremoved at one time prevents 119862 from being reduced toodrastically when 120576 is set too high for a given data set Thiswould automatically estimate the best cluster number whilealso utilizing a faster consistent and repeatable initializationtechnique For evaluating the goodness of the data partitionboth cluster compactness and intercluster separation shouldbe taken into account Hence the XB index is adopted
For each 119862 in the range of [119862min 119862max] a set of clustervalidity indexes were calculated where 119862max is the initialcluster number which is set to be much larger than theexpected cluster numberThe partitionmatrix with119862 clusterswith the best aggregate validity index is selected as the finalcluster partitionThe SP-FCM algorithm is summarized as inAlgorithm 1
Computational Intelligence and Neuroscience 5
(1) Initialize 119862max and 119862min let 119862 = 119862max the real number119898 iteration counter 119896 = 0 iteration counter 119905 = 0 maximumiteration number 119879 of PSO
(2) Initialize the population size 119871 the initial velocity of particles the initial position of particles 1198881 1198882 119908 the threshold 120576
and attrition rate 120588(3)DO
Repeat (a) Update partition matrix 119880(119896)for all particles by (3)(b) Calculate the cluster center for each particle by (4)(c) Calculate the fitness value for each particle by (7)(d) Calculate 119901119887119890119904119905 for each particle(e) Calculate 119892119887119890119904119905 for the swarm(f) Update the velocity and position of each particle by (5)(g) 119905 = 119905 + 1
Until PSO termination condition is met (lowast)(i) Calculate the optimal threshold 120572
119895(119895 = 1 2 119862) for each column of partition matrix 119880(119896) by (12)
and relocate 119906119894119895(1 le 119894 le 119873) of 119895th cluster according to 120572
119895
(ii) Calculate cardinality119872119895for each cluster on the basis of the number of data whose membership value equal to 1 by (14)
1 le 119895 le 119862
(iii) Remove all clusters whose119872119895lt 120576 and119872
119895is among lfloor120588 times 119862rfloor lowest cardinality
(iv) Update cluster number C(v) Calculate cluster validity index 119878XB(119896) by (13)(vi) Update iteration counter 119896 = 119896 + 1
While termination condition is not met (lowastlowast)(lowast) The termination condition of PSO in this method is 119905 ge 119879 (reach the maximum number of iterations) or the velocity
updates are close to zero over a number of iterations(lowastlowast) The algorithm can terminate under either of the following two conditions(1) The prototype parameters in 119861 stabilize within some threshold 120575(2) The number of clusters has reached the minimum limit 119862min
Algorithm 1 SP-FCM
Here if lfloor120588 times 119862rfloor is equal to 0 we can let it to be1 This means that the cluster with the lowest cardinalitymay be removed The initial 119862max cluster prototypes canbe initialized using exemplars from data points selected by120573119895= 119909lfloor(119873119862max)119895rfloor
After termination the 119861 and 119880 from119862 isin [119862min 119862max] with the best cluster validity index 119878XB areselected as the final cluster prototype and partition
4 Experimental Results
In this section the performance of FCM RCM shadowed 119888-means (SCM) [21] shadowed rough 119888-means (SRCM) [19]and SP-FCM algorithms is presented on four UCI datasetsfour yeast gene expression datasets and real data Forevaluating the convergence effect the fundamental criterioncan be described as follows the distance between differentobjects in the same cluster should be as close as possible thedistance between different objects in different cluster shouldbe as far as possible Here we use DB index and Dunn indexto evaluate the clustering effect For a given data set and 119862value the higher the similarity values within the clusters andthe intercluster separation the lower the DB index value Agood clustering procedure shouldmake the value ofDB indexas low as possible Reversely higher values of the Dunn indexindicate better clustering in the sense that the clusters are well
separated and relatively compact The details of experimentsare mentioned below
41 UCI Data Set In our experiments totally four UCI datasets are used including 4-dimensional Iris 13-dimensionalWine 10-dimensional Glass and 34-dimensional Iono-sphere There are 3 clusters in data set of Iris each of whichhas 50 data patterns 3 clusters in data set of Wine whichhave 50 60 and 68 data patterns 6 clusters in data set ofGlass which have 30 35 40 42 36 and 31 separately and2 clusters in data set of Ionosphere which have 226 and125 data patterns The validity indices of each method arecompared in Table 1 SP-FCM can identify compact groupscompared to other algorithmswhen given the cluster number119862 It can also be seen that SRCM and SP-FCM have moreobvious advantages than FCM RCM and SCM SP-FCMperforms slightly better than SRCM in most cases due tothe global search ability which enables it to converge to anoptimum or near optimum solutions Moreover shadowedset- and rough set-based clustering methods namely SP-FCM SRCM RCM and SCM perform better than FCM Itimplies that the partition of approximation regions can revealthe nature of data structure and only the lower bound andboundary region of each cluster have positive contribution inthe process of updating the prototypes
6 Computational Intelligence and Neuroscience
Table 1 Performance of FCM RCM SCM SRCM and SP-FCM on four UCI data sets
Different indices Algorithm Data setsIris Wine Glass Ionosphere
DB index
FCM 07642 08803 22971 20587RCM 06875 05692 19635 15434SCM 06862 05327 18495 14763SRCM 06613 04436 15804 13971SP-FCM 06574 04328 15237 14066
Dunn index
FCM 23106 25834 01142 08381RCM 27119 28157 02637 10233SCM 24801 27992 03150 10319SRCM 30874 31342 05108 11924SP-FCM 33254 31764 04921 12605
0
05
1
15
2
25
3
12 11 10 9 8 7 6 5 4 3 2Number of clusters
Xie-
Beni
val
idity
(a) Iris
0
1
2
3
4
5
6
13 12 11 10 9 8 7 6 5 4 3 2Number of clusters
Xie-
Beni
val
idity
(b) Wine
0
2
4
6
8
10
14 1213 11 10 9 8 7 6 5 4 3 2Number of clusters
Xie-
Beni
val
idity
(c) Glass
0
5
10
15
20
25
16 15 14 13 12 11 10 9 8 7 6 5 4 3 2Number of clusters
Xie-
Beni
val
idity
(d) Ionosphere
Figure 2 XB validity index of four UCI data sets with cluster number C
As usual the number of clusters is implied by the nature ofthe problem Here with the shadowed sets involved one cananticipate that the optimal number of clusters could be foundThe fuzzification coefficient 119898 can be optimized however itis common to assume a fixed value of 20 which associateswith the form of the membership functions of the generatedclusters For testing the SP-FCMalgorithm the rule119862 le 11987312is adopted and the range of the expected cluster number canbe set as (1) Iris [119862min = 2 119862max = 12] (2) Wine [119862min = 2119862max = 13] (3) Glass [119862min = 2 119862max = 14] (4) Ionosphere
[119862min = 2 119862max = 16] The swarm size is set as 119871 = 20the maximum iteration number of PSO 119879 = 50 and forcluster reduction the cluster cardinality threshold 120576 = 10and the attrition rate 120588 = 01 In each cycle we get thedistribution of every cluster remove part of them accordingto their cardinality and calculate the XB index and thecluster number 119862 varies from 119862max to 119862min After ending thecirculation the partition with the lowest value is selected asthe final result Figure 2 presents the validity indices in theprocess of generating the optimal cluster number Smaller
Computational Intelligence and Neuroscience 7
Table 2 Performance of FCM RCM SCM SRCM and SP-FCM on four yeast expression data sets
Different indices Algorithm Data setsGDS608 GDS2003 GDS2267 GDS2712
DB index
FCM 20861 24671 15916 19526RCM 16109 22104 10274 12058SCM 15938 21346 08946 10965SRCM 13274 19523 07438 07326SP-FCM 12958 18946 07962 06843
Dunn index
FCM 02647 02976 04208 03519RCM 03789 03981 07164 06074SCM 03865 03775 08439 06207SRCM 05126 04953 09759 08113SP-FCM 05407 05026 09182 08049
values indicate more compact and well-separated clustersThe validity indices reach their minimum value at 119862 = 33 6 and 2 separately which correspond to the final clusterprototype and the best partition Through the shadowedsets and PSO approaches the influence of each boundaryregion to the formation of the prototypes and the clusterscan be properly resolved Although more computing timeis required to run SP-FCM the reasonable result can beacquired for processing the overlapping and vagueness datapatterns
42 Yeast Gene Expression Data Set There are four yeastgene expression data sets used in the experiments includingGDS608 GDS2003 GDS2267 and GDS2712 downloadedfrom Gene Expression Omnibus The number of classesand samples of GDS608 is 26 and 6303 for GDS2003 thenumber of classes and samples is 23 and 5617 for GDS2267is 14 and 9275 and for GDS2712 is 15 and 9275 Table 2presents the validity indices of different methods after thecluster number 119862 was given The SP-FCM and SRCM obtainthe same effect and perform better than other clusteringalgorithms The improvement can be attributed to the factthat the global search capacity of PSO is conducive to findingmore appropriate cluster centers while escaping from localoptima
For getting the optimum 119862 automatically we let119898 = 201198881= 149 119888
2= 149 and 119908 = 072 and the rule 119862 le 11987312
is adopted The swarm size is set as 119871 = 20 the maximumiteration number of PSO is 119879 = 80 and for clusterreduction the range of the expected cluster number thecluster cardinality threshold 120576 and the attrition rate 120588 can beset as (1) GDS608 [119862min = 20 119862max = 80] 120576 = 20 120588 = 005(2) GDS2003 [119862min = 20 119862max = 75] 120576 = 20 120588 = 005(3) GDS2267 [119862min = 10 119862max = 96] 120576 = 20 120588 = 008(4) GDS2712 [119862min = 10 119862max = 96] 120576 = 20 120588 = 008In each cycle we get the distribution of every cluster removepart of them according to their cardinality and calculate theXB index and the cluster number 119862 varies from 119862max to119862min The partition with the lowest value is selected as the
final result after the loop is ended As seen in Figure 3 forGDS608 at the beginning the cluster number decreases at afaster rate it takes 26 iterations to reduce the cluster numberfrom 119862 = 80 to 119862 = 30 and 4 iterations from 119862 = 30
to 119862 = 26 and the XB index begins to increase when thecluster number 119862 lt 26 For GDS2003 it takes 24 iterationsto reduce the cluster number from 119862 = 75 to 119862 = 30 and 7iterations from 119862 = 30 to 119862 = 23 and the XB index beginsto increase when the cluster number 119862 lt 23 For GDS2267 ittakes 23 iterations to reduce the cluster number from 119862 = 96to 119862 = 20 and 6 iterations from 119862 = 20 to 119862 = 14 andthe XB index begins to increase when the cluster number119862 lt 14 For GDS2712 it takes 23 iterations to reduce thecluster number from 119862 = 96 to 119862 = 20 and 5 iterationsfrom 119862 = 20 to 119862 = 15 and the XB index begins to increasewhen the cluster number 119862 lt 15 Here the advantagesof fuzzy sets PSO and shadowed sets are integrated in theSP-FCM and make this algorithm applicable to deal withoverlapping partitions the uncertainty and vagueness arisingfrom the boundary regions and the optimization processin the shadowed sets makes this method robust to outliersso that the approximation regions of each cluster can bedetermined accurately and the obtained prototypes approachthe desired locations
43 Real Data In this experiment totally 10 different pack-ages are tested Each package is represented by 100 framescaptured from different angles by camera and each frame isextracted SIFT feature points which are used for training arecognition system Figure 4 shows some images with theirSIFT keypoints And this data set is comprised of 248150descriptors We let 119898 = 20 119888
1= 149 119888
2= 149 119908 = 072
119871 = 20 120576 = 30 and 120588 = 001 for the SP-FCM and choose thereasonable range [119862min = 200 119862max = 360] according to thecategory amount of packages and distribution of keypoints ineach image Eighty iterations of PSO are run on each given119862 to produce the cluster prototype 119861 and partition matrix119880 as the starting point for the shadowed sets Longer PSOstabilization is needed to obtainmore stable cluster partitions
8 Computational Intelligence and Neuroscience
0
05
1
15
2
25
3
35
30 29 28 27 26 25 24 23 22 21 20Number of clusters
Xie-
Beni
val
idity
(a) GDS608
0
1
2
3
4
5
6
Xie-
Beni
val
idity
30 29 28 27 26 25 24 23 22 21 20Number of clusters
(b) GDS2003
0
1
2
3
4
5
6
7
20 19 18 17 16 15 14 13 12 11 10Number of clusters
Xie-
Beni
val
idity
(c) GDS2267
0
1
2
3
4
5
6
20 19 18 17 16 15 14 13 12 11 10Number of clusters
Xie-
Beni
val
idity
(d) GDS2712
Figure 3 XB validity index of four yeast gene expression data sets with cluster number 119862
Table 3 Performance of FCM RCM SCM SRCM and SP-FCM on package datasets
Different indices AlgorithmsFCM RCM SCM SRCM SP-FCM
DB index 184569 159671 143194 124038 107826Dunn index 92647 116298 125376 169422 167313
Within each cluster the optimal 120572119895decides the cardinality
and realizes cluster reduction and XB index is calculatedEach 119862-partition is ranked using this index and selected asthe final output by the smallest index value that indicates thebest compact and well-separated clusters At the beginningthe cluster number decreases at a faster speed it takes 26iterations to reduce the cluster number from 119862 = 360 to 119862 =289 and 20 iterations from119862 = 289 to119862 = 267 The XB indexincreases at a relatively faster rate when the cluster number119862 lt 267 Figure 5 shows the XB index for 119862 isin [267 289]The index reaches its minimum value at 119862 = 276 that meansthe best partition for this data set is 276 clusters Table 3exhibits the comparative analysis of convergence effect Asexpected SP-FCMcan provide sound results for the real datathe performance is assessed by those validity indices
5 Conclusions
This paper presents a modified fuzzy 119888-means algorithmbased on the particle swarm optimization and shadowed setsto perform unsupervised feature clustering This algorithmcalled SP-FCMutilizes the global search property of PSO andvagueness balance property of shadowed sets such that it canestimate the optimal cluster number as it runs through itsalternating optimization process SP-FCM as a randomizedbased approach has the capability to alleviate the problemsfaced by FCM which has some demerits of initializationand falling in local minima Moreover this algorithm avoidsthe subjective and somewhat arbitrary trials to estimate theappropriate value of cluster number and it enhances thiscapability to find the optimal cluster number within a specific
Computational Intelligence and Neuroscience 9
(a) 644 features (b) 460 features (c) 742 features
(d) 442 features (e) 313 features (f) 602 features
(g) 373 features (h) 724 features (i) 539 features (j) 124 features
Figure 4 Ten package images with SIFT features
0100200300400500600700800900
1000
289
285
282
280
278
276
274
272
270
268
Number of clusters
Xie-
Beni
val
idity
287
283
281
279
277
275
273
271
269
267
Figure 5 XB validity index of bag data set with cluster number 119862
range using cluster validity measures as indicatorsThe use ofXB validity index allows the algorithm to find the optimumcluster number with cluster partitions that provide compactand well-separated clusters From the experiments we haveshown that the SP-FCMalgorithmproduces good resultswithreference to DB and Dunn indices especially to the highdimension and large data cases
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
10 Computational Intelligence and Neuroscience
Acknowledgment
This research was supported by the National Natural ScienceFoundation of China (Grant no 61105089 NSFC)
References
[1] A K Jain ldquoData clustering 50 years beyond K-meansrdquo PatternRecognition Letters vol 31 no 8 pp 651ndash666 2010
[2] H-P Kriegel P Kroger and A Zimek ldquoClustering high-dimensional data a survey on subspace clustering pattern-based clustering and correlation clusteringrdquoACMTransactionson Knowledge Discovery from Data vol 3 no 1 article 1 2009
[3] P Melin and O Castillo ldquoA review on type-2 fuzzy logic appli-cations in clustering classification and pattern recognitionrdquoApplied Soft Computing Journal vol 21 pp 568ndash577 2014
[4] M Yuwono S W Su B D Moulton and H T Nguyen ldquoDataclustering using variants of rapid centroid estimationrdquo IEEETransactions on Evolutionary Computation vol 18 no 3 pp366ndash377 2014
[5] S C Satapathy G Pradhan S Pattnaik J V R Murthy andP V G D P Reddy ldquoPerformance comparisons of PSO basedclusteringrdquo InterJRI Computer Science and Networking vol 1no 1 pp 18ndash23 2009
[6] G Peters and RWeber ldquoDynamic clustering with soft comput-ingrdquo Wiley Interdisciplinary Reviews Data Mining and Knowl-edge Discovery vol 2 no 3 pp 226ndash236 2012
[7] D T Anderson J C Bezdek M Popescu and J M KellerldquoComparing fuzzy probabilistic and possibilistic partitionsrdquoIEEE Transactions on Fuzzy Systems vol 18 no 5 pp 906ndash9182010
[8] P Maji and S Paul ldquoRobust rough-fuzzy C-means algorithmdesign and applications in coding and non-coding RNA expres-sion data clusteringrdquo Fundamenta Informaticae vol 124 no 1-2pp 153ndash174 2013
[9] J Bezdek Fuzzy mathematics in pattern classification [PhDthesis] Cornell University New york NY USA 1974
[10] V Olman F Mao H Wu and Y Xu ldquoParallel clusteringalgorithm for large data sets with applications in bioinformat-icsrdquo IEEEACM Transactions on Computational Biology andBioinformatics vol 6 no 2 pp 344ndash352 2009
[11] J C Bezdek and R J Hathaway ldquoOptimization of fuzzyclustering criteria using genetic algorithmsrdquo in Proceedings ofthe 1st IEEE Conference on Evolutionary Computation pp 589ndash594 June 1994
[12] T A Runkler ldquoAnt colony optimization of clustering modelsrdquoInternational Journal of Intelligent Systems vol 20 no 12 pp1233ndash1251 2005
[13] K S Al-Sultan and S Z Selim ldquoA global algorithm for theFuzzy Clustering problemrdquo Pattern Recognition vol 26 no 9pp 1357ndash1361 1993
[14] R Eberhart and J Kennedy ldquoNew optimizer using particleswarm theoryrdquo in Proceedings of the 6th International Sympo-sium onMicroMachine and Human Science pp 39ndash43 NagoyaJapan October 1995
[15] T A Runkler and C Katz ldquoFuzzy clustering by particleswarm optimizationrdquo in Proceedings of the IEEE InternationalConference on Fuzzy Systems pp 601ndash608 can July 2006
[16] C-F Juang C-M Hsiao and C-H Hsu ldquoHierarchical cluster-based multispecies particle-swarm optimization for fuzzy-system optimizationrdquo IEEE Transactions on Fuzzy Systems vol18 no 1 pp 14ndash26 2010
[17] H Izakian and A Abraham ldquoFuzzy C-means and fuzzy swarmfor fuzzy clustering problemrdquo Expert Systems with Applicationsvol 38 no 3 pp 1835ndash1838 2011
[18] W Pedrycz ldquoShadowed sets representing and processing fuzzysetsrdquo IEEE Transactions on Systems Man and Cybernetics BCybernetics vol 28 no 1 pp 103ndash109 1998
[19] J Zhou W Pedrycz and D Miao ldquoShadowed sets in the char-acterization of rough-fuzzy clusteringrdquoPattern Recognition vol44 no 8 pp 1738ndash1749 2011
[20] W Pedrycz ldquoFrom fuzzy sets to shadowed sets interpretationand computingrdquo International Journal of Intelligent Systems vol24 no 1 pp 48ndash61 2009
[21] S Mitra W Pedrycz and B Barman ldquoShadowed c-meansintegrating fuzzy and rough clusteringrdquo Pattern Recognitionvol 43 no 4 pp 1282ndash1291 2010
[22] J C Bezdek J M Keller R Krishnapuram and N R PalFuzzy Models and Algorithms for Pattern Recognition and ImageProcessing vol 4 ofTheHandbooks of Fuzzy Sets Springer NewYork NY USA 1999
[23] D L Davies and DW Bouldin ldquoA cluster separation measurerdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 1 no 2 pp 224ndash227 1978
[24] X L Xie and G Beni ldquoA validity measure for fuzzy clusteringrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 13 no 8 pp 841ndash847 1991
[25] J C Bezdek and N R Pal ldquoSome new indexes of clustervalidityrdquo IEEE Transactions on Systems Man and CyberneticsPart B Cybernetics vol 28 no 3 pp 301ndash315 1998
[26] S C Satapathy S K Patnaik C D P Dash and S Sahoo ldquoDataclustering using modified fuzzy-PSO (MFPSO)rdquo in Proceedingsof the 5th Multi-Disciplinary Int Workshop on Artificial Intelli-gence Hyderabad India 2011
[27] M K Pakhira S Bandyopadhyay and U Maulik ldquoValidityindex for crisp and fuzzy clustersrdquo Pattern Recognition vol 37no 3 pp 487ndash501 2004
[28] R Krishnapuram and J M Keller ldquoThe possibilistic C-meansalgorithm insights and recommendationsrdquo IEEE Transactionson Fuzzy Systems vol 4 no 3 pp 385ndash393 1996
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
4 Computational Intelligence and Neuroscience
For a fuzzy set with discrete membership function thebalance equation is modified as
119874(120572119895) =
10038161003816100381610038161003816100381610038161003816100381610038161003816
sum
119906119894119895le120572119895
119906119894119895+ sum
119906119894119895ge119906119895maxminus120572119895
(119906119895maxminus 120572119895)
minus card 119906119894119895| 120572119895lt 119906119894119895lt 119906119895maxminus 120572119895
10038161003816100381610038161003816100381610038161003816100381610038161003816
(11)
In order to find the best 120572119895 it should satisfy the following
optimal problem
120572119895= arg min120572119895
119874(120572119895) (12)
where 119906119894119895isin [0 1] is the membership of 119909
119894in a cluster with
prototype 120573119895 119906119895max
and 119906119895min
denote the highest and lowestmembership values to the 119895th cluster and 120572
119895is the threshold
of the 119895th cluster The range of feasible values of threshold 120572119895
is [119906119895min (119906119895min+ 119906119895max)2] [19]
This approach considers all membership values withrespect to a fixed cluster when updating the prototype ofthis cluster The main merits of shadowed sets involve theoptimizationmechanism for choosing separate threshold andthe reduction of the burden of plain numeric computations
25 XB Clustering Validity Index The clustering algorithmsdescribed above require prespecification of the number ofclusters The partition results are dependent on the choiceof 119862 There exist validity indices to evaluate the goodness ofclustering according to a given number of clusters thereforethese validity indices can be used to acquire the optimal valueof 119862 [27]
The XB index presents a fuzzy-validity criterion basedon a validity function which identifies overall compact andseparate fuzzy 119888-partitions This function depends upon thedata set geometric distance measure and distance betweencluster centroids and fuzzy partition irrespective of anyfuzzy algorithm used For evaluating the goodness of thedata partition both cluster compactness and interclusterseparation should be taken into account For the FCMalgorithm with 119898 = 20 the Xie-Beni index can be shownto be
119878XB =119869FCM1198731198892
min (13)
where 119889min = min119894119895
10038171003817100381710038171003817120573119894minus 120573119895
10038171003817100381710038171003817is the minimum distance
between cluster centroidsThemore separate the clusters thelarger the 119889min and the smaller the 119878XB
3 Shadowed Sets-Based PSO-FuzzyClustering SP-FCM
FCM strives to find 119862 compact clusters in 119883 where 119862 is oneof the specified parameters But the process of selecting andadjusting 119862manually to obtain desirable cluster partitions ina given data set is very subjective and somewhat arbitrary
To seek the optimal cluster structure 119862 is always allowed tobe overestimated [28] such that the distances between someclusters are not big enough or themembership values of someobjects with different clusters are adjacent and ambiguousin a given data set And in this case the modification ofprototypes through long time iteration is meaningless
The main subject of cluster validation is the evaluation ofclustering results to find the partitioning that best fits the dataset Based on the foregoing algorithms we wish to find clusterpartitions that contain compact and well-separated clustersIn our algorithm 119862 is also overestimated and the clusterscompete for data membership We can set [119862min 119862max] as thereasonable range of cluster number based on the knowledgeof the data This provides a more transparent and tractableprocess of cluster number reduction Considering the fuzzypartition matrix 119880 = [119906
119894119895]119873times119862
each column is comprisedof the membership values of all feature vectors 119909
119894with a
single cluster center Thus an optimal threshold 120572119895(119895 =
1 2 119862) for each column should be found to create a harderpartition by (12) The amount of data which are assignedmembership value equal to 1 is identified as the cardinalityof corresponding cluster According to 120572
119895 the cardinality of
the 119895th column is
119872119895= card 119906
119894119895| 119906119894119895ge 119906119895maxminus 120572119895 (14)
Here the threshold is not subjectively user-defined butit is established on the balance of uncertainty and canbe adjusted automatically in the clustering process Thisproperty of shadowed sets can be used to reduce the clusternumber In order to control the convergence speed thedecision to delete clusters can be based on some thresholdsDifferent threshold values should be set for different data setsdepending on the cluster structure and size of data sets Herea threshold 120576 and attrition rate 120588 (0 lt 120588 lt 1) are set Thedecision to delete clusters in SP-FCM is based solely on clustercardinality and the threshold 120576 If 120576 is too small 119862 is reducedmore slowly and it may stop prematurely before the optimalcluster number is found On the other hand if 120576 is too large119862 may be reduced too drastically In our method clusterswhose cardinalities 119872
119895lt 120576 are considered as ldquocandidatesrdquo
for removal And we can remove up to lfloor120588 times 119862rfloor clustershaving the lowest cardinality from the pool of candidatesspecified by 120576 Limiting the number of clusters that can beremoved at one time prevents 119862 from being reduced toodrastically when 120576 is set too high for a given data set Thiswould automatically estimate the best cluster number whilealso utilizing a faster consistent and repeatable initializationtechnique For evaluating the goodness of the data partitionboth cluster compactness and intercluster separation shouldbe taken into account Hence the XB index is adopted
For each 119862 in the range of [119862min 119862max] a set of clustervalidity indexes were calculated where 119862max is the initialcluster number which is set to be much larger than theexpected cluster numberThe partitionmatrix with119862 clusterswith the best aggregate validity index is selected as the finalcluster partitionThe SP-FCM algorithm is summarized as inAlgorithm 1
Computational Intelligence and Neuroscience 5
(1) Initialize 119862max and 119862min let 119862 = 119862max the real number119898 iteration counter 119896 = 0 iteration counter 119905 = 0 maximumiteration number 119879 of PSO
(2) Initialize the population size 119871 the initial velocity of particles the initial position of particles 1198881 1198882 119908 the threshold 120576
and attrition rate 120588(3)DO
Repeat (a) Update partition matrix 119880(119896)for all particles by (3)(b) Calculate the cluster center for each particle by (4)(c) Calculate the fitness value for each particle by (7)(d) Calculate 119901119887119890119904119905 for each particle(e) Calculate 119892119887119890119904119905 for the swarm(f) Update the velocity and position of each particle by (5)(g) 119905 = 119905 + 1
Until PSO termination condition is met (lowast)(i) Calculate the optimal threshold 120572
119895(119895 = 1 2 119862) for each column of partition matrix 119880(119896) by (12)
and relocate 119906119894119895(1 le 119894 le 119873) of 119895th cluster according to 120572
119895
(ii) Calculate cardinality119872119895for each cluster on the basis of the number of data whose membership value equal to 1 by (14)
1 le 119895 le 119862
(iii) Remove all clusters whose119872119895lt 120576 and119872
119895is among lfloor120588 times 119862rfloor lowest cardinality
(iv) Update cluster number C(v) Calculate cluster validity index 119878XB(119896) by (13)(vi) Update iteration counter 119896 = 119896 + 1
While termination condition is not met (lowastlowast)(lowast) The termination condition of PSO in this method is 119905 ge 119879 (reach the maximum number of iterations) or the velocity
updates are close to zero over a number of iterations(lowastlowast) The algorithm can terminate under either of the following two conditions(1) The prototype parameters in 119861 stabilize within some threshold 120575(2) The number of clusters has reached the minimum limit 119862min
Algorithm 1 SP-FCM
Here if lfloor120588 times 119862rfloor is equal to 0 we can let it to be1 This means that the cluster with the lowest cardinalitymay be removed The initial 119862max cluster prototypes canbe initialized using exemplars from data points selected by120573119895= 119909lfloor(119873119862max)119895rfloor
After termination the 119861 and 119880 from119862 isin [119862min 119862max] with the best cluster validity index 119878XB areselected as the final cluster prototype and partition
4 Experimental Results
In this section the performance of FCM RCM shadowed 119888-means (SCM) [21] shadowed rough 119888-means (SRCM) [19]and SP-FCM algorithms is presented on four UCI datasetsfour yeast gene expression datasets and real data Forevaluating the convergence effect the fundamental criterioncan be described as follows the distance between differentobjects in the same cluster should be as close as possible thedistance between different objects in different cluster shouldbe as far as possible Here we use DB index and Dunn indexto evaluate the clustering effect For a given data set and 119862value the higher the similarity values within the clusters andthe intercluster separation the lower the DB index value Agood clustering procedure shouldmake the value ofDB indexas low as possible Reversely higher values of the Dunn indexindicate better clustering in the sense that the clusters are well
separated and relatively compact The details of experimentsare mentioned below
41 UCI Data Set In our experiments totally four UCI datasets are used including 4-dimensional Iris 13-dimensionalWine 10-dimensional Glass and 34-dimensional Iono-sphere There are 3 clusters in data set of Iris each of whichhas 50 data patterns 3 clusters in data set of Wine whichhave 50 60 and 68 data patterns 6 clusters in data set ofGlass which have 30 35 40 42 36 and 31 separately and2 clusters in data set of Ionosphere which have 226 and125 data patterns The validity indices of each method arecompared in Table 1 SP-FCM can identify compact groupscompared to other algorithmswhen given the cluster number119862 It can also be seen that SRCM and SP-FCM have moreobvious advantages than FCM RCM and SCM SP-FCMperforms slightly better than SRCM in most cases due tothe global search ability which enables it to converge to anoptimum or near optimum solutions Moreover shadowedset- and rough set-based clustering methods namely SP-FCM SRCM RCM and SCM perform better than FCM Itimplies that the partition of approximation regions can revealthe nature of data structure and only the lower bound andboundary region of each cluster have positive contribution inthe process of updating the prototypes
6 Computational Intelligence and Neuroscience
Table 1 Performance of FCM RCM SCM SRCM and SP-FCM on four UCI data sets
Different indices Algorithm Data setsIris Wine Glass Ionosphere
DB index
FCM 07642 08803 22971 20587RCM 06875 05692 19635 15434SCM 06862 05327 18495 14763SRCM 06613 04436 15804 13971SP-FCM 06574 04328 15237 14066
Dunn index
FCM 23106 25834 01142 08381RCM 27119 28157 02637 10233SCM 24801 27992 03150 10319SRCM 30874 31342 05108 11924SP-FCM 33254 31764 04921 12605
0
05
1
15
2
25
3
12 11 10 9 8 7 6 5 4 3 2Number of clusters
Xie-
Beni
val
idity
(a) Iris
0
1
2
3
4
5
6
13 12 11 10 9 8 7 6 5 4 3 2Number of clusters
Xie-
Beni
val
idity
(b) Wine
0
2
4
6
8
10
14 1213 11 10 9 8 7 6 5 4 3 2Number of clusters
Xie-
Beni
val
idity
(c) Glass
0
5
10
15
20
25
16 15 14 13 12 11 10 9 8 7 6 5 4 3 2Number of clusters
Xie-
Beni
val
idity
(d) Ionosphere
Figure 2 XB validity index of four UCI data sets with cluster number C
As usual the number of clusters is implied by the nature ofthe problem Here with the shadowed sets involved one cananticipate that the optimal number of clusters could be foundThe fuzzification coefficient 119898 can be optimized however itis common to assume a fixed value of 20 which associateswith the form of the membership functions of the generatedclusters For testing the SP-FCMalgorithm the rule119862 le 11987312is adopted and the range of the expected cluster number canbe set as (1) Iris [119862min = 2 119862max = 12] (2) Wine [119862min = 2119862max = 13] (3) Glass [119862min = 2 119862max = 14] (4) Ionosphere
[119862min = 2 119862max = 16] The swarm size is set as 119871 = 20the maximum iteration number of PSO 119879 = 50 and forcluster reduction the cluster cardinality threshold 120576 = 10and the attrition rate 120588 = 01 In each cycle we get thedistribution of every cluster remove part of them accordingto their cardinality and calculate the XB index and thecluster number 119862 varies from 119862max to 119862min After ending thecirculation the partition with the lowest value is selected asthe final result Figure 2 presents the validity indices in theprocess of generating the optimal cluster number Smaller
Computational Intelligence and Neuroscience 7
Table 2 Performance of FCM RCM SCM SRCM and SP-FCM on four yeast expression data sets
Different indices Algorithm Data setsGDS608 GDS2003 GDS2267 GDS2712
DB index
FCM 20861 24671 15916 19526RCM 16109 22104 10274 12058SCM 15938 21346 08946 10965SRCM 13274 19523 07438 07326SP-FCM 12958 18946 07962 06843
Dunn index
FCM 02647 02976 04208 03519RCM 03789 03981 07164 06074SCM 03865 03775 08439 06207SRCM 05126 04953 09759 08113SP-FCM 05407 05026 09182 08049
values indicate more compact and well-separated clustersThe validity indices reach their minimum value at 119862 = 33 6 and 2 separately which correspond to the final clusterprototype and the best partition Through the shadowedsets and PSO approaches the influence of each boundaryregion to the formation of the prototypes and the clusterscan be properly resolved Although more computing timeis required to run SP-FCM the reasonable result can beacquired for processing the overlapping and vagueness datapatterns
42 Yeast Gene Expression Data Set There are four yeastgene expression data sets used in the experiments includingGDS608 GDS2003 GDS2267 and GDS2712 downloadedfrom Gene Expression Omnibus The number of classesand samples of GDS608 is 26 and 6303 for GDS2003 thenumber of classes and samples is 23 and 5617 for GDS2267is 14 and 9275 and for GDS2712 is 15 and 9275 Table 2presents the validity indices of different methods after thecluster number 119862 was given The SP-FCM and SRCM obtainthe same effect and perform better than other clusteringalgorithms The improvement can be attributed to the factthat the global search capacity of PSO is conducive to findingmore appropriate cluster centers while escaping from localoptima
For getting the optimum 119862 automatically we let119898 = 201198881= 149 119888
2= 149 and 119908 = 072 and the rule 119862 le 11987312
is adopted The swarm size is set as 119871 = 20 the maximumiteration number of PSO is 119879 = 80 and for clusterreduction the range of the expected cluster number thecluster cardinality threshold 120576 and the attrition rate 120588 can beset as (1) GDS608 [119862min = 20 119862max = 80] 120576 = 20 120588 = 005(2) GDS2003 [119862min = 20 119862max = 75] 120576 = 20 120588 = 005(3) GDS2267 [119862min = 10 119862max = 96] 120576 = 20 120588 = 008(4) GDS2712 [119862min = 10 119862max = 96] 120576 = 20 120588 = 008In each cycle we get the distribution of every cluster removepart of them according to their cardinality and calculate theXB index and the cluster number 119862 varies from 119862max to119862min The partition with the lowest value is selected as the
final result after the loop is ended As seen in Figure 3 forGDS608 at the beginning the cluster number decreases at afaster rate it takes 26 iterations to reduce the cluster numberfrom 119862 = 80 to 119862 = 30 and 4 iterations from 119862 = 30
to 119862 = 26 and the XB index begins to increase when thecluster number 119862 lt 26 For GDS2003 it takes 24 iterationsto reduce the cluster number from 119862 = 75 to 119862 = 30 and 7iterations from 119862 = 30 to 119862 = 23 and the XB index beginsto increase when the cluster number 119862 lt 23 For GDS2267 ittakes 23 iterations to reduce the cluster number from 119862 = 96to 119862 = 20 and 6 iterations from 119862 = 20 to 119862 = 14 andthe XB index begins to increase when the cluster number119862 lt 14 For GDS2712 it takes 23 iterations to reduce thecluster number from 119862 = 96 to 119862 = 20 and 5 iterationsfrom 119862 = 20 to 119862 = 15 and the XB index begins to increasewhen the cluster number 119862 lt 15 Here the advantagesof fuzzy sets PSO and shadowed sets are integrated in theSP-FCM and make this algorithm applicable to deal withoverlapping partitions the uncertainty and vagueness arisingfrom the boundary regions and the optimization processin the shadowed sets makes this method robust to outliersso that the approximation regions of each cluster can bedetermined accurately and the obtained prototypes approachthe desired locations
43 Real Data In this experiment totally 10 different pack-ages are tested Each package is represented by 100 framescaptured from different angles by camera and each frame isextracted SIFT feature points which are used for training arecognition system Figure 4 shows some images with theirSIFT keypoints And this data set is comprised of 248150descriptors We let 119898 = 20 119888
1= 149 119888
2= 149 119908 = 072
119871 = 20 120576 = 30 and 120588 = 001 for the SP-FCM and choose thereasonable range [119862min = 200 119862max = 360] according to thecategory amount of packages and distribution of keypoints ineach image Eighty iterations of PSO are run on each given119862 to produce the cluster prototype 119861 and partition matrix119880 as the starting point for the shadowed sets Longer PSOstabilization is needed to obtainmore stable cluster partitions
8 Computational Intelligence and Neuroscience
0
05
1
15
2
25
3
35
30 29 28 27 26 25 24 23 22 21 20Number of clusters
Xie-
Beni
val
idity
(a) GDS608
0
1
2
3
4
5
6
Xie-
Beni
val
idity
30 29 28 27 26 25 24 23 22 21 20Number of clusters
(b) GDS2003
0
1
2
3
4
5
6
7
20 19 18 17 16 15 14 13 12 11 10Number of clusters
Xie-
Beni
val
idity
(c) GDS2267
0
1
2
3
4
5
6
20 19 18 17 16 15 14 13 12 11 10Number of clusters
Xie-
Beni
val
idity
(d) GDS2712
Figure 3 XB validity index of four yeast gene expression data sets with cluster number 119862
Table 3 Performance of FCM RCM SCM SRCM and SP-FCM on package datasets
Different indices AlgorithmsFCM RCM SCM SRCM SP-FCM
DB index 184569 159671 143194 124038 107826Dunn index 92647 116298 125376 169422 167313
Within each cluster the optimal 120572119895decides the cardinality
and realizes cluster reduction and XB index is calculatedEach 119862-partition is ranked using this index and selected asthe final output by the smallest index value that indicates thebest compact and well-separated clusters At the beginningthe cluster number decreases at a faster speed it takes 26iterations to reduce the cluster number from 119862 = 360 to 119862 =289 and 20 iterations from119862 = 289 to119862 = 267 The XB indexincreases at a relatively faster rate when the cluster number119862 lt 267 Figure 5 shows the XB index for 119862 isin [267 289]The index reaches its minimum value at 119862 = 276 that meansthe best partition for this data set is 276 clusters Table 3exhibits the comparative analysis of convergence effect Asexpected SP-FCMcan provide sound results for the real datathe performance is assessed by those validity indices
5 Conclusions
This paper presents a modified fuzzy 119888-means algorithmbased on the particle swarm optimization and shadowed setsto perform unsupervised feature clustering This algorithmcalled SP-FCMutilizes the global search property of PSO andvagueness balance property of shadowed sets such that it canestimate the optimal cluster number as it runs through itsalternating optimization process SP-FCM as a randomizedbased approach has the capability to alleviate the problemsfaced by FCM which has some demerits of initializationand falling in local minima Moreover this algorithm avoidsthe subjective and somewhat arbitrary trials to estimate theappropriate value of cluster number and it enhances thiscapability to find the optimal cluster number within a specific
Computational Intelligence and Neuroscience 9
(a) 644 features (b) 460 features (c) 742 features
(d) 442 features (e) 313 features (f) 602 features
(g) 373 features (h) 724 features (i) 539 features (j) 124 features
Figure 4 Ten package images with SIFT features
0100200300400500600700800900
1000
289
285
282
280
278
276
274
272
270
268
Number of clusters
Xie-
Beni
val
idity
287
283
281
279
277
275
273
271
269
267
Figure 5 XB validity index of bag data set with cluster number 119862
range using cluster validity measures as indicatorsThe use ofXB validity index allows the algorithm to find the optimumcluster number with cluster partitions that provide compactand well-separated clusters From the experiments we haveshown that the SP-FCMalgorithmproduces good resultswithreference to DB and Dunn indices especially to the highdimension and large data cases
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
10 Computational Intelligence and Neuroscience
Acknowledgment
This research was supported by the National Natural ScienceFoundation of China (Grant no 61105089 NSFC)
References
[1] A K Jain ldquoData clustering 50 years beyond K-meansrdquo PatternRecognition Letters vol 31 no 8 pp 651ndash666 2010
[2] H-P Kriegel P Kroger and A Zimek ldquoClustering high-dimensional data a survey on subspace clustering pattern-based clustering and correlation clusteringrdquoACMTransactionson Knowledge Discovery from Data vol 3 no 1 article 1 2009
[3] P Melin and O Castillo ldquoA review on type-2 fuzzy logic appli-cations in clustering classification and pattern recognitionrdquoApplied Soft Computing Journal vol 21 pp 568ndash577 2014
[4] M Yuwono S W Su B D Moulton and H T Nguyen ldquoDataclustering using variants of rapid centroid estimationrdquo IEEETransactions on Evolutionary Computation vol 18 no 3 pp366ndash377 2014
[5] S C Satapathy G Pradhan S Pattnaik J V R Murthy andP V G D P Reddy ldquoPerformance comparisons of PSO basedclusteringrdquo InterJRI Computer Science and Networking vol 1no 1 pp 18ndash23 2009
[6] G Peters and RWeber ldquoDynamic clustering with soft comput-ingrdquo Wiley Interdisciplinary Reviews Data Mining and Knowl-edge Discovery vol 2 no 3 pp 226ndash236 2012
[7] D T Anderson J C Bezdek M Popescu and J M KellerldquoComparing fuzzy probabilistic and possibilistic partitionsrdquoIEEE Transactions on Fuzzy Systems vol 18 no 5 pp 906ndash9182010
[8] P Maji and S Paul ldquoRobust rough-fuzzy C-means algorithmdesign and applications in coding and non-coding RNA expres-sion data clusteringrdquo Fundamenta Informaticae vol 124 no 1-2pp 153ndash174 2013
[9] J Bezdek Fuzzy mathematics in pattern classification [PhDthesis] Cornell University New york NY USA 1974
[10] V Olman F Mao H Wu and Y Xu ldquoParallel clusteringalgorithm for large data sets with applications in bioinformat-icsrdquo IEEEACM Transactions on Computational Biology andBioinformatics vol 6 no 2 pp 344ndash352 2009
[11] J C Bezdek and R J Hathaway ldquoOptimization of fuzzyclustering criteria using genetic algorithmsrdquo in Proceedings ofthe 1st IEEE Conference on Evolutionary Computation pp 589ndash594 June 1994
[12] T A Runkler ldquoAnt colony optimization of clustering modelsrdquoInternational Journal of Intelligent Systems vol 20 no 12 pp1233ndash1251 2005
[13] K S Al-Sultan and S Z Selim ldquoA global algorithm for theFuzzy Clustering problemrdquo Pattern Recognition vol 26 no 9pp 1357ndash1361 1993
[14] R Eberhart and J Kennedy ldquoNew optimizer using particleswarm theoryrdquo in Proceedings of the 6th International Sympo-sium onMicroMachine and Human Science pp 39ndash43 NagoyaJapan October 1995
[15] T A Runkler and C Katz ldquoFuzzy clustering by particleswarm optimizationrdquo in Proceedings of the IEEE InternationalConference on Fuzzy Systems pp 601ndash608 can July 2006
[16] C-F Juang C-M Hsiao and C-H Hsu ldquoHierarchical cluster-based multispecies particle-swarm optimization for fuzzy-system optimizationrdquo IEEE Transactions on Fuzzy Systems vol18 no 1 pp 14ndash26 2010
[17] H Izakian and A Abraham ldquoFuzzy C-means and fuzzy swarmfor fuzzy clustering problemrdquo Expert Systems with Applicationsvol 38 no 3 pp 1835ndash1838 2011
[18] W Pedrycz ldquoShadowed sets representing and processing fuzzysetsrdquo IEEE Transactions on Systems Man and Cybernetics BCybernetics vol 28 no 1 pp 103ndash109 1998
[19] J Zhou W Pedrycz and D Miao ldquoShadowed sets in the char-acterization of rough-fuzzy clusteringrdquoPattern Recognition vol44 no 8 pp 1738ndash1749 2011
[20] W Pedrycz ldquoFrom fuzzy sets to shadowed sets interpretationand computingrdquo International Journal of Intelligent Systems vol24 no 1 pp 48ndash61 2009
[21] S Mitra W Pedrycz and B Barman ldquoShadowed c-meansintegrating fuzzy and rough clusteringrdquo Pattern Recognitionvol 43 no 4 pp 1282ndash1291 2010
[22] J C Bezdek J M Keller R Krishnapuram and N R PalFuzzy Models and Algorithms for Pattern Recognition and ImageProcessing vol 4 ofTheHandbooks of Fuzzy Sets Springer NewYork NY USA 1999
[23] D L Davies and DW Bouldin ldquoA cluster separation measurerdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 1 no 2 pp 224ndash227 1978
[24] X L Xie and G Beni ldquoA validity measure for fuzzy clusteringrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 13 no 8 pp 841ndash847 1991
[25] J C Bezdek and N R Pal ldquoSome new indexes of clustervalidityrdquo IEEE Transactions on Systems Man and CyberneticsPart B Cybernetics vol 28 no 3 pp 301ndash315 1998
[26] S C Satapathy S K Patnaik C D P Dash and S Sahoo ldquoDataclustering using modified fuzzy-PSO (MFPSO)rdquo in Proceedingsof the 5th Multi-Disciplinary Int Workshop on Artificial Intelli-gence Hyderabad India 2011
[27] M K Pakhira S Bandyopadhyay and U Maulik ldquoValidityindex for crisp and fuzzy clustersrdquo Pattern Recognition vol 37no 3 pp 487ndash501 2004
[28] R Krishnapuram and J M Keller ldquoThe possibilistic C-meansalgorithm insights and recommendationsrdquo IEEE Transactionson Fuzzy Systems vol 4 no 3 pp 385ndash393 1996
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience 5
(1) Initialize 119862max and 119862min let 119862 = 119862max the real number119898 iteration counter 119896 = 0 iteration counter 119905 = 0 maximumiteration number 119879 of PSO
(2) Initialize the population size 119871 the initial velocity of particles the initial position of particles 1198881 1198882 119908 the threshold 120576
and attrition rate 120588(3)DO
Repeat (a) Update partition matrix 119880(119896)for all particles by (3)(b) Calculate the cluster center for each particle by (4)(c) Calculate the fitness value for each particle by (7)(d) Calculate 119901119887119890119904119905 for each particle(e) Calculate 119892119887119890119904119905 for the swarm(f) Update the velocity and position of each particle by (5)(g) 119905 = 119905 + 1
Until PSO termination condition is met (lowast)(i) Calculate the optimal threshold 120572
119895(119895 = 1 2 119862) for each column of partition matrix 119880(119896) by (12)
and relocate 119906119894119895(1 le 119894 le 119873) of 119895th cluster according to 120572
119895
(ii) Calculate cardinality119872119895for each cluster on the basis of the number of data whose membership value equal to 1 by (14)
1 le 119895 le 119862
(iii) Remove all clusters whose119872119895lt 120576 and119872
119895is among lfloor120588 times 119862rfloor lowest cardinality
(iv) Update cluster number C(v) Calculate cluster validity index 119878XB(119896) by (13)(vi) Update iteration counter 119896 = 119896 + 1
While termination condition is not met (lowastlowast)(lowast) The termination condition of PSO in this method is 119905 ge 119879 (reach the maximum number of iterations) or the velocity
updates are close to zero over a number of iterations(lowastlowast) The algorithm can terminate under either of the following two conditions(1) The prototype parameters in 119861 stabilize within some threshold 120575(2) The number of clusters has reached the minimum limit 119862min
Algorithm 1 SP-FCM
Here if lfloor120588 times 119862rfloor is equal to 0 we can let it to be1 This means that the cluster with the lowest cardinalitymay be removed The initial 119862max cluster prototypes canbe initialized using exemplars from data points selected by120573119895= 119909lfloor(119873119862max)119895rfloor
After termination the 119861 and 119880 from119862 isin [119862min 119862max] with the best cluster validity index 119878XB areselected as the final cluster prototype and partition
4 Experimental Results
In this section the performance of FCM RCM shadowed 119888-means (SCM) [21] shadowed rough 119888-means (SRCM) [19]and SP-FCM algorithms is presented on four UCI datasetsfour yeast gene expression datasets and real data Forevaluating the convergence effect the fundamental criterioncan be described as follows the distance between differentobjects in the same cluster should be as close as possible thedistance between different objects in different cluster shouldbe as far as possible Here we use DB index and Dunn indexto evaluate the clustering effect For a given data set and 119862value the higher the similarity values within the clusters andthe intercluster separation the lower the DB index value Agood clustering procedure shouldmake the value ofDB indexas low as possible Reversely higher values of the Dunn indexindicate better clustering in the sense that the clusters are well
separated and relatively compact The details of experimentsare mentioned below
41 UCI Data Set In our experiments totally four UCI datasets are used including 4-dimensional Iris 13-dimensionalWine 10-dimensional Glass and 34-dimensional Iono-sphere There are 3 clusters in data set of Iris each of whichhas 50 data patterns 3 clusters in data set of Wine whichhave 50 60 and 68 data patterns 6 clusters in data set ofGlass which have 30 35 40 42 36 and 31 separately and2 clusters in data set of Ionosphere which have 226 and125 data patterns The validity indices of each method arecompared in Table 1 SP-FCM can identify compact groupscompared to other algorithmswhen given the cluster number119862 It can also be seen that SRCM and SP-FCM have moreobvious advantages than FCM RCM and SCM SP-FCMperforms slightly better than SRCM in most cases due tothe global search ability which enables it to converge to anoptimum or near optimum solutions Moreover shadowedset- and rough set-based clustering methods namely SP-FCM SRCM RCM and SCM perform better than FCM Itimplies that the partition of approximation regions can revealthe nature of data structure and only the lower bound andboundary region of each cluster have positive contribution inthe process of updating the prototypes
6 Computational Intelligence and Neuroscience
Table 1 Performance of FCM RCM SCM SRCM and SP-FCM on four UCI data sets
Different indices Algorithm Data setsIris Wine Glass Ionosphere
DB index
FCM 07642 08803 22971 20587RCM 06875 05692 19635 15434SCM 06862 05327 18495 14763SRCM 06613 04436 15804 13971SP-FCM 06574 04328 15237 14066
Dunn index
FCM 23106 25834 01142 08381RCM 27119 28157 02637 10233SCM 24801 27992 03150 10319SRCM 30874 31342 05108 11924SP-FCM 33254 31764 04921 12605
0
05
1
15
2
25
3
12 11 10 9 8 7 6 5 4 3 2Number of clusters
Xie-
Beni
val
idity
(a) Iris
0
1
2
3
4
5
6
13 12 11 10 9 8 7 6 5 4 3 2Number of clusters
Xie-
Beni
val
idity
(b) Wine
0
2
4
6
8
10
14 1213 11 10 9 8 7 6 5 4 3 2Number of clusters
Xie-
Beni
val
idity
(c) Glass
0
5
10
15
20
25
16 15 14 13 12 11 10 9 8 7 6 5 4 3 2Number of clusters
Xie-
Beni
val
idity
(d) Ionosphere
Figure 2 XB validity index of four UCI data sets with cluster number C
As usual the number of clusters is implied by the nature ofthe problem Here with the shadowed sets involved one cananticipate that the optimal number of clusters could be foundThe fuzzification coefficient 119898 can be optimized however itis common to assume a fixed value of 20 which associateswith the form of the membership functions of the generatedclusters For testing the SP-FCMalgorithm the rule119862 le 11987312is adopted and the range of the expected cluster number canbe set as (1) Iris [119862min = 2 119862max = 12] (2) Wine [119862min = 2119862max = 13] (3) Glass [119862min = 2 119862max = 14] (4) Ionosphere
[119862min = 2 119862max = 16] The swarm size is set as 119871 = 20the maximum iteration number of PSO 119879 = 50 and forcluster reduction the cluster cardinality threshold 120576 = 10and the attrition rate 120588 = 01 In each cycle we get thedistribution of every cluster remove part of them accordingto their cardinality and calculate the XB index and thecluster number 119862 varies from 119862max to 119862min After ending thecirculation the partition with the lowest value is selected asthe final result Figure 2 presents the validity indices in theprocess of generating the optimal cluster number Smaller
Computational Intelligence and Neuroscience 7
Table 2 Performance of FCM RCM SCM SRCM and SP-FCM on four yeast expression data sets
Different indices Algorithm Data setsGDS608 GDS2003 GDS2267 GDS2712
DB index
FCM 20861 24671 15916 19526RCM 16109 22104 10274 12058SCM 15938 21346 08946 10965SRCM 13274 19523 07438 07326SP-FCM 12958 18946 07962 06843
Dunn index
FCM 02647 02976 04208 03519RCM 03789 03981 07164 06074SCM 03865 03775 08439 06207SRCM 05126 04953 09759 08113SP-FCM 05407 05026 09182 08049
values indicate more compact and well-separated clustersThe validity indices reach their minimum value at 119862 = 33 6 and 2 separately which correspond to the final clusterprototype and the best partition Through the shadowedsets and PSO approaches the influence of each boundaryregion to the formation of the prototypes and the clusterscan be properly resolved Although more computing timeis required to run SP-FCM the reasonable result can beacquired for processing the overlapping and vagueness datapatterns
42 Yeast Gene Expression Data Set There are four yeastgene expression data sets used in the experiments includingGDS608 GDS2003 GDS2267 and GDS2712 downloadedfrom Gene Expression Omnibus The number of classesand samples of GDS608 is 26 and 6303 for GDS2003 thenumber of classes and samples is 23 and 5617 for GDS2267is 14 and 9275 and for GDS2712 is 15 and 9275 Table 2presents the validity indices of different methods after thecluster number 119862 was given The SP-FCM and SRCM obtainthe same effect and perform better than other clusteringalgorithms The improvement can be attributed to the factthat the global search capacity of PSO is conducive to findingmore appropriate cluster centers while escaping from localoptima
For getting the optimum 119862 automatically we let119898 = 201198881= 149 119888
2= 149 and 119908 = 072 and the rule 119862 le 11987312
is adopted The swarm size is set as 119871 = 20 the maximumiteration number of PSO is 119879 = 80 and for clusterreduction the range of the expected cluster number thecluster cardinality threshold 120576 and the attrition rate 120588 can beset as (1) GDS608 [119862min = 20 119862max = 80] 120576 = 20 120588 = 005(2) GDS2003 [119862min = 20 119862max = 75] 120576 = 20 120588 = 005(3) GDS2267 [119862min = 10 119862max = 96] 120576 = 20 120588 = 008(4) GDS2712 [119862min = 10 119862max = 96] 120576 = 20 120588 = 008In each cycle we get the distribution of every cluster removepart of them according to their cardinality and calculate theXB index and the cluster number 119862 varies from 119862max to119862min The partition with the lowest value is selected as the
final result after the loop is ended As seen in Figure 3 forGDS608 at the beginning the cluster number decreases at afaster rate it takes 26 iterations to reduce the cluster numberfrom 119862 = 80 to 119862 = 30 and 4 iterations from 119862 = 30
to 119862 = 26 and the XB index begins to increase when thecluster number 119862 lt 26 For GDS2003 it takes 24 iterationsto reduce the cluster number from 119862 = 75 to 119862 = 30 and 7iterations from 119862 = 30 to 119862 = 23 and the XB index beginsto increase when the cluster number 119862 lt 23 For GDS2267 ittakes 23 iterations to reduce the cluster number from 119862 = 96to 119862 = 20 and 6 iterations from 119862 = 20 to 119862 = 14 andthe XB index begins to increase when the cluster number119862 lt 14 For GDS2712 it takes 23 iterations to reduce thecluster number from 119862 = 96 to 119862 = 20 and 5 iterationsfrom 119862 = 20 to 119862 = 15 and the XB index begins to increasewhen the cluster number 119862 lt 15 Here the advantagesof fuzzy sets PSO and shadowed sets are integrated in theSP-FCM and make this algorithm applicable to deal withoverlapping partitions the uncertainty and vagueness arisingfrom the boundary regions and the optimization processin the shadowed sets makes this method robust to outliersso that the approximation regions of each cluster can bedetermined accurately and the obtained prototypes approachthe desired locations
43 Real Data In this experiment totally 10 different pack-ages are tested Each package is represented by 100 framescaptured from different angles by camera and each frame isextracted SIFT feature points which are used for training arecognition system Figure 4 shows some images with theirSIFT keypoints And this data set is comprised of 248150descriptors We let 119898 = 20 119888
1= 149 119888
2= 149 119908 = 072
119871 = 20 120576 = 30 and 120588 = 001 for the SP-FCM and choose thereasonable range [119862min = 200 119862max = 360] according to thecategory amount of packages and distribution of keypoints ineach image Eighty iterations of PSO are run on each given119862 to produce the cluster prototype 119861 and partition matrix119880 as the starting point for the shadowed sets Longer PSOstabilization is needed to obtainmore stable cluster partitions
8 Computational Intelligence and Neuroscience
0
05
1
15
2
25
3
35
30 29 28 27 26 25 24 23 22 21 20Number of clusters
Xie-
Beni
val
idity
(a) GDS608
0
1
2
3
4
5
6
Xie-
Beni
val
idity
30 29 28 27 26 25 24 23 22 21 20Number of clusters
(b) GDS2003
0
1
2
3
4
5
6
7
20 19 18 17 16 15 14 13 12 11 10Number of clusters
Xie-
Beni
val
idity
(c) GDS2267
0
1
2
3
4
5
6
20 19 18 17 16 15 14 13 12 11 10Number of clusters
Xie-
Beni
val
idity
(d) GDS2712
Figure 3 XB validity index of four yeast gene expression data sets with cluster number 119862
Table 3 Performance of FCM RCM SCM SRCM and SP-FCM on package datasets
Different indices AlgorithmsFCM RCM SCM SRCM SP-FCM
DB index 184569 159671 143194 124038 107826Dunn index 92647 116298 125376 169422 167313
Within each cluster the optimal 120572119895decides the cardinality
and realizes cluster reduction and XB index is calculatedEach 119862-partition is ranked using this index and selected asthe final output by the smallest index value that indicates thebest compact and well-separated clusters At the beginningthe cluster number decreases at a faster speed it takes 26iterations to reduce the cluster number from 119862 = 360 to 119862 =289 and 20 iterations from119862 = 289 to119862 = 267 The XB indexincreases at a relatively faster rate when the cluster number119862 lt 267 Figure 5 shows the XB index for 119862 isin [267 289]The index reaches its minimum value at 119862 = 276 that meansthe best partition for this data set is 276 clusters Table 3exhibits the comparative analysis of convergence effect Asexpected SP-FCMcan provide sound results for the real datathe performance is assessed by those validity indices
5 Conclusions
This paper presents a modified fuzzy 119888-means algorithmbased on the particle swarm optimization and shadowed setsto perform unsupervised feature clustering This algorithmcalled SP-FCMutilizes the global search property of PSO andvagueness balance property of shadowed sets such that it canestimate the optimal cluster number as it runs through itsalternating optimization process SP-FCM as a randomizedbased approach has the capability to alleviate the problemsfaced by FCM which has some demerits of initializationand falling in local minima Moreover this algorithm avoidsthe subjective and somewhat arbitrary trials to estimate theappropriate value of cluster number and it enhances thiscapability to find the optimal cluster number within a specific
Computational Intelligence and Neuroscience 9
(a) 644 features (b) 460 features (c) 742 features
(d) 442 features (e) 313 features (f) 602 features
(g) 373 features (h) 724 features (i) 539 features (j) 124 features
Figure 4 Ten package images with SIFT features
0100200300400500600700800900
1000
289
285
282
280
278
276
274
272
270
268
Number of clusters
Xie-
Beni
val
idity
287
283
281
279
277
275
273
271
269
267
Figure 5 XB validity index of bag data set with cluster number 119862
range using cluster validity measures as indicatorsThe use ofXB validity index allows the algorithm to find the optimumcluster number with cluster partitions that provide compactand well-separated clusters From the experiments we haveshown that the SP-FCMalgorithmproduces good resultswithreference to DB and Dunn indices especially to the highdimension and large data cases
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
10 Computational Intelligence and Neuroscience
Acknowledgment
This research was supported by the National Natural ScienceFoundation of China (Grant no 61105089 NSFC)
References
[1] A K Jain ldquoData clustering 50 years beyond K-meansrdquo PatternRecognition Letters vol 31 no 8 pp 651ndash666 2010
[2] H-P Kriegel P Kroger and A Zimek ldquoClustering high-dimensional data a survey on subspace clustering pattern-based clustering and correlation clusteringrdquoACMTransactionson Knowledge Discovery from Data vol 3 no 1 article 1 2009
[3] P Melin and O Castillo ldquoA review on type-2 fuzzy logic appli-cations in clustering classification and pattern recognitionrdquoApplied Soft Computing Journal vol 21 pp 568ndash577 2014
[4] M Yuwono S W Su B D Moulton and H T Nguyen ldquoDataclustering using variants of rapid centroid estimationrdquo IEEETransactions on Evolutionary Computation vol 18 no 3 pp366ndash377 2014
[5] S C Satapathy G Pradhan S Pattnaik J V R Murthy andP V G D P Reddy ldquoPerformance comparisons of PSO basedclusteringrdquo InterJRI Computer Science and Networking vol 1no 1 pp 18ndash23 2009
[6] G Peters and RWeber ldquoDynamic clustering with soft comput-ingrdquo Wiley Interdisciplinary Reviews Data Mining and Knowl-edge Discovery vol 2 no 3 pp 226ndash236 2012
[7] D T Anderson J C Bezdek M Popescu and J M KellerldquoComparing fuzzy probabilistic and possibilistic partitionsrdquoIEEE Transactions on Fuzzy Systems vol 18 no 5 pp 906ndash9182010
[8] P Maji and S Paul ldquoRobust rough-fuzzy C-means algorithmdesign and applications in coding and non-coding RNA expres-sion data clusteringrdquo Fundamenta Informaticae vol 124 no 1-2pp 153ndash174 2013
[9] J Bezdek Fuzzy mathematics in pattern classification [PhDthesis] Cornell University New york NY USA 1974
[10] V Olman F Mao H Wu and Y Xu ldquoParallel clusteringalgorithm for large data sets with applications in bioinformat-icsrdquo IEEEACM Transactions on Computational Biology andBioinformatics vol 6 no 2 pp 344ndash352 2009
[11] J C Bezdek and R J Hathaway ldquoOptimization of fuzzyclustering criteria using genetic algorithmsrdquo in Proceedings ofthe 1st IEEE Conference on Evolutionary Computation pp 589ndash594 June 1994
[12] T A Runkler ldquoAnt colony optimization of clustering modelsrdquoInternational Journal of Intelligent Systems vol 20 no 12 pp1233ndash1251 2005
[13] K S Al-Sultan and S Z Selim ldquoA global algorithm for theFuzzy Clustering problemrdquo Pattern Recognition vol 26 no 9pp 1357ndash1361 1993
[14] R Eberhart and J Kennedy ldquoNew optimizer using particleswarm theoryrdquo in Proceedings of the 6th International Sympo-sium onMicroMachine and Human Science pp 39ndash43 NagoyaJapan October 1995
[15] T A Runkler and C Katz ldquoFuzzy clustering by particleswarm optimizationrdquo in Proceedings of the IEEE InternationalConference on Fuzzy Systems pp 601ndash608 can July 2006
[16] C-F Juang C-M Hsiao and C-H Hsu ldquoHierarchical cluster-based multispecies particle-swarm optimization for fuzzy-system optimizationrdquo IEEE Transactions on Fuzzy Systems vol18 no 1 pp 14ndash26 2010
[17] H Izakian and A Abraham ldquoFuzzy C-means and fuzzy swarmfor fuzzy clustering problemrdquo Expert Systems with Applicationsvol 38 no 3 pp 1835ndash1838 2011
[18] W Pedrycz ldquoShadowed sets representing and processing fuzzysetsrdquo IEEE Transactions on Systems Man and Cybernetics BCybernetics vol 28 no 1 pp 103ndash109 1998
[19] J Zhou W Pedrycz and D Miao ldquoShadowed sets in the char-acterization of rough-fuzzy clusteringrdquoPattern Recognition vol44 no 8 pp 1738ndash1749 2011
[20] W Pedrycz ldquoFrom fuzzy sets to shadowed sets interpretationand computingrdquo International Journal of Intelligent Systems vol24 no 1 pp 48ndash61 2009
[21] S Mitra W Pedrycz and B Barman ldquoShadowed c-meansintegrating fuzzy and rough clusteringrdquo Pattern Recognitionvol 43 no 4 pp 1282ndash1291 2010
[22] J C Bezdek J M Keller R Krishnapuram and N R PalFuzzy Models and Algorithms for Pattern Recognition and ImageProcessing vol 4 ofTheHandbooks of Fuzzy Sets Springer NewYork NY USA 1999
[23] D L Davies and DW Bouldin ldquoA cluster separation measurerdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 1 no 2 pp 224ndash227 1978
[24] X L Xie and G Beni ldquoA validity measure for fuzzy clusteringrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 13 no 8 pp 841ndash847 1991
[25] J C Bezdek and N R Pal ldquoSome new indexes of clustervalidityrdquo IEEE Transactions on Systems Man and CyberneticsPart B Cybernetics vol 28 no 3 pp 301ndash315 1998
[26] S C Satapathy S K Patnaik C D P Dash and S Sahoo ldquoDataclustering using modified fuzzy-PSO (MFPSO)rdquo in Proceedingsof the 5th Multi-Disciplinary Int Workshop on Artificial Intelli-gence Hyderabad India 2011
[27] M K Pakhira S Bandyopadhyay and U Maulik ldquoValidityindex for crisp and fuzzy clustersrdquo Pattern Recognition vol 37no 3 pp 487ndash501 2004
[28] R Krishnapuram and J M Keller ldquoThe possibilistic C-meansalgorithm insights and recommendationsrdquo IEEE Transactionson Fuzzy Systems vol 4 no 3 pp 385ndash393 1996
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
6 Computational Intelligence and Neuroscience
Table 1 Performance of FCM RCM SCM SRCM and SP-FCM on four UCI data sets
Different indices Algorithm Data setsIris Wine Glass Ionosphere
DB index
FCM 07642 08803 22971 20587RCM 06875 05692 19635 15434SCM 06862 05327 18495 14763SRCM 06613 04436 15804 13971SP-FCM 06574 04328 15237 14066
Dunn index
FCM 23106 25834 01142 08381RCM 27119 28157 02637 10233SCM 24801 27992 03150 10319SRCM 30874 31342 05108 11924SP-FCM 33254 31764 04921 12605
0
05
1
15
2
25
3
12 11 10 9 8 7 6 5 4 3 2Number of clusters
Xie-
Beni
val
idity
(a) Iris
0
1
2
3
4
5
6
13 12 11 10 9 8 7 6 5 4 3 2Number of clusters
Xie-
Beni
val
idity
(b) Wine
0
2
4
6
8
10
14 1213 11 10 9 8 7 6 5 4 3 2Number of clusters
Xie-
Beni
val
idity
(c) Glass
0
5
10
15
20
25
16 15 14 13 12 11 10 9 8 7 6 5 4 3 2Number of clusters
Xie-
Beni
val
idity
(d) Ionosphere
Figure 2 XB validity index of four UCI data sets with cluster number C
As usual the number of clusters is implied by the nature ofthe problem Here with the shadowed sets involved one cananticipate that the optimal number of clusters could be foundThe fuzzification coefficient 119898 can be optimized however itis common to assume a fixed value of 20 which associateswith the form of the membership functions of the generatedclusters For testing the SP-FCMalgorithm the rule119862 le 11987312is adopted and the range of the expected cluster number canbe set as (1) Iris [119862min = 2 119862max = 12] (2) Wine [119862min = 2119862max = 13] (3) Glass [119862min = 2 119862max = 14] (4) Ionosphere
[119862min = 2 119862max = 16] The swarm size is set as 119871 = 20the maximum iteration number of PSO 119879 = 50 and forcluster reduction the cluster cardinality threshold 120576 = 10and the attrition rate 120588 = 01 In each cycle we get thedistribution of every cluster remove part of them accordingto their cardinality and calculate the XB index and thecluster number 119862 varies from 119862max to 119862min After ending thecirculation the partition with the lowest value is selected asthe final result Figure 2 presents the validity indices in theprocess of generating the optimal cluster number Smaller
Computational Intelligence and Neuroscience 7
Table 2 Performance of FCM RCM SCM SRCM and SP-FCM on four yeast expression data sets
Different indices Algorithm Data setsGDS608 GDS2003 GDS2267 GDS2712
DB index
FCM 20861 24671 15916 19526RCM 16109 22104 10274 12058SCM 15938 21346 08946 10965SRCM 13274 19523 07438 07326SP-FCM 12958 18946 07962 06843
Dunn index
FCM 02647 02976 04208 03519RCM 03789 03981 07164 06074SCM 03865 03775 08439 06207SRCM 05126 04953 09759 08113SP-FCM 05407 05026 09182 08049
values indicate more compact and well-separated clustersThe validity indices reach their minimum value at 119862 = 33 6 and 2 separately which correspond to the final clusterprototype and the best partition Through the shadowedsets and PSO approaches the influence of each boundaryregion to the formation of the prototypes and the clusterscan be properly resolved Although more computing timeis required to run SP-FCM the reasonable result can beacquired for processing the overlapping and vagueness datapatterns
42 Yeast Gene Expression Data Set There are four yeastgene expression data sets used in the experiments includingGDS608 GDS2003 GDS2267 and GDS2712 downloadedfrom Gene Expression Omnibus The number of classesand samples of GDS608 is 26 and 6303 for GDS2003 thenumber of classes and samples is 23 and 5617 for GDS2267is 14 and 9275 and for GDS2712 is 15 and 9275 Table 2presents the validity indices of different methods after thecluster number 119862 was given The SP-FCM and SRCM obtainthe same effect and perform better than other clusteringalgorithms The improvement can be attributed to the factthat the global search capacity of PSO is conducive to findingmore appropriate cluster centers while escaping from localoptima
For getting the optimum 119862 automatically we let119898 = 201198881= 149 119888
2= 149 and 119908 = 072 and the rule 119862 le 11987312
is adopted The swarm size is set as 119871 = 20 the maximumiteration number of PSO is 119879 = 80 and for clusterreduction the range of the expected cluster number thecluster cardinality threshold 120576 and the attrition rate 120588 can beset as (1) GDS608 [119862min = 20 119862max = 80] 120576 = 20 120588 = 005(2) GDS2003 [119862min = 20 119862max = 75] 120576 = 20 120588 = 005(3) GDS2267 [119862min = 10 119862max = 96] 120576 = 20 120588 = 008(4) GDS2712 [119862min = 10 119862max = 96] 120576 = 20 120588 = 008In each cycle we get the distribution of every cluster removepart of them according to their cardinality and calculate theXB index and the cluster number 119862 varies from 119862max to119862min The partition with the lowest value is selected as the
final result after the loop is ended As seen in Figure 3 forGDS608 at the beginning the cluster number decreases at afaster rate it takes 26 iterations to reduce the cluster numberfrom 119862 = 80 to 119862 = 30 and 4 iterations from 119862 = 30
to 119862 = 26 and the XB index begins to increase when thecluster number 119862 lt 26 For GDS2003 it takes 24 iterationsto reduce the cluster number from 119862 = 75 to 119862 = 30 and 7iterations from 119862 = 30 to 119862 = 23 and the XB index beginsto increase when the cluster number 119862 lt 23 For GDS2267 ittakes 23 iterations to reduce the cluster number from 119862 = 96to 119862 = 20 and 6 iterations from 119862 = 20 to 119862 = 14 andthe XB index begins to increase when the cluster number119862 lt 14 For GDS2712 it takes 23 iterations to reduce thecluster number from 119862 = 96 to 119862 = 20 and 5 iterationsfrom 119862 = 20 to 119862 = 15 and the XB index begins to increasewhen the cluster number 119862 lt 15 Here the advantagesof fuzzy sets PSO and shadowed sets are integrated in theSP-FCM and make this algorithm applicable to deal withoverlapping partitions the uncertainty and vagueness arisingfrom the boundary regions and the optimization processin the shadowed sets makes this method robust to outliersso that the approximation regions of each cluster can bedetermined accurately and the obtained prototypes approachthe desired locations
43 Real Data In this experiment totally 10 different pack-ages are tested Each package is represented by 100 framescaptured from different angles by camera and each frame isextracted SIFT feature points which are used for training arecognition system Figure 4 shows some images with theirSIFT keypoints And this data set is comprised of 248150descriptors We let 119898 = 20 119888
1= 149 119888
2= 149 119908 = 072
119871 = 20 120576 = 30 and 120588 = 001 for the SP-FCM and choose thereasonable range [119862min = 200 119862max = 360] according to thecategory amount of packages and distribution of keypoints ineach image Eighty iterations of PSO are run on each given119862 to produce the cluster prototype 119861 and partition matrix119880 as the starting point for the shadowed sets Longer PSOstabilization is needed to obtainmore stable cluster partitions
8 Computational Intelligence and Neuroscience
0
05
1
15
2
25
3
35
30 29 28 27 26 25 24 23 22 21 20Number of clusters
Xie-
Beni
val
idity
(a) GDS608
0
1
2
3
4
5
6
Xie-
Beni
val
idity
30 29 28 27 26 25 24 23 22 21 20Number of clusters
(b) GDS2003
0
1
2
3
4
5
6
7
20 19 18 17 16 15 14 13 12 11 10Number of clusters
Xie-
Beni
val
idity
(c) GDS2267
0
1
2
3
4
5
6
20 19 18 17 16 15 14 13 12 11 10Number of clusters
Xie-
Beni
val
idity
(d) GDS2712
Figure 3 XB validity index of four yeast gene expression data sets with cluster number 119862
Table 3 Performance of FCM RCM SCM SRCM and SP-FCM on package datasets
Different indices AlgorithmsFCM RCM SCM SRCM SP-FCM
DB index 184569 159671 143194 124038 107826Dunn index 92647 116298 125376 169422 167313
Within each cluster the optimal 120572119895decides the cardinality
and realizes cluster reduction and XB index is calculatedEach 119862-partition is ranked using this index and selected asthe final output by the smallest index value that indicates thebest compact and well-separated clusters At the beginningthe cluster number decreases at a faster speed it takes 26iterations to reduce the cluster number from 119862 = 360 to 119862 =289 and 20 iterations from119862 = 289 to119862 = 267 The XB indexincreases at a relatively faster rate when the cluster number119862 lt 267 Figure 5 shows the XB index for 119862 isin [267 289]The index reaches its minimum value at 119862 = 276 that meansthe best partition for this data set is 276 clusters Table 3exhibits the comparative analysis of convergence effect Asexpected SP-FCMcan provide sound results for the real datathe performance is assessed by those validity indices
5 Conclusions
This paper presents a modified fuzzy 119888-means algorithmbased on the particle swarm optimization and shadowed setsto perform unsupervised feature clustering This algorithmcalled SP-FCMutilizes the global search property of PSO andvagueness balance property of shadowed sets such that it canestimate the optimal cluster number as it runs through itsalternating optimization process SP-FCM as a randomizedbased approach has the capability to alleviate the problemsfaced by FCM which has some demerits of initializationand falling in local minima Moreover this algorithm avoidsthe subjective and somewhat arbitrary trials to estimate theappropriate value of cluster number and it enhances thiscapability to find the optimal cluster number within a specific
Computational Intelligence and Neuroscience 9
(a) 644 features (b) 460 features (c) 742 features
(d) 442 features (e) 313 features (f) 602 features
(g) 373 features (h) 724 features (i) 539 features (j) 124 features
Figure 4 Ten package images with SIFT features
0100200300400500600700800900
1000
289
285
282
280
278
276
274
272
270
268
Number of clusters
Xie-
Beni
val
idity
287
283
281
279
277
275
273
271
269
267
Figure 5 XB validity index of bag data set with cluster number 119862
range using cluster validity measures as indicatorsThe use ofXB validity index allows the algorithm to find the optimumcluster number with cluster partitions that provide compactand well-separated clusters From the experiments we haveshown that the SP-FCMalgorithmproduces good resultswithreference to DB and Dunn indices especially to the highdimension and large data cases
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
10 Computational Intelligence and Neuroscience
Acknowledgment
This research was supported by the National Natural ScienceFoundation of China (Grant no 61105089 NSFC)
References
[1] A K Jain ldquoData clustering 50 years beyond K-meansrdquo PatternRecognition Letters vol 31 no 8 pp 651ndash666 2010
[2] H-P Kriegel P Kroger and A Zimek ldquoClustering high-dimensional data a survey on subspace clustering pattern-based clustering and correlation clusteringrdquoACMTransactionson Knowledge Discovery from Data vol 3 no 1 article 1 2009
[3] P Melin and O Castillo ldquoA review on type-2 fuzzy logic appli-cations in clustering classification and pattern recognitionrdquoApplied Soft Computing Journal vol 21 pp 568ndash577 2014
[4] M Yuwono S W Su B D Moulton and H T Nguyen ldquoDataclustering using variants of rapid centroid estimationrdquo IEEETransactions on Evolutionary Computation vol 18 no 3 pp366ndash377 2014
[5] S C Satapathy G Pradhan S Pattnaik J V R Murthy andP V G D P Reddy ldquoPerformance comparisons of PSO basedclusteringrdquo InterJRI Computer Science and Networking vol 1no 1 pp 18ndash23 2009
[6] G Peters and RWeber ldquoDynamic clustering with soft comput-ingrdquo Wiley Interdisciplinary Reviews Data Mining and Knowl-edge Discovery vol 2 no 3 pp 226ndash236 2012
[7] D T Anderson J C Bezdek M Popescu and J M KellerldquoComparing fuzzy probabilistic and possibilistic partitionsrdquoIEEE Transactions on Fuzzy Systems vol 18 no 5 pp 906ndash9182010
[8] P Maji and S Paul ldquoRobust rough-fuzzy C-means algorithmdesign and applications in coding and non-coding RNA expres-sion data clusteringrdquo Fundamenta Informaticae vol 124 no 1-2pp 153ndash174 2013
[9] J Bezdek Fuzzy mathematics in pattern classification [PhDthesis] Cornell University New york NY USA 1974
[10] V Olman F Mao H Wu and Y Xu ldquoParallel clusteringalgorithm for large data sets with applications in bioinformat-icsrdquo IEEEACM Transactions on Computational Biology andBioinformatics vol 6 no 2 pp 344ndash352 2009
[11] J C Bezdek and R J Hathaway ldquoOptimization of fuzzyclustering criteria using genetic algorithmsrdquo in Proceedings ofthe 1st IEEE Conference on Evolutionary Computation pp 589ndash594 June 1994
[12] T A Runkler ldquoAnt colony optimization of clustering modelsrdquoInternational Journal of Intelligent Systems vol 20 no 12 pp1233ndash1251 2005
[13] K S Al-Sultan and S Z Selim ldquoA global algorithm for theFuzzy Clustering problemrdquo Pattern Recognition vol 26 no 9pp 1357ndash1361 1993
[14] R Eberhart and J Kennedy ldquoNew optimizer using particleswarm theoryrdquo in Proceedings of the 6th International Sympo-sium onMicroMachine and Human Science pp 39ndash43 NagoyaJapan October 1995
[15] T A Runkler and C Katz ldquoFuzzy clustering by particleswarm optimizationrdquo in Proceedings of the IEEE InternationalConference on Fuzzy Systems pp 601ndash608 can July 2006
[16] C-F Juang C-M Hsiao and C-H Hsu ldquoHierarchical cluster-based multispecies particle-swarm optimization for fuzzy-system optimizationrdquo IEEE Transactions on Fuzzy Systems vol18 no 1 pp 14ndash26 2010
[17] H Izakian and A Abraham ldquoFuzzy C-means and fuzzy swarmfor fuzzy clustering problemrdquo Expert Systems with Applicationsvol 38 no 3 pp 1835ndash1838 2011
[18] W Pedrycz ldquoShadowed sets representing and processing fuzzysetsrdquo IEEE Transactions on Systems Man and Cybernetics BCybernetics vol 28 no 1 pp 103ndash109 1998
[19] J Zhou W Pedrycz and D Miao ldquoShadowed sets in the char-acterization of rough-fuzzy clusteringrdquoPattern Recognition vol44 no 8 pp 1738ndash1749 2011
[20] W Pedrycz ldquoFrom fuzzy sets to shadowed sets interpretationand computingrdquo International Journal of Intelligent Systems vol24 no 1 pp 48ndash61 2009
[21] S Mitra W Pedrycz and B Barman ldquoShadowed c-meansintegrating fuzzy and rough clusteringrdquo Pattern Recognitionvol 43 no 4 pp 1282ndash1291 2010
[22] J C Bezdek J M Keller R Krishnapuram and N R PalFuzzy Models and Algorithms for Pattern Recognition and ImageProcessing vol 4 ofTheHandbooks of Fuzzy Sets Springer NewYork NY USA 1999
[23] D L Davies and DW Bouldin ldquoA cluster separation measurerdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 1 no 2 pp 224ndash227 1978
[24] X L Xie and G Beni ldquoA validity measure for fuzzy clusteringrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 13 no 8 pp 841ndash847 1991
[25] J C Bezdek and N R Pal ldquoSome new indexes of clustervalidityrdquo IEEE Transactions on Systems Man and CyberneticsPart B Cybernetics vol 28 no 3 pp 301ndash315 1998
[26] S C Satapathy S K Patnaik C D P Dash and S Sahoo ldquoDataclustering using modified fuzzy-PSO (MFPSO)rdquo in Proceedingsof the 5th Multi-Disciplinary Int Workshop on Artificial Intelli-gence Hyderabad India 2011
[27] M K Pakhira S Bandyopadhyay and U Maulik ldquoValidityindex for crisp and fuzzy clustersrdquo Pattern Recognition vol 37no 3 pp 487ndash501 2004
[28] R Krishnapuram and J M Keller ldquoThe possibilistic C-meansalgorithm insights and recommendationsrdquo IEEE Transactionson Fuzzy Systems vol 4 no 3 pp 385ndash393 1996
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience 7
Table 2 Performance of FCM RCM SCM SRCM and SP-FCM on four yeast expression data sets
Different indices Algorithm Data setsGDS608 GDS2003 GDS2267 GDS2712
DB index
FCM 20861 24671 15916 19526RCM 16109 22104 10274 12058SCM 15938 21346 08946 10965SRCM 13274 19523 07438 07326SP-FCM 12958 18946 07962 06843
Dunn index
FCM 02647 02976 04208 03519RCM 03789 03981 07164 06074SCM 03865 03775 08439 06207SRCM 05126 04953 09759 08113SP-FCM 05407 05026 09182 08049
values indicate more compact and well-separated clustersThe validity indices reach their minimum value at 119862 = 33 6 and 2 separately which correspond to the final clusterprototype and the best partition Through the shadowedsets and PSO approaches the influence of each boundaryregion to the formation of the prototypes and the clusterscan be properly resolved Although more computing timeis required to run SP-FCM the reasonable result can beacquired for processing the overlapping and vagueness datapatterns
42 Yeast Gene Expression Data Set There are four yeastgene expression data sets used in the experiments includingGDS608 GDS2003 GDS2267 and GDS2712 downloadedfrom Gene Expression Omnibus The number of classesand samples of GDS608 is 26 and 6303 for GDS2003 thenumber of classes and samples is 23 and 5617 for GDS2267is 14 and 9275 and for GDS2712 is 15 and 9275 Table 2presents the validity indices of different methods after thecluster number 119862 was given The SP-FCM and SRCM obtainthe same effect and perform better than other clusteringalgorithms The improvement can be attributed to the factthat the global search capacity of PSO is conducive to findingmore appropriate cluster centers while escaping from localoptima
For getting the optimum 119862 automatically we let119898 = 201198881= 149 119888
2= 149 and 119908 = 072 and the rule 119862 le 11987312
is adopted The swarm size is set as 119871 = 20 the maximumiteration number of PSO is 119879 = 80 and for clusterreduction the range of the expected cluster number thecluster cardinality threshold 120576 and the attrition rate 120588 can beset as (1) GDS608 [119862min = 20 119862max = 80] 120576 = 20 120588 = 005(2) GDS2003 [119862min = 20 119862max = 75] 120576 = 20 120588 = 005(3) GDS2267 [119862min = 10 119862max = 96] 120576 = 20 120588 = 008(4) GDS2712 [119862min = 10 119862max = 96] 120576 = 20 120588 = 008In each cycle we get the distribution of every cluster removepart of them according to their cardinality and calculate theXB index and the cluster number 119862 varies from 119862max to119862min The partition with the lowest value is selected as the
final result after the loop is ended As seen in Figure 3 forGDS608 at the beginning the cluster number decreases at afaster rate it takes 26 iterations to reduce the cluster numberfrom 119862 = 80 to 119862 = 30 and 4 iterations from 119862 = 30
to 119862 = 26 and the XB index begins to increase when thecluster number 119862 lt 26 For GDS2003 it takes 24 iterationsto reduce the cluster number from 119862 = 75 to 119862 = 30 and 7iterations from 119862 = 30 to 119862 = 23 and the XB index beginsto increase when the cluster number 119862 lt 23 For GDS2267 ittakes 23 iterations to reduce the cluster number from 119862 = 96to 119862 = 20 and 6 iterations from 119862 = 20 to 119862 = 14 andthe XB index begins to increase when the cluster number119862 lt 14 For GDS2712 it takes 23 iterations to reduce thecluster number from 119862 = 96 to 119862 = 20 and 5 iterationsfrom 119862 = 20 to 119862 = 15 and the XB index begins to increasewhen the cluster number 119862 lt 15 Here the advantagesof fuzzy sets PSO and shadowed sets are integrated in theSP-FCM and make this algorithm applicable to deal withoverlapping partitions the uncertainty and vagueness arisingfrom the boundary regions and the optimization processin the shadowed sets makes this method robust to outliersso that the approximation regions of each cluster can bedetermined accurately and the obtained prototypes approachthe desired locations
43 Real Data In this experiment totally 10 different pack-ages are tested Each package is represented by 100 framescaptured from different angles by camera and each frame isextracted SIFT feature points which are used for training arecognition system Figure 4 shows some images with theirSIFT keypoints And this data set is comprised of 248150descriptors We let 119898 = 20 119888
1= 149 119888
2= 149 119908 = 072
119871 = 20 120576 = 30 and 120588 = 001 for the SP-FCM and choose thereasonable range [119862min = 200 119862max = 360] according to thecategory amount of packages and distribution of keypoints ineach image Eighty iterations of PSO are run on each given119862 to produce the cluster prototype 119861 and partition matrix119880 as the starting point for the shadowed sets Longer PSOstabilization is needed to obtainmore stable cluster partitions
8 Computational Intelligence and Neuroscience
0
05
1
15
2
25
3
35
30 29 28 27 26 25 24 23 22 21 20Number of clusters
Xie-
Beni
val
idity
(a) GDS608
0
1
2
3
4
5
6
Xie-
Beni
val
idity
30 29 28 27 26 25 24 23 22 21 20Number of clusters
(b) GDS2003
0
1
2
3
4
5
6
7
20 19 18 17 16 15 14 13 12 11 10Number of clusters
Xie-
Beni
val
idity
(c) GDS2267
0
1
2
3
4
5
6
20 19 18 17 16 15 14 13 12 11 10Number of clusters
Xie-
Beni
val
idity
(d) GDS2712
Figure 3 XB validity index of four yeast gene expression data sets with cluster number 119862
Table 3 Performance of FCM RCM SCM SRCM and SP-FCM on package datasets
Different indices AlgorithmsFCM RCM SCM SRCM SP-FCM
DB index 184569 159671 143194 124038 107826Dunn index 92647 116298 125376 169422 167313
Within each cluster the optimal 120572119895decides the cardinality
and realizes cluster reduction and XB index is calculatedEach 119862-partition is ranked using this index and selected asthe final output by the smallest index value that indicates thebest compact and well-separated clusters At the beginningthe cluster number decreases at a faster speed it takes 26iterations to reduce the cluster number from 119862 = 360 to 119862 =289 and 20 iterations from119862 = 289 to119862 = 267 The XB indexincreases at a relatively faster rate when the cluster number119862 lt 267 Figure 5 shows the XB index for 119862 isin [267 289]The index reaches its minimum value at 119862 = 276 that meansthe best partition for this data set is 276 clusters Table 3exhibits the comparative analysis of convergence effect Asexpected SP-FCMcan provide sound results for the real datathe performance is assessed by those validity indices
5 Conclusions
This paper presents a modified fuzzy 119888-means algorithmbased on the particle swarm optimization and shadowed setsto perform unsupervised feature clustering This algorithmcalled SP-FCMutilizes the global search property of PSO andvagueness balance property of shadowed sets such that it canestimate the optimal cluster number as it runs through itsalternating optimization process SP-FCM as a randomizedbased approach has the capability to alleviate the problemsfaced by FCM which has some demerits of initializationand falling in local minima Moreover this algorithm avoidsthe subjective and somewhat arbitrary trials to estimate theappropriate value of cluster number and it enhances thiscapability to find the optimal cluster number within a specific
Computational Intelligence and Neuroscience 9
(a) 644 features (b) 460 features (c) 742 features
(d) 442 features (e) 313 features (f) 602 features
(g) 373 features (h) 724 features (i) 539 features (j) 124 features
Figure 4 Ten package images with SIFT features
0100200300400500600700800900
1000
289
285
282
280
278
276
274
272
270
268
Number of clusters
Xie-
Beni
val
idity
287
283
281
279
277
275
273
271
269
267
Figure 5 XB validity index of bag data set with cluster number 119862
range using cluster validity measures as indicatorsThe use ofXB validity index allows the algorithm to find the optimumcluster number with cluster partitions that provide compactand well-separated clusters From the experiments we haveshown that the SP-FCMalgorithmproduces good resultswithreference to DB and Dunn indices especially to the highdimension and large data cases
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
10 Computational Intelligence and Neuroscience
Acknowledgment
This research was supported by the National Natural ScienceFoundation of China (Grant no 61105089 NSFC)
References
[1] A K Jain ldquoData clustering 50 years beyond K-meansrdquo PatternRecognition Letters vol 31 no 8 pp 651ndash666 2010
[2] H-P Kriegel P Kroger and A Zimek ldquoClustering high-dimensional data a survey on subspace clustering pattern-based clustering and correlation clusteringrdquoACMTransactionson Knowledge Discovery from Data vol 3 no 1 article 1 2009
[3] P Melin and O Castillo ldquoA review on type-2 fuzzy logic appli-cations in clustering classification and pattern recognitionrdquoApplied Soft Computing Journal vol 21 pp 568ndash577 2014
[4] M Yuwono S W Su B D Moulton and H T Nguyen ldquoDataclustering using variants of rapid centroid estimationrdquo IEEETransactions on Evolutionary Computation vol 18 no 3 pp366ndash377 2014
[5] S C Satapathy G Pradhan S Pattnaik J V R Murthy andP V G D P Reddy ldquoPerformance comparisons of PSO basedclusteringrdquo InterJRI Computer Science and Networking vol 1no 1 pp 18ndash23 2009
[6] G Peters and RWeber ldquoDynamic clustering with soft comput-ingrdquo Wiley Interdisciplinary Reviews Data Mining and Knowl-edge Discovery vol 2 no 3 pp 226ndash236 2012
[7] D T Anderson J C Bezdek M Popescu and J M KellerldquoComparing fuzzy probabilistic and possibilistic partitionsrdquoIEEE Transactions on Fuzzy Systems vol 18 no 5 pp 906ndash9182010
[8] P Maji and S Paul ldquoRobust rough-fuzzy C-means algorithmdesign and applications in coding and non-coding RNA expres-sion data clusteringrdquo Fundamenta Informaticae vol 124 no 1-2pp 153ndash174 2013
[9] J Bezdek Fuzzy mathematics in pattern classification [PhDthesis] Cornell University New york NY USA 1974
[10] V Olman F Mao H Wu and Y Xu ldquoParallel clusteringalgorithm for large data sets with applications in bioinformat-icsrdquo IEEEACM Transactions on Computational Biology andBioinformatics vol 6 no 2 pp 344ndash352 2009
[11] J C Bezdek and R J Hathaway ldquoOptimization of fuzzyclustering criteria using genetic algorithmsrdquo in Proceedings ofthe 1st IEEE Conference on Evolutionary Computation pp 589ndash594 June 1994
[12] T A Runkler ldquoAnt colony optimization of clustering modelsrdquoInternational Journal of Intelligent Systems vol 20 no 12 pp1233ndash1251 2005
[13] K S Al-Sultan and S Z Selim ldquoA global algorithm for theFuzzy Clustering problemrdquo Pattern Recognition vol 26 no 9pp 1357ndash1361 1993
[14] R Eberhart and J Kennedy ldquoNew optimizer using particleswarm theoryrdquo in Proceedings of the 6th International Sympo-sium onMicroMachine and Human Science pp 39ndash43 NagoyaJapan October 1995
[15] T A Runkler and C Katz ldquoFuzzy clustering by particleswarm optimizationrdquo in Proceedings of the IEEE InternationalConference on Fuzzy Systems pp 601ndash608 can July 2006
[16] C-F Juang C-M Hsiao and C-H Hsu ldquoHierarchical cluster-based multispecies particle-swarm optimization for fuzzy-system optimizationrdquo IEEE Transactions on Fuzzy Systems vol18 no 1 pp 14ndash26 2010
[17] H Izakian and A Abraham ldquoFuzzy C-means and fuzzy swarmfor fuzzy clustering problemrdquo Expert Systems with Applicationsvol 38 no 3 pp 1835ndash1838 2011
[18] W Pedrycz ldquoShadowed sets representing and processing fuzzysetsrdquo IEEE Transactions on Systems Man and Cybernetics BCybernetics vol 28 no 1 pp 103ndash109 1998
[19] J Zhou W Pedrycz and D Miao ldquoShadowed sets in the char-acterization of rough-fuzzy clusteringrdquoPattern Recognition vol44 no 8 pp 1738ndash1749 2011
[20] W Pedrycz ldquoFrom fuzzy sets to shadowed sets interpretationand computingrdquo International Journal of Intelligent Systems vol24 no 1 pp 48ndash61 2009
[21] S Mitra W Pedrycz and B Barman ldquoShadowed c-meansintegrating fuzzy and rough clusteringrdquo Pattern Recognitionvol 43 no 4 pp 1282ndash1291 2010
[22] J C Bezdek J M Keller R Krishnapuram and N R PalFuzzy Models and Algorithms for Pattern Recognition and ImageProcessing vol 4 ofTheHandbooks of Fuzzy Sets Springer NewYork NY USA 1999
[23] D L Davies and DW Bouldin ldquoA cluster separation measurerdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 1 no 2 pp 224ndash227 1978
[24] X L Xie and G Beni ldquoA validity measure for fuzzy clusteringrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 13 no 8 pp 841ndash847 1991
[25] J C Bezdek and N R Pal ldquoSome new indexes of clustervalidityrdquo IEEE Transactions on Systems Man and CyberneticsPart B Cybernetics vol 28 no 3 pp 301ndash315 1998
[26] S C Satapathy S K Patnaik C D P Dash and S Sahoo ldquoDataclustering using modified fuzzy-PSO (MFPSO)rdquo in Proceedingsof the 5th Multi-Disciplinary Int Workshop on Artificial Intelli-gence Hyderabad India 2011
[27] M K Pakhira S Bandyopadhyay and U Maulik ldquoValidityindex for crisp and fuzzy clustersrdquo Pattern Recognition vol 37no 3 pp 487ndash501 2004
[28] R Krishnapuram and J M Keller ldquoThe possibilistic C-meansalgorithm insights and recommendationsrdquo IEEE Transactionson Fuzzy Systems vol 4 no 3 pp 385ndash393 1996
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
8 Computational Intelligence and Neuroscience
0
05
1
15
2
25
3
35
30 29 28 27 26 25 24 23 22 21 20Number of clusters
Xie-
Beni
val
idity
(a) GDS608
0
1
2
3
4
5
6
Xie-
Beni
val
idity
30 29 28 27 26 25 24 23 22 21 20Number of clusters
(b) GDS2003
0
1
2
3
4
5
6
7
20 19 18 17 16 15 14 13 12 11 10Number of clusters
Xie-
Beni
val
idity
(c) GDS2267
0
1
2
3
4
5
6
20 19 18 17 16 15 14 13 12 11 10Number of clusters
Xie-
Beni
val
idity
(d) GDS2712
Figure 3 XB validity index of four yeast gene expression data sets with cluster number 119862
Table 3 Performance of FCM RCM SCM SRCM and SP-FCM on package datasets
Different indices AlgorithmsFCM RCM SCM SRCM SP-FCM
DB index 184569 159671 143194 124038 107826Dunn index 92647 116298 125376 169422 167313
Within each cluster the optimal 120572119895decides the cardinality
and realizes cluster reduction and XB index is calculatedEach 119862-partition is ranked using this index and selected asthe final output by the smallest index value that indicates thebest compact and well-separated clusters At the beginningthe cluster number decreases at a faster speed it takes 26iterations to reduce the cluster number from 119862 = 360 to 119862 =289 and 20 iterations from119862 = 289 to119862 = 267 The XB indexincreases at a relatively faster rate when the cluster number119862 lt 267 Figure 5 shows the XB index for 119862 isin [267 289]The index reaches its minimum value at 119862 = 276 that meansthe best partition for this data set is 276 clusters Table 3exhibits the comparative analysis of convergence effect Asexpected SP-FCMcan provide sound results for the real datathe performance is assessed by those validity indices
5 Conclusions
This paper presents a modified fuzzy 119888-means algorithmbased on the particle swarm optimization and shadowed setsto perform unsupervised feature clustering This algorithmcalled SP-FCMutilizes the global search property of PSO andvagueness balance property of shadowed sets such that it canestimate the optimal cluster number as it runs through itsalternating optimization process SP-FCM as a randomizedbased approach has the capability to alleviate the problemsfaced by FCM which has some demerits of initializationand falling in local minima Moreover this algorithm avoidsthe subjective and somewhat arbitrary trials to estimate theappropriate value of cluster number and it enhances thiscapability to find the optimal cluster number within a specific
Computational Intelligence and Neuroscience 9
(a) 644 features (b) 460 features (c) 742 features
(d) 442 features (e) 313 features (f) 602 features
(g) 373 features (h) 724 features (i) 539 features (j) 124 features
Figure 4 Ten package images with SIFT features
0100200300400500600700800900
1000
289
285
282
280
278
276
274
272
270
268
Number of clusters
Xie-
Beni
val
idity
287
283
281
279
277
275
273
271
269
267
Figure 5 XB validity index of bag data set with cluster number 119862
range using cluster validity measures as indicatorsThe use ofXB validity index allows the algorithm to find the optimumcluster number with cluster partitions that provide compactand well-separated clusters From the experiments we haveshown that the SP-FCMalgorithmproduces good resultswithreference to DB and Dunn indices especially to the highdimension and large data cases
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
10 Computational Intelligence and Neuroscience
Acknowledgment
This research was supported by the National Natural ScienceFoundation of China (Grant no 61105089 NSFC)
References
[1] A K Jain ldquoData clustering 50 years beyond K-meansrdquo PatternRecognition Letters vol 31 no 8 pp 651ndash666 2010
[2] H-P Kriegel P Kroger and A Zimek ldquoClustering high-dimensional data a survey on subspace clustering pattern-based clustering and correlation clusteringrdquoACMTransactionson Knowledge Discovery from Data vol 3 no 1 article 1 2009
[3] P Melin and O Castillo ldquoA review on type-2 fuzzy logic appli-cations in clustering classification and pattern recognitionrdquoApplied Soft Computing Journal vol 21 pp 568ndash577 2014
[4] M Yuwono S W Su B D Moulton and H T Nguyen ldquoDataclustering using variants of rapid centroid estimationrdquo IEEETransactions on Evolutionary Computation vol 18 no 3 pp366ndash377 2014
[5] S C Satapathy G Pradhan S Pattnaik J V R Murthy andP V G D P Reddy ldquoPerformance comparisons of PSO basedclusteringrdquo InterJRI Computer Science and Networking vol 1no 1 pp 18ndash23 2009
[6] G Peters and RWeber ldquoDynamic clustering with soft comput-ingrdquo Wiley Interdisciplinary Reviews Data Mining and Knowl-edge Discovery vol 2 no 3 pp 226ndash236 2012
[7] D T Anderson J C Bezdek M Popescu and J M KellerldquoComparing fuzzy probabilistic and possibilistic partitionsrdquoIEEE Transactions on Fuzzy Systems vol 18 no 5 pp 906ndash9182010
[8] P Maji and S Paul ldquoRobust rough-fuzzy C-means algorithmdesign and applications in coding and non-coding RNA expres-sion data clusteringrdquo Fundamenta Informaticae vol 124 no 1-2pp 153ndash174 2013
[9] J Bezdek Fuzzy mathematics in pattern classification [PhDthesis] Cornell University New york NY USA 1974
[10] V Olman F Mao H Wu and Y Xu ldquoParallel clusteringalgorithm for large data sets with applications in bioinformat-icsrdquo IEEEACM Transactions on Computational Biology andBioinformatics vol 6 no 2 pp 344ndash352 2009
[11] J C Bezdek and R J Hathaway ldquoOptimization of fuzzyclustering criteria using genetic algorithmsrdquo in Proceedings ofthe 1st IEEE Conference on Evolutionary Computation pp 589ndash594 June 1994
[12] T A Runkler ldquoAnt colony optimization of clustering modelsrdquoInternational Journal of Intelligent Systems vol 20 no 12 pp1233ndash1251 2005
[13] K S Al-Sultan and S Z Selim ldquoA global algorithm for theFuzzy Clustering problemrdquo Pattern Recognition vol 26 no 9pp 1357ndash1361 1993
[14] R Eberhart and J Kennedy ldquoNew optimizer using particleswarm theoryrdquo in Proceedings of the 6th International Sympo-sium onMicroMachine and Human Science pp 39ndash43 NagoyaJapan October 1995
[15] T A Runkler and C Katz ldquoFuzzy clustering by particleswarm optimizationrdquo in Proceedings of the IEEE InternationalConference on Fuzzy Systems pp 601ndash608 can July 2006
[16] C-F Juang C-M Hsiao and C-H Hsu ldquoHierarchical cluster-based multispecies particle-swarm optimization for fuzzy-system optimizationrdquo IEEE Transactions on Fuzzy Systems vol18 no 1 pp 14ndash26 2010
[17] H Izakian and A Abraham ldquoFuzzy C-means and fuzzy swarmfor fuzzy clustering problemrdquo Expert Systems with Applicationsvol 38 no 3 pp 1835ndash1838 2011
[18] W Pedrycz ldquoShadowed sets representing and processing fuzzysetsrdquo IEEE Transactions on Systems Man and Cybernetics BCybernetics vol 28 no 1 pp 103ndash109 1998
[19] J Zhou W Pedrycz and D Miao ldquoShadowed sets in the char-acterization of rough-fuzzy clusteringrdquoPattern Recognition vol44 no 8 pp 1738ndash1749 2011
[20] W Pedrycz ldquoFrom fuzzy sets to shadowed sets interpretationand computingrdquo International Journal of Intelligent Systems vol24 no 1 pp 48ndash61 2009
[21] S Mitra W Pedrycz and B Barman ldquoShadowed c-meansintegrating fuzzy and rough clusteringrdquo Pattern Recognitionvol 43 no 4 pp 1282ndash1291 2010
[22] J C Bezdek J M Keller R Krishnapuram and N R PalFuzzy Models and Algorithms for Pattern Recognition and ImageProcessing vol 4 ofTheHandbooks of Fuzzy Sets Springer NewYork NY USA 1999
[23] D L Davies and DW Bouldin ldquoA cluster separation measurerdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 1 no 2 pp 224ndash227 1978
[24] X L Xie and G Beni ldquoA validity measure for fuzzy clusteringrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 13 no 8 pp 841ndash847 1991
[25] J C Bezdek and N R Pal ldquoSome new indexes of clustervalidityrdquo IEEE Transactions on Systems Man and CyberneticsPart B Cybernetics vol 28 no 3 pp 301ndash315 1998
[26] S C Satapathy S K Patnaik C D P Dash and S Sahoo ldquoDataclustering using modified fuzzy-PSO (MFPSO)rdquo in Proceedingsof the 5th Multi-Disciplinary Int Workshop on Artificial Intelli-gence Hyderabad India 2011
[27] M K Pakhira S Bandyopadhyay and U Maulik ldquoValidityindex for crisp and fuzzy clustersrdquo Pattern Recognition vol 37no 3 pp 487ndash501 2004
[28] R Krishnapuram and J M Keller ldquoThe possibilistic C-meansalgorithm insights and recommendationsrdquo IEEE Transactionson Fuzzy Systems vol 4 no 3 pp 385ndash393 1996
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience 9
(a) 644 features (b) 460 features (c) 742 features
(d) 442 features (e) 313 features (f) 602 features
(g) 373 features (h) 724 features (i) 539 features (j) 124 features
Figure 4 Ten package images with SIFT features
0100200300400500600700800900
1000
289
285
282
280
278
276
274
272
270
268
Number of clusters
Xie-
Beni
val
idity
287
283
281
279
277
275
273
271
269
267
Figure 5 XB validity index of bag data set with cluster number 119862
range using cluster validity measures as indicatorsThe use ofXB validity index allows the algorithm to find the optimumcluster number with cluster partitions that provide compactand well-separated clusters From the experiments we haveshown that the SP-FCMalgorithmproduces good resultswithreference to DB and Dunn indices especially to the highdimension and large data cases
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
10 Computational Intelligence and Neuroscience
Acknowledgment
This research was supported by the National Natural ScienceFoundation of China (Grant no 61105089 NSFC)
References
[1] A K Jain ldquoData clustering 50 years beyond K-meansrdquo PatternRecognition Letters vol 31 no 8 pp 651ndash666 2010
[2] H-P Kriegel P Kroger and A Zimek ldquoClustering high-dimensional data a survey on subspace clustering pattern-based clustering and correlation clusteringrdquoACMTransactionson Knowledge Discovery from Data vol 3 no 1 article 1 2009
[3] P Melin and O Castillo ldquoA review on type-2 fuzzy logic appli-cations in clustering classification and pattern recognitionrdquoApplied Soft Computing Journal vol 21 pp 568ndash577 2014
[4] M Yuwono S W Su B D Moulton and H T Nguyen ldquoDataclustering using variants of rapid centroid estimationrdquo IEEETransactions on Evolutionary Computation vol 18 no 3 pp366ndash377 2014
[5] S C Satapathy G Pradhan S Pattnaik J V R Murthy andP V G D P Reddy ldquoPerformance comparisons of PSO basedclusteringrdquo InterJRI Computer Science and Networking vol 1no 1 pp 18ndash23 2009
[6] G Peters and RWeber ldquoDynamic clustering with soft comput-ingrdquo Wiley Interdisciplinary Reviews Data Mining and Knowl-edge Discovery vol 2 no 3 pp 226ndash236 2012
[7] D T Anderson J C Bezdek M Popescu and J M KellerldquoComparing fuzzy probabilistic and possibilistic partitionsrdquoIEEE Transactions on Fuzzy Systems vol 18 no 5 pp 906ndash9182010
[8] P Maji and S Paul ldquoRobust rough-fuzzy C-means algorithmdesign and applications in coding and non-coding RNA expres-sion data clusteringrdquo Fundamenta Informaticae vol 124 no 1-2pp 153ndash174 2013
[9] J Bezdek Fuzzy mathematics in pattern classification [PhDthesis] Cornell University New york NY USA 1974
[10] V Olman F Mao H Wu and Y Xu ldquoParallel clusteringalgorithm for large data sets with applications in bioinformat-icsrdquo IEEEACM Transactions on Computational Biology andBioinformatics vol 6 no 2 pp 344ndash352 2009
[11] J C Bezdek and R J Hathaway ldquoOptimization of fuzzyclustering criteria using genetic algorithmsrdquo in Proceedings ofthe 1st IEEE Conference on Evolutionary Computation pp 589ndash594 June 1994
[12] T A Runkler ldquoAnt colony optimization of clustering modelsrdquoInternational Journal of Intelligent Systems vol 20 no 12 pp1233ndash1251 2005
[13] K S Al-Sultan and S Z Selim ldquoA global algorithm for theFuzzy Clustering problemrdquo Pattern Recognition vol 26 no 9pp 1357ndash1361 1993
[14] R Eberhart and J Kennedy ldquoNew optimizer using particleswarm theoryrdquo in Proceedings of the 6th International Sympo-sium onMicroMachine and Human Science pp 39ndash43 NagoyaJapan October 1995
[15] T A Runkler and C Katz ldquoFuzzy clustering by particleswarm optimizationrdquo in Proceedings of the IEEE InternationalConference on Fuzzy Systems pp 601ndash608 can July 2006
[16] C-F Juang C-M Hsiao and C-H Hsu ldquoHierarchical cluster-based multispecies particle-swarm optimization for fuzzy-system optimizationrdquo IEEE Transactions on Fuzzy Systems vol18 no 1 pp 14ndash26 2010
[17] H Izakian and A Abraham ldquoFuzzy C-means and fuzzy swarmfor fuzzy clustering problemrdquo Expert Systems with Applicationsvol 38 no 3 pp 1835ndash1838 2011
[18] W Pedrycz ldquoShadowed sets representing and processing fuzzysetsrdquo IEEE Transactions on Systems Man and Cybernetics BCybernetics vol 28 no 1 pp 103ndash109 1998
[19] J Zhou W Pedrycz and D Miao ldquoShadowed sets in the char-acterization of rough-fuzzy clusteringrdquoPattern Recognition vol44 no 8 pp 1738ndash1749 2011
[20] W Pedrycz ldquoFrom fuzzy sets to shadowed sets interpretationand computingrdquo International Journal of Intelligent Systems vol24 no 1 pp 48ndash61 2009
[21] S Mitra W Pedrycz and B Barman ldquoShadowed c-meansintegrating fuzzy and rough clusteringrdquo Pattern Recognitionvol 43 no 4 pp 1282ndash1291 2010
[22] J C Bezdek J M Keller R Krishnapuram and N R PalFuzzy Models and Algorithms for Pattern Recognition and ImageProcessing vol 4 ofTheHandbooks of Fuzzy Sets Springer NewYork NY USA 1999
[23] D L Davies and DW Bouldin ldquoA cluster separation measurerdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 1 no 2 pp 224ndash227 1978
[24] X L Xie and G Beni ldquoA validity measure for fuzzy clusteringrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 13 no 8 pp 841ndash847 1991
[25] J C Bezdek and N R Pal ldquoSome new indexes of clustervalidityrdquo IEEE Transactions on Systems Man and CyberneticsPart B Cybernetics vol 28 no 3 pp 301ndash315 1998
[26] S C Satapathy S K Patnaik C D P Dash and S Sahoo ldquoDataclustering using modified fuzzy-PSO (MFPSO)rdquo in Proceedingsof the 5th Multi-Disciplinary Int Workshop on Artificial Intelli-gence Hyderabad India 2011
[27] M K Pakhira S Bandyopadhyay and U Maulik ldquoValidityindex for crisp and fuzzy clustersrdquo Pattern Recognition vol 37no 3 pp 487ndash501 2004
[28] R Krishnapuram and J M Keller ldquoThe possibilistic C-meansalgorithm insights and recommendationsrdquo IEEE Transactionson Fuzzy Systems vol 4 no 3 pp 385ndash393 1996
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
10 Computational Intelligence and Neuroscience
Acknowledgment
This research was supported by the National Natural ScienceFoundation of China (Grant no 61105089 NSFC)
References
[1] A K Jain ldquoData clustering 50 years beyond K-meansrdquo PatternRecognition Letters vol 31 no 8 pp 651ndash666 2010
[2] H-P Kriegel P Kroger and A Zimek ldquoClustering high-dimensional data a survey on subspace clustering pattern-based clustering and correlation clusteringrdquoACMTransactionson Knowledge Discovery from Data vol 3 no 1 article 1 2009
[3] P Melin and O Castillo ldquoA review on type-2 fuzzy logic appli-cations in clustering classification and pattern recognitionrdquoApplied Soft Computing Journal vol 21 pp 568ndash577 2014
[4] M Yuwono S W Su B D Moulton and H T Nguyen ldquoDataclustering using variants of rapid centroid estimationrdquo IEEETransactions on Evolutionary Computation vol 18 no 3 pp366ndash377 2014
[5] S C Satapathy G Pradhan S Pattnaik J V R Murthy andP V G D P Reddy ldquoPerformance comparisons of PSO basedclusteringrdquo InterJRI Computer Science and Networking vol 1no 1 pp 18ndash23 2009
[6] G Peters and RWeber ldquoDynamic clustering with soft comput-ingrdquo Wiley Interdisciplinary Reviews Data Mining and Knowl-edge Discovery vol 2 no 3 pp 226ndash236 2012
[7] D T Anderson J C Bezdek M Popescu and J M KellerldquoComparing fuzzy probabilistic and possibilistic partitionsrdquoIEEE Transactions on Fuzzy Systems vol 18 no 5 pp 906ndash9182010
[8] P Maji and S Paul ldquoRobust rough-fuzzy C-means algorithmdesign and applications in coding and non-coding RNA expres-sion data clusteringrdquo Fundamenta Informaticae vol 124 no 1-2pp 153ndash174 2013
[9] J Bezdek Fuzzy mathematics in pattern classification [PhDthesis] Cornell University New york NY USA 1974
[10] V Olman F Mao H Wu and Y Xu ldquoParallel clusteringalgorithm for large data sets with applications in bioinformat-icsrdquo IEEEACM Transactions on Computational Biology andBioinformatics vol 6 no 2 pp 344ndash352 2009
[11] J C Bezdek and R J Hathaway ldquoOptimization of fuzzyclustering criteria using genetic algorithmsrdquo in Proceedings ofthe 1st IEEE Conference on Evolutionary Computation pp 589ndash594 June 1994
[12] T A Runkler ldquoAnt colony optimization of clustering modelsrdquoInternational Journal of Intelligent Systems vol 20 no 12 pp1233ndash1251 2005
[13] K S Al-Sultan and S Z Selim ldquoA global algorithm for theFuzzy Clustering problemrdquo Pattern Recognition vol 26 no 9pp 1357ndash1361 1993
[14] R Eberhart and J Kennedy ldquoNew optimizer using particleswarm theoryrdquo in Proceedings of the 6th International Sympo-sium onMicroMachine and Human Science pp 39ndash43 NagoyaJapan October 1995
[15] T A Runkler and C Katz ldquoFuzzy clustering by particleswarm optimizationrdquo in Proceedings of the IEEE InternationalConference on Fuzzy Systems pp 601ndash608 can July 2006
[16] C-F Juang C-M Hsiao and C-H Hsu ldquoHierarchical cluster-based multispecies particle-swarm optimization for fuzzy-system optimizationrdquo IEEE Transactions on Fuzzy Systems vol18 no 1 pp 14ndash26 2010
[17] H Izakian and A Abraham ldquoFuzzy C-means and fuzzy swarmfor fuzzy clustering problemrdquo Expert Systems with Applicationsvol 38 no 3 pp 1835ndash1838 2011
[18] W Pedrycz ldquoShadowed sets representing and processing fuzzysetsrdquo IEEE Transactions on Systems Man and Cybernetics BCybernetics vol 28 no 1 pp 103ndash109 1998
[19] J Zhou W Pedrycz and D Miao ldquoShadowed sets in the char-acterization of rough-fuzzy clusteringrdquoPattern Recognition vol44 no 8 pp 1738ndash1749 2011
[20] W Pedrycz ldquoFrom fuzzy sets to shadowed sets interpretationand computingrdquo International Journal of Intelligent Systems vol24 no 1 pp 48ndash61 2009
[21] S Mitra W Pedrycz and B Barman ldquoShadowed c-meansintegrating fuzzy and rough clusteringrdquo Pattern Recognitionvol 43 no 4 pp 1282ndash1291 2010
[22] J C Bezdek J M Keller R Krishnapuram and N R PalFuzzy Models and Algorithms for Pattern Recognition and ImageProcessing vol 4 ofTheHandbooks of Fuzzy Sets Springer NewYork NY USA 1999
[23] D L Davies and DW Bouldin ldquoA cluster separation measurerdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 1 no 2 pp 224ndash227 1978
[24] X L Xie and G Beni ldquoA validity measure for fuzzy clusteringrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 13 no 8 pp 841ndash847 1991
[25] J C Bezdek and N R Pal ldquoSome new indexes of clustervalidityrdquo IEEE Transactions on Systems Man and CyberneticsPart B Cybernetics vol 28 no 3 pp 301ndash315 1998
[26] S C Satapathy S K Patnaik C D P Dash and S Sahoo ldquoDataclustering using modified fuzzy-PSO (MFPSO)rdquo in Proceedingsof the 5th Multi-Disciplinary Int Workshop on Artificial Intelli-gence Hyderabad India 2011
[27] M K Pakhira S Bandyopadhyay and U Maulik ldquoValidityindex for crisp and fuzzy clustersrdquo Pattern Recognition vol 37no 3 pp 487ndash501 2004
[28] R Krishnapuram and J M Keller ldquoThe possibilistic C-meansalgorithm insights and recommendationsrdquo IEEE Transactionson Fuzzy Systems vol 4 no 3 pp 385ndash393 1996
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014