presenting a method for clustering using cuckoo ...scholarism.net/fulltext/22014224.pdf ·...
TRANSCRIPT
International Advances in Engineering and Technology (IAET)
ISSN: 2305-8285 Vol.22 2014
www.scholarism.netInternational Scientific Researchers (ISR)
30
Presenting a method for Clustering Using Cuckoo Optimization
Algorithm
SamaneAsadi1, Vahid Rafe
1- Department of computer engineering, Arak Branch, Islamic Azad University ,Arak ,
Iran
Abstract-Nowadays according to increasing growth of data and computers' computing and
storing power, many investigators have been attracted to patterns and relations of these data.
Data mining of a large data collection could be done through different techniques, among
which clustering is one of the most important. Clustering is an unsupervised learning process
with the main objective of organizing data into certain clusters and groups in a way that the
similarities between the data within a cluster being maximized and those of similarities
between the data of different clusters being minimized. With expansion of the applied fields
of clustering, there is even a more ongoing need to present an efficient method of clustering.
Many algorithms have been presented to this end by now. These algorithms are facing
problems and require determined number of cluster before being applied. Cuckoo
Optimization algorithm has been used in the present study as one of the newest evolutionary
algorithms with high level of accuracy in solving different problems and achieving global
optimum. Samples from UCI databases were used to validate the suggested method and the
results of its implementation were compared to those of well-known evolutionary algorithms
such as K-means. The accuracy and convergence speed of the suggested algorithm was
considered to be significant comparing to the results of other algorithms.
Keywords: Data mining, K-means, Cuckoo Optimization Algorithm
1. Introduction
Today, most of the organizations are engaged in collecting and storing data
rapidly, yet they suffer from the lack of the knowledge for decision making in
spite of the massive volume of data. Therefore, data mining, mechanical data
analysis to overcome the deficiencies in decision-making and mining
information and knowledge hidden in the data are obvious necessity[1].
International Advances in Engineering and Technology (IAET)
ISSN: 2305-8285 Vol.22 2014
www.scholarism.netInternational Scientific Researchers (ISR)
31
Clustering is a data-mining technique and unsupervised learning process which
is applicable to many fields of study including medical data analysis, biology,
diagnosis of abnormal cases, marketing, etc[2-10]. Generally, clustering
algorithms are divided into two Hierarchical and Partitional types. Based on the
type of the generative structure, the hierarchical clustering approaches are
categorized as Divisive and Agglomerative. Initially all the data are categorized
in one cluster, then data with the least similarities are broken into separate
clusters through an iterative process in different stages and goes on with it until
specific number of clusters are attained during Divisive approach, which is also
known as Top-Down approach. But Agglomerative or Down- Top approach
performs in quite opposite way, i.e. each piece of data is initially considered to
be a separate cluster to be combined with the most similar clusters through an
iterative process during next stages and finally to form one or certain number of
the clusters[11-13].
Partitional clustering algorithm breaks down and categorizes data in a level to
be able to work with wide range of data and to optimize the predetermined cost
function. Most of Partitional clustering algorithms are center-based. One of the
most important and most-widely-used Partitional center-based algorithms is k-
means. Despite the efforts made to put an end to the K-means algorithm's
deficiencies, the optimal solution has not necessarily been achieved[11]. So far,
many algorithms have been presented to improve K-means and clustering which
will follow in section 2.
To overcome K-means deficiencies, the present study has used a hybrid
algorithm called COA-KM which is based on Cuckoo Optimization Algorithm
(COA) by resolving the problem through determining the cuckoos' egg laying
and habitats[14]. In order to take advantage of K-means convergence speed, a
Cuckoo's egg laying site has been determined by using K-means in the
suggested algorithm. COA-KM has proved to overcome other evolutionary
algorithms deficiencies and to enjoy a proper level of accuracy. Section 2 of this
study will concentrate on reviewing past literature on clustering approaches.
Clustering will be discussed on section 3 and basic concepts in section 4. The
suggested algorithm is clarified in section 5. And finally, the result of the
International Advances in Engineering and Technology (IAET)
ISSN: 2305-8285 Vol.22 2014
www.scholarism.netInternational Scientific Researchers (ISR)
32
comparison between COA-KM and other algorithms is presented along with the
conclusion.
2. Literature Review
In [15] has used Genetic Algorithm and a new hybrid operator which do
clustering by changing neighborhood centers for K-means. Using Genetic
Algorithm and combining this algorithm with K-means in [16], an algorithm
called GAK was introduced. In [17] has done clustering based on Tabu Search
Algorithm. In [18] used neighborhood functions and operators improving Tabu
Search Algorithm to present a different approach of clustering. Having used
Simulated Annealing Algorithm and combining Simulated Annealing
Algorithm with K-means and presenting SAKM as a new hybrid algorithm, in
[19] and [20] have respectively overcame clustering deficiencies by verification
of the efficiency of the suggested method using the available databases. In [21]
have applied Ant Colony Optimization (ACO) to clustering. In [22] has
simulated honey bees social life and clustering according to Honey-Bees Mating
Optimization (HBMO). Particle Swarm Optimization (PSO) has been applied in
[23] and the hybrid algorithm of GSA-KM based on Gravitational Search
Algorithm has been used for clustering in [24].
3. Clustering Problem
Clustering is a NP-Complete problem that aims to find clusters in such a way
that the defined similarity measure is minimized. Therefore, clustering is a
common optimization problem, and as a result it requires a mathematical
expression like other optimization problems. If S={X1,X2,…,Xn} and contains
all points that should be clustered; K is equal to the number of the clusters; and
{C1,C2,…,Ck} includes the clusters' centers, the following equation must be
maintained (Al-Sultan,1995)[17]:
Each cluster should contain at least one data
Ci for i=1,…,K (1)
International Advances in Engineering and Technology (IAET)
ISSN: 2305-8285 Vol.22 2014
www.scholarism.netInternational Scientific Researchers (ISR)
33
Different clusters should not have any data in common
Ci Cj = for i,j =1,…,K (2)
Each data should be assigned to one cluster, i.e. the total of the data of all
clusters should be equal to the total initial data after the assignment.
i=1 Ci = S (3)
The main objective of this paper is to do Partitional center-based clustering.
Through this algorithm, the clusters are produced in a way to optimize a
predetermined cost function. The most widely function cost used for these
techniques is standard squared error which has a very good performance with
dense and isolated clusters. This standard is defined according to equation
4[11,19].
J = ∑ ∑ ‖ j i
- j‖
i=1 j=1 (4)
In Equation 4, ‖ ‖ stands for distance measure and Cj is the jth
cluster center.
K-means uses a simple way to assign a set of data in a pre-specified number of
clusters, for example k clusters, to perform Partitional center-based clustering.
Main idea is to define k centers for each cluster. These centers must be chosen
precisely because the results depend on them. Therefore, the more distant the
centers are from each other, the better. Next step is to assign each data to the
closest center. After assigning data to all available centers, which means the end
of the first step with the initial clustering being done, k new centers should be
counted for the previous step clusters. Data will be assigned to proper centers
again after defining k new centers[11].
Worth noting that the main objective of this paper is to do clustering based on
centers and to find the cluster centers in a way that objective function is
minimized.
4. Cuckoo Optimization Algorithm
A brief description of Cuckoo Optimization Algorithm is presented in this
section.
International Advances in Engineering and Technology (IAET)
ISSN: 2305-8285 Vol.22 2014
www.scholarism.netInternational Scientific Researchers (ISR)
34
Cuckoo Optimization Algorithm (COA) is one of evolutionary techniques was
introduced in [14]. This algorithm is inspired by the lifestyle of a bird called the
Cuckoo. This bird did ’t ma e est for itself a d it be used the ests of other
birds for laying eggs. Ability to create eggs like the bird host is reinforced in
cuckoo bird. If the bird's host discover eggs that are not mine, it throw away or
leave the nest and it makes a nest in other places. Cuckoo eggs are the bigger
size of the host bird until cuckoo brood would hatch soon. When the host bird's
eggs throws out of the nest or demand food so much to other broods die of
hungry. When the cuckoo brood grows and becomes a mature bird continues the
mother's life instinctively.
4.1 Generating initial cuckoo habitat
I order to solve a optimizatio problem, it’s e essary that the values of
problem variables be formed as an array. In GA terminologies this array is
called “Chromosome”. But i COA it is alled “habitat” [14]. To start the
optimization algorithm, a candidate habitat matrix is generated. Then some
randomly produced number of eggs is supposed for each of these initial cuckoo
habitats. In nature, each cuckoo lays between 5 to 20 eggs. These values are
used as the upper and lower bounds of egg assigned to each cuckoo at different
iterations. Other habit of real cuckoos is that they lay eggs within a maximum
distance from their habitat. This maximum area will be called “Egg Layi g
Radius ELR ”. Ea h u oo has a egg layi g radius ELR whi h is
appropriate with the total umber of eggs, umber of urre t u oo’s eggs a d
also variable limits of varhi and varlow [14]. So ELR is defined as:
ELR= umber of urre t u oo
,s eggs
otal umber of eggs varhi- varlow (5)
Whi h is a i teger, supposed to ha dle the ma imum value of ELR.
4.2 Immigration of cuckoos
When young cuckoos grow and become mature, they live in their own area and
make society for some time. But when the season for egg laying approaches
they move to new habitats with the most similar host eggs and with more food
for new young birds. Then the cuckoo groups are formed in different areas, the
society with the highest fitness value is selected as the goal point, and other
cuckoo to move to that point.
When mature cuckoos live in that environment identify cuckoos belong to
which groups that are difficult. Now that cuckoo groups are identified their
International Advances in Engineering and Technology (IAET)
ISSN: 2305-8285 Vol.22 2014
www.scholarism.netInternational Scientific Researchers (ISR)
35
mean benefit value is calculated. The maximum amount of the benefit is
determi ed by the goal group a d o seque tly that group’s best habitat is the
new destination habitat for moving cuckoos.
When moving toward goal point, the cuckoos do not fly all the way to the
destination habitat. They only fly a part of the way and also have a deviation.
Figure1 shows Pseudo code from cuckoo optimization algorithm[14].
1. Initialize cuckoo habitats with random points
2. Define ELR for each cuckoo
3. Let cuckoo to lay eggs inside their corresponding ELR
4. Kill those eggs that are identified by host birds
5. Eggs hatch and chicks grow
6. Evaluate the habitat of each newly grown cuckoo
7. Limit u oos’ma imum umber in environment and kill those that live in
worst habitats
8. Cuckoos find best group and select goal habitat
9. Let new cuckoo population move toward goal habitat
10. If stop condition is satisfied end, if not go to 2
Fig1: Pseudo code of Cuckoo Optimization Algorithm
5. The Suggested COA-KM Algorithm
Cuckoo Optimization Algorithm plays the main role in COA-KM algorithm. K-
means algorithms have been applied for better and faster search of the solution
space.
Step 1: generating cuckoos initial population
Initial habitats and the number of eggs per Cuckoo are randomly initialized
based on Eq.(6).
Habitat = [
1
…
pop
]Hi = Habitati = [ Center1, Center2,.., Centerk] i =1,2, Npop
International Advances in Engineering and Technology (IAET)
ISSN: 2305-8285 Vol.22 2014
www.scholarism.netInternational Scientific Researchers (ISR)
36
Centerj = [ x1,x2, …, d] j=1, , …, k (6)
x1 = Rand(.) (xmax - xmin) + xmin j
mi j j
ma
Hi stands for i-th
cuckoo's habitat in Eq(6). Centerj is the j-th
cluster center for
the i cuckoo. k,d, Npop are respectively number of the population, problem's
aspect, and number of the clusters. j
ma and j
mi are upper and lower bounds
of the points belong to the j-th
cluster.
Number of Eggs = [
E1
E
…
E pop
] (7)
NEi =⌈Ra d . Ema - Emi ⌉ Emi Emi Ei Ema i = 1, , …, pop
Ema and Emi indicate maximum and minimum numbers of each cuckoo's
eggs. H1 is equal to cluster centers generated in step 1.
Step 2: evaluation of cuckoos' cost function
Suppose that we have N sample feature vectors. The cost function is
evaluated for each habitat as follows:
Step 2-1: i=1 and Objec=0.
Step 2-2: select the ith sample.
Step 2-3: calculate the distances between the ith sample and Centerj
(j=1,…,K).
Step 2-4: add the value of Objec with the minimum distance
calculated in Step2-3 (Objec = Objec+ min (|Ce ter m|) i=1,…,K (8)
Step 2-5: if all samples have been selected, go to the next step, otherwise
i=i+1and return to step2-2.
International Advances in Engineering and Technology (IAET)
ISSN: 2305-8285 Vol.22 2014
www.scholarism.netInternational Scientific Researchers (ISR)
37
Step 2-6: Cost(X) =Objec.
The cost function is calculated mathematically as below:
Cost(X)=∑ mi |Ce ter m| i=1,…,K N=number of input data
(9)
Step 3: initialization of some of cuckoos' population using K-means
algorithm.
The cuckoo with the maximum number of NE eggs is identified and
initialization of its eggs is done using K-means. Cost function of the cuckoo and
the eggs is computed. If the cost function of an egg is lower than that of the
cuckoo's, the egg will replace the initial cuckoo which is considered as an egg
from then on.
Step 4: laying eggs in host birds' nests
The cuckoo egg radius is computed based on equation 5. Egg laying is done
randomly within a circle-shape area with determined radius. Then, the objective
function of each egg is calculated; 10 % of the egg's population with improper
cost function will be identified and replaced by the host birds.
Step 5: Cuckoo immigration
After eggs grow up and turn into adult cuckoo, the best cuckoo Gbest is
identified. Other cuckoos will start migrating toward this cuckoo according to
the explanations presented in section 4. In the case that α is greater than 0.01,
the parameter amount should be reduced.
Step 6: Elimination of the cuckoos in worst habitats
If the total of all available cuckoos exceed the maximum number of them, the
cuckoos in worst habitats with undesirable cost function will be eliminated.
Step 7: if the stop condition is maintained, the algorithm will stop.
Otherwise, the determined egg laying radius will be determined
according to Eq(5) and algorithm will be performed from the 4th step.
International Advances in Engineering and Technology (IAET)
ISSN: 2305-8285 Vol.22 2014
www.scholarism.netInternational Scientific Researchers (ISR)
38
6. Experimental Results
Three real datasets are used to validate our proposed algorithm. These datasets
are named Iris, Breast Cancer Wisconsin (Cancer) ,and Contraceptive Method
Choice (CMC).Each dataset has a different number of clusters, data objects ,
features [21,24]. These datasets have been used to compare and evaluate the
performance of clustering algorithms in the literature and are described as
follows:
Iris dataset (n =150,d= 4,k= 3): This dataset contains three classes of 50 objects
each,where each class refers to a type of iris flower. There are 150 random
samples of iris flowers with four numeric attributes in this dataset. These
attributes are sepal length and widthin cm, petal length and widthin cm.There
are no missing values for attributes[21,24].
Breast Cancer Wisconsin (n = 683,d = 9,k= 2): This dataset contains 683objects
characterized by nine features: clump thickness, cell size uniformity,cell shape
uniformity, marginal adhesion, single epithelial cell size, bare nuclei,bland
chromatin, normal nucleoli,andmitoses.There are two clusters in the data:
malignant (444objects)and benign(239objects)[21,24].
Contraceptive Method Choice also denoted as CMC (n= 1473,d= 10,k= 3): This
dataset is a subset of the1987 National Indonesia Contraceptive Prevalence
Survey. The samples are married women whoe ither were not pregnant or did
not know if they were at the time of interview.The problem is to predict the
choice of current contraceptive method (no use has 629 objects, long-term
methods have 334 objects, and short-term methods have 510 objects) of a
woman based on her demographic and socioeconomic characteristics[21,24].
To implement COA-KM algorithm, the initial number of the cuckoos was
considered to be 5 and maximum permitted number of them was defined as 15
cuckoos with the respective maximum and minimum of 12 and 2 eggs.
Algorithm was stopped after 500 iteration. The presented evolutionary
algorithms have been implemented by MATLAB software on a 2GB RAM
computer . The results of the presented evolutionary algorithms like K-means,
ACO, and HBMO are compared in Table 1, Table 2, and Table 3. The
suggested algorithm indicates a better level of clustering accuracy with
International Advances in Engineering and Technology (IAET)
ISSN: 2305-8285 Vol.22 2014
www.scholarism.netInternational Scientific Researchers (ISR)
39
convergence to the optimal solution in comparison with other algorithms based
on the results of the implementation. For instance, the solution of
implementation of COA-KM on Iris data has always reported to be 96.6554
with standard deviation of zero, while the best values of the full implementation
of K-means, HBMO, and ACO are respectively 97.333, 96.752 and 97.100.
The results of implementation of different algorithms on Iris data are shown in
Table 1. The study of the results of the implementation of the algorithms on
Cancer data, shown in Table 2, also indicate the superiority of the suggested
algorithm compared to other algorithms. The difference between the best and
worst solutions of the suggested algorithm is insignificant; the worst solution is
equal to 2965.88 which is yet more optimal than the best solution of the most of
the other algorithms. The results of the implementation of COA-KM on CMC
data are shown in Table 3. They indicate the superiority of COA-KM over many
evolutionary algorithms.
Table1: Results obtained by the algorithms on Iris data
Standard Deviation Cost Function Value
Method worst Average Best
0 96.6554 96.6554 96.6554 COA-KM
0.00165 96.6636 96.6583 96.6562 COA
14.6311 120.45 106.05 97.33 k-means
0.0123 76.764 96.723 96.698 GSA
0.0076 96.705 96.689 96.679 GSA-KM
2.018 102.01 99.957 97.4573 SA
0.53 98.5694 97.8680 97.3659 TS
14.563 139.7782 125.1970 113.9865 GA
0.367 97.8084 97.1715 97.1007 ACO
0.531 97.7576 96.9531 96.7520 HBMO
Table2: Results obtained by the algorithms on Cancer data
Standard Deviation Cost Function Value
Method worst Average Best
0.4575 2965.88 2964.88 2964.51 COA-KM
1.6785 2967.45 2966.88 2965.27 COA
251.14 3521.59 3251.21 2999.19 k-means
8.1731 2990.83 2973.58 2967.96 GSA
0.0670 2965.30 2965.21 2965.14 GSA-KM
230.192 3421.95 3239.17 2993.45 SA
232.217 3434.16 3251.37 2982.84 TS
International Advances in Engineering and Technology (IAET)
ISSN: 2305-8285 Vol.22 2014
www.scholarism.netInternational Scientific Researchers (ISR)
40
229.734 3427.43 3249.46 2999.32 GA
90.500 3242.01 3046.06 2970.49 ACO
103.471 3210.78 3112.42 2989.94 HBMO
Table3: Results obtained by the algorithms on CMC data
Standard Deviation Cost Function Value
Method worst Average Best
0.70586 5695.7309 5694.9270 5694.1230 COA-KM
3.0007 5704.5 5699.87 5696.936 COA
47.16 5934.50 5893.60 5842.43 k-means
1.724 5702.09 5699.84 5698.15 GSA
0.2717 5697.87 5697.36 5697.03 GSA-KM
50.867200 5966.9470 5893.4823 5894.0380 SA
40. 84568 5999.8053 5993.5942 5885.0621 TS
50.3694 5812.6480 5756.5984 5705.6301 GA
45.634700 5912.4300 5819.1347 5701.9230 ACO
12.6900 5725.3500 5713.9800 5699.2670 HBMO
7. conclusion
The study of the results of implementation of COA-KM and comparing them
with those of the original algorithms indicates resolving of the problems and
deficiencies of the original algorithms in COA-KM. K-means algorithm enjoys
a high speed of convergence for example, but the convergence takes place in
local optimum. To take advantage of K-means high convergence speed, the
initialization of some of the population of the suggested hybrid algorithm was
done by using K-means. As a result, the search of the solution space is started
from a more proper area through COA-KM; convergence speed improves; and
standard deviation is reduced. Furthermore, cuckoos' egg laying radius are made
by the repetition of the reduced algorithm and less random changes in solution
space of the suggested algorithm. Implementing these changes on Cuckoo
algorithm and using K-means have resulted in COA-KM proper level of the
clustering accuracy.
References
[1] Sh.H. Lia, P.H. Chu, P.Y. Hsiao, Data Mining Techniques And Applications – A Decade
Review From 2000 To 2011, Expert Systems with Applications. 39, (2012) 11303–
11311.
International Advances in Engineering and Technology (IAET)
ISSN: 2305-8285 Vol.22 2014
www.scholarism.netInternational Scientific Researchers (ISR)
41
[2] Y. Xia, D. Feng, , T. Wang, R. Zhao, Y. Zhang, Image Segmentation by Clustering of
Spatial Patterns, Pattern Recognition Letters. 28, (2007) 1548–1555.
[3] S. Bandyopadhyay, U. Maulik, Genetic Clustering For Automatic Evolution Of Clusters
And Application to Image Classification, Pattern Recognition. 35, (2002) 1197–
1208.
[4] L. Liao, T. Lin, B. Li, MRI Brain Image Segmentation and Bias Field Correction Based
on Fast Spatially Constrained Kernel Clustering Approach, Pattern Recognition
Letters. 29, (2008) 1580–1588.
[5] H. Tang, T. Li, T. Qiu, Y. Park, Segmentation of Heart Sounds Based on Dynamic
Clustering, Biomedical Signal Processing And Control. 7, (2012) 509–516.
[6] M. Ceccarelli, A. Maratea, Improving Fuzzy Clustering of Biological Data by Metric
Learning With Side Information, International Journal of Approximate Reasoning.
47, (2008) 45–57.
[7] N. Wu, J. Zhang, Factor-Analysis Based Anomaly Detection and Clustering, Decision
Support Systems. 42, (2006) 375–389.
[8] S. Hyun Oh, W. Suk Lee, An Anomaly Intrusion Detection Method by Clustering Normal
User Behavior, Computers & Security. 22, (2003) 596–612.
[9] I. Mahdavi, N. Cho, B. Shirazi, N. Sahebjamnia, Designing Evolving User Profile In E-
CRM With Dynamic Clustering of Web Documents, Data & Knowledge
Engineering. 65, (2008) 355–372.
[10] D. Vicari, M. Alfó, Model Based Clustering of Customer Choice Data, Computational
Statistics & Data Analysis. 71, (2014) 3–13.
[11] A. Jai , 010 . “Data Clusteri g: 50 ears Beyo d K-Means, Pattern Recognition
Letters. 31, (2010) 651–666.
[12] S. Paterlinia, Th. Krink, Differential Evolution and Particle Swarm Optimisation In
Partitional Clustering, Computational Statistics & Data Analysis. 50, (2006) 1220 –
1247.
[13] J. Wu, H. Xiong, J. Chen, Towards Understanding Hierarchical Clustering: A Data
Distribution Perspective, Neuro Computing. 72, (2009) 2319–2330.
[14] R. Rajabioun, Cuckoo Optimization Algorithm, Applied Soft Computing. 11, (2011)
5508–5518.
[15] M. Laszlo, S., Mukherjee, A Genetic Algorithm That Exchanges Neighboring Centers
For K-Means Clustering, Pattern Recognition Letters. 28, (2007) 2359–2366.
[16] K. Krishna, M., Murty, Genetic K-Means Algorithm, IEEE Transactions on Systems,
Man, And Cybernetics B, Cybernetics. 29(1999).
[17] Kh. AL-Sulta , 1995 . “A abu Sear h Approa h o the Clusteri g Problem, Pattern
Recognition. 28 , (1995) 1443–1451.
International Advances in Engineering and Technology (IAET)
ISSN: 2305-8285 Vol.22 2014
www.scholarism.netInternational Scientific Researchers (ISR)
42
[18] Y. Liu, Zh. Yi, H. Wu, M. Ye, , K. Chen, A Tabu Search Approach For the Minimum
Sum-Of-Squares Clustering Problem, Information Sciences. 178, (2008) 2680–2704.
[19] SH. Selim, K. Alsultan, A Simulated Annealing Algorithm For the Clustering Problem,
Pattern Recognition. 24, (1991) 1003-1008.
[20] S. Bandyopadhayay, U., Maulik, M.K. Pakhira, Clustering Using Simulated Annealing
With Probabilistic Redistribution, International Journal of Pattern Recognition And
Artificial Intelligence. 15, (2001) 269-285.
[21] P.S. Shelokar, V.K. Jayaraman, B.D. Kulkarni, An Ant Colony Approach For
Clustering, Analytica Chimica Acta. 509, (2004) 187–195.
[22] D. Karaboga, C. Ozturk, (2011), A Novel Clustering Approach: Artificial Bee Colony
(ABC) Algorithm, Applied Soft Computing. 11, (2011) 652–657.
[23] Kao, Yi.T., E. Zahara, I.W. Kao, A Hybridized Approach to Data Clustering, Expert
Systems with Applications. 34, (2008) 1754–1762.
[24] A. Hatamlou, A. Salwani, H. Nezamabadi-pour, A combined approach for clustering
based on K-means and gravitational search algorithms, Swarm and Evolutionary
Computation .6, (2011) 47-52.