[ieee 2008 eighth international conference on intelligent systems design and applications (isda) -...
TRANSCRIPT
An Evolutionary Classifier Based on Adaptive Resonance Theory Network II and Genetic Algorithm
I-En Liao1 Shu-Ling Shieh
1,2,* Hui-Ching Chen
1
1 Department of Computer Science and Engineering, National Chung-Hsing University, Taichung, Taiwan.
2 Department of Information Management, Ling-Tung University , Taichung, Taiwan. * [email protected]
Abstract
Adaptive Resonance Theory Network II (ART2) is a neural network concerning unsupervised learning. It has been shown that ART2 is suitable for clustering problems that require on-line learning of large-scale and evolving databases. However, if applied to classification problems, ART2 suffers from deficiencies in terms of interpretation of class labels and sensitivity to the input data-order. This study proposes a novel evolutionary classifier based on Adaptive Resonance Theory Network II and genetic algorithms. In the proposed classifier, ART2 is used first for generating the weights between attributes and clusters. In the second stage, a genetic algorithm is employed to generate class labels of input data. The performance of the proposed algorithm is evaluated using Hayes data sets from the Machine Learning Repository at UCI. The experimental results show that the proposed classifier is as good as the well-known C5.0 Classifier in terms of accuracy.
1. Introduction
Several different unsupervised neural-network
architectures have been proposed which are based on
Adaptive Resonance Theory (ART) Network
[1][2][4][5][6]. ART is advantageous for use in on-line
learning and self-organizing. ART1 is an ART model
that can only handle binary input data of zeroes and
ones. ART2 is an improvement of ART1 that can
support the analysis of non-binary input data.
ART2 has been proved to perform well on the
clustering problem. Several researchers[7][8][9] have
proposed supervised ART networks for solving
classification problems that have demonstrated good
results.
This research proposes an evolutionary classifier
based on ART2 and genetic algorithms. The proposed
classifier does not have the drawbacks of using ART2
alone as a classifier. ART2 is used first for generating
the weights between attributes and clusters. Then a
genetic algorithm is employed to generate class labels
of input data. The performance of the proposed
algorithm is evaluated using Hayes data sets from the
Machine Learning Repository at UCI. The
experimental results show that the proposed classifier
is as good as the well-known C5.0 Classifier in terms
of accuracy.
The remaining sections of this paper are organized
as follows: the next section briefly presents related
work concerning Adaptive Resonance Theory
Networks and genetic algorithm; in the third section,
we present a detailed description of our algorithm;
experimental results of the dataset derived from the
Hayes data sets and UCI KDD Archive are provided in
the fourth section. The conclusions are given in the last
section.
2. Related Work
Carpenter and Grossberg[4] proposed the Adaptive
Resonance Theory Network (ART1 Network), which
accepts binary input data, for cluster analysis. For non-
binary input data, Grossberg and Carpenter[5]
developed the Adaptive Resonance Theory Network II
(ART2 Network) in 1987. Hussain[8] proposed
ARTSTAR as a classifier based on ART2. In
ARTSTAR, an extra Instar layer is added to ART2 to
class labels of training data for supervised learning. As
a result, ARTSTAR performs better than ART2 for
Eighth International Conference on Intelligent Systems Design and Applications
978-0-7695-3382-7/08 $25.00 © 2008 IEEE
DOI 10.1109/ISDA.2008.75
318
solving classification problems. Carpenter et al.[7]
proposed ARTMAP network architecture; and
Weenink[13] proposed the structure CATEGORYART
of ART Network, and improved the weaknesses of
ARTMAP. Liu et al.[12] presented a new supervised
ART structure, IFART (Impulse Force Based ART),
the training network of which is Category ART. Its
main purpose is to improve ART Categorization
schemes.
Genetic Algorithms (GA) use evolution and
chromosome elimination to determine the best solution
by searching large, multipoint, complex problem
spaces (continuous reproduction, crossover, and
mutation). The selection, crossover, and mutation
processes are not stopped until the termination
condition is satisfied. GA can avoid preserving the
local optimal solution and obtain the global optimum.
Combined multiple classifiers are more accurate
than a single classifier. Many applications use GA to
improve the rate of accuracy of classification. Kim et
al.[10] used GA as the optimal tool that could be used
to predict consumer purchase behaviors. Its most
important advantage is that it can combine multiple
classifiers and support (something) to make decisions.
Kuo and Liao[11] combining Genetic K-Means
Algorithms with ART2 and improved the efficiency of
classifying. Behrouz and William[3] used GA in
feature selection and combined multiple classifiers,
leading to a significant performance improvement.
3. System Structure and Algorithm
In this paper we propose a method that uses the
clustering result obtained from applying for the
learning process of ATR2 and the z input vectors of
sample data (with chromosomes) as the basis of
classification. We propose an evolutionary classifier,
as shown in Fig 1.
The system structure of this research is divided into
three phases: clustering, genetic evolution, and testing.
In the first phase, the clustering stage, we first input
the training data which accounts for 90% of the sample;
and then proceed to the ART2 procedure to obtain the
result of clustering, simultaneously, removing bij. In
the second phase, we use a genetic algorithm to evolve
the appropriate chromosome so that bij and the training
sample correspond to the real category. To obtain
sufficient clustering accuracy, we use continuous
genetic evolution. In the final stage, once the classifier
completes the training, we measure the accuracy of
10% of the testing data in order to evaluate the
accuracy of the evolutionary algorithm.
(1) The coding of chromosomes
The length of every chromosome is n*m, where n is
the clustering number of ART2, and m is the number
of attributes. Chromosomes are C = (c0,…,cn*m-1),
where ci defines the chromosome on which rests the
gene of the i location, and ci is a floating number in
[0,10). The initial value of every ci is generated
randomly from the floating number interval [0,10).
Each ci represents the ratio of influence factors
between the training attribute and the classification. If
a clustering number has 3 clusters (n=3), number of
attributes is 4 (m=4), and the length of chromosome is
12 elements; the method of coding is as shown in Fig.
2.
C0 C1 C2 C3 ...C10 C11
6.63444 3.74802 7.75940 .. . 2.86918
Fig 2 The coding of chromosomes
(2) Genetic Reproduction and Selection
Fig 1. System Structure
319
The larger the fitness value of a chromosome is, the
higher the accuracy rate of the chromosome’s
evolutionary classifier will be. This research makes use
of roulette wheel selection so that the areas of divided
wheel sections are cut so as to increase in size, and the
proportion of area these sections occupy is directly
proportional to the fitness value. Thus the probability
of being chosen as a member of the next generation
and the fitness value are in direct proportion.
(3) Genetic Crossover operators
Genetic Crossover operators are based on two
random boundaries of node bits chosen from a
population. It proceeds by the boundaries of node bits
exchanging self-identifying information through one
another to produce two node bit boundaries. This
research makes use of the signal-divided crossover,
and chooses a crossover point randomly. The two sides
of the crossover points can perform exchanging actions.
(4) Genetic Mutation operator
To ensure random selection of a node bit from a
number of boundaries, slight changes are made to the
value of the corresponding species bits. This series of
processes generates new species in order to complete
mutation. Fig 3 shows random selection to determine
the genetic set C2 of chromosomes, and then mutation
and slight changes to the original genetic value. If the
uncertain value is 0.5, the genetic value would mutate
from 7.75941 to 7.75941*0.5 3.879705. As such, it
becomes a post-mutational chromosome.
C0 C1 C2 C3
Before 6.63443 3.74802 7.75941 1.18871
After 6.63443 3.74802 3.879705 1.18871
Fig 2. Examples of mutated chromosomes
(5) Fitness Function
The fitness function that we have designed can be
used to accumulate numbers of chromosomes which
ensure that the sample is classified correctly. In
formula (1), n represents the total number of training
samples, and every sample has m attributes; j stands
for the sample that passes through the ART2 classifier,
and is the jth class. C is a chromosome, where ci+j*m
defines the chromosome on which is the gene of the
i+j*m location. S signifies the input sample, and Si is
the ith attribute of the sample. The parameter bij is used
for ART2 after learning. Then, class k signifies the true
category of the kth training sample. H is a fitness
function used to correspond with fitness value by
passing through hash function (h) to the class. If they
accurately match one another, then it returns 1, or
returns 0. Hence, if the fitness value obtained after
chromosome calculation corresponds with the category
successfully, the accurate number is increased by 1,
otherwise there is no increment.
)),)(((1
1
0
k
z
k
m
iijimji classbScH� �
�
�
��� �� …….............(1)
Where:
1
0
1( ( )) ;
0
( ( ), )
1
0
m
i j m i ij ki
mif h c S b classi j m i ij k
io therw ise
H c S b class�
� ��
�� � �� � �
�
� �
�� �
�……..(2)
1
0
1
0
( ( ))
( ( ) / 1 0 0 )
m
i j m i iji
m
i j m i iji
h c S b
In t c S b
�
� ��
�
� ��
� �
� � �
�
�
………..(3)
4. Experimental result
The main purpose of experiment is to evaluate the
feasibility of the system. We use the Hayes dataset to
train and test the accuracy rate of the genetic
evolutionary classifier. The original Hayes dataset is in
the University of California at Irvine - Machine
Learning Repository, (http://www.ics.uci.edu/~mlearn/
MLRepository.html. ) There are 160 data records and
each record has 4 columns.
We perform two sets of experiments: the first
experiment compares the proposed genetic
evolutionary classifier and the software of decision
tree C5.0. The purpose of the second experiment is to
observe the influence of the genetic evolutionary
generation on the accuracy rate.
(1) Analysis of the accuracy rate
320
In the first experiment, we select and test random
samples from the Hayes dataset. The tested sample
consists of 10% of the population. For example, the
Hayes dataset has 160 data records, and we randomly
choose 144 records to be the training samples. The
other 16 records can serve as the testing samples. The
experiment repeats ten times, and then it calculates
three values: the average, the data maximum, and the
data minimum.
The experimental results are shown in Table 1. The
best accuracy ratio achieved by the proposed
evolutionary classifier is 87.5%; higher than the 84.4%
obtained by C5.0. However, the accuracy rate of the
worst situation, 68.75%, is lower than that of C5.0.
However, the average accuracy of our evolutionary
classifier is 79.3%, higher than the 78.76% of C5.0.
Table 1. Comparison of accuracy rate between C5.0 and proposed evolutionary classifier.
Times 1 2 3 4 5 6 7
C5.0 78.1 75 84.4 84.4 75 84.4 71.9
Proposed 81.25 81.25 81.25 81.25 81.25 68.75 68.75
Times 8 9 10 avg max min
C5.0 75 75 84.4 78.76 84.4 75
Proposed 81.25 81.25 87.5 79.37 87.5 68.75
(2) The influence of evolutionary generation on accuracy rate
In this experiment, we observe different
generations produced by genetic evolution and the
effects on accuracy rate of the proposed evolutionary
classifier. We assume that the evolutionary parameter
is set to the number 200 of the chromosome and the
mutation rate is 0.1; we then select the 50th, 100
th, 150
th,
and 200th
generations for the simulation experiment, as
shown in Table 2. The continuously evolved
generations gradually increase the accuracy rate. In the
200th
generation, the best accuracy rate obtained
reached as high as 67.5% (comparable to that of C5.0).
However, after the 200th
generation, the accuracy rate
did not increase. The curves of Fig 3. show the
increasing accuracy rate.
Table 2.The influence of the evolutionary generations on accuracy rates of the Hayes dataset.
Generation 50 100 150 200 250
A 59.38 62.5 63.13 67.5 67.5
Fig 3. The evolutionary generations influence on the accuracy rate.
5. Conclusion
In this paper, we propose a new evolutionary
classifier. The method uses the results of clustering
and the input data, which are obtained after the training
processes of ART2. The results are fed into the genetic
chromosome as the basis of classification. The results
of this experiment show that the 200th
evolutionary
generation produces the best classification accuracy
rate. In order to demonstrate the practicability of our
methods and to compare our experimental results with
the C5.0 decision tree software, we use random
selection and the 10-fold intersect comparison to test
them separately. The experimental results show that
the proposed classifier is as good as the well-known
C5.0 Classifier in terms of accuracy.
References
[1] A.B. Baruah, and R.C. Welti, “Adaptive resonance theory
and the classical leader algorithm“, Proceedings of the IEEE International Joint Conference on Neural Networks – Seattle,
1991, pp. A-913.
[2] L.I. Burke, “Clustering characteristics of adaptive
resonance“, Neural Networks, 4, 1991, pp.485-491.
[3] M.B Behrouz and F.P. William, “ Using Genetic
Algorithms for Data Mining Optimization in an Educational
Web-based System”, Proceeding of Genetic and Evolutionary Computation Conference, 2003, pp. 2252-2263.
321
[4] G.A. Carpenter and S. Grossberg, “A massively parallel
architecture for a self-organizing neural pattern recognition
machine”, Machine Computer Vision, Graphics and Image Processing, 37, 1987,pp. 54-115.
[5] G.A. Carpenter, and S. Grossberg,."ART 2: Self-
organisation of stable category recognition codes for analog
input patterns" , Applied Optics, 26, 1987, pp. 4919-4930.
[6] G.A. Carpenter, and S. Grossberg, “ART 3: Hierarchical
search using chemical transmitters in self–organizing pattern
recognition architectures”, Neural Networks, 3, 1990, pp.
129-152.
[7] G.A. Carpenter, S.Grossberg, and J.N.Reynolds,
“ARTMAP: Supervised real-time learning and classification
of non-stationary data by a self-organizing neural network”,
Neural Networks, vo1.4, 1991, pp.565-588.
[8] T.S. Hussain, “ARTSTAR: A Supervised Modular
Adaptive Resonance Network Classifier”, M.Sc. Thesis,Queen's University at Kingston, 1993.
[9] T.S. Hussain and Roger A. Browse, “ARTSTAR: A
Supervised Adaptive Resonance Classifier”, In AI’94: Tenth Canadian Conference on Artificial Intelligence, 1994, pp.
121-130.
[10] E. Kim, W. Kim, and Y. Lee, “Combination of Multiple
Classifiers for the Customer's Purchase Behavior Prediction”,
Decision Support Systems, Vol. 34, No. 2, 2003, pp. 167-175.
[11] R.J. Kuo and C.L. Liao, “Integration of Adaptive
Resonance Theory II Neural Network and Genetic K-Means
Algorithm for Data Mining”, Journal of the Chinese Institute of Industrial Engineers, Vol. 19, No. 4, 2002, pp. 64-70.
[12] H.Liu, Y.Liu, J.Liu, B.Zhang and G.Wu, “Impulse
Force Base ART Network With GA Optimization”, IEEE Int. Conf. Neural Networks 8 Signal Processing Nanjing, China ,
2003, December pp.14-17.
[13] D.Weenink, “Category ART: A Variation On Adaptive
Resonance Theory Neural Net”, IFA Proceedings 21, 1997,
pp.117-129.
322