[ieee 2008 eighth international conference on intelligent systems design and applications (isda) -...

An Evolutionary Classifier Based on Adaptive Resonance Theory Network II and Genetic Algorithm

I-En Liao1 Shu-Ling Shieh

1,2,* Hui-Ching Chen

1

1 Department of Computer Science and Engineering, National Chung-Hsing University, Taichung, Taiwan.

2 Department of Information Management, Ling-Tung University , Taichung, Taiwan. * [email protected]

Abstract

Adaptive Resonance Theory Network II (ART2) is a neural network concerning unsupervised learning. It has been shown that ART2 is suitable for clustering problems that require on-line learning of large-scale and evolving databases. However, if applied to classification problems, ART2 suffers from deficiencies in terms of interpretation of class labels and sensitivity to the input data-order. This study proposes a novel evolutionary classifier based on Adaptive Resonance Theory Network II and genetic algorithms. In the proposed classifier, ART2 is used first for generating the weights between attributes and clusters. In the second stage, a genetic algorithm is employed to generate class labels of input data. The performance of the proposed algorithm is evaluated using Hayes data sets from the Machine Learning Repository at UCI. The experimental results show that the proposed classifier is as good as the well-known C5.0 Classifier in terms of accuracy.

1. Introduction

Several different unsupervised neural-network

architectures have been proposed which are based on

Adaptive Resonance Theory (ART) Network

[1][2][4][5][6]. ART is advantageous for use in on-line

learning and self-organizing. ART1 is an ART model

that can only handle binary input data of zeroes and

ones. ART2 is an improvement of ART1 that can

support the analysis of non-binary input data.

ART2 has been proved to perform well on the

clustering problem. Several researchers[7][8][9] have

proposed supervised ART networks for solving

classification problems that have demonstrated good

results.

This research proposes an evolutionary classifier

based on ART2 and genetic algorithms. The proposed

classifier does not have the drawbacks of using ART2

alone as a classifier. ART2 is used first for generating

the weights between attributes and clusters. Then a

genetic algorithm is employed to generate class labels

of input data. The performance of the proposed

algorithm is evaluated using Hayes data sets from the

Machine Learning Repository at UCI. The

experimental results show that the proposed classifier

is as good as the well-known C5.0 Classifier in terms

of accuracy.

The remaining sections of this paper are organized

as follows: the next section briefly presents related

work concerning Adaptive Resonance Theory

Networks and genetic algorithm; in the third section,

we present a detailed description of our algorithm;

experimental results of the dataset derived from the

Hayes data sets and UCI KDD Archive are provided in

the fourth section. The conclusions are given in the last

section.

2. Related Work

Carpenter and Grossberg[4] proposed the Adaptive

Resonance Theory Network (ART1 Network), which

accepts binary input data, for cluster analysis. For non-

binary input data, Grossberg and Carpenter[5]

developed the Adaptive Resonance Theory Network II

(ART2 Network) in 1987. Hussain[8] proposed

ARTSTAR as a classifier based on ART2. In

ARTSTAR, an extra Instar layer is added to ART2 to

class labels of training data for supervised learning. As

a result, ARTSTAR performs better than ART2 for

Eighth International Conference on Intelligent Systems Design and Applications

978-0-7695-3382-7/08 $25.00 © 2008 IEEE

DOI 10.1109/ISDA.2008.75

318

solving classification problems. Carpenter et al.[7]

proposed ARTMAP network architecture; and

Weenink[13] proposed the structure CATEGORYART

of ART Network, and improved the weaknesses of

ARTMAP. Liu et al.[12] presented a new supervised

ART structure, IFART (Impulse Force Based ART),

the training network of which is Category ART. Its

main purpose is to improve ART Categorization

schemes.

Genetic Algorithms (GA) use evolution and

chromosome elimination to determine the best solution

by searching large, multipoint, complex problem

spaces (continuous reproduction, crossover, and

mutation). The selection, crossover, and mutation

processes are not stopped until the termination

condition is satisfied. GA can avoid preserving the

local optimal solution and obtain the global optimum.

Combined multiple classifiers are more accurate

than a single classifier. Many applications use GA to

improve the rate of accuracy of classification. Kim et

al.[10] used GA as the optimal tool that could be used

to predict consumer purchase behaviors. Its most

important advantage is that it can combine multiple

classifiers and support (something) to make decisions.

Kuo and Liao[11] combining Genetic K-Means

Algorithms with ART2 and improved the efficiency of

classifying. Behrouz and William[3] used GA in

feature selection and combined multiple classifiers,

leading to a significant performance improvement.

3. System Structure and Algorithm

In this paper we propose a method that uses the

clustering result obtained from applying for the

learning process of ATR2 and the z input vectors of

sample data (with chromosomes) as the basis of

classification. We propose an evolutionary classifier,

as shown in Fig 1.

The system structure of this research is divided into

three phases: clustering, genetic evolution, and testing.

In the first phase, the clustering stage, we first input

the training data which accounts for 90% of the sample;

and then proceed to the ART2 procedure to obtain the

result of clustering, simultaneously, removing bij. In

the second phase, we use a genetic algorithm to evolve

the appropriate chromosome so that bij and the training

sample correspond to the real category. To obtain

sufficient clustering accuracy, we use continuous

genetic evolution. In the final stage, once the classifier

completes the training, we measure the accuracy of

10% of the testing data in order to evaluate the

accuracy of the evolutionary algorithm.

(1) The coding of chromosomes

The length of every chromosome is n*m, where n is

the clustering number of ART2, and m is the number

of attributes. Chromosomes are C = (c0,…,cn*m-1),

where ci defines the chromosome on which rests the

gene of the i location, and ci is a floating number in

[0,10). The initial value of every ci is generated

randomly from the floating number interval [0,10).

Each ci represents the ratio of influence factors

between the training attribute and the classification. If

a clustering number has 3 clusters (n=3), number of

attributes is 4 (m=4), and the length of chromosome is

12 elements; the method of coding is as shown in Fig.

2.

C0 C1 C2 C3 ...C10 C11

6.63444 3.74802 7.75940 .. . 2.86918

Fig 2 The coding of chromosomes

(2) Genetic Reproduction and Selection

Fig 1. System Structure

319

The larger the fitness value of a chromosome is, the

higher the accuracy rate of the chromosome’s

evolutionary classifier will be. This research makes use

of roulette wheel selection so that the areas of divided

wheel sections are cut so as to increase in size, and the

proportion of area these sections occupy is directly

proportional to the fitness value. Thus the probability

of being chosen as a member of the next generation

and the fitness value are in direct proportion.

(3) Genetic Crossover operators

Genetic Crossover operators are based on two

random boundaries of node bits chosen from a

population. It proceeds by the boundaries of node bits

exchanging self-identifying information through one

another to produce two node bit boundaries. This

research makes use of the signal-divided crossover,

and chooses a crossover point randomly. The two sides

of the crossover points can perform exchanging actions.

(4) Genetic Mutation operator

To ensure random selection of a node bit from a

number of boundaries, slight changes are made to the

value of the corresponding species bits. This series of

processes generates new species in order to complete

mutation. Fig 3 shows random selection to determine

the genetic set C2 of chromosomes, and then mutation

and slight changes to the original genetic value. If the

uncertain value is 0.5, the genetic value would mutate

from 7.75941 to 7.75941*0.5 3.879705. As such, it

becomes a post-mutational chromosome.

C0 C1 C2 C3

Before 6.63443 3.74802 7.75941 1.18871

After 6.63443 3.74802 3.879705 1.18871

Fig 2. Examples of mutated chromosomes

(5) Fitness Function

The fitness function that we have designed can be

used to accumulate numbers of chromosomes which

ensure that the sample is classified correctly. In

formula (1), n represents the total number of training

samples, and every sample has m attributes; j stands

for the sample that passes through the ART2 classifier,

and is the jth class. C is a chromosome, where ci+j*m

defines the chromosome on which is the gene of the

i+j*m location. S signifies the input sample, and Si is

the ith attribute of the sample. The parameter bij is used

for ART2 after learning. Then, class k signifies the true

category of the kth training sample. H is a fitness

function used to correspond with fitness value by

passing through hash function (h) to the class. If they

accurately match one another, then it returns 1, or

returns 0. Hence, if the fitness value obtained after

chromosome calculation corresponds with the category

successfully, the accurate number is increased by 1,

otherwise there is no increment.

)),)(((1

1

0

k

z

k

m

iijimji classbScH� �

�

�

�� …….............(1)

Where:

1

0

1( ( )) ;

0

( ( ), )

1

0

m

i j m i ij ki

mif h c S b classi j m i ij k

io therw ise

H c S b class�

� ��

��

�

� �

��

�……..(2)

1

0

1

0

( ( ))

( ( ) / 1 0 0 )

m

i j m i iji

m

i j m i iji

h c S b

In t c S b

�

� ��

�

� ��

� �

� � �

�

�

………..(3)

4. Experimental result

The main purpose of experiment is to evaluate the

feasibility of the system. We use the Hayes dataset to

train and test the accuracy rate of the genetic

evolutionary classifier. The original Hayes dataset is in

the University of California at Irvine - Machine

Learning Repository, (http://www.ics.uci.edu/~mlearn/

MLRepository.html. ) There are 160 data records and

each record has 4 columns.

We perform two sets of experiments: the first

experiment compares the proposed genetic

evolutionary classifier and the software of decision

tree C5.0. The purpose of the second experiment is to

observe the influence of the genetic evolutionary

generation on the accuracy rate.

(1) Analysis of the accuracy rate

320

In the first experiment, we select and test random

samples from the Hayes dataset. The tested sample

consists of 10% of the population. For example, the

Hayes dataset has 160 data records, and we randomly

choose 144 records to be the training samples. The

other 16 records can serve as the testing samples. The

experiment repeats ten times, and then it calculates

three values: the average, the data maximum, and the

data minimum.

The experimental results are shown in Table 1. The

best accuracy ratio achieved by the proposed

evolutionary classifier is 87.5%; higher than the 84.4%

obtained by C5.0. However, the accuracy rate of the

worst situation, 68.75%, is lower than that of C5.0.

However, the average accuracy of our evolutionary

classifier is 79.3%, higher than the 78.76% of C5.0.

Table 1. Comparison of accuracy rate between C5.0 and proposed evolutionary classifier.

Times 1 2 3 4 5 6 7

C5.0 78.1 75 84.4 84.4 75 84.4 71.9

Proposed 81.25 81.25 81.25 81.25 81.25 68.75 68.75

Times 8 9 10 avg max min

C5.0 75 75 84.4 78.76 84.4 75

Proposed 81.25 81.25 87.5 79.37 87.5 68.75

(2) The influence of evolutionary generation on accuracy rate

In this experiment, we observe different

generations produced by genetic evolution and the

effects on accuracy rate of the proposed evolutionary

classifier. We assume that the evolutionary parameter

is set to the number 200 of the chromosome and the

mutation rate is 0.1; we then select the 50th, 100

th, 150

th,

and 200th

generations for the simulation experiment, as

shown in Table 2. The continuously evolved

generations gradually increase the accuracy rate. In the

200th

generation, the best accuracy rate obtained

reached as high as 67.5% (comparable to that of C5.0).

However, after the 200th

generation, the accuracy rate

did not increase. The curves of Fig 3. show the

increasing accuracy rate.

Table 2.The influence of the evolutionary generations on accuracy rates of the Hayes dataset.

Generation 50 100 150 200 250

A 59.38 62.5 63.13 67.5 67.5

Fig 3. The evolutionary generations influence on the accuracy rate.

5. Conclusion

In this paper, we propose a new evolutionary

classifier. The method uses the results of clustering

and the input data, which are obtained after the training

processes of ART2. The results are fed into the genetic

chromosome as the basis of classification. The results

of this experiment show that the 200th

evolutionary

generation produces the best classification accuracy

rate. In order to demonstrate the practicability of our

methods and to compare our experimental results with

the C5.0 decision tree software, we use random

selection and the 10-fold intersect comparison to test

them separately. The experimental results show that

the proposed classifier is as good as the well-known

C5.0 Classifier in terms of accuracy.

References

[1] A.B. Baruah, and R.C. Welti, “Adaptive resonance theory

and the classical leader algorithm“, Proceedings of the IEEE International Joint Conference on Neural Networks – Seattle,

1991, pp. A-913.

[2] L.I. Burke, “Clustering characteristics of adaptive

resonance“, Neural Networks, 4, 1991, pp.485-491.

[3] M.B Behrouz and F.P. William, “ Using Genetic

Algorithms for Data Mining Optimization in an Educational

Web-based System”, Proceeding of Genetic and Evolutionary Computation Conference, 2003, pp. 2252-2263.

321

[4] G.A. Carpenter and S. Grossberg, “A massively parallel

architecture for a self-organizing neural pattern recognition

machine”, Machine Computer Vision, Graphics and Image Processing, 37, 1987,pp. 54-115.

[5] G.A. Carpenter, and S. Grossberg,."ART 2: Self-

organisation of stable category recognition codes for analog

input patterns" , Applied Optics, 26, 1987, pp. 4919-4930.

[6] G.A. Carpenter, and S. Grossberg, “ART 3: Hierarchical

search using chemical transmitters in self–organizing pattern

recognition architectures”, Neural Networks, 3, 1990, pp.

129-152.

[7] G.A. Carpenter, S.Grossberg, and J.N.Reynolds,

“ARTMAP: Supervised real-time learning and classification

of non-stationary data by a self-organizing neural network”,

Neural Networks, vo1.4, 1991, pp.565-588.

[8] T.S. Hussain, “ARTSTAR: A Supervised Modular

Adaptive Resonance Network Classifier”, M.Sc. Thesis,Queen's University at Kingston, 1993.

[9] T.S. Hussain and Roger A. Browse, “ARTSTAR: A

Supervised Adaptive Resonance Classifier”, In AI’94: Tenth Canadian Conference on Artificial Intelligence, 1994, pp.

121-130.

[10] E. Kim, W. Kim, and Y. Lee, “Combination of Multiple

Classifiers for the Customer's Purchase Behavior Prediction”,

Decision Support Systems, Vol. 34, No. 2, 2003, pp. 167-175.

[11] R.J. Kuo and C.L. Liao, “Integration of Adaptive

Resonance Theory II Neural Network and Genetic K-Means

Algorithm for Data Mining”, Journal of the Chinese Institute of Industrial Engineers, Vol. 19, No. 4, 2002, pp. 64-70.

[12] H.Liu, Y.Liu, J.Liu, B.Zhang and G.Wu, “Impulse

Force Base ART Network With GA Optimization”, IEEE Int. Conf. Neural Networks 8 Signal Processing Nanjing, China ,

2003, December pp.14-17.

[13] D.Weenink, “Category ART: A Variation On Adaptive

Resonance Theory Neural Net”, IFA Proceedings 21, 1997,

pp.117-129.

322

[ieee 2008 eighth international conference on intelligent systems design and applications (isda) -...

Documents