[lecture notes in computer science] cellular automata volume 3305 || building classifier cellular...

8

Click here to load reader

Upload: alfons-g

Post on 09-Apr-2017

217 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: [Lecture Notes in Computer Science] Cellular Automata Volume 3305 || Building Classifier Cellular Automata

P.M.A. Sloot, B. Chopard, and A.G. Hoekstra (Eds.): ACRI 2004, LNCS 3305, pp. 823–830, 2004.© Springer-Verlag Berlin Heidelberg 2004

Building Classifier Cellular Automata

Peter Kokol, Petra Povalej, Mitja Lenič, and Gregor Štiglic

University of Maribor, FERI, Laboratory for system design, Smetanova ulica 17,2000 Maribor, Slovenia

{kokol, petra.povalej, mitja.lenic, gregor.stiglic}@uni-mb.si

Abstract. Ensembles of classifiers have the ability to boost classification accu-racy comparing to single classifiers and are a commonly used method in thefield of machine learning. However in some cases ensemble construction algo-rithms do not improve the classification accuracy. Mostly ensembles are con-structed using specific machine learning method or a combination of methods,the drawback being that the combination of methods or selection of the appro-priate method for a specific problem must be made by the user. To overcomethis problem we in-vented a novel approach where an ensemble of classifiers isconstructed by a self-organizing system applying cellular automata (CA). Firstresults are promising and show that in the iterative process of combining classi-fiers in the CA, a combination of methods can occur, that leads to superior ac-curacy.

1 Introduction

A large variety of problems arising in engineering, physics, chemistry and economicscan be formulated as classification problems [1, 2, 8]. Although classification repre-sents only a small part of the machine learning field, there are a lot of methods avail-able. To develop a general method that at the same time exposes different aspects of aproblem and produces “optimal” solution is unfortunately impossible, due to the socalled “no free lunch” theorem [6]. Therefore a standard local optimization techniqueis to use multi-starts, by trying different starting points and running the processes ofclassification independently from each other and selecting the best result from alltrials. Moreover ensembles [4] of classifiers are often used since they have the abilityto boost classification accuracy comparing to single classifiers. However, single clas-sifiers just like ensembles of classifiers can often find solutions near local optima, butare not so successful in global exploration of search space. Therefore different popu-lar methods for global exploration of the search space are used, like simulated an-nealing, genetic algorithms and lately also a theory of self-organization systems thatcould gain some diversity leading to possible better solutions.

During the last few years, the scientific area of self-organizing systems becamevery popular. The field seeks general rules about the growth and evolution of sys-temic structure, the forms it might take, and finally methods that predict the futureorganization that will result from changes made to the underlying components. A lot

Page 2: [Lecture Notes in Computer Science] Cellular Automata Volume 3305 || Building Classifier Cellular Automata

824 P. Kokol et al.

of work in this field has been done using Cellular Automata (CA), Boolean Networks,with Alife, Genetic Algorithms, Neural Networks and similar techniques. In this pa-per we propose a new method for nonlinear ensembles of classifiers that exploit theability of self-organization applying CA in search for an optimal solution. The aim isto combine different single classifiers in an ensemble represented by a CA. We showhow improved accuracy can be obtained by interaction of neighborhood cells con-taining different kind of classifiers.

The paper is organized as follows. In Sec. 2 and 3 the process of classification andthe basics of CA is shortly introduced. In Sec. 4 we discuss how to design a CA asclassifier. Sec. 5 presents experiment setup and in Sec. 6 first results are discussed.

2 Classifier Systems and Classification

As mentioned before in this research we are focused on solving classification tasks.Classification is a process in which each sample is assigned with a predefined class.Classifier systems were introduced in 1978 by Holland as a machine learning meth-odology, which can be taught of syntactically simple rules.

For learning a classifier a training set, which consists of training samples, is used.Every training sample is completely described by a set of attributes (sample proper-ties) and class (decision). After learning a classifier a testing set is used to evaluate itsquality. The testing set consists of testing samples described with the same attributesas training samples except that testing samples are not included in the training set. Ingeneral the accuracy of classifying unseen testing samples can be used for predictingthe efficiency of applying the classifier into practice.

3 Cellular Automata

CA are massively parallel systems [3] obtained by composition of myriads of simpleagents interacting locally, i.e. with their closest neighbors. In spite of their simplicity,the dynamics of CA is potentially very rich, and ranges from attracting stable con-figurations to spatio-temporal chaotic features and pseudo-random generation abili-ties. Those abilities enable the diversity that can possibly overcome local optima bysolving engineering problems. Moreover, from the computational viewpoint, they areuniversal, one could say, as powerful as Turing machines and, thus, classical VonNeumann architectures. These structural and dynamical features make them verypowerful: fast CA-based algorithms are developed to solve engineering problems incryptography and microelectronics for instance, and theoretical CA-based models arebuilt in ecology, biology, physics and image-processing. On the other hand, thesepowerful features make CA difficult to analyze - almost all long-term behavioralproperties of dynamical systems, and cellular automata in particular, are unpredict-able. However in this paper the aim was not to analyze the process of CA rather touse it for superior classification tasks.

Page 3: [Lecture Notes in Computer Science] Cellular Automata Volume 3305 || Building Classifier Cellular Automata

Building Classifier Cellular Automata 825

4 Cellular Automata as a Classifier

Our novel idea is to exploit benefits of self-organization abilities of cellular automatain classification and its possible use in the extraction of new knowledge. Our first taskwas to define the basic elements (content of the cells) and transaction rules that wouldresult in a learning ability of cellular automata and representing gained classificationknowledge in its structure. Most obvious choice is to define a cell in the classificationcellular automata (CCA) as a classifier. In a simplified view we can look at cellularautomata as on an ensemble of classifiers. In each cell we have to build a classifierthat has different and eventually better abilities than already produced ones. Contraryif all cells would have the same classifier there would be no diversity and it would bequite unreasonable to use any trans-action rules on such automata, because all cellswould classify in the same way. Thus we must ensure the most possible diversity ofclassifiers used in different automata’s cells to benefit from the idea of CCA.

4.1 Initializing Automata

The most desirable initial situation in a CCA is having a unique classifier in each ofits cells. We can ensure that by using different methods but the problem is (especiallywhen the CCA has a lot of cells) that the number of different methods is limited.Nevertheless, even if there is only one method available we can still use so calledfine-tuning. Namely, almost all machine-learning methods used for classificationhave multiple parameters that affect classification. Therefore with changing thoseparameters different classifiers can be obtained. Parameters can be randomly selectedor defined by using for example evolutionary algorithms for determining most appro-priate parameters. An-other possibility is by changing expected probability distribu-tions of input samples, which may also result in different classifiers, even by usingthe same machine learning method with the same parameters. Still another approachis the feature reduction/selection. That technique is recommended when a lot of fea-tures are presented.

4.2 Transaction Rules

Transaction rules must be defined in such a way to enforce the learning process thatshould lead to generalization abilities of the CCA. Successful cells should thereforebe rewarded, but on the other hand cells with not so good classifiers shouldn’t bepunished too much, to preserve the diversity of the CCA. Each classifier in a cell canclassify a single learning sample or can generate unknown tag to signal that it cannotassign any class to current sample. From the transaction rules point of view a classifi-cation result can have three outcomes: same class as the learning sample, differentclass or cannot classify. Cannot classify tag plays important role when combiningpartial classifiers that are not defined on the whole learning set. In that case CCAbecomes a classifier system i.e. when using if-then rules in CCAs cells. Therefore a

Page 4: [Lecture Notes in Computer Science] Cellular Automata Volume 3305 || Building Classifier Cellular Automata

826 P. Kokol et al.

cell with unknown classification for current learning sample should be treated differ-ently as misclassification. Beside the cells classification ability also the neighborhoodplays a very important role in the self-organization ability of a CCA. Transactionrules depend on the specific neighborhood state to calculate new cell’s state. In gen-eral we want to group classifiers that support similar hypothesis and have thereforesimilar classification on learning samples. Even if sample is wrongly classified, theneighborhood can support a cell classifier by preventing its elimination from auto-mata. With that transaction rule we encourage creation of decision centers for a spe-cific class and in this way we can overcome the problem of noisy learning samples –another CCAs advantage.

4.3 Learning Algorithm

Input: training set with N training samplesNumber of iterations: t=1,2,…TFor t=1,2…T:

- choose a learning sample I- fill the automaton- each cell in automaton classifies the learning

sample I- change cells energy according to the transac-

tion rules- a cell with energy bellow zero does not survive

Output: CCA with highest classification accuracy onthe training set

Once the CCAs classifier diversity is ensured, transaction rules are continuously ap-plied. Beside its classifier information, each cell contains also statistical informationabout its successfulness in a form of cell’s energy. Transaction rules can increase ordecrease energy level dependent on the successfulness of classification and cellsneighbors. If energy drops below zero the cell is terminated. New cell can be createddependent on learning algorithm parameters with its initial energy state and a classi-fier used from pool of classifiers or newly generated classifier dependent on neigh-borhood classification abilities and modifying expected probability distribution and/orused features. When using second approach we create local boosting algorithm toassure that misclassified samples are correctly classified by the new cell. Of course ifcell is too different from the neighborhood it will decently die out and the classifierwill be returned to the pool. The learning of a CCA is done incrementally by supply-ing samples from the learning set. Transaction rules are executed first on the wholeCCA with a single sample and then continued with the next until the whole problemis learned by using all samples - that is a similar technique than used in neural net-works [5, 7]. Transaction rules do not directly imply learning, but the iteration ofthose rules creates the self-organizing ability. Of course this ability depends on classi-

Page 5: [Lecture Notes in Computer Science] Cellular Automata Volume 3305 || Building Classifier Cellular Automata

Building Classifier Cellular Automata 827

fiers used in CCAs cells, and its geometry. Stopping criteria can be determined bydefining fixed number of iterations or by monitoring accuracy.

4.4 Inference Algorithm

Input: a sample for classificationNumber of iterations: t=1,2,…VFor t=1,2…V

- each cell in automaton classifies the sample- change cells energy according to the transac-

tion rules- each cell with energy bellow zero does not sur-

viveClassify the sample according to the weighted votingof the survived cells

Output: class of the input sample

Inference algorithm differs from learning algorithm, because it does not use self-organization. Simplest way to pro-duce single classification would be to use themajority of the cell voting. On the other hand some votes can be very weak from thetransaction rule point of view. If transaction rules are applied using onlyneighborhood majority vote as sample class those weak cells can be eliminated. Afterseveral irritations of transaction rules only cells with strong structural support surviveon which majority voting can be executed. Weighted voting algorithm can be used byexploiting energy state of classifier in a cell.

5 Experiment Setup and Results

To test the presented idea of the CCA we implemented simple cellular automata andused simple if-then rules in the cells. Rule form is as follows:

if (attribute≤value) then decision ;

where decision is a class of a current sample.

The following transaction rules were used:- If a cell has the same classification as the sample class:

• if majority neighbourhood vote is the same as learning sample class the in-crease cell‘s energy

• if majority neighbourhood vote differs the dramatically increase energy ofthe cell;

Page 6: [Lecture Notes in Computer Science] Cellular Automata Volume 3305 || Building Classifier Cellular Automata

828 P. Kokol et al.

- If a cell classification differs from the learning sample class• if neighbourhood majority vote is the same as learning sample class then de-

crease energy of the cell• if neighbourhood majority class is the same as the cell class then slightly de-

crease energy of cell;- If a cell cannot classify learning sample then leave the energy state of the cell un-changed.

Through all iterations all cells use one point of energy (to live). If the energy levelof a cell drops below zero the cell is terminated.

The presented CA Simple approach was tested on five databases from UCI re-pository. We compared the results with the following commonly used methods fordecision tree induction that are able to use more complex conditions as our CA Sim-ple:

- greedy decision tree induction based on different purity measures (Quinlan’sID3, C4.5, C5 and Decrain),

- genetically induced decision trees – a hybrid between genetic algorithms anddecision trees (Vedec) and

- boosting of decision trees – a method for generating an ensemble of classifi-ers by successive re-weighting of the training instances. AdaBoost algo-rithm, introduced by Freund and Schapire was used.

The predicting power of induced classifiers was evaluated on the basis of averageaccuracy (Eq. 1) and average class accuracy (Eq. 2).

objectsallofnum

objectsclassifiedcorrectlyofnumaccuracy

.= (1)

cclassinobjectsallofnum

cclassinobjectsclassifiedcorrectlyofnumaccuracyc .

=

classesofnum

accuracyaccuracyclassaverage c

c

.

∑=

(2)

Table 1. Accuracy of the induced classifiers on the test sets.

DatasetC 4.5 C5

C5Boost ID3

DecrainID3 Vedec

CASimple

BreastCancer 96.4 95.3 96.6 92.7 95.7 96.1 97.4GlassType 71.8 83.6 92.7 69.1 83.6 87.3 81.8Hepatitis 80.8 82.7 82.7 78.8 82.7 78.8 86.5Iris 97.0 94.0 94.0 94.0 94.0 96.0 96.0Pima 76.2 75.8 81.3 70.7 76.6 77.3 75.7

Page 7: [Lecture Notes in Computer Science] Cellular Automata Volume 3305 || Building Classifier Cellular Automata

Building Classifier Cellular Automata 829

Table 1 shows the results in the terms of the CCAs total accuracy on unseen casescompared to some very popular decision trees approaches. We can see that even thisvery simple CCA can outperform more prominent methods. The average class accu-racy is presented in Table 2 which once again shows that the simple CCA performedbest on two databases. Even in all other cases the results are comparable with theresults obtained with other approaches in the terms of total accuracy and also in aver-age class accuracy. That shows that the ability of self-organization, is very promisingand that more elaborate CCAs can produce even better results.

Table 2. Average class accuracy of induced classifiers on the test sets.

DatasetC 4.5 C5

C5Boost ID3

DecrainID3 Vedec

CASimple

BreastCancer 94.4 95.0 96.7 90.2 96.1 96.8 98.1GlassType 80.1 82.3 93.1 69.2 83.5 86.0 78.8Hepatitis 58.7 59.8 59.8 63.7 59.8 75.7 50.0Iris 94.2 94.2 94.2 94.2 94.2 96.1 96.1Pima 73.0 70.5 79.2 66.1 70.4 76.0 72.0

6 Concluding Remarks and Future Work

The aim of this paper was to present the use of self-organization ability of cellularautomata in classification. This way we combined classifiers in an ensemble that iseventually better than individual classifiers or traditional ensembles generated byboosting or bagging methods. We empirically showed that even a very simple CCAcan give superior results as more complex classifiers or ensembles of those classifiers.This is very promising and taking into account the simplicity of the tested CCA wecan foresee a lot of room for improvements. The advantages of the resulting self-organizing structure of cells in CCA is the problem independency, robustness to noiseand no need for the user input. The important research direction in the future are toanalyze the resulting self-organized structure, impact of transaction rules on classifi-cation accuracy, introduction of other social aspect for cells survival and enlargingthe classifier diversity by more complex cells classifiers.

References

1. Coello, C.A.C.: Self-Adaptive Penalties for GA-based Optimization, 1999 Congress onEvolutionary Computation, Washing-ton D.C., Vol. 1, IEEE Service Center (1999) 573-580.

2. Dietterich, T.G.: Ensemble Methods in Machine Learning. In J. Kittler and F. Roli (Ed.)First International Workshop on Multiple Classifier Systems, Lecture Notes in ComputerScience (2000) 1-15. New York: Springer Verlag.

Page 8: [Lecture Notes in Computer Science] Cellular Automata Volume 3305 || Building Classifier Cellular Automata

830 P. Kokol et al.

3. Flocchini, P., Geurts, F., Santoro, N.: Compositional experimental analysis of cellularautomata: Attraction properties and logic disjunction. Technical Report TR-96-31, Schoolof Computer Science, Carleton University (1996)

4. Freund, Y., Schapire, R. E.: Experiments with a new boosting algorithm, In ProceedingsThirteenth International Conference on Machine Learning, Morgan Kaufman, San Francisco(1996) 148-156.

5. Lenič, M., Kokol, P.: Combining classifiers with multimethod approach, In ProceedingsSecond international conference on Hybrid Intelligent Systems, Santiago de Chile, Decem-ber 1-4, 2002. IOS Press (2002) 374-383.

6. Kirkpatrick, S., Jr., C. G., Vecchi, M.: Optimization by simulated annealing. Science,(1983) 220.

7. Towel, G., Shavlik, J: The Extraction of Refined Rules From Knowledge Based NeuralNetworks. Machine Learning (1993) 131, 71-101.

8. Zorman, M., Podgorelec, V., Kokol, P., Peterson, M., Sprogar, M., Ojstersek, M: Findingthe right decision tree's induction strategy for a hard real world problem. International Jour-nal of Medical Informatics (2001) 63, Issues 1-2, 109-121