real/binary-like coded versus binary coded genetic algorithms to automatically generate fuzzy...

13
Engineering Applications of Artificial Intelligence 17 (2004) 313–325 Real/binary-like coded versus binary coded genetic algorithms to automatically generate fuzzy knowledge bases: a comparative study Sofiane Achiche*, Luc Baron, Marek Balazinski Department of Mechanical Engineering, ! Ecole Polytechnique de Montre´al, P.O. 6079, Station Centre-Ville, Montre´al, Que´bec, Canada H3C 3A7 Abstract Nowadays fuzzy logic is increasingly used in decision-aided systems since it offers several advantages over other traditional decision-making techniques. The fuzzy decision support systems can easily deal with incomplete and/or imprecise knowledge applied to either linear or nonlinear problems. This paper presents the implementation of a combination of a Real/Binary-Like coded Genetic Algorithm (RBLGA) and a Binary coded Genetic Algorithm (BGA) to automatically generate Fuzzy Knowledge Bases (FKB) from a set of numerical data. Both algorithms allow one to fulfill a contradictory paradigm in terms of FKB precision and simplicity (high precision generally translates into a higher level of complexity) considering a randomly generated population of potential FKBs. The RBLGA is divided into two principal coding methods: (1) a real coded genetic algorithm that maps the fuzzy sets repartition and number (which drives the number of fuzzy rules) into a set of real numbers and (2) a binary like coded genetic algorithm that deals with the fuzzy rule base relationships (a set of integers). The BGA deals with the entire FKB using a single bit string, which is called a genotype. The RBLGA uses three reproduction mechanisms, a BLX-a; a simple crossover and a fuzzy set reducer, while the BGA uses a simple crossover, a fuzzy set displacement mechanism and a rule reducer. Both GAs are tested on theoretical surfaces, a comparison study of the performances is discussed, along with the influences of some evolution criteria. r 2004 Elsevier Ltd. All rights reserved. Keywords: Artificial intelligence; Fuzzy logic systems; Fuzzy knowledge bases; Automatic learning; Real coded genetic algorithms; Binary coded genetic algorithms 1. Introduction Nowadays fuzzy logic is increasingly used in decision- aided systems since it offers several advantages over other traditional decision-making techniques. The fuzzy decision support systems (FDSS) can easily deal with incomplete and/or imprecise knowledge applied to either linear or nonlinear problems. In most cases decision making systems are used when there is: * an expert available to manually construct the FKB; * an important nonlinearity of the modeled process. Hence, there is a difficulty to build a good enough mathematical model (impossible in some cases) that emulates the behavior of the problem to be solved. The construction of FKBs requires the evaluation of each potential solution (the generated FKBs), which allows to establish the accuracy level when comparing their behaviors (outputs) to the one in the learning data. The manual construction of an FKB requires the evaluation of each proposition made by the expert to measure its performance. If the result is not satisfactory, the expert determines the modifications either to the fuzzy sets repartition, to the fuzzy rule base or to both, in the hope of improving the accuracy. This process is extremely long and tedious since it is completely dependent on the expert’s intuition, experiences and his understanding of the problem’s behavior. Therefore, a multi-criteria optimization tool is highly desirable. The genetic algorithms (GAs) are powerful stochastic optimization methods (Goldberg, 1989) and are con- sidered here as an optimization tool for the automatic generation of FKBs. The GA allows one to improve FKB performance in terms of accuracy and simplicity with respect to one or several performance criteria. This can be done automatically, i.e., without the need of expert intuition. This paper presents an overview of the ARTICLE IN PRESS *Corresponding author. E-mail addresses: sofi[email protected] (S. Achiche), [email protected] (L. Baron), [email protected] (M. Balazinski). 0952-1976/$ - see front matter r 2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.engappai.2004.04.006

Upload: polymtl

Post on 30-Apr-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

Engineering Applications of Artificial Intelligence 17 (2004) 313–325

ARTICLE IN PRESS

*Correspondi

E-mail addre

luc.baron@poly

(M. Balazinski).

0952-1976/$ - see

doi:10.1016/j.eng

Real/binary-like coded versus binary coded genetic algorithms toautomatically generate fuzzy knowledge bases: a comparative study

Sofiane Achiche*, Luc Baron, Marek Balazinski

Department of Mechanical Engineering, !Ecole Polytechnique de Montreal, P.O. 6079, Station Centre-Ville, Montreal, Quebec, Canada H3C 3A7

Abstract

Nowadays fuzzy logic is increasingly used in decision-aided systems since it offers several advantages over other traditional

decision-making techniques. The fuzzy decision support systems can easily deal with incomplete and/or imprecise knowledge applied

to either linear or nonlinear problems. This paper presents the implementation of a combination of a Real/Binary-Like coded

Genetic Algorithm (RBLGA) and a Binary coded Genetic Algorithm (BGA) to automatically generate Fuzzy Knowledge Bases

(FKB) from a set of numerical data. Both algorithms allow one to fulfill a contradictory paradigm in terms of FKB precision and

simplicity (high precision generally translates into a higher level of complexity) considering a randomly generated population of

potential FKBs. The RBLGA is divided into two principal coding methods: (1) a real coded genetic algorithm that maps the fuzzy

sets repartition and number (which drives the number of fuzzy rules) into a set of real numbers and (2) a binary like coded genetic

algorithm that deals with the fuzzy rule base relationships (a set of integers). The BGA deals with the entire FKB using a single bit

string, which is called a genotype. The RBLGA uses three reproduction mechanisms, a BLX-a; a simple crossover and a fuzzy set

reducer, while the BGA uses a simple crossover, a fuzzy set displacement mechanism and a rule reducer. Both GAs are tested on

theoretical surfaces, a comparison study of the performances is discussed, along with the influences of some evolution criteria.

r 2004 Elsevier Ltd. All rights reserved.

Keywords: Artificial intelligence; Fuzzy logic systems; Fuzzy knowledge bases; Automatic learning; Real coded genetic algorithms; Binary coded

genetic algorithms

1. Introduction

Nowadays fuzzy logic is increasingly used in decision-aided systems since it offers several advantages overother traditional decision-making techniques. The fuzzydecision support systems (FDSS) can easily deal withincomplete and/or imprecise knowledge applied to eitherlinear or nonlinear problems. In most cases decisionmaking systems are used when there is:

* an expert available to manually construct the FKB;* an important nonlinearity of the modeled process.

Hence, there is a difficulty to build a good enoughmathematical model (impossible in some cases) thatemulates the behavior of the problem to be solved. Theconstruction of FKBs requires the evaluation of each

ng author.

sses: [email protected] (S. Achiche),

mtl.ca (L. Baron), [email protected]

front matter r 2004 Elsevier Ltd. All rights reserved.

appai.2004.04.006

potential solution (the generated FKBs), which allows toestablish the accuracy level when comparing theirbehaviors (outputs) to the one in the learning data.The manual construction of an FKB requires theevaluation of each proposition made by the expert tomeasure its performance. If the result is not satisfactory,the expert determines the modifications either to thefuzzy sets repartition, to the fuzzy rule base or to both,in the hope of improving the accuracy. This process isextremely long and tedious since it is completelydependent on the expert’s intuition, experiences andhis understanding of the problem’s behavior. Therefore,a multi-criteria optimization tool is highly desirable. Thegenetic algorithms (GAs) are powerful stochasticoptimization methods (Goldberg, 1989) and are con-sidered here as an optimization tool for the automaticgeneration of FKBs. The GA allows one to improveFKB performance in terms of accuracy and simplicitywith respect to one or several performance criteria. Thiscan be done automatically, i.e., without the need ofexpert intuition. This paper presents an overview of the

ARTICLE IN PRESSS. Achiche et al. / Engineering Applications of Artificial Intelligence 17 (2004) 313–325314

FDSS software Fuzzy-Flou, developed at Ecole Poly-technique (Canada) and the University of Silesia(Poland) (Balazinski et al., 1993) which was used forthe validation tests. Then, it presents a brief descriptionof the BGA and an extended one of the RBLGA whichwas used in this work, explaining the specifications ofthe reproduction and mutation mechanisms of each.Finally validation results are presented along with acomparative study of their performances inspectedthrough different evolution parameters and perfor-mance paradigms.

2. Fuzzy decision support system

In this section we present a rule-based approach todecision making using fuzzy logic techniques, based onthe compositional rule of inference (CRI). Thisapproach is used to handle uncertain knowledge andwas developed in the 1960s by Zadeh (1973). Suchknowledge can be collected and delivered by a humanexpert (e.g. decision-maker, designer, process planner,machine operator, etc.). The CRI may be written in theform:

U ¼ ðC �?� B � AÞ3R; ð1Þ

where R represents the global relation that aggregates allthe fuzzy rules, ðA;B;y;CÞ represents the inputs(observations) and U represents the output (conclusion).The symbol 3 represents the CRI operator. Threedefuzzification methods are usually available, i.e., centerof gravity (COG), average of maximums (AOM) and themodified center of gravity (MCOG) (Balazinski et al.,1993). The FKB consists of two components: thelinguistic term base (database) and the fuzzy productionrule base. The database is divided into two parts: fuzzy

Fig. 1. Screen shot of the

premises and fuzzy conclusions. Fig. 1 shows a screenprintout of the premises and a conclusion (on the right),the fuzzy rules and settings (on the left) of the FDSSFuzzy-Flou software. FKBs can be divided into twomain categories:

* multiple inputs single output systems (MISO);* multiple inputs multiple outputs systems (MIMO).

In this paper only MISOs are considered. The deffuzi-fication mechanism used is the COG. The fuzzy rules area finite number of heuristic fuzzy rules of the if then

type.

3. Automatic generation of fuzzy knowledge bases using

GAs

The automatic generation of fuzzy knowledge bases isperformed using a GA. A GA is based on the analogy ofthe mechanics of natural genetics, and imitates theDarwinian survival-of-the-fittest approach (Baron,1998). The GA uses four basic operations: crossover,mutation, evaluation and natural selection. Crossoverand mutation are used respectively to generate andmodify an FKB genotype and the natural selection sortsthe different FKBs according to the performancecriteria.

Several research used GAs for the automatic genera-tion of FKBs (Cord !on et al., 2000; Diederich andRenaud, 1999; Surmann and Selenschtschikow, 2002;Valenzuela-Rendon, 1991). As shown in Fig. 2, eachindividual of a population is a potential FKB. Themethod uses iterative improvement of individuals ateach generation to converge toward multiple optimasimultaneously.

FDSS Fuzzy-Flou.

ARTICLE IN PRESS

Fig. 3. Fuzzy set.

Fig. 2. The GA learning process of an FDSS Fuzzy-Flou knowledge

base.

S. Achiche et al. / Engineering Applications of Artificial Intelligence 17 (2004) 313–325 315

When the number of unknown parameters increases,GA exhibits only a polynomial increase of the complex-ity (Deb et al., 2000; Lobo et al., 2000), while the otheroptimization techniques show an exponential increase.Fig. 2 presents the encoding/decoding scheme as well asthe four basic operations of the developed GA learningsoftware (Baron et al., 2002).

3.1. Binary coded genetic algorithm

The binary representation (BGA) of the genotypedominated most of the works using GAs, since theefficiency of the binary coded GAs was provedtheoretically (Goldberg, 1989). The other obviousreason is the rather simple numerical implementationof BGA. This evolutionary process operates directly onthe genotype—i.e., the coded physical characteristicsinto bit string—of individuals rather than on itsphenotype—i.e., the physical characteristics themselves.

3.1.1. Coding

The genotype G of an FKB is the coding of the fuzzysets and the rule base into a bit-string:

G � fGsets;Grulesg; ð2Þ

where Gsets and Grules are respectively the genotypes ofthe fuzzy sets and the fuzzy rules.

3.1.1.1. Input/output premises. The FDSS Fuzzy-Flouallows the use of trapezoidal membership functions (seeFig. 3). For the sake of coding simplicity, we consideronly non-symmetrical triangular fuzzy sets on thepremises and symmetrical triangular fuzzy sets on theconclusion. Therefore the position of each fuzzy set isgiven as m1 ¼ m2 ¼ position of the summit, am and bm

are set to reach the positions of the previous and nextsummit, while hm ¼ 1 as shown in Fig. 3. The size of the

Gsets depends on the number of premises N; the numberof fuzzy sets Ni on each premise i and the number of bitsbr allocated to specify the resolution on the position.For example, if br ¼ 4; the genotype of the fuzzy sets ofpremise i is given as

GXi� 1001|ffl{zffl}

summit1

1110|ffl{zffl}summit2

? 0111|ffl{zffl}summitKi

8<:

9=;; ð3Þ

where Ki is the number of fuzzy sets on premise Xi

excluding the two summits located at the extreme valuesof each premise. The total size of Gsets is given as

sizeðGsetsÞ ¼XN

i¼1

Kibr

!þ Kybr; ð4Þ

where Ky is the number of fuzzy sets on the conclusion.However on the conclusion, the number of fuzzy sets isequal to the number of the coded summits since thelimits are also coded.

3.1.1.2. Fuzzy rules. The genotype of the fuzzy rulesmust contain information about all of the possiblecombinations connecting one fuzzy set on each premiseto a fuzzy set on the conclusion. For N input premisesand Ki fuzzy sets on each premise i (including the limits),the maximum number of fuzzy rules K is computed as

K ¼ ðK1Þ � ðK2Þ �?� ðKNÞ: ð5Þ

The fuzzy rules are coded as an ordered list ofcombination of the premises, each having an enable/disable bit, denoted e—0 for disable; 1 for enable—together with a conclusion fuzzy set number. Each ruleis coded into a 4 bit string, i.e.,

Grules � e101|ffl{zffl}rule1

e111|ffl{zffl}rule2

? e011|ffl{zffl}ruleK

8<:

9=;: ð6Þ

3.1.2. BGA reproduction mechanisms

The evolution of the population is achieved byreproduction of the best individuals based on their

ARTICLE IN PRESS

Old generation(Parents)

1 0 1 1 0 0 0

0 1 1 111

1 1 1 10 0 1

Crossover site

Father

Boy

11

0

0

11

00

chromosome

New generation(childs)

1 0 1 01 1

1 1 0 01 1

1 1 1 1 0 00 1

Mother

Girl

chromosome

chromosome

chromosome

Fig. 4. Simple crossover.

Selected summit

DisplacementDisplacement

New position New position

Step of resolution

Fig. 5. Fuzzy-sets displacement.

K K

22

1 1

fuzzy ruleDeactivated

fuzzy rule

K fuzzy rules

Randomly selected

Fig. 6. Fuzzy-rules reduction.

1 1 1 0

111

Mutation site

Mutated gene

10

1 0 1 0

0 0 0

Fig. 7. Mutation of a genotype.

S. Achiche et al. / Engineering Applications of Artificial Intelligence 17 (2004) 313–325316

ability to survive natural selection. This reproduction isperformed with a combination of the four followingoperators:

1. Simple crossover: The main reproduction mechan-ism is performed by crossing the genotype of theparents, in order to obtain the genotype of two children.The crossover technique used in BGA is a simple

crossover (Syswerda, 1989) as shown in Fig. 4. Thispart of the reproduction mechanism is governed by aninitiating probability p1:

2. Fuzzy sets displacement: Displacement of the fuzzysets is performed (with an initiating probability p2) byrandomly selecting a fuzzy set on a premise. The summitof the selected fuzzy set is then moved by one step ofresolution toward the left or the right, with an equalprobability (see Fig. 5). This reproduction operator hasthe virtue of trying different fuzzy set repartitions, whiledecreasing the number of fuzzy sets by superimposingtwo or more of them.

3. Fuzzy-rules reduction: The reduction of the numberof fuzzy rules is performed with a probability p3 given by

p3 ¼ 1 p1 p2: ð7Þ

One of the K fuzzy rules is randomly selected anddeactivated—the bit e is set to disable—as shown in Fig.

6. Obviously, this reproduction operator does notalways generate a reduction in the number of fuzzyrules (the case when e is already set to 0), but graduallyworks toward that direction. The bias toward thereduction of the number of rules tends to produce lesscomplex FKBs.

4. Mutation: Mutation is a random inversion of a bitin the genotype of a new member of the population asshown in Fig. 7. Mutation makes it possible to trycompletely different solutions (Deb and Agrawal, 1998).The probability of mutation p4 should be kept verysmall in order to give the other reproduction operatorsprecedence for improving the population.

ARTICLE IN PRESS

Fig. 8. Coding of a fuzzy rules set.

S. Achiche et al. / Engineering Applications of Artificial Intelligence 17 (2004) 313–325 317

This way of seeking completely different solutionsallows the algorithm to jump out of a local optimum,and potentially fall into more promising regions.

3.2. Real/binary like coded genetic algorithm

The most important success of the GAs remains intheir evolution paradigm rather than in the way they arecoded, hence the occurrence of real coded geneticalgorithms (RCGA) in recent years, where they weremostly used in numerical optimization works (Herreraet al., 1995; Ono et al., 2003).

The use of RCGAs overcomes one of BGAs weak-nesses, which is the low resolution of the solutions(Baron et al., 2002; Achiche et al., 2002). Moreover,most optimization works are done in a continuousmathematical space (real space). However, the FKBscontain two cooperative but distinct parts:

* the premises and the conclusion, along with the fuzzysets on them;

* the fuzzy rules base.

They are distinct in such a way that the first one dealswith real numbers while the second one uses integernumbers, since the fuzzy rules are simple pointers to theindex of a fuzzy set on the conclusion. For this matter,we used a combination between an RCGA and a BGAwhere the binary part is mapped into a string of integers,the algorithm is called Real/Binary-like Coded GeneticAlgorithm (RBLGA).

3.3. Coding

The genotype RG corresponds to several independentsets of reals and a set of integers:

RG � fRGsets;RGrulesg; ð8Þ

where RGsets and RGrules are respectively the genotypesof the fuzzy sets and the fuzzy rules. The genotype can bedescribed as follows:

Input/output premises: There are as many real numbersets as there are premises in the problem, and one set forthe conclusion. Each set contains a predefined maximumnumber of real numbers representing the location of thesummit of each fuzzy set on each premise and theconclusion. The two summits located at the minimumand maximum limits of each premise and the conclusionare not coded, since they are constant throughout theevolution (similar to the BGA coding).

As in the BGA we consider non-symmetrical-over-lapping triangular fuzzy sets for premises and symme-trical triangular fuzzy sets for the conclusion. The

genotype of the fuzzy sets of premise i is given as

RGXi� x1|{z}

summit1

x2|{z}summit2

? xi|{z}summitKi

8><>:

9>=>;; ð9Þ

where Ki is the number of fuzzy sets on the premise i (orthe conclusion). The limits of the premises are notincluded in the sets. RGsets is then given as

RGsets � RGX1|fflffl{zfflffl}premise1

; RGX2|fflffl{zfflffl}premise2

;y; RGXi|fflffl{zfflffl}premisei

;y; RGXc|fflffl{zfflffl}conclusion

8><>:

9>=>;: ð10Þ

Fuzzy rules: The fuzzy rules are coded as a set ofintegers representing an ordered list of the combinationof the premises. Each integer in the set represents aconclusion fuzzy set summit (see Fig. 8). The genotypeof the fuzzy rules is given as

RGrules � r1|{z}rule1

; r2|{z}rule2

;y; rK|{z}ruleK

8<:

9=;: ð11Þ

The number of fuzzy rules K is computed using Eq. (5).The initial population of FKBs is composed of P

randomly generate FKBs. The genotype of each newsolution contains all the sets mentioned above. How-ever, as we will explain below, the size of the sets candecrease.

3.3.1. RBLGA reproduction mechanisms

Reproduction is performed by crossover of theparent’s genotype to obtain the offspring’s genotype (ortwo offsprings). The reproduction of the FKBs in theRBLGA is performed through three crossover mechan-isms, each one having a certain purpose to achieve, asexplained below.

3.3.1.1. Multi crossover. The multi-crossover mechan-ism is a combination of two crossovers applied ondifferent parts of the genotype.

Premises/conclusion crossover: The mechanism used iscalled blending crossover a (BLX-a) (Eshelman andSchaffer, 1993), where a determines the exploitation/exploration level of the offspring obtained from theselected parents. Exploitation indicates the usage ofthe interval between the values of the two parents; the

ARTICLE IN PRESS

Fig. 10. BLGA simple crossover.

Fig. 9. Blended crossover a–BLX-a:

S. Achiche et al. / Engineering Applications of Artificial Intelligence 17 (2004) 313–325318

exploration uses an interval outside of these two limits,hence trying new solutions (see Fig. 9).

If A ¼ fx1;x2;y; xi;y; xng and B ¼fy1; y2;y; yi;y; yng represents the two selected parents,C the offspring obtained by the crossover of A and B

then C ¼ fz1; z2;y; zi;y; zng; where zi are randomlyselected in the interval [mini I a; maxi + I a] where:

* maxi ¼ maximum fxi; yig;* mini ¼ minimum fxi; yig;* I ¼ fmaxi minig:

Fig. 9 shows the above-described mechanism. Theparameter a controls the exploitation/exploration level,knowing that:

* in the exploitation zone, the offspring inheritsbehaviors close to those of his parents;

* in the exploration zone, the offspring is a result of anexploration, therefore his attributes will be distantfrom his parent’s average.

In order to avoid any bias in either direction (exploita-tion or exploration) the value of a is set to 0.5, whichprovides offsprings in the zone named the relaxedexploitation zone (Herrera and Lozano, 2000) (seeFig. 9).

Fuzzy rules crossover: Since the part of the genotype

representing the fuzzy rule base is composed of integernumbers, the crossover on this part of the genotype isdone by a simple crossover. The use of a BLX crossoveris not suitable in this case, because of the integer natureof the values (the rules will tend to be the value zeromost of the time). The operation is performed byinverting the end part of the sets of the parents at arandomly selected crossover site as shown in Fig 10.These two mechanisms are governed by an initiatingprobability pr1:

3.3.1.2. Fuzzy set reducer. This mechanism aims toincrease the simplicity level of the FKBs by randomlyselecting a fuzzy set on a premise and erasing it togetherwith the corresponding fuzzy rules. This mechanismallows one to obtain different and more simple (lessinformation) solutions (i.e.; FKBs). This mechanism isgoverned by the initiating probability pr2:

3.3.1.3. Mutation. Mutation is the creation of anindividual by altering the gene of an existing one. The

probability pr3 governs the occurrence of this mechan-ism. The mutation used in the RBLGA is a Randommutation (uniform) (Michalewicz, 1992), applied to onerandomly selected individual, as follows:

* an individual C is randomly selected, C ¼fz1; z2;y; zi;y; zng;

* a mutation site is randomly set in the interval [1,n];* the selected value ‘‘zi’’ is replaced by z0i ¼

randomðai; biÞ; where ai and bi are the limits enclosingzi:

4. Learning process

The learning process is formulated as an optimizationproblem applied to the numerical data, using the BGAand the RBLGA in order to produce near to optimalFKBs.

An FKB contains the following entities/information:

1. the number of premises (inputs) and the number ofconclusions (outputs);

2. the number of fuzzy sets and their distribution on thepremises and the conclusions; and

3. the fuzzy rules (fuzzy rule base).

Item 1 is a part of the problem’s input data and all thefeatures in items 2 and 3 are a part of the learningprocess. The maximal complexity on each premise (i.e.;maximal number of fuzzy sets) is fixed at the beginningof the optimization and therefore these entities are not apart of the learning process (the maximal complexitycan differ from premise to premise). After few execu-tions, maximal complexity can be readjusted to a highernumber if required.

The goal of the learning process is to generate FKBswhile maximizing the performance criteria in terms ofaccuracy ðfrmsÞ and simplicity level ðfsiÞ: Criteria frms

and fsi are defined in Section 4.1. The optimizationproblem can be defined as follows:

ARTICLE IN PRESSS. Achiche et al. / Engineering Applications of Artificial Intelligence 17 (2004) 313–325 319

For the BGA:

Max f ðfrms;fsiÞ with G : Binary Genotype: ð12Þ

For the RBLGA:

Max frms with RG : Real Genotype: ð13Þ

4.1. Performance criteria

The performance criteria allow one to compute theratings of each FKB. Those performance ratings areused by the RBLGA and the BGA in order to performnatural selection. The principal performance criterion isthe accuracy level of the FKB (approximation error) inreproducing the outputs of the learning data. Theapproximation error of the FKB is measured using theroot-mean-square error

Drms ¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1

M

XMi¼1

ðGAoutputi dataoutputi

vuut Þ2; ð14Þ

where M represents the number of points in the sampleddata. The fitness value is evaluated as a percentage of theconclusion length base (L), as follows:

frms ¼L Drms

L� 100; ð15Þ

while ð100 frmsÞ being the error percentage.The RBLGA uses the value of frms as the only

evaluation criterion to perform the natural selection,however the BGA uses a second performance criterionthat rates the simplicity of the FKBs ðfsiÞ: The use ofthis additional criterion is due to the particularities ofthe reproduction mechanisms (fuzzy set displacementand fuzzy rules reduction) that are less straightforwardthan the ones used by the RBLGA when it comes toreducing the size of the FKB. The fitness value, denotedfsi; is defined as

fsi �K na

K; ð16Þ

where K is the maximum number of fuzzy rules (usingthe initial complexity) and na is the number of activerules of the FKB under evaluation. In order to chosebetween these two contradictory objectives, the BGAuses a weighted sum of the two objective functions, i.e.:

fBGA ¼ ofrms þ ð1 oÞfsi; ð17Þ

where o sets the bias between frms and fsi:

4.2. Natural selection

Natural selection is performed on the population bykeeping the most promising individuals based on theirfitness value. This is equivalent to using solutions thatare closest to the optimum. For convenience, we keepthe size of the population constant.

4.2.1. Natural selection in the BGA

The first generation starts with P FKBs andadditional Ps are generated by reproduction andmutation. These brand new FKBs are then evaluated.Natural selection is applied on the 2 � P FKBs byranking them based on fBGA and frms: We keep the firstP=2 non-identical—to maintain diversity in the popula-tion—FKBs of the two lists.

4.2.2. Natural selection in the RBLGA

The first generation begins with P FKBs, and thesame number of additional FKBs are generated byreproduction and mutation. Moreover, in the RBLGAnatural selection is applied on the 2 � P FKBs byordering them according to the principal performancecriterion frms and keeping the P first FKBs.

The size P has to be selected depending upon theperformances of the computer in use. A high value of P

generally ensures a better diversity in the population,which helps to avoid premature convergence. Howeverit increases the learning time.

5. Validation results

The BGA and RBLGA learning performances arenow investigated using three examples of knownbehavior in terms of type z ¼ f ðx; yÞ 3D surfaces, wherethe nodes are the learning set of sampled data. We haveused three different surfaces of different complexities tohave a better idea on the generality of the results (ratherthan a specific result to a specific surface). The evolutionand selection criteria used for both algorithms are set tothe following values:

BGA’s evolution/selection criteria:

* p1 ¼ 85:0%;* p2 ¼ 13:5%;* p3 ¼ 1:5%;* p4 ¼ 5:0%;* o ¼ 1:0; and* maximal complexity: six fuzzy sets (including the

limits) on each premise and 8 on the conclusion(8 being the maximum allowed by the codingresolution).

From this set of parameters we can say that thedominant evolution of the BGA is performedusing the simple crossover with a probability p1:The emphasis on the Fuzzy-Rules reduction reproduc-tion mechanism is set at a low level to let the FKBSevolve toward a lower complexity level naturally ratherthan in a forced way. With o ¼ 1 the emphasis of theselection mechanism is put on frms ¼ fBGA: Thesevalues have been chosen with respect to the conclusionsdriven from the work presented in Balazinski et al.(1993).

ARTICLE IN PRESS

Fig. 12. Theoretical exponential surface.

S. Achiche et al. / Engineering Applications of Artificial Intelligence 17 (2004) 313–325320

RBLGA’s evolution/selection criteria:

* pr1 ¼ 85:0%;* pr2 ¼ 15:0%;* pr3 ¼ 5:0%; and* maximal complexity: six fuzzy sets (including the

limits) on each premise and 8 on the conclusion (nolimits on the number, since it can match the numberof fuzzy rules).

The evolution is mostly governed by the multi-crossover

reproduction mechanism, however the fuzzy set reducer

weight is not negligible which will tend to produce moresimple FKBs.

Note 1: In the RBLGA the value of pr2 is equal to thesum of the probabilities p2 and p3 of the BGA. Thisequality sets a good comparative basis, since themechanism driven by these probabilities performssimilar changes in the FKBS, i.e., reducing the size ofthe fuzzy rule base.

Note 2: The maximal complexity is set in order to fixthe maximum size of the genotypes, it can be changed ifthe early results are not satisfying. However, the numberof variables taking part in the learning process candecrease through the generations and moreover, in thesame population both the BGA and the RBLGA candeal with individuals of different sizes below the fixedmaximum.

5.1. Theoretical surfaces

The theoretical surfaces used for the learning processare as follows:

1. Sinusoid surface. The theoretical sinusoid surface(Fig. 11) is defined as

z ¼ sinðxyÞ with0pxp1:6;

0pyp1:4:ð18Þ

Fig. 11. Theoretical sinusoid surface.

Fig. 13. Theoretical hyper-tangent surface.

2. Exponential surface. The theoretical concave sur-face (Fig. 12) is defined as

z ¼ expðx2 þ y2Þ with1:5pxp1:5;

2pyp2:ð19Þ

3. Hyper-tangent surface. The theoretical hyper-tangent surface (Fig. 13) is defined as

z ¼ tanhðxðx2 þ y2ÞÞ with0:2pxp1:4;

0:2pyp1:4:ð20Þ

5.1.1. FKBs accuracy level

To measure the accuracy levels (fitness levels) of boththe BGA and the RBLGA, in generating FKBs, and forthe sake of comparison, several runs have been made on

ARTICLE IN PRESS

Fig. 14. Fitness values: RBLGA versus BGA.

Table 1

Average frms percentage versus the number of generations

Population

size

Generation

number

Fitness BGA

(%)

Fitness

RBLGA (%)

100 10 68.15 79.93

100 50 77.73 93.38

100 100 80.52 95.08

100 500 86.49 95.72

100 1000 89.99 95.78

100 2500 91.65 95.94

Table 2

Size of the rule base versus the number of generations

Population

size

Generation

number

Rule base size

BGA

Rule base size

RBLGA

100 10 14.33 23.67

100 50 12 19

100 100 12.33 19

100 500 11 19

100 1000 11.33 19

100 2500 12.33 19

S. Achiche et al. / Engineering Applications of Artificial Intelligence 17 (2004) 313–325 321

the three surfaces cited above. The population size P

was set to 100: Runs were performed for 10; 50; 100; 500;1000 and 2500 generations. The best individual’s fitnesslevel has been taken into account at the last generation.The average value of the three different results obtainedfor each theoretical surface was computed, as shown inFig. 14 and compiled in Table 1.

Fig. 14 shows the superiority of RBLGA when itcomes to accuracy for the same number of evaluatedsolutions (number of generation � population size). Forinstance, the highest level of accuracy obtained by theBGA after the exploration of 250,000 FKBs (i.e.91.65%) is below the one achieved by the RBLGA afterexploring 5000 FKBs, meaning that the RBLGA is ableto create a more versatile population of FKBs fasterthan the BGA. The BGA was late in the accuracy raceeven when 50,000 times more solutions where explored.This leads to the conclusion that when dealing with acomplex model to map into an FKB, the RBLGA willmost probably produce a more satisfactory accurateFKB.

Fig. 14, also shows that the RBLGA tends to reach anaccuracy plateau faster than the BGA, which drags theproblem of premature convergence, very often a draw-back to real coded GAs.

5.1.2. The simplicity level of the FKBs

In this section, the simplicity level of the FKBs isstudied. The simplicity level of an FKB is inverselyproportional to the number of fuzzy rules representingthe fuzzy rule base. The two main advantages ofconstructing more simple FKBs are:

1. a simple FKB is more flexible toward a manualtuning by a human expert;

2. a simple FKB tends to be a more general model to theexisting problem as reported by the authors inBalazinski et al. (2000).

Table 2 shows the averages of the rule base size obtainedfor the three different surfaces.

Fig. 15 shows that the BGA is more successful increating simple FKBs, and the RBLGA reached a

ARTICLE IN PRESS

Fig. 15. Number of fuzzy rules: RBLGA versus BGA.

S. Achiche et al. / Engineering Applications of Artificial Intelligence 17 (2004) 313–325322

simplicity plateau with only 19 fuzzy rules. Taking intoaccount that the primary maximum number of fuzzyrules was 36; 19 fuzzy rules still represents approxi-mately a 50% simplification rate. The difference in thesimplicity level between the BGA and the RBLGA isquite predictable from of the differences in thereproduction mechanisms. The BGA simplifies theFKBs in three different ways:

1. the first randomly generated population of indivi-duals is already created with inactive fuzzy rules;

2. the Fuzzy Sets Displacement reproduction mechan-ism can reduce a set of fuzzy rules if a summit isseparated by a step of resolution from his neighbor;

3. the Fuzzy Rules Reduction reproduction mechanismrandomly reduces the rules.

while the only way the RBLGA reduces rules is byreducing the number of fuzzy sets on the premises (byerasing rather than displacing). The lowest number offuzzy rules the BGA proposed is around 12—E67%—which is better than the RBLGA.

5.1.3. Learning time

We studied the accuracy and the complexity/simpli-city level of the FKBs. However, an important aspect ofthe automatic generation of FKBs is the GAs learningtime. In this section, we explore the learning time (LT)of the BGA versus the LT of the RBLGA. Thecomparison is based on the average LT obtained forthe same three surfaces, using the same evolution/selection criteria. Fig. 16 illustrates the results that werecompiled in Table 3. It is quite obvious that the BGA

outperforms the RBLGA, since for the same number ofexplored solutions, the LT for the BGA is much lower.Also the slope of the curve representing the evolution ofthe RBLGA is higher than the one of the BGA, whichmeans that the LT increases faster for the RBLGA.However, even if the LT of the BGA is veryadvantageous, the accuracy result is still very low (seeTable 1).

The first question is: for approximatively the sameLT, which GA performs better (i.e.; better fitness)?

From Table 4, for the same LT ðE0:20 minÞ theRBLGA reaches the fitness level of 93:38%; whilethe BGA reaches 86:49%; hence, concerning this aspect,the RBLGA outclasses the BGA.

The second question is: what is the LT needed by theBGA to reach an accuracy level, near the best oneswhich are proposed by the RBLGA?

To determine the LT for approximatively the samelevel of fitness for both AGs, we pushed the evolution ofthe BGA up to 10,000 generations. Table 5 shows theaverage of the results obtained for the three surfaces.The BGA reached an accuracy level of 93:41% after9:62 min (and 10,000 generations) while the RBLGAproposed approximatively the same value in around0:2 min (see Table 4). From these different results, wecan conclude that the RBLGA is more efficient, relativeto the LT.

5.1.4. Influence of the population size

In this section we explore the influence of thepopulation size on the outcome of the learning process.For the sake of comparison the number of generations is

ARTICLE IN PRESS

Table 3

Learning time RBLGA versus BGA

Population

size

Generation

number

LT BGA

(min)

LT RBLGA

(min)

100 10 0.02 0.04

100 50 0.03 0.21

100 100 0.05 0.41

100 500 0.20 2.18

100 1000 0.41 4.84

100 2500 1.18 16.63

Table 4

Fitness BGA versus RBLGA: same LT

GA Population size Generation number LT (min) Fitness

BGA 100 500 0.21 86.49%

RBLGA 100 50 0.20 93.38%

Table 5

LT and fitness of the BGA after 10,000 generations

GA Population size Generation number LT (min) Fitness

BGA 100 10,000 9.62 93.41%

Fig. 16. Learning time: RBLGA versus BGA.

Table 6

Influence of the population size

Generation

number

Population

size

Fitness (%) Size of the

rule base

Learning

time (min)

BGA

100 25 76.89 12.33 0.02

100 50 80.27 12.00 0.03

100 100 80.52 12.33 0.05

100 200 80.60 13.67 0.09

RBLGA

100 25 93.12 11.00 0.09

100 50 93.92 20.33 0.21

100 100 95.08 19.00 0.41

100 200 96.42 23.33 0.89

S. Achiche et al. / Engineering Applications of Artificial Intelligence 17 (2004) 313–325 323

fixed at 100 for both algorithms. Table 6, highlights theinfluence of the population size.

Increasing the population size improves the perfor-mances of both algorithms when it comes to accuracy.However, since accuracy and the rule base size arelinked, the size of the rule base increases for the RBLGAand remains quite stable for the BGA. The reasonsbehind the BGA stability are the special reproduction

mechanisms for the fuzzy rules (see 5.1.2). The LTincreases along with the population size, which is verypredictable, since the number of evaluated solutionsincreases. For both the BGA and the RBLGA, theoptimal population size seems to be around 100individuals, since when increasing from 100 to 200individuals the accuracy improves by approximately1:00% for the RBLGA and less than 0:20% for the BGA,while the LT doubles.

5.1.5. Comparison and discussion

The BGA and the RBLGA reacted differentlythroughout the different tests which were performed inthis paper. Table 7 summarizes the ability comparison ofboth GAs, and the check mark O gives an edge to oneover the other. The RBLGA completely outperformedthe BGA regarding fitness level, since the RBLGA wasable to produce satisfactory solutions (FKBs) from a

ARTICLE IN PRESS

Table 7

Performances: RBLGA versus BGA

Fitness Simplification Learning time

BGA ORBLGA O O O

S. Achiche et al. / Engineering Applications of Artificial Intelligence 17 (2004) 313–325324

restricted set of evaluated individuals, giving an edge tothe RBLGA over the BGA.

For the simplicity level of the FKBs, the BGA gave alower number of fuzzy rules, the difference was notsharp enough from what the RBLGA proposed, theGAs are almost identical on this point giving a slightedge to the BGA.

Considering the LT, for the same number ofevaluated solutions, the BGA is faster. However as seenin Section 5.1.3, in order to obtain a comparativeaccuracy level the BGA has to run longer and even thenthe BGA did not achieve the results obtained by theRBLGA. Both GAs are efficient but an edge is given tothe RBLGA since we consider the accuracy of the FKBsas the most important criterion to achieve (as we can seeby the weight put on the accuracy in the evolutionparameters). From these remarks we can conclude thateven if the BGA remains quite efficient, the RBLGAwas more convincing in the search for near optimalsolutions (FKBs).

6. Conclusion

From the different tests, we can attest that generallyboth the binary coded genetic algorithm (BGA) and thereal/binary like coded genetic algorithm (RBLGA) areefficient, when dealing with the artificial data. However,in most cases the RBLGA was more satisfactory.

Some principal conclusions can be stated:

* when dealing with a complex process to map into afuzzy knowledge base, the RBLGA will be moreeffective than the BGA;

* for a simple process to model, the BGA can be amore appropriate choice since it runs faster alongwith providing satisfactory results;

* a population size of 100 is a good compromisebetween the accuracy level and learning time, forboth GAs;

* if fast accurate responses are needed the use of theRBLGA is more appropriate, since it reaches highaccuracy levels faster than the BGA;

* if a simple fuzzy knowledge base is needed, the BGAwill be used, due to the efficiency of the rulesreduction mechanisms;

* the evolution parameters can be changed for bothGAs, if a new optimization paradigm has to be set.

Better accuracy can be achieved by increasing theinitiating probability that governs the crossovermechanisms.

Acknowledgements

Financial support from the Natural Sciences andEngineering Research Council of Canada under grantsRGPIN-203618, RGPIN-105518 and STPGP-269579 isgratefully acknowledged.

References

Achiche, S., Balazinski, M., Baron, L., Jemielniak, K., 2002. Tool wear

monitoring using genetically-generated fuzzy knowledge bases.

Engineering Applications of Artificial Intelligence 15, 303–314.

Balazinski, M., Bellerose, M., Czogala, E., 1993. Application of fuzzy

logic techniques to the selection of cutting parameters in machining

processes. International Journal for Fuzzy Sets and Systems 61,

307–317.

Balazinski, M., Achiche, S., Baron, L., 2000. Influence of optimization

and selection criteria on genetically-generated fuzzy knowledge

bases. International Conference on Advanced manufacturing

Technology, (ICAMT2000) Johor Bahru, Malaysia, pp. 159–164.

Baron, L., 1998. Genetic algorithm for line extraction, Technical

Report EPM/RT-98/06, !Ecole Polythechnique de Montreal,

19pp.

Baron, L., Achiche, S., Balazinski, M., 2002. Fuzzy decisions system

knowledge base generation using a genetic algorithm. International

Journal of Approximate Reasoning 28, 125–148.

Cord !on, O., Herrera, F., Villar, P., 2000. Analysis and guidelines to

obtain a good uniform fuzzy partition granularity for fuzzy-rule

based systems using simulated annealing. International Journal of

Approximate Reasoning 187–216.

Deb, K., Agrawal, S., 1998. Understanding interactions among genetic

algorithm parameters. In: Banzhaf, W., Reeves, C. (Eds.),

Foundations of Genetic Algorithms, vol. 5. Morgan Kaufman,

San Mateo, CA, pp. 265–286.

Deb, K., Pratap, A., Agarwal, S., Meyarivan, T., 2000. A fast and

elitist multi-objective genetic algorithm: NSGA-II. IEEE Transac-

tions on Evolutionary Computation 6, 182–200.

Diederich, J., Renaud, F., 1999. A Fuzzy classifier using genetic

algorithms for biological data. Proceedings of the Eighth Interna-

tional Conference of the North American Fuzzy Information

Processing Society, NAFIPS, New York, BEE, pp. 680–684.

Eshelman, L.J., Schaffer, J.D., 1993. Real-Coded Genetic Algorithms

and Interval-Schemata, Foundations of Genetic Algorithms 2.

Morgan Kaufman Publishers, San Mateo, pp. 187–202.

Goldberg, D.E., 1989. Genetic Algorithms in Search, Optimi-

zation and Machine Learning. Addison-Wesley, Reading, MA,

412pp.

Herrera, F., Lozano, M., 2000. Gradual distributed real-coded genetic

algorithms. IEEE Transactions on Evolutionary Computation 4,

43–63.

Herrera, F., Lazano, M., Verdegay, J.L., 1995. Tunning fuzzy

logic controllers by genetic algorithms. International Journal of

Approximate Reasoning 12, 299–315.

Lobo, F.G., Goldberg, D.E., Pelikan, M., 2000. Time complexity of

genetic algorithms on exponentially scaled problems. GECCO-

2000: Proceedings of the Genetic and Evolutionary Computation

Conference, pp. 151–158.

ARTICLE IN PRESSS. Achiche et al. / Engineering Applications of Artificial Intelligence 17 (2004) 313–325 325

Michalewicz, Z., 1992. Genetic Algorithms þ Data Structure ¼Evolution Programs: Springer, New York.

Ono, I., Kita, H., Kobayashi, S., 2003. A real-coded genetic algorithm

using the unimodal normal distribution crossover. Advances in

Evolutionary Computing: Theory and Applications, Natural

Computing Series. Springer-Verlag, New York, pp. 213–237.

Surmann, H., Selenschtschikow, A., 2002. Automatic generation of

fuzzy logic rule bases: Examples I. Proceeding of the First

International ICSC Conference on Neuro-Fuzzy Technologies,

pp. 75–81.

Syswerda, G., 1989. Uniform crossover in genetic algorithms.

Proceedings of the International Conference on Genetic

Algorithms, pp. 2–9.

Valenzuela-Rendon, M., 1991. The fuzzy classifier system: a classifier

system for continuously varying variables. Proceedings of the

Fourth International Conference on Genetic Algorithms,

pp. 346–353.

Zadeh, L.A., 1973. Outline of new approach to the analysis of complex

systems and decisions processes. IEEE Transactions of Systems,

Man and Cybernetics 3, 28–44.