genetically optimized fuzzy polynomial neural networks

20
IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 14, NO. 1, FEBRUARY 2006 125 Genetically Optimized Fuzzy Polynomial Neural Networks Sung-Kwun Oh, Member, IEEE, Witold Pedrycz, Fellow, IEEE, and Ho-Sung Park Abstract—In this paper, we introduce a new topology of fuzzy polynomial neural networks (FPNNs) that is based on a geneti- cally optimized multilayer perceptron with fuzzy polynomial neu- rons (FPNs). The study offers a comprehensive design methodology involving mechanisms of genetic optimization, especially those ex- ploiting genetic algorithms (GAs). Let us recall that the design of the “conventional” FPNNs uses an extended group method of data handling (GMDH) and uses a fixed scheme of fuzzy inference (such as simplified, linear, and regression polynomial fuzzy inference) in each FPN of the network. It also considers a fixed number of input nodes (as being selected in advance by a network designer) at FPNs (or nodes) located in each layer. However such design process does not guarantee that the resulting FPNs will always result in an op- timal networks architecture. Here, the development of the FPNN gives rise to a structurally optimized topology and comes with a substantial level of flexibility which becomes apparent when con- trasted with the one we encounter in the conventional FPNNs. The design of each layer of the FPNN deals with its structural optimiza- tion involving a selection of preferred nodes (or FPNs) with spe- cific local characteristics (such as the number of input variables, the order of the polynomial forming a consequent part of fuzzy rules and a collection of the specific subset of input variables) and addresses detailed aspects of parametric optimization. Along this line, two general optimization mechanisms are explored. The struc- tural optimization is realized via GAs. In case of the parametric op- timization we proceed with a standard least square method-based learning. Through the consecutive process of such structural and parametric optimization, an optimized and flexible fuzzy neural network becomes generated in a dynamic fashion. To evaluate the performance of the genetically optimized FPNN (gFPNN), we ex- perimented with two time series data (gas furnace and chaotic time series) as well as some synthetic data. A comparative analysis re- veals that the proposed FPNN exhibits higher accuracy and superb predictive capability in comparison to some previous models avail- able in the literature. Index Terms—Design procedure, fuzzy polynomial neuron (FPN), genetic algorithms (GAs), genetically optimized fuzzy polynomial neural networks (gFPNN), group method of data handling (GMDH), multilayer perceptron (MLP). Manuscript received November 20, 2003; revised December 2, 2004 and May 3, 2005. This work was supported by Korea Research Foundation Grant KRF- 2003-002-D00297 and KRF-2004-002-D00257. S.-K. Oh is with the Department of Electrical Engineering, The University of Suwon, Gyeonggi-do 445-743, South Korea. W. Pedrycz is with the Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB T6G 2G6, Canada, and also with the Systems Research Institute, Polish Academy of Sciences, T6R 2G7 Warsaw, Poland. (e-mail: [email protected]). H.-S. Park is with the Department of Electrical Electronic and Information Engineering, Wonkwang University, 570-749, South Korea. Digital Object Identifier 10.1109/TFUZZ.2005.861620 I. INTRODUCTION R ECENTLY, a lot of attention has been directed toward advanced techniques of modeling complex systems. The challenging quest of how to construct models that come with significant approximation and generalization abilities as well as are easy to comprehend (interpret) has been within the community for decades. While neural networks, fuzzy sets and evolutionary computing as the fundamental technologies of computational intelligence (CI) have expanded and enriched a field of modeling quite immensely, they have also given rise to a number of new methodological issues and increased our awareness about tradeoffs one has to make in system modeling [1]–[4]. The most promising and successful approaches to endow fuzzy systems with learning and adaptation mechanisms have been realized in the realm of CI. Especially neural fuzzy systems and genetic fuzzy systems hybridize the approximate inference method of fuzzy systems by augmenting them with the learning capabilities of neural networks and evolutionary algorithms [5]. When the dimensionality of the model goes up (say, the number of variables increases), so do the difficulties. Fuzzy sets emphasize the aspect of transparency of the models and a role of a model designer whose prior knowledge about the system may be very helpful in facilitating all identification pursuits. On the other hand, to build models of substantial ap- proximation capabilities, there is a need for advanced tools. The art of modeling is to reconcile these two tendencies and find a workable and efficient synergistic environment. Moreover it is also worth stressing that in many cases the nonlinear form of the model acts as a two-edge sword: While we gain flexibility to cope with experimental data, we are provided with an abun- dance of nonlinear dependencies that need to be exploited in a systematic manner. In particular, when dealing with high-order nonlinear and multivariable equations of the model, we require a vast amount of data to estimate all its parameters [1], [2]. To help alleviate the problems we have already alluded to, one of the first approaches along the line of a systematic de- sign of nonlinear relationships between systems inputs and out- puts comes under the name of a group method of data han- dling (GMDH). GMDH was developed in the late 1960s by Ivakhnenko [6]–[9] as a vehicle for identifying nonlinear rela- tions between input and output variables. While undoubtedly providing with a systematic design procedure, GMDH comes with some drawbacks. First, it tends to generate quite com- plex polynomials even for relatively simple systems (experi- mental data). Second, owing to its limited generic structure (that is based on quadratic two-variable polynomials), GMDH also tends to produce an overly complex network (model) when it comes to highly nonlinear systems. Third, if there are less than 1063-6706/$20.00 © 2006 IEEE

Upload: duongminh

Post on 27-Jan-2017

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Genetically optimized fuzzy polynomial neural networks

IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 14, NO. 1, FEBRUARY 2006 125

Genetically Optimized Fuzzy PolynomialNeural Networks

Sung-Kwun Oh, Member, IEEE, Witold Pedrycz, Fellow, IEEE, and Ho-Sung Park

Abstract—In this paper, we introduce a new topology of fuzzypolynomial neural networks (FPNNs) that is based on a geneti-cally optimized multilayer perceptron with fuzzy polynomial neu-rons (FPNs). The study offers a comprehensive design methodologyinvolving mechanisms of genetic optimization, especially those ex-ploiting genetic algorithms (GAs). Let us recall that the design ofthe “conventional” FPNNs uses an extended group method of datahandling (GMDH) and uses a fixed scheme of fuzzy inference (suchas simplified, linear, and regression polynomial fuzzy inference) ineach FPN of the network. It also considers a fixed number of inputnodes (as being selected in advance by a network designer) at FPNs(or nodes) located in each layer. However such design process doesnot guarantee that the resulting FPNs will always result in an op-timal networks architecture. Here, the development of the FPNNgives rise to a structurally optimized topology and comes with asubstantial level of flexibility which becomes apparent when con-trasted with the one we encounter in the conventional FPNNs. Thedesign of each layer of the FPNN deals with its structural optimiza-tion involving a selection of preferred nodes (or FPNs) with spe-cific local characteristics (such as the number of input variables,the order of the polynomial forming a consequent part of fuzzyrules and a collection of the specific subset of input variables) andaddresses detailed aspects of parametric optimization. Along thisline, two general optimization mechanisms are explored. The struc-tural optimization is realized via GAs. In case of the parametric op-timization we proceed with a standard least square method-basedlearning. Through the consecutive process of such structural andparametric optimization, an optimized and flexible fuzzy neuralnetwork becomes generated in a dynamic fashion. To evaluate theperformance of the genetically optimized FPNN (gFPNN), we ex-perimented with two time series data (gas furnace and chaotic timeseries) as well as some synthetic data. A comparative analysis re-veals that the proposed FPNN exhibits higher accuracy and superbpredictive capability in comparison to some previous models avail-able in the literature.

Index Terms—Design procedure, fuzzy polynomial neuron(FPN), genetic algorithms (GAs), genetically optimized fuzzypolynomial neural networks (gFPNN), group method of datahandling (GMDH), multilayer perceptron (MLP).

Manuscript received November 20, 2003; revised December 2, 2004 and May3, 2005. This work was supported by Korea Research Foundation Grant KRF-2003-002-D00297 and KRF-2004-002-D00257.

S.-K. Oh is with the Department of Electrical Engineering, The University ofSuwon, Gyeonggi-do 445-743, South Korea.

W. Pedrycz is with the Department of Electrical and Computer Engineering,University of Alberta, Edmonton, AB T6G 2G6, Canada, and also with theSystems Research Institute, Polish Academy of Sciences, T6R 2G7 Warsaw,Poland. (e-mail: [email protected]).

H.-S. Park is with the Department of Electrical Electronic and InformationEngineering, Wonkwang University, 570-749, South Korea.

Digital Object Identifier 10.1109/TFUZZ.2005.861620

I. INTRODUCTION

RECENTLY, a lot of attention has been directed towardadvanced techniques of modeling complex systems. The

challenging quest of how to construct models that come withsignificant approximation and generalization abilities as wellas are easy to comprehend (interpret) has been within thecommunity for decades. While neural networks, fuzzy sets andevolutionary computing as the fundamental technologies ofcomputational intelligence (CI) have expanded and enriched afield of modeling quite immensely, they have also given riseto a number of new methodological issues and increased ourawareness about tradeoffs one has to make in system modeling[1]–[4]. The most promising and successful approaches toendow fuzzy systems with learning and adaptation mechanismshave been realized in the realm of CI. Especially neural fuzzysystems and genetic fuzzy systems hybridize the approximateinference method of fuzzy systems by augmenting them withthe learning capabilities of neural networks and evolutionaryalgorithms [5]. When the dimensionality of the model goes up(say, the number of variables increases), so do the difficulties.Fuzzy sets emphasize the aspect of transparency of the modelsand a role of a model designer whose prior knowledge aboutthe system may be very helpful in facilitating all identificationpursuits. On the other hand, to build models of substantial ap-proximation capabilities, there is a need for advanced tools. Theart of modeling is to reconcile these two tendencies and find aworkable and efficient synergistic environment. Moreover it isalso worth stressing that in many cases the nonlinear form ofthe model acts as a two-edge sword: While we gain flexibilityto cope with experimental data, we are provided with an abun-dance of nonlinear dependencies that need to be exploited in asystematic manner. In particular, when dealing with high-ordernonlinear and multivariable equations of the model, we requirea vast amount of data to estimate all its parameters [1], [2].

To help alleviate the problems we have already alluded to,one of the first approaches along the line of a systematic de-sign of nonlinear relationships between systems inputs and out-puts comes under the name of a group method of data han-dling (GMDH). GMDH was developed in the late 1960s byIvakhnenko [6]–[9] as a vehicle for identifying nonlinear rela-tions between input and output variables. While undoubtedlyproviding with a systematic design procedure, GMDH comeswith some drawbacks. First, it tends to generate quite com-plex polynomials even for relatively simple systems (experi-mental data). Second, owing to its limited generic structure (thatis based on quadratic two-variable polynomials), GMDH alsotends to produce an overly complex network (model) when itcomes to highly nonlinear systems. Third, if there are less than

1063-6706/$20.00 © 2006 IEEE

Page 2: Genetically optimized fuzzy polynomial neural networks

126 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 14, NO. 1, FEBRUARY 2006

Fig. 1. General topology of the generic FPN module (F: fuzzy set-basedprocessing part; P: the polynomial form of mapping).

TABLE IFAMILY OF THE TOPOLOGIES OF THE CONVENTIONAL FPNN

three input variables, GMDH algorithm does not generate ahighly versatile structure.

To alleviate the problems associated with the GMDH, afamily of fuzzy polynomial neuron (FPN)-based SOPNN(called “FPNN” as a new category of neuro-fuzzy net-works)[10] has been introduced. In a nutshell, these networkscome with a high level of flexibility associated with each node(a processing element forming a partial description (PD) (orFPN) can have a different number of input variables as wellas exploit a different order of the polynomial (say, linear,quadratic, cubic, etc.). In comparison to well-known neuralnetworks or neuro-fuzzy networks whose topologies are com-monly selected and kept unchanged to all detailed (parametric)learning, the FPNN architecture is not fixed in advance butbecomes fully optimized (both structurally and parametrically).As a consequence, FPNNs exhibit a superb performance incomparison to the previously presented models. Although theFPNN has a flexible architecture whose potential can be fullyutilized through a systematic design, it is difficult to obtainthe structurally and parametrically optimized network becauseof the limited design of the FPNs residing with each layer ofthe FPNN. In other words, when we construct FPNs of eachlayer in the conventional FPNN, the parameters such as thenumber of input variables (nodes), the order of the polyno-mial, and the input variables available within a FPN are fixed(selected) in advance by the designer. Accordingly, the FPNNalgorithm exhibits some tendency to produce overly complexnetworks. The development comes with some unnecessarycomputing overhead resulting from the use of the trial and errormethod and/or the repetitive adjustment of numeric valuesof the parameters (as encountered in the case of the originalGMDH algorithm). Especially, the conventional FPNN-relatedpapers studied previously by Oh and Pedrycz [10], [33], [43]distinguish between the two main categories of the FPNN

TABLE IIPOLYNOMIAL TYPE ACCORDING TO THE NUMBER OF INPUT VARIABLES

OCCURRING IN THE CONCLUSION PART OF FUZZY RULES

TABLE IIIDIFFERENT FORMS OF THE REGRESSION POLYNOMIALS OCCURRING IN THE

CONSEQUENCE PART OF THE FUZZY RULES

architectures that are different from each other with respect totheir topologies and learning methods. The one proposed in[43] is based upon a combination of FNNs and PNN being re-garded as a certain hybrid architecture. The other one [10], [33]concerns an FPN-based SOPNN (called “FPN-based SONN”or “FPNN”) whose topology consists of neurons with fuzzyinference system in each layer of the network. In both cases, inorder to generate a structurally and parametrically optimizednetwork, such parameters need to be optimized.

In this study, in addressing the above problems existing inthe conventional SOPNNs (especially, FPNs-based SOPNNs re-ferred to as “FPNN”[10], [33]) as well as the GMDH algorithm,we introduce a new genetic design. As its consequence, we willbe referring to these networks as GA-based FPNNs (“gFPNN”,for brief). The determination of the optimal values of the pa-rameters available within an individual FPN (viz. the numberof input variables, the order of the polynomial, and input vari-ables) leads to a structurally and parametrically optimized net-work. As a result, this network becomes more flexible as well asit exhibits a simpler topology in comparison to the conventionalFPNN discussed in the previous research. In the developmentof the network, we introduce an aggregate objective function(performance index) that deals with training and testing data,and elaborate on its optimization in order to strike a meaningfulbalance between approximation and generalization abilities ofthe network. Let us reiterate that the objective of this study isto develop a general design methodology of gFPNN modeling,come up with a logic-based structure of such model and pro-pose a comprehensive evolutionary development environmentin which the optimization of the models can be efficiently car-ried out both at the structural as well as parametric level [11].

This paper is organized in the following manner. First,Section II introduces the architecture and development of theFPNNs. The genetic design of the FPNN comes with an overalldescription of a detailed design methodology of FPNN basedon genetically optimized multilayer perceptron architecture inSection III. In Section IV, we report on a comprehensive set ofexperiments. Finally concluding remarks are offered in Sec-tion V. To evaluate the performance of the proposed model, weexploit two well-known time series data [12], [13], [17]–[41]as well as a nonlinear function [20], [23]–[25], [42], [43].

Page 3: Genetically optimized fuzzy polynomial neural networks

OH et al.: GENETICALLY OPTIMIZED FUZZY POLYNOMIAL NEURAL NETWORKS 127

Fig. 2. Taxonomy of the conventional FPNN architecture.

Fig. 3. Overall genetically driven structural optimization of FPNN.

Furthermore, the network is directly contrasted with severalexisting neurofuzzy models reported in the literature.

II. ARCHITECTURE AND DEVELOPMENT OF FPNN

In this section, we elaborate on the architecture and a devel-opment process of the FPNN. This network emerges from thegenetically optimized multi-layer perceptron architecture basedon GA algorithms and the extended GMDH method. First, webriefly discuss fuzzy polynomial neurons forming a basic com-puting node of the network. Next, the structural and parametricoptimization of FPNNs is described with a special emphasisplaced on the growth process of the network.

A. FPNN Based on FPNs

In what follows, we start with a basic topology of the FPN.This neuron, being regarded as a generic type of the processingunit, dwells on the concepts of fuzzy sets and neural networks.Fuzzy sets realize a linguistic interface by linking the externalworld—numeric data with the processing unit. Neurocomputingmanifests in the form of a local polynomial unit realizing a non-linear processing. The FPN encapsulates a family of nonlinear“IF–THEN” rules. When organized together, FPNs give rise to aself-organizing FPNN. As visualized in Fig. 1, the FPN consistsof two basic functional modules. The first one, labeled here by

, is a collection of fuzzy sets (here and ) that form

an interface between the input numeric variables and the pro-cessing part realized by the neuron. Here, and denote inputvariables. The second module (denoted here by ) is focusedon the function—based nonlinear (polynomial) processing. Thisnonlinear processing involves some input variables ( and ).Quite commonly, we will be using a polynomial form of thenonlinearity, hence the name of the fuzzy polynomial processingunit. The use of polynomials is motivated by their generality andflexibility. In particular, they include constant and linear map-pings as their special cases (that are used quite frequently inrule-based systems).

In other words, FPN realizes a family of multiple-input–single-output rules. Each rule, refer again to Fig. 1, reads in theform

if is and is then is (1)

where is a vector of the parameters of the conclusion partof the rule while denotes the regression poly-nomial forming the consequence part of the fuzzy rule whichuses several types of high-order polynomials (linear, quadratic,and modified quadratic) besides the constant function formingthe simplest version of the consequence; refer to Table III. Al-luding to the input variables of the FPN, especially a way inwhich they interact with the two functional blocks shown there,we use the notation FPN to explicitly point at the

Page 4: Genetically optimized fuzzy polynomial neural networks

128 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 14, NO. 1, FEBRUARY 2006

Fig. 4. General flowchart of the genetic optimization of the FPNN architecture.

corresponding variables. The processing of the FPN is in line ofthe rule-based computing existing in the literature [12], [13].

The activation of the rule “ ” is computed as an and-combi-nation of the activations of the fuzzy sets occurring in the rule.This combination of the subconditions is realized through anyt-norm. In particular, we consider the minimum and product op-

erations as the two widely used models of the logic connectives.Subsequently, denote the resulting activation level of the rule by

.The activation levels of the rules contribute to the output of

the FPN being computed as a weighted average of the individualcondition parts (functional transformations) [note that the

Page 5: Genetically optimized fuzzy polynomial neural networks

OH et al.: GENETICALLY OPTIMIZED FUZZY POLYNOMIAL NEURAL NETWORKS 129

Fig. 5. FPN design used in the FPNN architecture—structural considerations and mapping the structure on a chromosome.

index of the rule, namely “ ” is a shorthand notation for thetwo indexes of fuzzy sets used in the rule (1), that is ]

all rules

all rules

all rules

(2)In the previous expression, we use an abbreviated notation to

describe an activation level of the “ ” th rule to be in the form

all rules(3)

B. The Review of the Conventional FPNN Architecture

Proceeding with the conventional FPNN architecture as pre-sented in [10], [33], essential design decisions have to be madewith regard to the number of input variables and the order of

Fig. 6. Formation of each FPN in FPNN architecture.

the polynomial occurring in the conclusion part of the rule. Fol-lowing these criteria, we distinguish between two fundamentaltypes of the FPNN architectures. Moreover, for each type of thetopology, we identify two cases.

a) Basic FPNN architecture—The number of the input vari-ables of the fuzzy rules in the FPN node is kept the samein each layer of the network.Case 1) The order of the polynomial in the conclusion

part of the fuzzy rules is the same in all thenodes of each layer.

Case 2) The order of such polynomial in the nodes ofthe 2nd layer or higher is different from the oneoccurring in the rules located in the 1st layer.

Page 6: Genetically optimized fuzzy polynomial neural networks

130 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 14, NO. 1, FEBRUARY 2006

Fig. 7. Genetic optimization used in the generation of the optimal nodes in the given layer of the network.

TABLE IVSYSTEM’S INPUT VECTOR FORMATS FOR THE DESIGN OF THE FPNN

TABLE VCOMPUTATIONAL ASPECTS OF THE GENETIC OPTIMIZATION OF FPNN

b) Modified FPNN architecture—The number of the inputvariables of the fuzzy rules in the FPN differs across thelayers of the networkCase 1) The order of the polynomial in the conclusion

part of the fuzzy rules is the same in all thenodes of each layer.

Case 2) The order of such polynomial in the nodes ofthe 2nd layer or higher is different from the oneoccurring in the rules located in the 1st layer.

For each case (Case 1 or 2), we consider two kinds of inputvector formats in the conclusion part of the fuzzy rules of thefirst layer, namely Scheme 1 (selected inputs) and Scheme 2 (theentire collection of inputs).

i) Scheme 1: The input variables of the consequence partof the fuzzy rules are same as the input variables of thepremise part.

ii) Scheme 2: The input variables of the consequence part ofthe fuzzy rules in a node of the 1st layer are the same asthe entire system input variables and the input variablesof the consequence part of the fuzzy rules in a node of thesecond layer or higher are same as the input variables ofpremise part.

The previous taxonomy is also summarized in Tables I and II.In Table I, small letters (p, q) denote the number of selected inputvariables coming to a node in each layer of network. Capitalletters (P, Q) stand for the type of fuzzy inference as related tothe order of the polynomial standing in the consequence part offuzzy rules, refer to Table III. In Table II, for two types of inputvariables being used in the consequence part of fuzzy rules, weidentify with two cases (denoted here as Type P and Type );we use asterisk to indicate that the inputs coming to each nodeof the corresponding layer use systems input variables as

Page 7: Genetically optimized fuzzy polynomial neural networks

OH et al.: GENETICALLY OPTIMIZED FUZZY POLYNOMIAL NEURAL NETWORKS 131

Fig. 8. Example of the FPN design guided by some chromosome.

input variables being used in the consequent part of fuzzy rules,refer to the consequence input type of Table V of Section V.

In the sequel, we use the following notation.

• A- vector of the selected input variables.

• B- vector of the entire collection of systems input vari-ables .• Type P: —type of a poly-

nomial function standing in the consequence part of thefuzzy rules.• Type : —type of a

polynomial function occurring in the consequence part ofthe fuzzy rules.

As illustrated in Table II, A and B describe the vector of theselected input variables and the entire set of input variables ofthe system, respectively. Proceeding with each layer of FPNN,the design alternatives available within a single FPN can bemade with regard to all input variables or a subset of selectedvariables to be included in the consequence part of fuzzy rules.Following these criteria, we distinguish between the two funda-mental types of the rules (denoted here by Type P and Type ,respectively).

• Type P—The input variables in the conclusion part offuzzy rules are kept as those in the conditional part of thefuzzy rules.• Type —All system input variables are kept as input

variables in the conclusion part of the fuzzy rules.

If there are less than three input variables, the generic FPNNalgorithm does not generate a highly versatile structure. To al-leviate this shortcoming, an advanced type of the architectureis taken into consideration that can be treated as the modifiedversion of the generic type of the topology of the network. Ac-cordingly, we identify also two types.

i) Advanced Type: In case that the number of system inputvariables is less than three, the advanced type is used

ii) Generic Type: In case that the number of system inputvariables is four or higher, the generic type is used.

The overall selection process of the conventional FPNN ar-chitecture is visualized in Fig. 2.

The FPNN comes with a highly versatile architecture artic-ulated both in terms of the flexibility of the individual nodes(that are essentially banks of nonlinear “IF–THEN” rules) aswell as the interconnectivity between the nodes and the overallorganization of the layers. The learning of FPNNs dwells onthe extended GMDH algorithm. When developing the FPN, wechoose the input variables of a node from input variables

. The total number of the nodes in the currentlayer depends upon the number of the selected input variablesthat come from the nodes situated in the preceding layer. Thisresults in nodes, where is the numberof the selected nodes (input variables). Based on the numberof the selected nodes (input variables) and the order of thepolynomial, refer to Table III, we form various topologies ofthe FPNNs.

Page 8: Genetically optimized fuzzy polynomial neural networks

132 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 14, NO. 1, FEBRUARY 2006

TABLE VIPERFORMANCE INDEX OF THE NETWORK OF EACH LAYER VERSUS THE INCREASE OF MAXIMAL NUMBER OF INPUTS TO BE SELECTED FOR TYPE II

The commonly encountered types of the polynomial are

• Bilinear ;• Biquadratic- Bilinear ;• Biquadratic- Bilinear ;• Trilinear ;• Triquadratic- Trilinear

;• Triquadratic- Trilinear .

III. ALGORITHMS AND DESIGN PROCEDURE OF GENETICALLY

OPTIMIZED FPNN (gFPNN)

As mentioned in the previous section, even though the con-ventional FPNN exhibits a flexible architecture whose potentialcan be fully utilized through a systematic design, it is difficultto obtain the structurally and parametrically optimized networkbecause of the limited design of the FPNs residing with eachlayer of the FPNN. In other words, when we construct FPNs ofeach layer in the conventional FPNN, such parameters as thenumber of input variables (nodes), the order of the polynomial,and the input variables available within a FPN are fixed (se-lected) in advance by the designer. Accordingly, the determi-nation of the optimal values of the parameters available withinan individual FPN (viz. the number of input variables, the orderof the polynomial, and input variables) leads to a structurallyand parametrically optimized network.

A. Genetic Optimization of FPNN

The task of optimizing any complex model involves two mainphases. First, a class of some optimization algorithms has tobe chosen so that it becomes fully petinent to the requirementsimplied by the problem at hand. Secondly, various parameters ofthe optimization algorithm need to be tuned in order to achieveits best performance.

GAs are optimization techniques based on the principles ofnatural evolution. In essence, they are search algorithms thatuse operations found in natural genetics to guide a comprehen-sive search over the parameter space. GAs have been theoret-ically and empirically demonstrated to provide robust searchcapabilities in complex spaces thus offering a valid solutionstrategy to problems requiring efficient search over quite com-plex search spaces. In contrast to gradient-based optimization

and direct search, which are often based on techniques of math-ematical programming [14], genetic algorithm are aimed at sto-chastic global search and involve a structured information ex-change [15]. It is eventually instructive to highlight the main fea-tures that tell GA apart from some other optimization methods:1) GA operates on the codes of the variables, but not the vari-ables themselves. 2) GA searches optimal points starting from agroup (population) of points in the search space (potential solu-tions), rather than a single point. 3) GAs search is directed onlyby some fitness function whose form could be quite complex;we do not require it to be differentiable.

In this study, for the optimization of the FPNN model, GAuses the serial method of binary type, roulette-wheel used in theselection process, one-point crossover in the crossover opera-tion, and a binary inversion (complementation) operation in themutation operator. To retain the best individual and carry it overto the next generation, we use a standard elitist strategy [16].The overall genetically-driven structural optimization of FPNNis shown in Fig. 3.

As already mentioned, when we construct FPNs at each layerin the conventional FPNN, such parameters as the number ofinput variables (nodes), the order of polynomial, and input vari-ables available within a FPN are fixed (selected) in advance bythe designer. This could have frequently contributed to the dif-ficulties in the design of the optimal network and reduced thequality of the overall network. To overcome this apparent draw-back, we resort ourselves to the genetic optimization. A gen-eral flowchart for the genetic optimization of the FPNN archi-tecture is shown in Fig. 4 and displays a more detailed flow ofthe overall development activities.

B. Design Procedure of the Genetically Optimized FPNN

The gFPNN comes with a highly versatile architecture bothin the flexibility of the individual nodes as well as the intercon-nectivity between the nodes and organization of the layers. Ev-idently, these features contribute to the significant flexibility ofthe networks yet require a prudent design methodology and awell-thought learning mechanisms. Let us stress that there areseveral important differences that make this architecture distinctfrom the panoply of the well-known neurofuzzy architecturesexisting in the literature. The most important is that the learning

Page 9: Genetically optimized fuzzy polynomial neural networks

OH et al.: GENETICALLY OPTIMIZED FUZZY POLYNOMIAL NEURAL NETWORKS 133

Fig. 9. Performance index of gFPNN for Type II according to the increase of number of layers.

of gFPNN dwells on the extended GMDH algorithm that is cru-cial to the structural development of the network. As a conse-quence of the use of GMDH, the architecture of the networkbecomes dynamic as we do not confine ourselves to a certaindepth (number of layers) of the network but this may changeover the training of the network.

The framework of the design procedure of the FPNN based ongenetically optimized multilayer perceptron architecture com-prises the following steps.

[Step 1] Determine systems input variablesDefine systems input variables related to

the output variable . If required, the normalization of input datais carried out as well.

[Step 2] Form training and testing dataThe input–output data set ,

(with being the total number of data points)is divided into two parts, that is, a training and testing dataset.Denote their sizes by and , respectively. Obviously, wehave . The training data set is used to construct theFPNN. Next, the testing data set is used to evaluate the qualityof the network.

[Step 3] Decide initial information for constructing theFPNN structure

We decide upon the design parameters of the FPNN structure;those include

a) Specification of the rules:—Membership function (MF) type: Two types of Trian-

gular and Gaussian-like MF are the typical choices;—Number of MFs per each input of a node (or FPN): 2—Two types of structure of the consequence part of fuzzy

rules: Selected input or entire system input variablesat each node of the first layer of the network, refer toTable II.

b) Considering the stopping criterion, two terminationmethods are of interest:—comparison of a minimal identification error of the cur-

rent layer with that occurring at the previous layer ofthe network.

— the maximal number of layers (predetermined by thedesigner) with an intent to achieve a sound balance be-tween models accuracy and its complexity; the depthof the network does not exceed five layers. We havefound that that this limit leads to the meaningful bal-ance between these the requirement of accuracy andcomplexity.

Page 10: Genetically optimized fuzzy polynomial neural networks

134 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 14, NO. 1, FEBRUARY 2006

Fig. 10. Performance index of gFPNN with respect to the increase of number of system inputs.

c) The maximal number of input variables entering eachnode in the corresponding layer: Max ,where stands for inputs to be selected and Max denotesthe maximal number of inputs to be selected.

d) The total number of nodes to be retained (selected) atthe next generation of the FPNN algorithm: The popula-tion size is equal to 30.

It is worth stressing that the decisions made with respect to c)–d)help us avoid building excessively large networks (which couldbe quite detrimental in terms of their predictive abilities).

[Step 4] Decide FPN structure using genetic design

This concerns the selection of the number of input variables,the polynomial order, and the input variables to be assigned ineach node of the corresponding layer. These important decisionsare carried out through an extensive genetic optimization.

When it comes to the organization of the chromosome repre-senting (mapping) the structure of the FPNN, we partition thechromosome into three sub-chromosomes (segments) as visual-ized in Fig. 5. The first subchromosome contains the number ofinput variables, the second subchromosome involves the orderof the polynomial of the node, and the third subchromosome(remaining bits) contains input variables coming to the corre-sponding node (FPN). All these elements are optimized whenrunning the GA.

In nodes (FPNs) of each layer of FPNN, we adhere to thenotation shown in Fig. 6. “FPNn” denotes the nth FPN (node)of the corresponding layer, “N” denotes the number of nodes(inputs or FPNs) coming to the corresponding node, and “T”denotes the polynomial order of conclusion part of fuzzy rulesused in the corresponding node.

Each sub-step of the genetic design of the three types of theparameters available within the FPN is structured as follows.

[Step 4.1] Selection of the number of input variables (firstsub-chromosome)

[Step 4.1.1] The first three bits of the given chromosome areassigned to the binary bits used for the selection of the numberof input variables. The size of this bit structure depends on thenumber of input variables. For example, in case of three bits, themaximum number of input variables is limited to 7.

[Step 4.1.2] The three bits randomly selected as

bit bit bit (4)

are then decoded into decimal. Here, , andshow each location of these three bits and are denoted as “0” or“1,” respectively.

[Step 4.1.3] The above decimal value is rounded off

(5)

where Max denotes the maximal number of input variables en-tering the corresponding node (FPN) while is the decoded

Page 11: Genetically optimized fuzzy polynomial neural networks

OH et al.: GENETICALLY OPTIMIZED FUZZY POLYNOMIAL NEURAL NETWORKS 135

decimal value corresponding to the situation when all bits ofthe first subchromosome are set up as 1s.

[Step 4.1.4] The normalized integer value is then treated asthe number of input variables (or input nodes) entering the cor-responding node.

Evidently, the maximal number (Max) of input variables isequal to or less than the number of all systems input variables

coming to the first layer, that is,[Step 4.2] Selection of the polynomial order of the conse-

quent part of fuzzy rules (second sub-chromosome).[Step 4.2.1] The three bits of the second sub-chromosome

are assigned to the binary bits for the selection of the order ofpolynomial.

[Step 4.2.2] The three bits randomly selected using (4) aredecoded into a decimal format.

[Step 4.2.3] The decimal value obtained by means of (5)is normalized and rounded off. The value of Max is replacedwith 4 (in essence, we end up with the integer value in the rangein between 1 and 4; refer also to Table III). This mapping isstraightforward.

a) The normalized integer value is given as 1—the polyno-mial order of the consequent part of fuzzy rules is 0 (con-stant: Type 1).

b) The normalized integer value is given as 2—the polyno-mial order of the consequent part of fuzzy rules is 1 (linearfunction: Type 2).

c) The normalized integer value is given as 3—the poly-nomial order of the consequent part of fuzzy rules is 2(quadratic function: Type 3).

d) The normalized integer value is given as 4—the polyno-mial order of the consequent part of fuzzy rules is 2 (mod-ified quadratic function: Type 4).

[Step 4.2.4] The normalized integer value is given as the se-lected polynomial order, when constructing each node of thecorresponding layer.

[Step 4.3] Selection of input variables (third sub-chromo-some)

[Step 4.3.1] The remaining bits are assigned to the binary bitsused for the selection of the input variables.

[Step 4.3.2] The remaining bits are divided by the value ob-tained in [Step 4.1]. If these numbers are not divisible, we applythe following rule. For example, if the remaining are 22 bits andthe number of input variables obtained in [Step 4.1] has been setup as 4, the first, second, and third bit structures (spaces) for theselection of input variables are assigned to 6 bits, respectively.The last (fourth) bit structure (spaces) used for the selection ofthe input variables is assigned to 4 bits.

[Step 4.3.3] Each bit structure is decoded into decimal(through relationship (4))

[Step 4.3.4] Each decimal value obtained in [Step 4.3.3]is then normalized following (5); moreover we round off thevalues obtained from this expression. We replace Max with thetotal number of inputs (viz. input variables or input nodes),(or ) in the corresponding layer. Note that the total numberof input variables denotes the number of the overall systemsinputs, , in the first layer, and the number of the selected

nodes, , as the output nodes of the preceding layer in thesecond layer or higher.

[Step 4.3.5] The normalized integer values are then takenas the selected input variables while constructing each node ofthe corresponding layer. Here, if the selected input variablesare multiple-duplicated, the multiple-duplicated input variables(viz. same input numbers) are treated as a single input variablewhile the remaining ones are discarded.

[Step 5] Carry out fuzzy inference and coefficient parametersestimation for fuzzy identification in the selected node (FPN)

The fuzzy inference method (Simplified inference or regres-sion polynomial inference) can be determined not in advance bythe designer as in the conventional FPNN but through a selec-tion of the polynomial order of the consequence part of fuzzyrules as shown in [Step 4.2.3].

At this step, the regression polynomial inference is consid-ered. The inference method deals with regression polynomialfunctions viewed as the consequents of the rules. Regressionpolynomials (polynomial and in the very specific case, a con-stant value) standing in the conclusion part of fuzzy rules aregiven as different types of Type 1, 2, 3, or 4, see Table III.In the fuzzy inference, we consider two types of membershipfunctions, namely triangular and Gaussian-like membershipfunctions. The regression fuzzy inference (reasoning scheme)is envisioned:

The consequence part can be expressed by linear, quadratic,or modified quadratic polynomial equation as shown inTable III. The use of the regression polynomial inferencemethod gives rise to the expression

If is is then

(6)where is the th fuzzy rule, is an inputvariable, is a membership function of fuzzy sets, denotesthe number of the input variables, is a regression polyno-mial function of the input variables.

The computing of the numeric output of the model followsthe well-known form:

(7)where is the number of the fuzzy rules, is the inferred value,

is the premise fitness of , and is the normalized premisefitness of .

Here, we consider a linear regression polynomial function ofthe input variables. The consequence parameters are producedby the standard least squares method that is

(8)

where,

,,

, is the node number, denotes the totalnumber of data points, and stands the number of the

Page 12: Genetically optimized fuzzy polynomial neural networks

136 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 14, NO. 1, FEBRUARY 2006

Fig. 11. Genetically optimized FPNN (gFPNN) architecture.

Fig. 12. Conventional optimized FPNN architecture.

Fig. 13. Optimization process of each performance index by the genetic algorithms.

input variables. Subsequently is the data index whileis the number of the fuzzy rules.

The procedure described previously is implemented itera-tively for all nodes of the layer and also for all layers of FPNN;we start from the input layer and move toward the output layer.

[Step 6] Select nodes (FPNs) with the best predictive capa-bility and construct their corresponding layer

Fig. 7 depicts the genetic optimization for the generation ofthe optimal nodes (FPNs) at the corresponding layer. All nodesof this layer of the FPNN are constructed genetically.

The generation process can be organized as the following se-quence of steps

[Step 6.1] We set up initial genetic information necessary forgeneration of the FPNN architecture. This concerns the numberof generations and populations, mutation rate, crossover rate,and the length of the chromosome.

[Step 6.2] The nodes (FPNs) are generated through the ge-netic design. Here, a single population assumes the same roleas the node (FPN) in the FPNN architecture and its underlyingprocessing is visualized in Fig. 5. The optimal parameters of thecorresponding polynomial are computed by the standard leastsquares method.

[Step 6.3] To evaluate the performance of FPNs (nodes)constructed using the training dataset, the testing dataset isused. Based on this performance index, we calculate the fitnessfunction.

This fitness function reads as

fitness function (9)

where EPI denotes the performance index for the testing data(or validation data). In this case, the model is obtained by thetraining data and EPI is obtained from the testing data (or vali-dation data) of the FPNN model constructed by the training data.

Page 13: Genetically optimized fuzzy polynomial neural networks

OH et al.: GENETICALLY OPTIMIZED FUZZY POLYNOMIAL NEURAL NETWORKS 137

TABLE VIICOMPARATIVE ANALYSIS OF THE PERFORMANCE OF THE NETWORK; CONSIDERED ARE MODELS REPORTED IN THE LITERATURE

[Step 6.4] To move on to the next generation, we carry outselection, crossover, and mutation operation using genetic initialinformation and the fitness values obtained via [Step 6.3].

[Step 6.5] The nodes (FPNs) obtained on the basis of thecalculated fitness values are rearranged in adescending order. We unify the nodes with duplicated fitnessvalues (viz. in case that one node is the same fitness value asother nodes) among the rearranged nodes on the basis of thefitness values. We choose several FPNs characterized by the bestfitness values. Here, we use the pre-defined number of FPNswith better predictive capability that need to be preserved toassure an optimal operation at the next iteration of the FPNNalgorithm. The outputs of the retained nodes (FPNs) serve asinputs to the next layer of the network. There are two cases asto the number of the retained FPNs, that is as follows.

i) If , then the number of the nodes (FPNs) re-tained for the next layer is equal to .

Here, denotes the number of the retained nodes ineach layer that nodes with the duplicated fitness valueswere removed.

ii) If , then for the next layer, the number of theretained nodes (FPNs) is equal to .

The above design flow is carried out for the successive layersof the network. For the construction of the nodes in the corre-sponding layer of the original FPNN structure, the nodes ob-tained through i) or ii) are rearranged in an ascending order on

the basis of the initial population number. This step is needed toconstruct the final network architecture. We also trace the loca-tion of the original nodes of each layer generated by the geneticoptimization.

[Step 6.6] For the elitist strategy, we select the node that hasthe highest fitness value among the selected nodes .

[Step 6.7] We generate new populations of the next genera-tion using operators of GAs obtained from [Step 6.4]. We usethe elitist strategy. This sub-step carries out by repeating from[Step 6.2] to [Step 6.6]. Especially in [Step 6.5], we replace thenode that has the lowest fitness value in the current generationwith the node that has reached the highest fitness value in theprevious generation obtained from [Step 6.6].

[Step 6.8] We combine the nodes ( populations) obtainedin the previous generation with the nodes ( populations) ob-tained in the current generation. In the sequel, nodes thathave higher fitness values among them are selected. Thatis, this sub-step is completed by repeating [Step 6.5].

[Step 6.9] Until the last generation, this sub-step carries outby repeating from [Step 6.7] to [Step 6.8].

The iterative process generates the optimal nodes of the givenlayer of the FPNN.

[Step 7] Check the termination criterionThe termination condition that controls the growth of the

model consists of two components, that is the performanceindex and a size of the network (expressed in terms of the

Page 14: Genetically optimized fuzzy polynomial neural networks

138 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 14, NO. 1, FEBRUARY 2006

TABLE VIIIPERFORMANCE INDEX OF THE NETWORK OF EACH LAYER VERSUS THE INCREASE OF MAXIMAL NUMBER OF INPUTS TO BE SELECTED

maximal number of the layers). As far as the performance indexis concerned (that reflects a numeric accuracy of the layers),a termination is straightforward and comes in the form of theinequality

(10)

where denotes a maximal fitness value occurring at the cur-rent layer whereas stands for a maximal fitness value that oc-curred at the previous layer. As far as the depth of the networkis concerned, the generation process is stopped at a depth ofless than five. This size of the network has been experimentallyfound to build a sound compromise between the high accuracyof the resulting model and its complexity as well as generaliza-tion abilities.

In this study, we use two measures (performance indexes) thatis the mean squared error (MSE) and the root mean squared error(RMSE).

i) MSE:

(11)

ii) RMSE:

(12)

where is the th target output data and stands forthe -th actual output of the model for this specific datapoint. is training or testing input–outputdata pairs and is an overall(global) performance indexdefined as a sum of the errors for the .

[Step 8] Determine new input variables for the next layerIf (10) has not been met, the model is expanded. The outputs

of the preserved nodes serves as new inputsto the next layer . This is cap-tured by the expression

(13)

The FPNN algorithm is carried out by repeating steps 4–8.

IV. EXPERIMENTAL STUDIES

The performance of the GA-based FPNN is illustrated withthe aid of some well-known and widely used datasets. The firstone is a time series of gas furnace (Box–Jenkins data) which wasstudied previously in the number of studies [12], [13], [17]–[33].The other one deals with the chaotic Mackey–Glass time se-ries [34]–[41]. The third dataset is concerned with a nonlinearstatic system being already exploited in fuzzy modeling [20],[23]–[25], [42], [43].

A. Gas Furnace Process

We illustrate the performance of the network and elaborateon its development by experimenting with data coming fromthe gas furnace process. The time series data (296 input–outputpairs) resulting from the gas furnace process has been inten-sively studied in the previous literature [12], [13], [17]–[33].The delayed terms of methane gas flow rate, and carbondioxide density, are used as system input variables such as

, , , , , and .The output variable is . The first part of the dataset (con-sisting of 148 pairs) was used for training. The remaining partof the series serves as a testing set. We consider the MSE givenby (11) and regarded to be a pertinent performance index.

We choose the input variables of nodes in the 1st layer ofFPNN architecture from these input variables. We use threetypes of system input variables of FPNN architecture withvector formats such as Types I–III as shown in Table IV.The forms Types I–III utilize two, four, and six system inputvariables, respectively.

Table V summarizes the list of parameters used in the ge-netic optimization of the network. In the optimization of eachlayer, we use 100 generations, 60 populations, a string of 36bits, crossover rate equal to 0.65, and the probability of muta-tion set up to 0.1. A chromosome used in the genetic optimiza-tion consists of a string including three sub-chromosomes. Thefirst chromosome contains the number of input variables, thesecond chromosome contains the order of the polynomial, andfinally the third chromosome contains input variables. The num-bers of bits allocated to each sub- chromosome are equal to 3,3, and 30, respectively. The population size being selected fromthe total population size (60) is equal to 30. The developmentprocess is realized as follows. 60 nodes (FPNs) are generated

Page 15: Genetically optimized fuzzy polynomial neural networks

OH et al.: GENETICALLY OPTIMIZED FUZZY POLYNOMIAL NEURAL NETWORKS 139

Fig. 14. Genetically optimized FPNN (gFPNN) architecture.

Fig. 15. Optimization process of each performance index by the genetic algorithms.

in each layer of the network. The parameters of all nodes gen-erated in each layer are estimated and the network is evaluatedusing both the training and testing data sets. Then we comparethese values and choose 30 FPNs that produce the best (lowest)value of the performance index. The maximal number (Max) ofinputs to be selected is confined to the range between two tofive (2–5) entries. The polynomial order of the consequent partof fuzzy rules is chosen from four types, that is Types 1–4 asshown in Table III. As usual in fuzzy systems, we may exploita variety of membership functions in the condition part of therules and this is another factor contributing to the flexibility ofthe network. Overall, triangular or Gaussian fuzzy sets are ofgeneral interest. The first class of triangular membership func-tions provides with a very simple implementation. The secondclass becomes useful because of an infinite support of its fuzzysets.

In generating the overall optimized network, the computationtime depends on the GA-related parameters (such as the max-imal number of generations, total and selected population sizes),the maximal number of inputs to be selected at each node as wellas the maximal number of layers as shown in Table V.

Fig. 8 shows an example of the FPN design that is drivenby some specific chromosome, refer to the case that the per-formance values are , in the 1st layerwhen using Gaussian-like MF and in Table VI(b). InFPN of each layer, two Gaussian-like membership functions foreach input variable are used. Here, the number of entire inputvariables (here, entire system input variables) considered in the1st layer is given as 4 (Type II). The polynomial order selected isgiven as Type 2. Furthermore considered are all systems inputvariables for the regression polynomial function of the conse-quent part of fuzzy rules occurring in the first layer. That is,

Page 16: Genetically optimized fuzzy polynomial neural networks

140 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 14, NO. 1, FEBRUARY 2006

TABLE IXANALYSIS OF THE PERFORMANCE OF THE NETWORK; CONSIDERED ARE MODELS REPORTED IN THE LITERATURE

“Type ” as the consequent input type in the 1st layer is usedand in the 2nd layer or higher, the input variables in the con-clusion part of fuzzy rules are kept as those in the conditionalpart of the fuzzy rules. Especially, in the 2nd layer or higher, thenumber of entire input variables is given as that is the numberof the nodes selected in the current layer, as the output nodes ofthe preceding layer. Refer to Step 4.3.4 of the introduced designprocess. As mentioned previously, the maximal number (Max)of input variables for the selection is confined to 4, and two vari-ables (such as and ) were selected among them. The pa-rameters of the conclusion part (polynomial) of fuzzy rules canbe determined by the standard least-squares method.

Table VI summarizes the results when using Type II: Ac-cording to the maximal number of inputs to be selected (

to 5), the selected node numbers, the selected polynomial type(Type T), and its corresponding performance index (PI and EPI)were shown when the genetic optimization for each layer wascarried out. “Node” denotes the nodes for which the fitness valueis maximal in each layer. For example, in case of Table VI(b), thefitness value in layer 5 is maximal for when nodes 3,28 occurring in the previous layer are selected as the node inputsin the present layer. Only 2 inputs of Type 2 (linear function)were selected as the result of the genetic optimization. Here,node “0” indicates that it has not been selected by the genetic op-eration. Therefore the width (the number of nodes) of the layercan be lower in comparison to the conventional FPNN (whichgreatly contributes to the compactness of the resulting network).In that case, the minimal value of the performance index at thenode, that is , are obtained.

Fig. 9 shows the values of performance index vis-à-visnumber of layers of the gFPNN with respect to the maximalnumber of inputs to be selected as optimal architectures ofeach layer of the network included in Table VI while Fig. 10summarizes the values of the performance index representedvis-à-vis the increasing number of the layers with regard to thenumber of entire system inputs being used in gFPNN when

using . In Fig. 9, – denote the optimal nodenumbers at each layer of the network, namely those with thebest predictive performance. Here, the node numbers of the 1stlayer represent system input numbers, and the node numbersof each layer in the second layer or higher represent the outputnode numbers of the preceding layer, as the optimal node whichhas the best output performance in the current layer.

Fig. 11 illustrates the detailed optimal topology of the net-work with two or three layers, which is compared to the conven-tional optimized network shown in Fig. 12. The performance ofthe conventional (optimized) FPNN was quantified by the valuesof PI being equal to 0.016 and EPI given as 0.128 whereas underthe condition given as similar performance values, two typesof gFPNN architectures were depicted as shown in Fig. 11(a)and (b). As shown in Fig. 11, the genetic design procedure ateach stage (layer) of FPNN leads to the selection of the pre-ferred nodes (or FPNs) with optimal local characteristics (suchas the number of input variables, the order of the polynomial,and input variables). In the sequel, the best results for networkin the second layer are obtained when using and trian-gular MF, and this happens with the Type 1 (constant) and fournodes at the inputs (nodes numbered as 7, 9, 28, 30); the perfor-mance of this network is quantified by the values ofand . In addition, the most preferred results forthe network in the third layer coming with and

have been reported when using andGaussian-like MF with the polynomial of Type 3 and 2 nodes atthe inputs (the node numbers are 14, 28). Therefore the width(the number of nodes) of the layer as well as the depth (thenumber of layers) of the network can be lower in comparisonto the conventional FPNN. This which immensely contributesto the compactness of the resulting network.

Fig. 13 illustrates the optimization process by visualizing theperformance index in successive generations of the genetic op-timization. It also shows the optimized network architecture ineach layer of the network when using Type II, , and

Page 17: Genetically optimized fuzzy polynomial neural networks

OH et al.: GENETICALLY OPTIMIZED FUZZY POLYNOMIAL NEURAL NETWORKS 141

Fig. 16. Performance index of the network shown with respect to the increase of maximal number of inputs to be selected.

triangular MF. Noticeably, the variation ratio (slope) of the per-formance of the network changes radically around the 2nd and4th layers. Therefore, to effectively reduce a large number ofnodes and avoid a substantial amount of time-consuming itera-tions concerning FPNN layers, some stopping criterion can betaken into consideration. Referring to Fig. 11(a), it becomes ob-vious that we can optimize the network up to maximally thesecond layer (or the fourth layer).

Table VII contrasts the performance of the genetically devel-oped network with other fuzzy and neuro-fuzzy models studiedin the literatures. The experimental results clearly reveal thatthe proposed approach and the resulting model outperforms theexisting networks both in terms of better approximation capa-bilities (lower values of the performance index on the trainingdata, ) as well as superb generalization abilities (expressedby the performance index on the testing data, ).is defined as the mean square errors(MSE) between the experi-mental data and the respective outputs of the model (network).variables). Moreover when compared with the conventional op-timized FPNN(or SONN), the width (the number of nodes, seethe last column of Table VII) of the layer as well as the depth(the number of layers) of the network can be smaller (again notethat the result is a more compact network) yet we obtain similarperformance values as in the previous larger network.

B. Chaotic Mackey–Glass Time Series

In this section, we demonstrate how the GA-based FPNN canbe utilized to predict future values of a chaotic Mackey–Glasstime series. The performance of the network is also contrastedwith some other models existing in the literature [34]–[41]. Thetime series is generated by the chaotic Mackey–Glass differen-tial delay equation [41] comes in the form

(14)

The prediction of future values of this series arises as a bench-mark problem that has been used and reported by a number ofresearchers. To obtain the time series value at each integer point,we applied the fourth-order Runge-Kutta method to find the nu-merical solution to (14). From the Mackey–Glass time series

, we extracted 1000 input–output data pairs in the followingformat:

where to . The first 500 pairs were used as thetraining data set while the remaining 500 pairs formed thetesting data set. To come up with a quantitative evaluation ofthe network, we use the standard RMSE performance index asgiven by (12).

The GA-based design procedure is carried out in the samemanner as in the previous experiments.

Table VIII summarizes the performance of the network whenchanging the maximal number of inputs to be selected; here Maxwas set up to 2–5. The consequent input type to be used in thisprocess is the same as that in case of Gas furnace process.

Fig. 14(a) and (b) illustrates the detailed optimal topologiesof gFPNN for one layer and two layers respectively when using

and triangular MF: The performance of the networkis now given as and in the firstlayer, and and in the second layer,respectively. Fig. 14(c) illustrates the detailed optimal topologyof gFPNN for 1 layer when using and Gaussian-likeMF: Here, we arrive at the values of the performance index tobe equal to and . As shown inFig. 14, the proposed network enables the architecture to be astructurally optimized and simplified network than the “conven-tional” FPNN.

Fig. 15 illustrates the optimization process by visualizing thevalues of the performance index obtained in successive gener-ations of GA. It also shows the optimized architecture of thenetwork when using Gaussian-like MF (the maximal number(Max) of inputs to be selected is set to 4 with the structure com-posed of five layers). As shown in Fig. 15, the variation ratio(slope) of the performance of the network is almost the same upto the third through fifth layer, therefore in this case, the stop-ping criterion can be taken into consideration up to maximallytwo layers for the purpose to effectively reduce the number ofnodes as well as layers of the network (from the viewpoint ofthe width and depth of gFPNN for the compact network).

Page 18: Genetically optimized fuzzy polynomial neural networks

142 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 14, NO. 1, FEBRUARY 2006

TABLE XPERFORMANCE INDEX OF THE NETWORK OF EACH LAYER VERSUS THE INCREASE OF MAXIMAL NUMBER OF INPUTS TO BE SELECTED

TABLE XICOMPARATIVE ANALYSIS OF THE PERFORMANCE OF THE NETWORK; CONSIDERED ARE MODELS REPORTED IN THE LITERATURES

Table IX gives a comparative summary of the network withother models. The experimental results clearly reveal that it out-performs the existing models both in terms of better approxi-mation capabilities (lower values of the performance index onthe training data, ) as well as superb generalization abilities(expressed by the performance index on the testing data ).

is defined as the RMSEs computed for the experi-mental data and the respective outputs of the network.

C. Two-Input Nonlinear Static System

In this section, we consider a nonlinear static system [20],[23]–[25], [42], [43] with two inputs, , and a single outputthat assumes the following form:

(15)

This nonlinear static equation is widely used to evaluatemodeling performance of the fuzzy modeling. Using (15), 50input–output data are generated: The inputs are generated ran-domly and the corresponding output is then computed throughthe above relationship. We consider the MSE (11) to serve as aperformance index.

Most of the optimized parameters are the same as being usedin the previous experiments. The consequent input type to beused in this process is fixed unlike in case of the previous exper-iments, that is the input variables in the conclusion part of fuzzyrules are kept as those in the conditional part of the fuzzy rulesin each layer of the network.

Table X summarizes the detailed results while Fig. 16 pertainsto the values of the performance index. As shown in Table X, the

Fig. 17. Genetically optimized FPNN architecture in case of using Max = 3

and Gaussian-like MF.

performance in the second layer of the genetically optimizednetwork gets radically much better in both cases (triangular andGaussian MF). In the case of using Triangular MF, the resultfor network in the 2nd layer is obtained when usingwith Type 3 polynomials (quadratic functions) and four nodes atinput (node numbers are 1, 6, 7, 9); this network comes with thevalue of PI equal to 7.9e-18. When using Gaussian MF, the resultfor the network in the second layer comes with ;these have been reported when again with the secondorder polynomials and three nodes at the input (node inputs: 2,6, 9), and the related optimal network is visualized in Fig. 17.

Fig. 18 illustrates the optimization process by visualizing theperformance index obtained in successive generations of thisoptimization. It also shows the optimized network architecture(consisting here of five layers). The variation ratio (slope) ofthe performance of the network changes radically in the second

Page 19: Genetically optimized fuzzy polynomial neural networks

OH et al.: GENETICALLY OPTIMIZED FUZZY POLYNOMIAL NEURAL NETWORKS 143

Fig. 18. Optimization process of performance index by the genetic algorithms.

layer. Therefore, the depth (the number of layers) of the genet-ically optimized FPNN can be lower in comparison to the con-ventional FPNN.

Table XI covers a comparative analysis including severalprevious fuzzy and neuro-fuzzy models. Compared with thesemodels, the gFPNN emerges as the one with high accuracy andimproved prediction capabilities.

V. CONCLUDING REMARKS

In this study, the GA-based design procedure of FPNN alongwith their architectural considerations has been investigated.In contrast to the “standard (conventional)” FPNN structuresand their learning, the proposed model comes with a diversityof local characteristics of FPNs that are extremely useful whencoping with various nonlinear characteristics of the systemunder consideration. The GA-based design procedure appliedat each stage (layer) of the FPNN leads to the selection of theseoptimal nodes (or FPNs) with local characteristics (such as thenumber of input variables, the order of the consequent polyno-mial of fuzzy rules, and input variables) available within FPNN.These options contribute to the flexibility of the resulting ar-chitecture of the network. The design methodology comesas a hybrid structural optimization and parametric learningbeing viewed as two fundamental phases of the design process.The GMDH method is now comprised of both a structuralphase such as a self-organizing and an evolutionary algorithm(rooted in natural law of survival of the fittest), and the ensuingparametric phase of the Least Square Estimation (LSE)-basedlearning.

The comprehensive experimental studies involving well-known datasets quantify a superb performance of the networkin comparison to the existing fuzzy and neuro-fuzzy models.Most importantly, through the proposed framework of geneticoptimization we can efficiently search for the optimal networkarchitecture (structurally and parametrically optimized net-work) and this becomes crucial in improving the performanceof the resulting model.

REFERENCES

[1] V. Cherkassky, D. Gehring, and F. Mulier, “Comparison of adaptivemethods for function estimation from samples,” IEEE Trans. NeuralNetworks, vol. 7, no. 4, pp. 969–984, Jul. 1996.

[2] J. A. Dickerson and B. Kosko, “Fuzzy function approximation with el-lipsoidal rules,” IEEE Trans. Syst., Man, Cybern. B, vol. 26, no. 4, pp.542–560, Aug. 1996.

[3] V. Sommer, P. Tobias, D. Kohl, H. Sundgren, and L. Lundstrom, “Neuralnetworks and abductive networks for chemical sensor signals: A casecomparison,” Sensors Actuators, vol. B.28, pp. 217–222, 1995.

[4] S. Kleinsteuber and N. Sepehri, “A polynomial network modeling ap-proach to a class of large-scale hydraulic systems,” Comput. Elect. Eng.,vol. 22, pp. 151–168, 1996.

[5] O. Cordon et al., “Ten years of genetic fuzzy systems: Current frame-work and new trends,” Fuzzy Sets Syst., vol. 141, pp. 5–31, 2004.

[6] A. G. Ivakhnenko, “Polynomial theory of complex systems,” IEEETrans. Syst., Man, Cybern., vol. SMC-1, no. 4, pp. 364–378, Oct. 1971.

[7] A. G. Ivakhnenko and H. R. Madala, Inductive Learning Algorithms forComplex Systems Modeling. Boca Raton, Fl: CRC Press, 1994.

[8] A. G. Ivakhnenko and G. A. Ivakhnenko, “The review of problems solv-able by algorithms of the group method of data handling (GMDH),” Pat-tern Recogn. Image Anal., vol. 5, no. 4, pp. 527–535, 1995.

[9] A. G. Ivakhnenko, G. A. Ivakhnenko, and J. A. Muller, “Self-organi-zation of neural networks with active neurons,” Pattern Recogn. ImageAnal., vol. 4, no. 2, pp. 185–196, 1994.

[10] S.-K. Oh and W. Pedrycz, “Self-organizing polynomial neural networksbased on polynimial and fuzzy polynomial neurons: Analysis and de-sign,” Fuzzy Sets Syst., vol. 142, no. 2, pp. 163–198, Mar. 2004.

[11] W. Pedrycz and M. Reformat, “Evolutionary optimization of fuzzymodels in fuzzy logic: A framework for the new millennium,” inStudies in Fuzziness and Soft Computing, V. Dimitrov and V. Korotkich,Eds. Berlin, Germany: Physica-Verlag, 1996, vol. 8, pp. 51–67.

[12] W. Pedrycz, “An identification algorithm in fuzzy relational system,”Fuzzy Sets Syst., vol. 13, pp. 153–167, 1984.

[13] S.-K. Oh and W. Pedrycz, “Identification of fuzzy systems by meansof an auto-tuning algorithm and its application to nonlinear systems,”Fuzzy Sets Syst., vol. 115, no. 2, pp. 205–230, 2000.

[14] Z. Michalewicz, Genetic Algorithms + Data Structures = EvolutionPrograms. Berlin, Germany: Springer-Verlag, 1996.

[15] J. H. Holland, Adaptation in Natural and Artificial Systems. AnnArbor, MI: Univ. Mich. Press, 1975.

[16] K. A. De Jong, “Are genetic algorithms function optimizers?,” in Par-allel Problem Solving From Nature 2, R. Manner and B. Manderick,Eds. Amsterdam, The Netherlands: North-Holland.

[17] D. E. Box and G. M. Jenkins, Time Series Analysis, Forcasting and Con-trol. San Francisco, CA: Holden Day, 1976.

[18] R. M. Tong, “The evaluation of fuzzy models derived from experimentaldata,” Fuzzy Sets Syst., vol. 13, pp. 1–12, 1980.

[19] M. Sugeno and T. Yasukawa, “Linguistic modeling based on numericaldata,” IFSA 91, Brussels, Computer, Management Systems Science, pp.264–267, 1991.

[20] , “A fuzzy-logic-based approach to qualitative modeling,” IEEETrans. Fuzzy Syst., vol. 1, no. 1, pp. 7–31, Feb. 1993.

[21] C. W. Xu and Y. Zailu, “Fuzzy model identification self-learning fordynamic system,” IEEE Trans. Syst. Man, Cybern., vol. SMC-17, no. 4,pp. 683–689, Jul. 1987.

[22] J. Q. Chen, Y. G. Xi, and Z. J. Zhang, “A clustering algorithm for fuzzymodel identification,” Fuzzy Sets Syst., vol. 98, pp. 319–329, 1998.

[23] A. F. Gomez-Skarmeta, M. Delgado, and M. A. Vila, “About the use offuzzy clustering techniques for fuzzy model identification,” Fuzzy SetsSyst., vol. 106, pp. 179–188, 1999.

[24] E.-T. Kim et al., “A new approach to fuzzy modeling,” IEEE Trans.Fuzzy Syst., vol. 5, no. 3, pp. 328–337, Jun. 1997.

[25] E.-T. Kim et al., “A simple identified sugeno-type fuzzy model viadouble clustering,” Inform. Sci., vol. 110, pp. 25–39, 1998.

[26] J. Leski and E. Czogala, “A new artificial neural networks based fuzzyinference system with moving consequents in if-then rules and selectedapplications,” Fuzzy Sets Syst., vol. 108, pp. 289–297, 1999.

[27] Y. Lin and G. A. Cunningham III, “A new approach to fuzzy-neuralmodeling,” IEEE Trans. Fuzzy Syst., vol. 3, no. 2, pp. 190–197, Apr.1995.

[28] Y. Wang and G. Rong, “A self-organizing neural-network-based fuzzysystem,” Fuzzy Sets Syst., vol. 103, pp. 1–11, 1999.

Page 20: Genetically optimized fuzzy polynomial neural networks

144 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 14, NO. 1, FEBRUARY 2006

[29] H.-S. Park, S.-K. Oh, and Y.-W. Yoon, “A new modeling approach tofuzzy-neural networks architecture” (in Korean), J. Control, Automat.,Syst. Eng., vol. 7, no. 8, pp. 664–674, Aug. 2001.

[30] S.-K. Oh, D.-W. Kim, and B.-J. Park, “A study on the optimal design ofpolynomial neural networks structure,” Trans. Korean Inst. Elect. Eng.,vol. 49D, no. 3, pp. 145–156, Mar. 2000.

[31] S.-K. Oh, W. Pedrycz, and D.-W. Kim, “Hybrid fuzzy polynomial neuralnetworks,” Int. J. Uncertainty, Fuzziness, Knowledge-Based Syst., vol.10, no. 3, pp. 257–280, Jun. 2002.

[32] S.-K. Oh and W. Pedrycz, “The design of self-organizing polynomialneural networks,” Inform. Sci., vol. 141, pp. 237–258, 2002.

[33] , “Fuzzy polynomial neuron-based self-organizing neural net-works,” Int. J. Gen. Syst., vol. 32, no. 3, pp. 237–250, May 2003.

[34] L. X. Wang and J. M. Mendel, “Generating fuzzy rules from numericaldata with applications,” IEEE Trans. Syst., Man, Cybern., vol. 22, no. 6,pp. 1414–1427, Dec. 1992.

[35] R. S. Crowder III, “Predicting the Mackey-Glass time series with cas-cade-correlation learning,” in Proc. 1990 Connectionist Models SummerSchool, D. Touretzky, G. Hinton, and T. Sejnowski, Eds. Pittsburgh,PA: Carnegic Mellon University, 1990, pp. 117–123.

[36] J. S. R. Jang, “ANFIS: Adaptive-network-based fuzzy inferencesystem,” IEEE Trans. Syst., Man, Cybern., vol. 23, no. 3, pp. 665–685,Jun. 1993.

[37] L. P. Maguire, B. Roche, T. M. McGinnity, and L. J. McDaid, “Pre-dicting a chaotic time series using a fuzzy neural network,” Inform. Sci.,vol. 112, pp. 125–136, 1998.

[38] C. J. Li and T. -Y. Huang, “Automatic structure and parameter trainingmethods for modeling of mechanical systems by recurrent neural net-works,” Appl. Math. Model., vol. 23, pp. 933–944, 1999.

[39] S.-K. Oh, W. Pedrycz, and T.-C. Ahn, “Self-organizing neural networkswith fuzzy polynomial neurons,” Appl. Soft Comput., vol. 2, no. 1F, pp.1–10, Aug. 2002.

[40] A. S. Lapedes and R. Farber, “Non-linear signal processing using neuralnetworks: Prediction and system modeling,” Los Alamos National Lab-oratory, Los Alamos, NM, Tech. Rep. LA-UR-87-2662, 1987.

[41] M. C. Mackey and L. Glass, “Oscillation and chaos in physiologicalcontrol systems,” Science, vol. 197, pp. 287–289, 1977.

[42] S.-K. Oh, D.-W. Kim, B.-J. Park, and H.-S. Hwang, “Advanced poly-nomial neural networks architecture with new adaptive nodes,” Trans.Control, Automat., Syst. Eng., vol. 3, no. 1, pp. 43–50, Mar. 2001.

[43] B.-J. Park, W. Pedrycz, and S.-K. Oh, “Fuzzy polynomial neuralnetworks: Hybrid architectures of fuzzy modeling,” IEEE Trans. FuzzySyst., vol. 10, no. 5, pp. 607–621, Oct. 2002.

Sung-Kwun Oh (M’98) received the B.Sc., M.Sc.,and Ph.D. degrees in electrical engineering fromYonsei University, Seoul, Korea, in 1981, 1983, and1993, respectively.

During 1983–1989, he was a Senior Researcher ofthe R&D Laboratory of Lucky-Goldstar IndustrialSystems Co., Ltd., Anyang, Korea. From 1996 to1997, he was a Postdoctoral Fellow in the De-partment of Electrical and Computer Engineering,University of Manitoba, Canada. He is currentlya Professor in the Department of Electrical Engi-

neering, University of Suwon, South Korea. His research interests include fuzzysystem, fuzzy-neural networks, automation systems, advanced computationalintelligence, and intelligent control.

Dr. Oh currently serves as an Associate Editor of the KIEE Transactions onSystems and Control and the Journal of Control, Automation, and Systems En-gineering of the ICASE, South Korea.

Witold Pedrycz (M’88–SM’90–F’99) received theM.Sc., Ph.D., D.Sci., all from the Silesian Universityof Technology, Gliwice, Poland.

He is a Professor and Canada Research Chair(CRC) in the Department of Electrical and ComputerEngineering, University of Alberta, Edmonton,Canada. He is also with the Systems Research Insti-tute of the Polish Academy of Sciences. He is activelypursuing research in computational intelligence,fuzzy modeling, knowledge discovery and datamining, fuzzy control including fuzzy controllers,

pattern recognition, knowledge-based neural networks, relational computation,bioinformatics, and software engineering. He has published numerous papersin this area, and is an author of nine research monographs covering variousaspects of computational intelligence and software engineering.

Dr. Pedrycz has been a member of numerous program committees of con-ferences in the area of fuzzy sets and neurocomputing. He currently servesas an Associate Editor of the IEEE TRANSACTIONS ON SYSTEMS, MAN, andCYBERNETICS, the IEEE TRANSACTIONS ON NEURAL NETWORKS, and the IEEETRANSACTIONS ON FUZZY SYSTEMS. He is Editor-in-Chief of Information Sci-ences, President of IFSA, and President of NAFIPS.

Ho-Sung Park received the B.S., M.S., and Ph.D.degrees in control and instrumentation engineeringfrom Wonkwang University, Korea, in 1999, 2001,and 2005, respectively.

He is currently a full-time Lecturer in the Schoolof Electrical, Electronic, and Information Engi-neering, Wonkwang University, South Korea. Hisresearch interests include fuzzy inference system,neural networks and hybrid systems, neuro-fuzzymodels, genetic algorithms, and computationalintelligence.

Dr. Park is a member of KIEE and ICASE.