[ieee 2010 third international workshop on advanced computational intelligence (iwaci) - suzhou,...

Third International Workshop on Advanced Computational Intelligence August 25-27,2010 - Suzhou, Jiangsu, China

An Online Self-organizing Neuro-Fuzzy System from Training Data

Ning Wang, Dan Wang and Zhiliang Wu

Abstract- In this paper, we design a novel Online Selfconstructing Neuro-Fuzzy System (OSNFS) based on the proposed generalized ellipsoidal basis functions (GEBF). Due to the flexibility and dissymmetry of the GEBF, the partitioning made by GEBFs in the input space is more flexible and more economical, and therefore results in a parsimonious neurofuzzy system (NFS) with high performance under the online learning algorithm. The geometric growing criteria and the error reduction ratio (ERR) method are used as growing and pruning strategies respectively to realize the structure learning algorithm which implements an optimal and compact network structure. The proposed OSNFS starts with no fuzzy rules and does not need to partition the input space a priori. In addition, all the free parameters in premises and consequents are adjusted online based on the c-completeness of fuzzy rules and the linear least square (LLS) approach, respectively. The performance of the proposed OSNFS is compared with other well-known algorithms on a benchmark problem in nonlinear dynamic system identification. Simulation results demonstrate that the proposed OSNFS approach can facilitate a compact and economical NFS with better approximation performance.

I. INTRODUCTION

IT has been revealed that fuzzy systems and neural networks can approximate any function to any desired

accuracy provided that sufficient fuzzy rules or hidden neurons are available [1], [2]. Innovative merger of the two paradigms results in a popular and powerful field, neurofuzzy systems (NFS), which is designed to realize a fuzzy inference system through the topology of neural networks, and therefore incorporates the generic advantages of neural networks like massive parallelism, robustness, and learning ability into fuzzy inference systems [3]. It is worthily noted that the problem of determining the number of hidden nodes in the NFS can be viewed as the choice of the number of fuzzy rules. However, the main challenge on how to extract a suitable collection of fuzzy rules from the available data set is still an open issue. To cope with this difficulty, more attentions have been focused on the NFS to identify the structure and parameters of fuzzy systems based on the learning ability of neural networks.

The main idea behind existing approaches can be described as two stages: 1) structure identification, which concerns partitioning the input space and determining the number

Ning Wang, Dan Wang and Zhiliang Wu are with the Marine Engineering College, Dalian Maritime Uni versity, DaHan 116026, China (email: [email protected]).

The work described in this paper was partially supported by Foundamental Research Funds for the Central Uni versities (under Grant 2009QN025), National Nature Science Foundation of China (under Grant 60674037) and Program for Liaoning Excellent Talents in Uni versities(under Grant 2009R06).

978-1-4244-6337-4/10/$26.00 @2010 IEEE 26

of fuzzy rules for a specific performance, and 2) parameter estimation, which involves identifying the parameters of premises and consequences. Similar to the well-known ANFIS [4], the traditional design of the NFS is that assume that some particular membership functions have been defined in advance and the number of fuzzy rules is determined a priori according to either expert knowledge or trial and error method, and the parameters are modified the hybrid or BP learning algorithm which is known to be slow and easy to be entrapped into local minima. A significant contribution was made by Platt [5] through the development of an algorithm named resource-allocating network (RAN) that adds hidden units to the network based on the novelty of the new data in the sequential learning process. Based on this learning scheme, Yingwei et al. [6] proposed improved RAN-based neural networks by using pruning methods whereby inactive hidden neurons can be detected and removed during the learning process. Hence, a compact neural network can be implemented. Other improvements of the RAN developed in [7], [8] take into considerations of the pseudo-Gaussian (PG) function and orthogonal techniques including the QR factorization and Singular Value Decomposition (SVD) method, and have been applied to the time-series analysis. As variants of geometric growing criteria [5], [9], some typical selfconstructing paradigms have been proposed [10]. Chen et al. [11] proposed an orthogonal least square (OLS) learning algorithm whereby both structure and parameter identification are conducted. Recently, a Dynamic Fuzzy Neural Network (DFNN) based on RBF neural networks has been developed in [12] in which not only the parameters can be adjusted by the linear least square (LLS) but also the structure can be self-adaptive via growing and pruning criteria. In [13], the DFNN is extended to a generalized DFNN (GDFNN) by introducing the ellipsoidal basis function (EBF). Similar to the GDFNN, a self-organizing fuzzy neural network (SOFNN) [14] with a pruning strategy using the optimal brain surgeon (OBS) approach has been proposed to extract fuzzy rules online. The SOFNN based on Genetic Algorithms (SOFNNGA) [15] is designed to implement a Takagi-Sugeno (T-S) type fuzzy model by using the GA as a pruning method which makes this approach unsuitable for online learning paradigm. Lately, a fast and accurate online selforganizing scheme for parsimonious fuzzy neural networks (FAOS-PFNN) [16], [17] has been proposed to accelerate the learning speed and increase the approximation accuracy via incorporating pruning strategy into new growth criteria. In the context of membership functions of the input variables, the asymmetric Gaussian function (AGF) [18], [19] has been

presented to upgrade the learning ability and flexibility of the NFS, besides the popularly used standard and simplified Gaussian functions.

Unlike the foregoing NFS's, a dissymmetrical Gaussian function (OGF) is presented to extend symmetric Gaussian functions by permitting the input signal to be modeled by the OGF in this paper. We design a novel online learning scheme for realizing a TSK fuzzy system from available data. The system starts with no hidden units, and recruits newly generated neurons when criteria of rule generation are satisfied and deletes insignificant hidden nodes according to criteria of pruning rules while the parameters in premises are updated simultaneously. The weights in consequents of the resulting structure are estimated online by the linear least square (LLS) method. To demonstrate the effectiveness and superiority of the proposed NFS, simulation studies are conducted on a benchmark problems in nonlinear dynamic system identification. Simulation results indicate that the proposed paradigm can online facilitate a more compact NFS with better performance. Comprehensive comparisons with other popular approaches show that the overall performance of the proposed approach is superior to the others.

The rest of this paper is organized as follows. Section II briefly describes the architecture of the proposed NFS, and the learning algorithm is presented in Section III in detail. Section IV shows simulation studies of the proposed paradigm, including quantitative and qualitative performance comparisons with other typical algorithms. Section V summarizes the conclusion.

II. ARCHITECTURE OF THE PROPOSED NEURO-Fuzzy

SYSTEM

In this section, we present the structure of the proposed online self-organizing neuro-fuzzy system (OSNFS) shown in Fig.l which is a four-layered network. It should be noted that a novel concept of the generalized ellipsoidal basis function (GEBF) is presented and incorporated into the topology of the OSNFS which realizes a TSK fuzzy inference system. The GEBF eliminates the symmetry restriction of the standard Gaussian membership function in each dimension and increases the flexibility of the widths for clusters in the input space. The overall fuzzy neural network can be described in the form of fuzzy rules given by

Rule j : IF Xl is Alj (Clj, (Tfj, (TE) ... Xr is Arj (crj, (T�j' (Ttj) THEN y=wj(xI, ··· , Xr), j=I, 2, ... ,u . (1)

where Aij(Cij, (Tt, (T�), i = 1, 2, ··· , r,j = 1, 2, ··· , u is the fuzzy set of the ith input variable Xi in the jth fuzzy rule, Cij, (Tt and (T� are the center, left width and right width of the corresponding fuzzy set, respectively, rand u is the number of input variables and fuzzy rules, respectively.

Layer 1: Input Layer. There are no any computations for input data.

Layer 2: Membership Function Layer. Let Mij be the corresponding membership function of the fuzzy set Aij in

27

Layer 1 Layer 2 Layer 3 Layer 4

Fig. I. Architecture of the proposed OSNFS

layer 2, the dissymmetrical Gaussian function (OGF) is used to define the membership function as follows:

Layer 3: Rule Node Layer. Each node in this layer represents a possible IF-part of fuzzy rules. If multiplication is selected as T-norm to calculate each rule's firing strength, the output of the jth rule Rj(j = 1, 2, ··· , u) can be calculated by the above mentioned GEBF given by

(3)

(4)

Layer 4: Output Layer. This layer has single output node for multi-input and single-output (MISO) systems. However, the results could be readily applied to multi-input and multioutput (MIMO) systems since a special MIMO system could be decomposed into several MISO systems. The output is the weighted summation of incoming signals given by

u

Y(XI, ··· , xr) = LWjipj (5) i=l

where Wj is the THEN-part of the jth rule and is a polynomial of the input variables given by

Wj = CXOj + CXljXI + ... + CXrjXr, j = 1, 2, ... , u (6)

and CXOj,CXlj,'" ,CXrj,j = 1, 2, ···, u are the weights of input variables in the jth rule.

III. LE ARNING ALGORITHM OF THE PROPOSED OSNFS

In this section, the main idea behind the proposed OSNFS is presented. For each observation (Xk, tk), k = 1, 2", . , n,

where n is the number of total training data pairs, Xk E Rr and tk E R are the kth input vector and the desired output, respectively. The output yk of the existing structure could be obtained by (1)-(6). In the learning process, suppose that the fuzzy neural network has generated u GEBF hidden neurons in layer 3 for the kth observation.

A. Growing Criteria

1) System Errors: When the kth data pair (Xk, tk) arrives, the system error can be as follows:

(7)

If (8)

a new GEBF membership function (MF) neuron should be created or some widths of the existing GEBF MF neurons should be modified for high performance. Otherwise, no new fuzzy rules will be recruited and only the weights in consequents of the existing fuzzy rules will be updated. Here, ke is a predefined threshold that decays during the learning process, where emax is the maximum error chosen, emin is the desired accuracy and fJ E (0, 1) is the convergence constant.

2) Input Partitioning: A GEBF MF for a fuzzy rule is a local representation over a region defined in the input space. According to the c-completeness, if the kth observation (Xk, tk) satisfies the criterion, the system would not create any new rules but accommodate the new learning data by tuning parameters of the nearest GEBF unit. In order to realize this criterion, the distance distjk between the new coming sample Xk = [Xlk, X2k,' . . , Xrk]T and the center Cj = [Clj, C2j, . . . , cdT of the jth GEBF unit can be obtained as follows:

distjk(Xk) = V(Xk - Cj)T�j(Xk)(Xk - Cj) (9)

�j(Xk) = 0'2/(X2k) 0

( O'l/:(Xlk) 0 0 1 o 0 O';/�xrk)

(10) where O'ij(Xik) is the width of the ith dimension in the jth GEBF unit and can be derived from (4).

For the kth observation, find the nearest GEBF unit given by

(11)

28

If

(12)

existing GEBF MFs cannot partition the input space well and the c-completeness is not be satisfied yet. A new GEBF MF should be considered or the nearest GEBF unit should be modified for fine partitioning in the input space. Otherwise, no new fuzzy rules will be recruited and only the weights in consequents of existing fuzzy rules will be updated. Here, kd is a predefined threshold that decays during the learning process, where dmax and dmin are the maximum and minimum distance, respectively. According to the ccompleteness, they can be defined as follows:

dmax = Jln(l/cmin), dmin = Jln(l/cmax) (13)

where Cmax and Cmin are the maximum and minimum value chosen for the fire strength c, respectively. In this paper, we choose Cmax = 0.8 and Cmin = 0.5, respectively. And 'Y E (0, 1) is the convergence constant.

B. Pruning Criteria

Consider (5) as a special case of the linear regression model which can be described in the following compact form:

T=wA+E w = [1/110 1/12,' .. ,1/1n]T

(14)

(15)

A = lao, al,'" , ar]T (16)

1/1i = [(Pi, ¢iXli, ... , ¢iXri]T, ¢i = [<Pli, <P2i, ... , <Pui] (17)

ai = [CXil,CXi2,'" ,CXiu], i = O,l,···,r (18)

where T = [tl, t2, ... , tn]T E Rn is the desired output vector, A = lao, al,' .. , ar]T E RV is the vector of weights with v = u x (r + 1), w E Rnxv is the matrix of the regressors, and E = [el, e2, ... , en]T E Rn is the error vector which is assumed to be uncorrelated with the regressors.

For the matrix W, if its row number is larger than the column number, we can transform it into a set of orthogonal basis vectors by QR decomposition,

w=PQ (19)

where the matrix P = [Pb P2,'" ,Pv] E Rnxv has the same dimension as the matrix W with orthogonal columns and Q E RV x v is an upper triangular matrix. The orthogonality makes it feasible to compute individual contribution of each rule to the desired output energy from each vector. Substituting (21) into (16) yields

T= PQA+E= PG+E (20)

where G = [91092,'" ,9v]T = (pTp)-lpTT E RV could be obtained by the linear least square (LLS) method. An ERR due to Pi as defined in [2] is given by

(pTT) 2 erri = T .TTT ' i = 1, 2, ···, v (21) Pi Pt

The equation of the ERR offers a simple and effective approach to evaluating the significance of each row in the matrix w. In order to define the significance of each fuzzy rule and the sensitivity of each variable for a fuzzy rule, another matrix ERR = [PI. P2,·· . , Pu] E R(r+l)xu will be

introduced as follows:

ERR = [ErrI. Err2,··· , Errr+l]T (22)

Errj = [err(j-l)u+b err(j-l)u+2,··· , err(j_l)u+u]T (23)

According to the matrix ERR, we define the significance sigj of the jth fuzzy rule given by

If

. J pi Pj . 1 2 s�gj =

r + l' J = , , ... , u (24)

A pruning criterion of existing fuzzy rules can be obtained.

sigj < ks, j = 1,2, ... , u (25)

the jth fuzzy rule is considered insignificant and will be removed from the system. Otherwise, no fuzzy rules will be deleted, where ks is a predefined threshold.

C. Parameter Adjustment

For an incoming sample Xk = [Xlk,X2k,··· ,xrkjT, suppose that some criteria of adding rules are satisfied and we should allocate the center and the width of the newly generated GEBF unit in each dimension. Let Bi [Xi,min, Cib Ci2,·· . , Ciu, Xi,max]T, wl [aiO,afi,af2, ... ,af;.,aiO]T, and WiR [aiO, aff, a�, ... , at:;" aiO]T be the boundary vector, left width vector and right width vector of the input variable Xi, where �o = [alO, a20, ... , aro]T is the initial width vector which can be easily chosen by aiO = (Xi,max -Xi,min)/2, and calculate the distance dik between the input variable Xik and the boundaries contained in Bi as follows:

Jik = arg min dik(j) j=1,2,.·· ,u+2 (27)

where u is the number of the existing fuzzy rules. If

(28)

Xik can be represented well by the nearest DGF. Hence, the center Ci(u+1), the left width ahu+1)' and the right width

a«U+l) in the ith dimension of the newly generated GEBF umt are chosen as follows:

Ci(u+1) = Bi(Jik) (29)

ahU+1) = Wl(Jik), anU+1) = WiR(Jik) (30)

Otherwise, a new DGF should be created in the ith dimension. Assume that the input Xik falls into the intervalllBi = [Bi(J1), Bi(J2)] which is the smallest interval containing the input. The premise assignments can be described by

(31)

29

ahU+1) = ",dik(J1), anU+1) = ",dik(J2) (32)

where '" can be easily defined as follows:

1 '" =

Jln(l/e) (33)

Let C J = [ClJ, C2J, • • • , CrJ] T be the center vector of the nearest GEBF unit to the incoming sample Xk = [Xlk, X2k, ... , Xrk]T, and the corresponding left width vector and right width vector of the nearest GEBF are �1 = [atJ,afJ'··· ,a�JjT and �1 = [af"a�, ... ,a�]T, respectively. In order to determine which dimensions of the Jth GEBF unit will be selected, the sensitivity senJ(i) of the ith input variable in the Jth fuzzy rule based on the ERR method is as follows:

senJ(i) = pJ(i + 1) /� PJ(k) (34)

where Pj,j = 1,2,··· , u are the columns of the matrix ERR = [PI. P2, ... , Pu] E R(r+l) xu and r is the number of input variables. If

(35)

the sensitivity of the ith input variable is less than the average level in all terms related to input variables in the Jth fuzzy rule and the corresponding widths in the ith dimension should be decreased to enhance the local interpretation. Unlike existing width adjustments, the left width and right width of the DGF doesn't need adjustments simultaneously and only the side where the incoming data lies will be modified as follows:

{afJ = kwaf:" if Xik � CiJ R k R ·f aiJ = waiJ, 1 Xik > CiJ

(36)

where kw E (0,1) is a predefined threshold. In addition, the LLS method provides a simple and effi

cient approach to determining the weights.

IV. SIMUL ATION STUDIES

As the simulation example, a nonlinear dynamic plant to be identified as given in [10], [11], [12], [13], [16] is described as follows:

(k 1) = y(k)y(k -l)[y(k) + 2.5] (k) (37) y +

1 + y2 (k) + y2 (k _ 1) + u .

In order to identify the above-mentioned nonlinear dynamic plant, a series-parallel identification model governed by the following equation is given by

y(k+1) = f(y(k),y(k-1))+u(k) , (38)

where the function f is realized by the proposed OSNFS with three inputs and one output. The input signal is generated by

u(k) = sin(27rk/2 5) . (39)

The parameters are set as follows: emax = 0.5, emin = 0.03, '" = 1.75, km = 0.3, kw = 0.9 and ks = 0.003. Simulation results are plotted in Fig.2-5. From Fig.2, we can

8,----------,--------,--------,------, 4 ,---- - - ----,- - - - ----,-- - - - ---,- - - - --,

20015050

'":;% 3oro:J13~ 2c'"'0~.~ 1'0C

'"lJ!l 0c51's~ - 1oo

200150100Training samples

50

5

7

6

Fig. 2. Growth of rule nodes Fig. 4. Identification result

0.2 ,----------,--------,--------,------,

0.18

_ 0.16LU(/)

~ 0.14

g 0.12'"alffi 0.1:J

~ 0.08'"'"E'8 0.06

c:: 0.04

0.02

0.9

'5~ 0.5oU.2 0.4a.~ 0.31i~ 0.2::;;

0.1

0.5- 0.5oL---------'--~=::::.:....-------'---===.=;;:I- 120015050

oL..L--"---- - -----'- - - -----'-- - - ----'--- - - ----'o

Fig. 3. Root mean squared error (RMSE) during training Fig. 5. Membership functions of the input variable u(t)

see that fuzzy rule nodes grow with learning data arriving.Finally, there are only 7 fuzzy rules involved in our OSNFS.Fig.3 shows that the Root Mean Squared Error (RMSE)during online training can approximate to very small valuenear to zero. And, the identification performance of theOSNFS is shown in FigA, which indicates that the perfor-mance is satisfactory. For the illustration for membershipfunctions of input variables yet), yet - 1) and u(t) , weonly plot DGF membership functions of u(t) for savingpage length. Furthermore, the performance comparisons ofdifferent approaches are listed in Table I, from which itcan be concluded that the presented OSNFS provides thebest performance on a more compact neuro-fuzzy system,although the FAOS-PFNN obtains the most parsimoniousstructure.

V. CONCLUSIONS

In this paper, we present a novel online self-organizingneuro-fuzzy system (OSNFS) which implements a TSK

TABLE ICOMPARISONS OFTHEPROPOSED OSNFS WITH OTHER ALGORITHMS

Algorithms Rule Numbers Parameter Numbers RMSERBF-AFS [10] 35 280 0.1384

OLS [II] 65 326 0.0288DFNN [I2] 6 48 0.0283GDFNN [13] 8 56 Om08

FAOS-PFNN [I6] 5 25 0.0252Our OSNFS 7 72 Om05

fuzzy inference system. In the online learning process, crite-ria of rule generation and the pruning strategy are presentedto identify the structure of a neuro-fuzzy system by addingand deleting generalized ellipsoidal basis functions (GEBF)when sequential training data pairs arrives, and a novel onlineparameter allocation mechanism based on z-completeness isdeveloped to allocate the widths for each dimension of inputvariables. It should be noted that the premise adjustmentis more flexible and simpler due to the characteristics of

30

the dissymmetrical Gaussian function (DGF). The parameter estimation of consequents is performed on the resulting structure by using the linear least square (LLS) method. The effectiveness and superiority of the proposed OSNFS is demonstrated in nonlinear dynamic system identification. Simulation results show that our approach provides high performance and comparable compact neuro-fuzzy system. Comprehensive comparisons with other popular approaches indicate that the overall performance of the proposed OSNFS is superior to the others in terms of parsimonious structure and high capability of approximation.

REFERENCES

[1 ] L.x. Wang, Adaptive FuzzY Systems and Control: Design and Stability Analysis. Prentice-Hall, Englewood Cliffs, NJ, 1994.

[2] 1.S.R. Jang, C.T. Sun and E. Mizutani, Neuro-Fuzzy and Soft Computing. Prentice-Hall, Englewood Cliffs, NJ, 1997.

[3] S. Mitra and Y. Hayashi, "Neuro-fuzzy rule generation: sur vey in soft computing framework;' IEEE Trans. Neural Networks, vol. 1 1, pp. 748-768,2000.

[4] 1.S.R. Jang, "ANFIS: Adapti ve-network-based fuzzy inference system," IEEE Trans. Syst., Man, Cybern., vol. 23, pp. 665-684, 1993.

[5] 1. Platt, "A resource-allocating network for function interpolation," Neural Comput., vol. 3, pp. 213-225, 1991.

[6] L. Yingwei, N. Sundararajan and P. Saratchandran, "A sequential learning scheme for function approximation using minimal radial basis function (RBF) neural networks," Neural Comput., vol. 9, pp. 461-478, 1997.

[7] I. Rojas, H. Pomares, J.L. Bernier, et al., "TIme series analysis using normalized PG-RBF network with regression weights," Neurocomputing, vol. 42, pp. 267-285, 2002.

[8] M. Salmeron, 1. Ortega, C.G. Puntonet and A. Prieto, "Impro ved RAN sequential prediction using orthogonal techniques;' Neurocomputing, vol. 41, pp. 153-172,2001.

[9] V. Kadirkamanathan and M. Niranjan, "A function estimation approach to sequential leaming with neural networks," Neural Comput., vol. 5, pp. 954-975, 1993.

[ 10] K.B. Cho and B.H. Wang, "Radial basis function based adapti ve fuzzy systems and their applications to system identification and prediction," Fuzzy Sets Syst., vol. 83, pp. 325-339, 1996.

[ 1 1 ] S. Chen, C.F.N. Cowan and P.M. Grant, "Orthogonal least squares learning algorithm for radial basis function network," IEEE Trans. Neural Networks, vol. 2, no. 2, pp. 302-309, 1991.

[ 12] S.Q. WU and MJ. Er, "Dynamic fuzzy neural networks - a no vel approach to function approximation," IEEE Trans. Syst., Man, Cybern., B, Cybern., vol. 30, no. 2, pp. 358-364, 2000.

[ 13] S.Q. Wu, M.1. Er and Y. Gao, "A fast approach for automatic generation of fuzzy rules by generalized dynamic fuzzy neural networks;' IEEE Trans. FUzzY Syst.,vol. 9, no. 4, pp. 578-594, 2001.

[ 14] G. Leng, T.M. McGinnity and G. Prasad, "An approach for on-line extraction of fuzzy rules using a self-organising fuzzy neural network;' Fuzzy Sets Syst., vol. 150, pp. 2 1 1 -243, 2005.

[ 15] G. Leng, T.M. McGinnity and G. Prasad, "Design for self-organizing fuzzy neural networks based on genetic algorithms;' IEEE Trans. FUzzY Systems, vol. 14, pp. 755-766, 2006.

[ 16] N. Wang, MJ. Er and X.Y. Meng, "A fast and accurate online selforganizing scheme for parsimonious fuzzy neural networks," Neurocomputing, vol. 72, pp. 38 18-3829, 2009.

[ 17] N. Wang, X.Y. Meng and Q.Y. Xu, "A fast and parsimonious fuzzy neural network (FPFNN) for function approximation," in Proc. 48th IEEE Corif. Decision Contr. (CDC'09),shanghai, China, Dec. 2009, pp. 4174-4179.

[ 18] C.S. Velayutham and S. Kumar, "Asymmetric subsethood-product fuzzy neural inference system (ASuPFuNIS)," IEEE Trans. Neural Networks, vol. 16, no. 1, pp. 160-174,2005.

[ 19] C.P. Hsu, P.Z. Lin, T.T. Lee and C.H. Wang, "Adapti ve asymmetric fuzzy neural network controller design via network structuring adaptation;' Fuzzy Sets Syst., vol. 159, pp. 2627-2649, 2008.

31

[ieee 2010 third international workshop on advanced computational intelligence (iwaci) - suzhou,...

Documents