nonlinear system identification using two-dimensional wavelet-based state-dependent parameter models
TRANSCRIPT
This article was downloaded by: [Umeå University Library]On: 22 September 2013, At: 23:33Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,37-41 Mortimer Street, London W1T 3JH, UK
International Journal of Systems SciencePublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/tsys20
Nonlinear system identification using two-dimensionalwavelet-based state-dependent parameter modelsNguyen-Vu Truong a & Liuping Wang aa School of Electrical and Computer Engineering, RMIT University, Melbourne, VIC 3001,AustraliaPublished online: 28 Oct 2009.
To cite this article: Nguyen-Vu Truong & Liuping Wang (2009) Nonlinear system identification using two-dimensionalwavelet-based state-dependent parameter models, International Journal of Systems Science, 40:11, 1161-1180, DOI:10.1080/00207720902985419
To link to this article: http://dx.doi.org/10.1080/00207720902985419
PLEASE SCROLL DOWN FOR ARTICLE
Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) containedin the publications on our platform. However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of theContent. Any opinions and views expressed in this publication are the opinions and views of the authors, andare not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon andshould be independently verified with primary sources of information. Taylor and Francis shall not be liable forany losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoeveror howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use ofthe Content.
This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions
International Journal of Systems ScienceVol. 40, No. 11, November 2009, 1161–1180
Nonlinear system identification using two-dimensional wavelet-based state-dependent
parameter models
Nguyen-Vu Truong and Liuping Wang*
School of Electrical and Computer Engineering, RMIT University, Melbourne, VIC 3001, Australia
(Received 9 August 2008; final version received 8 April 2009)
This article presents a nonlinear system identification approach that uses a two-dimensional (2-D) wavelet-basedstate-dependent parameter (SDP) model. In this method, differing from our previous approach, the SDP is afunction with respect to two different state variables, which is realised by the use of a 2-D wavelet seriesexpansion. Here, an optimised model structure selection is accomplished using a PRESS-based procedure inconjunction with orthogonal decomposition (OD) to avoid any ill-conditioning problems associated with theparameter estimation. Two simulation examples are provided to demonstrate the merits of the proposedapproach.
Keywords: nonlinear systems; identification for control; nonlinear system identification; PRESS statistics;multivariable nonlinearities
1. Introduction
The state-dependent parameter (SDP) model structurehas been a well-known and natural way to expressnonlinear systems (Young 1993, 1998, 2001; Young,McKenna and Bruun 2001; Truong, Wang and Young2006, 2007b; Truong, Wang and Huang 2007a;Truong and Wang 2008, 2009). This model structureis written in the form of a linear regression in specified,state variables (i.e. derivatives or lagged values of theinput and output variables), multiplied by associatedSDPs, which are functions of the respective statevariables, to characterise the nonlinearities.
Previous works on SDP estimation and modelling(Young 1993, 1998, 2001; Young et al. 2001; Truonget al. 2006, 2007a, 2007b; Truong and Wang 2008,2009) only consider a specific SDP model structurethat relies very much on a single state dependency. Inthe presence of significant interactions between thesystem’s various input/output terms, a model of thistype has limited applications since it cannot representthe multivariable dependence nature of the system’snonlinear dynamics. Hence, it is valuable to extend theoriginal SDP model of single state dependency tomultivariable state dependency to capture such inter-acted multivariable nonlinearities.
The focus of this work is on the construction of aneffective nonlinear system identification technique viathe so-called two-dimensional (2-D) SDP (2-DSDP)model, including a systematic approach to the selectionof a set of candidate model structures and the final
determination of the optimal model itself. This parti-
cular model structure refers to a type of SDP models in
which the SDP is a function of two different state
variables. It, in turn, makes the SDP relationship be a
surface instead of being a curve as in the single state
dependency (1-D) case. At this point, the system
identification task is to solve the approximation
problem of these 2-D functions within the structure
of a dynamic 2-DSDP model.Traditionally, to address this problem, there exists
a number of approaches available in the open
literature, employing various types of functions, such
as polynomial, spline, kernel and other basis functions
(Chen, Billings and Luo 1989; Savakis, Stoughton and
Kanetkar 1989; Baudat and Anouar 2001; Gonzalez
et al. 2003). In recent years, wavelet has been widely
used due to its excellent localization properties in both
time and frequency (Chui 1992; Meyer 1992; Mertin
1999). With these properties in association with wave-
let multiresolution decomposition, an arbitrary func-
tion can be well approximated at any level of regularity
and a desired accuracy by a small number of wavelet
basis functions. This makes wavelet series expansion
outperform many other approximation schemes (Chui
1992), especially in approximating complex functions
or functions with sharp discontinuities. Thus, it has
become an effective new tool for functional
approximation.The use of 2-D wavelets for nonlinear system
identification has been studied (Liu, Billings and
*Corresponding author. Email: [email protected]
ISSN 0020–7721 print/ISSN 1464–5319 online
� 2009 Taylor & Francis
DOI: 10.1080/00207720902985419
http://www.informaworld.com
Dow
nloa
ded
by [
Um
eå U
nive
rsity
Lib
rary
] at
23:
33 2
2 Se
ptem
ber
2013
Kadirkamanathan 1998; Billings and Wei 2005);however, their application in the context of 2-DSDPmodel is new. In the recent papers by Truong et al.(2006, 2007a, 2007b) and Truong and Wang (2008,2009), a 1-D wavelet-based SDP nonlinear systemmodel was proposed, in which 1-D wavelets are usedfor the parameterisation of the associated SDPs. Thisarticle extends the 1-DSDP model structure to acompact mathematical formulation in a 2-D context,in which the so-called 2-D wavelet series expansion isused for the approximation of the respective 2-DSDPrelationships to form a class of nonlinear systemmodels called 2-D wavelet-based SDP (2-DWSDP)models. This is conceptually straight-forward: inessence the proposed approach converts a complicated2-DSDP estimation problem into a much simpler andcomputationally efficient implementation using wave-lets. Models obtained in this manner are very compact,thus can be used in a much wider range of applications.
Unlike the estimation of a linear time-invariantmodel which has a limited range of candidate modelstructures, model structure determination in nonlinearsystem identification is a challenging task by its ownright. First, the set of candidate model structures isrequired to be determined before the estimation. Herea novel approach is proposed based on the character-istics of wavelet functions. The selection of the scalingfactors (finest and coarsest) for a wavelet seriesexpansion is crucial. It determines the amount ofinformation (i.e. regressor matrix) to be included forthe functional approximation, which in essence isrelated to the set of the candidate models and toboth the approximation performance and the compu-tational efficiency of the model structure selectionalgorithm. In the 1-D case (Truong et al. 2006, 2007a,2007b; Truong and Wang 2008), this information wasobtained from the non-parametrically estimated SDPrelationships, whereas, in the 2-DWSDP model situa-tion, this information is not available. In this article,new results on the selection of those scaling parametersin the context of the 2-D wavelet series expansion and2-DWSDP model setting are developed to enhance theprocedure of selecting candidate model structures.Second, based on this selected set of candidate modelstructures, the optimal structure of a 2-DWSDPmodel, along with its parameters, is chosen using thePRESS (Prediction Error Sums of Square) criterion asdescribed in our previous papers (Truong et al. 2006,2007a, 2007b; Truong and Wang 2008, 2009).Furthermore, since orthogonal decomposition (OD) isused in the PRESS computation, it enables thealgorithm to eliminate any numerically ill-conditionedterms within a given candidate model structure (Hong,Harris, Chen and Sharkey 2003a; Hong, Sharkeyand Warwick 2003b, 2003c; Billings and Wei 2008).
This further enhances the performance and efficiencyof this model structure selection algorithm.
The structure of this article is outlined as follows.Section 2 introduces and discusses the 2-D waveletfunctional approximation as well as the associated2-DWSDP model. The selection of candidate struc-tures is discussed in Section 3. Section 4 describes thenonlinear model structure selection procedure usingthe PRESS criterion and the forward regression, andsummarises the identification procedure using theproposed approach. Section 5 presents two simulationexamples to illustrate the efficiency of the proposedtechnique. Finally, Section 6 concludes this article.
2. 2-DWSDP model
It is assumed that a nonlinear system can berepresented by the following 2-DSDP model:
yðkÞ ¼Xnyq¼1
fqðxmq,nq Þyðk�qÞþXnuq¼0
gqðxlq,pqÞuðk�qÞþ eðkÞ
ð1Þ
where u(k) and y(k) are, respectively, the sampledinput-output sequences; while {nu, ny} refer to themaximum number of lagged inputs and outputs.The functions fq, gq are dependent on xmq,nq
¼ (xmq,
xnq|mq6¼ nq2 x) and xlq,pq¼ (xlq, xpq|lq 6¼pq2 x) in which
x¼ {y(k� 1), . . . , y(k� ny), u(k), . . . , u(k� nu)}. As aresult, they are regarded as 2-DSDP. Finally, e(k)refers to the noise variable, assumed initially to be azero-mean, white noise process that is uncorrelatedwith the input u(k) and its past values.
For example, a first-order 2-DSDP model repre-sentation of a nonlinear system can take the followingform:
yðkÞ ¼ f1½yðk� 1Þ, uðkÞ�yðk� 1Þ þ g0½uðkÞ, uðk� 1Þ�uðkÞ
ð2Þ
Let x¼ {x1, x2,x3}¼ {y(k� 1), u(k), u(k� 1)}. Inthis case, the 2-DSDPs f1 and g0 are dependent onx1,2¼ {x1, x2}¼ {y(k� 1), u(k)} and x2,3¼ {x2,x3}¼{u(k), u(k� 1)}, respectively.
2.1. 2-D wavelet series expansion
In the case of 2-D wavelet series expansion, the waveletbasis function is no longer single dimensional butvaried with respect to two different variables, i.e. x1and x2. To formulate a 2-D wavelet basis function�[2](x1, x2), a natural approach is based on thetensor product of 2 1-D wavelet functions �(x1)and �(x2) (Liu et al. 1998; Billings and Wei 2005)
1162 N.-V. Truong and L. Wang
Dow
nloa
ded
by [
Um
eå U
nive
rsity
Lib
rary
] at
23:
33 2
2 Se
ptem
ber
2013
as follows:
�½2�ðx1, x2Þ ¼ �ðx1Þ�ðx2Þ ð3Þ
For example, let �(x) be a 1-D Mexican hat wavelet as
described in the following equation:
�ðxÞ ¼ð1� x2Þe�0:5x
2
if x 2 ð�4, 4Þ
0 otherwise
( )ð4Þ
Then its 2-D version (shown in Figure 1) takes the
following form:
�½2�ðx1,x2Þ
¼ð1�x21Þð1�x22Þe
�0:5ðx21þx2
2Þ if x1,x2 2 ð�4,4Þ
0 otherwise
( )
ð5Þ
Let f[2] be the associated 2-DSDP relationship in
approximation with respect to two different state
variables (x1, x2), where it is represented by a 2-D
wavelet series expansion as in the following form:
f ½2�ðx1, x2Þ ¼Ximax
imin
Xj12Lix1
Xj22Lix2
ai, j1, j2�½2�i, j1, j2ðx1, x2Þ ð6Þ
and
�½2�i, j1, j2ðx1, x2Þ ¼ �½2�ð2�ix1 � j1, 2�ix2 � j2Þ ð7Þ
Here, {ai,j1,j2} is the set of coefficients of the expansion;
imin and imax correspond to the minimum and maxi-
mum scales used for the approximation of f [2](x1, x2).
Lix1, Lix2 (determined as in (8) and (9)) are the
translation libraries with respect to �(x1), �(x2) at
scale i, respectively. They are derived by using the
compact supported conditions of the mother wavelet(see Section 3.1 for details)
Lix1 ¼ f j 2 ð2�ix1min � s2, 2
�ix1max � s1Þ, j 2 Zg ð8Þ
Lix2 ¼ f j 2 ð2�ix2min � s2, 2
�ix2max � s1Þ, j 2 Zg ð9Þ
where (s1, s2) is the supporting range of the motherwavelet.1 For example, for the Mexican hat waveletas in (5), s1¼�4 and s2¼ 4.
Since imin and imax determine the set of terms usedfor the approximation, the next question to addresshere is how to select these parameters in a 2-D wavelet-based context. This will be discussed and illustrated inSection 3.
2.2. 2-DWSDP model formulation
Based on this formulation, the model structure of (1) isparameterised using a 2-D wavelet series expansion asin (6), where the 2-DSDPs fq(xmq,nq
) and gq(xlq,pq) canbe approximated as
fqðxmq, nqÞ ¼Ximax
imin
Xj12Lixmq
Xj22Lixnq
afq, i, j1, j2�½2�i, j1, j2ðxmq, nq Þ
ð10Þ
gqðxlq, pqÞ ¼Ximax
imin
Xj12Lixlq
Xj22Lixpq
bgq, i, j1, j2�½2�i, j1, j2ðxlq, pqÞ ð11Þ
in which, {Lixmq,Lixnq} and {Lixlq,Lixpq} correspond tothe translation libraries with respect to {�(xmq),�(xnq)} and {�(xlq),�(xpq)} at scale i; afq,i,j1,j2 andbgq,i,j1,j2 are the coefficients.
Figure 1. 2-D Mexican hat wavelet function.
International Journal of Systems Science 1163
Dow
nloa
ded
by [
Um
eå U
nive
rsity
Lib
rary
] at
23:
33 2
2 Se
ptem
ber
2013
Substituting (10) and (11) into (1), we obtain a2-DWSDP as follows:
yðkÞ¼Xnyq¼1
Ximax
imin
Xj12Lixmq
Xj22Lixnq
afq,i,j1,j2�½2�i,j1,j2ðxmq,nqÞ
24
35yðk�qÞ
þXnuq¼0
Ximax
imin
Xj12Lixlq
Xj22Lixpq
bgq,i,j1,j2�½2�i,j1,j2ðxlq,pq Þ
24
35
�uðk�qÞþeðkÞ ð12Þ
In this 2-DWSDP model, the parameters are thecoefficients of the respective 2-D wavelets, i.e. afq,i,j1,j2and bgq,i,j1,j2. With given information of the waveletbasis functions, i.e. �[2](x1, x2) and {ny, nu} as well asthe scaling parameters {imin, imax}, the next task here isto formulate (12) as a linear-in-the-parameter regres-sion equation, starting from the inner-most summation(j2) to the outer-most summation (q).
Let the inner-most coefficients and wavelet basisfunctions be represented in vector forms as follows:
�fq, j1 ¼ afq, i, j1, j2min, . . . , afq, i, j1, j2max
� �Tj22Lixnq
�fq, j1ðkÞ ¼h�½2�i, j1, j2min½xmq, nq ðkÞ�, . . . ,
�½2�i, j1, j2max½xmq, nq ðkÞ�ij22Lixnq
8>>>><>>>>:
9>>>>=>>>>;ð13Þ
and
�gq, j1¼ bgq, i, j1, j2min, . . . ,bgq, i, j1, j2max
� �Tj22Lixpq
�gq, j1ðkÞ ¼h�½2�i, j1, j2min½xlq,pqðkÞ�, . . . ,
�½2�i, j1, j2max½xlq,pqðkÞ�ij22Lixpq
8>>>><>>>>:
9>>>>=>>>>;ð14Þ
Note that as defined in (13) and (14), �fq,j1 and�gq,j1 are the parameter vectors which are functionsof j1, while �fq,j1(k) and �gq,j1(k) are functions of{j1, xmq,nq
(k)} and {j1, xlq,pq(k)}, respectively.Then,Xj22Lixnq
afq, i, j1, j2�½2�i, j1, j2ðxmq, nqÞ ¼ �fq, j1ðkÞ�fq, j1 ð15ÞX
j22Lixpq
bgq, i, j1, j2�½2�i, j1, j2ðxlq, pqÞ ¼ �gq, j1ðkÞ�gq, j1 ð16Þ
As a result, (12) can be simplified into:
yðkÞ ¼Xnyq¼1
Ximax
imin
Xj12Lixmq
�fq, j1ðkÞ�fq, j1
24
35yðk� qÞ
þXnuq¼0
Ximax
imin
Xj12Lixlq
�gq, j1ðkÞ�gq, j1
24
35uðk� qÞ þ eðkÞ
ð17Þ
Similarly, let
Afq,Li¼ �Tfq, j1min, . . . ,�
Tfq,j1max
h iTj12Lixmq
Zfq,LiðkÞ¼ �fq, j1minðkÞ, . . . ,�fq, j1maxðkÞ
� �j12Lixmq
yðk�qÞ
8><>:
9>=>;
ð18Þ
and
Bgq,Li¼ �Tgq, j1min, . . . ,�
Tgq,j1max
h iTj12Lixlq
Zgq,LiðkÞ¼ �gq, j1minðkÞ, . . . ,�gq, j1maxðkÞ
� �j12Lixlq
uðk�qÞ
8><>:
9>=>;
ð19Þ
in which Li refers to the whole translation libraryat scale i. As defined in (18) and (19), Zfq,Li
(k) andZgq,Li
(k) are functions of {xmq,nq(k), y(k� q)} and
{xlq,pq(k), u(k� q)}, respectively.Substituting (18) and (19) into (17), y(k) is
expressed as
yðkÞ ¼Xnyq¼1
Ximax
imin
Zfq,LiðkÞAfq,Li
" #
þXnuq¼0
Ximax
imin
Zgq,LiðkÞBgq,Li
" #þ eðkÞ ð20Þ
Now let
Aq ¼ ATfq,Limin
, . . . ,ATfq,Limax
h iTZfqðkÞ ¼ Zfq,Limin
ðkÞ, . . . ,Zfq,LimaxðkÞ
� �Bq ¼ BT
gq,Limin, . . . ,BT
gq,Limax
h iTZgqðkÞ ¼ Zgq,Limin
ðkÞ, . . . ,Zgq,LimaxðkÞ
� �
8>>>>>>><>>>>>>>:
9>>>>>>>=>>>>>>>;
ð21Þ
leading to the linear-in-the-parameter regressionequation:
yðkÞ ¼Xnyq¼1
ZfqðkÞAq
� �þXnuq¼0
ZgqðkÞBq
� �þ eðkÞ ð22Þ
In Equation (22), the parameter matrices Aq and Bq
are to be estimated from the experimental data, andZfq(k), Zgq(k) which are the wavelet terms areconstructed from experimental input–output data.
To integrate (22) with measured input and outputdata, we assume that y(0), y(1), . . . , y(N� 1) and u(0),u(1), . . . , u(N� 1) are available.
With
Y ¼ yð0Þ, . . . , yðN� 1Þ½ �T ð23Þ
U ¼ uð0Þ, . . . , uðN� 1Þ½ �T ð24Þ
� ¼ ½eð0Þ, . . . , eðN� 1Þ�T ð25Þ
1164 N.-V. Truong and L. Wang
Dow
nloa
ded
by [
Um
eå U
nive
rsity
Lib
rary
] at
23:
33 2
2 Se
ptem
ber
2013
Write (22) into the matrix form as
Y ¼Xnyq¼1
ZfqAq þXnuq¼0
ZgqBq þ� ð26Þ
where
Zfq ¼ ZTfqð0Þ, . . . ,ZT
fqðN� 1Þh iT
Zgq ¼ ZTgqð0Þ, . . . ,ZT
gqðN� 1Þh iT
8><>:
9>=>; ð27Þ
Let us define
A ¼ AT1 , . . . ,AT
ny
h iTZf ¼ ½Zf1, . . . ,Zfny �
B ¼ BT0 , . . . ,BT
nu
h iTZg ¼ ½Zg0, . . . ,Zgnu �
8>>>>>><>>>>>>:
9>>>>>>=>>>>>>;
ð28Þ
Substituting (28) into (26), we obtain
Y ¼ ZfAþ ZgBþ� ð29Þ
As a result, (12) is written in the matrix form as
Y ¼ P� þ� ð30Þ
where P is the data matrix and � is the parameter
vector to be estimated, and
P ¼ Zf,Zg
� �� ¼ ½AT,BT�
T
( )ð31Þ
3. Selection of candidate structures
One of the keys in nonlinear system identification is to
effectively select candidate structures. This is among
the most challenging tasks due to infinite possible
combinations of nonlinear regression terms. Therefore,
it is critical, at the first step, to reduce the set of
candidate structures to a manageable size based on
some known characteristics about the system under
study. This reduces the computational load and
improves the efficiency of the optimised model struc-
ture selection algorithm.In the situation of 2-DSDP models as considered in
this article, the finest and coarsest scaling parameters
imin, imax determine the set of terms as well as their
associated characteristics2 used for the approximation
of the respective 2-DSDP relationship, (i.e. f1(x1, x2)
via a 2-D wavelet series expansion as described in
Section 2.1). As a result, they play an important role in
the selection of candidate model structures for the
nonlinear system identification. If imin and imax are
properly selected and a compactly supported mother
wavelet is chosen, the set of candidate structures is now
limited and deterministic.The aim of this section is to derive criteria to guide
the selection of these parameters based on the available
information obtained from the input–output data as
well as the wavelet basis functions.Based on the formulation of 2-D wavelets (3) as
well as 2-D wavelet series expansion (6), a 2-DSDP
f1(x1, x2) can be represented in the following tensor
product:
f1ðx1, x2Þ ¼ h1ðx1Þh2ðx2Þ
Furthermore, by approximating h1(x1) and h2(x2)
using the following equations via 1-D wavelet series
expansion:
h1ðx1Þ ¼Xix1 max
ix1 min
Xj2Li
ch1, i, j�i, jðx1Þ ð32Þ
h2ðx2Þ ¼Xix2 max
ix2 min
Xj2Li
ch2, i, j�i, jðx2Þ ð33Þ
the problem is now separated into two sub-problems in
which we independently examine the problem of
the selection of scaling parameters, [ix1min, ix1max] and
[ix2min, ix2max], for the wavelet-based series expansion
of two unknown 1-D functions, h1(x1), h2(x2). In this
manner, the scaling parameters imin and imax used for
the 2-D wavelet series expansion of f1(x1, x2) can be
selected as:
imin ¼Minðix1 min, ix2 minÞ ð34Þ
imax ¼Maxðix1 max, ix2 maxÞ ð35Þ
Therefore, the question to be addressed here is how
to select the associated finest and coarsest scaling
parameters, ixmin and ixmax, for the wavelet series
expansion of an unknown 1-D function f (x) based on
the known characteristics of the state variable x and
the wavelet basis function �(x).
3.1. On the selection of scaling parameters
A 1-D function f (x) is represented by the following
1-D wavelet series expansion
f ðxÞ ¼Xi2Z
Xj2Z
di, j�i, jðxÞ ð36Þ
in which
�i, jðxÞ ¼ �ð2�ix� jÞ ð37Þ
By limiting the scaling factor i to be bounded with a
range of (ixmin, ixmax), Equation (36) is approximated
International Journal of Systems Science 1165
Dow
nloa
ded
by [
Um
eå U
nive
rsity
Lib
rary
] at
23:
33 2
2 Se
ptem
ber
2013
by the following equation:
f ðxÞ ¼Xixmax
ixmin
Xj2Z
ci, j�i, jðxÞ ð38Þ
Furthermore, if �(x) is compactly supported within
(s1, s2), the translation parameter j is bounded by the
inequality:
s1 5 2�ix� j5 s2 ð39Þ
This inequality is regarded as the compactly
supported condition of the mother wavelet �(x)
given by
�i, jðxÞ 6¼ 0 if s1 5 2�ix� j5 s2
�i, jðxÞ ¼ 0 otherwise
( )ð40Þ
From (39), we have
2�ixmin � s2 5 2�ix� s2 5 j5 2�ix� s1 5 2�ixmax � s1
ð41Þ
As a result,
2�ixmin � s2 5 j5 2�ixmax � s1 ð42Þ
Define
Li ¼ fj 2 ð2�ixmin � s2, 2
�ixmax � s1Þ, j 2 Zg ð43Þ
which is regarded as the translation library at scale i.As a result, we obtain
�i, jðxÞ 6¼ 0 if j 2 Li
�i, jðxÞ ¼ 0 if j =2Li
� �ð44Þ
Using (44), Equation (38) is equivalent to the
following equation:
f ðxÞ ¼Xixmax
ixmin
Xj2Li
ci, j�i, jðxÞ þXj =2Li
ci, j�i, jðxÞ
" #
¼Xixmax
ixmin
Xj2Li
ci, j�i, jðxÞ ð45Þ
since
�i, jðxÞ ¼ 0 when j =2Li ð46Þ
Now the next question to be addressed is how to
select the scaling parameters [ixmin, ixmax].Let us define the wavelet function library LW used
for the functional approximation of f(x) as
LW ¼ f�i, jðxÞ, ðixmin � i � ixmax, i 2 ZÞ and ð j 2 LiÞg
ð47Þ
Lixmax� Lixmax�1 � � � � � Lixmin
ð48Þ
As shown earlier (see (43) and (47)), the wavelet
function library LW is determined based on the values
of {xmin, xmax, s1, s2, ix min, ix max}, in which the interval
between (s1, s2) is the supporting range of the mother
wavelet �(x) (Figure 2c), and the interval between
(xmin, xmax) is the range of the state variable x
(Figure 2a).The selection of [ixmin, ixmax] determines the
amount (number of terms) and characteristics of the
information included in the wavelet function library
LW for the functional approximation. This is crucial
as it is directly related to both the approximation
performance and the computational efficiency of the
model structure selection algorithm. In the following,
we discuss and derive the criteria to guide the selec-
tion of ixmin and ixmax based on the information of
{xmin, xmax} and {s1, s2} which are given information
and can be directly determined very easily from the
data as well as the mother wavelet �(x).Let Int(x) denote the integer part of a scalar
variable x. We assume that �(x) is chosen so that
s1 5 0 and s2 4 0
s1 � Intðs1Þ�� ��5 2�1
s2 � Intðs2Þ�� ��5 2�1
8><>:
9>=>;
and s1� xmin5 0 and 05 xmax� s2.Under these assumptions, criteria to guide the
selection of ix min and ix max are developed as described
in the following Lemma.
Lemma 1: Under the earlier assumptions, the following
results hold:
ixmin, ixmax 2 Z: ixmin � ixmax
ixmin 4Maxlog xmax
s2
� �log 2
,log xmin
s1
� �log 2
0@
1A
ic � ixmax 4Maxlogð2xmaxÞ
log 2,log 2xminj j
log 2
8>>>>>>><>>>>>>>:
9>>>>>>>=>>>>>>>;
where, ic refers to the scaling parameter that for all i� ic,
�(2�ix) is assumed to be constant.
Proof: From (39) and (42), we have
s1 5 2�ix� j5 s2 ð49Þ
2�ixmin � s2 5 j5 2�ixmax � s1 ð50Þ
where j is the translation index, the interval between
(s1, s2) is the wavelet’s supporting range and the
interval between (xmin, xmax) is the range of the state
variable x.For this 2-D problem, we fix one dimension ( j)
to obtain the criterion for selecting ixmin and ixmax.
1166 N.-V. Truong and L. Wang
Dow
nloa
ded
by [
Um
eå U
nive
rsity
Lib
rary
] at
23:
33 2
2 Se
ptem
ber
2013
The determination of translation indices j is then
automatically obtained corresponding to the respective
selection of ixmin and ixmax as in (42).From (50), j is determined by
j 2 Li ¼ 2�ixmin � s2, 2�ixmax � s1
� �, j 2 Z
� ½0�
Because (49) is true for all j2Li, when j¼ 0
s1 5 2�ix5 s2
This indicates
Maxf2�ixg ¼ 2�ixmax 5 s2
Minf2�ixg ¼ 2�ixmin 4 s1
Because {xmax4 0, s24 0} and {xmin5 0, s15 0}, then
i4log xmax
s2
� �log 2
and i4log xmin
s1
� �log 2
Thus,
i4Maxlog xmax
s2
� �log 2
,log xmin
s1
� �log 2
0@
1A
As a result,
ixmin 4Maxlog xmax
s2
� �log 2
,log xmin
s1
� �log 2
0@
1A ð51Þ
Also, from (50), it is observed that
LiM ¼ LiMþ1 ¼ � � � ¼ LiMþ1
if
2�iMxmax 5 2�1 and 2�iMxmin
�� ��5 2�1
or
iM 4logð2xmaxÞ
log 2and iM 4
log 2xminj j
log 2
iM 4Maxlogð2xmaxÞ
log 2,log 2xminj j
log 2
Hence, imax can be chosen so that
ixmax 4Maxlogð2xmaxÞ
log 2,log 2xminj j
log 2
ð52Þ
Figure 2. On the selection of scaling parameters.
International Journal of Systems Science 1167
Dow
nloa
ded
by [
Um
eå U
nive
rsity
Lib
rary
] at
23:
33 2
2 Se
ptem
ber
2013
Furthermore, when i increases, �(2�ix) is widelystretched. Therefore, there exists ic2Z so that for alli� ic, �(2�ix) is assumed to be constant. Since there isno benefit of keeping on adding constant terms into theapproximation function library, ic is chosen to be theupper bound for ixmax. Consequently, criteria forselecting ixmin and ixmax are derived as
ixmin, ixmax 2 Z: ixmin � ixmax
ixmin 4Maxlog xmax
s2
� �log 2
,log xmin
s1
� �log 2
0@
1A
ic � ixmax 4Maxlogð2xmaxÞ
log 2,log 2xminj j
log 2
8>>>>>>><>>>>>>>:
9>>>>>>>=>>>>>>>;ð53Þ
œ
The interpretation as well as the application ofthe earlier results in the context of nonlinear systemidentification via 2-DWSDP models will be illustratedin the simulation examples (Section 5), particularlythrough Example 1.
Remark 2: The selection of ixmin and ixmax based on(53) automatically implies the translation indices jbeforehand using (42). It means that a fixed waveletfunction library is deterministically established.
Remark 3: If xmin¼ 0 or xmax¼ 0, the log function asin (53) will be undefined. In such cases, without theloss of generality, we convert xmin to a small negativevalue �� (i.e. �¼ 10�2) which is negative and veryclose to 0, or convert xmax to a small positive value �(i.e. �¼ 10�2) which is positive and very close to 0 tosatisfy the assumptions.
Remark 4: If xmin =2 [s1, 0] or xmax =2 [0, s2], we canalways convert x to (s1, s2) to satisfy the assumptions.
Remark 5: Note that for a Mexican hat wavelet as in(4), ic is determined to be 5.
As a result, the criteria to guide the selection of thescaling parameters imin and imax in the context of a 2-Dwavelet series expansion of f1(x1, x2) can be derived asfollows:
imin ¼Minðix1 min, ix2 minÞ ð54Þ
imax ¼Maxðix1 max, ix2 maxÞ ð55Þ
Applying (53), we obtain:
ix1 min, ix1 max 2 Z: ix1 min � ix1 max
ix1 min 4Maxlog
x1 max
s2
� �log 2
,log
x1 min
s1
� �log 2
0@
1A
ic � ix1 max 4Maxlogð2x1maxÞ
log 2,log 2x1minj j
log 2
8>>>>>>><>>>>>>>:
9>>>>>>>=>>>>>>>;ð56Þ
ix2 min, ix2 max 2 Z: ix2 min � ix2 max
ix2 min 4Maxlog x2max
s2
� �log 2
,log x2min
s1
� �log 2
0@
1A
ic � ix2 max 4Maxlogð2x2maxÞ
log 2,log 2x2minj j
log 2
8>>>>>>><>>>>>>>:
9>>>>>>>=>>>>>>>;ð57Þ
As a consequence,
imin, imax 2 Z: imin � imax
imin 4Min
Maxlog x1max
s2
� �log 2
,log x1min
s1
� �log 2
0@
1A
Maxlog x2max
s2
� �log 2
,log x2min
s1
� �log 2
0@
1A
266666664
377777775
ic � imax 4Maxlogð2x1maxÞ
log 2,log 2x1minj j
log 2,
logð2x2maxÞ
log 2,log 2x2minj j
log 2
8>>>>>>>>>>>>>>>>>>><>>>>>>>>>>>>>>>>>>>:
9>>>>>>>>>>>>>>>>>>>=>>>>>>>>>>>>>>>>>>>;ð58Þ
This will be illustrated in the simulation examples(Section 5).
4. Model structure determination using PRESS
The 2-DWSDP model as derived in (30) is oftenoverparameterised as it may consist of significantredundancies in the model representation. With theseredundancies, the data matrix is often numericallyill-conditioned, leading to a number of disadvantagesin both computation and efficiency associated with theparameter estimation.
The principle of a model structure determinationalgorithm lies in the selection of a final model structurewhich is simple but adequate to explain the essentialsof the underlying system dynamics. The key here is tojustify the significance of each term within the originaloverparameterised model based on a criterion, anddetermine which term is necessary to be included intothe final model.
An efficient model structure determinationapproach based on the PRESS criterion and forwardregression has been studied in the previous works(Truong et al. 2006, 2007a, 2007b; Truong and Wang2008, 2009). This approach detects the most significantterms in the overparameterised model based on theincremental value of PRESS3 (DPRESS) as criterion todetect the significance of each term within the model inwhich the maximum DPRESS signifies the mostsignificant term, while its minimum reflects the least
1168 N.-V. Truong and L. Wang
Dow
nloa
ded
by [
Um
eå U
nive
rsity
Lib
rary
] at
23:
33 2
2 Se
ptem
ber
2013
significant term. Based on this, the algorithm begins
with the initial subset being the most significant term.
It then starts to grow to include the subsequent
significant terms in a forward regression manner until
a specified performance is achieved. Furthermore,
since OD is used in the PRESS computation (Hong
et al. 2003a, 2003b, 2003c; Billings and Wei 2008),
it enables the algorithm to eliminate any numerically
ill-conditioning associated with the parameter estima-
tion. This further enhances the performance and
efficiency of this model structure selection algorithm.Upon determining the optimised nonlinear model
structure for the overparameterised representation as
in (30), the final identified model structure is generally
found to be
yðkÞ ¼Xnyq¼1
Xnfqj¼1
aq, j’½2�q, jðxmq, nqÞ
" #yðk� qÞ
þXnuq¼0
Xngqj¼1
bq, j�½2�q, jðxlq, pq Þ
" #uðk� qÞ þ eðkÞ ð59Þ
Similarly, we can write (59) into the following
matrix form
Y ¼ L� þ� ð60Þ
where
Y ¼ yð0Þ, yð1Þ, . . . , yðN� 1Þ½ �T
U ¼ uð0Þ, uð1Þ, . . . , uðN� 1Þ½ �T
� ¼ eð0Þ, eð1Þ, . . . , eðN� 1Þ½ �T
Aq ¼ aq, 1, aq, 2, . . . , aq, nfq� �
Bq ¼ bq, 1, bq, 2, . . . , bq, ngq� �
� ¼ A1,A2, . . . ,Any ,B0,B1, . . . ,Bnu
� �Tð61Þ
LfqðkÞ ¼ ’½2�q, 1½xmq, nq ðkÞ�, . . . , ’½2�q, nfq ½xmq, nq ðkÞ�h i
yðk� qÞ
LgqðkÞ ¼ �½2�q, 1½xlq, pq ðkÞ�, . . . ,�½2�q, ngq ½xlq, pq ðkÞ�h i
uðk� qÞ
Lk ¼ ½Lf1ðkÞ, . . . ,Lfny ðkÞ,Lg0ðkÞ, . . . ,Lgnu ðkÞ�T
L ¼ L0, . . . ,LN�1½ �T
Now define the cost function
J ¼ Y� L�½ �T Y� L�½ � ð62Þ
and solve for the parameter vector � that minimises J,
� ¼ LTL� ��1
LTY ð63Þ
This estimation is based on least squares approach
which will have optimal statistical properties if e(k) is a
zero mean, normally distributed, white noise process and
independent of the input signal u(k). The consistency of
the parameter estimates will be numerically investi-
gated and discussed through the simulation examples.
However, depending upon the nature of the data, this
assumption might not be applicable. In such a case,
some other estimation approaches might be necessary,
such as an instrumental variable (IV) approach which
can be used for the parameter estimation in this model
setting (Truong and Wang 2008).
4.1. Identification procedure
The overall nonlinear system identification using the
proposed approach can be summarised into the
following steps.
(1) Determining the 2-DSDP model’s initial condi-
tions. This includes the following:
(a) Select the initial values, which normally startwith lower values of ny and nu.
(b) Based on the available a priori knowledge,select the significant variables from all the
candidate lagged output and input terms
(y(k� 1), . . . , y(k� ny), u(k), . . . , u(k� nu))
and the significant 2-D state dependencies
( fq(xmq,nq), gq(xlq,pq)) formulated by the
selected significant variables. Note that the
a priori knowledge can be some known
structural characteristics, or based on some
hypothesis and assumption made about the
system under study.(c) Otherwise, if there is no a priori knowledge
available, all the possible variables as well as
their associated possible 2-D dependencies
for the selected model order (ny and nu) need
to be considered. For example, if ny¼ 1 and
nu¼ 1, the possible variables are y(k� 1),
u(k) and u(k� 1), leading to the possible 2-D
dependencies between: {y(k� 1), u(k)},
{y(k� 1), u(k� 1)} and {u(k), u(k� 1)}.
(2) 2-DWSDP’s optimised model structure selec-
tion. This involves the following steps:
(a) Based on the features of considered dataand the selected wavelet basis function,
determine the associated scaling parameters
[imin, imax] to be used for the 2-DSDP
parameterisation using (58).(b) Formulate an overparameterised 2-DWSDP
model by expanding all the 2-DSDPs
(i.e. fq(xmq,nq), gq(xlq,pq)) via 2-D wavelet
series expansion using the selected scaling
parameters [imin, imax].
International Journal of Systems Science 1169
Dow
nloa
ded
by [
Um
eå U
nive
rsity
Lib
rary
] at
23:
33 2
2 Se
ptem
ber
2013
(c) Using the PRESS-based selection algorithm,determine an optimised model structurefrom the candidate model terms.
(3) Final parametric optimisation.
. Using the measured data, estimate theassociated parameters via a LeastSquares algorithm.
(4) Model validation.
. If the identified values of ny and nu asselected in Step 1 provides a satisfactoryperformance over the considered data,terminate the procedure.
. Otherwise, increase the model’s order,i.e. ny¼ nyþ 1 and/or nu¼ nuþ 1, andrepeat Steps 1b, 2–4.
5. Examples
To demonstrate the merits of the proposed approach,two examples are provided in this section. Forsimplicity, throughout this section, a form of 2-DMexican hat wavelet functions, which are very easy tocalculate with a very small computational load, asdefined in (5), is used.
To facilitate the direct comparison between theestimated and actual nonlinear functions, we first startwith a simulation example. The second example studiesthe identification of a Continuous Stirring TankReactor (CSTR) which is among the most commontype of chemical and petrochemical reactors. In theseexamples, the proposed technique is also compared toa polynomial-based approach.
5.1. Example 1
Consider a nonlinear system described by the followingequation:
yðkÞ ¼ �yðk� 1Þ2uðkÞe�0:5½yðk�1Þ2þuðkÞ2�
þ uðkÞuðk� 1Þ3e�0:5½uðkÞ2þuðk�1Þ2� þ eðkÞ ð64Þ
in which the input signal uðkÞ ¼ sinð k50Þ and e(k) is awhite noise sequence, uniformly distributed within[�0.045, 0.045].
With zero initial conditions, (64) is simulated togenerate 1000 data samples for system identification asshown in Figure 3.
With the assumption that there is no a prioriknowledge available, a first-order 2-DSDP model(ny¼ 1, nu¼ 1) is used for the identification of thesystem. In this situation, the possible variables arey(k� 1), u(k) and u(k� 1). This leads to the possible
2-D dependencies between: {y(k� 1), u(k)}, {y(k� 1),
u(k� 1)} and {u(k), u(k� 1)}. Consequently, the2-DWSDP model structure used for the identificationof this system is in the following form:
yðkÞ ¼ f1 yðk�1Þ,uðkÞ½ �yðk�1Þþg0 yðk�1Þ,uðk�1Þ½ �uðkÞ
þg1 uðkÞ,uðk�1Þ½ �uðk�1Þ ð65Þ
5.1.1. Selection of scaling parameters
Since (65) consists of 3 2-DSDPs: f1[y(k� 1), u(k)],g0[y(k� 1), u(k� 1)] and g1[u(k), u(k� 1)], the scalingparameters for each respective 2-DSDP need to bedetermined.
First, the selection of the scaling parameters usedfor the 2-D wavelet series expansion of f1[y(k� 1), u(k)]is considered. From (53), the following set of inequal-
ities are obtained.
iymin, iymax 2 Z: iymin � iymax
iymin 4Maxlog ymax
s2
� �log 2
,log ymin
s1
� �log 2
0@
1A
ic � iymax 4Maxlogð2ymaxÞ
log 2,log 2ymin
�� ��log 2
8>>>>>>><>>>>>>>:
9>>>>>>>=>>>>>>>;ð66Þ
With s2¼ 4, s1¼�4, ymax¼max[y(k� 1)]¼ 0.5029,ymin¼min[y(k� 1)]¼��¼�0.01 and particularlyic¼ 5, we obtain
iymin 4Max log0:5029
4
.log 2,
log�10�2
�4
.log 2
¼ �2:99 ð67Þ
ic � iymax 4Max�logð2� 0:5029Þ= log 2,
log �2� 10�2�� ��= log 2� ¼ 0:48 ð68Þ
As a result,
iymin, iymax 2 Z: iymin � iymax
iymin 4� 2:99
5 ¼ ic � iymax 4 0:48
8><>:
9>=>; ð69Þ
Similarly,
iumin, iumax 2 Z: iumin � iumax
iumin 4Max log 24
� �= log 2, log 2
4
� �= log 2
� �¼ �2
5 � iumax 4Max log 2= log 2, log 2= log 2ð Þ ¼ 1
8><>:
9>=>;ð70Þ
Equalities (69) and (70) lead to
�2 � iymin � iymax
1 � iymax � 5
� �ð71Þ
1170 N.-V. Truong and L. Wang
Dow
nloa
ded
by [
Um
eå U
nive
rsity
Lib
rary
] at
23:
33 2
2 Se
ptem
ber
2013
and�1 � iumin � iumax
2 � iumax � 5
� �ð72Þ
Inequalities (71) and (72) define the bounds forif1min¼Min(iymin, iumin) and if1max¼Max(iymax,iumax). The selection of if1min and if1max needs to bestrictly within these bounds. That is, because
. If if1max is greater than its upper bound, itmeans that a large number of unnecessaryconstant terms are added into the functionlibrary for the approximation of f1[y(k� 1),u(k)].
. If if1min is smaller than its lower bound, itmeans that the function library is added with alarger number of unnecessary high-frequencywavelet terms. For example, in this example ifwe choose iymin¼ iu min¼�3, this results in anextra unnecessary 231 2-D wavelet termsadded into the function library for theapproximation of f1[y(k� 1), u(k)]. In thisexample, as the model structure consists of 32-DSDPs, it means that about 693 unneces-sary, extra model candidate terms are added,leading to a significant growth in the over-parameterised model. This directly concernsthe accuracy and efficiency of the modelstructure selection algorithm.
Using (71) and (72), let us choose iymin¼ iumin¼�1and iymax¼ iumax¼ 2, then the finest and coarsestscaling factors used for the 2-D wavelet seriesexpansion of f1[y(k� 1), u(k)] are chosen to beif1min¼Min(iymin, iumin)¼�1 and if1max¼Max(iymax,iumax)¼ 2.
Similarly, for the expansion of g0[y(k� 1), u(k� 1)]and g1[u(k), u(k� 1)], [ig0min, ig0max]¼ [ig1min, ig1max] isselected to be [�1, 2]. As a result, the overall finest andcoarsest scaling factors used for the identification ofthis system are selected to be [imin, imax]¼ [�1, 2]. Usingthis information, the expansion of all the 2-DSDPsresults in a total of 432 model’s candidate terms.
5.1.2. Identification results
Using the PRESS-based selection algorithm to choosethe significant model terms, the final identified modelis found to be
yðkÞ ¼ 0:3727�½2�0, 1,�1ðx1, x2Þh i
fyðk�1Þ, uðkÞgyðk� 1Þ
þ
1:0097�½2�1;1;0ðx1; x2Þ
þ0:2585�½2�0;1;�1ðx1; x2Þ
þ0:4769�½2�0;�1;0ðx1; x2Þ
�0:0076�½2�1;0;0ðx1; x2Þ
2666664
3777775fuðkÞ;uðk�1Þg
uðkÞ ð73Þ
−0.1
0
0.1
0.2
0.3
0.4
0.5(a)
0 100 200 300 400 500 600 700 800 900 1000
0 100 200 300 400 500 600 700 800 900 1000
−1
−0.5
0
0.5
1(b)
Sampling index
Figure 3. Example 1 data: (a) output (b) input.
International Journal of Systems Science 1171
Dow
nloa
ded
by [
Um
eå U
nive
rsity
Lib
rary
] at
23:
33 2
2 Se
ptem
ber
2013
in which
�½2�i, j1, j2ðx1, x2Þh i
fcðkÞ, dðkÞg¼ �½2�i, j1, j2½cðkÞ, dðkÞ� ð74Þ
�½2�i, j1, j2ðx1, x2Þ ¼ �½2�ð2�ix1 � j1, 2�ix2 � j2Þ ð75Þ
�½2�ðx1, x2Þ ¼ ð1� x21Þð1� x22Þe�0:5ðx2
1þx2
2Þ ð76Þ
Table 1 shows the incremental values of DPRESSk
that are resulted from excluding the associated terms
from the model. As discussed earlier, this value
reflects the significance of each term towards the
model’s parameterisation. The most significant term
corresponds to the maximum DPRESSk (5.7681), and
this is ranked 1 as shown in Table 1. The least
significant term is reflected by the minimum DPRESSk
(0.022), and this is ranked 5 in Table 1.To validate the identified model (73), we generate
a new data set k¼ 1001–2000. Figure 4 shows the
comparison between the model’s iterative (simulated)
output4 and the actual noise-free output over the
validation data set, as well as their associated residual.
They are almost identical. Figure 5 compares the
estimated 2-DSDPs ( f1ðx1, x2Þ and g1(x1, x2)) to the
actual functions ( f1(x1, x2) and g1(x1, x2)), which are
very well matched to each other. They, in turn, imply
that the identified 5-term model (Equation (73))
excellently characterises this system, in the sense that
the actual system’s dynamics are efficiently captured.To further investigate the consistency property of
the proposed approach in this particular example,
a Monte Carlo simulation, which consists of 100
independent tests, has been implemented. In this test,
the realisation of the noise is varied by changing the
‘seed’ element of the random noise generator from 0
to 99. In each independent test, a set of input–output
data is generated by simulating (64), but with varied
noise sequence. The results are tabulated in Table 2,
which demonstrates that the parameter estimates
1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000
0
0.1
0.2
0.3
0.4
0.5(a)
0 100 200 300 400 500 600 700 800 900 1000−6
−4
−2
0
2
4
6x 10−3
(b)
Sampling index
Figure 4. Example 1: (a) comparison between the actual output (solid) and model iterative output of (73) (dot-dot) over thevalidation set and (b) their associated residual.
Table 1. Example 1: DPRESS table.
Termindex k Model’s term DPRESSk Rank
1 �½2�0, 1,�1½yðk� 1Þ, uðkÞ�yðk� 1Þ 0.1155 4
2 �½2�1, 1, 0½uðkÞ, uðk� 1Þ�uðk� 1Þ 5.7681 1
3 �½2�0, 1,�1½uðkÞ, uðk� 1Þ�uðk� 1Þ 0.1383 3
4 �½2�0,�1, 0½uðkÞ, uðk� 1Þ�uðk� 1Þ 0.8027 2
5 �½2�1, 0, 0½uðkÞ, uðk� 1Þ�uðk� 1Þ 0.022 5
1172 N.-V. Truong and L. Wang
Dow
nloa
ded
by [
Um
eå U
nive
rsity
Lib
rary
] at
23:
33 2
2 Se
ptem
ber
2013
obtained in this example are quite consistent and very
close to the noise free estimates.
5.1.3. In comparison to a polynomial-based approach
For comparison, a polynomial-based approach is used
to parameterise the respective 2-DSDP relationship.Using this approach, the above system is identified as
yðkÞ ¼0:6578x1x
42 þ 0:0781x22
�0:3156x42 þ 0:6541
" #fyðk�1Þ, uðkÞg
yðk� 1Þ
þ0:9113x41x
32 þ 0:2526x31
�1:0414x31x42 � 0:0084
" #fuðkÞ, uðk�1Þg
uðk� 1Þ
ð77Þ
To facilitate the comparison between the proposedmethod and this polynomial-based approach, a mea-surement index, absolute error, is used:
Ek ¼ yðkÞ � yðkÞ�� �� ð78Þ
in which y(k) and y(k) correspond the actual noise freedata and the model’s simulated output over the testingdata set.
Let us denote Ewaveletk and E
polyk to be the absolute
errors resulted from using (73) and (77), respectively.They are then compared in Figure 6, in whichEwaveletk (shown in solid line) is much smaller than
Epolyk (shown in dot-dot line). This implies that in this
example, the proposed approach is more advantageousthan the considered polynomial-based approach.
Figure 7 compares the estimated functions usingthe polynomial approach f
poly1 ðx1, x2Þ and g
poly1 ðx1, x2Þ
versus the actual functions f1(x1, x2) and g1(x1, x2). Thegap between f
poly1 ðx1, x2Þ and f1(x1, x2) as shown in
Figure 7(a) indicates significant bias in the parameterestimates for the polynomial-based model (77) underthe considered noise level.
Another disadvantage of this polynomial appro-ach is demonstrated in Figure 7(b), in which g
poly1 ðx1, x2Þ
exhibits significant oscillatory and overshoot beha-viour. This is a limitation of high-order polynomials inapproximating complicated functions like the onesconsidered in this example. By contrast, the proposedwavelet-based approach has provided very well local-ised solutions which closely approximate the actual 2-Ddependencies (Figure 5). That is due to the excellent
0
0.2
0.4
−1
−0.5
0
0.5
1−0.2
−0.1
0
0.1
0.2
0.3
(a)
−1
−0.5
0
0.5
1
−1−0.5
00.5
1−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
(b)
Figure 5. Example 1: (a) f1(x1, x2) (solid) vs. f1ðx1, x2Þ (dot-dash) and (b) g1(x1, x2) (solid) vs. g1(x1, x2) (dot-dash).
Table 2. Example 1: Monte Carlo test.
TermIndex k
Noise-freeestimate
Noise disturbedestimate
1 0.3667 0.3657 0.02372 1.0115 1.0116 0.01033 0.2663 0.2665 0.01654 0.4843 0.4830 0.01325 �0.0041 �0.0042 0.0062
International Journal of Systems Science 1173
Dow
nloa
ded
by [
Um
eå U
nive
rsity
Lib
rary
] at
23:
33 2
2 Se
ptem
ber
2013
localisation properties of wavelet basis functions.
Additionally, the bounded characteristics of wavelet
basis functions can be very useful for the stability
analysis of the identified models using the proposed
approach.
5.2. Example 2
The three most commonly used reactors are: batch or
semi-batch (BR), CSTR, and tubular or plug flow (PR)
reactors. In this example, the identification of a CSTR
reactor is under study.
1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 20000
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
Sampling index
Figure 6. Example 1: Ewaveletk (solid) vs. Epoly
k (dot-dot).
−1
0
1
−1−0.5
00.5
1−1.5
−1
0
1
(b)
00.2
0.4
−1
0
1−0.4
0.5
1.5
−0.5
−0.2
0
0.2
0.4
0.6
0.8
1
(a)
Figure 7. Example 1: (a) f1(x1, x2) (solid) vs. fpoly1 ðx1, x2Þ (dot-dash) and (b) g1(x1, x2) (solid) vs. g
poly1 ðx1, x2Þ (dot-dash).
1174 N.-V. Truong and L. Wang
Dow
nloa
ded
by [
Um
eå U
nive
rsity
Lib
rary
] at
23:
33 2
2 Se
ptem
ber
2013
The CSTR is the most commonly used type ofchemical and petrochemical plants. It consists of atank, stirring mechanism and feed pumps (Figure 8).Within a CSTR, two chemicals are mixed and reactedto produce a product compound at a concentration ofCa(t) and a mixture temperature at T(t). This reactionis irreversible and exothermic, occurring in a constantvolume reactor that is cooled by a single coolantstream at a flow rate of qc(t). This coolant stream flowrate varies the heat produced from the reaction, andthus influences the product concentration. This processis highly nonlinear and its mathematical model is givenas a set of differential equations:
_CaðtÞ ¼q
�Ca0 � CaðtÞ½ � � k0CaðtÞe
�E=RTðtÞ
_TðtÞ ¼q
�T0 � TðtÞ½ � þ k1CaðtÞe
�E=RTðtÞ
þ k2qcðtÞ 1� e�k3=qcðtÞ� �
Tc0 � TðtÞ½ � ð79Þ
In which Ca0 is the inlet feed concentration; qdenotes the process flow rate and T0 and Tc0 representthe inlet feed and coolant temperature, respectively.These parameters are assumed to be constant attheir nominal values. E
R , �, k1 ¼ �DHk0=�Cp, k2 ¼�cCpc=�Cp� and k3¼ ha/�cCpc are thermodynamic andchemical constants relating to this particular problem.The nominal values for this plant are given in Table 3.
This system set-up has been studied in Lightbodyand Irwin (1997) where a neural network was used tomodel the plant as a third-order nonlinear model, i.e.
CaðkÞ ¼ f ½Caðk� 1Þ,Caðk� 2Þ,Caðk� 3Þ,
qcðk� 1Þ, qcðk� 2Þ, qcðk� 3Þ� ð80Þ
In this example, using the same system set-up, wedemonstrate that the plant dynamics can be excellentlycaptured and represented in a compact manner using asimpler first-order 2-DWSDP model. This illustrates theeffectiveness and advantages of the developed approach.
5.2.1. Identification results
In this study, the identification data is obtained bysimulating (79) using the nominal values as tabulated inTable 3. With the input qc(k) set to be varied betweenqcmin¼ 90L/min and qcmax¼ 111L/min (Figure 9b),and the sampling interval to be Dt¼ 0.1min, 750minworth of simulated data (7500 samples) is obtainedas shown in Figure 9 (which can also be obtained fromDe Moor (2007). These input and output signals{qc(k),Ca(k)} are then, for the ease of the systemidentification, standardised and still designated as{qc(k),Ca(k)}(i.e. qc ¼
qc�MeanðqcÞStdðqcÞ
and Ca ¼Ca�MeanðCaÞ
StdðCaÞ).
The 7500 data points were divided into two sets: theestimation set consisting of the first 6000 data points
and the validation set consisting of the remaining 1500
data points for model testing.Using a first-order 2-DWSDPmodel for this system,
with the finest and coarsest scaling parameters chosen
to be�1 and 3, the final identified model is found to be:
CaðkÞ¼
0:8685�½2�3,0,0ðx1,x2Þ
þ0:5903�½2�2,1,1ðx1,x2Þ
þ0:0117�½2�0,4,2ðx1,x2Þ
þ0:2622�½2�1,1,�1ðx1,x2Þ
þ0:0568�½2��1,9,5ðx1,x2Þ
266666666664
377777777775fCaðk�1Þ,qcðk�1Þg
Caðk�1Þ
þ
0:1482�½2�3,0,0ðx1,x2Þ
þ0:2241�½2�2,0,�1ðx1,x2Þ
þ0:1092�½2�1,0,1ðx1,x2Þ
þ0:0488�½2�0,2,1ðx1,x2Þ
þ0:0931�½2�1,1,1ðx1,x2Þ
266666666664
377777777775fCaðk�1Þ,qcðk�1Þg
qcðk�1Þ
ð81Þ
Figure 8. A CSTR’s schematic representation.
Table 3. CSTR parameters.
Parameters Description Nominal values
q Process flowrate 100L/min� Reactor volume 100Lk0 Reaction rate constant 7.2�1010min�1ER Activation energy 1�104KT0 Feed temperature 350KTc0 Inlet coolant temperature 350KDH Heat of reaction �2�105 cal/molCp,Cpc Specific heats 1 cal/g/K
�, �c Liquid densities 1�103 g/Lha Heat transfer coefficients 7�105 cal/min/K
International Journal of Systems Science 1175
Dow
nloa
ded
by [
Um
eå U
nive
rsity
Lib
rary
] at
23:
33 2
2 Se
ptem
ber
2013
where
�½2�i, j1, j2ðx1, x2Þh i
fcðkÞ, dðkÞg¼ �½2�i, j1, j2½cðkÞ, dðkÞ� ð82Þ
�½2�i, j1, j2ðx1, x2Þ ¼ �½2�ð2�ix1 � j1, 2�ix2 � j2Þ ð83Þ
�½2�ðx1, x2Þ ¼ ð1� x21Þð1� x22Þe�0:5ðx2
1þx2
2Þ ð84Þ
Figure 10(a) compares the predicted output of themodel (Equation (81)), which is recovered to its originalamplitude by de-standardisation, versus the actualoutput over the estimation set; and their associatedresidual is shown in Figure 10(b). Figure 11 comparesthe model’s iterative (simulated) output to the actualoutput signal over the whole data set. This demon-strates that this identified first-order 2-DWSDP model(10 terms) excellently characterises the dynamic beha-viour of this CSTR. The simplicity and effectiveness ofthis model make it attractive for its future applications,such as the design of a nonlinear control system.
5.2.2. In comparison to a polynomial-based approach
As in the previous example, we provide a comparisonbetween the proposed approach and a polynomial-based approach in which a polynomial function is used
to parameterise the respective 2-DSDP relationship(Figure 12). Using this polynomial-based approach,the above-mentioned CSTR system is identified as
CaðkÞ¼0:0001x41x
42�0:0007x
41x
22
þ0:0025x41þ0:8325
" #fCaðk�1Þ,qcðk�1Þg
Caðk�1Þ
þ
�0:0001x41x42þ0:0004x
21x
42
þ0:0001x31x42�0:0005x
41x
22
�0:0007x42þ0:1674
2664
3775fCaðk�1Þ,qcðk�1Þg
qcðk�1Þ
ð85Þ
To facilitate the comparison and measurementindex and mean-squared-error (MSE) is used tomeasure the performance of the identified models.This index is defined as
MSE ¼
PNtest
k¼1 yðkÞ � yðkÞ�� ��2PNtest
k¼1 yðkÞ � �y�� ��2
" #ð86Þ
in which y and y correspond to the actual measurementand the model’s simulated output on the testing set and�y ¼ ð1=NtestÞ
PNtest
k¼1 yðkÞ.The MSE of (85) calculated over the validation set
(points from 6001 to 7500) is 0.0473. This value islarger than the MSE value calculated for (81) with
respect to the same testing data set which is 0.0325.
0 1000 2000 3000 4000 5000 6000 7000
0 1000 2000 3000 4000 5000 6000 7000
0.06
0.08
0.1
0.12
0.14
0.16(a)
85
90
95
100
105
110
115(b)
Sampling index
Figure 9. CSTR data: (a) output Ca(k) and (b) input qc(k).
1176 N.-V. Truong and L. Wang
Dow
nloa
ded
by [
Um
eå U
nive
rsity
Lib
rary
] at
23:
33 2
2 Se
ptem
ber
2013
0 1000 2000 3000 4000 5000 6000 70000.06
0.08
0.1
0.12
0.14
(a)
0 1000 2000 3000 4000 5000 6000 7000–0.02
–0.01
0
0.01
0.02
(b)
0 500 1000 15000.06
0.08
0.1
0.12
0.14(c)
Sampling index
Figure 11. CSTR: (a) Comparison between the actual output (solid) and model iterative output of (81) (dash-dash) over thewhole data, (b) their associated residual and (c) a zoom-in view over the validation set (the last 1500 samples).
0 1000 2000 3000 4000 5000 6000
0 1000 2000 3000 4000 5000 6000
0.06
0.08
0.1
0.12
0.14
0.16(a)
−0.01
−0.005
0
0.005
0.01(b)
Sampling index
Figure 10. CSTR: (a) Comparison between the actual output (solid) and model (81) prediction (dot-dot) over the estimation set,and (b) their associated residual.
International Journal of Systems Science 1177
Dow
nloa
ded
by [
Um
eå U
nive
rsity
Lib
rary
] at
23:
33 2
2 Se
ptem
ber
2013
This implies that for this example, the proposedapproach may be advantageous over the polynomial-based approach.
6. Conclusions
A new class of SDP models called 2-DWSDP modelhas been presented in this article for nonlinear systemidentification. Using this approach, multi-dimensionalstate dependency has been developed, providing analternative extension to the existing SDP modellingapproach which is single state dependency based. Inaddition, the associated nonlinear model structureselection problem is systematically solved by firstchoosing a set of candidate model structures basedon the characteristics of the wavelet, then exploitingthe PRESS criterion and forward regression in con-junction with OD to yield a more parsimoniousnonlinear system model. The parameter estimationprocedure also automatically eliminates the termsassociated with the ill-conditioning problem in thealgorithm.
The contribution of this article can be summarisedas follows:
(1) To the best of our knowledge, the reportedresults are one of the first on a systematicdevelopment of 2-DSDP models for nonlinear
system identification. The advantage of this
model structure over the existing SDP model
structure is that it takes into account the
interactions between various model’s output/
input terms. Together with its relative simpli-
city, this makes the proposed approach more
practical and very useful for a wide range of
engineering applications.(2) The proposed 2-DWSDP model structure is
inherently stochastic. Therefore, the uncer-
tainty associated with the parameter estimates
is taken into account in the identification
methodology. This is often very useful for its
practical applications. As demonstrated in the
simulation examples, the proposed approach
works very well in the presence of a substantial
amount of noise despite the bias in the
parameter estimates. This bias is dependent
upon the signal-to-noise ratio of the system.
Nevertheless, the extension of this work to
obtain an unbiased parameter estimation using
IV for consistent parameter estimates in the
presence of high level of noise is underway and
is to be reported in the future.
Through the simulation examples, the merits of the
proposed approach have been illustrated. Particularly,
a relative comparison to a polynomial-based approach
−5
0
5
0
5
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
(a)
−5
0
5
0
50
0.05
0.1
0.15
0.2
0.25
0.3
(b)
Figure 12. Example 2: 2-DSDP plots: (a) f1ðx1, x2Þ and (b) g1(x1, x2).
1178 N.-V. Truong and L. Wang
Dow
nloa
ded
by [
Um
eå U
nive
rsity
Lib
rary
] at
23:
33 2
2 Se
ptem
ber
2013
was also provided, demonstrating the advantages ofthe developed technique.
Notes
1. The mother wavelet has nonzero values within thisrange. Outside this range, it has zero or insignificantvalues which are assumed to be zero.
2. A small value of imin results in a large number of waveletelements with higher frequency characteristics to becontained in the function’s library. And vice versa, witha large value of imax, the function’s library will consist ofa large number of wavelet elements that are at lowerfrequency features.
3. The difference between the overparameterised (original)model’s PRESS value and the one calculated byexcluding a term from the original model.
4. That is, the output obtained by generating the determi-nistic model output from the model input alone, withoutany reference to the output measurements.
Notes on contributors
Liuping Wang received her PhD in1989 from the University of Sheffield,UK; subsequently, she was an adjunctassociate professor in the Departmentof Chemical Engineering at theUniversity of Toronto, Canada.From 1998 to 2002 she was a seniorlecturer and research coordinator inthe Center for Integrated Dynamics
and Control, University of Newcastle, Australia beforejoining RMIT University where she is a professor andHead of Discipline of Electrical Engineering. She is theauthor of two books, joint editor of one book and haspublished over 130 papers. L. Wang has been activelyengaged in industry-oriented research and development sincethe completion of her PhD studies. While working at theUniversity of Toronto, Canada, she was the co-founder of anIndustry Consortium for identification of chemical processes.Since her arrival at Australia in 1998, she has been workingwith Australian government organisations and companies inthe areas of food manufacturing, mining, automotive andpower services. She leads the Control Systems program in theAustralian Advanced Manufacturing Cooperative ResearchCenter (AMCRC) that develops next generation technologyplatforms for the manufacturing industry. She is in the boardof directors of Australian Power Academy that promotespower engineering education and raises scholarships frompower industry to support undergraduate students.
Nguyen-Vu Truong received the BEng(honours) degree and the PhD degreeboth in control engineering fromPetronas University of Technology(Malaysia) and RMIT University(Australia), respectively. Since 2008,he has been working as a researchfellow at the school of electricaland computer engineering, RMIT
University. His research interests are nonlinear systemidentification, wavelet theory and its applications.
References
Baudat, G., and Anouar, F. (2001), ‘Kernel-based Methods
and Function Approximation’, in Proceedings of the 2001
International Joint Conference on Neural Networks,
Washington, DC, USA, pp. 1244–1249.
Billings, S.A., Chen, S., and Korenberg, M.J. (1989),
‘Identification of Nonlinear MIMO Systems Using
Forward Regression Orthogonal Estimator’, International
Journal of Control, 49, 2157–2189.Billings, S.A, and Wei, H.L. (2005), ‘A New Class of Wavelet
Networks for Nonlinear System Identification’, IEEE
Transactions on Neural Networks, 16(4), 862–874.
Billings, S.A., and Wei, H.L. (2008), ‘An Adaptive
Orthogonal Search Algorithm for Model Subset Selection
and Non-linear System Identification’, International
Journal of Control, 81(5), 714–724.
Chen, S., Billings, S.A., and Luo, W. (1989), ‘Orthogonal
Least Squares Methods and Their Applications in
Nonlinear System Identification’, International Journal of
Control, 50, 1873–1896.Chui, K.C. (1992), An Introduction to Wavelets, New York:
Academics.De Moor, B.L.R. (ed.) (2007), DaIsy: Database for
Identification of Systems, Department of Electrical
Engineering, ESAT/SISTA, K.U.Leuven, Belgium. http://
homes.esat.kuleuven.be/~smc/daisy/ [Continuous stirred
tank reactor, Process Industry Systems, 98-002].Gonzalez, J., Rojas, I., Ortega, J., Pomares, H., Fernandez,
F.J., and Diaz, A.F. (2003), ‘Multiobjective Evolutionary
Optimization of the Size, Shape and Position Parameters
of Radial Basis Function Networks for Functional
Approximation’, IEEE Transactions on Neural Networks,
14(6), 1478–1495.Hong, X., Harris, C.J., Chen, S., and Sharkey, P.M. (2003a),
‘Robust Nonlinear System Identification Methods Using
Forward Regression’, IEEE Transactions on Systems,
Man and Cybernetics-Part A: Systems and Humans, 33(4),
514–523.Hong, X., Sharkey, P.M., and Warwick, K. (2003b),
‘Automatic Nonlinear Predictive Model Construction
Algorithm Using Forward Regression and the PRESS
Statistic’, IEE Proceedings: Control Theory and
Applications, 150(3), 245–254.Hong, X., Sharkey, P.M., and Warwick, K. (2003c), ‘A
Robust Nonlinear Identification Algorithm Using PRESS
Statistic and Forward Regression’, IEEE Transactions on
Neural Networks, 14(2), 454–458.
Lightbody, G., and Irwin, G.W. (1997), ‘Nonlinear
Control Structures Based on Embedded Neural System
Models’, IEEE Transactions on Neural Networks, 8(3),
553–567.
Liu, G.P., Billings, S.A., and Kadirkamanathan, V. (1998),
‘Nonlinear System Identification Using Wavelet Network’,
in Proceedings of the UKACC International Conference on
Control, pp. 1248–1253.Mertin, A. (1999), Signal Analysis: Wavelet, Filter Banks,
Time-Frequency Transforms and Applications, London:
John Wiley & Sons.
International Journal of Systems Science 1179
Dow
nloa
ded
by [
Um
eå U
nive
rsity
Lib
rary
] at
23:
33 2
2 Se
ptem
ber
2013
Meyer, Y. (1992), Wavelet and Operator, Cambridge:Cambridge University Press.
Savakis, A.E., Stoughton, J.W., and Kanetkar, S.V. (1989),‘Spline Function Approximation for Velocimeter DopplerFrequency Measurement’, IEEE Transactions onInstrumentation and Measurement, 8(4), 892–897.
Truong, N.V., and Wang, L. (2008), ‘Nonlinear SystemIdentification in a Noisy Environment Using WaveletBased SDP Models’, in Proceedings of the 17th IFAC
World Congress, Seoul, S. Korea, pp. 7439–7444.Truong, N.V., and Wang, L. (2009), ‘Benchmark NonlinearSystem Using Wavelet Based SDP Models’, in Proceedings
of the 17th IFAC Symposium on System Identification(SYSID 2009), Saint-Malo, France.
Truong, N.V., Wang, L., and Huang, J.M. (2007a),‘Nonlinear Modeling of a Magnetic Bearing Using
SDP Model and Linear Wavelet Parameterization’,in Proceedings of the 2007 American Control Conference,New York, USA, pp. 2254–2259.
Truong, N.V., Wang, L., and Young, P.C. (2006), ‘NonlinearSystem Modeling Based on Nonparametric Identificationand Linear Wavelet Estimation of SDP Models’,
in Proceedings of the 45th IEEE Conference on Decisionand Control, San Diego, USA, pp. 2523–2528.
Truong, N.V., Wang, L., and Young, P.C. (2007b),‘Nonlinear System Modeling Based on NonparametricIdentification and Linear Wavelet Estimation ofSDP Models’, International Journal of Control, 80(5),
774–788.Young, P.C. (1993), ‘Time Variable and State DependentModelling of Nonstationary and Nonlinear Time Series’,
in Developments in Time Series, Volume in Honour ofMaurice Priestley, ed. T.S. Rao, London: Chapman andHall, pp. 374–413.
Young, P.C. (1998), ‘Data-based Mechanistic Modelling ofEngineering Systems’, Journal in Vibration and Control, 4,5–28.
Young, P.C. (2001), ‘The Identification and Estimation of
Nonlinear Stochastic Systems’, in Nonlinear Dynamics andStatistics, ed. A.I. Mees, Boston: Birkhauser, pp. 127–166.
Young, P.C., McKenna, P., and Bruun, J. (2001),
‘Identification of Nonlinear Stochastic Systems by StateDependent Parameter Estimation’, International Journal ofControl, 74, 1837–1857.
1180 N.-V. Truong and L. Wang
Dow
nloa
ded
by [
Um
eå U
nive
rsity
Lib
rary
] at
23:
33 2
2 Se
ptem
ber
2013