robust output nash strategies based on sliding mode observation...

14
Journal of the Franklin Institute ] (]]]]) ]]]]]] Robust output Nash strategies based on sliding mode observation in a two-player differential game Alejandra Ferreira de Loza a, , Manuel Jimenez-Lizarraga b , Leonid Fridman c,1 a UVHC, LAMIH, F-59313 Valenciennes, France b Faculty of Physical and Mathematical Sciences, Autonomous University of Nuevo Le ´ on, San Nicol as de los Garza, N.L., M exico c Departamento de Control Autom atico, CINVESTAV, A.P. 14-740, CP 07000, M exico D.F., M exico Received 24 November 2010; received in revised form 29 March 2011; accepted 4 July 2011 Abstract This paper tackles the problem of a two-player differential game affected by matched uncertainties with only the output measurement available for each player. We suggest a state estimation based on the so-called algebraic hierarchical observer for each player in order to design the Nash equilibrium strategies based on such estimation. At the same time, the use of an output integral sliding mode term (also based on the estimation processes) for the Nash strategies robustification for both players ensures the compensation of the matched uncertainties. A simulation example shows the feasibility of this approach in a magnetic levitator problem. & 2011 The Franklin Institute. Published by Elsevier Ltd. All rights reserved. 1. Introduction Preliminaries: Differential Game Theory deals with the dynamic optimization behavior of multiple decision makers when none of them can control the decisions made by others and the outcome for each participant is affected by the consequences of these decisions. Such approach has been profitably applied in diverse areas [1–4]. However, realistic systems are inevitably www.elsevier.com/locate/jfranklin 0016-0032/$32.00 & 2011 The Franklin Institute. Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.jfranklin.2011.07.003 Corresponding author. E-mail address: [email protected] (A. Ferreira de Loza). 1 On leave of, Universidad Nacional Aut´onoma de M exico (UNAM), Department of Control, Engineering Faculty, C.P. 04510. M exico D.F., M exico. Please cite this article as: A. Ferreira de Loza, et al., Robust output Nash strategies based on sliding mode observation in a two-player differential game, J. Franklin Inst. (2011), doi:10.1016/j.jfranklin.2011.07.003

Upload: others

Post on 19-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Robust output Nash strategies based on sliding mode observation …verona.fi-p.unam.mx/~lfridman/publicaciones/papers/jfi3.pdf · 2011-08-09 · Preliminaries: Differential Game Theory

Journal of the Franklin Institute ] (]]]]) ]]]–]]]

0016-0032/$3

doi:10.1016/j

�CorrespoE-mail ad

1On leave

Faculty, C.P

Please cite

observation

www.elsevier.com/locate/jfranklin

Robust output Nash strategies based on sliding modeobservation in a two-player differential game

Alejandra Ferreira de Lozaa,�, Manuel Jimenez-Lizarragab,Leonid Fridmanc,1

aUVHC, LAMIH, F-59313 Valenciennes, FrancebFaculty of Physical and Mathematical Sciences, Autonomous University of Nuevo Leon,

San Nicol �as de los Garza, N.L., M�exicocDepartamento de Control Autom �atico, CINVESTAV, A.P. 14-740, CP 07000, M�exico D.F., M�exico

Received 24 November 2010; received in revised form 29 March 2011; accepted 4 July 2011

Abstract

This paper tackles the problem of a two-player differential game affected by matched uncertainties

with only the output measurement available for each player. We suggest a state estimation based on

the so-called algebraic hierarchical observer for each player in order to design the Nash equilibrium

strategies based on such estimation. At the same time, the use of an output integral sliding mode term

(also based on the estimation processes) for the Nash strategies robustification for both players

ensures the compensation of the matched uncertainties. A simulation example shows the feasibility of

this approach in a magnetic levitator problem.

& 2011 The Franklin Institute. Published by Elsevier Ltd. All rights reserved.

1. Introduction

Preliminaries: Differential Game Theory deals with the dynamic optimization behavior ofmultiple decision makers when none of them can control the decisions made by others and theoutcome for each participant is affected by the consequences of these decisions. Such approachhas been profitably applied in diverse areas [1–4]. However, realistic systems are inevitably

2.00 & 2011 The Franklin Institute. Published by Elsevier Ltd. All rights reserved.

.jfranklin.2011.07.003

nding author.

dress: [email protected] (A. Ferreira de Loza).

of, Universidad Nacional Autonoma de M�exico (UNAM), Department of Control, Engineering

. 04510. M�exico D.F., M�exico.

this article as: A. Ferreira de Loza, et al., Robust output Nash strategies based on sliding mode

in a two-player differential game, J. Franklin Inst. (2011), doi:10.1016/j.jfranklin.2011.07.003

Page 2: Robust output Nash strategies based on sliding mode observation …verona.fi-p.unam.mx/~lfridman/publicaciones/papers/jfi3.pdf · 2011-08-09 · Preliminaries: Differential Game Theory

A. Ferreira de Loza et al. / Journal of the Franklin Institute ] (]]]]) ]]]–]]]2

subject to uncertainties. Sometimes the uncertainties are caused by the nature of the systemsthemselves. For example, a system may have an uncertain structure; or even if it has adeterministic structure it may have some uncertain parameters. Sometimes although a systemhas a deterministic structure and parameters, the uncertainties are introduced by externalunknown inputs, such as disturbances, measurement noise, etc. In this situation a naturalquestion arises: How should the players act in a game with uncertainties?Since the players do not exactly know the uncertainties in the game, a possible solution

is the adoption of robust strategies which maintain the performance indexes below someperformance bounds. Furthermore, during the last decades, the application of modernconcepts in differential games has significantly increased. This is specially seen in the kindof games affected by uncertainties (see [5–10]). A common focus in recent publications hasbeen the analysis of the uncertainty effects in players behavior. In [6,5,8] LQ games withuncertainty scenarios have been considered using the H1 approach leading to the min–max formulation. Two different papers, [6] as well as [5], deal with a two-person uncertainLQ differential game with uncertainties which may ‘‘play against’’ the players. In [7], theauthors propose a type of Robust Nash equilibrium concept where the game uncertainty isrepresented by a malevolent input, which is subjected to a cost penalty or a direct bound.Then, H1 theory is used once again to design robust strategies for all players.A second key point presented in most applications deals with inaccurate systems, where only

a part, or a combination, of the state space coordinates is known. Games in which the playershave access only to output measurements have been of interest since the late 1960s in theworks of [11–14]. In [11], several problems of inaccurate state information, with white noisecorrupting the output, in differential games are presented using quadratic cost functionals.However, it is still not clear how the estimation errors affect the functionals for each player. In[13], a partially observed system is considered, but the disturbances are assumed to bequadratically integrable on an infinite horizon, which implies that they tend to zero. In [14] atwo-player stochastic discrete-time game with constrained state estimation is studied, both ofthe players have access only to a noise corrupted measured output, and one of the players hasaccess to one step delayed control input in one of the studied cases.One of the most prospering control strategies for exact compensation of uncertainties is

sliding mode control (see, e.g., [15,16]). Sliding mode control has been applied in a widevariety of problems (see [17,18]). Another advantage of conventional sliding mode controlis the reduction in the order of the state equation. In [19] the system is decomposed, andthe stability conditions are given for the reduced order system, subsequently a sliding modecontroller is designed to ensure that the state trajectory reaches the sliding motion infinite time.When only the output of a system is available, a possibility for apply sliding mode

control is to design an observer. Thus, using the obtained estimates in a control law insteadof the real states of the system. Recently, robust observers based on sliding modes havebeen successfully developed in [20–23]. This kind of observers is widely used because oftheir insensitivity, which is a stronger characteristic than that of robustness, with respect tounknown inputs.Another special sliding mode technique, namely integral sliding mode (ISM) [24], has

also been widely used in processes that require compensation of arising uncertainty effects.The main properties of ISM are: (1) the ISM does not have a reaching phase; (2) it ensuresinsensitivity of the desired trajectory with respect to matched uncertainties starting fromthe initial moment.

Please cite this article as: A. Ferreira de Loza, et al., Robust output Nash strategies based on sliding mode

observation in a two-player differential game, J. Franklin Inst. (2011), doi:10.1016/j.jfranklin.2011.07.003

Page 3: Robust output Nash strategies based on sliding mode observation …verona.fi-p.unam.mx/~lfridman/publicaciones/papers/jfi3.pdf · 2011-08-09 · Preliminaries: Differential Game Theory

A. Ferreira de Loza et al. / Journal of the Franklin Institute ] (]]]]) ]]]–]]] 3

ISM methods are successfully used for robustification of different optimal control problems(see, for example [25–27]). Such useful tools have been scarcely used in differential games(see [28,29]).

Hypothesis: A two-person nonzero-sum differential game affected by the presence ofunknown inputs (or external matched non-vanishing perturbations) is considered. Theseuncertainties influence player dynamics and they are not available (measurable), neither a

priori nor on-line. The only way to obtain information regarding the state is through anestimation process.

Contribution: This paper exploits the properties of integral sliding mode control as wellas sliding mode observation techniques in order to provide a set of robust Nash strategiesfor a two-player uncertain game with incomplete state information.

The proposed controller ensures the insensitivity of the Nash strategy with respect tomatched uncertainties.

Additionally, it is shown that the observer convergence time and the controllerrealization error tend to zero together with the sampling step and execution timerespectively.

The precision of the proposed controllers in terms of the sampling step and executiontime is calculated.

Methodology: To solve this problem in this paper the next steps are taken:

P

o

The uncertainties of the players are compensated using output integral sliding modecontrol.

� A hierarchical sliding mode observer is constructed to estimate the states. � Nash strategies are designed based on the estimated state.

Paper structure: In Section 2, the model is presented and the control challenge isformulated. The Nash strategy design for the nominal game is summarized here. In Section4, an output integral sliding mode controller rejecting the matched uncertainty is proposed.The hierarchical observer is described in Section 5. The robust Nash controller and anestimation of the closed loop error during implementation are presented in Section 6.Performance issues of the robust output Nash controller are illustrated in a magneticbearing simulation study in Section 7.

2. Game model description and basic assumptions

Let us consider an uncertain LQ differential game (LQDG) where the players’ dynamicis represented by linear ordinary differential:

_xðtÞ ¼AxðtÞ þX2i ¼ 1

BiuiðtÞ þX2i ¼ 1

ziðtÞ,

y1ðtÞ ¼C1xðtÞ, y2ðtÞ ¼C2xðtÞ,

xð0Þ ¼ x0, t 2 ½0,t1�, ð1Þ

Here, i denotes the number of players ði¼ 1,2Þ, A 2 Rn�n and Bi 2 Rn�mi are knownconstant system matrices and zi

ðtÞ 2 R is an unknown input. In addition, xðtÞ 2 Rn is the gamestate vector, with uiðtÞ 2 Rmi being the control strategies of each i-player and yiðtÞ 2 Rpi is

lease cite this article as: A. Ferreira de Loza, et al., Robust output Nash strategies based on sliding mode

bservation in a two-player differential game, J. Franklin Inst. (2011), doi:10.1016/j.jfranklin.2011.07.003

Page 4: Robust output Nash strategies based on sliding mode observation …verona.fi-p.unam.mx/~lfridman/publicaciones/papers/jfi3.pdf · 2011-08-09 · Preliminaries: Differential Game Theory

A. Ferreira de Loza et al. / Journal of the Franklin Institute ] (]]]]) ]]]–]]]4

the output of the game for each player which can be measured at each time. Finally, Ci 2 Rpi�n

is the output matrix for player i.The following assumptions will be considered in this paper.

A.1.

Plea

obse

The pairs ðA,BiÞ are controllable and ðA,CiÞ are observable.

A.2. The uncertainties zi

ðtÞ (i¼1,2) are two smooth unknown functions which satisfy thenext matching condition:

Gi :¼ fziðtÞ9ziðtÞ ¼BigiðtÞ,JgiðtÞJrgþ, gþ40g ð2Þ

A.3.

rank ðCiBiÞ ¼mi,pi4mi. A.4. The initial condition x0 is bounded, i.e., there exists an w0, such that Jx0J

2rw0.

A.5. span B1aspan B2.

The first equality in Eq. (2) means that zi2 span Bi, i.e., the i-player is able to exert a

force on the perturbation.Control challenge: For the robust Nash control design we propose the following two part

strategy:

uiðtÞ ¼ ui0ðtÞ þ ui

1ðtÞ, i¼ 1,2 ð3Þ

where the control ui0ðtÞ is the Nash feedback strategy designed for the nominal game (i.e., zi

¼ 0Þand control ui

1ðtÞ is an integral sliding mode compensator for the unknown inputs zi. Below, theNash strategy for the nominal game, i.e., when no unknown inputs are present, is summarized.

2.1. Nash control strategy for the nominal system

Consider the nominal game

_x0ðtÞ ¼Ax0ðtÞ þ B1u10ðtÞ þ B2u2

0ðtÞ, x0ð0Þ ¼ x0, ð4Þ

with a quadratic cost functional as an individual aim performance

Jiðui0,u

ı0Þ ¼

Z 10

xT QiðtÞxþX2j ¼ 1

ujT Rijuj

!dt, jai: ð5Þ

The performance index JiT ðu

i0,u

ı0Þ (5) of each i-player for infinite time horizon nominal

game is given in the standard form, where ui0 is the strategy for i-player and uı

0 are thestrategies for the rest of the players ðı is the player counteracting to the player with index i).We will assume also that

QiðtÞ ¼QiTðtÞZ0, RjiðtÞ ¼RjiTðtÞ40,

RijðtÞ ¼RijTðtÞZ0 ðjaiÞ: ð6Þ

The game solutions are understood in the Filippov sense (see [30]) in order to providethe possibility of discontinuous signals in the observer design. Note that Filippov solutionscoincide with the usual solutions, when the right-hand side is continuous.In the case when there are no unknown inputs in the game and the complete state

information is available, from the limiting solution of the finite time problem [31], the next

se cite this article as: A. Ferreira de Loza, et al., Robust output Nash strategies based on sliding mode

rvation in a two-player differential game, J. Franklin Inst. (2011), doi:10.1016/j.jfranklin.2011.07.003

Page 5: Robust output Nash strategies based on sliding mode observation …verona.fi-p.unam.mx/~lfridman/publicaciones/papers/jfi3.pdf · 2011-08-09 · Preliminaries: Differential Game Theory

A. Ferreira de Loza et al. / Journal of the Franklin Institute ] (]]]]) ]]]–]]] 5

coupled algebraic equations appear [32]:

�ðA�S2P2ÞTP1�P1ðA�S2P2Þ þ P1S1P1�Q1�P2S21P2 ¼ 0, ð7Þ

�ðA�S1P1ÞTP2�P2ðA�S1P1Þ þ P2S2P2�Q2�P1S12P1 ¼ 0, ð8Þ

with

Si ¼ BiðRjiÞ�1BiT,

Sij ¼ BiðRjiÞ�1RjiðRjiÞ

�1BiT for jai

The following result is well established (see [33]): for a two-player LQDG described byEq. (1) with Eq. (5); let Pi (i¼1,2) be a symmetric stabilizing solution of equations (7)–(8).

Note that it is know that Eqs. (7) and (8) in general may not be unique [34], therefore weconsider only a couple of stabilizing strategies.

Taking F in :¼ ðRjiÞ�1BiTPi for i¼1,2, then ðF 1n,F2nÞ is a feedback Nash equilibrium.

The limiting stationary (Nash) strategies are

uin0 ðtÞ ¼�Rji�1BiTPixðtÞ: ð9Þ

Before presenting the robust Nash strategies, let us introduce the integral sliding modescompensator and the hierarchical observer. First we make a definition. For any matrixM 2 Rr�q having rank(M) ¼q, the matrix M? 2 Rr�q�r is a matrix such that M?M ¼ 0.

Remark 1. Sliding mode techniques offer powerful tools to solve a variety of controlproblems. Such approach had scarcely been applied in dynamic games problems. To thebest of the authors knowledge, the first works in that direction were the papers of [35,36]and more recently in [28]. Therefore, the main contribution of this paper is to provide a setof robust Nash strategies for a two-player uncertain game (matched uncertainty) withincomplete state information. The proposed hierarchical observer converges quickly to thereal states and based on such estimations both players are able to effectively compensatethe matched uncertainties and thus, the desired equilibrium is attained. Comparing withother works dealing with games and incomplete information our scheme provides asolution for both problems, in [14] for example, the players implement a standardLuenberger observer but the convergence issues are not addressed.

3. Output integral sliding modes compensator

Define for each player the next output-based sliding function

siðyiÞ ¼Giyi þ si, ð10Þ

where Gi ¼DiðCiBı Þ?. Let us remark that, with assignation of the matrix Gi, the term

ðCiBı Þ? will cancel all terms related with the opposite player. The matrix Di 2 Rmi�ðpi�m ı Þ is

assumed such that satisfy the following condition detðGiCiBiÞa0.Calculating the time derivative

_siðyiÞ ¼DiðCiBı Þ?|fflfflfflfflfflfflffl{zfflfflfflfflfflfflffl}

Gi

Ci½Axþ Biui0ðtÞ þ Biui

1ðtÞ þ BigiðtÞ� þ _si: ð11Þ

Please cite this article as: A. Ferreira de Loza, et al., Robust output Nash strategies based on sliding mode

observation in a two-player differential game, J. Franklin Inst. (2011), doi:10.1016/j.jfranklin.2011.07.003

Page 6: Robust output Nash strategies based on sliding mode observation …verona.fi-p.unam.mx/~lfridman/publicaciones/papers/jfi3.pdf · 2011-08-09 · Preliminaries: Differential Game Theory

A. Ferreira de Loza et al. / Journal of the Franklin Institute ] (]]]]) ]]]–]]]6

Now, let _si be defined as

_si ¼GiCiAx�GiCiBiui0ðtÞ, sið0Þ ¼ �Giyið0Þ,

where vector x 2 Rn is the observer state vector which will be introduced in the nextsection. Substituting _si in Eq. (11) yields

_siðyiÞ ¼ �GiCiAðx�xÞ þ GiCiBiui1ðtÞ þ GiCiBigiðtÞ: ð12Þ

We propose the control

ui1ðtÞ ¼ �f ðtÞðLiÞ

�1 siðtÞ

JsiðtÞJ,

Li :¼ GiCiBi, ð13Þ

the function f ðtÞ will be defined below. For the Lyapunov function V iðsiÞ ¼ JsiJ2=2

_ViðsiÞ ¼ ðsi, _siÞ ¼ si,GiCiAðx�xÞ�f ðtÞ

siðtÞ

JsiðtÞJþ LigiðtÞ

� �r�JsiJðf ðtÞ�JGiCiAJJx�xJ�gþJLiJÞo0, ð14Þ

where f ðtÞ4JGiCiAJJx�xJþ gþJLiJ.Therefore, from Eq. (11), we have

12JsiJ2 ¼ViðsiðxðtÞ,tÞÞrV iðsiðxð0Þ,0ÞÞr1

2Jsiðxð0Þ,0ÞJ2 ¼ 0,

which implies that for all tZ0, _siðtÞ ¼ 0, which leads to siðtÞ ¼ 0. This means that from thebeginning of the game, the ISM strategy for each player completely compensates thematched uncertainty. The equivalent control which maintains the trajectories on the slidingsurface is

ui1eqðtÞ ¼�ðG

iCiBiÞ�1GiCiAðx�xÞ�giðtÞ:

Substitution of the equivalent control in Eq. (1) yields the sliding mode equivalentdynamic:

_xðtÞ ¼AxðtÞ þX2i ¼ 1

BiððLiÞ�1GiCiAxðtÞ þ ui

0ðtÞÞ,

y1ðtÞ ¼C1xðtÞ, y2ðtÞ ¼C2xðtÞ, ð15Þ

where A :¼ A�P2

i ¼ 1ðLiÞ�1GiCiA.

Remark 2. In [21], it has been proven that when the number of outputs is less than or equalto the number of inputs, the matrix A in Eq. (15) always belongs to the null space of thematrix Ci and, consequently, the pair ðA,CiÞ is not observable. This means that in the casewhen pirmi, the ISM control using only output information should not be realized.

4. Observer design

Now, we design a hierarchical observer for the equivalent system (16). The principal ideain the design of the hierarchical observer is the recovery of the elements CxðtÞ, CAxðtÞ and

Please cite this article as: A. Ferreira de Loza, et al., Robust output Nash strategies based on sliding mode

observation in a two-player differential game, J. Franklin Inst. (2011), doi:10.1016/j.jfranklin.2011.07.003

Page 7: Robust output Nash strategies based on sliding mode observation …verona.fi-p.unam.mx/~lfridman/publicaciones/papers/jfi3.pdf · 2011-08-09 · Preliminaries: Differential Game Theory

A. Ferreira de Loza et al. / Journal of the Franklin Institute ] (]]]]) ]]]–]]] 7

so on, until we get CiAkx, with k¼ 1,‘�1. Constructing the Hx(t) vector with

H ¼ ½Ci CiA . . . CiA‘�1�T , H 2 Rp‘�n,

where ‘ is the observability index, i.e., the least positive integer such that rank H¼n.Before designing the observer, it is necessary to find an error bound. Design the

following dynamic system:

~x�

ðtÞ ¼A ~xðtÞ þX2i ¼ 1

BiððLiÞ�1GiCiAx þ ui

0ðtÞÞ þX2i ¼ 1

Kiðyi�Ci ~xÞ, ð16Þ

where ~x 2 Rn, Ki 2 Rn�pi defining the error rðtÞ ¼ x� ~x, from Eqs. (15) and (16) we have

_rðtÞ ¼ ðA�KiCiÞrðtÞ ¼ ArðtÞ,

with A Hurwitz, r(t) is bounded. Now, let us recover the CiAkx vectors with k¼ 1,‘�1. To

recover CiAx vector, let us design the next auxiliary system:

_xð1Þa ðtÞ ¼A ~x þX2i ¼ 1

BiððLiÞ�1GiCiAx þ ui

0ðtÞÞ þ TiðCiTiÞ�1vð1ÞðtÞ,

where xa 2 Rn, T is a matrix of the corresponding dimensions such that detðCiTiÞa0 andxð1Þa ð0Þ satisfies Cixð1Þa ð0Þ ¼ yið0Þ. For the variable

sð1ÞðyiðtÞ,xð1Þa ðtÞÞ ¼CixðtÞ�Cixð1Þa ðtÞ, ð17Þ

the time derivative is

_sð1ÞðtÞ ¼CiAðxðtÞ� ~xðtÞÞ�vð1ÞðtÞ, ð18Þ

where

vð1ÞðtÞ ¼Mi1

sð1ÞðtÞ

Jsð1ÞðtÞJ:

Here, the scalar gain Mi1 should satisfy the condition JCi ~AJJx�xJoMi

1 to reach thesliding mode regime. Then, we get _sð1ÞðtÞ ¼ sð1ÞðtÞ ¼ 0 for all tZ0. Thus, CixðtÞ ¼Cixð1Þa ðtÞ

and the equivalent output injection is

vð1Þeq ðtÞ ¼Ci ~AxðtÞ�Ci ~A ~xðtÞ 8tZ0,

finally the recovery of Ci ~AxðtÞ is made by

Ci ~AxðtÞ ¼Ci ~A ~xðtÞ þ vð1Þeq ðtÞ 8tZ0, ð19Þ

Now, to recover Ci ~A2xðtÞ the next auxiliary system is designed:

_xð2Þa ðtÞ ¼~A2~x þ ~A

X2i ¼ 1

BiððLiÞ�1GiCiAx þ ui

0ðtÞÞ þ TðCiTÞ�1vð2ÞðtÞ,

for the variable

sð2Þðvð1Þeq ðtÞ,xð2Þa ðtÞÞ ¼Ci ~AxðtÞ�Cixð2Þa ðtÞ, ð20Þ

the time derivative is

_sð2ÞðtÞ ¼Ci ~A2ðxðtÞ� ~xðtÞÞ�vð2ÞðtÞ:

Please cite this article as: A. Ferreira de Loza, et al., Robust output Nash strategies based on sliding mode

observation in a two-player differential game, J. Franklin Inst. (2011), doi:10.1016/j.jfranklin.2011.07.003

Page 8: Robust output Nash strategies based on sliding mode observation …verona.fi-p.unam.mx/~lfridman/publicaciones/papers/jfi3.pdf · 2011-08-09 · Preliminaries: Differential Game Theory

A. Ferreira de Loza et al. / Journal of the Franklin Institute ] (]]]]) ]]]–]]]8

The output injection for the second level is

vð2ÞðtÞ ¼Mi2

sð2ÞðtÞ

Jsð2ÞðtÞJ,

with JCi ~A2JJx�xJoMi

2, to obtain the sliding mode regime _sð2ÞðtÞ ¼ sð2ÞðtÞ ¼ 0 for all tZ0.The equivalent control is

vð2Þeq ðtÞ ¼Ci ~A2xðtÞ�Ci ~A

2~xðtÞ 8tZ0

and

Ci ~A2xðtÞ ¼Ci ~A

2~xðtÞ þ vð2Þeq ðtÞ 8tZ0: ð21Þ

We can repeat this procedure to recover the Ci ~AkxðtÞ vectors for k¼ 1,‘�1. The general

formula for the auxiliary dynamics is

_xðkÞa ðtÞ ¼~A

k~x þ ~A

k�1X2i ¼ 1

BiððLiÞ�1GiCiAx þ ui

0ðtÞÞ þ TðCiTÞ�1vðkÞðtÞ

and

vðkÞðtÞ ¼Mik

sðkÞðtÞ

JsðkÞðtÞJ,

with JCi ~AkJJx�xJoMi

k. The general sliding surface

sðkÞðvðk�1Þeq ðtÞ,xðkÞa ðtÞÞ ¼yiðtÞ�Cixð1Þa ðtÞ for k¼ 1,

vðk�1Þeq ðtÞ þ Ci ~Ak�1

~xðtÞ�CixðkÞa ðtÞ for k41,

8<:

and sðkÞð0Þ should satisfy

sðkÞð0Þ ¼Ciyið0Þ�Cixð1Þa ð0Þ for k¼ 1,

vðk�1Þeq ð0Þ þ Ci ~Ak�1

~xð0Þ�CixðkÞa ð0Þ for k41,

8<:

Eqs. (19) and (21) can be rewritten in matrix form

Ci

CiA

^

CiAk

266664

377775

|fflfflfflfflfflffl{zfflfflfflfflfflffl}H

xðtÞ ¼

Ci

CiA

^

CiAk

266664

377775

|fflfflfflfflfflffl{zfflfflfflfflfflffl}H

~xðtÞ þ

Cixð1Þa �Cx

vð1Þeq

^

vðkÞeq

266664

377775

|fflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflffl}veq

ð22Þ

As mentioned earlier, rank H ¼n; therefore the state xðtÞ can be recovered as

xðtÞ :¼ ~xðtÞ þHþveq, ð23Þ

where Hþ ¼ ðHT HÞ�1HT .Observer realization: To carry out the observer in the form (23), the surface s(k) must be

realizable. To guarantee this, the equivalent output injection vðkÞ must be available.However, the non-idealities in the implementation of vðkÞ cause the so-called chatteringeffect. Therefore, vðkÞ cannot be directly measured. However, we can apply a first-order

Please cite this article as: A. Ferreira de Loza, et al., Robust output Nash strategies based on sliding mode

observation in a two-player differential game, J. Franklin Inst. (2011), doi:10.1016/j.jfranklin.2011.07.003

Page 9: Robust output Nash strategies based on sliding mode observation …verona.fi-p.unam.mx/~lfridman/publicaciones/papers/jfi3.pdf · 2011-08-09 · Preliminaries: Differential Game Theory

A. Ferreira de Loza et al. / Journal of the Franklin Institute ] (]]]]) ]]]–]]] 9

filter

t_vðkÞav þ vðkÞav ¼ vðkÞeq , vavð0Þ ¼ 0: ð24Þ

For a very small t40, the filter output approaches to the equivalent control vðkÞeq , i.e.,

lim t-0d=t-0

vðkÞav ¼ vðkÞeq (see [15]), where d is the sampling time. We can select t¼ dZ, where

0oZo1. Finally, to realize the observer, select a very small sampling interval d and

substitute vðkÞeq by vðkÞav :

xðtÞ :¼ ~xðtÞ þHþvav, ð25Þ

vav ¼ ½Cixð1Þa �Ci ~x vð1Þav � � � vðkÞav �

T : ð26Þ

Remark 3. Under assumptions A1–A4 and supposing the ideal output integral slidingmode exists, xðtÞ � xðtÞ 8t40.

Next section summarizes the robust Nash control law for the uncertain LQ differentialgame presented in Eq. (2).

5. Robust output Nash strategy

Before presenting the new robust Nash strategies we have the next proposition:

Proposition 1. Due to the inaccurate state information, it seems natural to use a current state

estimate xðtÞ (if it is available) instead of xðtÞ in the feedback equilibrium control laws (9),that is,

uin0 ðxÞ ¼�Rji�1BiTPix, ð27Þ

where uin denotes the control action based on estimations x.

Thus, the proposed control law in Eq. (3) yields

uiðx,tÞ ¼ �Rji�1BiTPix�f ðtÞðLiÞ�1 siðtÞ

JsiðtÞJ, i¼ 1,2: ð28Þ

5.1. Error estimation during implementation of the closed loop control

The filter causes some errors in the state vector reconstruction. Evidently those errorsdirectly affect the controller. Hence, we will estimate the error which appears during therealization of the closed loop control, that is, the error due to the actuators plus the errordue to the observation process.

The control error is OðmÞ, where m is a control execution constant which generallydepends on the actuators time constants. Now, let us estimate the error order due to theobservation process. As we saw, the observer design is based on the recursive use of filtersof the form (24). Firstly, let us recall the following lemma regarding the error induced bythis type of filters.

Please cite this article as: A. Ferreira de Loza, et al., Robust output Nash strategies based on sliding mode

observation in a two-player differential game, J. Franklin Inst. (2011), doi:10.1016/j.jfranklin.2011.07.003

Page 10: Robust output Nash strategies based on sliding mode observation …verona.fi-p.unam.mx/~lfridman/publicaciones/papers/jfi3.pdf · 2011-08-09 · Preliminaries: Differential Game Theory

A. Ferreira de Loza et al. / Journal of the Franklin Institute ] (]]]]) ]]]–]]]10

Lemma 2. If in the differential equation [15]

t_z þ z¼ wðtÞ þW ðtÞ_s, ð29Þ

where t is a constant and z, w and s are m-dimensional vector functions,

(1)

Pl

ob

the functions w(t) and W(t), and their first-order derivatives are bounded in magnitude by

a certain number M, and

(2)

JsðtÞJrx ðx is a constant positive value),

then for any pair of positive numbers Dt and u there exists a number dðu,Dt,zð0ÞÞ such that

JzðtÞ�wðtÞJru with 0otrd, x=trd and tZDt.

Indeed, JzðtÞ�wðtÞJ satisfies the following inequality

JzðtÞ�wðtÞJrJzð0Þ�wð0ÞJexpð�t=tÞ þMðtþ xÞ þ 3M xt

� �

In our case, expression Eq. (29) can be related with t1 _vð1Þav þ vð1Þav ¼ vð1Þeq �_sð1Þ, which is obtained

from Eqs. (24) and (18). Thus, in this case wðtÞ refers to the equivalent output injection.Furthermore, the sliding mode control error directly affects the performance of the first slidingmode in the observation process. It is also known that the sampling step d induces an error oforder OðdÞ in the variable _sð1Þ during the sliding motion. Hence, it is reasonable to accept thatthe error in the sliding variable _sð1Þ is of order OðmÞ þOðdÞ. By defining D :¼ mþ d, we havethat the constant x in Lemma (2) x¼OðDÞ ¼OðmÞ þOðdÞ. Therefore, choosing t1 ¼OðD1=2Þ,the error in the first step of the observation scheme is of order OðD1=2Þ, that isvð1Þav �vð1Þeq ¼OðD1=2Þ. As it was mentioned before, we must substitute vð1Þeq by vð1Þav into thevariable sð2Þ in Eq. (20). Thus, we can consider that during the sliding motion, sð2Þ will bebounded by a constant of order OðD1=2Þ, and consequently, using a filter constant t2 ¼OðD1=4Þ,the error induced for the second filter will be vð2Þav�vð2Þeq ¼OðD1=4Þ. Following a similar analysis,we obtain an error of order OðD1=2k

Þ in the kth step for the observer reconstruction. Thus, itturns out to be that the observation error is of order OðD1=2‘ Þ, recalling that ‘ is the smallestinteger such that the H matrix has rank n. Thus, we can say that during the realization of thecontrol process, the closed loop control total error ec is

ec ¼OðmÞ þOðD1=2‘ Þ:

Fig. 1. Top view of a planar rotor disk magnetic bearing system [2].

ease cite this article as: A. Ferreira de Loza, et al., Robust output Nash strategies based on sliding mode

servation in a two-player differential game, J. Franklin Inst. (2011), doi:10.1016/j.jfranklin.2011.07.003

Page 11: Robust output Nash strategies based on sliding mode observation …verona.fi-p.unam.mx/~lfridman/publicaciones/papers/jfi3.pdf · 2011-08-09 · Preliminaries: Differential Game Theory

A. Ferreira de Loza et al. / Journal of the Franklin Institute ] (]]]]) ]]]–]]] 11

6. Magnetic levitator example

Consider the magnetic bearing system depicted in Fig. 1, which is composed of a planarrotor disk and two sets of stator electromagnets: one acting in the y-direction and the otheracting in the x-direction. This system may be decoupled into two subsystems, one for eachdirection, with similar equations (see [2] for details). Here, only the linearized subsystem inthe y-direction is considered:

_x ¼

0 1 0 0

8L0I20

mk20

2L0I0

mk2�2L0I0

mk2

0 �2I0

k�

kR1

L00

02I0

k0 �

kR2

L0

26666666664

37777777775

|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}A

0

0k

L0

0

266664

377775

|fflfflffl{zfflfflffl}B1

ðu1 þ z1Þ þ

0

0

0k

L0

2666664

3777775

|fflfflffl{zfflfflffl}B2

ðu2 þ z2Þ,

where k¼ 2g0 þ a, g0 is the air gap when the rotor is in the position y¼0; a is a positiveconstant introduced to model the fact that the permeability of electromagnets is finite;L040 is a constant which depends on the system construction; Io is the premagnetizationconstant, m is the mass of the rotor and R1, R2 are the resistances in the first set of statorelectromagnets. The state variables x¼ ½y _y i1�I0 i2�I0�

T and the control inputsu1 ¼ e1�I0R1 and u2 ¼ e2�I0R2.

Considering m¼2 kg, L0¼0.3 mH, I0¼60 mA, R1...4 ¼ 1 O and k¼0.002 m. With

C1 ¼

1 0 0 0

0 0 1 0

0 0 0 1

264

375, C2 ¼

1 0 0 0

0 0 0 1

� �,

and the controller parameters R11 ¼ diagð½1 1�Þ; R22 ¼ diagð½1 1�Þ;Q1 ¼Q2 ¼ 50I ,R12 ¼R21 ¼ 1. It can be verified that for this system the triplet ðA,Bi,CiÞ does not haveinvariant zeros.

The initial condition is xð0Þ ¼ ½0:0005 0 0:06 0:06�T . The pair ðA,C1Þ is observable with

:A ¼

0 1 0 0

530 0 0:2 �0:2

0 0 0 0

0 0 0 0

26664

37775, K ¼

25 0 0

686 0:2 �0:2

0 10:2 �0:4

0 �0:4 10:8

26664

37775:

The gain K guarantees that the A ¼A�KC1 matrix is Hurwitz. Applying the Lyapunoviterations algorithm [37] we find F1 ¼ ð20949 901 10 3Þ and F2 ¼ ð�20949 �901 �3 10Þ.The uncertainties are z1ðtÞ ¼ 2 sinð4tÞ þ 2 cosð2tÞ þ 1 and z2ðtÞ ¼ 3 cosð5tÞ. The output ISMgains are G1 ¼ ½�1 1 0�, G2 ¼ ½�1 0 1�, M1

1 ¼�10, M21 ¼�10. The simulation integration

time was 10 ms, i.e., d¼ m¼ 10 ms; D¼ 20 ms, and the filter constant was chosen as t¼ D1=2.Fig. 2 shows the feasibility of the robust Nash methodology, the effects of the

perturbations are clearly compensated. The players’ performance indexes are showed inFig. 3 and listed in Table 1.

Please cite this article as: A. Ferreira de Loza, et al., Robust output Nash strategies based on sliding mode

observation in a two-player differential game, J. Franklin Inst. (2011), doi:10.1016/j.jfranklin.2011.07.003

Page 12: Robust output Nash strategies based on sliding mode observation …verona.fi-p.unam.mx/~lfridman/publicaciones/papers/jfi3.pdf · 2011-08-09 · Preliminaries: Differential Game Theory

0 1 2 3−4

−2

0

2

4

6x 10−4

Pos

ition

[m]

Time [s]

Fig. 2. Position of rotor for the perturbed system without compensation (dotted-line) and using robust Nash

strategy (solid-line).

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

10

20

30

40

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

10

20

30

40

Time (sec)

UncompensatedCompensatedJ2(u0,u0)2 1

J1(u0,u0)2 1

Fig. 3. Individual performance index for each player. Perturbed system without compensation (solid-line) and

with compensation (dash-line).

A. Ferreira de Loza et al. / Journal of the Franklin Institute ] (]]]]) ]]]–]]]12

7. Conclusions

A two-player differential game affected by matched uncertainties and with only thepartial state measurable by all players was presented. We proposed an algebraichierarchical observer to design the Nash equilibrium strategies based on such estimation.At the same time, the use of an output integral sliding mode term (also based on theestimation process) for the robustification of the Nash strategies was proposed, ensuring

Please cite this article as: A. Ferreira de Loza, et al., Robust output Nash strategies based on sliding mode

observation in a two-player differential game, J. Franklin Inst. (2011), doi:10.1016/j.jfranklin.2011.07.003

Page 13: Robust output Nash strategies based on sliding mode observation …verona.fi-p.unam.mx/~lfridman/publicaciones/papers/jfi3.pdf · 2011-08-09 · Preliminaries: Differential Game Theory

Table 1

Players’ performance with and without compensation.

t (s) Nash strategy Robust Nash strategy

J1ðu10,u20Þ J2ðu20,u

10Þ J1ðu10,u

20Þ J2ðu2

0,u10Þ

0 0 0 0 0

0.5 22.2 22.7 8.1662 7.7529

1 32.5 31.7 8.1698 7.7565

1.5 48.4 56.4 8.1734 7.7601

2 60.4 65.3 8.177 7.7638

2.5 74.1 81.0 8.1806 7.7673

3 85.8 98.3 8.1842 7.7709

3.5 96.5 107.3 8.1878 7.7745

4 115.4 132.7 8.1914 7.7781

A. Ferreira de Loza et al. / Journal of the Franklin Institute ] (]]]]) ]]]–]]] 13

the compensation of matched uncertainties. The error provoked by the controllerrealization was calculated in terms of the sampling step and execution time. A simulationexample showed the feasibility of this approach.

References

[1] I. Sahin, M. Simaan, A flow and routing control policy for communication networks with multiple

competitive users, Journal of the Franklin Institute 343 (2) (2006) 168–180.

[2] M. Jungers, A. Franco, E. De Pieri, H. Abou-Kandil, Nash strategy applied to active magnetic bearing

control, in: Proceedings of IFAC World Congress, 2005.

[3] C. Douligeris, R. Mazumdar, A game theoretic perspective to flow control in telecommunication networks,

Journal of the Franklin Institute 329 (2) (1992) 383–402.

[4] P. Nie, Dynamic Stackelberg games under open-loop complete information, Journal of the Franklin Institute

342 (7) (2005) 737–748.

[5] F. Amato, M. Mattei, A. Pironti, Guaranteeing cost strategies for linear quadratic differential games under

uncertain dynamics, Automatica 38 (2002) 507–515.

[6] F. Amato, M. Mattei, A. Pironti, Robust strategies for Nash linear quadratic games under uncertain dynamics,

in: Proceedings of the 37th IEEE Conference on Decision and Control, vol. 2, 1998, pp. 1869–1870.

[7] W. van den Broek, J. Engwerda, J. Schumacher, Robust equilibria in indefinite linear-quadratic differential

games, Journal of Optimization Theory and Applications 119 (3) (2003) 565–595.

[8] T. Basar, P. Bernhard, H1-Optimal Control and Related Minimax Design Problems, Birkhauser, Boston,

1995.

[9] M. Jimenez-Lizarraga, A. Poznyak, Robust Nash equilibrium in multi-model LQ differential games: analysis

and extraproximal numerical procedure, Optimal Control Applications and Methods 8 (2) (2007) 117–141.

[10] B.J. Gardner, J.J. Cruz, Well-posedness of singularly pertubed Nash games, Journal of the Franklin Institute

306 (5) (1978) 355–374.

[11] I. Rhodes, D. Luenberger, Differential games with imperfect state information, IEEE Transactions on

Automatic Control AC 14 (1) (1969) 29–38.

[12] I. Rhodes, D. Luenberger, Stochastic differential games with constrained state estimators, IEEE

Transactions on Automatic Control AC 14 (5) (1969) 476–481.

[13] M. James, J. Baras, Partially Observed Differential Games, Infinite Dimensional HJI Equation, and

Nonlinear H1 Control, Technical Report, Institute for Systems Research, Technical Research Report T.R.

94-49, 2000.

[14] A. Kian, J. Cruz, M. Simaan, Stochastic discrete time nash games with constrained state estimators, Journal

of Optimization Theory and Applications 114 (1) (2002) 171–188.

[15] V. Utkin, Sliding Modes in Control and Optimization, Springer Verlag, Germany, 1992.

Please cite this article as: A. Ferreira de Loza, et al., Robust output Nash strategies based on sliding mode

observation in a two-player differential game, J. Franklin Inst. (2011), doi:10.1016/j.jfranklin.2011.07.003

Page 14: Robust output Nash strategies based on sliding mode observation …verona.fi-p.unam.mx/~lfridman/publicaciones/papers/jfi3.pdf · 2011-08-09 · Preliminaries: Differential Game Theory

A. Ferreira de Loza et al. / Journal of the Franklin Institute ] (]]]]) ]]]–]]]14

[16] C. Edwards, S. Spurgeon, Sliding Mode Control: Theory and Applications, Taylor and Francis, London,

1998.

[17] J. Fei, F. Chowdhury, Robust adaptive sliding mode control for triaxial gyroscope, International Journal of

Innovative Computing, Information and Control 6 (6) (2010) 2439–2448.

[18] L. Tsung-Chin, Stable indirect adaptive type-2 fuzzy sliding mode control using Lyapunov approach,

International Journal of Innovative Computing, Information and Control 6 (12) (2010) 5725–5748.

[19] H. Xing, D. Li, X. Zhong, Sliding mode control for a class of parabolic uncertain distributed parameter

systems with time-varying delays, International Journal of Innovative Computing, Information and Control

5 (9) (2009) 2689–2702.

[20] J. Barbot, M. Djemai, T. Boukhobza, Sliding modes observers, in: W. Perruquetti, J.P. Barbot (Eds.),

Control Engineering, 2002, pp. 103–130.

[21] F. Bejarano, L. Fridman, A. Poznyak, Output integral sliding mode control based on algebraic hierarchical

observer, International Journal of Control 80 (2007) 443–453.

[22] Y. Orolov, Sliding mode observer-based synthesis of state derivative-free model reference adaptive control of

distributed parameter systems, ASME Journal of Dynamic Systems, Measurement and Control 160 (2000)

726–731.

[23] T. Ahmed-Ali, F. Lamnabhi-Lagarrigue, Sliding observer controller design for uncertain triangular

nonlinear systems, IEEE Transactions on Automatic Control 44 (1999) 1244–1249.

[24] V. Utkin, J. Guldner, J. Shi, Sliding Mode Control in Electromechanical Systems, Taylor and Francis, 1999.

[25] F. Bejarano, L. Fridman, A. Poznyak, Output integral sliding mode for min–max optimization of multi-plant

linear uncertain systems, IEEE Transactions on Automatic Control 54 (11) (2009) 2611–2620.

[26] M. Basin, J. Rodriquez, L. Fridman, Optimal and robust control for linear state-delay systems, Journal of

the Franklin Institute 344 (6) (2007) 801–928.

[27] T. Wu, J. Wang, Y. Juang, Decoupled integral variable structure control for mimo systems, Journal of the

Franklin Institute 344 (2007) 1006–1020.

[28] M. Jimenez-Lizarraga, A. Poznyak, Quasi-equilibrium in LQ differential games with bounded uncertain

disturbances: robust and adaptive strategies with pre-identification via sliding mode technique, International

Journal of Systems Science 38 (7) (2007) 585–599.

[29] M. Jimenez-Lizarraga, L. Fridman, Robust Nash strategies based on integral sliding mode control for a two

players uncertain linear affine-quadratic game, International Journal of Innovative Computing, Information

and Control 5 (2) (2009) 241–251.

[30] A. Filippov, Differential Equations with Discontinuous Right Hand-Sides, Mathematics and Its Applications,

Kluwer Academic Publisher, 1988.

[31] T. Basar, G. Olsder, Dynamic Noncooperative Game Theory, SIAM, Philadelphia, 1999.

[32] H. Abou-Kandil, G. Freiling, V. Ionescu, G. Jank, Matrix Riccati Equations in Control and System Theory,

Birkhauser, 2003.

[33] J. Engwerda, LQ Dynamic Optimization and Differential Games, John Wiley and Sons, West Sussex,

England, 2005.

[34] A. Weeren, J. Schumacher, J. Engwerda, Asymptotic analysis of linear feedback nash equilibria in nonzero-

sum linear-quadratic differential games, Journal of Optimization Theory and Applications 101 (1999)

693–723.

[35] J.J. Cruz, S. Drakunov, M. Sikora, Leader-follower strategy via sliding mode approach, Journal of

Optimization Theory and Applications 88 (1996) 267–295.

[36] M.A. Sikora, J.J. Cruz, in: IEEE Proceedings of Conference on Decislon and Control, 1997, pp. 54–59.

[37] T.-Y. Li, Z. Gajic, Lyapunov Iterations for Solving Coupled Algebraic Riccati Equations of Nash

Differential Games and Algebraic Riccati Equations of Zero-Sum Games, New Trends in Dynamic Games

and Applications, Birkhauser, Boston, 1995, pp. 334–351.

Please cite this article as: A. Ferreira de Loza, et al., Robust output Nash strategies based on sliding mode

observation in a two-player differential game, J. Franklin Inst. (2011), doi:10.1016/j.jfranklin.2011.07.003