on optimal control of constrained linear systems with imperfect state information and stochastic...
TRANSCRIPT
![Page 1: On optimal control of constrained linear systems with imperfect state information and stochastic disturbances](https://reader035.vdocuments.mx/reader035/viewer/2022080316/5750257c1a28ab877eb41310/html5/thumbnails/1.jpg)
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROLInt. J. Robust Nonlinear Control 2004; 14:379–393 (DOI: 10.1002/rnc.888)
On optimal control of constrained linear systems withimperfect state information and stochastic disturbances
Tristan Perezn,y, Hernan Haimovich and Graham C. Goodwin
Centre for Integrated Dynamics and Control, School of Electrical Engineering and Computer Science,
The University of Newcastle, University Drive, Callaghan, NSW 2308, Australia
SUMMARY
We address the problem of finite horizon optimal control of discrete-time linear systems with inputconstraints and uncertainty. The uncertainty for the problem analysed is related to incomplete stateinformation (output feedback) and stochastic disturbances. We analyse the complexities associated withfinding optimal solutions. We also consider two suboptimal strategies that could be employed for largeroptimization horizons. Copyright # 2004 John Wiley & Sons, Ltd.
KEY WORDS: stochastic optimal control; constrained control; output feedback
1. INTRODUCTION
Control problems where variables need to satisfy constraints and where the control action alsohas to minimize a cost arise naturally in many different areas of Science, especially ControlEngineering. One particular technique that addresses such problems is model predictive control(MPC). This technique offers a framework to deal with constrained control problems for whichthe classical off-line computation of control laws is rendered difficult or impossible [1]. Becauseof this, MPC has become a widely adopted technique to address advanced control problems inindustrial applications [2].
When controlling physical systems, the constrained control problem often becomes moreinvolved because of the presence of uncertainty. In this context, uncertainty can be classifiedaccording to three sources:
* Incomplete state information to implement the control (output feedback),* Unmodelled dynamics,* Disturbances.
In control problems for uncertain systems, there are predominantly two approaches to modeluncertainty [3]: via a stochastic description and via a set-membership description. The stochastic
Copyright # 2004 John Wiley & Sons, Ltd.
yE-mail: [email protected]
nCorrespondence to: Tristan Perez, Centre for Integrated Dynamics and Control, School of Electrical Engineering andComputer Science, The University of Newcastle, Callaghan, NSW 2308, Australia
![Page 2: On optimal control of constrained linear systems with imperfect state information and stochastic disturbances](https://reader035.vdocuments.mx/reader035/viewer/2022080316/5750257c1a28ab877eb41310/html5/thumbnails/2.jpg)
approach describes uncertainty in terms of probability distributions associated with theuncertain quantities (disturbances, initial conditions, parameters, etc.). The alternative set-membership approach describes uncertainty only in terms of some information regarding thesets in which the uncertain quantities (disturbances, parameters, etc.) take values.
Within the framework of MPC, the set-membership description of uncertainty has beenadvocated in the literature [4, 5, 6]. For example, in Reference [5] a strategy based on set-valuedobservers is proposed. This strategy uses an observer that assumes unknown but boundeddisturbances and generates a set of possible states based on past output and input information.Then, to each estimated state a set of control values that guarantees the satisfaction of theconstraints is associated. The control is chosen so as to belong to the intersection of all thecontrol value sets. In Reference [6], an extension of the dual-mode paradigm [7] is presented byusing invariant sets of estimation errors for the case of unknown but bounded measurementnoise and disturbances.
On the other hand, the problem of constrained control of uncertain systems with a stochasticuncertainty description is still not fully resolved within the framework of MPC [1]. In Reference[8] output feedback predictive control of nonlinear systems with uncertain parameters isaddressed. In this work, there are no constraints associated with the control and due to thedifficulties of the optimization for a general nonlinear system, only suboptimal solutions areconsidered. In Reference [9], state feedback control of unconstrained linear systems withuncertain parameters is addressed. In Reference [10], the case of state feedback, inputconstraints and scalar disturbances is considered. A randomized algorithm (Monte Carlosampling) is used to approximate the optimal solution. Examples for an optimization horizon ofonly one time instant are presented. In [11], the same authors analyse a similar case, but wherestate constraints are also present. They utilize a short optimization horizon (five time instants)and implement the solution over a longer horizon in a receding horizon manner.
When there is incomplete state information, an observer-based strategy that seems naturalfor control in the presence of stochastic disturbances is the one that uses the so-calledCertainty Equivalence (CE) principle. Specifically, CE consists of estimating the state andthen using these estimates as if they were the true state in the control law that results if theproblem were formulated as a deterministic problem (no uncertainty). This strategy is moti-vated by the unconstrained problem with quadratic cost, for which the obtained strategy isindeed optimal [12, 13]. The use of CE in MPC leads to CE-MPC, and due to its simplicity,this strategy has been advocated in the literature [14] and reported in a number of applications[15, 16, 17]. Notwithstanding the widespread use of CE-MPC in applications, it must bestressed that CE-MPC, generally, results in a suboptimal control strategy. Two factors canbe highlighted that render CE-MPC suboptimal: (1) the state estimate is assumed to be thetrue current state, and (2) the stochastic behaviour of the system is neglected over theprediction horizon.
Our principal aim in the current work is to analyse the extent to which truly optimal solutionscan be found for single-input linear systems with input constraints and uncertainty related tooutput feedback and stochastic disturbances.
The rest of the paper is organized as follows. In Section 1, we state the optimal controlproblem for the case considered. In Section 2.1, we find the optimal solution for the case whenthe prediction horizon consists of only one time instant. In Section 2.2, we point to thecomplications that appear when the prediction horizon is increased to two time instants. Here,we also discuss the implications of practical implementation even for cases where the estimation
Copyright # 2004 John Wiley & Sons, Ltd. Int. J. Robust Nonlinear Control 2004; 14:379–393
T. PEREZ, H. HAIMOVICH AND G. C. GOODWIN380
![Page 3: On optimal control of constrained linear systems with imperfect state information and stochastic disturbances](https://reader035.vdocuments.mx/reader035/viewer/2022080316/5750257c1a28ab877eb41310/html5/thumbnails/3.jpg)
part can be solved explicitly. This motivates the study of different suboptimal strategies inSection 3. An example is presented in Section 4 and conclusions are drawn in Section 5.
2. PROBLEM STATEMENT
We consider the following time-invariant discrete-time linear system with disturbances:
xkþ1 ¼Axk þ Buk þ wk;
yk ¼Cxk þ vk;ð1Þ
where xk;wk 2 Rn and uk; yk; vk 2 R: The control uk is constrained to take values in the followingset:
U ¼ fu : � D4u4Dg � R ð2Þ
for a given constant value D > 0: The disturbances wk; and vk are sequences of independentidentically distributed random vectors, with probability density functions pwð�Þ and pvð�Þ;respectively. The initial state, x0; is characterized by a given probability density function (pdf)px0 ð�Þ: We assume that the pair ðA;BÞ is reachable and that the pair ðA;CÞ is observable.
We further assume that, at the event k; the value of the state xk is not available to thecontroller. Instead, the following sets of past inputs and outputs are available to the controllerat the time instant k:
Ik ¼
fy0g if k ¼ 0;
fy0; y1; u0g if k ¼ 1;
fy0; y1; y2; u0; u1g if k ¼ 2;
..
. ...
fy0; y1; . . . ; yN�1; u0; u1; . . . uN�2g if k ¼ N � 1:
8>>>>>>>>><>>>>>>>>>:
ð3Þ
Then, Ik 2 R2kþ1; and also Ikþ1 ¼ fIk; ykþ1; ukg; where Ik � Ikþ1: It is clear that Ik contains allthe information available to the controller at the time instant k; and hence it is called theinformation vector.
For the system defined above, together with the assumptions made, we seek to minimize thecostz
E gT ðxNÞ þXN�1
k¼0
gðxk; ukÞ
( ); ð4Þ
where the terminal cost gT ð�Þ and the cost per stage gð�; �Þ are given by
gT ðxNÞ ¼ xTNSxN ;
gðxk; ukÞ ¼ xTkQxk þ Ru2k; ð5Þ
and N is called the optimization horizon.
zDue to the stochastic nature of the system, the expression gT ðxN Þ þPN�1
k¼0 gðxk; ukÞ becomes a random variable. Hence,it is only meaningful to formulate the minimization problem in terms of its statistics. From a practical viewpoint, it is ofinterest to minimize the expected value of this expression.
Copyright # 2004 John Wiley & Sons, Ltd. Int. J. Robust Nonlinear Control 2004; 14:379–393
OPTIMAL CONTROL OF CONSTRAINED LINEAR SYSTEMS 381
![Page 4: On optimal control of constrained linear systems with imperfect state information and stochastic disturbances](https://reader035.vdocuments.mx/reader035/viewer/2022080316/5750257c1a28ab877eb41310/html5/thumbnails/4.jpg)
The result of this minimization will be a sequence of functions fpopt0 ð�Þ; popt1 ð�Þ; . . . ; poptN�1ð�Þgthat enable the controller to calculate the desired optimal control action depending on theinformation available to the controller at each time instant k; i.e. uoptk ¼ poptk ðIkÞ: These functionsalso must ensure that the constraints be always satisfied. This motivates the following definition.
Definition 1 (Admissible policies for incomplete state information)A policy PN is a finite sequence of functions pkð�Þ :R2kþ1 ! R for k ¼ 0; 1; . . . ;N � 1; i.e.
PN ¼ fp0ð�Þ; p1ð�Þ; . . . ;pN�1ð�Þg:
A policy PN is called an ‘‘admissible control policy’’ if and only if
pkðIkÞ 2 U 8Ik 2 R2kþ1; for k ¼ 0; . . . ;N � 1:
Further, the class of all admissible control policies will be denoted by
%PPN ¼ fPN : PN admissibleg:
The optimal control problem we address can then be stated as follows. &
Definition 2 (Stochastic finite-horizon optimal control problem)Given the pdfs px0 ð�Þ; pwð�Þ and pvð�Þ of the initial state, x0; and the disturbances wk and vk;respectively, the problem considered is that of finding the control policy Popt
N ; called the optimalcontrol policy, belonging to the class of all admissible control policies %PPN ; that minimizes thecost
VNðPNÞ ¼ Ex0;wk ;vk
k¼0;1;...;N�1
gT ðxNÞ þXN�1
k¼0
gðxk;pkðIkÞÞ
( )ð6Þ
subject to the constraints
xkþ1 ¼ Axk þ BpkðIkÞ þ wk; ð7Þ
yk ¼ Cxk þ vk; ð8Þ
Ikþ1 ¼ fIk; ykþ1; ukg; ð9Þ
for k ¼ 0; . . . ;N � 1; where the terminal cost gT ð�Þ and the cost per stage gð�; �Þ are given by
gT ðxNÞ ¼ xTNSxN
gðxk;pkðIkÞÞ ¼ xTkQxk þ Rp2kðIkÞ;
ð10Þ
for S;R > 0 and Q50:We can thus express the optimal control policy as
PoptN ¼ arg inf
PN2 %PPN
VNðPNÞ; ð11Þ
with the following resulting optimal cost:
VoptN ¼ inf
PN2 %PPN
VNðPNÞ: ð12Þ
&
Copyright # 2004 John Wiley & Sons, Ltd. Int. J. Robust Nonlinear Control 2004; 14:379–393
T. PEREZ, H. HAIMOVICH AND G. C. GOODWIN382
![Page 5: On optimal control of constrained linear systems with imperfect state information and stochastic disturbances](https://reader035.vdocuments.mx/reader035/viewer/2022080316/5750257c1a28ab877eb41310/html5/thumbnails/5.jpg)
Remark 2.1We would like to emphasize that the optimization problem thus stated takes into account thefact that new information will be available to the controller at future time instants. This is calledclosed-loop optimization and can be distinguished from open-loop optimization where thecontrol values fu0; u1; . . . ; uN�1g are selected all at once, at stage 0 [13]. For deterministicsystems, in which there is no uncertainty, it is not necessary to make this distinction andminimizing the cost over all sequences of controls or over all control policies yields the sameresult. &
Throughout this work, the matrix S will be adopted as a solution of the following algebraicRiccati equation [18]:
S ¼ ATSAþQ� KT %RRK ; ð13Þ
where
K ¼4 %RR�1BTSA; %RR¼4 Rþ BTSB: ð14Þ
3. OPTIMAL SOLUTIONS
A key aspect of the problem described in the previous section is that control actions cannot becalculated without taking into account the future consequences of the current decision since theaction taken at a particular stage affects all the subsequent stages. The formulated problem fallsinto the class of the so-called sequential decision problems under uncertainty [3, 13] and the onlygeneral approach to tackle such problems is dynamic programming (DP).
DP is an algorithm based on Bellman’s principle of optimality [19] that proceeds sequentiallybackwards in time by solving optimization subproblems. We next briefly show how thisalgorithm is used to solve the problem stated previously. For details, the reader is referred to[3, 13, 19].
We start by defining the functions
*ggN�1ðIN�1;pN�1ðIN�1ÞÞ9EfgT ðxNÞ þ gðxN�1; pN�1ðIN�1ÞÞjIN�1; pN�1ðIN�1Þg;
*ggkðIk;pkðIkÞÞ9Efgðxk;pkðIkÞÞjIk;pkðIkÞg:ð15Þ
The DP algorithm for the case of incomplete state information can be expressed via thefollowing sequential optimization subproblems:
SOPN�1 : JN�1ðIN�1Þ ¼ infuN�12U
*ggN�1ðIN�1; uN�1Þ
subject to xN ¼ AxN�1 þ BuN�1 þ wN�1 ð16Þ
and
SOPk : JkðIkÞ ¼ infuk2U
½ *ggkðIk; ukÞ þ EfJkþ1ðIkþ1ÞjIk; ukg�
subject to xkþ1 ¼ Axk þ Buk þ wk and Ikþ1 ¼ fIk; ykþ1; ukg ð17Þ
for k ¼ N � 2;N � 3; . . . ; 0: The optimization process starts at stage N � 1 by solving SOPN�1
Copyright # 2004 John Wiley & Sons, Ltd. Int. J. Robust Nonlinear Control 2004; 14:379–393
OPTIMAL CONTROL OF CONSTRAINED LINEAR SYSTEMS 383
![Page 6: On optimal control of constrained linear systems with imperfect state information and stochastic disturbances](https://reader035.vdocuments.mx/reader035/viewer/2022080316/5750257c1a28ab877eb41310/html5/thumbnails/6.jpg)
for all possible values of IN�1: In this way, the law poptN�1ð�Þ is found, in the sense that given thevalue of IN�1; the corresponding optimal control is the value uN�1 ¼ poptN�1ðI
N�1Þ; the minimizerof SOPN�1: The procedure then continues in order to find poptN�2ð�Þ; . . . ; p
opt0 ð�Þ by solving
SOPN�2; . . . ;SOP0: After the last optimization subproblem is solved, the optimal controlpolicy Popt
N is obtained and the optimal cost (cf. (12)) is
VoptN ¼ VNðP
optN Þ ¼ EfJ0ðI0Þg ¼ EfJ0ð y0Þg: ð18Þ
3.1. Optimal solution for N ¼ 1
We next apply the DP algorithm to obtain the optimal solution for the case where the predictionhorizon is equal to only one time instant.
Proposition 1For N ¼ 1; the solution of the optimal control problem stated in Definition 2 is of the formPopt
1 ¼ fpopt0 ð�Þg; with
uopt0 ¼ popt0 ðI0Þ ¼ �satDðKEfx0jI0gÞ; 8I0 2 R; ð19Þ
where K was defined in (14) and satD :R ! R is defined as
satDðzÞ ¼
D if z > D
z if jzj4D
�D if z5� D
8>><>>:
:
ð20Þ
Moreover, the last step in the DP algorithm has the form
J0ðI0Þ ¼ EfxT0Sx0jI0g þ %RRFDðKEfx0jI0gÞ þ trðKTK covfx0jI0gÞ þ EfwT
0Sw0g; ð21Þ
where FD :R ! R is given by
FDðzÞ ¼ ½z� satDðzÞ�2: ð22Þ
ProofThe only optimization subproblem we have to solve is SOP0 (cf. (16)) because N ¼ 1:
J0ðI0Þ ¼ infu02U
EfgT ðx1Þ þ gðx0; u0ÞjI0; u0g
¼ infu02U
EfðAx0 þ Bu0 þ w0ÞTSðAx0 þ Bu0 þ w0Þ þ xT0Qx0 þ Ru20jI
0; u0g:ð23Þ
By distributing and grouping terms and also using the fact that Efw0jI0; u0g ¼ Efw0g ¼ 0 andthat w0 is neither correlated with the state x0 nor correlated with the control u0; the latter can beexpressed as
J0ðI0Þ ¼ infu02U
EfxT0 ðATSAþQÞx0 þ 2u0B
TSAx0 þ ðBTSBþ RÞu20jI0; u0g þ EfwT
0Sw0g: ð24Þ
Copyright # 2004 John Wiley & Sons, Ltd. Int. J. Robust Nonlinear Control 2004; 14:379–393
T. PEREZ, H. HAIMOVICH AND G. C. GOODWIN384
![Page 7: On optimal control of constrained linear systems with imperfect state information and stochastic disturbances](https://reader035.vdocuments.mx/reader035/viewer/2022080316/5750257c1a28ab877eb41310/html5/thumbnails/7.jpg)
Further, using expressions (13) and (14), we can write
J0ðI0Þ ¼ infu02 U
½EfxT0Sx0 þ %RRðxT0KTKx0 þ 2u0Kx0 þ u20ÞjI
0; u0g� þ EfwT0Sw0g
¼ %RR infu02U
½Efðu0 þ Kx0Þ2jI0; u0g� þ EfxT0Sx0jI
0g þ EfwT0Sw0g;
ð25Þ
where we have used the fact that the conditional pdf of x0 given I0 and u0 is equal to the onewhere only I0 is given. Finally, using properties of the expected value of quadratic forms}see,for example [12]}the optimization to solve becomes
J0ðI0Þ ¼ %RR infu02U
½ðu0 þ KEfx0jI0gÞ2� þ trðKTK covfx0jI0gÞ þ EfxT0Sx0jI
0g þ EfwT0Sw0g: ð26Þ
From (26), it follows that the unconstrained minimum is attained at uuc0 ¼ �KEfx0jI0g: For theconstrained case, the result of Equation (19) follows from the convexity of the quadraticfunction. The final step of DP is obtained by substituting (19) back into (26), i.e.
J0ðI0Þ ¼ %RRFDðKEfx0jI0gÞ þ trðKTKcovfx0jI0gÞ þ EfxT0Sx0jI0g þ EfwT
0Sw0g; ð27Þ
with the function FDð�Þ defined in (22). &
Remark 3.1It is worth noting that when N ¼ 1 the optimal control law popt0 depends on the information I0
only through the conditional expectation Efx0jI0g: Therefore, this conditional expectation is asufficient statistic for the problem considered, i.e. it provides all the necessary information toimplement the control. &
Remark 3.2The control law given in (19) is the same law that would be optimal if we considered the case inwhich the state is measured (complete state information) and the disturbance wk is still acting onthe system. Further, the same law would also be the optimal control law if apart from measuringthe state, wk is set equal to a fixed value or to its mean value}see Reference [20] for the solutionin the deterministic and complete-information case. Therefore, Certainty Equivalence holdsfor the optimization horizon N ¼ 1; i.e. the optimal control law is the same law that wouldresult if the optimal control problem were considered with some or all uncertain quantities set toa fixed value. &
3.2. Optimal solution for N ¼ 2
We now consider the case where the optimization horizon is increased to two instants, i.e.N ¼ 2:
Proposition 2For N ¼ 2; the solution of the optimal control problem stated in Definition 2 is of the formPopt
2 ¼ fpopt0 ð�Þ;popt1 ð�Þg; with
uopt1 ¼ popt1 ðI1Þ ¼ �satDðKEfx1jI1gÞ; 8I1 2 R3; ð28Þ
uopt0 ¼ popt0 ðI0Þ ¼ arg infu02U
½ðu0 þ KEfx0jI0gÞ2 þ %RREfFDðKEfx1jI1gÞjI0; u0g� 8I0 2 R; ð29Þ
where FDð�Þ is given in (22).
Copyright # 2004 John Wiley & Sons, Ltd. Int. J. Robust Nonlinear Control 2004; 14:379–393
OPTIMAL CONTROL OF CONSTRAINED LINEAR SYSTEMS 385
![Page 8: On optimal control of constrained linear systems with imperfect state information and stochastic disturbances](https://reader035.vdocuments.mx/reader035/viewer/2022080316/5750257c1a28ab877eb41310/html5/thumbnails/8.jpg)
ProofThe first step of the DP algorithm (cf. (16)) gives
J1ðI1Þ ¼ infu12U
EfgT ðAx1 þ Bu1 þ w1Þ þ gðx1; u1ÞjI1; u1g: ð30Þ
We see that J1ðI1Þ is similar to J0ðI0Þ in (27). By comparison, we readily obtain
popt1 ðI1Þ ¼ � satDðKEfx1jI1gÞ;
J1ðI1Þ ¼ %RRFDðKEfx1jI1gÞ þ trðKTKcovfx1jI1gÞ þ EfxT1Sx1jI1g þ EfwT
1Sw1g: ð31Þ
The second step of the DP algorithm proceeds as follows:
J0ðI0Þ ¼ infu02U
½Efgðx0; u0ÞjI0; u0g þ EfJ1ðI1ÞjI0; u0g�
subject to x1 ¼ Ax0 þ Bu0 þ w0; I1 ¼ fI0; y1; u0g: ð32Þ
The latter can be written as
J0ðI0Þ ¼ infu02U
½EfxT0Qx0 þ Ru20jI0; u0g þ EfEfxT1Sx1jI
1gjI0; u0g
þ %RREfFDðKEfx1jI1gÞjI0; u0g þ trðKTK covfx1jI1gÞ þ EfwT1Sw1g�:
ð33Þ
Since fI0; u0g � I1; using the properties of successive conditioning [21], we can express thesecond term of (33) as
EfEfxT1Sx1jI1gjI0; u0g ¼EfxT1Sx1jI
0; u0g
¼EfðAx0 þ Bu0 þ w0ÞTSðAx0 þ Bu0 þ w0ÞjI0; u0g:
ð34Þ
Using this, expression (33) becomes
J0ðI0Þ ¼ infu02U
½EfxT0Qx0 þ Ru20 þ ðAx0 þ Bu0 þ w0ÞTSðAx0 þ Bu0 þ w0ÞjI0; u0g
þ %RREfFDðKEfx1jI1gÞjI0; u0g þ trðKTK covfx1jI1gÞ þ EfwT1Sw1g�:
ð35Þ
Note that the first part of (35) is identical to (23). Therefore, using (26), expression (35) can bewritten as
J0ðI0Þ ¼ infu02U
½ðu20 þ KEfx0jI0gÞ2 þ %RREfFDðKEfx1jI1gÞjI0; u0g�
þ EfxT0Sx0jI0g þ
X1j¼0
trðKTK covfxj jI jgÞ þ EfwTj Swjg;
ð36Þ
where trðKTK covfx1jI1gÞ has been left out of the minimization because it is not affected by u0due to the linearity of the system equations [3, 22]. By considering only the terms that areaffected by u0; we find the result given in Equation (29). &
In order to proceed with the optimization and to obtain an explicit expression for popt0 ; weneed to know how to express Efx1jI1g ¼ Efx1jI0; y1; u0g explicitly as a function of I0; u0 and y1:The optimal law popt0 ð�Þ depends on I0 not only through Efx0jI0g; as was the case for N ¼ 1: Itcan be shown that even for the case of gaussian disturbances the optimal control law popt0 ð�Þdepends on covfx0jI0g as well, when input constraints are present [23].
Copyright # 2004 John Wiley & Sons, Ltd. Int. J. Robust Nonlinear Control 2004; 14:379–393
T. PEREZ, H. HAIMOVICH AND G. C. GOODWIN386
![Page 9: On optimal control of constrained linear systems with imperfect state information and stochastic disturbances](https://reader035.vdocuments.mx/reader035/viewer/2022080316/5750257c1a28ab877eb41310/html5/thumbnails/9.jpg)
To calculate Efx1jI1g; the conditional pdf px1 ð�jI1Þ must be found. At any time instant k; the
conditional pdfs pxkð�jIkÞ satisfy the following recursion.
Time update (Chapman–Kolmogorov equation):
pxkðxkjIk�1; uk�1Þ ¼
Zpxk ðxkjxk�1; uk�1Þpxk�1
ðxk�1jIk�1; uk�1Þ dxk�1: ð37Þ
Observation update:
pxk ðxkjIkÞ ¼ pxkðxkjI
k�1; yk; uk�1Þ ¼pykðykjxkÞpxkðxkjI
k�1; uk�1Þpyk ðykjIk�1; uk�1Þ
; ð38Þ
where
pyk ðykjIk�1; uk�1Þ ¼
ZpykðykjxkÞpxkðxkjI
k�1; uk�1Þ dxk: ð39Þ
Remark 3.3Finding the conditional pdfs that satisfy the recursion given by Equations (37) and (38) inexplicit form may be very difficult or even impossible depending on the pdfs of the initial stateand the disturbances. For the case in which they are all Gaussian, however, all the conditionaldensities that satisfy (37) and (38) are also Gaussian. In this particular case, (37) and (38) led toa recursive algorithm in terms of the (conditional) expectation and covariance (that completelydefine any Gaussian pdf) known as the Kalman Filter algorithm [24, 25]. &
Due to the way the information enters the conditional pdfs, it is very difficult to obtain anexplicit form for the optimal control. On the other hand, the computational implementation ofsuch optimal control may also be rather involved and computationally demanding, even in thecase where the recursion given by (37) and (38) can be found in explicit form. To demonstratethis point, let us illustrate a possible way of implementing the optimal controller in such a case.
We start by discretizing the set U of admissible control values such that the discretized set,Ud ¼ fu0i 2 U : i ¼ 1; 2; . . . ; rg; consist of only a finite number r of quantities. The discretizationallows the optimal control to be approximated as
uopt0 � arg infu02Ud
½ðu0 þ KEfx0jI0gÞ2 þ %RREfFDðKEfx1jI1gÞjI0; u0g� : ð40Þ
Then, in order to solve the minimization, only a finite number of values of u0 have to beconsidered for a given I0 ¼ y0: For every value u0i of u0; the expression between square bracketsin (40) must be evaluated. To accomplish this, the following steps are envisaged:
1. We need to evaluate the function W2ð�Þ given by
W2ðy0Þ ¼ Efx0jy0g ¼Z
x0px0 ðx0jy0Þ dx0; ð41Þ
for the given value of y0:2. Given y0; pvð�Þ; pwð�Þ; and px0 ð�Þ; we need to obtain py0ð�jx0Þ and px1 ð�jy0; y1; u0Þ in explicit
form (as assumed) using recursion (37) and (38) together with the system and measurementequations.
Copyright # 2004 John Wiley & Sons, Ltd. Int. J. Robust Nonlinear Control 2004; 14:379–393
OPTIMAL CONTROL OF CONSTRAINED LINEAR SYSTEMS 387
![Page 10: On optimal control of constrained linear systems with imperfect state information and stochastic disturbances](https://reader035.vdocuments.mx/reader035/viewer/2022080316/5750257c1a28ab877eb41310/html5/thumbnails/10.jpg)
3. Using px1 ð�jy0; y1; u0Þ; the expectation Efx1jy0; y1; u0g can be expressed as
hðy0; y1; u0Þ ¼Efx1jy0; y1; u0g
¼Z
x1px1ðx1jy0; y1; u0Þ dx1;ð42Þ
which may not be expressible in explicit form even if (37) and (38) are so. However, it maybe evaluated numerically if y0; y1 and u0 were given.
4. Using px0ð�jy0Þ; pvð�Þ; together with the measurement equation, we can obtain py1 ð�jy0; u0Þ:We can now express the expectation EfFDðKhðy0; y1; u0ÞÞjI0; u0g as
W1ðy0; u0Þ ¼EfFDðKhðy0; y1; u0ÞÞjI0; u0g
¼Z
FDðKhðy0; y1; u0ÞÞpy1ðy1jy0; u0Þ dy1:ð43Þ
Note that, in order to calculate this integral numerically, the function hðy0; y1; u0Þ has to becalculated for different values of y1; even if u0 and y0 were given.
To find the cost incurred for one of the r values u0i; expressions (41)–(43) may need furtherdiscretizations (for x0; x1 and y1; respectively). Additionally, the fact that x0; x1 2 Rn maycomplicate the matter further, if n > 1:
From the previous comments, it is evident that the approximation of the optimal solution canbe very computationally demanding depending on the discretizations performed. Note that inthe above steps all the pdfs are assumed known in explicit form so that the integrals can beevaluated. As already mentioned in Remark 3.3, this may not always be possible.
However, an alternative approach to brute force discretizations is the use of Markov ChainMonte Carlo (MCMC) methods [26], which are based on approximating continuous pdfs bydiscrete ones, by drawing samples from the pdfs in question or from other approximations.These methods could be applied in this case but it must be stressed that the exponential growthin the number of computations (as the optimization horizon is increased) seems to beunavoidable save for some very particular cases. The application of these methods to therecursion given by (37) and (38) gives rise to a special case of the, so-called, particle filters [27].
Due to the apparent impossibility of analytically proceeding with the optimization forhorizons greater than two, and also due to the intricacies of the implementation of the optimallaw even for N ¼ 2; one often has to resort to the use of suboptimal solutions. In the nextsection, we analyse two alternative suboptimal strategies.
4. SUBOPTIMAL STRATEGIES
4.1. Certainty equivalent control (CEC)
As mentioned in the introduction, CEC is a control strategy in which, at each stage, the controlapplied is the optimal control of an associated deterministic problem derived from the originalproblem by removing all uncertainty. Specifically, the associated problem is derived by settingthe disturbance wk to a fixed typical value}e.g. %ww ¼ Efwkg}and by also assuming perfect stateinformation. After finding the solution to the associated deterministic problem, the control toapply is implemented using some estimate of the state #xxðIkÞ in place of the true state (which was
Copyright # 2004 John Wiley & Sons, Ltd. Int. J. Robust Nonlinear Control 2004; 14:379–393
T. PEREZ, H. HAIMOVICH AND G. C. GOODWIN388
![Page 11: On optimal control of constrained linear systems with imperfect state information and stochastic disturbances](https://reader035.vdocuments.mx/reader035/viewer/2022080316/5750257c1a28ab877eb41310/html5/thumbnails/11.jpg)
assumed to be known for solving the associated problem). That is, we obtain the optimal policyfor the deterministic problem (DET)
PDETN ¼ fpDET
0 ð�Þ; . . . ;pDETN�1ð�Þg; ð44Þ
where pDETk : Rn ! R for k ¼ 0; 1; . . . ;N � 1: Then, the CEC evaluates the deterministic laws at
the estimate of the state, i.e.
uCEk ¼ pDETk ð #xxðIkÞÞ: ð45Þ
The associated deterministic problem for linear systems with a quadratic cost is an example ofa case where the control policy can be obtained explicitly for any finite optimization horizon[28]. The following example illustrates this for an optimization horizon N ¼ 2; for a procedureto obtain the pre-computable laws for larger optimization horizons see [28, 29].
Example 1 (Closed-loop CEC)For N ¼ 2; the deterministic policy PDET
2 ¼ fpDET0 ð�Þ; pDET
1 ð�Þg is given by [30]
pDET1 ðxÞ ¼ �satDðKxÞ 8x 2 Rn; ð46Þ
pDET0 ðxÞ ¼
�satDðGxþHÞ if x 2 X1;
�satDðKxÞ if x 2 X2 8x 2 Rn;
�satDðGx�HÞ if x 2 X3:
8>><>>: ð47Þ
K is given by (14) and
G ¼K þ KBKA
1þ ðKBÞ2; H ¼
KB
1þ ðKBÞ2D:
The sets X1;X2;X3 form a partition of Rn; and are given by
X1 ¼fx jKAKx5� Dg
X2 ¼fx j jKAKxj4Dg
X3 ¼fx jKAKx > Dg
with
AK ¼ A� BK
Therefore, a closed-loop CEC consists in using the controls
uCE0 ¼pDET0 ð #xxðI0ÞÞ;
uCE1 ¼pDET1 ð #xxðI1ÞÞ:
ð48Þ
4.2. Partially stochastic CEC (PS-CES)
This variant of CEC treats the problem as one of perfect state information but takes stochasticdisturbances into account. To solve the optimization problem, it is assumed that the state is, andwill be, known to the controller. In reality, and as with CEC, the value of the state is providedby an estimator that receives the available information as input and generates #xxkðIkÞ: A PS-CEC
Copyright # 2004 John Wiley & Sons, Ltd. Int. J. Robust Nonlinear Control 2004; 14:379–393
OPTIMAL CONTROL OF CONSTRAINED LINEAR SYSTEMS 389
![Page 12: On optimal control of constrained linear systems with imperfect state information and stochastic disturbances](https://reader035.vdocuments.mx/reader035/viewer/2022080316/5750257c1a28ab877eb41310/html5/thumbnails/12.jpg)
admissible policy
LN ¼ fl0ð�Þ; . . . ; lN�1ð�Þg ð49Þ
is a sequence of admissible control laws lkð�Þ :Rn ! U that map the (estimates of the) states intoadmissible control actions. The PS-CEC solves the following perfect state information problem.
Definition 3 (PS-CEC optimal control problem)Assuming the state #xxk will be available to the controller at time instant k to calculate the controland given the pdf pwð�Þ of the disturbances wk; find the admissible control policy Lopt
N thatminimizes the cost
#VVNðLNÞ ¼ Ewk
k¼0;1;...;N�1
gT ð #xxNÞ þXN�1
k¼0
gð #xxk; lkð #xxkÞÞ
( )ð50Þ
subject to #xxkþ1 ¼ A #xxk þ Buk þ wk for k ¼ 0; 1; . . . ;N � 1:
The optimal control policy for perfect state information thus found will be used, as in CEC, tocalculate the control action based on the estimate #xxk provided by the estimator, i.e.
uk ¼ lkð #xxkÞ: ð51Þ&
Next, we apply this suboptimal strategy to the problem of interest.
4.2.1. PS-CEC for N ¼ 1: Using the DP algorithm, we have
#JJð #xx0Þ ¼ infu02U
Ef #xxT1S #xx1 þ #xxT0Q #xx0 þ Ru20j #xx0; u0g: ð52Þ
As with the true optimal solution for N ¼ 1; the PS-CEC optimal control has the form
#uuopt0 ¼ lopt0 ð #xx0Þ ¼ �satDðK #xx0Þ; ð53Þ
#JJ0ð #xx0Þ ¼ #xxT0S #xx0 þ %RRFDðK #xx0Þ þ EfwT0Sw0g: ð54Þ
We can see that if #xx0 ¼ Efx0jI0g then the PS-CEC for N ¼ 1 coincides with the optimalsolution.
4.2.2. PS-CEC for N ¼ 2: The first step of the DP algorithm yields
#uuopt1 ¼ lopt1 ð #xx1Þ ¼ �satDðK #xx1Þ; ð55Þ
#JJ1ð #xx1Þ ¼ #xxT1S #xx1 þ %RRFDðK #xx1Þ þ EfwT1Sw1g: ð56Þ
For the second step, we have, after some algebra, that
#JJ0ð #xx0Þ ¼ infu02U
½Efgð #xx0; u0Þ þ #JJ1ð #xx1Þj #xx0; u0g�
subject to #xx1 ¼ A #xx0 þ Bu0 þ w0;ð57Þ
#uuopt0 ¼ arg infu02U
½ðu0 þ K #xx0Þ2 þ %RREfFD½KðA #xx0 þ Bu0 þ w0Þ�j #xx0; u0g�: ð58Þ
Comparing #uuopt0 to expression (29) for the optimal control, we can appreciate that, given #xx0; evenif EfFD½KðA #xx0 þ Bu0 þ w0Þ�j #xx0; u0g cannot be found in explicit form as a function of u0;
Copyright # 2004 John Wiley & Sons, Ltd. Int. J. Robust Nonlinear Control 2004; 14:379–393
T. PEREZ, H. HAIMOVICH AND G. C. GOODWIN390
![Page 13: On optimal control of constrained linear systems with imperfect state information and stochastic disturbances](https://reader035.vdocuments.mx/reader035/viewer/2022080316/5750257c1a28ab877eb41310/html5/thumbnails/13.jpg)
numerically computing this suboptimal control action is much less computationally demandingthan its optimal counterpart.
5. SIMULATION EXAMPLES
In this section we compare CEC and PS-CEC by means of simulation examples. The examplesare aimed at assessing the closeness, in performance, of the different strategies, as evidenced bythe cost function.
In order to assess the merits (or otherwise) of these strategies, the corresponding costs have tobe evaluated. As previously mentioned, any function of the state and the control is a randomvariable, and the cost is defined as the expected value of a particular one of these functions(quadratic, in this case). Hence, a comparison between the costs incurred by using differentpolicies is only meaningful in terms of these expected values. To numerically compute values ofthe cost for a given control policy, different realizations of the initial state plus process andmeasurement disturbances have to be obtained and a corresponding realization of the costevaluated. Then, the expected value can be approximated by averaging over the differentrealizations.
The following examples are simulated for the following system:
A ¼0:9713 0:2189
�0:2189 0:7524
" #; B ¼
0:0287
0:2189
" #; C ¼ ½0:3700 0:0600�: ð59Þ
The disturbances wk are adopted to have a uniform distribution with support on ½�0:5; 0:5� �½�1; 1� and likewise for vk with support on ½�0:1; 0:1�: The initial state x0 is normally distributedwith zero mean and covariance diagð300�1; 300�1Þ: A Kalman Filter was implemented toprovide the state estimates needed. Although this estimator is not the optimal one in this case,because the disturbances are not normally distributed, it yields the best linear unbiasedestimator for the state. The parameters for the Kalman Filter were adopted as the true mean andcovariance of the corresponding variables in the system. The saturation limit of the control, D;was adopted to be equal to 1. The optimization horizon is in both cases N ¼ 2:
For PS-CEC, we discretize the set U so that only 500 values are considered, and the expectedvalue in (58) is approximated by taking 300 samples of the pdf pwð�Þ for every possible value ofu0 in the discretized set. For CEC, we implement the policy given in Example 1.
We simulated the closed-loop system over only two time instants, but repeated the simulationa large number of times (between 2000 and 8000). For each simulation, a different realization ofthe disturbances and the initial state was used. A realization of the cost was calculated for everysimulation run for each one of the control policies applied (PS-CEC and CEC). The sampleaverage of the cost for each policy was computed and the difference between them was alwaysfound to be less than 0.1%.
Remark 4.1The examples simulated have no practical value. However, the comparison between the costs forthe two control policies seems to indicate that the trade-off between better performance andcomputational complexity favours the CEC implementation over the PS-CEC. &
Copyright # 2004 John Wiley & Sons, Ltd. Int. J. Robust Nonlinear Control 2004; 14:379–393
OPTIMAL CONTROL OF CONSTRAINED LINEAR SYSTEMS 391
![Page 14: On optimal control of constrained linear systems with imperfect state information and stochastic disturbances](https://reader035.vdocuments.mx/reader035/viewer/2022080316/5750257c1a28ab877eb41310/html5/thumbnails/14.jpg)
6. CONCLUDING REMARKS
We have addressed the optimal control problem for discrete-time input-constrained linear time-invariant systems with output feedback and stochastic disturbances. We have extended some ofour previous work in the area [31] where we only considered Gaussian disturbances. Theframework presented in this paper allows for a general stochastic description of disturbances.However, the intricacies and complexities of the optimal solution for such a case prevents theevaluation of explicit solutions even for optimization horizons as short as 2 time instants.
It would be of interest, from a practical standpoint, to extend the optimization horizonbeyond N ¼ 2: However, as we have explained in a previous section and seen in the examples,due to computational issues this becomes very difficult. In order to achieve this extension, one isled to conclude that CEC may be, at this point, the only way forward.
Of course, the ultimate test for the suboptimal strategies would be to contrast them with theoptimal one. The authors believe that in this case, an appreciable difference in the cost may beobtained, due to the fact that the optimal strategy takes into account the process andmeasurement disturbances in a unified manner, as opposed to the above-mentioned suboptimalstrategies that use estimates provided by an estimator as if they were the true state.
ACKNOWLEDGEMENT
The authors would like to acknowledge comments and suggestions made by Prof. David Q. Mayne fromthe Department of Electrical and Electronic Engineering, Imperial College of Science, London, UK, thatmotivated the current work.
REFERENCES
1. Mayne DQ, Rawlings JB, Rao CV, Scokaert POM. Constrained model predictive control: Stability and optimality.Automatica 2000; 36:789–814.
2. Maciejowski JM. Predictive Control with Constraints. Prentice-Hall: Englewood Cliffs, NJ, 2002.3. Bertsekas D. Dynamic Programming and Optimal Control, Optimization and Computation Series, vols. 1, 2. Athena
Scientific: Belmont, MA, 2000.4. Lee JH, Yu Z. Worst-case formulations of model predictive control for systems with bounded parameters.
Automatica 1997; 33(5):763–781.5. Shamma JS, Tu KY. Output feedback for systems with constraints and saturations: scalar control case. System and
Control Letters 1998; 35:265–293.6. Lee YI, Kouvaritakis B. Receding horizon output feedback control for linear systems with input saturation. IEE
Proceedings, Control Theory and Applications 2001; 148(2):109–115.7. Mayne DQ, Michalska H. Robust receding horizon control of constrained nonlinear systems. IEEE Transactions on
Automatic Control 1993; 38(11):1623–1633.8. Filatov NM, Unbehauen H. Adaptive predictive control policy for nonlinear stochastic systems. IEEE Transactions
on Automatic Control 1995; 40(11):1943–1949.9. Lee JH, Cooley BL. Optimal feedback control strategies for state-space systems with stochastic parameters. IEEE
Transactions on Automatic Control 1998; 43(10):1469–1475.10. Batina I, Stoorvogel AA, Weiland S. Stochastic disturbance rejection in model predictive control by randomizes
algorithms. In Proceedings of the American Control Conference, Arlington, VA, U.S.A., 2001.11. Batina I, Stoorvogel AA, Weiland S. Optimal control of linear, stochastic systems with state and input constraints.
In Proceedings of the 41st IEEE Conference on Decision and Control, Las Vegas, NV, U.S.A., 2002.12. (AAstr .oom KJ. Introduction to Stochastic Control Theory, Mathematics in Science and Engineering, vol. 70. Academic
Press: New York, 1970.13. Bertsekas DP. Dynamic Programming and Stochastic Control, Mathematics in Science and Engineering, vol. 125.
Academic Press: New York, 1976.
Copyright # 2004 John Wiley & Sons, Ltd. Int. J. Robust Nonlinear Control 2004; 14:379–393
T. PEREZ, H. HAIMOVICH AND G. C. GOODWIN392
![Page 15: On optimal control of constrained linear systems with imperfect state information and stochastic disturbances](https://reader035.vdocuments.mx/reader035/viewer/2022080316/5750257c1a28ab877eb41310/html5/thumbnails/15.jpg)
14. Muske, KR, Rawlings JB. Model predictive control with linear models. A.I.Ch.e. Journal 1993; 39(2):262–2872.15. Angeli D, Mosca E, Casavola A. Output predicitive control of linear plants with saturated and rate limited
actuators. In Proceedings of American Control Conference, ACC, Chicago, U.S.A., 2000; 272–276.16. Marquis P, Broustail J. Smoc, a bridge between state space and model predictive controllers: application to the
automation of hydrotreating unit. In Proceedings of IFAC Workshop on Model Based Process Control. PergamonPress: Oxford, 1988; 37–43.
17. P!eerez T, Goodwin GC, Tzeng CY. Model predictive rudder roll stabilization control for ships. In Fifth IFACConference on Manoeuvering and Control of Marine Crafts (MCMC2000), Aalborg, Denmark, 2000.
18. Anderson BDO, Moore JB. Optimal Control. Linear Quadratic Methods. Prentice-Hall: Englewood Cliffs, NJ, 1971.19. Bellman RE. Dynamic Programming. Princeton University Press: Princeton, NJ, 1957.20. De Don!aa JA, Goodwin GC. Elucidation of state-space regions wherein model predictive control and anti-windup
strategies achieve identical control policies. Proceedings of American Control Conference, Chicago, Illinois, U.S.A.,2000.
21. Ash RB, Dol!eeans-Dade C. Probability and Measure Theory. Harcourt Academic Press, San Diego, 2000.22. Bertsekas DP. Dynamic Programming: Deterministic and Stochastic Models. Prentice-Hall: Englewood Cliffs, NJ,
1987.23. Haimovich H, Perez T, Goodwin GC. On optimality and certainty equivalence in output feedback control of
constrained uncertain linear systems. In European Control Conference, 2003 University of Cambridge, U.K.24. Anderson BDO, Moore JB. Optimal Filtering. Prentice-Hall: Englewood Cliffs, NJ, 1979.25. Kalman RE. A new approach to linear filtering and prediction problems. Transactions on ASME, Series D, Journal
of Basic Engineering 1960; 82:35–45.26. Robert CR, Casella G. Monte Carlo Statistical Methods. Springer: Berlin, 1999.27. Doucet A, de Freitas N, Gordon N. Sequential Monte Carlo Methods in Practice. Springer: Berlin, 2001.28. Bemporad A, Morari M, Dua V, Pistikopulos E. The explicit linear quadratic regulator for constrained systems.
Technical Report AUT99-16, Automatic Control Laboratory, ETH-Swiss Federal Institute of Technology, 1999.29. Ser !oon M, De Don!aa JA, Goodwin GC. Global analytical model predictive control with input constraints.
In Proceedings of the 39th Conference on Decision and Control, Sydney, Australia, 2000.30. De Don!aa JA. Input constrained linear control. Ph.D. Thesis, The University of Newcastle, NSW, Australia, 2000.31. P!eerez T, Goodwin GC. Stochastic output feedback model predictive control. In Proceedings of American Control
Conference, ACC, Arlington, VA, U.S.A., 2001.
Copyright # 2004 John Wiley & Sons, Ltd. Int. J. Robust Nonlinear Control 2004; 14:379–393
OPTIMAL CONTROL OF CONSTRAINED LINEAR SYSTEMS 393