lqr - massachusetts institute of technology...fall 2001 16.31 22—1 linear quadratic regulator...

Topic #22

16.31 Feedback Control

Deterministic LQR

• Optimal control and the Riccati equation

• Lagrange multipliers

• The Hamiltonian matrix and the symmetric root locus

Factoids: for symmtric R ∂uT Ru

= 2u T R ∂u

∂Ru = R

∂u

Copyright 2001 by Jonathan How.

1

Fall 2001 16.31 22—1

Linear Quadratic Regulator (LQR)

• We have seen the solutions to the LQR problem using the symmetric root locus which defines the location of the closed-loop poles.

— Linear full-state feedback control.

— Would like to demonstrate from first principles that this is the optimal form of the control.

• Deterministic Linear Quadratic Regulator

Plant:

x (t) = A(t)x(t) + Bu(t)u(t), x(t0) = x0

z(t) = Cz (t)x(t)

Cost: Z tf £ JLQR = zT (t)Rzz(t)z(t) + uT (t)Ruu(t)u(t)

¤ dt + x(tf )Ptf x(tf )

t0

— Where Ptf ≥ 0, Rzz(t) > 0 and Ruu(t) > 0

— Define Rxx = CzTRzzCz ≥ 0

— A(t) is a continuous function of time.

— Bu(t), Cz (t), Rzz(t), Ruu(t) are piecewise continuous functions of time, and all are bounded.

• Problem Statement: Find the input u(t) ∀t ∈ [t0, tf ] to mini-mize JLQR.

Fall 2001 16.31 22—2

• Note that this is the most general form of the LQR problem — we rarely need this level of generality and often suppress the time dependence of the matrices.

— Aircraft landing problem.

• To optimize the cost, we follow the procedure of augmenting the constraints in the problem (the system dynamics) to the cost (inte-grand) to form the Hamiltonian:

1 ¢ H =

2

¡xT (t)Rxxx(t) + uT (t)Ruuu(t) + λT (t) (Ax(t) + Buu(t))

— λ(t) ∈ Rn×1 is called the Adjoint variable or Costate

— It is the Lagrange multiplier in the problem.

• From Stengel (pg427), the necessary and sufficient conditions for optimality are that:

T 1. λ (t) = −∂H = −Rxxx(t) − AT λ(t)∂x

2. λ(tf ) = Ptf x(tf )

3. ∂H = 0 ⇒ Ruuu + BuT λ(t) = 0, so uopt = −R−1

∂u uu BuT λ(t)

4. ∂2H ≥ 0 (need to check that Ruu ≥ 0)∂u 2

Fall 2001 16.31 Optimization-1

• This control design problem is a constrained optimization, with the constraints being the dynamics of the system.

• The standard way of handling the constraints in an optimization is to add them to the cost using a Lagrange multiplier

— Results in an unconstrained optimization.

• Example: min f (x, y) = x2 + y2 subject to the constraint that c(x, y) = x + y + 2 = 0

2

1.5

1

0.5

0

−0.5

−1

−1.5

−2 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2

x

Figure 1: Optimization results

y

• Clearly the unconstrained minimum is at x = y = 0

Fall 2001 16.31 Optimization-2

• To find the constrained minimum, form the augmented cost function

L , f (x, y) + λc(x, y) = x 2 + y 2 + λ(x + y + 2)

— Where λ is the Lagrange multiplier

• Note that if the constraint is satisfied, then L ≡ f

• The solution approach without constraints is to find the stationary point of f (x, y) (∂f/∂x = ∂f/∂y = 0)

— With constraints we find the stationary points of L ∂L ∂L ∂L

= = = 0 ∂x ∂y ∂λ

which gives

∂L∂x

= 2x + λ = 0

∂L = 2y + λ = 0

∂y ∂L

= x + y + 2 = 0∂λ

• This gives 3 equations in 3 unknowns, solve to find x? = y? = −1

• The key point here is that due to the constraint, the selection of x and y during the minimization are not independent

— The Lagrange multiplier captures this dependency.

• The LQR optimization follows the same path as this, but it is com-plicated by the fact that the cost involves an integration over time.

Fall 2001 16.31 22—3

• Note that we now have:

x (t) = Ax(t) + Buopt(t) = Ax(t) − BuR−1 uu Bu

T λ(t)

with x(t0) = x0

• So combine with equation for the adjoint variable

λ (t) = −Rxxx(t) − AT λ(t) = −CzTRzzCzx(t) − AT λ(t)

to get: ∙x (t) λ (t)

¸

=

"A −BuR−1

uu BuT

−CzTRzzCz −AT

# ∙ x(t) λ(t)

¸

which of course is the Hamiltonian Matrix again.

• Note that the dynamics of x(t) and λ(t) are coupled, but x(t) is known initially and λ(t) is known at the terminal time, since λ(tf ) = Ptf x(tf )

— This is a two point boundary value problem that is very hard to solve in general.

• However, in this case, we can introduce a new matrix variable P (t) and show that:

1. λ(t) = P (t)x(t) 2. It is relatively easy to find P (t).

Fall 2001 16.31 22—4

• How proceed?

1. For the 2n system " #∙ ∙ x (t) A −BuR−1

uu BuT

x(t) λ (t)

¸

= −CzTRzzCz −AT λ(t)

¸

define a transition matrix " # F11(t1, t0) F12(t1, t0)

F (t1, t0) = F21(t1, t0) F22(t1, t0)

and use this to relate x(t) to x(tf ) and λ(tf ) " #∙ ∙

λ(t)

¸

= F11(t, tf ) F12(t, tf ) x(tf )x(t)

F21(t, tf ) F22(t, tf ) λ(tf )

¸

so

x(t) = Fh 11(t, tf )x(tf ) + F12(t, tf )iλ(tf ) = F11(t, tf ) + F12(t, tf )Ptf x(tf )

2. Now find λ(t) in terms of x(tf ) h i λ(t) = F12(t, tf ) + F22(t, tf )Ptf x(tf )

3. Eliminate x(tf ) to get: h i h i−1 λ(t) = F12(t, tf ) + F22(t, tf )Ptf F11(t, tf ) + F12(t, tf )Ptf x(t)

, P (t)x(t)

Fall 2001 16.31 22—5

4. Now, since λ(t) = P (t)x(t), then

λ (t) = P (t)x(t) + P (t)x (t)

⇒ − CzTRzzCzx(t) − AT λ(t) =

−P (t)x(t) = CzTRzzCzx(t) + AT λ(t) + P (t)x (t)

= CzTRzzCzx(t) + AT λ(t) + P (t)(Ax(t) − BuR

−1 uu Bu

T λ(t))

= (CzTRzzCz + P (t)A)x(t) + (AT − P (t)BuR

−1 uu Bu

T )λ(t)

= (CzTRzzCz + P (t)A)x(t) + (AT − P (t)BuR

−1 uu Bu

T )P (t)x(t)

=£ ATP (t) + P (t)A + Cz

TRzzCz − P (t)BuR−1 uu Bu

TP (t)¤ x(t)

• This must be true for arbitrary x(t), so P (t) must satisfy

−P (t) = ATP (t) + P (t)A + CzTRzzCz − P (t)BuR

−1 uu Bu

TP (t)

— Which is a matrix differential Riccati Equation.

• The optimal value of P (t) is found by solving this equation back-

wards in time from tf with P (tf ) = Ptf

Fall 2001 16.31 22—6

• The control gains are then

uopt = −R−1 uu Bu

T λ(t)

= −R−1 uu Bu

TP (t)x(t) = −K(t)x(t)

— Where K(t) , R−1 uu Bu

TP (t)

• Note that x(t) and λ(t) together define the closed-loop dynamics for the system (and its adjoint), but we can eliminate λ(t) from the solution by introducing P (t) which solves a Riccati Equation.

• The optimal control inputs are in fact a linear full-state feedback control

• Note that normally we are interested in problems with t0 = 0 and tf = ∞, in which case we can just use the steady-state value of P that solves (assumes that A,Bu is stabilizable)

ATP + PA + CzTRzzCz − PBuR

−1 uu Bu

TP = 0

which is the Algebraic Riccati Equation.

— If we use the steady-state value of P , then K is constant.

Fall 2001 16.31 22—7

• Example: simple system with t0 = 0 and tf = 10sec. ∙ ∙ 0 1 0

x = 0 −1

¸

x +1

¸

u " # Z 10 ∙ ∙

J = xT (10)0 0

x(10) + xT (t) q 0 0

0 ¸

x(t) + ru 2(t)

¸

dt 0 h 0

• Compute gains using both time-varying P (t) and steady-state value.

• Find state solution x(0) = [1 1]T using both sets of gains q = 1 r = 1 h = 5

1.4 K

1(t)

K1

5

4.5

1.2

4

1 3.5

3 0.8

2.5

0.6 2

0.4 1.5

1

0.2

0.5

K2(t)

K2

x1

x2

0 0 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10

Time (sec) Time (sec)

Dynamic Gains Static Gains 1.4 1.4

1.2

x1

x2 1.2

1 1

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0

−0.2 −0.2

−0.4 −0.4

−0.6 −0.6 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10

Time (sec) Time (sec)

Sta

tes

Gai

ns

Sta

tes

Gai

ns

Figure 2: Set q = 1, r = 1, h = 10, Klqr = [1 0.73]

Fall 2001 16.31 22—8

• As noted, the closed-loop dynamics couple x(t) and λ(t) and are given by ∙

x (t) λ (t)

¸

=

"A −BuR−1

uu BuT

−CzTRzzCz −AT

# ∙ x(t) λ(t)

¸

with the appropriate boundary conditions.

• OK, so where are the closed-loop poles of the system?

— They must be the eigenvalues of " #

H ,A −BuR−1

uu BuT

−CzTRzzCz −AT

• When we analyzed this before for a SISO system, we found that the closed-loop poles could be related to a SRL for the transfer function

Gzu(s) = Cz (sI − A)−1Bu = b(s)a(s)

and, in fact, the closed-loop poles were given by the LHP roots of

Rzz a(s)a(−s) + b(s)b(−s) = 0

Ruu

where we previously had Rzz/Ruu ≡ 1/r

• We now know enough to show that this is true.

Fall 2001 16.31 22—9

Derivation of the SRL• The closed-loop poles are given by the eigenvalues of " #

A −BuR−1 uu Bu

T

−CzTRzzCz −AT H ,

so solve det(sI − H) = 0" #

= det(A) det(D − CA−1B) A B

C D • If A is invertible: det

£ (sI + AT ) − Cz

T RzzCz (sI − A)−1BuR−1

u

¤uu B

T

= det(sI − A) det(sI + AT ) det

⇒ det(sI − H) = det(sI − A) det £ I − Cz

T RzzCz(sI − A)−1BuR−1

u (sI + AT )−1¤

uu BT

• Note that det(I + ABC) = det(I + CAB), and if a(s) = det(sI − A), then a(−s) = det(−sI − AT ) = (−1)n det(sI + AT )

det(sI−H) = (−1)n a(s)a(−s) det £ I + R−1

u (−sI − AT )−1CzT RzzCz(sI − A)−1Bu

¤uu B

T

• If Gzu(s) = Cz (sI −A)−1Bu, then GT zu(−s) = Bu

T (−sI −AT )−1CzT ,

so for SISO systems

£ I + R−1

zu(−s)RzzGzu(s)¤

uu GT

= (−1)na(s)a(−s) I + Rzz

Gzu(−s)Gzu(s)¸

Ruu

det(sI − H) = (−1)na(s)a(−s) det ∙

∙ Rzz

a(s)a(−s) += (−1)n

Ruu b(s)b(−s)

¸

= 0

Fall 2001 16.31 22—10

• Simple example from before: A scalar system with

x = ax + bu

with cost (Rxx > 0 and Ruu > 0)

J =Z ∞

0(Rxxx 2(t) + Ruuu 2(t)) dt

• Then the steady-state P solves

2aP + Rxx − P 2b2/Ruu = 0

which gives that P =a+ √ a2+b2Rxx/Ruu > 0 R−1

uu b2

• Then u(t) = −Kx(t) where

uu bP = a +

pa2 + b2Rxx/Ruu

K = R−1

b

• The closed-loop dynamics are

x = (a − bK)x =

µa −

b

b (a +

pa2 + b2Rxx/Ruu)

¶

x

= −pa2 + b2Rxx/Ruu x = Aclx(t)

• Note that as Rxx/Ruu →∞, Acl ≈ −|b|pRxx/Ruu

• And as Rxx/Ruu → 0, K ≈ (a + |a|)/b — If a < 0 (open-loop stable), K ≈ 0 and Acl = a − bK ≈ a

— If a > 0 (OL unstable), K ≈ 2a/b and Acl = a − bK ≈ −a

Fall 2001 16.31 22—11

Summary

• Can find the optimal feedback gains u = −Kx using the Matlab command

K = lqr(A,B,Rxx, Ruu)

• Similar derivation for the optimal estimation problem (Linear Quadratic Estimator)

— Full treatment requires detailed of advanced topics (e.g. stochas-tic processes and Ito calculus) — better left to a second course.

— But, by duality, can compute optimal Kalman filter gains from

Ke = lqr(AT,CyT , BwRwB

T w,Rv), L = Ke

T

snauti

®

snauti

MATLAB is a trademark of The MathWorks, Inc.

snauti

®

lqr - massachusetts institute of technology...fall 2001 16.31 22—1 linear quadratic regulator...

Documents