adaptive model predictive control: robustness and parameter estimation · 2021. 5. 28. · state...

Adaptive Model Predictive Control:

Robustness and Parameter Estimation

Mark Cannon

Joint work with:

Matthias Lorenzen Stuttgart University

Xiaonan Lu Oxford University

Sebastian East Oxford/NNAISENSE

1

Motivation

Robust MPC paradigm:

ControlledSystem

Prediction &Optimization

Model

u y

d

Uncertain model & disturbances a↵ect performance

Large e↵ort (time & money) spent on model identification o✏ine

2

Motivation

Adaptive MPC paradigm:

ParameterEstimation

u y

d

Prediction &Optimization

ControlledSystem

Model

Identify (or learn) model (or cost or constraints) online

Require: robust constraint satisfaction

closed loop stability & performance guarantees

parameter convergence

2

Applications

is updated in parallel and sufficient threads are available. Thedense matrix inversions associated with the � and ⇣ updatescan be computed offline as they do not include any decisionvariables, so only dense matrix-vector multiplications arerequired, and the computational complexity of each iterationis therefore O(N2). Additionally, the updates for � and ⇣

consist of multiplications by Toeplitz matrices that can beimplemented as (stable) linear filtering operations with astorage requirement of O(N).

IV. NUMERICAL STUDIES

The performance of the ADMM algorithm was investi-gated through simulation in comparison with both a CDCSstrategy and an approximately optimal DP implementation(the algorithm of [16] was not included as the lack of hardlimits on state of charge mean that it cannot be compared inany meaningful way). For both optimization-based strategiesit was assumed that the future driver behaviour was knownwith complete precision. A simple CDCS strategy was as-sumed where the engine was switched off and all power wasdelivered from the motor until the lower state constraint wasviolated, after which the engine was permanently switchedon, and used to provide all of the positive demand powerwhenever the state of charge was below its lower constraint.The DP algorithm was modified from that presented in [14]to include the engine switching control variable and engineswitching cost, with the state of charge discretised to 0.1%intervals and the battery power control input discretised to1% intervals. The values ⇢1 = 8.86 � 10�9 and ⇢2 =2.34 � 10�4 were taken from [14], and we set ⇢3 = ⇢2.Also, ⇢4 was set at 2�103 after using a parameter sweepsimilar to that detailed in [14]. The termination threshold, ✏,was set at 7 � 104 for both the initial convex phase and thesubsequent nonconvex phase of the ADMM algorithm, andkd was set arbitrarily at 104.

The velocity and road gradient data used to generate thepower demand profiles is shown in Figure 3. This is realdrive-test data taken from 49 instances of a single �13kmroute driven by four different drivers. The method detailed insection II was used to obtain the demand power for this data,where it was assumed that the mechanical brake providednone of the braking power. The vehicle was modelled asa 1800kg passenger vehicle with a 100kW petrol internalcombustion engine, a 50kW electric motor, and a 21.5Ahlithium-ion battery with a 350V and 0.1� open circuitvoltage and resistance. The battery was initialised at 60%and constrained to between 40% and 70% to ensure that thestate constraints were strongly active at the solution. Thesimulations were programmed in Matlab on a 2.6GHz IntelCore i7-6700HQ CPU.

A. Results

Figure 4 shows the cumulative fuel consumption, state ofcharge trajectory, and engine switching control inputs for asingle instance of the journey. It can be seen that althoughthe state of charge (SOC) trajectories for DP and ADMMare qualitatively similar, the trajectory for CDCS has a large

Fig. 3. Velocity and gradient data against distance.

Fig. 4. Cumulative fuel consumption, SOC trajectory, and engine state fora single journey using CDCS, DP and ADMM.

deviation for the first 430s, and this is reflected in the sub-optimal fuel consumption. After 430s the SOC trajectoriesare almost identical because the car is descending for themajority of the second half of the journey (see the gradientplots in Figure 3), and is therefore in a regenerative modefor all three methods. During this period, the optimization-based methods consume no fuel as the engine has beenswitched off, but the CDCS strategy continues to consumefuel due to the rotation of the engine. It can also be seenfrom the engine controls that the ADMM algorithm largelyobtains the periods where it is optimal to turn the engine onand off (e.g determines that the engine should be off whiledescending), but introduces additional switching decisions,particularly around 100s and 400s.

The first plot in Figure 5 shows the total fuel consumedacross all journeys using CDCS, DP, and ADMM. To prop-erly compare the fuel consumption using each method, theequivalent fuel consumption as a result of battery use isnormally also considered within the total energy consump-tion, but in this case the second plot shows that the terminalSOC was within 1.7% in all cases for all three methods,so this effect was ignored. It can be seen that DP reducesfuel consumption w.r.t CDCS by �40% in all cases, and that

CONFIDENTIAL. Limited circulation. For review onlyIEEE L-CSS submission 19-0063.2 (Submission for L-CSS and CDC)

.Preprint Received May 2, 2019 10:04:22 PST

Uncertain parameters, uncertain demand

Networks of interacting locally controlled systems

3

Overview

An idea with a long history: e.g. self-tuning control, DMC, GPC . . .[Clarke, Tu↵s, Mohtadi, 1987]

Revisited with new tools:

Set membership estimation[Bai, Cho, Tempo, 1998]

Robust tube MPC[Langson, Chryssochoos, Rakovic, Mayne, 2004]

Dual adaptive/predictive control[Lee & Lee, 2009]

4

Overview

Recent work on MPC with model adaptation

Online learning & identification:

– Persistency of Excitation constraints

[Marafioti, Bitmead, Hovd, 2014]

– RLS parameter estimation with covariance matrix in cost

[Heirung, Ydstie, Foss, 2017]

– Gaussian process regression, particle filtering

[Klenske, Zeilinger, Scholkopf, Hennig, 2016]

[Bayard & Schumitzky, 2010]

Robust constraint satisfaction and performance:

– Constraints based on prior uncertainty set, online update of cost only

[Aswani, Gonzalez, Sastry, Tomlin, 2013]

– Set-based identification, stable FIR plant model

[Tanaskovic, Fagiano, Smith, Morari, 2014]

5

Overview

This talk:

1 Set membership parameter estimation

2 Polytopic tube robust MPC

3 Convex constraints for persistent excitation

4 Time varying model parameters

5 Di↵erentiable MPC

6

Parameter set estimate

Plant model with unknown parameter vector ✓? and disturbance w:

xk+1 = A(✓?)xk + B(✓?)uk + wk

Assumption 1: model is a�ne in unknown parameters

xk+1 = Dk✓? + dk + wk

⇢Dk = D(xk, uk)

dk = A0xk + B0uk

Assumption 2: stochastic disturbance wk 2 WW 3 0 is compact and convex

Unfalsified set: If xk, xk�1, uk�1 are known, then ✓? 2 �k

�k = {✓ : xk = Dk�1✓ + dk�1 + w, w 2 W}

7

Minimal parameter set estimate

Minimal parameter set update:

⇥k+1 = ⇥k \ �k+1

⇥k+1

�k+1

⇥k

Assumption 3: W is a ’tight’ bound: for all w0 2 @W and ✏ > 0

Pr�kwk � w

0k < ✏

� pw(✏)

where pw(✏) > 0 8✏ > 0

Assumption 4: persistent excitation: 9 ↵, � > 0, N such that

kDkk ↵ andk+N�1X

j=k

D>j Dj ⌫ �I for all k

8


Unfalsified set: �k+1 = {✓ : xk+1 � Dk✓ � dk 2 W}= {✓ : Dk(✓⇤ � ✓) + wk 2 W}

For any given ✓0 2 ⇥k:

pick w0 2 @W so that

Dk(✓⇤ � ✓0) is normal to @W

at w0

let ✏ = kwk � w0k

then ✓0 /2 �k+1 if ✏ < kDk(✓⇤ � ✓0)k

W

w0

normal at w0

9


Unfalsified set: �k+1 = {✓ : xk+1 � Dk✓ � dk 2 W}= {✓ : Dk(✓⇤ � ✓) + wk 2 W}

For any given ✓0 2 ⇥k:

pick w0 2 @W so that

Dk(✓⇤ � ✓0) is normal to @W

at w0

let ✏ = kwk � w0k

then ✓0 /2 �k+1 if ✏ < kDk(✓⇤ � ✓0)k

W

w0

wk

Dk(✓⇤ � ✓0)

normal at w0

radius ✏

9


If Assumptions 1-4 hold, then ⇥k ! {✓⇤} as k ! 1 w.p. 1

This follows from:

A For any ✓0 2 ⇥k, if k✓⇤ � ✓0k � ✏, then

Pr{✓0 62 �j} � pw

�✏

p�/N

�

for all k, all ✏ > 0, and some j 2 {k + 1, . . . , k + N}

B For any ✓0 2 ⇥0 such that k✓0 � ✓⇤k � ✏,

Pr{✓0 2 ⇥k} 1 � pw

�✏

p�/N

��bk/Nc

for all k and all ✏ > 0, so1X

k=0

Pr{✓0 2 ⇥k} = 0Borel-Cantelli

=)Lemma

Pr�✓0 2

1\

k=0

⇥k

= 0

10


The complexity of ⇥k is unbounded in general

e.g. Minimal parameter set ⇥k for k = 1, . . . , 6 with polytopic W and ⇥0

is a direct application of Theorem 2 in Raimondo et al.(2009). The origin is practically stable for the closed loopsystem with the ultimate bound depending on the size ofthe disturbance set W.

Proposition 11. Suppose Assumptions 1,2,6,10 hold. Thenthe origin is practically stable with region of attraction XN

and limk�1 kxkk� = 0 for the closed-loop system (1) withMPC control law (10).

Without additive disturbance the constants in Lemma 9vanish, c1 = c2 = 0, leading to the following corollary.

Corollary 12. Suppose Assumptions 1,2,6,10 hold andW = {0}. Then the origin is asymptotically stable withregion of attraction XN for the closed-loop system (1) withMPC control law (10).

4. NUMERICAL EXAMPLE

In this section, two examples are presented to illustrate theadvantages of the proposed Adaptive MPC scheme. Wefirst demonstrate the online identification and constraintsatisfaction in a setup where stabilization of the origin isconsidered and thereafter the adaptive scheme is comparedwith a non-adaptive, Robust MPC in an ad-hoc trackingimplementation for constant reference signals.

Example 1 Consider the second-order discrete-time linearsystem of the form (1) with

A0 =

0.5 0.2

�0.1 0.6

�, B0 =

0

0.5

�,

A1 =

0.042 00.072 0.03

�, A2 =

0.015 0.0190.009 0.035

�, A3 = 02�2,

{Bi}i=1,2 = 02�1, B3 =

0.03970.0539

�

⇥ = {✓ | k✓k1 1} and W = {w 2 R2 | kwk1 0.1}.

The MPC parameters were horizon length N = 9, costweights Q = diag(1, 1), R = 0.001, and prestabilizingfeedback gain K = [0.017 � 0.41]. Separate state andinput constraints [xk]2 � �0.3, |uk| 1 were applied tothe system which should be satisfied robustly.

Starting from an initial condition x0 = [0, 10]>, Figure 1shows a typical closed loop trajectory, the predicted statetube trajectory at time step k = 3 and the state constraint[xk]2 � �0.3. Under the proposed Adaptive MPC scheme,the state constraint is robustly satisfied for all possiblepredicted states and the state converges to a neighborhoodof the origin. The input constraints were satisfied robustlywith the input saturated at uk = �1 in the first four steps.

Figure 2 shows the parameter set from time step k = 0to k = 5. Given the realized state and input trajectory,falsified parameters are removed and the uncertainty setis non-increasing. We remark that the parameter adaptiondepends on the initial condition and disturbance realiza-tion since the cost does not reflect the advantage of futureparameter learning.

The simulation was performed with Matlab and MOSEK.The average optimization time without further tuning orwarm-start was 1.5s with a maximum of 1.8s on an IntelCore i7 with 3.4GHz.

0 0.5 1 1.5 2

0

1

2

3

[x]1

[x] 2

Fig. 1. Realized closed loop trajectory, predicted state tube trajec-tory and constraint [xk]1 � �0.3.

�10

1

�10

1

�1

0

1

0 0.51

0

1

�1

0

1

0.51

0

1

�1

0

1

0.51

0

1

�1

0

1

0.51

0

1

�1

0

1

0.51

�0.50

0.5

�1

0

1

Fig. 2. Evolution of the parameter set from time k = 0 to k = 5.Note the di�erent axis limits.

Example 2 To highlight the increased performance of theproposed Adaptive MPC scheme we compare it with anon-adaptive robust Tube MPC. Consider a simple massspring damper system with dynamics described by

mx = �cx � kx + u + w

and nominal parameters mass m = 1, damping constantc = 0.2, and spring constant k = 1. The parameteruncertainty was considered to be ±20%, |w(t)| 0.5, andinput u and state x constrained to [�5, 5] and [�0.1, 1.1]respectively. To apply the MPC algorithm, a first-orderdiscretization with sampling time Ts = 0.1 was used. Theprediction horizon was set to N = 14 and cost weightsQ = diag(10, 0.001), R = 0.001.

Figure 3 shows the closed loop response under the pro-posed Adaptive MPC and robust Tube MPC algorithm.After time steps k = 20, 40, 60 the desired set point wasswitched between 0 and 1. Due to the model uncertaintythe desired steady state xss = 1 is only stabilized with ano�set which, due to the parameter adaption, is decreasedin the Adaptive MPC scheme but not in the Robust MPC.Similarly, while each transient between the steady statesis similar in the Robust MPC, the Adaptive MPC shows afaster, improved convergence in each transient. The closed

11

Fixed complexity polytopic parameter set estimate

Define ⇥k = {✓ : H⇥✓ hk} for a fixed matrix H⇥

Update ⇥k+1 by solving, for each row i:

[hk+1]i = maxw02W,..., wN�12W

✓2⇥k

[H⇥]i✓

subject to

xk�N+2 = Dk�N+1✓ + dk�N+1 + w0

...

xk+1 = Dk✓ + dk + wN�1

⇥k+1

Tkj=k�N+1 �j+1

⇥k

Then ⇥k+1 ✓ ⇥k ✓ · · · ✓ ⇥0,

and ⇥k+1 is the minimum volume set such that

⇥k+1 ◆ ⇥k \k\

j=k�N+1

�j+1

12

Fixed complexity polytopic parameter set estimate

If Assumptions 1-4 hold, then ⇥k ! {✓⇤} as k ! 1 w.p. 1

This follows from:

A If [hk]i � [H⇥]i✓⇤ � ✏, then

Pr

⇢{✓ : [H⇥]i✓ = [hk]i} \

k\

j=k�N+1

�j+1 = ;�

�pw

⇣✏�

↵N

⌘�N

for all i, k, and all ✏ > 0

B For any ✓0 such that [H⇥]i(✓0 � ✓⇤) � ✏ for some row i,

Pr{✓0 2 ⇥k} ⇢

1 �pw

⇣✏�

N↵

⌘�N�bk/Nc

for all k and all ✏ > 0, so1X

k=0

Pr{✓0 2 ⇥k} = 0Borel-Cantelli

=)Lemma

Pr�✓0 2

1\

k=0

⇥k

= 0

13

Example: fixed complexity parameter set estimate

Figure: Parameter set ⇥k at time steps

k 2 {0, 1, 2; 10, 25, 50; 100, 500, 5000}

⇥ set Volume Cost*

(%)

⇥0 100 62.22

⇥1 26.1 61.13

⇥2 18.3 61.03

⇥10 12.7 60.96

⇥25 8.3 60.93

⇥50 6.3 60.77

⇥100 3.4 59.45

⇥500 0.7 57.94

⇥5000 0.0089 53.95

✓? - 52.70

Table: Volume of ⇥k as ⇥k/⇥0 ⇥ 100%;

Cost* with same initial x0 and constraints

14

Inexact disturbance bounds

What if W is not exactly known?

Suppose wk 2 cW for all k, for known cW

cW��✓

W +⇢B

Assumption 5: cW is compact and convex, and W ✓ cW ✓ W � ⇢Bfor some ⇢ � 0, and B = {x : kxk 1}

Replace W with cW in the fixed complexity polytopic parameter set update

then ✓? 2 b�k+1 = {✓ : xk+1 = Dk✓ + dk + w, w 2 cW}, and

if Assumptions 1-5 hold, then ⇥k ! {✓⇤} � ⇢

pN/� B as k ! 1 w.p. 1

15

Noisy measurements

Let yk = xk + sk be an estimate of xk

Assumption 6: i.i.d. noise sk 2 S for all k

where S 3 0 is a compact, convex polytope

Assumption 7: the noise bound is tight, i.e. for all s0 2 @S and ✏ > 0

Pr�ksk � s

0k < ✏

� ps(✏)

where ps(✏) > 0 for all ✏ > 0

Then S = co{s(j)

, . . . , s(h)} implies ✓? 2 co{b�(1)

k+1, . . . ,b�(h)

k+1}, where

b�(j)k+1 =

n✓ : yk+1 � D(yk � s

(j), uk)✓ � d(yk � s

(j)k , uk) 2 cW � S

o

If Assumptions 1-7 hold, then ⇥k ! {✓⇤} � ⇢

pN/� B as k ! 1 w.p. 1

16

Parameter point estimate

Define a point estimate ✓k of ✓?

✓k: defines a nominal predicted performance index

⇥k: enforces constraints robustly

Given a parameter estimate ✓k:

Least mean squares (LMS) filter estimate update is

✓k+1 = ✓k + µD>(xk, uk)(xk+1 � x1|k)

✓k+1 = ⇧⇥k+1(✓k+1)

where

I x1|k = D(xk, uk)(✓k)

I µ > 0 satisfies 1/µ > sup(x,u)2Z kD(x, u)k2

I ⇧⇥(✓) = arg min✓2⇥ k✓ � ✓k projects onto ⇥

For µ = 0 the update is ✓k+1 = ⇧⇥k+1(✓k)

17

Parameter point estimate

The LMS filter (µ > 0) ensures the l2 gain bound:

If supk2N kxkk < 1 and supk2N kukk < 1, then ✓k 2 ⇥k for all k and

supT2N,wk2W,✓02⇥0

PTk=0 kx1|kk2

1µk✓0 � ✓?k2 +

PTk=0 kwkk2

1

where x1|k = A(✓?)xk + B(✓?)uk � x1|k is the 1-step prediction error

18

Control Problem

Consider robust regulation of the system

xk+1 = A(✓)xk + B(✓)uk + wk

with ✓ 2 ⇥k, wk 2 W, subject to the state and control constraints

Fxk + Guk 1 = [1 · · · 1]>

Assumption (Robust stabilizability):There exists a set X = {x : V x 1} and feedback gain K such that X is�-contractive for some � 2 [0, 1), i.e.

V �(✓)x �1, for all x 2 X , ✓ 2 ⇥0.

where �(✓) = A(✓) + B(✓)K.

19

Tube MPC

A sequence of sets (a “tube”) is constructed to bound the predicted statexi|k, with ith cross section, Xi|k:

Xi|k = {x : V x ↵i|k}

where V is determined o✏ine and ↵i|k are online decision variables

(A) For robust satisfaction of xi|k 2 Xi|k, we require

V �(✓)x + V B(✓)vi|k + w ↵i+1|k for all x 2 Xi|k, ✓ 2 ⇥k

where [w]i = maxw2W [V ]iw

(B) For robust satisfaction of Fxi|k + Gui|k 1, we require

(F + GK)x + Gvi|k 1 for all x 2 Xi|k

Condition (A) is bilinear in x and ✓, but can be expressed in terms of linearinequalities using a vertex representation of either Xi|k or ⇥k

21

Tube MPC

We generate the vertex representation:

Xi|k = co{x1i|k, . . . x

mi|k}

using the property that {x : [V ]rx [↵i|k]r} is a supporting hyperplane ofXi|k for each r:

X

x1

x2

Xi|k

x1i|k = x

2i|k

HHY

Hence each vertex xji|k is given by the intersection of hyperplanes

corresponding to a fixed set of rows of V , and

xji|k = U

j↵i|k

for some Uj , determined o✏ine from the vertices of X = {x : V x 1}

22

Tube MPC

Using the hyperplane and vertex descriptions of Xi|k, the robust tubeconstraints become

A V �(✓)U j↵i|k + V B(✓)vi|k + w ↵i+1|k for all ✓ 2 ⇥k, j = 1, . . . , m

B (F + GK)U j↵i|k + Gvi|k 1, j = 1, . . . , m

Now condition (B) is linear and (A) can be equivalently written as linearconstraints using

Polyhedral set inclusion lemmaLet Pi = {x : Fix fi} ⇢ Rn for i = 1, 2. Then P1 ✓ P2 i↵

9⇤ � 0 such that ⇤F1 = F2 and ⇤f1 f2

23

Robust adaptive MPC algorithm

O✏ine: Choose ⇥0, X , feedback gain K, and compute P

Online, at each time k = 1, 2, . . .:

1 Given xk, update the set (⇥k) and point (✓k) parameter estimates

2 Compute the solution (v⇤k,↵⇤

k,⇤⇤k) of the QP:

minvk,↵k,⇤k

J(xk, ✓k,vk)

subject to (vk,↵k,⇤k) 2 F(xk, ⇥k)

3 Apply the control law u⇤k = Kxk + v

⇤0|k

25


The MPC algorithm has the following closed loop properties:

If ✓? 2 ⇥0 and F(x0, ⇥0) 6= ;, then for all k > 0:1 ✓

? 2 ⇥k

2 F(xk, ⇥k) 6= ;3 Fxk + Guk 1

If µ > 0, then5 the closed loop system is finite-gain l

2-stable, i.e.

TX

k=0

kxkk2 c0kx0k2 + c1k✓0 � ✓?k2 + c2

TX

k=0

kwkk2

for some constants c0, c1, c2 > 0, for all T

26


The MPC algorithm has the following closed loop properties:


? 2 ⇥k

2 F(xk, ⇥k) 6= ;3 Fxk + Guk 1

If µ = 0, then5 the closed loop system is input-to-state stable, i.e.

kxT k ⌘(kxkk, T � k) + ( maxi2{k,...,T�1}

kwjk) + ⇣(k✓k � ✓⇤k)

for some KL-function ⌘, some K-functions , ⇣ and all k, T .

26

Regulation example

2nd order linear system with

(A(✓), B(✓)) = (A0, B0) +3X

i=1

(Ai, Bi)✓i

A0 =

0.5 0.2

�0.1 0.6

�A1 =

0.042 00.072 0.03

�A2 =

0.015 0.0190.009 0.035

�A3 =

0 0

�0 0

�

B0 =

0

0.5

�B1 =

00

�B2 =

00

�B3 =

0.03970.059

�

B true parameter ✓? = [0.8 0.2 �0.5]>, initial set ⇥0 = {✓ : k✓k1 1}.

B disturbance uniformly distributed on W = {w 2 R2 : kwk1 0.1}, wk

B state and input constraints: [x]2 � �0.3 and uk 1.

27

Regulation example: constraint satisfaction

Figure: Realized closed-loop trajectory from initial condition x0 = [3 6]> (red line),

predicted state tube at time k = 0 (tube cross-sections: blue, terminal set: pink)

28

Regulation example: constraint satisfaction

0 1 2 3 4 5 6 7 8time step k

-1

0

1

2

3

4

5

contr

ol i

nput u

k

Figure: Realized closed-loop trajectory from initial condition x0 = [3 6]> (red line),

predicted control tube at time k = 0 (tube cross-sections: blue)

28

Persistent excitation

PE condition evaluated over a future horizon is nonconvex in ui|k, xi|k:

(PE):N�1X

i=0

D>(xi|k, ui|k) D(xi|k, ui|k) ⌫ �I

Linearise:

let ui|k = ui|k + ui|k and xi|k = xi|k + xi|k, where x0|k = xk and

ui|k = Kxi|k + v⇤i+1|k�1

xi+1|k = A(✓k)xi|k + B(✓k)ui|k

then Di|k = Di|k + Di|k, where Di|k = D(xi|k, ui|k), Di|k = D(xi|k, ui|k)

D>i|kDi|k = D

>i|kDi|k + D

>i|kDi|k + D

>i|kDi|k + D

>i|kDi|k

⌫ D>i|kDi|k + D

>i|kDi|k + D

>i|kDi|k

so D>i|kDi|k + D

>i|kDi|k + D

>i|kDi|k ⌫ �I =) D

>i|kDi|k ⌫ �I

29

Persistent excitation

A su�cient condition forN�1X

i=0

D>i|kDi|k ⌫ �I

is the LMI in xi|k, ui|k,�:

(PE-LMI):N�1X

i=0

�D

>i|kDi|k + D

>i|kDi|k + D

>i|kDi|k

�⌫ �I

This can be expressed in terms of

xi|k 2 Xi|k � {xi|k}ui|k 2 K(Xi|k � {xi|k}) + {vi|k} � {v

⇤i+1|k�1}

using

Di|k 2 con

D�U

j↵i|k � xi|k, K(U j

↵i|k � xi|k) + vi|k � v⇤i+1|k�1

�o

Hence (PE-LMI) is equivalent to an LMI in optimization variables vk,↵k,�

30

Robust adaptive MPC algorithm with PE condition

O✏ine: Choose ⇥0, X , �, feedback gain K, and compute P

Online, at each time k = 1, 2, . . .:

1 Given xk, update set (⇥k) and point (✓k) parameter estimates, andcompute xi|k, ui|k, i = 0, . . . , N � 1

2 Compute the solution (v⇤k,↵⇤

k,⇤⇤k) of the semidefinite program

minvk,↵k,⇤k,�

J(xk, ✓k,vk) � ��

subject to (vk,↵k,⇤k) 2 F(xk, ⇥k) and (PE-LMI)

3 Apply the control law u⇤k = Kxk + v

⇤0|k

31

PE example

0 200 400 600 800 1000

Time step, t

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Pa

ram

ete

r se

t vo

lum

e,

vol(

t)

= 103

= 102

= 1 = 0

Figure: Parameter set volume vol(⇥t) vs cost weight �

32

PE example

10-2 100 102

Weighting,

-10 -2

-10 -3

-10 -4

-10 -5

Optim

al

in P

E c

onst

rain

t

10-2 100 102

Weighting,

10-7

10-6

PE

coeffic

ient

1

Figure: Minimum eigenvalue of information matrix vs cost weight �

33

PE example

10-1 100

Ensemble average 1

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Siz

e o

f para

mete

r se

t 9

average side lengthinner radiusbest fit lineouter radius

10-1 100

Ensemble average 1

0

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045

Volu

me o

f para

mete

r se

t est

imate

9

Figure: Size of parameter set after 10 time steps vs minimum eigenvalue of

information matrix

34

Time-varying parameters

Assumption (time-varying parameters)

There exists a constant r✓ such that the parameter vector ✓?k satisfies

✓?k 2 ⇥0 for all k and k✓?

k+1 � ✓?kk r✓

Define the dilation operator:

Rj(⇥) = {✓ : H⇥✓ h + jr✓1}

Then the minimal parameter set at k + 1 is

⇥k+1 = R1(⇥k \ �k+1) \ ⇥0

and ⇥k is replaced in the tube MPC constraints by

⇥i|k = Ri(⇥k) \ ⇥0

35

Robust adaptive MPC algorithm with time-varying

parameters

Parameter estimate bounds and recursive feasibility properties are unchanged:

Theorem (Closed loop properties)


? 2 ⇥k

2 F(xk, ⇥k) 6= ;3 Fxk + Guk 1

But the LMS filter has an additional tracking error, which invalidates thel2-stability properties, i.e. “certainty equivalence” no longer applies

36

Time-varying parameters example

�10

1

�10

1

�1

0

1

�10

1

�10

1

�1

0

1

�10

1

�10

1

�1

0

1

�10

1

�10

1

�1

0

1

�10

1

�10

1

�1

0

1

�10

1

�10

1

�1

0

1

Figure 4. Parameter membership set for the sys-tem with time-varying parameters at time stepsk 2 {0, 100, 200, 300, 400, 500}.

creases conservatism and can increase performance. Yet,due to the number of equality constraints in the MPCoptimization program, a significant increase in compu-tation time was observed with increasing complexityof X0. Furthermore, note that the scalar input allowsthe decomposition of the PE input constraint into twolinear constraints, leading to two convex QP problemsto be solved and compared in each MPC iteration [24].

6 Conclusions

A computationally tractable model predictive control al-gorithm with recursive parameter update has been pre-sented that provides guarantees for closed-loop stabilityand robust constraint satisfaction. The requirements forstability and constraint satisfaction are considered sep-arately. This leads to a set-membership parameter es-timation scheme being employed to derive bounds onthe state and input predictions whereas a Least MeanSquares filter is used to achieve a finite gain from thedisturbance to the state. The online optimization to besolved is a linearly constrained quadratic program andproven to be recursively feasible. Two numerical exam-ples are provided to demonstrate the e�ectiveness of theproposed algorithm.

Extensions for time-varying parameters and for PE re-gressors are discussed explicitly. As the MPC schemeis formulated in a modern state-space framework, theproposed setup provides a solid framework for adaptiveMPC algorithms and can be easily combined with fur-ther results tailored to specific control objectives, e.g.,tracking or output feedback MPC.

Compared to the classical adaptive control literature,the assumptions made are necessarily more restrictivein order to allow a robust MPC formulation. The use

of less restrictive assumptions in combination with softconstraints or chance constraints are currently under in-vestigation. In particular for the time-varying case, itwould furthermore be of interest to derive bounds onthe estimation error, which could then be used to relaxAssumptions 8 and 11 to a parameter dependent presta-bilizing feedback and terminal constraint. Finally, ques-tions on optimal excitation of the system as discussedin [12] remain open for future research.

A Appendix

A.1 Computation of the terminal region

As shown in [8] and [29], a terminal set Xf satisfying As-sumption 11 can be computed recursively by the follow-ing algorithm. With X0 and f as given above, i.e., X0 ={x 2 Rn | Hxx 1} and [f ]i = maxx2X0 [F +GK]ix, let

X0f = {(z, ↵) 2 Rn � R�0 | (F + GK)z + ↵f 1}

and define

Xi+1f =

��

��(z, ↵)

��

9(z+, ↵

+) 2 Xif s. t.

Acl(✓)({z} � ↵X0) � W✓ {z

+} � ↵+X0 8✓ 2 ⇥

��

��\ X0

f .

(A.1)The sets Xi

f , i 2 N are non-increasing with i and theterminal set is given by the limit for i ! 1. Under thegiven assumptions, the sequence converges in finite timesuch that Xf = Xi

f for some i 2 N satisfying Xif = Xi+1

f .

With {✓k}k2Nvp

1being the vertices of the set ⇥, [hk

x]i =

maxx2X0 [Hx]iAcl(✓k)x, and

Xi+1f =

��

��

(z, ↵, z+, ↵

+) |Hx

�Acl(✓

k)z � z+�+ h

kx↵ + 1↵

+ �w

8k 2 Nvp

1

��

��,

the recursion (A.1) can be computed by Xi+1f =

Projn+1(Xi+1f ) where Projn+1 is the projection onto the

first n + 1 coordinates.

As the projection of polytopes can be computationallydemanding, the recursion can be simplified through set-ting z = z

+ = 0 and determining only a suitable ↵ sat-isfying Assumption 11.

A.2 Proof of Lemma 24

PROOF. [Lemma 24] Let xk be the solution of (1) withwk � 0. By [14, Corollary 2.4] {uk}k being PE implies

13

Figure: Parameter set ⇥k at times k 2 {0, 100, 200, 300, 400, 500} for the

time-varying system with r✓ = 0.01

37

Time-varying parameters example

for all possible predicted states and the state convergesto a neighborhood of the origin. Similarly, the input con-straints (not plotted) are satisfied for all k 2 N.

�0.5 0 0.5 1 1.5 2

0

1

2

3

[xk]1

[xk] 2

Figure 1. Realized closed-loop trajectory from initial condi-tion x0 = [2 3]>, predicted state tube at time k = 0, andconstraint [xk]2 � �0.3.

To highlight the parameter estimation, the PE condi-tion as described in Section 4.2 has been implemented,following [24], via an additional constraint on the input

P�1X

l=0

uk�lu>k�l ⌫ ↵I (29)

with P = n + 1 and ↵ = 2. Starting from an initial con-dition x0 = [0 0]>, the closed loop exhibits a persistentlyexciting regressor, with the typical cyclic state and in-put (Figure 2). Due to the state constraint, the centerof the trajectory path is shifted to the positive orthant,such that the closed-loop state trajectory does not vio-late the constraint [xk]2 � �0.3. As predicted by Propo-sition 23, the parameter membership set converges to asingleton (Figure 3). Given the realized state and inputtrajectory, falsified parameters are removed and the un-certainty set is non-increasing.

Finally, to demonstrate the capability of handling time-varying systems, in the following, the problem setup hasbeen changed to a time-varying parameter ✓

⇤k with ✓

⇤0 =

✓⇤ and a bound on the variation of k✓

⇤k+1 � ✓

⇤kk 0.01.

In the simulation, the parameter has been taken to be aperiodic deterministic function in time. Each parameteris increased/decreased linearly by 0.01�

3, i.e.

[✓⇤k+1]i = [✓⇤

k]i ± 0.01�3

,

where the sign is changed upon hitting the boundary of⇥. As above, the simulation has been initialized with

�0.2 0 0.2 0.4

0

0.5

[xk]1

[xk] 2

0 20 40 60 80 100�1

�0.50

0.51

time step k

input

uk

Figure 2. Closed-loop state and input trajectory with en-forced PE input (solid line), state and input constraints(dashed line).

�10

1

�10

1

�1

0

1

�10

1

�10

1

�1

0

1

�10

1

�10

1

�1

0

1

�10

1

�10

1

�1

0

1

�10

1

�10

1

�1

0

1

�10

1

�10

1

�1

0

1

Figure 3. Parameter membership set at time stepsk 2 {0, 5, 25, 70, 120, 500}.

x0 = [0 0]> and the additional PE constraint (29). Fig-ure 4 shows the estimated parameter set at samplingtimes k = 1, 100, 200, 300, 400 and 500. Instead ofconvergence to a singleton as in Figure 3, the parameterset varies in position, shape, and size.

The simulations were performed in Matlab with Yalmipfor setting up the optimization program, which wassolved using MOSEK. The median solver time (with PEconstraint) reported by Yalmip was 0.068s (0.10s) with amaximum of 0.095s (0.19s) and minimum of 0.05s on anIntel Core i7 with 3.4GHz. Choosing X0, i.e. the shapeof the tube cross sections, to be the minimal robustlyforward invariant set under the local control law de-

12

Figure: Parameter set ⇥k at times k 2 {0, 5, 25, 70, 120, 500} for the

non-time-varying case for comparison

37

Di↵erentiable MPC

MPC law: uN (xk, ✓k, ⇥k) is the solution of a multiparametricprogramming problem

Di↵erentiable MPC uses the gradient r✓uN (·) to train a neural network

(NN) with weights ✓k via back-propagation

Update ✓k as MPC optimization parameters embedded in a NN layer;retain parameter set estimate ⇥k for safe constraint handling

38

Di↵erentiable MPC: learning model parameters

Linearly parameterised system model:xk+1 = f(xk, uk, ✓

⇤) + wk

f(xk, uk, ✓) = Dk✓ + dk

⇢Dk = D(xk, uk)

dk = d(xk, uk)

Parameter set estimate:⇥k+1 ◆ ⇥k \ �k+1

�k+1 = {✓ : xk+1 � Dk✓ � dk 2 W}

Imitation learning problem: identify ✓⇤ by observing an expert controller

Train ✓k to minimize a loss function

1

T

kX

t=k�T+1

⇣kut � uN (xt, ✓k, ⇥k)k2 + �kwtk2

⌘

whereut = {ut, . . . , ut+N�1} = observed expert control sequenceuN (xt, ✓k, ⇥k) = {u0|t, . . . , uN�1|t} = MPC law for an initial state xt

wt = xt+1 � f(xk, uk, ✓k) = 1-step ahead error

39

Di↵erentiable MPC: learning the MPC performance index

Platoon problem:

y1 y2 yn

regulate y1, . . . yn a so that yi+1 � yi ! 0 subject to yi+1 � yi � y

a yi b

Prior assumptions:

System model (yi = ui) is known

y, a, b are known

Unknown MPC cost to be learnt from observations of an expert controller

42

Conclusions

Stable adaptive robust MPC is computationally tractable

Set-membership parameter estimation and point estimates define MPCcost functions and robust constraints

Nonconvex PE conditions can be relaxed to convex su�cient conditions

Future work

How to ensure recursive feasibility while enforcing PE constraints?

Can we relax the assumption of bounded disturbances?

How to combine general adaptive parameter estimation withset-membership bounds?

References:– M. Lorenzen, M. Cannon, F. Allgower, Automatica, 2019

– X. Lu, M. Cannon, American Control Conference, 2019

– S. East, M. Gallieri, J.Masci, J. Koutnik, M. Cannon,

Neural Information Processing Systems (NeurIPS), 2019

– X. Lu, M. Cannon, D. Koksal-Rivet, arxiv.org:1911.00865, 201945

adaptive model predictive control: robustness and parameter estimation · 2021. 5. 28. · state...

Documents