stochastic control theory - technical university of … · stochastic control theory...

Stochastic control theory

External system description

Niels Kjølstad PoulsenDepartment of Informatics and Matematical Modelleling

The Technical University of Denmark

Version: 15 Januar 2009 (A4)2018-03-13 22.57

2

Abstract

These notes are intended for use in connection to the course in Stochastic Adaptive Con-trol (02421) given at the Department of Mathematical Modelling, The TechnicalUniversity of Denmark.

This report is devoted to control of stochastic systems described in discrete time. We areconcerned with external descriptions or transfer function models, where we have a dynamicmodel for the input output relation only (i.e.. no direct internal information). The methodsare based on LTI systems and quadratic costs.

We will start with the basic minimal variance problem. This control strategy is based on aone step criterium and is known to in many cases to require a very high control effort. Wewill then move on to more advance, but still one step strategies, such as Pole-Zero control,Generalized Stochastic Pole placement control and Generalized Minimum Variance control.All strategies aiming at reducing the control power to a reasonable level. These methods canbe regarded as extension to the basic minimal variance strategy and have all a close relationto prediction. Consequently a section on that topic can be found in appendix.

The next step in the development is the multi step strategies where the control action is de-termined with due respect to the performance over a future period of time. The GeneralizedPredictive Control (GPC) methodology is a special case of the Model based Predictive Control(MPC) which aims at optimizing the performance over a finite future period of time.

The Linear Quadratic Gaussian controller can be regarded as a limit of the GPC controllersince it aims a optimizing a quadratic performance in steady state or consider the problemover an infinite horizon.

CONTENTS 3

Contents

1 Introduction 4

2 Minimal Variance Control 5

3 MV0 control 11

4 MV1 control 12

5 MV1a control 14

6 Pole-Zero (PZ) control 15

7 Generalized Stochastic Pole Placement (GSP) Control 17

8 Generalized Minimum Variance (GMV) Control 19

8.1 MV1 Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

8.2 MV3 Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

9 Generalized Predictive (GPC) Control 24

10 Linear Quadric Gaussian (LQG) Control 26

A Polynomials, transfer functions and LTI systems 30

B Prediction 31

B.1 Prediction in the ARMA structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

B.2 Simple prediction in the ARMAX structure . . . . . . . . . . . . . . . . . . . . . . 33

B.3 Prediction in the ARMAX structure . . . . . . . . . . . . . . . . . . . . . . . . . . 34

C The Diophantine Equation 37

C.1 The Sylvester method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

C.2 Impulse response method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

D Closed loop properties 40

4 1 Introduction

1 Introduction

It is assumed, that the system to be controlled is a linear, time invariant (LTI system) SISO system(single input single output system). SISO systems are also denoted as scalar systems and has onecontrol signal (input signal), ut, and one output signal, yt.

yt = q−kB(q−1)

A(q−1)ut + vt (1)

The signal, vt, models the total effect of the disturbances.

In general the time delay, k ≥ 0 due to the causality. If for example ut is a measured input signalthen k = 0 might be the case. In a control application, where the sampling of the output is carriedout before the determination and the effectuating of the control action, the time delay is largerthan zero (i.e. k ≥ 1). If the underlying continuous time system do not contain any natural timedelay, then the discrete time system will have k = 1 (ie. only have a time delay due to the samplingprocedure).

If the total effect (vt) of the disturbances is a weakly stationary process and has a rational spectrum

the we can model the effect as:

vt =C(q−1)

D(q−1)et

where et ∈ F(

0, σ2)

and is a white noise sequence that is uncorrelated with past output signals(yt−i, i = 1, 2, ...).

In this presentation we will assume the system is given by a ARMAX structure (autoregressivemoving average model with external input) or the CARMA (controlled autoregressive movingaverage model) which can be written as

A(q−1)yt = q−kB(q−1)ut + C(q−1)et (2)

or as

yt = q−kB(q−1)

A(q−1)ut +

C(q−1)

A(q−1)et

q−kB

C

et

ut

yt

A−1

Figure 1. Stochastic system in the ARMAX form

The driving noise sequence, et ∈ F(

0, σ2)

, is a white noise sequence and is uncorrelated with pastoutput signals (yt−i, i = 1, 2, ...). The 3 polynomials

A(q−1) = 1 + a1q−1 + ... + anq

−na

B(q−1) = b0 + b1q−1 + ... + bnq

−nb

C(q−1) = 1 + c1q−1 + ... + cnq

−nc

are assumed to have the orders na, nb and nc, respectively. The two polynomials, A and C, areassumed to be monic i.e. A(0) = 1 and C(0) = 1. Furthermore C(z) = znC(z−1) has no rootsoutside the unit circle. This latter assumption is justified by the spectral representation Theorem.

5

Remark: 1 The ARMAX (above) and the BJ structure:

yt = q−kB(q−1)

F (q−1)ut +

C(q−1)

D(q−1)et

can be regarded as extreme version of the more general L-structure:

A(q−1)yt = q−kB(q−1)

F (q−1)ut +

C(q−1)

D(q−1)et

This structure is often obtained if the model is the result of system identification. If we are willingto accept common factors, we can always transform a description from one structure to another.

✷

2 Minimal Variance Control

Minimal variance control has been described in a huge part of the literature. One of the most wellknown reference is (Astrom 1970).

Example: 2.1 The following example is a modified (and reduced) version of Example 3 in (Astrom 1970).Consider the problem of producing paper with a certain thikness. In this process paper pulp is transformed into acontinuous line of paper. Due to variation in e.g. raw materials the thickness of the paper is varying. A controlleris installed to reduce the variation and the set point (the average thickness of the paper) is adjusted such that theprobability of having a paper thickness less than a certain limit is at a specified level. This is illustrated in Figure2 (with a over saturated probability).

7 8 9 10 11 12 13 14 150

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Dens

ity fu

nctio

ns

tickness

Minimal variance control

Figure 2. Basic minimal variance control and a ARMAX system.

In Figure 2 the situation is illustrated for two controllers when the lower limit for the paper thickness is 10 (is

scaled units). It is quite clear that for a controller resulting i a low variance of the paper thikness the set point can

be lower and still producing paper with same quality. ✷

In Appendix B we have investigated methods for optimal prediction.This facilitate the ability toevaluate the effect of a given control sequence. In this section we will solve the inverse problemwhich consists in finding that control sequence that in an optimal way brings the system to adesired state.

We will start with the basic minimal variance controller, which aim to (in stationarity) to minimizethe following cost function:

J = E{y2t+k} (3)

6 2 Minimal Variance Control

We assume the system and the disturbance is given by the ARMAX in (2).

Furthermore, we assume that the polynomials B and C have all their roots inside the unit disk.In that situation we have the following theorem.

Theorem: 2.1: Assume the system is given by (2). The solution to the basic minimal variancecontrol problem is given by the controller:

B(q−1)G(q−1)ut = −S(q−1)yt (4)

where G and S are polynomials with order:

ord(G) = k − 1 ord(S) = Max(na − 1, nc − k)

and are the solution to the Diophantine equation:

C(q−1) = A(q−1)G(q−1) + q−kS(q−1) (5)

In stationarity the closed loop is characterized by:

yt = G(q−1)et ut = −S

Bet

Notice the control error (yt) is a MA(k)-process. ✷

Proof: Consider the situation in an instant of time t. Since the time delay through the system is k, the controlaction, ut, can only effect the situation at the instant t+ k and further on. According to Theorem B.2 we have thefollowing:

yt+k =1

C[BGut + Syt] +Get+k

and consequently:

Jt+k = E{y2t+k} = E

{

[

1

C(BGut + Syt)

]2}

+ E{

[Get+k]2}

since Get+k = et+k + · · · + gk−1et+1 is independent of Yt. Especially, is the last term independent on ut. Theoptimum of J occur if the first term is canceled (equal to zero). This is valid for the given controller if thepolynomial C has all its zeroes inside the unit disk. If the first term is zero, then the output is in closed loop (andunder stationary conditions)

yt = Get

The closed loop expression for the control comes directly from this and the control law (4). ✷

Remark: 2 Notice, this control is equivalent to ensure (by a proper choice of ut) that the (k-stepahead) prediction of yt to zero. ✷

Remark: 3 Notice, the poles of the closed loop is roots for:

C = ABG+ z−kBS = B(AG+ zkS) = BC

That means that the basic minimal variance controller is only able to stabilize system with a stableinverse (discrete time minimal phase systems), i.e. system with zeroes (to the B polynomial) wellinside the unit disk. Furthermore the C polynomial must have the same properties. ✷

Example: 2.2 Assume, that the result of an analysis of a dynamic system and its disturbances are resulted ina model as in (2) with:

A = 1− 1.7q−1 + 0.7q−2

B = 1 + 0.5q−1 k = 1

C = 1 + 1.5q−1 + 0.9q−2 et ∈ F(0, σ2)

7

C

q−kBut

yt

et

A−1

−S

1BG

Figure 3. Basic minimal variance control and a ARMAX system.

Firstly, we will investigate the situation for k = 1. In the design we have the Diophantine equation (5) which inthis case is:

(1 + 1.5q−1 + 0.9q−2) = (1− 1.7q−1 + 0.7q−2)1 + q−1(s0 + s1q−1)

The solution can be found in different ways. The most strait forward is to identify the coefficient to q−i, whichresults in:

0 : 1 = 1 (6)

1 : 1.5 = −1.7 + s0 (7)

2 : 0.9 = 0.7 + s1 (8)

or in s0 = 3.2 and s1 = 0.2. In other words:

G = 1 S = 3.2 + 0.2q−1

The minimal variance controller is therefore given by:

ut = −S

BGyt = −

3.2 + 0.2q−1

1 + 0.5q−1yt

or by:ut = −0.5ut−1 − 3.2yt − 0.2yt−1

With this strategy the we will have in closed loop (and in stationarity):

yt = et ut = −3.2 + 0.2q−1

1 + 0.5q−1et

From this we can easily find thatvar(y) = 0.1 var(u) = 0.1518

We will now focus on how much the performance of the controller will be deteriorated if the time delay is increasede.g.. to k = 2. In this situation the Diophantine equation becomes:

(1 + 1.5q−1 + 0.9q−2)yt = (1 − 1.7q−1 + 0.7q−2)(1 + g1q−1) + q−1(s0 + s1q

−1)

and the solution to the equation system:

VS. HS.

0 1 11 1.5 −1.7 + g12 0.9 0.7− 1.7g1 + s03 0 0.7g1 + s1

is:g1 = 3.2 s0 = 5.64 s1 = −2.24

The minimal variance controller is in this situation:

ut = −S

BG= −

5.64 − 2.224q−1

1 + 3.7q−1 + 1.6q−2yt

or:ut = −5.64yt + 2.24yt−1 − 3.7ut−1 − 1.6ut−2

The stationary error is:yt = et + 3.2et−1

which has a variance equal:V ar{yt} = (1 + 3.22)σ2 = 11.24σ2

In this example the variance of the error will increase if the time delay is increased. ✷


Example: 2.3 Consider a system given in the ARMAX form:

A = 1− 1.5q−1 + 0.95q−2 B = 1 + 0.5q−1 k = 1

C = 1− 0.95q−1 σ2 = (0.1)2

For this system the basic minimal variance controller is given by:

R = BG = 1 + 0.5q−1 S = 0.55 − 0.95q−1

or as:ut = −0.5ut−1 − 0.55yt + 0.95yt−1

The output signal and the control are shown in the stationary situation in Figure 4. The transient phase (after cut

in) can be studied in Figure 5. Notice, the reduction in variance just after cut in. ✷

0 50 100 150 200 250−1

−0.5

0

0.5

1

y

t

Output and reference

0 50 100 150 200 250−1

−0.5

0

0.5

1

u

t

Styresignal

Figure 4. The output signal and the control in Example 2.2

0 50 100 150 200 250 300 350 400 450 500−1

−0.5

0

0.5

1

0 50 100 150 200 250 300 350 400 450 500−1

−0.5

0

0.5

1

Figure 5. The output signal and the control in Example 2.2

Example: 2.4 In this example we will study the effect of the time delay, k. Assume, that the system is thesame as in example 2.2. For k = 1 the controller is as discussed in example 2.2. For k = 2 the control polynomialsare:

R = 1 + 1.05q−1 + 0.275q−2 S = −0.125− 0.53q−1

The output signal and the control are under stationarity for k = 1, 2 depicted in Figure 6. Notice, the smallincrement in the variance of the output signal due to the increased time delay. Also notice, the reduction in controlafford.

9

0 100 200 300 400−1

−0.5

0

0.5

1Output and reference signal, k=1

0 100 200 300 400−1

−0.5

0

0.5

1Control signal

0 100 200 300 400−1

−0.5

0

0.5

1Output and reference signal, k=2

0 100 200 300 400−1

−0.5

0

0.5

1Control signal

Figure 6. The output signal and the control from Example 2.3

For k = 1 and k = 2 the G polynomial is:

G1 = 1 G2 = 1 + 0.55q−1

That means an increased variance, which is equal 1.3025 = 1 + (0.55)2 for an increase in k from 1 to 2.

In the table below, the empirical variance, the theoretical variance and the variance of the control are listed for 10experiments. All numbers are in %.

empirical ratio theo. ratio ratio in control variance149.1580 130.2500 16.9856144.9701 130.2500 15.5137148.4734 130.2500 14.9431130.1090 130.2500 13.7075142.1038 130.2500 12.7749134.7121 130.2500 12.9202133.3890 130.2500 15.1116123.7364 130.2500 11.5495140.2522 130.2500 14.0588114.9559 130.2500 12.6139129.8356 130.2500 12.5119123.1263 130.2500 11.3916

✷

Example: 2.5 Let us continue Example 2.4 for k = 2, but with

B = 0.5 + 0.25q−1

In this case:R = 0.5 + 0.525q−1 + 0.1375−2 S = −0.125− 0.53q−1

That means a controller given by:

u = −−0.125− 0.53q−1

0.5 + 0.525q−1 + 0.1375−2

With this controller in action the closed loop is given by:

y = (1 + 0.55q−1)et u = −−0.125− 0.53q−1

0.5 + 0.25q−1et

From these expressions, we can determine various statistical properties such as variance, (auto and cross) spectraldensities and (auto and cross) correlation functions. Probably the most important is the variances

σ2y = 0.0130 σ2

u = 0.0119

These values can be compared with experimental (from simulations) values

σ2y =

1

N

N∑

i=1

y2i σ2u =

1

N

N∑

i=1

u2i


0.009 0.01 0.011 0.012 0.013 0.014 0.015 0.016 0.017 0.0180

2000

4000

6000

8000

10000

12000

14000Histogram over emperical variance of y

Figure 7. Histogram for the empirical variance for the output (y).

0.008 0.009 0.01 0.011 0.012 0.013 0.014 0.015 0.0160

2000

4000

6000

8000

10000

12000

14000Histogram over emperical variance of u

Figure 8. Histogram for the empirical variance for the control input (u).

0.009 0.01 0.011 0.012 0.013 0.014 0.015 0.0160.01

0.011

0.012

0.013

0.014

0.015

0.016

0.017

0.018

var(y

)

var(u)

Emperical variances

Figure 9. Pareto plot from Example 2.5. The theoretical values are σ2y = 0.0130 and σ2

u = 0.0119.

11

In Figure 7-8 the histograms for a large number of i runs (105) are plotted. Each run has a run length equal to500. The Pareto plot (σ2

y vs σ2u) is shown i Figure 9.

✷

Example: 2.6 Let us now focus on a system as in example 2.2, just with:

B = 1 + 0.95q−1

where the system zero (in 0.95) is close to the stability limit. For this system the minimal variance controller is:

R = BG = 1 + 0.95q−1 S = 0.55− 0.95q−1

i.e.ut = −0.95ut−1 − 0.55yt + 0.95yt−1

The output and control signals are in stationarity conditions as depicted in Figure 10. Notice, the oscillations in

the control signal. ✷

0 50 100 150 200 250 300 350 400 450 500−0.4

−0.2

0

0.2

0.4Output signal

0 50 100 150 200 250 300 350 400 450−2

−1

0

1

2Control signal

Figure 10. Output and control signal in Example 2.4

3 MV0 control

In the previous section we have dealt with the regulation problem without a reference signal (orthe reference is zero). In this section we will extent the results in order to cope with a (non zero)reference signal or a set point. Consequently, let us focus on a control in which the cost function

J = E{

(yt+k − wt)2

}

(9)

is minimized.

Theorem: 3.1: Assume the system is given by (2). The MV0, which minimize (9) is given by thecontrol law

B(q−1)G(q−1)ut = C(q−1)wt − S(q−1)yt (10)

where G and S are solutions to the Diophantine equation

C(q−1) = A(q−1)G(q−1) + q−kS(q−1) (11)

12 4 MV1 control

with ordersord(G) = k − 1 ord(S) = Max(na − 1, nc − k)

In stationarity the control error is

yt = yt − wt−k = G(q−1)et

which is a MA(k) process. ✷

Proof: From (64) we have

yt+k =1

C{BGut + Syt}+Get+k

and furthermore that

yt+k − wt =1

C[BGut + Syt − Cwt] +Get+k

Now

J = E

{

(yt+k − wt)2}

=

(

1

C[BGut + Syt − Cwt]

)2

+ V ar{Get+k}

which takes its minimum for the control law given in the Theorem. ✷

Theorem: 3.2: Let the assumptions in Theorem 3.1 (page 12) be valid and let the situation bestationary. Then for the system in (2) the MV0 controller will give

yt = q−kwt +Get

and

ut =A

Bwt −

S

Bet

in closed loop. ✷

Proof: The closed loop expression for the output comes directly from Theorem 3.1 (page 12). If this is introducedin the control law, then

BGut = Cwt − Syt

= Cwt − S(q−kwt +Get)

= AGwt − SGet

or as stated in the theorem. Notice, we have used the Diophantine equation (11) in the mid equation. ✷

4 MV1 control

In the previous section we saw, that the basic minimum variance controllers (MV and MV0) indeedrequired too much control action. Let us then focus on a control in which the cost function has aterm related to the control action, i.e. a control in which

J = E{

(yt+k − wt)2 + ρu2

t

}

(12)

is minimized.

Theorem: 4.1: Assume the system is given by (2). The MV1, which minimize (12) is given by thecontrol law

(BG+ αC) ut = Cwt − Syt α =ρ

b0(13)

13


C = AG+ q−kS (14)


✷

Proof: As in Theorem 3.1 (page 12) we have from (64) that

yt+k =1

C{BGut + Syt}+Get+k


yt+k − wt =1


Now

J = E

{

(yt+k − wt)2 + ρu2

t

}

=

(

1


)2

+ ρu2t + V ar{Get+k}

which takes its minimum for

2b0

C[BGut + Syt − Cwt] + 2ρut = 0

or as given in the theorem. ✷


yt = q−k B

B + αAwt +

BG+ αC

B + αAet

and

ut =A

B + αAwt −

S

B + αAet

in closed loop. ✷

Proof: Firstly, focus on the output yt. If the control law, (13), is introduced in the system description (2) then

y = q−k B

A

[

C

BG+ αCwt −

S

BG+ αCyt

]

+C

Aet

or (when multiplying with A[BG+ αC]

A[BG+ αC]yt = q−kBCwt − q−kBSyt + C(BG+ αC)et

or (after collecting terms involving yt)

(ABG + q−kBS + αAC)yt = q−kBCwt + C(BG + αC)et

If we apply the Diophantine equation (14) we have that

(BC + αAC)yt = q−kBCwt + C(BG + αC)et

or (after canceling C, which has all roots inside the stability area) the closed loop is as stated in the Theorem.

For the control actions we have

(BG+ αC)ut = Cwt − S

[

q−kB

Aut +

C

Aet

]

or (after multiplying with A)[

ABG+ q−kSB + αCA]

ut = ACwt − SCet

or (after applying the Diophantine equation (14))

[BC + αCA]ut = ACwt − SCet

or (after canceling C, which has all roots inside the stability area) the closed loop is as stated in the Theorem. ✷

14 5 MV1a control

5 MV1a control

The minimum variance controllers (MV and MV0) can in some applications require a high controlactivity. In order to reduce the variance of the control action the MV1 controller can be applied.Unless the system contain an integration (and then A(1) = 0) the MV1 controller will for a nonzero set point give a stationary error. The standard work around is let the cost include the controlmove (vt) rather than the control action (ut) itself into the cost function.

Consequently, let us now focus on a control in which the cost function

J = E{

(yt+k − wt)2 + ρv2t

}

vt = ut − ut−1 (15)

is minimized. Let us introduce the ∆ operator as

∆ = 1− q−1

then the cost in (15) can be written as

J = E{

(yt+k − wt)2 + ρ(∆ut)

2

}

Theorem: 5.1: Assume the system is given by (2). The MV1a, which minimize (15) is given bythe control law

(BG+ αC∆) ut = Cwt − Syt α =ρ

b0(16)


C = AG+ q−kS (17)


✷


yt+k =1

C{BGut + Syt}+Get+k


yt+k − wt =1


Now

J = E

{

(yt+k −wt)2 + ρv2t

}

=

(

1


)2

+ ρ(∆u)2t + V ar{Get+k}

which takes its minimum as given in the theorem. ✷


yt = q−k B

B + α∆Awt +

BG+ αC

B + α∆Aet

and

ut =A

B + α∆Awt −

S

B + α∆Aet

in closed loop. ✷

15

Proof: If the control law, (16), is introduced in the system description (2) then

y = q−k B

A

(

C

BG + α∆Cwt −

S

BG+ α∆Cyt

)

+C

Aet

or (when multiplying with A((BG + α∆C)

A(BG+ α∆C)yt = q−kBCwt − q−kBSyt + C(BG+ αC)et

After collecting terms and applying the Diophantine equation (17) we have that

(BC + α∆AC)yt = q−kBCwt + C(BG+ αC)et

or (after canceling C, which has all roots inside the stability area) the closed loop is as stated in the Theorem. ✷

6 Pole-Zero (PZ) control

In the previous sections we saw, that the basic minimum variance controllers (MV and MV0)indeed required too much control action. One way to reduce the control effort is to introduce aterm in the cost function which take the control effort into considerations. Another method is toreduce the requirements to the control error. Rather than require the output should follow thereference in a close way

yt = q−kwt

(as in the MV0 case) we could require the output is following the reference in the following way

yt = q−kBm

Am

wt

Here the reference model, (Bm, Am), is normally chosen to be faster than the open loop system(the plant), but sufficient slow to reduce the control action.

Let us then focus on a control in which the cost function has a term related to the control action,i.e. a control in which

J = E{

(Amyt+k −Bmwt)2

}

(18)

is minimized.

Theorem: 6.1: Assume the system is given by (2). The PZ-controller, which minimize (18) is givenby the control law

BGut = CBmwt − Syt (19)


AmC = AG+ q−kS (20)

with ordersord(G) = k − 1 ord(S) = max(na − 1, nc + nm − k)

✷


yt+k =1

C{BGut + Syt}+Get+k


Amyt+k −Bmwt =1

C[BGut + Syt − CBmwt] +Get+k

16 6 Pole-Zero (PZ) control

Now

J = E

{

(Amyt+k − Bmwt)2}

=

(

1

C[BGut + Syt − CBmwt]

)2

+ V ar{Get+k}

which takes its minimum for as given in the theorem. ✷

Notice that the error Amyt − q−kBmwt asymtotically will approach the stationary error Get. Theapproach is determined by the roots in the C polynomial.

This controller is a poleplacement controller with full zero cancelation. If the observer polynomialAo is chosen to be Ao = C, then we have a stochastic version with relations to eg. the MV0

controller.

This controller requires (as well as the MV0 controller which is a special case) perfect knowledgeog the time delay through the system.

wt

BmC

u0

−S

ut

et

C

q−kB

d

yt

A−11

BG

Figure 11. The structure in a PZ controller

Theorem: 6.2: Let the assumptions in Theorem 6.1 (page 15) be valid and let the situation bestationary. Then for the system in (2) the PZ-controller will give

yt = q−kBm

Am

wt +G

Am

et

and

ut =ABm

BAm

wt −S

BAm

et

in closed loop. ✷

Proof: Firstly, focus on the output yt. From the proof of Theorem 6.1 (page 15) we have

Amyt −Bmwt−k = Get

or as stated in Theorem 6.2 (page 16). For the control actions we have

BGut = BmCwt − Syt

= BmCwt − S

(

q−k Bm

Am

wt +G

Am

et

)

=Bm

Am

AGwt −G

Am

et

where we in the last line have used the Diophantine equation (20). From this we get the result stated in the theorem.

✷

17

7 Generalized Stochastic Pole Placement (GSP) Control

As mentioned earlier the PZ controller has a problem if the system has an unstable inverse. Thisis due to the cancellation of the system zeroes in PZ controller. It might also be the case that thesystem has zeroes that are stable, but are badly damped. This observation can be the platform fordesigning a controller that is applicable when the system zeros are badly damped or even unstable.We just have to accept that the unstable zeros remain zeros in the closed loop transfer function(from reference to output). This control strategy is in this presentation denoted as GSP-control.

The goal is still to have an output (yt) which is close to some reference model output

ym(t) = q−kBm

Am

wt

This is in a stochastic setting related to (but not identical to) minimizing the cost function:

Jt = E{

(Amyt+k −Bmwt)2}

(21)

Now, due to the the problem with the system zeroes we have to factorize the system numeratorpolynomial

B = B+B−

into a part suitable for cancellation (B+) and one (B−) which has to be kept in the resultingtransfer function from reference to output. In order to comply with the cost function in (21) B−

has to be a part of Bm, i.e.Bm = B−Bm1

where Bm1contains eventually extra factors and zeroes. This extra factors could be used for

ensuring af closed loop DC gain equal to one.

The design of the GSP controller is summarized in Theorem 7.1.

Theorem: 7.1: Assume the system is given by (1). The Generalized Stochastic Poleplacementcontroller (GSP) is given as:

B+Gut = Bm1Aowt − Syt −

G

B−d (22)

where the polynomials, G og S, are solutions to the Diophantine equation:

AoAm = AG+ q−kB−S (23)

Here G(0) = 1, ord(G) = k + nb−

− 1 and

ord(S) = max(na − 1, nao+ nam

− k − nb−

) (24)

The Observer polynomial, Ao, is a stable polynomial (i.e. have only roots inside the stability area).✷

Notice that often is the choise Ao = C used. This choise make a closer relation to the MV0 andPZ controller.

Notice that the PZ controller is a special case (for Ao = C) if

B− = 1 B+ = B

18 7 Generalized Stochastic Pole Placement (GSP) Control

(The choise, B− = const, will also result in the PZ controller). On the other hand, if

B+ = 1 B− = B

then all the system zeroes are cancelled.

Proof: An external controller can be written as

Rut = Qwt − Syt + γ

and the transfer operator from reference to output can be written as

Hy,w = q−k QB

AR+ q−kBS(25)

which according to the design objective has equal

q−k Bm1B−

Am(26)

The system zeros which we wish to maintain in the closed loop tranfer operator must be a part of the feed forwardterm, i.e. Q = Bm1

Ao where Ao is a stable polynomial. Since only a part (B+) of B can be cancelled we must havethat

R = B+G

where G is a polynomial of suitable order. In order to meet the design objective (i.e to have the correct tranfer fromreference to output) the two polynomials G and S must satisfy the Diophantine equation:

AoAm = AG+ q−kB−S

where G(0) = 1 and ord(G) = k + nb−

− 1. Furthermore must

Ao(Amyt+k −Bmwt) = AGyt+k +B−Syt −AoBmwt (27)

= G(But + Cet+k +Gd) +B−Syt −AoBmwt (28)

and then

Amyt+k −Bmwt =B−

Ao

{

B+Gut + Syt +G

B−d−AoBm1

wt

}

+C

Ao

Get+k

Consequently, the controller in (23) is a suboptimal solution to the design objective. ✷

wt1

B+G

ut

−S

dC

yt

et− G

B−

d

A−1Bm1Ao q−kB

Figure 12. Structure in the GSP controller

We can summarize the design of GSP controller in the following steps:

1. Factorize B = B+B−

2. Choose Am, Bm1and Ao

DC

[

Bm1B−

Am

]

= 1

3. SolveAoAm = AG+ q−kB−S

for S and G

19

4. Use the controller:

B+Gut = Bm1Aowt − Syt −

G

B−d

Theorem: 7.2: If the system in (2) is controlled by a stochastic poleplacement controller, then theclosed loop is given by:

yt = q−kBm1B−

Am

wt +G

Am

C

Ao

et (29)

ut =ABm1

AmB+

wt −S

AmB+

C

Ao

et −1

Bd (30)

✷

Proof: The proof is just a trivial but technical manipulation of transfer functions. If the controller (22) ismultiplated with B− we have

BGut = BmCwt − SB−yt −Gd (31)

A multiplication of the system in (2) with G gives

AGyt = q−kBGut + CGet +Gd

or by using (31)AGyt = q−k[BmCwt − SB−yt −Gd] + CGet +Gd

Furthermore is[AG+ q−kB−S]yt = q−kBmCwt + CGet

and

Amyt = q−kBmwt +GC

Aoet (32)

which is identical to (29).

The closed loop characteristics are obtained in a similar way. If the controller in (22) is multiplied with Am wehave:

B+GAmut = Bm1CAmwt − SAmyt −

GAm

B−d

If the expression in (32) for Amyt is applied we have furthermore that:

B+GAmut + Bm1CAmwt − S[q−kBmwt +G

C

Aoet]−

GAm

B−d (33)

= Bm1AGwt − SG

C

Ao

et −GAm

B−d (34)

Here the Diophantine equation (23) has been used. Hereby (30) simply emerge. ✷

Notice that Bm1is used in order to ensure a DC-gain equal to one in the closed loop transfer

function from refence to output.

8 Generalized Minimum Variance (GMV) Control

This type of control strategy is originally described in the following papers:(Clarke & Gawthrop1975), (Clarke & Gawthrop 1981), (Gawthrop 1977).

In the previous we have dealt with a control strategy which aims at minimizing the cost function:

J = E{[yt+k − wt]2} (35)

That lead to the MV0-controller which is well known to require a very large control effort. This isbecause it simply minimize the variance of the error between output and reference signal. One wayto reduce this control effort is to take only a part of the frequency region of the error into account.Another way to reduce the control effort is simply to include the variance of the control signal

20 8 Generalized Minimum Variance (GMV) Control

into the cost function. In a similar way we could also only include a filtered version of the controlsignal in the control design. In other words we can introduce frequency weights. The generalizedminimal variance controller is design such that the cost

J = E{

[yt+k − wt]2 + ρu2

t

}

(36)

is minimized. Here the signals

yt = Hy(q)yt wt = Hw(q)wt ut = Hu(q)ut (37)

are filtered or frequency weighted signals. The quantities, Hy(q), Hu(q) and Hw(q) are transferfunctions and are rational in q. In order to introduce more freedom in the design, we will use bothHy(q) and Hw(q). If these two filters are identical then the variance of a filtered version of theerror between output and reference is minimized. The transfer function Hu(q) is used to reduce thevariance of the control action in certain regions. Assume we have the following transfer function:

Hy(q) =By(q

−1)

Ay(q−1)Hu(q) =

Bu(q−1)

Au(q−1)Hw(q) =

Bw(q−1)

Aw(q−1)

where Ay(0) = Au(0) = Aw(0) = Bu(0) = 1 (the weight on the control signal is introduced via ρ).

Theorem: 8.1: Assume the system is given by (2). The Generalized Minimal Variance controller(GMV) is the given by

[AuBG+ αCBu]ut = Au

[

CBw

Aw

wt −S

Ay

yt −Gd

]

(38)

whereα =

ρ

b0

and the polynomials G and S are solutions to the Diophantine equation:

ByC = AyAG+ q−kS (39)

The orders areord(G) = k − 1 ord(S) = max(na + nay

− 1, nby + nc − k)

where G not necessarily is monic. In more details G(0) = By(0). ✷

Proof: Since the minimization is based on that ut is a function of the available information (ie. Yt) we have

J⋆ = minut

E{

[yt+k − wt]2 + ρu2

t

}

= E

[

minut

E{

[yt+k − wt]2 + ρu2

t

∣

∣ Yt

}

]

The control is consequently given by:

ut = arg minut

[

(

ˆyt+k|t −wt

)2+ ρu2

t

]

We have (39) and the system description (2) for determining ˆyt+k|t. More specific we have:

ByCyt+k = GBy [But + Cet+k + d] + Syt

or that:

yt+k =1

C

[

BGut +S

By

yt +Gd

]

+Get+k (40)

From this we easily see that:dyt+k

dut= b0

21

The control can be determined as the solution to:

b0(ˆyt+k|t − wt) + ρut = 0 (41)

or by[

BGut +S

By

yt − Cwt + αCut +Gd

]

= 0

or as given in (38). ✷

If we define the signalζt = yt + q−k [αut − wt] (42)

then the GMV control is equivalent to cancel the k−step ahead prediction of ζt, ie.

ζt+k = ˆyt+k − wt + αut = 0

Let the polynomial R be given as

R = AuBG+ αCBu (43)

then the GMV control can be written as

Rut = C

[

AuBw

Aw

wt

]

− S

[

Au

Ay

yt

]

−AuGd

Here the quantities in [...] are independent of the system (ie. only depend on the criterion). Wecan then write ζt (defined by (42) and by applying (40)) as

ζt+k =1

C

(

R

[

1

Au

ut

]

+ S

[

1

Ay

yt

]

− C

[

Bw

Aw

wt

]

−Gd

)

(44)

+Get+k (45)

= ζt+k|t +Get+k (46)

We can denote ζt as a generalized error.

The GMV controller can be interpreted as a simple minimum variance controller applied on ζt.

ut

etC

q−kB

d

wt

q−kHw

Hy

q−kHu

ξt

A−1

Figure 13. The GMV controller can be interpreted as a simple minimum variance controller appliedon ζt.

In steady state we have thatζt = Get

When a GMV controller is applied the transient behavior will approach the steady state situationin a way determined by the roots of the C-polynomial.

It is possible to regard the GMV controller as being an inner controller and three filters. In otherwords the controller is wrapped in filters which only depend on the cost function as indicated inFigure 14.

It is possible to analyze the steady state situation of the closed loop system. This is summarizedin the following theorem.

22 8 Generalized Minimum Variance (GMV) Control

Hw C

−Gd

−S

Au

etC d

ytutwt

PrefiltersInner controller

System

A−1R−1 q−kB

A−1y

Figure 14. GMV controller and system

Theorem: 8.2: Assume that the system in (2) is controlled by the GMV controller in (38). Insteady state the closed loop is given by:

[BAuBy + αABuAy] yt = q−kBw

Aw

BAuAywt +RAyet + δy

[BAuBy + αABuAy]ut =Bw

Aw

AAuAywt + SAuet + δu

The DC components, δy and δu, are given by

δy = αBuAyd δu = −AuByd

✷

Proof: The proof is technical and consists only of manipulations of transfer functions. If a system given by (2) iscontrolled by

Rut =Q

Pwt −

S

Lyt + γ

then the closed loop is given by:

Cyt = q−k LBQ

Pwt +LRCet +L (Bγ +Rd)

Cut =LAQ

Pwt − SCet + (ALγ − Sd)

where the (closed loop) characteristic polynomial is:

C = LAR+ q−kSB

The rest of the theorem emerge if R from (43) and

Q = AuCBw P = Aw

S = SAu L = Ay γ = −AuGd

is inserted. For example is

C = LAR+ q−kSB

= AyA (AuBG + αCBu) + q−kSAuB

= BAu

(

AyAG+ q−kS)

+ AyAαCBu

= BAuByC +AyAαCBu

= C (BAuBy + αABuAy)

✷

It is important to notice that if ρ = 0 then the system zeroes will be canceled. If they are not welldamped the performance of the controller will be unsatisfactory.

8.1 MV1 Control 23

It is seen from the closed loop transfer function that Bu must have a zero in one in order to avoidstationary errors (due to non zero reference and load) in the output.

In some cases it might be an advantage to use an alternative expression for the close loop charac-teristic polynomial:

[BAuBy + αABuAy] = AyAu

(

BHy + αAHu

)

This quite general controller has off cause several special cases. I the cost function is considered,then (for Ay = Aw = Au = 1) the following special cases emerge:

Type By Bw ρ

MV0 1 1 0PZ Am Bm 0

Besides the special cases mentioned above in the table, the following MV1 and MV3 will be de-scribed.

8.1 MV1 Control

The MV1 controller emerge if we use the following cost function

J = E{

(yt+k − wt)2 + ρ (ut − ut−1)

2}

That meansHy = 1 Hw = 1 Hu =

(

1− q−1

)

In relation the MV0 controller we have here introduced a cost on the higher frequencies of thecontrol signal. The DC component of the control signal do not enter into the cost function. Theclosed loop according to Theorem 39 given by

[BG+ αC(1 − q−1)]ut = Cwt − Syt −Gd

where G and S are solutions toC = AG+ q−kS

and where G(0) = 1, ord(G) = k − 1 and ord(S) = max(na − 1, nc − k). The difference to theMV0 controller is that now the R polynomial is extended from BG to BG+ α(1 − q−1)C.

As can be seen from the closed equation below this make it possible to control systems with anunstable inverse. We have in closed loop that:

yt = q−k B

B + α(1 − q−1)Awt +

BG+ α(1− q−1)C

B + α(1− q−1)Aet

ut =A

B + α(1− q−1)Awt −

S

B + α(1− q−1)Aet −

1

Bd

It might be noticed that if α varies from 0 to ∞, then the close loop poles will vary from systemzeros to systems poles and 1. This might cause problem is the poles of the system is located outsidethe stability area.

24 9 Generalized Predictive (GPC) Control

8.2 MV3 Control

This type of controller can also be denoted as a model follower due to the fact, that it is designedsuch that closed loop transfer function from reference to output equals a chosen transfer function.The same is the case for the transfer function from the noise to the output. Let the transferfunctions in the cost function be:

Hy =Ae

Be

Hw =AeBm

BeAm

Hu = 1 ρ = 0

where all polynomials have zeros inside the stability area. Since ρ = 0 the system zeros will becanceled and the MV3 can not be applied in connection to systems with an unstable inverse. Byapplying Theorem 8.2 we have as a special case that the closed loops are given by:

yt = q−kBm

Am

wt +Be

Ae

Get

ut =ABm

BAm

wt −SBe

BAe

et −1

Bd

As can be seen from these transfer function we have exactly the freedom of having one specifictransfer function from reference to output and another transfer function from noise to output. Inthe PZ controller had the same poles in the transfer function from reference and from noise tooutput.

9 Generalized Predictive (GPC) Control

From the early stages ((Richalet, Rault, Testud & Padon 1978), (Cutler & Ramaker 1980)) andespecially after its later development ((Clarke, Mohtadi & Tuffs 1987b), (Clarke, Mohtadi & Tuffs1987a)) predictive control has attracted a great deal of interest both in industry and academia.GPC is normal referred to (Clarke et al. 1987b) and (Clarke et al. 1987a) and (Clarke & Mohtadi1989).

In the previous section we have dealt with minimal variance control and extensions hereof. Oneof the problems was the lack of ability to control system with an unstable inverse. This is due tothe fact that these control strategies are based on a single step strategy. That means the controlis aiming at reducing the variance of error k-step ahead. Simultaneously the control is making theproblem worse in the next step.

The previous control strategies has been based on a one step criterion. In this section we willextend this to include strategies where we include several steps in the cost function.

J = E{

N∑

i=1

(

yt+i − wt+i

)2+ λ2u2

t+i−1

}

This cost function can be rewritten into the form

Jt = E{

(Yt:N −Wt:N )⊤(Yt:N −Wt:N ) + λ2U⊤U}

(47)

where

Yt:N =

yt...

yt+N

Wt:N =

wt

...wt+N

The cost function involves future output. For this reason we will apply Theorem B.3 (page 35)rewrite the system equation (2) (on page 4) into

yt+i =Si

Cyt +

Fi+1

Cut−1 +Hi+1ut+i +Giet+i

25

for i = 1, ..., N . Here the polynomials Gm and Sm are solutions to the Diophantine equation:

C(q−1) = A(q−1)Gm(q−1) + q−mSm(q−1)

with:ord(Gm) = m− 1 and ord(Sm) = Max(na − 1, nc −m)

Furthermore are Hm+1 and Fm+1 solutions to the Diophantine equation

q−kBGm = CHm+1 + q−m−1Fm+1

withord(Hm+1) = m ord(Fm+1) = max(nc − 1, nb + k − 1)

Since Gm and Hm+1 are truncated impulse response of C/A and q−kB/A we know that Gm ismonic and the first k coefficient in Hm+1 are zero. We have more specifically

Gm(0) = 1 Hm+1 = hkq−k + ... + hmq−m

The first two terms is denoted as the free response because it depend on past control actionsand output values. The second term (Hi(q

−1)ut+i) is denoted as the forced response and dependon actual and future control actions. This term is a part of the optimization. The last term(Gi(q

−1)et+i) is the noise term.

Let us arrange the free response in a vector according to

Yt =

...y0t+i

...

y0t+i =Si(q

−1)

C(q−1)yt +

Fi+1(q−1)

C(q−1)ut−1

and notice there exists several methods for determine the (predicted) free response.

Let us in the notation neglect that the first k coefficients in Hm+1 are zero. Then we can define:

Hτ =

h0 0 0 0 ... 0h1 h0 0 0 ... 0h2 h1 h0 0 ... 0h3 h2 h1 h0 ... 0...

......

.... . .

...hτ hτ−1 hτ−2 hτ−3 ... h0

(48)

In a similar way we can define:

Gτ =

g0 0 0 0 ... 0g1 g0 0 0 ... 0g2 g1 g0 0 ... 0g3 g2 g1 g0 ... 0...

......

.... . .

...gτ gτ−1 gτ−2 gτ−3 ... g0

(49)

Consequently, we can writeYt:N = Yt +HNUt:N +GNEt:N (50)

where as preciously mentioned Yt contains the free response (ie. yi for Ut:N = 0).

We have the following theorem.

26 10 Linear Quadric Gaussian (LQG) Control

Theorem: 9.1: Assume a dynamic system given by (2). The control strategy which aims atminimizing (47) given the information in Yt is given by:

Ut:N =[

H⊤NHN + λ2I

]−1HN(W − Yt)

where W = Wt:N and Yt are the minimal variance predictions of the reference signal and the freeresponse, respectively. ✷

Proof: If the prediction in (50)

Yt:N = Yt +HNUt:N + GNEt:N

is inserted in the cost function (47)

Jt = E{

(Yt:N −Wt:N )⊤(Yt:N −Wt:N) + λ2U⊤U}

the optimization is (cf. (Astrom 1970) p. 261) equivalent to minimize:

J =(

Y − W)T (

Y − W)

+ λ2UTU

= UT[

HTNHN + λ2I

]

U + 2(

Y − W)T

HU

+(

Y − W)T (

Y − W)

The minimum appear as stated in the theorem. ✷

Remark: 4 If we apply the more general cost function

Jt = E{

(Yt:N −Wt:N )⊤Qy(Yt:N −Wt:N ) + U⊤QuU}

the optimum appears for

Ut:N =[

H⊤NQyHN +Qu

]−1HNQy(W − Yt)

✷

10 Linear Quadric Gaussian (LQG) Control

We have now seen variuous methods for detuning the minimum varince controller. In Section 4the method consist of introducing a weight on the control action and in Section 9 we abanded theone step strategies and introduced a (finite) horizon. In this section we will extent the horizon andfind a controller that in stationarity minimizing the variance of the output and (a weighted variantof) the variance of the control action. Let us first focus on a controller in which the cost

Jt = limN→∞

E

{

1

N

N∑

i=t

y2i + ρu2i

}

(51)

is minimized.

Theorem: 10.1: Assume the system is given by (2). The LQG controller, which minimize (51), isgiven by the control law

R(q−1)ut = −S(q−1)yt (52)

27


P (q−1)C(q−1) = A(q−1)R(q−1) + q−kB(q−1)S(q−1) (53)

with ordersord(G) = n+ k − 1 ord(S) = n− 1

The P-polynomial is the stable solution to

P (q−1)P (q) = B(q−1)B(q) + ρA(q−1)A(q) (54)

✷

Proof: Omitted ✷

Theorem: 10.2: Let the assumptions in Theorem 10.1 (page 27) be valid and let the situation bestationary. Then for the system in (2) the LQG controller will give

yt =R(q−1)

P (q−1)et

and

ut = −S(q−1)

P (q−1)et

in closed loop. ✷

Proof: If the controller in (52) is introduced in the system description (2), then

Ay = −q−kBS

Ryt + Cet

or(AR+ q−kBS)yt = RCet

If furthermore the Diophantine eqution is applied we have the closed loop (for yt)as given in the Theorem. The

closed loop description of ut comes directly by introducing the closed loop expression for yt into the controller. ✷

The controller just stated will solve the regulation problem without a setpoint. Now consider thecost function

Jt = limN→∞

E

{

1

N

N∑

i=t

(yi − wt)2+ ρ (uii− u)

2

}

(55)

For a constant set point, wt we have the following theorems.

Theorem: 10.3: Assume the system is given by (2). The LQG controller, which minimize (54), isgiven by the control law

R(q−1)ut = ηC(q−1)wt − S(q−1)yt (56)


P (q−1)C(q−1) = A(q−1)R(q−1) + q−kB(q−1)S(q−1) and η =P (1)

B(1)(57)

with ordersord(G) = n+ k − 1 ord(S) = n− 1

The P-polynomial is the stable solution to

P (q−1)P (q) = B(q−1)B(q) + ρA(q−1)A(q) (58)

✷

28 10 Linear Quadric Gaussian (LQG) Control

Proof: Omitted ✷

Theorem: 10.4: Let the assumptions in Theorem 10.3 (page 27) be valid and let the situation bestationary. Then for the system in (2) the LQG controller will give

yt = ηB(q−1)

P (q−1)+

R(q−1)

P (q−1)et

and

ut = ηA(q−1)

P (q−1)wt −

S(q−1)

P (q−1)et

in closed loop. ✷

Proof: If the controller in (56) is introduced in the system description (2), then

Ay = q−kB

[

ηC

Rwt −

S

Ryt

]

+ Cet

or(AR+ q−kBS)yt = q−kBηCwt +RCet

If furthermore the Diophantine eqution is applied we have the closed loop (for yt)as given in the Theorem. The

closed loop description of ut comes directly by introducing the closed loop expression for yt into the controller. ✷

REFERENCES 29

References

Astrom, K. J. (1970). Introduction To Stochastic Control Theory, Academic Press.

Clarke, D. & Gawthrop, P. (1975). Self-tuning controller, IEE Proceedings D, Control Theory and

Application 122(9): 929–934.

Clarke, D. & Gawthrop, P. (1981). Implementation and application of microprocessor-based self-tuners, Automatica 17(1): 233–244.

Clarke, D. & Mohtadi, C. (1989). Properties of genelized predictive control, Automatica 25(6): 859–875.

Clarke, D., Mohtadi, C. & Tuffs, P. (1987a). Generalized predictive control - part II: Extentionsand interpretations, Automatica 23(2): 149–160.

Clarke, D., Mohtadi, C. & Tuffs, P. S. (1987b). Generalized predictive control - part I: The basicalgorithm, Automatica 23(2): 137–148.

Cutler, C. & Ramaker, B. (1980). Dynamic matrix control - a computer control algorithm, Proc.Joint Automatic Control Conf. .

Gawthrop, P. (1977). Some interpretations of the self-tuning controller, IEE Proceedings D, Control

Theory and Application 1(124): 889–894.

Kailath, T. (1980). Linear Systems, Prentice Hall.

Kucera, V. (1979). Linear Control Systems, Wiley-Interscience.

Richalet, J., Rault, A., Testud, J. & Padon, J. (1978). Model predictive heuristic control: Appli-cation to industrial processes, Automatica 14: 413–428.

30 A Polynomials, transfer functions and LTI systems

Appendix

A Polynomials, transfer functions and LTI systems

In this section we will consider polynomials

B(q−1) = b0 + b1q−1 + ... + bnq

−n

in the shift operator q−1 or polynomials

B(z−1) = b0 + b1z−1 + ... + bnz

−n

in the LaPlace variable z. The coefficient are assumed to be real (i.e. bi ∈ R). The polynomial isdenoted as monic if b0 = 1 and n is the order of the polynomial (for bn 6= 0). The roots of B isthe complex values zi that satisfies

B(z−1

i ) = 0

Furthermore consider a transfer operator

G(q) =B(q−1)

A(q−1)yt = G(q)ut

or in full

G(q) =b0 + b1q

−1 + ... + bnq−n

1 + a1q−1 + ... + anq−n

The dynamic relation between input and out is unchanged if numerator and denominator is multi-plied with the same (reel valued) factor. For that reason (and by convention) the transfer operatoris normalized such that the denominator (A) is monic. Furthermore if the transient behavior isdisregarded the dynamic relationship between input and output will be unchanged if numeratorand denominator is multiplied with the same dynamic factor. For example will

B(q−1)

A(q−1)and

C(q−1)B(q−1)

C(q−1)A(q−1)

have the same dynamic relationship between input and output.

It is quite easy to check that

b0 + b1q−1 + ... + bnq

−n

1 + a1q−1 + ... + anq−n= b0 + q

−1 (b1 − b0a1) + (b2 − b0a2)q−1 + ... (bn − b0an)q

1−n

1 + a1q−1 + ... + anq−n

or stated otherwise thatB(q−1)

A(q−1)= g0 + q−1S1(q

−1)

A(q−1)

whereS1(q

−1) = s0 + s1q−1 + ... + sns

q−ns si = bi − b0ai g0 = b0

and the order of S1 is n− 1 (or less).

31

If this observation is applied recursively (first on B/A, then on S1/A, S2/A and so on), then wecan see that

B(q−1)

A(q−1)= g0 + g1q

−1 + ... + gm−1q1−m + q−mSm(q−1)

A(q−1)

or

B(q−1)

A(q−1)= Gm(q−1) + q−mSm(q−1)

A(q−1)(59)

whereGm(q−1) = g0 + g1q

−1 + ... + gm−1q1−m

the order of Sm is n − 1 (or less). It is clear that gi coincide with the coefficients in the impulseresponse of the system B/A. For that reasonGm is also denoted as the truncated impulse response,ie.

Gm(q−1) =[

G(q)]

m= g0 + g1q

−1 + ... + gm−1q1−m

Equation (59) is simply equivalent to (a simple version of) the Diophantine equation

B(q−1) = A(q−1)Gm(q−1) + q−mSm(q−1) (60)

The coefficients in Gm and Sm (ie. the solution to (59) and (60)) are easily obtained from theimpulse response of the transfer function (see Section C). They can also be obtained by recursivelyapplying the observation mentioned above. The following lines represent the core part of a matlabprocedure for determining Gm and Sm.

G=[]; S=B;

for i=1:m,

G=[G S(1)]; S=[S(2:end)-S(1)*A(2:end) 0];

end;

S=S(1:end-1);

B Prediction

In this appendix it is assumed that the system is a scalar time-invariant stochastic system. Thesystem is assumed to be in the ARMAX form. If the system is given in BJ form or the L structure,then it can be transformed to the ARMAX form.

B.1 Prediction in the ARMA structure

Before we give the predictor for the ARMAX structure, we will handle the problem of predictionwhen the system is in the ARMA form. This problem is simpler than prediction in the ARMAXstructure due to the fact that no control (or other known input) is present. In other words we areonly considering the stochastic part of the process.

Theorem: B.1: Let yt be a (weakly) stationary process given by the ARMA model:

A(q−1)yt = C(q−1)et (61)

where {et} is a white noise sequence of F(0, σ2) distributed stochastic variable (which is independentof yt−i , i = 1, 2, ...). Furthermore, assume that A and C have all their roots inside the stability areaand are monic (C(0) = A(0) = 1). The minimal variance prediction is the given by:

32 B Prediction

yt+m|t =Sm(q−1)

C(q−1)yt

with the error:yt+m|t = Gm(q−1)et+m

The polynomials, G and S, obey the Diophantine equation:

C(q−1) = A(q−1)Gm(q−1) + q−mSm(q−1) (62)

with:Gm(0) = 1 ord(Gm) = m− 1 and ord(Sm) = Max{na − 1, nc −m}

✷

Proof: Let Yt denoted the information embedded in yt−i, i = 0, 1, ... (and ut−i, i = 0, 1, ...). The optimalsolution to the prediction problem when the available data is Yt is given by the conditional expectation, ie.:

yt+m = E{yt+m|Yt}

Consequently, we will look at:

yt+m =C(q−1)

A(q−1)et+m = gt ⋆ et+m =

∞∑

i=0

giet+m−i

The future noise inputs et+m, ...et+1 are independent of Yt while et, et−1, ... on the contrary are dependent on Yt.Consequently, we will split yt+m into two contributions. If we apply (59) (page 31) on C/A we have

yt+m = Gm(q−1)et+m +Sm(q−1)

A(q−1)et

where Gm and Sm are solutions to the Diophantine equation

C(q−1) = A(q−1)Gm(q−1) + q−mSm(q−1)

with:Gm(q−1) = 1 + g1q

−1 + ...+ gm−1q1−m

andord(G) = m− 1 G(0) = 1

It should be noted again that:

Gm(q−1)et+m = g0et+m + g1et+m−1 + ...+ gm−1et+1 =

m−1∑

i=0

giet+m−i

is independent of Yt. With the process equation we have

et =A(q−1)

C(q−1)yt

and then

yt+m = Gm(q−1)et+m +Sm(q−1)

C(q−1)yt (63)

The optimal prediction is then given by:

yt+m|t = E{yt+m|Yt} =Sm(q−1)

C(q−1)yt

with the error:yt+m|t = Gm(q−1)et+m

This proof can be significantly shortened if we accept the Diophantine equation (62). Then

yt+m =1

C

[

AGmyt+m + Smyt]

=1

C

[

GmCet+m + Smyt]

or as stated in (63). ✷

Remark: 5 Notice, the assumption that we can estimate et from the observation of yt, is that wehave observed yt for t0 → −∞ and that C is stable. When estimating et from yt we are filtrating ytthrough the inverse transfer function. The estimation error will tend to zero in a way determinedby the roots of C. ✷

B.2 Simple prediction in the ARMAX structure 33

Remark: 6 Notice, the error is a MA process. The variance of such a process is given by:

V ar{yt+m|t} = σ2[1 + g21 + ...+ g2m−1]

and the (Auto) Covariance function (and the correlation function) is zero for for lag larger thanm. ✷

B.2 Simple prediction in the ARMAX structure

Let us now return the original problem and assume the system is given by the ARMAX modelin (2). We will first consider the case where the prediction horizon exactly match the time delaythrough the system. Later on we will consider the more general case.

Assume that we known (have measured) the output up to present time t. The present controlaction, ut will affects the output at t+ k and forward. Let us then first determine the predictionof yt+k. That in fact means a prediction horizon which equals the time delay through the system.

Theorem: B.2: Let the system be given by (2) with a time delay equal k. Then the k step aheadprediction of yt is given by:

yt+k|t =1

C(q−1)

[

B(q−1)Gk(q−1)ut + Sk(q

−1)yt

]

(64)

and the error isyt+k|t = Gk(q

−1)et+k

where the polynomials G and S, are solutions to

C(q−1) = A(q−1)Gk(q−1) + q−kSk(q

−1) (65)

with:Gk(0) = 1 ord(Gk) = k − 1 and ord(Sk) = Max(na − 1, nc − k)

✷

Proof: Let us first rewrite the system equation (2) into

yt+k =B(q−1)

A(q−1)ut +

C(q−1)

A(q−1)et+k

In this simple case only the stochastic part contribute to the prediction (since the prediction horizon exactly matchthe time delay through the system). Let us focus on this part. If we truncate the transfer function to its k firstterm, ie.

C(q−1)

A(q−1)et+k = Gk(q

−1)et+k +Sk(q

−1)

A(q−1)et

whereC(q−1) = A(q−1)Gk(q

−1) + q−kSk

andGi(0) = 1 ord(Gk) = k − 1 ord(Sk) = Max{na − 1, nc − k)

Notice thatGket+k = et+k + ... gk−1et+1

Also notice that the coefficients in Gk simply is the Markov coefficient (or the impulse response) of the transferfunction from et to yt, ie.

Gk(q−1) =

[

C(q−1)

A(q−1)

]

k

= g0 + g1q−1 + ...+ gk−1q

1−k

where [...]k a truncation of the impulse response (of C/A to its k first coefficients). Then we can write

yt+k =B(q−1)

A(q−1)ut +Gk(q

−1)et+k +Sk(q

−1)

A(q−1)et

34 B Prediction

where the last term is known (or can be estimated from data) at t since

et =A(q−1)yt − q−kB(q−1)ut

C(q−1)

As a result we have (omitting the (q−1) argument):

yt+k =B

Aut +

SkA

ACyt − q−k SkB

ACut +Gket+k

=Sk

Cyt +

B

A

[

1− q−k Sk

C

]

ut +Gket+k

=Sk

Cyt +

BGk

Cut +Gket+k

If we are willing to accept a trick we can use a little shorter version of the proof. If we apply the Diophantineequation

C = AG+ q−kS

we can for yt+k write

yt+k =1

C

{

AGk + q−kSk

}

yt+k =1

C{GkAyt+k + Skyt} (66)

If we furthermore use the system equation (2), we have

yt+k =1

C{Gk [But + Cet+k] + Skyt} =

1

C{BGkut + Skyt}+Gket+k (67)

✷

It could be noticed, that for B ≡ 0 this predictor is equivalent to one in B.1.

B.3 Prediction in the ARMAX structure

Let us now focus on the little more general problem of prediction m step ahead. The core ofproblem is the influence of the control signal and especially the future control actions. In a controlcontext the present control action ut and the future control actions is a part of the optimization. Ina prediction context the present input is a part of the known input the situation is a bit different.

Theorem: B.3: Let the system be given by (2) with a time delay equal k and assume the presentcontrol action ut is not known (or has to be determined). Then the m step ahead prediction of yt isgiven by:

yt+m =Sm

Cyt +

Fm+1

Cut−1 +Hm+1ut+m (68)

and the error isyt+m|t = Gm(q−1)et+m

where the polynomials Gm and Sm, are solutions to

C(q−1) = A(q−1)Gm(q−1) + q−mSm(q−1) (69)

with:Gm(0) = 1 ord(Gm) = m− 1 and ord(Sm) = Max(na − 1, nc −m)

Furthermore are the polynomials Hm+1 and Fm+1 solutions to the Diophantine equation:

B.3 Prediction in the ARMAX structure 35

q−kBGm = CHm+1 + q−m−1Fm+1 (70)

whereord(Hm+1) = m ord(Fm+1) = max(nc − 1, nb + k − 1)

Quite rarely and only for k = 0 and B(0) = 1 will Hm+1 be monic. ✷

Proof: Let us now rewrite the system equation (2) into

yt+m = q−k B(q−1)

A(q−1)ut+m +

C(q−1)

A(q−1)et+m

and let us first focus on the stochastic part of the output. If we truncate the transfer function to its m first term,ie.

C(q−1)

A(q−1)et+m = Gm(q−1)et+m +

Sm(q−1)

A(q−1)et

whereC(q−1) = A(q−1)Gm(q−1) + q−mSm (71)

andGi(0) = 1 ord(Gi) = m− 1 ord(Si) = Max{na − 1, nc −m)

HereGm(q−1)et+m = et+m + ... + gi−1et+1

and the coefficients in Gm simply is the Markov coefficient (or the impulse response) of the transfer function fromet to yt ie.

Gi(q−1) =

[

C(q−1)

A(q−1)

]

i

= g0 + g1q−1 + ...+ gi−1q

1−i

We can then write

yt+m = q−k B(q−1)

A(q−1)ut+m +Gm(q−1)et+m +

Sm(q−1)

A(q−1)et

where the last term is known (or can be estimated from data) at t since

et =A(q−1)yt − q−kB(q−1)ut

C(q−1)

As a result we have (if we omit the (q−1) argument):

yt+m = q−k B

Aut+m +Giet+m +

SmA

ACyt − q−k SmB

ACut

=Sm

Cyt + q−k B

A

[

ut+m −Sm

Cut

]

+Gmet+m

=Sm

Cyt + q−k B

A

C − q−mSm

Cut+m +Gmet+m

=Sm

Cyt + q−k BGm

Cut+m +Gmet+m

Here we have used the Diophantine equation in (71).

Let us now focus on the second term

q−k BGi

Cut+m

It is clear that for m ≤ k the problem is easy. If m > k then this term depend on previous control actions (ut−1,ut−2, ...) and on future control actions (ut, ut+1, ...). (In this control context we regard ut as a decision variable andas a future control action. In other applications it might be regarded as a previous input variable.) Consequently,we can split this term into two terms. One term is related to actual control actions and one term related to previouscontrol actions. We have:

q−kBGm

Cut+m = Hm+1ut+m +

Fm+1

Cut−1

where especially:

Hm+1(q−1) =

[

q−kB(q−1)

A(q−1)

]

m+1

= h0 + h1q−1 + ...+ hmq−m

andHm+1ut+m = h0ut+m + ... + hmut

36 B Prediction

It is clear that hi = 0 for i ≤ k. Also note that in contradiction to Gm the polynomial Hm+1 is not necessarilymonic.

In general the polynomials Hm+1 and Fm+1 can be found as solution to the Diophantine equation:

q−kBGm = CHm+1 + q−m−1Fm+1

whereord(Hm+1) = m ord(Fm+1) = max(nc − 1, nb + k − 1)

In summary we have

yt+m =Sm

Cyt +

Fm+1

Cut−1 +Hm+1ut+m +Gmet+m

The first two terms is denoted as the free response because it depend on past control actions and output values. Thesecond term (Hm(q−1)ut+m) is denoted as the forced response and depend on actual and future control actions.The last term (Gm(q−1)et+m) is the noise term.

Also in this case we have a more short and streamlined proof. We just have to accept a trick. If we apply (69) wehave quite shortly

yt+m =1

C

[

AGmyt+m + Smyt]

=1

C

[

Gm

(

q−kBut+m + Cet+m

)

+ Smyt]

resulting in:

yt+m =1

C

[

q−kBGut+m + Smyt]

Gmet+m

The next step involves (70) and gives

yt+m =1

C

[

CHm+1ut+m + Fm+1ut + Smyt]

Gmet+m

From this (68) simply emerge.

✷

Let us for the sake of generality focus on the prognosis situation, where the present input is assumedto be known (or measured).

Theorem: B.4: Let the system be given by (2) with a time delay equal k and where the presentinput ut is known. Then the m step ahead prediction of yt is given by:

yt+m =Sm

Cyt +

Fm

Cut +Hmut+m

and the error isyt+m|t = Gm(q−1)et+m

where the polynomials Gm and Sm, are solutions to

C(q−1) = A(q−1)Gm(q−1) + q−mSm(q−1) (72)

with:Gm(0) = 1 ord(Gm) = m− 1 and ord(Sm) = Max(na − 1, nc −m)

Furthermore are the polynomials Hm and Fm solutions to the Diophantine equation:

q−kBGm = CHm + q−mFm (73)

whereord(Hm) = m− 1 ord(Fm) = max(nc − 1, nb + k − 1)

Quite rarely and only for k = 0 and B(0) = 1 will Hm be monic. ✷

37

C The Diophantine Equation

The Diophantine equation play a very important role in connection to stochastic control. It is apart in many design algorithms for controllers and predictors. I this appendix we will investigatethe property of this equation in details.

The name comes from the fact that Diophantus of Alexandria wrote a book in the third centuryA.D. about the problem of finding integer solution to the equation C = AX +BY .

Wikipedia: Diophantus of Alexandria (Greek born between 200 and 214, died between284 and 298 AD), sometimes called the father of algebra, was an Alexandrian mathe-matician. He is the author of a series of classical mathematical books called Arithmeticaand worked with equations which we now call Diophantine equations; the method tosolve those problems is now called Diophantine analysis. The study of Diophantineequations is one of the central areas of number theory. The findings and works ofDiophantus have influenced mathematics greatly and caused many other questions toarise. The most famous of these is Fermat’s Last Theorem. Diophantus also madeadvances in mathematical notation and was the first Greek mathematician who franklyrecognized fractions as numbers.

Assume, that we for 3 given polynomials, A, B and C

C(q−1) = c0 + c1q−1 + ...+ cnc

q−nc

B(q−1) = b1q−1 + ...+ bnb

q−nb

A(q−1) = 1 + a1q−1 + ...+ ana

q−na

have to determine the polynomials R and S, such that:

C(q−1) = A(q−1)R(q−1) + B(q−1)S(q−1) (74)

Notice this set of equations are determined by A, B and C. Also notice that these polynomialsare general and only in some special cases coincide with the system polynomials. It is importantto notice that B obey the following

B(0) = 0

i.e. the leading coefficient in B (that is b0) is zero.

We introduce the following basic theorem:

Theorem: C.1: The Diophantine equation (74) has a solution if and only if the common factors inA and B also is a common factor of C. ✷

Proof: See (Kucera 1979) ✷

It can also be noticed, that solutions to the Diophantine equation is not in general unique. Let R0

and S0 be a set of solutions to the Diophantine equation (74). Then

R(q−1) = R0(q−1) + B(q−1)F (q−1)

S(q−1) = S0(q−1)−A(q−1)F (q−1)

is also a set of solutions. Here F is an arbitrary polynomial.

In our applications the solution can be fixed in terms of constraints on the orders of the polynomials.In order to obtain the best noise reduction we often choose

38 C The Diophantine Equation

ord(R) = nr = nb − 1

where nb is the order of the B polynomial. The order of S has to such that a solution exists. Thatmeans that

nb + ord(S) = Max{nc, n+ nr}

or thatns = ord(S) = Max{na − 1, nc − nb}

The Diophantine equation can be solved in various ways. One of them is the Sylvester methodand the Euclidian algorithm.

C.1 The Sylvester method

If the coefficients to the polynomials (i.e. the coefficients to q−i) is identified, then the Diophantineequation is just linear set of equations. The resulting set of equations in the coefficients in R andS can be found to be:

1 0 ... 0 0 0 ... 0

a1 1. . .

... b1 0...

a2 a1 0 b2 b1 0...

... 1... b2 b1

an an−1 a1 bnb

... b2

0 an... 0 bnb−1

......

... an−1

... bnbbnb−1

0 0 an 0... bnb

r0r1...

rnr

s0s1...

sns

=

c0c1...

cnc

0...0

where:R(q−1) = r0 + r1q

−1 + ...+ rneq−ne

S(q−1) = s0 + s1q−1 + ...+ sns

q−ns

The set of equations can be expressed in a condensed form as:

Sx = z

where

x =

r0r1...

rnr

s0s1...

sns

and z =

c0c1...

cnc

0...0

C.2 Impulse response method 39

and where the Sylvester matrix, S, is

S =

1 0 ... 0 0 0 ... 0

a1 1. . .

... b1 0...

a2 a1 0 b2 b1 0...

... 1... b2 b1

an an−1 a1 bnb

... b2

0 an... 0 bnb−1

......

... an−1

... bnbbnb−1

0 0 an 0... bnb

This matrix has several interesting properties in connection to system theory. It can be shownthat A and B are coprime if and only if S are non-singular (see e.g. (Kailath 1980)).

C.2 Impulse response method

In certain cases the Diophantine equation degenerate to the following

C(q−1) = A(q−1)G(q−1) + q−kS(q−1) (75)

Hereng = k − 1 ns = max(na − 1, nc − k)

In this case the solution is a simple division of polynomials.

C(q−1)

A(q−1)= G(q−1) + q−k S(q

−1)

A(q−1)

and we can interpret the coefficients in G as the truncated impulse response, i.e.

G(q−1) =

k−1∑

i=0

giq−i C(q−1)

A(q−1)=

∞∑

i=0

giq−i

Let

C(q−1)

A(q−1)=

∞∑

i=0

giq−i =

k−1∑

i=0

giq−i +

∞∑

i=k

giq−i

orC(q−1)

A(q−1)= G(q−1) + q−k

∞∑

i=0

gi+kq−i

Here S is the remainder in the division.

The algorithm can be summarized in:

1. Determine the first k coefficients in the impulse response, i.e.

G(q−1) =

[

C(q−1)

A(q−1)

]

k

2. Determine S from:S(q−1) = qk(C(q−1)−A(q−1)G(q−1))

It should be noticed that this method is only applicable when the Diophantine equation takes thesimple form in (75). The method can in certain situations (in connection to predictive controldesign) be implemented as a recursive method (in k).

40 D Closed loop properties

D Closed loop properties

Consider the dynamic system given by:

Ay = Bu+ Ce

and assume it is controlled by the controller:

Ru = Qw − Sy

u

w

e

y

The closed loop properties are quite easily obtained. If the system description is multiplied withR we obtain:

ARy = BRu+ CRe

In a similar way, if the controller is multiplied with B we obtain:

BRu = BQw − BSy

Consequently we have

(AR+ BS)y = BQw +RCe

If we multiply the system description with S we have

ASy = BSu+ CSe

If the controller is multiplied with A we have:

ARu = AQw −ASy

In total we obtain the description:

(AR + BS)u = AQw − SCe

Index

MV1-regulator, 23MV3regulator, 24

ARMAX structure, 4

CARMA, 4control signal, 4

Generalized Predictive Control, 24GPC control, 24

input signal, 4

LTI system, 4

matrixSylvester, 37

minimal variance control, 5monic, 4

output signal, 4

rational spectrum, 4

scalar systems, 4SISO, 4stationary process, 4Sylvestermatrix, 37

time delay, 4

41

stochastic control theory - technical university of … · stochastic control theory...

Documents