system modeling and identification power spectrum analysis

1-1

Lecture Note #2(Chap.4 – Chap.5)

CBE 702Korea University

Prof. Dae Ryook Yang

System Modeling and Identification

1-2

Chap.4 Spectrum Analysis

• The discrete Fourier transform

– The DFT thus evaluates the z-transform on the unit circle.

– Fast Fourier transform (FFT) algorithm is an importantimplementation of the DFT, although often with the restriction thatthe number of measurements should be a power of two.

1

0

( ) exp( ) ( )k

Ni h

k N k m k zm

X X i h x i mh hX e ωω ω−

=

= = − =∑where (2 / ) for 0,2, , 1k Nh k k Nω π= = −L

1-3

Power Spectrum Analysis

• The periodogram (sample spectrum)

– Showing signal contents– Sampled version of spectrum convoluted

with rectangular pulse of duration T

• The correlogram (indirect approach)

21 2ˆ ( ) ( ) , forxx k N k kS i X i kNh Nh

πω ω ω= =

1*1ˆ ( )

N

xx t t kt k

C hk x xN k

−

−=

=− ∑

1

0

ˆ ( ) ( ) k

Ni mh

xx k xxm

S i h C mh e ωω−

−

=

= ∑)

1-4

Spectral Leakage and Windowing

• Spectral leakage– All methods based on Fourier transform assumes infinite measurements

sequence is available.– In reality, only finite number (N) of measurements are available– This causes spectral leakage and results distortion of calculated spectra.– For infinite number of measurements,

– For N measurements,

– Due to the last term, it causes spectral leakage and the signal can bedistorted.

– To reduce the effects of the bias, use windowing.

{ } { } { }( ) ( ) ( ) ( ) ( )k kY i y t W t y t W t∆ ω = = ∗F F F

{ } { } { }( ) ( ) ( ) ( ) ( ) ( ) ( )N k T k TY i y t W t N t y t W t N tω = = ∗F F F

1-5

• Window carpentry– Window function: some weight function applied to data to reduce the

spectral leakage associated with the finite observation interval– Basic property:

– Commonly used window functions and application of windowing

1, 0 ( )

0, largew

ττ

τ=

=

( ) for largew τ ε τ′ ≈

0 0 00( ) ( ) ( )s in where , 3 , 5 ,

T

Ts y t w t tdt Tω ω ω π π π= =∫ L

1-6

Transfer Function Estimation

• Based on DFT’s of input and output

• Based on cross spectrum and autospectrum of input

1

( )ˆ ( )( )

i h N

N

Y iH e

U iω ω

ω=

2

ˆ ( )ˆ ( ) ˆ ( )yui h

uu

S iH e

S iω ω

ω=

If the input and disturbance is uncorrelated,the disturbance contribution on YN(iω) isconsiderable while that on Syu(iω) is not. Thus,H2 can yield better results when input anddisturbance is uncorrelated.

1-7

• Parametric model (e.g., transfer function model)– The gain plot of the Bode diagram: the estimation of low-frequency gain

by fitting asymptotes to the gain plot, resonances– The phase plot of the Bode diagram: contribution of poles and zeros to the

phase lag/lead, dead time

• Statistical properties of transfer function estimate– Using coherent spectrum, check the linear dependence and the disturbance

levels– is defined and unbiased at a fixed number of frequencies and its

variance decreased as N grows.– The estimated is asymptotically unbiased as the number of

observation N increases.– The variance of the TF estimate at a given frequency point does not

decrease as N grows. Instead, the signal-to-noise ratio determines theaccuracy at each frequency.

– To reduce the variance, use block segmentation and average them.

ˆ( )i hH e ω

ˆ( )i hH e ω

1-8

Smoothing Spectra• Windowing:

– It offers a trade -off between spectral resolution and smoothness, the trade-off being dependent on the choice of window.

– The computation of from has a poor statistical andnumerical property. The remedy can be the use of windowing.

• Block segmentation:– Splitting the data into segments, and computing the periodogram for each

segment, and averaging the periodograms for all the segments.

• Zero padding:– Making a DFT of a sequence of N data points extended with a kN

sequence of zeros.– Zero padding is useful for smoothing the appearance of the spectrum

estimate and to resolve potential ambiguities, but it does not improve thefundamental frequency resolution.

ˆ ( )yuS iω ˆ ( )yuC τ

1-9

Covariance Estimate andCorrelation Analysis

• Obtaining Impulse Response– The input is white noise or PRBS

– For N measurements,

– To get h(k), an infinite dimensional equation should be solved.– But for special case of white noise input

– The estimation of h(k) from has also very poor numerical properties,especially for large k.

0

( ) ( ) ( ) ( )l

y k h l u l k e k∞

=

= − +∑0

( ) ( ) (( ) )yu uul

C kh h l C l k h∞

=

= −∑

1

1ˆ ( ) ( ) ( )N

yul k

C kh y l u l kN k = +

= −− ∑

1

1ˆ ( ) ( ) ( )N

uul k

C kh u l u l kN k = +

= −− ∑

2 , if 0( )

0, if 0u

uu

kC kh

kσ =

= ≠

2

1ˆ ˆ( ) ( )yuu

h k C khσ

=

ˆ ( )yuC kh

1-10

Chap.5 Linear Regression

• A general black-box model

• A linear model (linear in parameters):

1 2( , , , ; )py f vφ φ φ θ= +L

Output (response, dependent var.)Regressor (explanatory, independent var.)

ParametersDisturbances

( ) ( )Ty t e tφ θ= +

2 20 1 2 0 1 21. ( ) [1 ][ ]Ty t u u u uθ θ θ θ θ θ= + + =

2. log log log [log 1][ log ]TPV c P V c y V cγ γ γ= ⇒ = − + ⇒ =

3. y( ) ( 1) ( 1) [ ( 1) ( 1)][ ]Tk ay k bu k y k u k a b= − + − = − −

1-11

Least Square Estimation

• N Data points

• Assuming• Prediction error:

• Loss function:• Least square estimate:

1

N

N

yY

y

=

M1

N

T

NT

φ

Φφ

=

M1

N

ee

e

=

MN NY eΦ θ= +

2{ } 0, { } ,i i j e ijE e E ee i jσ δ= = ∀ˆ( ) N NYε θ Φ θ= −

21 1 1( ) ( ) ( )

2 2 2

NT T

k N N N Nk

V Y Yθ ε ε ε Φ θ Φ θ= = = − −∑ˆ argmin ( )Vθ θ=1ˆ ( )T T

N N N NYθ Φ Φ Φ−=

1-12

• Solution for least square estimate

– Sufficient condition for optimality: V(θ) is positive semidefinite

– The existence of inverse of• The size of regressor matrix ΦN is (Nxp) where p is the number of

parameters.• If , the inverse does not exist.• If , the inverse exists.

• Statistical properties of least square estimate (white noise)– Unbiased estimate of σe

2 is

– The covariance matrix of estimate is

( )( ) 0 (necessary condition for optimality)T T T

N N N N

VY

θΦ θ Φ Φ

θ∂

= − + =∂

1

1 1

2 ( ) ( ( ) )

( ( ) ) ( )( ( ) ) 0

T T TN N N N N N

T T T T T TN N N N N N N N N N

V Y I Y

Y Y

θ Φ Φ Φ Φ

θ Φ Φ Φ Φ Φ θ Φ Φ Φ

−

− −

= −

+ − − ≥

( )TN NΦ Φ

( )TN Nrank pΦ Φ =

( )TN Nrank pΦ Φ <

2 2 ˆˆ ( )e VN p

σ θ=−

2 1( )Te N Nσ Φ Φ −

1-13

• Example 5.4 (zero mean disturbance)

– The system parameters: a=0.9, b=0.1 (unknown)– The u and e are generated as random with σu

2 = σe2 =1. (E{e}=0)

– The least square estimate

– The loss function

– The variance estimate

– The estimated covariance

– Results are good even with large disturbance level.

k 1 1: y k k kS ay bu e− −= + +

k 1 1 1 1ˆ: y [ ][ ]T T

k k k k kM ay bu y u a b φ θ− − − −= + = =) )) )

ˆ [ ] [0.8992 0.0899]T a bθ = =))

ˆ( ) 499.94V θ =

2 1.0019eσ =

2 1 30.249 0.09( ) 10

0.09 1.023T

e N Nσ Φ Φ − −− = ⋅ −

1-14

• Example 5.5 (Nonzero mean disturbance)

– The system parameters: a=0.9, b=0.1 (unknown)– The u and e are generated as random with σu

2 = σe2 =1. (E{e}=1)


– The loss function

– The variance estimate


– This covariance matrix is very close to singular. Hence, the least-squaressolution is sensitive to colored noise.

k 1 1: y k k kS ay bu e− −= + +

k 1 1 1 1ˆ: y [ ][ ]T T


ˆ [ ] [0.9829 0.1550]T a bθ = =))

ˆ( ) 531.04V θ =

2 1.0642eσ =

2 1 3 80.0249 0.2734( ) 10 , det 1.4096 10

0.2734 3.568T

e N Nσ Φ Φ − − −− = ⋅ = ⋅ −

1-15

• Example 5.6 (step response)

– The system parameters: a=0.9, b=0.1 (unknown), step input– The e is generated as random with σe

2 =1. (E{e}=0)


– The loss function/variance estimate


– The large and small eigenvalues indicate the variance of combination ofthe estimated parameters associated to the eigenvector.

k 1 1: y k k kS ay bu e− −= + +

k 1 1 1 1ˆ: y [ ][ ]T T


ˆ [ ] [0.8836 0.1056]T a bθ = =))

ˆ( ) 484.1V θ =2 0.970eσ =

2 1 30.226 0.219( ) 10

0.219 1.291T

e N Nσ Φ Φ − −− = ⋅ −

30.183 0.981 0.19410 ,

1.335 0.194 0.981Vλ −

= ⋅ = −

1-16

• Example 5.7 (sensitivity to outliers)– Single big outlier may result in very distorted identification.

• Comments on least square estimation– Colored noise other than white noise will affect the efficiency of the

identification results• Outliers• Filtered white noise

– Remedies• Choose high SNR case if possible• Improve the excitation so that the rank of the ΦN

TΦ N matrix is at leastsame as number of parameters to be fitted. (persistent excitation)

• Remove the outliers by suitable filters such as spike filter or by visualinspection

• The identification results should be verified thoroughly. If the result isno good, try different methods or design the experiments careful ly.

1-17

Optimal Linear Unbiased Estimators• The assumption is too restrictive.• Class of all linear estimates

– For unbiased estimation, TTΦ=I and E{TTe}=0.– Best possible covariance

– Lagrange function for optimal estimate with constraints

– The optimal estimator

2{ } 0, { } ,i i j e ijE e E ee i jσ δ= = ∀ˆ TT Yθ =

ˆ ( )T T TT Y T I T eθ θ θ θ Φ θ= − = − = − +%

( ) {( )( ) } {( )( ) }T T T T TCov E E T Y T Y T RTθ θ θ θ θ θ θ= − − = − − =) ) )

R=E{eeT}

( , ) ( )T T TL T T RT tr T IΛ θ θ Λ Φ= + −% %

( , )2 0TL T

RTT

Λθθ ΦΛ

∂= + =

∂%% ( , )

0TL TT I

ΛΦ

Λ∂

= − =∂

1 12( )T TRΛ Φ Φ θθ− −= − %% 1 1 1( )TT R RΦ Φ Φ− − −=1 1 1ˆ ( ) (Markov estimate)T T TT Y R R Yθ Φ Φ Φ− − −= =

1-18

– The covariance matrix

– This estimator is also called Best Linear Unbiased Estimator (BLUE).– The BLUE can also be derived by minimize the loss function.

– If R= σe2 I, the BLUE is same as ordinary least square estimate.

– If R= diag(σe12, σe2

2,… , σeN2), the BLUE is same as weighted least square

(WLS) estimate.– If R is a full matrix, the BLUE is same as generalized least square (GLS)

estimate.

1 1ˆ( ) ( )TCov Rθ Φ Φ− −=

11( ) ( ) ( )

2TV Y R Yθ Φθ Φθ−= − −

) ) )

1-19

Linear Regression inFrequency Domain

• For the frequency data G(iωk),

• Rearranging since it is nonlinear

• Then let

11

11

( )( )( )( ) ( )( )

nn

n nn

b i bB iG ii a i aA i

ωωωω ωω

−

−

+ += =

+ + +

)) L) L

2

,

( )min ( )

( )k

ka b k k

B iG i

A iω

ωω

−∑))

2

,min ( ) ( ) ( )k k ka b k

A i G i B iω ω ω−∑) ) )

1 1

2 2

( ) ( )( ) ( )

( ) ( )

n

n

nN N

i G ii G i

Y

i G i

ω ωω ω

ω ω

=

M

1 11 1 1 1

1 12 2 2 2

1 1

( ) ( ) ( ) ( ) 1( ) ( ) ( ) ( ) 1

( ) ( ) ( ) ( ) 1

n n

n n

n nN N N N

i G i G i ii G i G i i

i G i G i i

ω ω ω ωω ω ω ω

Φ

ω ω ω ω

− −

− −

− −

− −

− − = − −

L L

M O M M O ML L

[ ]1 1T

n na a b bθ = L L * 1 *ˆ ( ) Yθ Φ Φ Φ−=

1-20

– This method is capable of fitting complicated frequency response.– But the frequency range is wide, it gives heavy weighting on high

frequency data

• Least square properties of DFT

– A simplification: Let (Φ∗Φ)−1=I/N.

0 11

0 11( 1) ( 1)( 1)

1 1 1N

N

T

i h i hi h

i N h i N hi N h

e e e

e e e

ω ωω

ω ωω

Φ−

−− −−

=

LL

M M O ML

[ ]0 1 1T

NY y y y −=) ) ) )L

0 1 1

T

Nθ θ θ θ − = ) ) ) )L

1

0

2 10 1 2 1

( ) k

Ni mh

k mm

NN

y y k h e

z z z

ωθ

θ θ θ θ

−

=

−−

= =

= + + + +

∑)

) ) ) )L

* 1 *ˆ ( ) Yθ Φ Φ Φ−=)

*ˆ (1/ )N Yθ Φ=) 1

0

1m

Ni kh

m kk

y eN

ωθ−

== ∑

1-21

Least Square Estimation withLinear Constraints

• Problem statement

• Lagrangian formulation

– For example, the fixed static gain case

11min ( ) ( ) ( )

2ˆsubject to 0

TV Y R Y

F G

θθ Φθ Φθ

θ

−= − −

− =

)) ) )

1

,

1 ˆmin ( , ) ( ) ( ) ( )2

T TL Y R Y F Gθ λ

θ λ Φθ Φθ λ θ−= − − + −)) ) )

1 1( , )0T T TL

R R Y Fθ λ

Φ Φθ Φ λθ

− −∂= − + =

∂

) ))1 1 1( ) ( )T T TR R Y Fθ Φ Φ Φ λ− − −= −

)1 1 1 1 1 1( ( ) ) ( ( ) )T T T TF R F F R R Y Gλ Φ Φ Φ Φ Φ− − − − − −= −

1 1 ( [1 1], 1)1

ba b F G

a= ⇒ + = = =

−

) )))

1-22

A Geometrical Interpretation

• The least square estimate can be regarded as the projection of t hedata on the surface spanned by the column of ΦN.

• Projection matrix– PN is symmetric

– PN is normalized

• Projection– ΦN

Tε=0

• Variance• Augmented system method

– ΦNTε=0 and YN= ΦNθ + ε

1( )T TN N N N NP Φ Φ Φ Φ−=

1( )T TN N N N N N N NY P Y YΦ Φ Φ Φ−= =)

1 1( ) ( )2 2

T T TN N N NV Y Y Y Yθ ε ε= = −

) ))

10 0N N N N

TN p p pN

I YΦ εΦ θ

×

× ×

=

)

Augmented system matrix

1-23

Multivariable System Identification• Consider a MIMO system (p input and m output)

– Different (A,B) will give same transfer function. (H=A−1B= (QA)−1(QB))– Thus, choose the parameter set with smallest 2-norm.

1 1 1: ( ) ( ) ( ) ( ), det ( ) 0S A z Y z B z U z A z− − −= ≠1 1

1 1

1 11 1

( ) , , ,

( ) , , ,

n m mm m n n

n m pn n

A z I A z A z A A R

B z B z B z B B R

− − − ××

− − − ×

= + + + ∈

= + + ∈

L LL L

1 1 1 1

( )1 1

( )1 1

,

[ ] ,

[ ] ,

mk k n k n k n k n k

T T T T T n m pk k k n k k n k

T n m p mn n

y A y A y Bu B u y R

y y u u R

A A B B R

φ φ

θ θ

− − − −

+− − − −

+ ×

= − − − + + + ∈

= − − ∈

= ∈

L LL L

L L1T

NTN

y

Yy

=

M1

N

T

NT

φ

Φφ

=

M1

N

ee

e

=

MN NY eΦ θ= +

1ˆ ( )T TN N N NYθ Φ Φ Φ−=

1-24

• Example 5.9

– For the model of order n=1

– For the model of order n=2

21 1

0.5 0.4 1 1: , , R

0.4 0.5 1 1k k k k kS y y u u y− −

= + ∈ −

21

1

0.5 0.41.6760.4 0.5

, 1 1 2.1951 1

T N

N TN F

A

B

θθ

θ

− − = − − = = = −

))) ) )

1 1 1 1k k ky A y B u− −= − +

1 1 2 2 1 1 2 2k k k k ky A y A y B u B u− − − −= − − + +

21

2

1 11.6411 1

, 0.1173 0.1466 2.1680.1173 0.1466

T N

TN F

B

B

θ

θ

= − = = −

))) )1

2

0.5 0.4

0.2827 0.3534,

0.0469 0.0587

0.587 0.0733

T

T

A

A

− − − − = − − − −

))

Wrong model, but smaller

system modeling and identification power spectrum analysis

Documents