lecture note #5 (chap.9) - cheric1-6 model order determination • model order is usually not a...

17
1-1 Lecture Note #5 (Chap.9) CBE 702 Korea University Prof. Dae Ryook Yang System Modeling and Identification

Upload: others

Post on 14-Mar-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

1-1

Lecture Note #5(Chap.9)

CBE 702Korea University

Prof. Dae Ryook Yang

System Modeling and Identification

1-2

Chap.9 Model Validation

• Residual: misfit between data and model• Residual analysis comprises tests of

– Independence of residuals– Normal distribution of residuals– Zero crossing (change of sign) of he residual sequence– Correlations between residuals and input

• Other checking points– Does the variance decrease as N increases?

• If not, it may indicate that the estimate is not statistically efficient– Is the parameter accuracy sufficient for the purpose of the model?– Does stochastic simulation verify that the estimated model behaves

qualitatively as expected?– Does cross validation with data that have not been previously used verify

that the estimated model is able to predict the behavior?

1-3

Method Prerequisites

• Experimental condition check– Persistent excitation of Input– Prior knowledge about the presence of feedback

• Signal conditioning– Outliers, aliasing, lost signal– Interference from discrete-time control– Trend– Nonzero initial conditions with low damping

• Linearity check– Gains for different input amplitudes– Symmetry of the responses– Coherence spectrum is very valuable

1-4

• Coherence spectrum and test of linearity– A measure of dependence between two signals– A test of SNR and linearity between inputs and output– The (quadratic) coherence spectrum

Where Syy(iω) and Suu(iω) are autospectra and Syu(iω) is cross spectrum.

– When Γyu(iω) not equal to one:• Disturbance affecting y• Input not represented by u• Nonlinear (No transfer function between u and y)• Nonzero initial value with low damping

1( ) ( ) ( )( )

( )yu uu uy

yuyy

S i S i S ii

S iω ω ω

Γ ωω

=

*0 0( ) ( ) ( ) ( ) ( )yy uu vvS i G i S i G i S iω ω ω ω ω= +

0( ) ( ) ( ) ( )Y s G s U s V s= +

0( ) ( ) ( )yu uuS i G i S iω ω ω=*

0 0*

0 0

( ) ( ) ( )( )

( ) ( ) ( ) ( )uu

yuuu vv

G i S i G ii

G i S i G i S iω ω ω

Γ ωω ω ω ω

=+

1-5

• Example 9.1 Identification of ARMAX model

– Modeling results

: ( ) ( ) ( ) ( ) ( ) ( )S A z Y z B z U z C z V z= +3 2( ) 0.986 0.888 0.368A z z z z= − + −

2( ) 0.848 0.629 0.313B z z z= − +3 2( ) 1.8 0.97C z z z z= − +

1.0411.004n=6

1.0351.004n=5

1.0291.004n=4

1.0461.026n=3

1.4951.478n=2

2.4312.417n=1

AkaikeFPE

Estimatedvariance

Modelorder

3 2(̂ ) 1.064 0.923 0.4A z z z z= − + −2(̂ ) 0.8102 0.5814 0.2372B z z z= − +

3 2(̂ ) 1.884 1.127 0.09273C z z z z= − + −

No outliers, signalloss, aliasing ortrend

Good coherence

Noise range

Sufficient excitation

1-6

Model Order Determination

• Model order is usually not a known a priori.– Try different orders until satisfaction.– Adding more parameters reduces the loss function.– If adding more parameters yields small improvement, it may be of little value.

• F-test of system order n0

– Model, M1 and M2: order n0=n1=n2, no. of parameter p0=p1=p2, loss function V1, V2

– Hypothesis test• Null hypothesis, H0: M1 is correct and M1 is a special case of M2 (Accept M1)• Alt. Hypothesis, HA: M1 is not enough but M2 includes the true model (Reject M1)

– Cochran theorem of statistics• 2(V1–V2)/σ2 is χ2(p2– p1) (distributed under the model M1)• 2V2/σ2 is χ2(N– p2) (distributed under the model M2)• V2 and (V1–V2) are independent under M2

– The ratio of two χ2-distributed variables is F-distributed.

1-7

– Test statistics for verification

– Reject the null hypothesis H0 ifwhere Fα is the a-percentile of the F-distribution.

• Example 9.2 (F-test for model order)– ARMAX model of order n with N=1000 (p=3n) from example 9.1– n1=1… 5, and n2=n1+1

2 21 2 1 2 2

2 1 2 2 122 2 2 1

( ) ( ) ( )( , ) ( , )

( )F

V V N pp p F N p p p

V p pσ σ

τσ− − −

= = ⋅ ∈ − −−

) ))

2 2 1( , )F F N p p pατ ≥ − −

1(2417 1478) (1000 6)

1, (6,3) 2101478 (6 3)Fn τ

− −= = ≈

1(1478 1026) (1000 9)

2, (9,6) 1461026 (9 6)Fn τ

− −= = ≈

−M

H026.1H08.53065

H026.1H08.53054

H026.1H08.537.2143

HA26.1HA8.5314632

HA26.1HA8.5321021

acceptF0.01(8 ,3)acceptF0.05(8 ,3)τFn2n1

Choose model order 3

1-8

• χ2-distribution– Distribution of a sum of squares of k independent, standard normal

random variables {xi} where k is called the number of degree of freedom.

– Mean of χ2-distribution is µ=k– Variance is σ2=2k– The probability density function

• F-distribution– Distribution of ratio of two χ2-distributions

– Mean of F-distribution is E{F}= n2/(n2−2) (n2>2)– The probability density function

2 2 2 21 2 kx x xχ = + + +L

22 2 ( /2) 1 /2 / 2 2( ) ( ) /(2 ( /2)), 0k kf e kχχ χ Γ χ− −= ≤ < ∞

( ) ( )1 2

2 21 2 1 2( , ) / / /n nF F n n n nχ χ= =

21

1 2

/ 2 / 2 11 2 1

( ) /21 2 2 1 2

(( )/2)( ) , 0

( / 2) ( / 2) (( / ) 1)

n n

n n

n n n Ff F F

n n n n n FΓ

Γ Γ

+

+= < < ∞ +

1-9

• The Akaike information criterion (AIC)– From a least–squares model of order n with p estimated parameters , the

model is fitted with data from N samples,

– The objective of identifying model order is to find the order as small aspossible (penalizing the high order)

– AIC:• AIC is statistically inconsistent• AIC gives an overestimated model order

– Minimum description length (MDL) by Rissanen

• The MDL is statistically consistent as Nà8

Nθ)

2 21

(1/ ) ( ) (1/ ) ( ), N p

N N i N NiN V N Rσ θ ε θ θ

== = ∈∑

) ) ))

2( ) log ( ) 2 / , pN NAIC p p N Rσ θ θ= + ∈) ))

2( ) log ( ) ( / )(log log )N N MMDL p p N Nσ θ θ= + +

) ))

Fisher information matrix

1-10

• The Akaike final prediction error (FPE)– An overparametrized model might poorly predict the behavior of a new

data set.– The expected prediction error based on p parameter estimate based on N

data fitted to some linear model

– Final Prediction Error (FPE)

– Order estimate

• Sometimes, FPE tends to underestimate the correct order of a system

θ)

Tk k ky wφ θ= +

2 2 2

2 2 2

{ ( )} {( ) } {( ( )) }

{ } ( { }) ( / )

T T T Tk k k k k k k

T Tk k k

E E y E y

E w tr E p N

ε θ φ θ φ θ φ θ φ θ

θθ φ φ σ σ

= − = − − −

= + ≈ +

) ) )

%%

2( ) (1 / ) 2( )/( ( )) ( )N NFPE p p N N p N N p Vσ θ= + = + −)

argmin ( )p FPE p=)

1-11

Residual Tests

• The residual based on the transfer function

– The residuals represent a disturbance or innovation of mismatch between the dataand model.

– If the sequence of residuals still exhibit some structure, the modeling is incomplete.– If the identification is satisfactory, the residuals should be uncorrelated to any other

variable including inputs and outputs.

• For successful identification– {εk} constitute a white-noise process with zero mean– {εk} are normally distributed– {εk} are symmetrically distributed– {εk} are independent of previous inputs– {εk} are independent of all inputs if there is no feedback

1( ) ( )( ( ) ( ) ( ))w uz H z Y z H z U zε −= −) )

for ( ) ( ) ( ) ( ) ( )u wY z H z U z H z W z= +) )

1-12

• Autocorrelation test– Autocovariance function of residuals

– Vector of residual autocorrelations for some number m

– The autocorrelation test statistic

• Example 9.3 (Validation of model in Ex.9.1)– m=50, N=1000

1( ) /( )N

k kkC Nεε ττ

τ ε ε τ−= += −∑

)

( ) [ (1) ( )] / (0)Tr C C m Cεε εε εε εετ =) ) )L

. 2 ( )distTNr r mεε εε εετ χ= →

1 802.22 70.93 40.94 45.55 44.46 46.87 42.38 41.89 36.6

10 44.2

n εετ 20.95 (50) 67.5χ =

Acceptable

1.96 N±for 95% confidencelevel

1-13

• Cross-correlation test– Cross-covariance function

– Vector of residual cross-correlations for time interval m

– The cross-correlation test statistic

• Example 9.4– m=50

1( ) /( )N

u k kkC u Nε ττ

τ ε τ−= += −∑

)

( ) [ ( 1) ( )] / (0)Tr m C C m Cεε εε εε εετ τ= + +) ) )L

.1 2( ) ( )distTu u uu um Nr R r mε ε ετ χ−= →

)

[ ] [ ]1 11where ( ) /( )N T

u k k m k k mk mR m u u u u N mε − − − −= +

= −∑) L L

1 127.72 164.43 65.14 63.15 60.66 56.97 59.98 56.49 57.5

10 59.1

un ετ

20.95 (50) 67.5χ =

Acceptable

Used for checking ifthere is a feedbacksystem

1-14

• Test of normality– For large m, check the probability of residuals against normal distribution

• Komogorov-Smirnov test– Test for normality

– {ε(k)} is a permutation of residuals {εk} by sorting the components of theresidual sequence in ascending order of magnitude

– For significance level 0.05

• Zero crossings– Number of zero crossings of the residuals

– For significance level 0.05

sup ( ) ( )KSx

F x F xε ετ = −) (1)

( ) ( 1)

0, ( )

/ , ( 1,2, , 1)k k

xF x

k N x k Nε

εε ε +

<= ≤ ≤ = −

)L

1

1

Nx ii

xτ −

== ∑ 1

1

1, if 0where

0, if 0k k

ik k

xε εε ε

+

+

<= ≥

/ 2 1.96 / 4 / 2 1.96 / 4xN N N Nτ− < < +

1.36/KS Nτ ≤

1-15

Model and Parameter Accuracy• Is the chosen order of model sufficiently accurate?• Check the step and impulse responses, Bode diagram and pole-

zero diagram• Stochastic simulation

– Use both deterministic input and residuals of identification as inputs– If it does not reproduce the data, the identification may have failed.

• Deterministic simulation– Use only deterministic input– Compare the output magnitude and delay– If it fails, it indicates that the input magnitude is inadequate or the model is too

simple

• Cross validation simulation– Use input data which are not previously used in the identification– If it fails, it indicates that the system is not time-invariant.

• If the estimated parameter variance increases as the number ofdata is increased, it indicate that the parameter estimates arestatistically inconsistent.

1-16

Classification with the FisherLinear Discriminant

• For time-varying systems or systems with wide operating conditions, theparameters can be quite different depending on the conditions and theycan be grouped into several classes.

• Classification of parameter estimate that might belong to one of severalclasses can be important

• For two classes A and B

– Fisher linear discriminant{ } , {( )( ) } ( or )T

i i i i i i iE m E m m R i A Bθ θ θ= − − = =

1( ) ( ( )/2)B A A BR m m R R Rλ −= − = +

if

? if

if

T

T

T

B

A

λ θ υ δ

θ λ θ υ δ

λ θ υ δ

> += − ≤ < −

)))

T T T TA A B B

T TA B

R m R m

R R

λ λλ λ λλυ

λ λ λ λ

+=

+(Threshold)

Region ofuncertainty

mA

mB

δ

λ

1-17

The Concept of Identifiability

• Identification fails when– Badly designed experiment (excitation, SNR, … )– Insufficient model complexity

• a priori identifiability:– Given a model structure and an experimental protocol (ideal data), do the

data contain enough information to estimate the unknown parameters ofthe model?

• a posteriori identifiability:– Given a model structure and an experimental protocol (data), can the

parameters of the model be estimated with acceptable precision?– This relates to parameter estimation and tests for goodness-of-fit.