mae 546 lecture 16
DESCRIPTION
ghgfhfTRANSCRIPT
Least-Squares Estimation Robert Stengel
Optimal Control and Estimation, MAE 546, Princeton University, 2013"
• Estimating unknown constants from redundant measurements"– Least-squares"– Weighted least-squares"
• Recursive weighted least-squares estimator"
Copyright 2013 by Robert Stengel. All rights reserved. For educational use only.!http://www.princeton.edu/~stengel/MAE3546.html!
http://www.princeton.edu/~stengel/OptConEst.html!
Perfect Measurement of a Constant Vector"
• Given "– Measurements, y, of a constant vector, x"
• Estimate x"
• Assume that output, y, is a perfect measurement and H is invertible"
y = H x– y: (n x 1) output vector"– H: (n x n) output matrix"– x : (n x 1) vector to be estimated"
• Estimate is based on inverse transformation"
x = H−1 y
Imperfect Measurement of a Constant Vector"
• Given "– �Noisy� measurements, z, of a
constant vector, x"• Effects of error can be reduced if
measurement is redundant"• Noise-free output, y"
y = H x
• Measurement of output with error, z"
z = y + n = H x + n – z: (k x 1) measurement vector"– n : (k x 1) error vector"
– y: (k x 1) output vector"– H: (k x n) output matrix, k > n"– x : (n x 1) vector to be estimated"
Cost Function for Least-Squares Estimate"
• Squared measurement error = cost function, J"
• What is the control parameter?"
J = 12εTε =
12z −H x( )T z −H x( )
=12zTz − xTHTz − zTH x + xTHTH x( )
Quadratic norm!
xThe estimate of x! dim x( ) = n ×1( )
• Measurement-error residual"ε = z −H x = z − y
�
dim ε( ) = k ×1( )
Static Minimization Provides Least-Squares Estimate"
Necessary condition"
∂J∂x
= 0 = 120 − HTz( )T − zTH + HTH x( )T + xTHTH⎡⎣
⎤⎦
J = 12zTz − xTHTz − zTH x + xTHTH x( )
Error cost function"
xTHTH = zTH
Static Minimization Provides Least-Squares Estimate"
xT HTH( ) HTH( )−1 = xT = zTH HTH( )−1 (row)or
x = HTH( )−1HT z (column)
• Estimate is obtained using left pseudo-inverse matrix"
Example: Average Weight of a Pail of Jelly Beans"
• Measurements are equally uncertain"
• Optimal estimate"
• Express measurements as"
zi = x + ni , i = 1 to k
z = Hx + n
H =
11...1
⎡
⎣
⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥
• Output matrix"
x = HTH( )−1HT z
Average Weight of the Jelly Beans"
Optimal estimate"
x = 1 1 ... 1⎡⎣ ⎤⎦
11...1
⎡
⎣
⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥
⎛
⎝
⎜⎜⎜⎜
⎞
⎠
⎟⎟⎟⎟
−1
1 1 ... 1⎡⎣ ⎤⎦
z1z2...zk
⎡
⎣
⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥
= k( )−1 z1 + z2 + ...+ zk( )
�
ˆ x = 1k
zii=1
k
∑ sample mean value[ ]
Simple average"
Least-Squares Applications"
• More generally, least-squares estimation is used for"– Higher-degree curve-fitting"– Multivariate estimation "
Error “Cost” Function! Least-Squares Linear Fit to Noisy Data"
y = a0 + a1xz = a0 + a1x( ) + n
zi = a0 + a1x( ) + niz1z2zn
⎡
⎣
⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥
=
a0 + a1x1( )a0 + a1x2( )
a0 + a1xn( )
⎡
⎣
⎢⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥⎥
+
n1n2nn
⎡
⎣
⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥
z =
1 x11 x2 1 xn
⎡
⎣
⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥
a0a1
⎡
⎣⎢⎢
⎤
⎦⎥⎥+ n
Ha + n
a =a0a1
⎡
⎣⎢⎢
⎤
⎦⎥⎥= HTH( )−1HT z
J = 12z −H a( )T z −H a( )
y = a0 + a1x
• Least-squares estimate of trend line"• Estimate ignores statistics of the error"
• Find trend line in noisy data"
• Measurement vector"
• Error cost function"
Least-Squares Image Processing"Original �Lenna�!
Degraded Image!
Restored Image (Adaptively Regularized Constrained
Total Least Squares)!
Chen, Chen, and Zhou, IEEE Trans. Image Proc., 2000"
Measurements of Differing Quality"
• Suppose some elements of the measurement, z, are more uncertain than others"
z = Hx + n• Give the more uncertain measurements less weight
in arriving at the minimum-cost estimate"• Let S = measure of uncertainty; then express error
cost in terms of S–1"J = 1
2εTS−1ε
Error Cost and Necessary Condition for a Minimum"
Error cost function, J"
Necessary condition for a minimum"∂ J∂ x
= 0
= 120 − HTS−1z( )T − zTS−1H + HTS−1H x( )T + xTHTS−1H⎡⎣
⎤⎦
J = 12εTS−1ε =
12z −H x( )T S−1 z −H x( )
=12zTS−1z − xTHTS−1z − zTS−1H x + xTHTS−1H x( )
Weighted Least-Squares Estimate of a Constant Vector"
xTHTS−1H − zTS−1H⎡⎣ ⎤⎦ = 0
xTHTS−1H = zTS−1HWeighted left pseudo-inverse provides the solution"
x = HTS−1H( )−1HTS−1 z
Necessary condition for a minimum"
• Optimal estimate of average jelly bean weight"
x = 1 1 ... 1⎡⎣ ⎤⎦
a11 0 ... 00 a22 ... 0... ... ... ...0 0 ... akk
⎡
⎣
⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥
11...1
⎡
⎣
⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥
⎛
⎝
⎜⎜⎜⎜⎜
⎞
⎠
⎟⎟⎟⎟⎟
−1
1 1 ... 1⎡⎣ ⎤⎦
a11 0 ... 00 a22 ... 0... ... ... ...0 0 ... akk
⎡
⎣
⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥
z1z2...zk
⎡
⎣
⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥
S−1 A =
a11 0 ... 00 a22 ... 0... ... ... ...0 0 ... akk
⎡
⎣
⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥
x = HTS−1H( )−1HTS−1 z
The Return of the Jelly Beans"
• Error-weighting matrix"
x =aii
i=1
k
∑ zi
aiii=1
k
∑
How to Chose the Error Weighting Matrix"
J = 12εTSA
−1ε =12z − y( )T SA−1 z − y( ) = 1
2z −H x( )T SA−1 z −H x( )
a) Normalize the cost function according to expected measurement error, SA!
J = 12εTSB
−1ε =12z −H x( )T SB−1 z −H x( )
b) Normalize the cost function according to expected measurement residual, SB!
Measurement Error Covariance, SA"
SA = E z − y( ) z − y( )T⎡⎣ ⎤⎦
= E z −Hx( ) z −Hx( )T⎡⎣ ⎤⎦= E nnT⎡⎣ ⎤⎦ R
Expected value of outer product of measurement error vector"
Measurement Residual Covariance, SB"
SB = E εεT⎡⎣ ⎤⎦
= E z −Hx( ) z −Hx( )T⎡⎣ ⎤⎦
= E Hε + n( ) Hε + n( )T⎡⎣ ⎤⎦
ε = z −Hx( )
Expected value of outer product of measurement residual vector"
Requires iteration (�adaptation�) of the estimate to find SB"
= HE εεT⎡⎣ ⎤⎦HT +HE εnT( ) + E nεT( )HT + E nnT( )
HPHT +HM +MTHT + R where
P = E x − x( ) x − x( )T⎡⎣ ⎤⎦M = E x − x( )nT⎡⎣ ⎤⎦R = E nnT⎡⎣ ⎤⎦
Recursive Least-Squares Estimation"
! Prior unweighted and weighted least-squares estimators use �batch-processing� approach"! All information is gathered prior to processing"! All information is processed at once"
! Recursive approach"! Optimal estimate has been made from prior
measurement set"! New measurement set is obtained"! Optimal estimate is improved by incremental
change (or correction) to the prior optimal estimate"
Prior Optimal Estimate"Initial measurement set and state
estimate, with S = SA = R"
z1 = H1x + n1x1 = H1
TR1−1H1( )−1H1
TR1−1z1
dim z1( ) = dim n1( ) = k1 ×1dim H1( ) = k1 × ndim R1( ) = k1 × k1State estimate minimizes"
J1 =12ε1TR1
−1ε1 =12z1 −H1 x1( )T R1
−1 z1 −H1 x1( )
New measurement"New Measurement Set"
z2 = H2x + n2
R2 : Second measurement error covariance
dim z2( ) = dim n2( ) = k2 ×1dim H2( ) = k2 × ndim R2( ) = k2 × k2
z z1z2
⎛
⎝⎜⎜
⎞
⎠⎟⎟
Concatenation of old and new measurements"
Cost of Estimation Based on Both Measurement Sets"
J2 = z1 −H1x2( )T z2 −H2x2( )T⎡⎣⎢
⎤⎦⎥R1
−1 0
0 R2−1
⎛
⎝⎜⎜
⎞
⎠⎟⎟
z1 −H1x2( )z2 −H2x2( )
⎡
⎣
⎢⎢
⎤
⎦
⎥⎥
= z1 −H1x2( )T R1−1 z1 −H1x2( ) + z2 −H2x2( )T R2
−1 z2 −H2x2( )
Cost function incorporates estimate made after incorporating z2"
Both residuals refer to x2
Optimal Estimate Based on Both Measurement Sets"
x2 = H1T H2
T⎡⎣
⎤⎦R1
−1 00 R2
−1
⎡
⎣⎢⎢
⎤
⎦⎥⎥H1
H2
⎡
⎣⎢⎢
⎤
⎦⎥⎥
⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪
−1
H1T H2
T⎡⎣
⎤⎦R1
−1 00 R2
−1
⎡
⎣⎢⎢
⎤
⎦⎥⎥z1z2
⎡
⎣⎢⎢
⎤
⎦⎥⎥
⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪
= H1TR1
−1H1 +H2TR2
−1H2( )−1 H1TR1
−1z1 +H2TR2
−1z2( )
Simplification occurs because weighting matrix is block diagonal"
Apply Matrix Inversion Lemma"Define"
P1−1 H1
TR1−1H1
Matrix inversion lemma"
H1TR1
−1H1 +H2TR2
−1H2( )−1 =P1
−1 +H2TR2
−1H2( )−1 =P1 − P1H2
T H2P1H2T +R2( )−1H2P1
Improved Estimate Incorporating New Measurement Set"
New estimate is a correction to the old"
x1 = P1H1TR1
−1z1
x2 = x1 − P1H2T H2P1H2
T + R2( )−1H2x1
+ P1H2T In − H2P1H2
T + R2( )−1H2P1H2T⎡
⎣⎤⎦R2
−1z2
= In − H2P1H2T +R2( )−1H2P1H2
T⎡⎣
⎤⎦ x1
+P1H2T In − H2P1H2
T +R2( )−1H2P1H2T⎡
⎣⎤⎦R2
−1z2
Simplify Optimal Estimate Incorporating New Measurement Set"
I = A−1A = AA−1, with A H2P1H2
T + R2
x2 = x1 − P1H2T H2P1H2
T +R2( )−1 z2 −H2x1( ) x1 −K z2 −H2x1( )
K : Estimator gain matrix
Recursive Optimal Estimate"! Prior estimate may be based on prior
incremental estimate, and so on"! Generalize to a recursive form, with
sequential index i!
xi = xi−1 − Pi−1HiT HiPi−1Hi
T + Ri( )−1 zi −Hixi−1( ) xi−1 −Ki zi −Hixi−1( )
with
Pi = Pi−1−1 +Hi
TRi−1Hi( )−1
dim x( ) = n ×1; dim P( ) = n × ndim z( ) = r ×1; dim R( ) = r × rdim H( ) = r × n; dim K( ) = n × r
Example of Recursive Optimal Estimate"
z = x + n
xi = xi−1 + pi−1 pi−1 +1( )−1 zi − xi−1( )
H = 1; R = 1
x0 = z0x1 = 0.5 x0 + 0.5z1x2 = 0.667 x1 + 0.333z2x3 = 0.75 x2 + 0.25z3x4 = 0.8 x3 + 0.2z4
index p-sub-i k-sub-i0 1 -1 0.5 0.52 0.333 0.3333 0.25 0.254 0.2 0.25 0.167 0.167
ki = pi−1 pi−1 +1( )−1 = pi−1pi−1 +1( )
pi = pi−1−1 +1( )−1 = 1
pi−1−1 +1( )
Optimal Gain and Estimate-Error Covariance"
! With constant estimation error matrix, R,"! Error covariance decreases at each step"! Estimator gain matrix, K, invariably goes to zero
as number of samples increases"! Why?"! Each new sample has smaller effect on the
average than the sample before!
Pi = Pi−1−1 +Hi
TRi−1Hi( )−1
Ki = Pi−1HiT HiPi−1Hi
T + Ri( )−1Next Time: �
Propagation of Uncertainty in Dynamic Systems�
Supplemental Material!
Weighted Least Squares (�Kriging�) Estimates (Interpolation)"
Delaware Sampling Sites! Delaware Average Concentrations! DE-NJ-PA PM2.5 Estimates!
Delaware Dept. of Natural Resources and Environmental Control, 2008!
http://en.wikipedia.org/wiki/Kriging!
Weighted Least Squares Estimates of Particulate Concentration (PM2.5)"
• Can be used with arbitrary interpolating functions"