topic 8-mean square estimation-wiener and kalman filtering
TRANSCRIPT
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
1/73
Topic 8-Mean Square Estimation: Wiener and Kalman Filtering
Papoulis Chapter 13 2 weeks
Minimum Mean-Square Estimation (MMSE)
Optimum Linear MSE estimation ---the orthogonality principle
Wiener Filtering
Kalman Filtering
Adaptive Filtering (not in Papoulis)
1
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
2/73
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
3/73
SupposeY is a RV with known PMF or PDF
Problem: Predict (or estimate) what value of Ywill be
observed on the next trial
Questions:
What value should we predict?
What is a good prediction? We need to specify some criterion that determines what is a
good/reasonable estimate.
Note that for continuous random variables, it doesnt make
sense to predict the exact value of Y, since that occurs with
zero probability.
A common estimate is the mean-square estimate.
Review: Parameter Estimation (Predicting the value of Y)[Review from Topic 4]
3
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
4/73
We will let "be the mean square estimate of the observable random
variable, YwithE[Y] = The mean-squared error (MSE) is defined as:
e=E[(Y ")2] We proceed by completing the square
E[(Y ")2] =E[(Y
+ ")2]
=E[(Y )2+ 2(Y )( ") + ( ")2]= var(Y) + 2( ")E[Y ]+ ( ")2= var(Y) + ( ")2 > var(Y) if "#
Clearly the MSE minimized when
"= "= is called the minimum- (or least-) mean-square error (MMSE
or LMSE) estimate
The minimum mean-square error is var(Y).
The Mean Square Error (MSE) Estimate
(8-1)
(8-2)
(8-3)
4
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
5/73
LetXand Ydenote random variables with known joint distribution
Suppose that we observe the value ofX (that is,Xis the observed
signal/data).How can we find the MMSE estimateof Y, denoted by
"that is a function of the observed data,X? Can the MMSE estimate, ", which is a function ofX, do better than
ignoringXand estimating the value of Yas "= Y =E[Y]? Yes! Denoting the MMSE estimate "byc(X), the MSE is given by
Note that the above integrals are positive, so that ewill be minimized
if the inner integral is a minimum for all values ofx.
Note that for a fixed value ofX, c(x) is a variable [not a function].
The MSE of a RV Based Upon Observing another RV
e =EXY{[Y! c(X)]2}= [y! c(x)]2 fX,Y
!"
"
#!"
"
# (x,y)dxdy
= fX(x) [y! c(x)]2!"
"
#!"
"
# fY|X(y |x)dydx (8-4)
5
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
6/73
The MSE of a RV Based Upon Observing another RV-2
Since for a fixed value ofx, c(x) is a variable [not a function], we can
minimize the MSE by setting the derivative of the inner integral, withrespect to c, to zero:
Solving for c after noting that
gives
Thus the MMSE estimate, ", is the conditional mean of Ygiven theobservation (or data) X.
The MMSE estimate is, in general, a nonlinear function ofX.
d
dc[y! c(x)]2
!"
"
# fY|X(y|x)dy = 2(y! c)!"
"
# fY|X(y|x)dy = 0
Y = c(X) = yfY|X!"
"
# (y)dy =E[Y |X]
c(x)fY|X!"
"
# (y)dy = c(x) fY|X!"
"
# (y)dy = c(x), where the integral is one.
(8-5)
(8-6)
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
7/73
11
1
!
(1!2)1/2
Let the random point (X, Y) is uniformly distributed on a semicircle
The joint PDF has value 2/"on the semicircle
The conditional PDF of Ygiven thatX= !is a uniform density on [0,
(1!2)1/2].
So, "=E[Y|X= !] = (1/2)(1!2)1/2 and this estimate achieves theleast possible MSE of var(Y|X= !) = (1!2)/12
Intuitively reasonable since If |!| is nearly 1, the MSE is small (since the range of Yis small)
If |!| is nearly 0, the MSE is large (since the range of Yis large)
MMSE Example
7
X
Y
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
8/73
11
1
The Regression Curve of YonX
"= E[Y|X=!] as a function of !is a curve called the regressioncurve of YonX ) (plotted as the lower curve above)
Graph of (1/2)(1!2)1/2 is a half-ellipse
GivenXvalue, the MMSE estimator of Ycan be read off
from the regression curve
8
X
Y
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
9/73
Example: As an example, suppose Y=X3is the unknown.Then the best MMSE estimator is given by
Clearly in this case Y=X3 is the best estimator for Y.Thus the
best estimator can be nonlinear.
.}|{}|{ 33 XXXEXYEY === (8-7)
9
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
10/73
10
Example :Let
where k> 0 is a suitable normalization constant. To determine the best
estimate for Yin terms ofX, we need
Thus
So, the best MMSE estimator is given by
"#
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
11/73
11
Once again the best estimator is nonlinear. In general the best
estimator is difficult to evaluate, and hence next we
will examine the special subclass of best linear estimators.
.1
)1(
3
2
1
1
3
2
13
2
)|(}|{)(
2
2
2
31
2
3
1
2
1
21
1
2
1
22
|
x
xx
x
x
x
y
dyydyy
dyxyfyXYEXY
x
xxx x
y
x XY
!
++
=
!
!=
!=
==
===
"""
!!
#
Y = E{Y | X}
(8-8)
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
12/73
Linear MMSE Estimation I
Suppose that we wish to estimate Yusing a linearfunction of the
observationX.
The linear MMSE estimate of Yis = aX+ b,where aand bare
chosen to minimize the mean-square errorE[(Y aX b)2]
LetZ =Y - " = Y aX b be the error, then we will show that theminimum occurs when
and the minimum MSE (with a linear estimate) is
a =!XY"
Y
"X
b = Y! a
X
emin =!y2(1! "
2
XY)
12
(8-9)
(8-10)
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
13/73
Optimum Linear MSE Estimate: Proof
Suppose that ais fixed, then the problem is to estimate thequantity Y - aX by the constant b.
But, we know from the previous example [see (8-3) that, under
those circumstances,
b =E [Y aX ] =Ya X
With bdetermined as above, the mean-square error becomes
E[(Y aX b)2] =E{[(Y Y) a(X - X)]2
= ($Y)2 -2a%XY$X$Y + a
2($X)2
Minimization of (8-12) is accomplished by simply differentiating
this expression with respect to a givinga= (%XY $Y)/$X
Substituting these values of aand binto the MSE gives
13
emin =!y2(1! "
2
XY)
(8-11)
(8-12)
(8-13)
(8-14)
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
14/73
Linear MMSE Estimation The Orthogonality Principle
As before, letZ =Y aX bbe the estimation error, then the MSE
is
e =E[(Y aX b)2] =E[(Z2)]
Setting the derivative of the MSE with respect toa to zerogives
Which says that the estimation error,Z = Y- ", is orthogonal, (thatis uncorrelated) with the received dataX.
This is referred to as the orthogonality principle of linearestimation.
When the estimation error is uncorrelated with the observed (data),X, and, intuitively, the estimate has extracted all the correlated
information from the data.
!e
!A=E[2Z("X)]=E[(Y" Y)X]= 0
14
(8-15)
(8-16)
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
15/73
Orthogonality Condition: A Geometric View(from D. Snider text section 6.3)
Recall the properties of the dot product in 3-dimensional vector analysis:
We can use the dot product to express the orthogonal projection of one vector
onto another, as in the figure below.
15
!
v !
u =| !
v | cos! | !
u|, !
v !
v =| !
v|2
, !
u!
u =| !
u|2
(8-16b)
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
16/73
The length of is ; its direction is that of the unit
vector ; thus
Now compare these identities with the expressions for the second moments
for zero-mean random variables
The dot products are perfectly analogous to the second moments if weregard #Xand #Yas the "lengths" ofXand Y, and the correlation
coefficient "as the cosine of the "angle betweenXand Y." After all, "lies
between -1 and +1, just like the cosine . In this vein, we say two random
variablesXand Yare "orthogonal" if E{XY} = 0 (so the angle is 900). Note
that this nomenclature is only consistent with these analogies when the
variables have mean zero.
The vector analogy is useful in remembering the least-mean-squared-error
formula. Furthermore, note that is orthogonal to in the figure. By analogy,
it reasonable that the prediction error is orthogonal (in the statistical sense)
to the LMSE predictor. 16
!
vproj |
!
v | cos!!
u/ | !
u|
!
vproj =| !
v | cos! !
u/ | !
u| =| !
v | cos! |
!
u|
| !
u|
!
u| !
u|=
!
v
!
u!u
!
u
!
u
XY =!X"!
Y, XX =!
X
2, YY =!
Y
2
(8-16c)
(8-16d)
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
17/73
Gaussian MMSE = Linear MMSE[From Topic 4]
In general, the linear MMSE estimate has a higher MSE than the(usually nonlinear) MMSE estimateE[Y|X]
IfXand Yare jointly Gaussian RVs, it can be shown that the
conditional PDF of YgivenX= !is a Gaussian PDF with mean
Y+ ("#Y/#X)(! X)
and variance
(#Y)2(1 "2)
Hence,
E[Y|X = !] =Y
+ ("#Y/#X)(! X)
which is the same as the linear MMSE.
For jointly Gaussian RVs, MMSE estimate = linear MMSE estimate
Another special property of the Gaussian RV.17
(8-17)
(8-18)
(8-19)
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
18/73
Minimum Mean-Square Error (MMSE) Linear Estimate of a Random Process
From Topic 4, we know that the optimum mean-square(generally, non-linear) estimate of a random process S(t) is the
conditional mean and is defined as
A linear estimate takes the form
The objective is to find h(&) so as to minimize the MS error
S(t) ! E[S(t) |X(!),a " ! " b], where a " t" b
S(t)= h(!)X(!)d!a
b
! , where a" t" b
E{[S(t)! S(t)]2}=E{[S(t)! h(!)X(!)d!]2}a
b
"
18
(8-20)
(8-21)
(8-22)
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
19/73
From the orthogonality condition, the mean-square error will be a
minimum if the observed data is orthogonal to the estimation error
over the observation interval:
so that the optimal estimator, h(&), can be found as the solution ofthe integral equation
In general the above equation can only be solved numerically.
19
Minimum Mean-Square Error (MMSE) Linear Estimate-2
ES,X
{[S(t)! h(!)X(!)d!]X(")}a
b
" = 0 a # "# b
RSX
(t,!)= h(")RXX
a
b
! (",!)d" a" !"b
(8-23)
(8-24)
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
20/73
Examples of Linear MMSE(assume all random processes are WSS)
Prediction: We want to estimate the future value of S(t +') basedon its present value. The optimum linear estimate is given by
The optimum linear estimate satisfies the orthogonality condition
and we can solve for aas
a = RS
(')/RS
(0)
20
S(t+!) = E[S(t+!) |S(t)]= aS(t)
E{[S(t+!)! aS(t)]S(t)}= 0
(8-25a)
(8-25b)
(8-26)
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
21/73
Examples of Linear MMSE
Filtering: We want to estimate the present value of S(t) based on
the present value of another processX(t). The optimum linearestimate is given by
The optimum linear estimate satisfies the orthogonality condition
and we can solve for aas
a = RSX(0)/RXX (0)and applying (8-14) we see that the minimum MSE (MMSE) is
21
S(t)=E[S(t) |X(t)]= aX(t)
E{[S(t)! aX(t)]X(t)}= 0
(8-27)
(8-28)
(8-29a)
emin =!S2(1!"
2
SX) = R
SS(0)! aR
SX(0) (8-29b)
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
22/73
Examples of Linear MMSE
Interpolation: We want to estimate the value of a process S(t) at a point t +'in the interval (t, t +T ) based on 2N+1 samples S(t+kT) that are within the
time interval (see Fig 13-1 in the text reproduced below).
The optimum linear (interpolation) estimate is
The optimum linear estimate satisfies the orthogonality condition
22
S(t+!) = ak
k=!N
N
" S(t+ kT) 0 # !# T (8-30)
(8-31)E{[S(t+!)! akk=!N
N
" S(t+ kT)]s(t+nT)}= 0 |n|# N 0 # !# T
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
23/73
From which it follows that
This is a system of2N + 1 linear equations that can be solved
to yield the 2N + 1 unknowns ak.
23
akk=!N
N
" RS(kT! nT) =RS(!!nT), ! N# n # N, 0 # !# T (8-32)
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
24/73
Examples of Linear MMSE
Smoothing: We want to estimate the present value of S(t) based
on the value of another processX(t) which is the sum of thesignal S(t) and a noise signal, ((t):
X(t) = S(t) + ((t)
The optimal estimate can be written as the conditional mean
and the linear estimate is
Note that the estimate !(t) is the output of a linear filter withimpulse response h(&) and with inputX(t).
The orthogonality condition gives
24
S(t) =E[S(t) |x(!),!" < !
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
25/73
The previous equation is equivalent to:
Which becomes
To determine h(t) we need to solve the above integral equation
which is easy to so since it is a convolution of h()) withRXX( ))that hold for all values of ).
Taking transforms of both sides we obtain
SSX(*) =H(*)SXX(*)or
which is known as the non-causal Wiener filter.
Why is this a non-causal solution? 25
E{[S(t)!
h(!
)X
(t!!
)d!
]X
(t!"
)}=
0 !"#t# "!"
"
$ (8-37)
RSX
(!) = h(")RXX
!"
"
# (!!")d" for all ! (8-38)
H(!)=SSX
(!)
SXX
(!)
(8-39)
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
26/73
26
Figure showing that the estimate is
the output of a linear filter
H(!)=SSS
(!)
SSS
(!)+ S""
(!)(8-41)
So with the independent signal and noise,
SSX
(*) = SSS
(*) and SXX
(*) = SSS
(*) + S((
(*) (8-40)
and (8-39) simplifies to
Is this an intuitively reasonably solution? What happens when
the noise gets very small?
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
27/73
If the spectra SSS(*) and S(((*) shown below do not overlap, thenH(*) = 1 in the band of the signal andH(*) = 0 in the band of thenoise, and the MMSE is zero.
Which can seen by extending (8-14)
27
emin =!S2(1!"2
SX) =
1
2#[S
SS
!"
"
# ($)!H*($)SSX($)]d$
=1
2#
SSS($)S
%%($)
SSS($)+ S%%($)!"
"
# d$(8-42)
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
28/73
28
Nonlinear Orthogonality Rule
.
Interestingly, a general form of the orthogonality principle also holds
in the case of nonlinear estimators also.
Nonlinear Orthogonality Rule:Let represent anyfunctional
form of the data and the best estimator for Ygiven With
we shall show that
implying that
This follows since
)(Xh}|{ XYE
.X
}|{ XYEYe !=
).(}|{ XhXYEYe !"=
,0)}({ =XehE
E{eh(X)}=EX{(Y!EY|X[Y | X])h(X)}
=EX
{Yh(X)}!EX
{EY|X
[Y | X]h(X)}
=EX
{Yh(X)}!EX
{EY|X
[Yh(X) |X]}
=EX
{Yh(X)}!EX
{Yh(X)}= 0.PILLAI
(8-43)
(8-44)
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
29/73
Discrete Time Processes
The non-causal estimate, ![n] of a discrete time process in terms of theobserved data
X[n] = S[n] +([n]
is the estimate
which is the output of a linear time invariant, non-causal system with input
X[n] and impulse response h[n].By the orthogonality principle we have
so that
Taking z-transform of both sides of (8-48) gives
29
(8-45)
S[n]= h[k]x[n! k]!"
"
# (8-46)
E{(S[n]! h[k]x[n! k]!"
"
# )X[n!m]}= 0, for all m
RSX[m]= h[k]RXXk=!"
"
# [m! k], for all m (8-48)
(8-47)
H(z) =S
SX[z]
SXX
[z](8-49)
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
30/73
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
31/73
The Wiener-Hopf equations cannot be directly solved withZ
transforms, since the two sides are not equal for every value of m.
There is fairly complicated (mathematically) spectral theorydescribed in the text that factors the transfer function of the impulse
response h[n] into causal and anti-causal sequences. We will not
discuss this approach.
Instead we will consider the more practical case of a predictor that
uses a finite number of past samples.
The solutions involve (straightforward) matrix inversion operations.
31
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
32/73
Causal Prediction-UsingLPast Samples
Consider the estimation of a process S[n] in terms of its pastLsamples
S[n-k], k +1:
The objective is to find theLfilter constants h[k] so as to minimize the
MSE. From the orthogonality principle the error S[n]-![n] must beorthogonal to the data S[n-m] giving
which gives the Wiener-Hopf (discrete) equation
Equation (8-51) is a system ofL equations expressing the unknowns h[k]
in the terms of the autcorrelationRS[m]32
S[n]=E[S[n] | s[n! k],1" k" L]= h[k]S[n! k]k=1
L
# (8-52)
E{(S[n]! h[k]S[n! k]k=1
L
" )S[n!m]}= 0, 1#m # L(8-53)
RS[m]= h[k]R
S
k=1
L
! [m" k], 1#m # L (8-54)
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
33/73
By rewriting [8-54] as
we recognize [8-55]can be written as a matrix-vector equation
where the matrix Ris anLxLToeplitz matrix with thekmthelement equal toRm-k, h is anLx1column vector with k
th
element equal to h[k], and r is aLx1column vector with kth
element equal toRk.
A Toeplitz matrix is a matrix in which each descending
diagonal from left to right is constant. For example, if the
Toeplitz matrix A has an ijthelementAi,j, thenAi+1,j+1 =Ai,j
There are many computationally efficient algorithms for
inverting a Toeplitz matrix.
33
RS
k=1
L
! [m"k
]h[k
]= R
S[m
], 0 #m#L
[8-55]
Rh = r [8-56]
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
34/73
It is easy to see that that matrix R is Toeplitz by displaying
the elements of the matrix-vector equation
which can be solved by standard matrix inversion techniques
to give
34
R0 R
1 R
2! R
L!1
R1 R
0 R
1! R
L!2
R2 R
1 R
0! R
L!3
"
RL!2 R2 ! R0 R1
RL!1 RL!2 !RL!1 R0
"
#
$$$
$$$$
%
&
'''
''''
h1
h2
h3
"
hL!1
hL
"
#
$$$
$$$$
%
&
'''
''''
=
R1
R2
R3
"
R0
"
#
$$$
$$$$
%
&
'''
''''
[8-57]
h =R!1r [8-58]
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
35/73
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
36/73
EstimationError: e(n)
DesiredResponse: d(n)Input: x(n)
H(z)Linear Filter
FilterOutput:y(n)
+
_
Least Mean Square Filtering- Generic Problem
The basic concept behind Wiener Filter theory is to minimize the difference
between the filter output,y(n), and some desired output, d(n). Noise could be
present in the filter output. This minimization either performs a matrix inversion
such as in (8-58) to find the Wiener filter when the model is known, or when the
model has some unknown parameters will use the least mean square (LMS)
approach, which adaptively adjusts the filter coefficients to reduce the square of
the difference between the desired and actual waveform after filtering. As before,we will assume thatH(z) is a feedforward finite impulse response (FIR) filter with
coefficients h(k) = hk ,K=1,2,L.
The system is described by the following equation
36e(n)= d(n)!y(n)= d(n)! h(k! n)x(n)k=1
L
" [8-59]
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
37/73
Least Mean Square Filtering--- System Identification
EstimationError: e(n)
Input:x(n)
H(z)
Linear FilterEstimatedOutput: y(n)
+
_
UnknownSystem
The LMS approach has a number of other applications in addition to standard
filtering including systems identification, interference canceling, and inverse
modeling or de-convolution. For system identification, the filter is placed inparallel with the unknown system and the parameters can be adapted (i.e.,
changed) to minimize the estimation error.
The desired output
is the output of theunknown system,
and the filter
coefficients are
either (1) computed
(Wiener) or (2)
adjusted (LMS
adapted) so that the
filter output best
matches that of the
unknown system in
the MMSE sense.37
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
38/73
Wiener Filter ~ MATLAB Implementation
If the system parameters are known, the Wiener-Hopf equationcan be solved using MATLABs matrix inversion operator(\) as shown in the following example.
The MATLAB toeplitzfunction is useful in setting up thecorrelation matrix. The function call is:
Rxx = toeplitz(rxx);
where rxx is the input row vector. This constructs asymmetrical matrix from a single row vector and can be usedto generate the correlation matrix in the Wiener-Hopf equationfrom the autocorrelation function rxx .
38
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
39/73
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
40/73
Example (continued): The solution uses the routine wiener_hopfto calculatethe optimum filter coefficients.
This program computes the correlation matrix from the autocorrelation functionand the toeplitz routine, and also computes the crosscorrelation function.
function b = wiener_hopf(x,y,maxlags)% Function to compute LMS algol using Wiener-Hopf equations
% Inputs: x = input
% y = desired signal% Maxlags = filter length% Outputs: b = FIR filter coefficients
%
rxx = xcorr(x,maxlags,'coeff'); % Compute the autocorrelation vectorrxx = rxx(maxlags+1:end)'; % Use only positive half of symm. vector
rxy = xcorr(x,y,maxlags); % Compute the crosscorrelationvector
rxy = rxy(maxlags+1:end)'; % Use only positive half
%rxx_matrix = toeplitz(rxx); % Construct correlation matrix
b = rxx_matrix\rxy; % Calculate FIR coefficients using matrix
% inversion, 40
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
41/73
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
-2
0
2
Time(sec)
SNR -8 db; 10 Hz sine
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
-500
0
500
Time(sec)
After Optimal Filterin g
0 10 20 30 40 50 60 70 80 90 1000
0.2
0.4
0.6
Frequency (Hz)
Optimal Filter Frequency Plot
Example: Results
The original data(upper plot) are
considerably lessnoisy after
filtering (middleplot).
The filtercomputed by theWiener-Hopf
algorithm has theshape of a
bandpass filter
with a peakfrequency at the
signal frequencyof 10 Hz.
41
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
42/73
LMS Adaptive Filters
Error: e(n)
DesiredResponse: d(n)
Input:x(n)
H(z)
Response:y(n)
+_
Classical filters (FIR and IIR) and optimal Wiener filters have fixed frequency
characteristics and can not respond to changes that might occur during the
course of the signal. Adaptive filters can modify their properties based onselected features of signal being analyzed, or can work when the frequency or
statistical characteristics are not known a priori.
The LMS Algorithm consists of two basic processes
Filtering process: calculate the output of FIR filter by convolving input and
taps. Calculate the estimation error by comparing the output to desired
signal Adaptation process: Adjust tap weights based on the estimation error
A typical adaptive filter paradigm is shown below, where the arrow denotes a
quantity that is being adapted. Typically the filter is a FIR filter (which is what
we will assume) with impulse response h(k).
42
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
43/73
Stochastic Gradient Approach
Most commonly used adaptive filtering algorithm.
Define cost function as mean-squared difference
between filter output and desired response (MSE).
If parameters known, uses the method of steepest
descent to invert the Wiener matrix
Move towards the minimum on the error
surface to get to minimum (the MSE has a
single minimum)
Requires the gradient of the error surface to be
known
When parameters not known, the most popular
adaptation algorithm is the LMS algorithm
Derived from steepest descent
Does not require gradient to be know: it isestimated at every iteration
43
update value
of tap-weigth
vector
!
"
###
$
%
&&&=
old value
of tap-weight
vector
!
"
###
$
%
&&&+
learning-
rate
parameter
!
"
###
$
%
&&&
tap'
input
vector
!
"
###
$
%
&&&
error
signal
!
"#
$
%&
Mean-Square Error (MSE) as a (Convex)Function of the Tap Weight
h2
h1
Mean-S
quareError(MSE)
Weight Values
estimated gradient
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
44/73
Least Mean Squared (LMS) Approach to Adaptive Filtering
If the MSE, e, were available, then the algorithm would use the MSE to compute
the optimum filter coefficients. But, in most practical situations, the MSE is
unknown or changing, while the instantaneous error enis often available.
Note that the MSE is defined asE[(en)2], and, ideally, we are interested in the
gradient, or derivative, of the MSE with respect to the adjustable parameters h(k).
Taking the derivative of the MSE with respect to h(k) gives:
where we have used the property that the differentiation and expectation
operations are interchangeable.
So, since we dont have access to the average MSE, we will drop theE operationand use the fact that
is an unbiased estimate of the gradient [8-60] to approximate the gradient.
!E[(en )2 ]!h(k)
=E!(en )2
!h(k)
!(en)2
!h(k)
[8-60]
[8-61]
44
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
45/73
Least Mean Squared (LMS) Approach to Adaptive Filtering
The LMS algorithm uses the estimated gradient [8-61] to adjust the filter
parameters.
The LMS algorithm adjusts the filter coefficients so that the sum of the squared
errors, which approximates (estimates) the MSE, converges toward this
minimum. The LMS algorithm uses a recursive gradient method known as the
steepest-descentmethod for finding the filter coefficients that produce the
minimum sum of squared errors. A modified steepest-descent algorithm updatesthe adjustable parameters to move in the direction of the negative gradient. The
symbol hn(k) denotes the impulse response coefficient h(k) at the nthiteration of
the LMS algorithm.
Filter coefficients are modified using an estimate of the negative gradient of the
error function with respect to a given hn(k). This estimate is given by the partialderivative of the (instantaneous)squared error, en, with respect to the
coefficients, hn(k): using the chain rule for differentiation and [8-59] we have
[8-62]!en
2
!hn(k)= 2e(n)
![d(n)"y(n)]
!hn (k)= "2e(n)x(n" k)
45
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
46/73
LMS Algorithm (continued)
Using this (estimate) for the gradient to construct an error signal,
the LMS algorithm updates the filter parameters in the direction of
the negative gradient, so if the filter parameter h(k) at the nth
iteration of the LMS algorithm is denoted by hn(k), the LMS
algorithm computes hn+1(k) as
hn+1(k) = hn(k) +,e(n)x(n-k) , k = 1,2,.,N and n=1,2,,
where,is a constant learning rate parameter that controls the rateof descent and convergence to the filter coefficients.
The equation [8-63] can be written as a vector iterative equation as
hn+1 =hn +,e(n)xn , n=1,2,,
wherexn is aNx1column vector whose mth
entry isx(n-m).
[8-63]
[8-64]
46
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
47/73
Example 2:Applying the LMS algorithm to a systems identification task.The unknownsystem will be an all-zero linear process with a digital
Transfer Function of:
H(z) = .5 +.75z-1+ 1.2 z-2
Confirm the match by plotting the magnitude of the Transfer Function for
both the unknown and matching systems.
b_unknown = [.5 .75 1.2]; % Define unknown processxn = randn(1,N);
xd = conv(b_unknown,xn); % Generate unknown system outputxd = xd(3:N+2); % Truncate extra points (symmetrically)
%% Apply Weiner filter
b = wiener_hopf(xn,xd,L); % Compute matching filter coefficientsb = b/N; % Scale filter coefficients..Calculate and plot frequency characteristics.
47
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
48/73
Example Results
0 50 100 150 200 2500
1
2
3
4
5
6
7
Frequency (Hz)
|H(z)|
Unknown Process
0 50 100 150 200 2500
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
Frequency (Hz)
|H(z)|
Matching Process
Original Coefficients: 0.5; 0.75; 1.2
Identified Coefficients: 0.44; 0.67; 1.1
The identified TransferFunction and coefficients
closely matched those ofthe unknown system.
In this example, theunknown system is an
all-zero system so the
match by an FIR filterwas quite close. A
system containing both
poles and zeros wouldbe more difficult to
match.
48
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
49/73
Adaptive Noise Cancellation
AdaptiveFilter
Error Signal
e(n) = x(n) -N*(n)
N*(n)
Desired outputAdaptive Noise
Cancellation
x(n) + N(n)
N(n)
Signal Channel
Reference Channel
+
-
Adaptive noise cancellation requires a reference signal that contains
components of the noise, but not the signal. The reference channel
carries a signal N(n), that is correlated with the noise N(n), but not
with the signal of interest, x(n). The adaptive filter will produce an
output N*(n), that minimizes the overall output. Since the adaptive
filter has no access to the signal, x(n), it can only reduce the overalloutput by minimizing the noise in this output.
49
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
50/73
Adaptive Line Enhancement (ALE)
DelayD
AdaptiveFIR Filter
Error Signal
e(n) = Bb(n) + Nb(n) - Nb*(n)
Nb*(n)
Desired output:Broadband Signal
(InteferenceSuppression)
B(n) + Nb(n)
Decorrelationdelay
Desired output:Narrowband
Signal(Adaptive LineEnhancement)
A reference signal is not necessary to separate narrowband frombroadband signals. In Adaptive Noise Enhancement, broadband and
narrowband signals are separated by delay: only narrowband signals will berelated to delayed versions of themselves. The error signal contains both
broadband and narrowband signals, but the filter can reduce only thenarrowband signals. Hence the adaptive filter output contains the filtered
narrowband singal. The decorrelation delay must be chosen with care.
50
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
51/73
Example: 3 Given the same sinusoidal signal in noise as used in
Example 1, design an adaptive filter to remove the noise. Just as in
Example 1, assume that you have a copy of the desired signal.
% Same initial lines as in Example 8-1 .....
% xn in the input signal containing noise% x is the desired signal (as in Ex 8-1 I nose free version of the signal)
%% Calculate Convergence Parameter
PX = (1/(N+1))* sum(xn.^2); % Calculate approx. power in xn
delta = a * (1/(10*L*PX)); % Calculate!%
[b,y] = lms(xn,x,delta,L); % Apply LMS algorithm (see below)
%
% Plotting identical to Example 8-1....
The adaptive filter coefficients are determined by the LMS algorithm
51
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
52/73
LMS Algorithm
function [b,y,e] = lms(x,d,delta,L)% Simple function to adjust filter coefficients using the LSM algorithm
% Adjusts filter coefficients, b, to provide the best match between% the input, x(n), and a desired waveform, d(n),
% Both waveforms must be the same length% Uses a standard FIR filter
%
M = length(x);b = zeros(1,L); y = zeros(1,M); % Initialize outputs
for n = L:Mx1 = x(n:-1:n-L+1); % Select input for convolutiony(n) = b * x1'; % Convolve (multiply) weights
with inpute(n) = d(n) - y(n); % Calculate error
b = b + delta*e(n)*x1; % Adjust weights
end
The LMS algorithm is implemented in the function lms.The input is x, the desired signal is d, delta is the
convergence factor and L is the filter length.
52
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
53/73
Example: Results
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
-2
0
2
4
Time(sec)
x(t)
SNR -8 db; 10 Hz sine
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
-0.5
0
0.5
Time(sec)
y(t)
After Adaptive Filtering
0 10 20 30 40 50 60 70 80 90 1000
2
4
6
8x 10
-7
Frequency (Hz)
|H(f)|
Adaptive Filter Frequency Plot
Application of an
adaptive filter usingthe LSM recursivealgorithm to data
containing a singlesinusoid (10 Hz) in
noise (SNR = -8
db). The filterrequires the first 0.4
to 0.5 seconds to
adapt (400-500points), and that the
frequencycharacteristics after
adaptation arethose of abandpass filter with
a single cutofffrequency of 10 Hz.
53
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
54/73
In the next example an ALE Filter is constructed using the LMSalgorithm. The desired waveformis just the signal delayed. The
best delay was found empirically to be 5 samples.
Adaptive Line Enhancement (ALE)
delay = 5; % Decorrelation delaya = .075; % Convergence gain
%%Generate data: two sequential sinusoids, 10 & 20 Hz in noise (SNR = -6)
x = [sig_noise(10,-6,N/2) sig_noise(20,-6,N/2)];.. Plot original signal .
%
PX = (1/(N+1))* sum(x.^2); % Calculate waveform power for delta
delta = (1/(10*L*PX)) * a; % Use 10% of the max. range of delta%
xd = [x(delay:N) zeros(1,delay-1)]; % Delay signal to decorrelate noise[b,y] = lms(xd,x,delta,L); % Apply LMS algorithm
Plot filtered signal ..
54
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
55/73
Example 4: Results Unlike a fixed Wiener Filter, an adaptive filter can trackchanges in a waveform as shown in this example where two sequential
sinusoids having different frequencies (10 & 20 Hz) are adaptively filtered.
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2-4
-2
0
2
4
6
Time(sec)
x(t)
10 & 20 Hz SNR -6 db
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2-0.4
-0.2
0
0.2
0.4
Time(sec)
y(t)
After Adaptive Filtering
55
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
56/73
Example 5: Adaptive Noise Cancellation (ANC). The LMS algorithm
is used with a reference signal to cancel a narrowband interference
signal.
0 0.5 1 1.5 2 2.5 3 3.5 4-1
-0.5
0
0.5
1
x(t)
Original Signal
0 0.5 1 1.5 2 2.5 3 3.5 4-2
-1
0
1
2
x(t)+n
(t)
Signal + interference
0 0.5 1 1.5 2 2.5 3 3.5 4-2
-1
0
1
2
Time(sec)
y(t)
After Adaptive Noise Cancellation
In this
application,
approximately
1000 samples(2.0 sec) are
required for the
filter to adapt
correctly.
56
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
57/73
Phase Sensitive Detection
Phase Sensitive Detection,also known as Synchronous orCoherent Detection,is a technique for demodulatingamplitude modulated(AM) signals that is also veryeffective in reducing noise.
From a frequency domain point of view, the effect ofamplitude modulation is to shift the signal frequencies to
another portion of the spectrum on either side of themodulating, or carrier,frequency.
Amplitude modulation can be very effective in reducing noisebecause it can shift signal frequencies to spectral regionswhere noise is minimal.
The application of a narrowband filter centered about the new
frequency range (i.e. the carrier frequency) can then be used toremove the noise.
A Phase Sensitive Detector functions as a narrowband filterthat tracks the carrier frequency. The bandwidth can be quitesmall.
57
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
58/73
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
59/73
Phase Sensitive Detection (continued)
0 0.5 1 1.50
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
fc
BW
Frequency characteristicsof a phase sensitive
detector. The frequencyresponse of the lowpass
filter is effectively reflected
about the carrierfrequency producing a
bandpass filter that tracksthe carrier frequency. By
making the cutoff
frequency small, the
bandwidth of the virtualbandpass can be verynarrow.
59
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
60/73
Example 6: Using a Phase Sensitive Detection to demodulate the
signal amplitude modulated with a 5 Hz sawtooth wave. The AM
signal is buried in -10 db noise. The filter is chosen as a second-order
Butterworth lowpass filter with a cutoff frequency set for best noiserejection while still providing reasonable fidelity to the sawtooth
waveform.
wn = .02; % Lowpass filter cutoff frequency[b,a] = butter(2,wn); % Design lowpass filter
%% Phase sensitive detection
ishift = fix(.125 * fs/fc); % Shift carrier by 1/4 periodvc = [vc(ishift:N) vc(1:ishift-1)]; % using periodic shift
v1 = vc .* vm; % Multiplier
vout = filter(b,a,v1); % Apply lowpass filter
60
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
61/73
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
62/73
62
Kalman Filter
The Kalman Filter is a recursive (iterative) time-domain data
processing algorithm in the time domain that solves the sameproblems as the Wiener filter. The Kalman filter can be also be
made adaptive, but we will not cover this topic (I do in my Digital
Communications course")
Generates optimal estimate of desired quantities given the set of
measurements (estimation, prediction, interpolation, smoothing,)
Optimal filtering for linear system and white Gaussian errors,
Kalman filter is bestestimate based on all previous
measurements
Recursive/Iterative Does not need to store all previous measurements and reprocess all
data each time step.
Kalman algorithmic approach can be viewed as two steps: (1)
prediction and then (2) correction.
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
63/73
63
Kalman Algorithm System Model
Output
device Kalman
Filter-
EstimatorMeasurement
noise
System
state
Input
Observed
output
Optimal
estimate of
system state
System
model noise
System
dynamics
Black box
system model
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
64/73
64
Kalman Filter Overview---Discrete Time The system state process,yk, that is to be estimated is modelled by the following
difference equations
yk=Ayk-1+Buk+ wk-1 [8-65a]
zk= Hyk+ vk [8-66a]
where the system state process is denoted byyk with filter parameters,AandB, and the output filterHthat are known. The model noise is wk. The processzkis the observable system output (filtered signal + noise) and the process u
kis the
system input. The model noise wkhas covariance Q,the measurement noise vkhas covarianceR, andPdenotes the prediction error co-variance matrix.
Kalman Filter algorithm is a two-step process: prediction and correction
1.Prediction:--kis an estimate based on measurements at previous time stepsthat follows the system above system dynamics
--k = Ayk-1+ Buk [8-67a]
P-k = APk-1AT+ Q [8-67b]
2. Correction:-khas additional information the measurement at time k-k =-
-k+ Kk(zk- H-
-k) [8-68a]
Pk = (I - KkH)P-k where Kk= P
-kH
T(HP-kHT+ R)-1 [8-68b]
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
65/73
65
Blending Factor
If we are sure about measurements: Measurement error covariance of the output noiseRdecreases to zero
The Kalman Gain,Kkdecreases and weights residual more heavily than
prediction
If we are sure about prediction
Prediction error covariance P-k decreases to zero
The Kalman GainKkincreases and weights prediction more heavily
than residual
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
66/73
66
Kalman Filter Summary
--k = Ayk-1+ Buk
P-k = APk-1AT+ Q
Prediction (time update)
(1) Project the state ahead
(2) Project the error covariance ahead
Correction (Measurement Update)
(1) Compute the Kalman Gain
(2) Update estimate with measurementzk
(3) Update error covariance
-k =--k+ K(zk- H-
-k)
Kk= P-kH
T(HP-kHT+ R)-1
Pk = (I - KH)P-k
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
67/73
67
Example Constant System Model
Prediction: system model has no input and no model noise
-k =--k+ Kk(zk- H-
-k)
Correction:
Kk= P-k(P
-k + R)
-1
--k = yk-1
P-k = Pk-1
Pk = (I - Kk)P-k
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
68/73
68
Example Constant System Model
0 10 20 30 40 50 60 70 80 90 100-0.7
-0.6
-0.5
-0.4
-0.3
-0.2
-0.1
0
-k
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
69/73
69
Example Constant Model
0 10 20 30 40 50 60 70 80 90 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Convergence of Error Covariance - Pk
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
70/73
70
0 10 20 30 40 50 60 70 80 90 100-0.7
-0.6
-0.5
-0.4
-0.3
-0.2
-0.1
0
Example Constant Model
Larger value ofR the
measurement error covariance
(indicates poorer quality of
measurements)
Filter slower to believe
measurements slower
convergence
-k
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
71/73
Comparing the Least-Squares (Kalman ) and the Least Mean Square Error
(Wiener) Approach
Least Mean Square [LMS] criterion
is statistical
Error criterion is not an explicit
function of the data, but depends
only on statistics
Least Squares (Kalman) Error
criterion is an explicit function of
the signal samples
To track variations in the channel,
the weighting factor, w [
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
72/73
Least-Squares/Kalman Algorithm: For an FIR Filter
LSEN = w
N!n
n=0
N
" en2 = wN!n (zkn=0
N
" ! dk)2 = wN!n (yn'n=0
N
" cn! dn )2 (8-70a)
Solving the above gives: cn =
An!1
yn (8-70b)
Where, An = w
n!k
k=0
n
"k
rk
'+!I [ !is the noise power] (8-70c)
=wAn!1 + rnrn
' , (8-70d)
and rn= wn!k
k=0
n
" rkak = wrn!1 + rnan (8-70e)
Challenge: given cn, how do we find c
n+1 ?
72
-
8/10/2019 Topic 8-Mean Square Estimation-Wiener and Kalman Filtering
73/73
Least-Squares/Kalman Algorithm: For a Tapped Delay Line Equalizer---continued
The key result that we use to derive the Kalman algorithm is the
Matrix Inversion Lemma to determine An
!1 from An!1
!1
An
!1=w
!1A
n!1
!1{ !A
n!1
!1rnrn
'A
n!1
!1
w+ rn
'A
n!1
!1rn
} (8-71a)
Letting, Dn= A
n
!1 (8-71b)
kn=
1
w+n
Dn!1rn [denotes the Kalman Gain] (8-71c)
n = r
n
'D
n!1rn (8-71d)
It can be shown that
cn+1 = cn + knen (8-71e)
Dn=w
!1[Dn!1 ! knrn
'D
n!1] (8-71f)
This algorithm takes "big" steps in the direction of the Kalman gain to iteratively
realize the optimum tap setting at each time instant [based upon the received
samples up to "n"]. The algorithm is effectively using the Gram-Schmidt
orthogonalization technique to realize copt
from the successive input vectors, rn{ }
The Kalman algorithms converge in ~Niterations !