Download - Kalman filter and State Space Models
-
8/11/2019 Kalman filter and State Space Models
1/24
State Space Models, Kalman Filter and Smoothing
The idea that any dynamic system can be expressed in a particular representation called the
state space representation was proposed by Kalman. He presented an algorithm (or) a set of rules
to sequentially forecast and update a set of projections of the unknown state vector.
State space representation of a dynamic system The general case
State space models were originally developed by control engineers to represent a dynamic system
or dynamic linear models. Interest normally centers on a (m1) vector of variables, called a state
vector, that may be signals from a satellite or the actual position of a missile or a rocket. A state
vector represents the dynamics of the process. More precisely, it retains all the memory in the
process. All the dependence between past and future, must funnel through the state vector. The
elements of the state vector may not have any specific economic meaning, but state space approach ispopular in economic applications involving modelling unobserved or latent variables, like permanent
income, NAIRU (Non-Acclerating Inflation Rate of Unemployment), expected inflation, state of the
economy in business cycle analysis, etc. In most cases such signals are not observable directly, but
such a vector of variables is related to a (n1) vector zt of variables that are actually observed,
through an equation called measurement equationor the observation equation, given by
zt = Atxt+ Ytt+ Nt (1)
where Yt and At are parameter matrices of order (nm) and (nk) respectively, xt is (k 1)
vector of exogeneous or pre-determined variables, andNt is (n 1) vector of disturbances which has
zero mean and covariance matrix Ht.
Although the state vector tis not directly observable, its movements are assumed to be governed
by a well defined process, called the transition equationor state equationgiven by,
t = Ttt1+ Rtt, t= 1, . . . , T . (2)
where Tt and Rt are matrices of order (mm) and (mg) respectively. t is (g 1) vector of
disturbances with mean zero and covariance matrixQt.
Remarks:
1. Note that in the measurement equation we have an added disturbances termNt.We need it iff
we assume that what we have observed is contaminated by an additional noise; otherwise we
simply have
zt = Atxt+ Ytt. (3)
1
-
8/11/2019 Kalman filter and State Space Models
2/24
-
8/11/2019 Kalman filter and State Space Models
3/24
large enough so that the dynamics of the systems can be captured by the simple first order
Markov structure of the state equation. From a technical point of view, the aim of state space
form is to set up t such that, it has as small number of elements as possible. Such a state
space set up is called a minimal realizationand it is a basic criterion for a good state space
form.
6. In many cases of interest only one observation is available in each time period that is zt
is now a scalar in the state equation. Also, the transition matrix is much simpler than given
before, in the sense, the parameters in most cases, including the variance, are assumed to be
time invariant. Thus the transition equation now becomes
t = Tt1+ Rt, t= 1, . . . , T . (12)
and
t= WN(0, 2Q). (13)
7. For many applications using Kalman filter, the vector of exogenous variables is simply not
necessary. One may also assume that the variance of the noise term is time invariant. So that,
the general system now boils down to:
zt = ytt+Nt, t= 1, . . . , T (14)
t = Tt1+ Rt, t= 1, . . . , T . (15)
zt now is a scalar and Nt (0, 2h) and yt is a (1 m) vector. In some of the state
space applications, especially those that use ARMAmodels the measurement error in the
observation equation i.e. Nt is assumed to be zero. This means thatNt in such applications
will be absent.
8. There are many ways to write a given system in state-space form. But written in any way, if
our primary interest is forecasting, we would get identical forecasts no matter which form we
use. Note also that, we can write any state space form as an ARMA model. This way, there
is an equivalence between the two forms.
3
-
8/11/2019 Kalman filter and State Space Models
4/24
Examples of state space representation:
Example 1: First let us consider the general ARMA(p, q) model and see how it can be cast in a
state space form. AnARMA(p, q) model can be written, by defining m =max(p, q+ 1),in the form:
zt = 1zt1+2zt2+ +mztm+1et1+2et2+ +m1etm+1+et
where we interpret j = 0 for j > p and j = 0 for j > q.
Then we can write the state and observation equation as follows:
State equation:
t =
1...
2... Im1
... ...
. . . . . . . . .
m... 0
t1+
1
1......
m1
et
Observation equation:
zt =
1 0 . . . 0t.
The original model can be easily recovered by repeated substitution, starting at the bottom row of
the state equation. We can easily note that the first element of the state vector is identically equalto the given model for zt.
Example 2: Let us consider next a univariate AR(p) process:
zt = 1zt1+2zt2+ +pztp+et
where(B) = (11B2B2 pB
p) is the AR operator and et is white noise. This could
be defined in state space form by writing the (m 1) state vector, t,wherem = pfor the present
case, as follows:State equation:
t =
1...
2... Im1
... ...
. . . . . . . . .
m... 0
t1+
1
0......
0
et
4
-
8/11/2019 Kalman filter and State Space Models
5/24
Observation equation:
zt =
1 0 . . . 0t.
Defining t = (1t 2t . . . mt), and substituting from the bottom row, we get the original AR
model.
Example 3: Let us consider the following ARMA(1, 1) model. For this model m = 2.
zt = 1zt1+1et1+et.
For this model the state and the measurement equations are given below:
StateEquation: t = 1 10 0
t1+ 11
et andObservationEquation: zt =
1 0
t.
If we define t = (1t 2t),then
2t = 1et
1t = 11,t1+2,t1+et
= 1zt1+1et1+et.
and this is precisely the original model.
Example 4: As a final example, we shall consider the first order moving average model, assuming
that the model has zero mean:
zt = et+1et1.
Herem = 2, so that the state and the measurement equations are given as follows:
State equation: t =
0 1
0 0
t1+
1
1
et and
Observation equation: zt =
1 0t.
If we define t = (1t 2t), then2t = 1et and1t = 2,t1+ et = et+ 1et1 and this is precisely
the original model.
5
-
8/11/2019 Kalman filter and State Space Models
6/24
We have seen before that there are many ways of writing a given system in state space form. We
shall here give an example of writing the AR(p) process in a different way.
Example 5: As before let m = p.The state equation is given as: State Equation:
zt
zt1...
ztp+1
t
=
1 2 . . . p1 p1 0 . . . 0 0...
...
0 . . . . . . 1 0
TTT
zt1zt2
...
ztp
t1
+
10...
0
RRR
et
Observation Equation:
(zt) =
1 0 . . . 0
yyyt
zt
zt1...
ztp+1
t
In this case, by carrying out the matrix multiplication on RHS of the state equation, we cannotice that the first row gives the originalAR model and the rest are trivial identities, including the
observation equation.
Example 6: Let us take the ARMA(p, q) that we have seen before:
zt = 1zt1+2zt2+ +mztm+1et1+2et2+ +m1etm+1+et
where we interpret j = 0 for j > p and j = 0 for j > q. We shall re-write it in a way different
from what we saw in Example 1. Letm=max(p, q+ 1). Then we can write the state equation andobservation equation as follows:
State Equation:
t+1=
1 2 . . . m1 m
1 0 . . . 0 0...
...
0 0 . . . 1 0
t+
et+1
0...
0
6
-
8/11/2019 Kalman filter and State Space Models
7/24
Observation Equation:
zt= +
1 1, . . . , m1t.
We shall take theARMA(1, 1) model and see how to write the state space form as given Example
6 and retrieve the original model. For ARMA(1, 1), m = 2. So the state and the observation
equations are:
StateEquation:
t+1 =
1 0
1 0
t+
1
0
et+1,
ObservationEquation:
zt = +
1 1t.
Starting from the second row of the state equation, we have
2,t+1= 1,t.
First row of state equation implies that
1,t+1 = 11,t+et+1
or
11B
1,t+1 = et+1. . . . (1)
Observation equation states that
zt = +1,t+12,t
= +1,t+11,t1
= +
1 +1B
1,t . . . (2)
Multiply (2)
11B
to give:
11B
zt
=
11B
1 +1B
1,t
=
1 +1B
et [from (1)]
= (which is the given model)
Example 7: Let us take an example of state space formulation for an economic problem. Fama
and Gibbons, Journal of Monetary Economics, 1982,9,pp.297-32 use the state space idea
7
-
8/11/2019 Kalman filter and State Space Models
8/24
to study the behaviour ofex-antereal interest rate (defined as the nominal interest rate, it, minus
the expected inflation rate,et .) This is unobservable because we do not have data on the anticipated
rate of inflation. Thus, the state variable is:
t = itet ,
where is the average ofex-antereal interest rate. Fama and Gibbons assume that the ex-antereal
interest rate follows the AR(1) process:
t+1= t+et+1.
But an econometrician has data on ex-postreal interest rate (that is, nominal interest rate, it minus
the actual rate of inflation,t.) That is,
itt = itet et t= +t+t,
wheret = et t, is the error agents made in forecasting inflation. If people forecast optimally,
thent should be free of autocorrelation and should be uncorrelated with ex-antereal interest rate.
Kalman Filter An Overview
Consider the system given by the following equations:
zt = ytt+Nt, t= 1, . . . , T
t = Tt1+ Rt, t= 1, . . . , T .
Given this, our objectives could be either to obtain the values of unknown parameters or, given
the parameter vectors, we may be aiming to obtain the linear least squares forecasts of the state
vector on the basis observed data. Kalman filter (KF here after) has many uses. We are utilising
it as an algorithm to evaluate the components of the likelihood function. Kalman filtering followsa two-step procedure. In the first step, the optimal predictor for the nextobservation is formed,
based on all the information currently available. This is done by the prediction equation. In the
second step, the moment a new observation becomes available, it is then incorporated into the
estimator of the state vector using the updating equation. These two equations collectively form
the Kalman filter equations. Applied recursively, the KF provides an optimal solution to the twin
problems of prediction and updating. Assuming that the observations are normally distributed and
also assuming that current estimator of the state vector is the bestavailable, the prediction and the
8
-
8/11/2019 Kalman filter and State Space Models
9/24
updating estimators are the best. By best, we mean the estimators have the minimum mean squared
error (MMSE). It is very evident that the process of predicting the next observation and updating it
as soon as the actual value becomes available, has an interesting by product the prediction error.
And we have seen in the chapter on estimation, that how a set of dependent observation can be
decomposed in terms of the prediction errors. KF gives us a natural mechanism to carry out thisdecomposition.
Kalman filter recursions Main equations
We shall useatto denote theMMSEestimator oftbased on all information up to and including
the current observation zt. Similarly, we have at/t1 as the MMSEestimator oft at time t1.
That is,at|t1= E(t|It1).
Prediction:
At timet1,all available information, includingzt1is incorporated inat1,which is theMMSEestimator oft1. The prediction error has a covariance matrix of
2ePt1.More precisely,
2ePt1= E
(t1 at1) (t1 at1)
From
t = Tt1+ Rt,
we get that at time t 1,the MMSEestimator oft is given by
at/t1= Tat1
so that the estimation error or the sampling error is given by
t at/t1
= T (t1 at1) + Rt.
The right hand side of this estimation error has zero expectations. We have to note here that an
estimator isunconditionally unbiased (u-unbiased) if its estimation error has zero expectations. And
when an estimator is u-unbiased its MSE matrix, E
t at|t1
t at|t1
is identical to the
covariance matrix of the estimation error, at|t1 t at|t1 t .And hence we can write thethe covariance of the estimation error as:
E
t at/t1
t at/t1
= E
T (t1 at1) + Rt
T (t1 at1) + Rt
= TE
(t1 at1) (t1 at1)
T + TE
(t1 at1)
t
R +
RE
t(t1 at1)
T + RE
t
t
R
= 2TPt1
T +2RQR.
9
-
8/11/2019 Kalman filter and State Space Models
10/24
Thus,
t at/t1
W S
0, 2Pt/t1
where
Pt/t1 = TPt1T + RQR
where WS stands for wide sense. Weak stationarity is sometimes referred to as wide sense
stationarity.
Now, given thatat/t1 isMMSEoft at timet 1,the MMSEofzt at time timet 1 clearly
is,
zt/t1= ytat/t1.
The associated prediction error is
zt zt/t1
= t= y
t
t at/t1
+Nt
the expectation of which is zero. Hence,
vart= E(2t) = E
ytt at/t1
t at/t1
yt
+E(N2t)
[since cross product terms have zero expectations]
= 2ytPt|t1yt+2h= 2ft
Deriving the state updatingequations is involved and hence the important steps are relegated to
the appendix and we state only the main equations below:
Updating equation
at = at|t1+ Pt|t1yt zt ytat|t1 /ft.And the estimation error is said to be
(t at) WS(0, 2Pt)
where
Pt = Pt|t1 Pt|t1ytytPt|t1/ft where ft = y
tPt|t1yt+h.
We have to highlight the following points.
10
-
8/11/2019 Kalman filter and State Space Models
11/24
1. Note the role played by the prediction error, t =
zt ytat|t1
and the variance associated
with it,2ft.
2. And note also the term, (m1) vector,
Pt|t1yt/ft
,which is called the Kalman gain.
3. In the discussion so far, we have assumed the presence of an additional noise in the measurement
equation; that is, h > 0. But we also have to note that, in our examples of state space
representation of ARMA models, we have assumed that the measurement equation has no
additional error. That is,Ntis assumed to be zero, implyingh,the variance of the measurement
error term, will be zero. However this should not matter, since through these adjustments note
that we have isolated h as an additive scalar, which when becomes zero, does not affect our
calculations. (Note the expression for ft.)
ML Estimation ofARMA models
Literature has many algorithms aimed at simplifying the computation of the components of
the likelihood. One approach is to use the Kalman filter recursions. Other useful algorithms are
by Newbold (Biometrica, 1974, Vol.61, 423-26) and the innovations algorithm, suggested by
Ansley (Biometrica, 1979, Vol.66,59-65).
KF recursions are useful for a number of purposes. But our emphasis will be on understanding
how these recursions (1) can be used to construct linear least squares forecasts of the state vector on
the basis of data observed through time t, and (2) use the resulting prediction error and its variance
to build the components of the likelihood function. In our derivation so far, we have motivated the
discussion on Kalman filter in terms of linear projection of the state vector, tand the observed times
serieszt.These are linear forecasts and are optimal among any function, if we assume that the state
vector and the disturbances are multivariate Gaussian. Our main aim is to see how KF recursions
calculate these forecasts recursively, generating a1 | 0, a2 | 1, . . . , aT| T1,and P1|0, P2|1, . . . , PT|T1 in
succession.
How do we start the recursions?
To start the recursions, we need to get a1|0.This means we should get the first period forecast of
based on an information set. Since we dont have information on the zeroth period, we take the
unconditional expectation as
a1|0 = E(1) ,
where the associated estimation error has zero mean and covariance matrix 2P1|0
.
11
-
8/11/2019 Kalman filter and State Space Models
12/24
Let us explain this with the help of an example.
Example 8: Let us take the simplest M A(1) model.
zt = et+1et1
We have shown before that the state vector is simply
t=
zt
1et
and hence
a1|0= E
z1
1e1
=
0
0
.
And the associated variance matrix of the estimation error, 2P0 or 2P1/0, is simply E(11), so
that we have,
P1|0 = 2E(1
1)
= 2E
z1
1e1
z1 1e1
=
1 +21 1
1 21
While one can work out by hand the covariance matrix for the initial state vector for pure MA
models, this turns out to be too tedious for higher order mixed models. So, we need a closed form
solution to calculate this matrix. We get such a solution by generalising this. Generalisation is easy
if we can make prior assumptions about the distribution of the state vector.
Two categories of state vector can be distinguished depending on whether or not the state vector
is covariance stationary. If it is so, then the distribution of the state vector is readily available; and
with that the problem of starting values can be easily resolved. With the assumption that the state
vector is covariance stationary, one can easily check from the state equation that the unconditional
mean of the state vector is zero. That is, from the state equation, one can easily see that
E(t) =0,
and the unconditional variance oft is easily seen to be,
E(t
t) =E(Tt1+ Rt) (Tt1+ Rt)
12
-
8/11/2019 Kalman filter and State Space Models
13/24
Let us denote theLHSof the above expression as .Noting that the state vector depends on shocks
only up to t1,we get
= TT + RQR
Though this can be solved in many ways, a direct closed form solution is given by the following
matrix lemma.
We use the vec operator and use the following result.
Proposition: LetA, BandC be matrices such that the product ABC exists. Then
vec
ABC
=
C A
vec
B
.
Thus, we vectorize both sides of the expression for and rearrange to get a closed form solution
as,
vec() =
Im2(T T)
1vec(RQR)
What this implies is that, provided the process is covariance stationary, Kalman filter recursions
can be started with
a1 | 0
= 0,and the (m m) matrixP1 | 0, whose elements can be expressed as
a column vector, is obtained from:
vecP1|0= Im2(T T)1
vec(RQR)
The best way to get a grasp of the Kalman recursions, is to try them out on a simple model. Let
us try them on the simple M A(1) model.
Example 9: Assume for convenience that the process has zero mean. So, the MA(1) model can
be written as,
zt = et+1et1.
Herem = 2.So fromExample 3,we have the state and the measurement equations given as follows:
State equation: t =
0 1
0 0
t1+
1
1
et and
Observation equation: zt =
1 0t.
Note that the observation equation has no error. How do we start the recursions? Recall from the
prediction equation that we have to first get at|t1
. That is, for the first period, we need to get
13
-
8/11/2019 Kalman filter and State Space Models
14/24
a1|0,the initial state vector. From our discussion about covariance stationary properties of the state
vector, it is clear that that
a1|0 = Ta0 = 0.
Next we have to calculate the matrix of the estimation error, i.e. 2P1|0 or 2P0. Though we have
a formula to calculate the such matrices, for the present problem one can find it directly:
P1|0 = P0= 2 E
1
1
= 2E
z1
1e1
{z1 1e1}
=
(1 +21) 1
1 21
.
Let us calculate the prediction error for z1.One can easily see that z1|0 = 0,and hence the associated
prediction error 1 = z1 itself and the prediction error variance is given as:
var (1) = 2
1 0
2E
1
1
10
= 2 [1 0]
1 +21 1
1 21
1
0
= 2(1 +21), with f1 = (1 +
21).
Application of the updating formula:
a1 =
(1 +21) 1
1 21
1
0
z1
(1 +21)
=
(1 +21)z1
z11
(1 +21)
=
z1
z11
(1 +21)
14
-
8/11/2019 Kalman filter and State Space Models
15/24
Similarly,
P1 =
(1 +21) 1
1 21
(1 +21) 1
1 21
1
0
1 0
(1 +21) 1
1 21(1 +21)
=
0 0
0 41
(1 +21)
Prediction equation for2:
a2|1 = Ta1
= 0 1
0 0 z1
z11
(1 +21)
=
z11
(1 +21)
0
.
And,
P2|1 =
0 1
0 0
0 0
0 41(1 +21)
0 0
1 0
+
1 1
1 21
=
41(1 +21) 00 0
+ 1 1
1 21
=
(1 +21+41)
(1 +21) 1
1 21
.
Predictingz2:
z2 = 1 0 z11(1 +21)
0
= z11
(1 +21)
Prediction error2:
2 = z2z11
(1 +21),
15
-
8/11/2019 Kalman filter and State Space Models
16/24
and
f2 =
1 0
(1 +21+
41)
(1 +21) 1
1 21
1
0
= (1 +21+41)(1 +21) 1 10
= (1 +21+41)/(1 +
21)
These steps show that, for the M A(1) model, one can calculate the prediction error and its variance
using the following recursions:
t = zt1t1
ft1, t= 1, 2, . . . , T , where 0= 0, and
ft = 1 + 2t
11 +21+ +
2(t1)1
Note here that the expressions for the prediction error t and the prediction error variance ft are
exactly the same as those obtained using triangular factorization for the M A(1) model.
-
As a final step towards finalising the likelihood function, we shall note the following further
simplification. Recall that we had decomposed the likelihood for set of dependent observations, into
a likelihood for the independent errors, using the concept of prediction error decomposition, as:
logL(z) =T
2log2
T
2log2
1
2
Tt=1
logft1
22
Tt=1
2t /ft.
From our derivation, we can see that the t and ft do not depend on2 and hence we can concen-
trate2 out. This means, we have to differentiate the log-likelihood with respect to 2 and get an
estimator for 2,say, 2.So we get,
2 = 1
T
T
t=12tft
.
Evaluating the log-likelihood in terms of2 = 2 and simplifying, we get
Log L (z)c =T
2
log2+ 1
1
2
Tt=1
log ftT
2log 2.
We either maximize this log likelihood or minimize,
Log L (z)c=T
t=1
log ft+Tlog 2.
16
-
8/11/2019 Kalman filter and State Space Models
17/24
One can make an initial guess about the underlying parameters and either apply the numerical
estimation procedures to calculate the derivatives or analytically calculate the derivatives by differ-
entiating the Kalman recursions. In either case one has to keep in mind the restrictions to be imposed
on the parameters, especially on the M Aparameters, to take care of the identification problem. Also,
it has been proved in the literature, that using Kalman recursions to estimate pure AR models isreally not necessary.
-
Kalman Smoothing
We have motivated the discussion on kalman filter so far as an algorithm for predicting the
state vector, obtaining exact finite sample forecasts, as a linear function of past observations.
We have also shown, how the resulting prediction error and the prediction error variance, can
be used to evaluate the loglikelihood.
This is sub-optimal if we are interested in estimating the sequence of states. In many cases,
kalman filter is used to obtain an estimate of the state vector itself. For example, in their model
of the business cycle, Stock and Watson show how one may be interested in knowing the state
of the economyor the phase of the business cycle the economy is in, which is unobservable
at any given historical point. Stock and Watson suggest that comovements in many macro
aggregates have a common element, which may be called the state of the economy and this is
unobservable. They motivate the use of kalman filter to obtain an estimate of this unobserved
state of the economy.
Sometimes elements of the state vector are even interpreted as estimates of missing observations,
which could be higher frequency data points from an observable lower frequency one or simply
an estimate of missing data point. For example, if we have data on a macro aggregate from
1955 through 2104,we may interested in obtaining an estimate of 1970 which may be missing.
Or, we may be interested in extracting monthly data from quarterly data.
Such estimates of the unobserved state of the economy or missing observations can be obtained
fromsmoothed estimatesof the state vector, t.
Each step of the kalman recursions gives an estimate of the state vector,t,given all current and
past observations. But an econometrician should use all available information to estimate the
sequence of states. Kalman smoother provides these estimates. The only smoothed estimator
which utilises all the sample observations is given by
17
-
8/11/2019 Kalman filter and State Space Models
18/24
at|T = E(t|IT)
and the M SEof this smoothed estimate is denoted
Pt|T = E
(t at|T)(t at|T)
.
The smoothing equations start from at|T andPt|Tand work backwards.
The expressions for at|T and Pt|T, which may be called the smoothing algorithm, are given
below without proof:
at|T = at+ Ptat+1|T Tt+1at
Pt|T = Pt+ Pt
Pt+1|T Pt+1|t
P
t
where
Pt = PtTt+1P
1t+1|t, t= T1, . . . , 1
with aT|T =aT and PT|T =PT.
A set of direct residuals can also be obtained from the smoothed estimators.
et= zt ytat|T, t= 1, . . . , T
This is not to be confused with the prediction residuals, t, defined earlier.
-
We shall explain the smoothing algorithm with an example. Consider the simple model
zt = t+t, t W N(0, 2)
t = t1+t, t W N(0, 2q)
where the state,t, and the observation, zt, are scalars. The state, which follows a random walk
process, cannot be observed directly as it is contaminated by noise. This is the simple signal plus
noise model. We assume that q is known. Also note that in this example we have allowed the
observation ztto be measured with error,
t. For this example, note that T = 1, R= 1 andy
t= 1.
18
-
8/11/2019 Kalman filter and State Space Models
19/24
The prediction equations for this example are
at|t1 = at1, Pt|t1 = Pt1+q
and the updating equations are
at = at|t1+Pt|t1(ztat|t1)/(Pt|t1+ 1)
and
Pt = Pt|t1P2t|t1/(Pt|t1+ 1)
We shall demonstrate how to predict, update and smooth with 4 observations: z1 = 4.4, z2 =
4, z3 = 3.5 and z4 = 4.6. The initial state vector has the property, 0 N(a0, 2P) and we have
been given that a0 = 4, P0 = 12 and q= 4 so that RQR
= 4 andh = 1.
From the prediction equation we have a1|0 = 4, and P1|0 = 16, so that from the updating
equations we have,
a1 = 4 + (12 + 4)(4.44)/(12 + 4 + 1) = 4.376
and
P1 = 16162/17 = 0.941
Since yt = 1 in the measurement equation for all t, MM SLE ofzt is always at|t1. So, z2|1 =
a2|1= a1= 4.376.
Repeating the calculations for t = 2, 3 and 4,we get the following results:
Smoothed estimators and residuals
t 1 2 3 4
zt 4.4 4.0 3.5 4.6
at 4.376 4.063 3.597 4.428
Pt 0.941 0.832 0.829 0.828
t 0.400 -0.376 -0.563 1.003
at|T 4.306 4.007 3.739 4.428
Pt|T 0.785 0.710 0.711 0.828
et 0.094 0.007 -0.239 0.172
19
-
8/11/2019 Kalman filter and State Space Models
20/24
From the above table we also have: a2|1= 4.376, P2|1= 4.941, a3|2 = 4.063, P3|2 = 4.832, a4|3=
3.597 and P4|3= 4.829.
From the table, the final estimates are seen to be a4= 4.428 andP4 = 0.828.
These values can now be used in the smoothing algorithm. And the algorithm, for the current
example reduces to,
at|T = at+Pt/Pt+1|t
at+1|Tat
Pt|T = Pt+
Pt/Pt+1|t2
Pt+1|TPt+1|t
, t= T1, . . . , 1
Since a4|4 = a4 and P4|4 = P4, we can apply the smoothing algorithm to obtain smoothed
estimates for a3|4 andP3|4 and work backwards. So we have
a3|4 = 3.597 + (0.829/4.829)(4.4283.597) = 3.379
P3|4 = 0.829 + (0.829/4.829)2(0.8284.829) = 0.711
The rest of the smoothed estimates have been displayed in the table above.
The smoothed estimates of the unobserved state vector is displayed by the row at|Tin the table
above.
Both the direct and the prediction error residuals have been calculated using the formulae,
et = ztat|T andt = ztat1 respectively.
20
-
8/11/2019 Kalman filter and State Space Models
21/24
Appendix
Derivation of updating equations
In this Appendix we shall derive the important steps leading to the updating equation and the
associated variance matrix of the estimation error. Before discussing the steps involved, we shall
digress a bit to delve into the following important material.
1. Consider the model:
Z(T1)
= Y(Tm)
(m1)
+ N(T1)
, N
0, 2
.
We shall call this model the sample information.
(a) Case 1: If is fixed in the above model, we have the usual GLSestimator given as:
Y1YY1Zand this would beBLUE.
(b) Case 2: Suppose the vector is either partially or fully random or stochastic. The
question now here, is the GLSestimator stillBLUE? The answer is it still is according to
theextendedGauss-Markov theorem, enunciated by Duncan and Horn, JASA, 1972,
pp.815-21. They proved that the GLSestimator now satisfies a condition called best,
linear, unconditionally unbiased (or u-unbiased) estimator.[An estimator is u-unbiased if
its estimation error has expectation zero.]
(c) Case 3: Suppose that is still fully or partially random. Additionally suppose that
we have some prior information about it. How can we use it to update the estimator
of already obtained? This becomes a special case of themixed estimationprocedure
developed by Theil and Goldberger (see Theil, Principles of Econometrics, pp.347-
52) where we incorporate such prior information with the sample information. Suppose
in our case, the prior information is given in the form given below:
(0 )
0, 2P0
,
where 0 is a known vector and P0 is a known positive definite matrix. Then to get an
updatedestimator, that combines this prior information with the sample information, we
first construct the augmented model:
0
Z
=
I
Y
+
0
N
21
-
8/11/2019 Kalman filter and State Space Models
22/24
More concisely,
Z = Y + N, where, E(N) =0, and
E
NN
= 2V= 2
P0 0
0
.
Using the extended Gauss-Markov theorem, we have the estimator ofgiven as:
=
YV1Y1
YV1Z.
Using the original notations, this can be re-written as:
= P
P10 0+ Y1Z
where P=
P10 + Y
1Y1
.
is now the updatedMMSEof,with
( ) 0, 2PWe are going to use this principle of combining sample information and prior information in deriving
our updating equation of the KF recursion.
Updating the state vector
The role of the updating vector is to incorporate the new information in zt the moment we
are at timet with the information already available in the estimator at/t1.This problem is directly
analogous to the one that we discussed under the extended Gauss-Markov theorem and Theils mixed
estimation procedure, where prior information was combined with the sample information. For our
case, the prior information is in
t at/t1
0, 2Pt/t1
,
while the sample information is derived from the measurement equation. Thus the augmented model:
at/t1 = t+ at/t1 t
zt = ytt+Nt.
In matrix notation, at/t1
zt
=
I
yt
t+
at/t1 t
Nt
.
The disturbance term has zero expectations and covariance matrix,
E
at/t1 t
at/t1 t
Nt
Nt
= 2
Pt/t1 0
0 h
.
22
-
8/11/2019 Kalman filter and State Space Models
23/24
More precisely,
Zt= ytt+et,
whereE(et) is zero and E(etet) = 2V,where
V= Pt/t1 0
0 h .
Now, using the extended Gauss-Markov theorem, we can write
at=
ytV1Yt
1ytV
1Zt
Using the original notations, we can re-write the expression for at as follows:
at = PtP1t|t1 at| t1+ ytzt/h
where
Pt =
P1t| t1+ yty
t/h
Thusat t
0, 2Pt
The updating formula can be put in a different way using a matrix inversion lemma. The advan-
tage in such an adjustment is that we dont have to invert any matrix in the updating equations.
Lemma:
For any (nn) matrix,D,defined by
D=A + BCB
1,
whereA and C are non-singular matrices of order n and m respectively and B is (nm) ,then we
have:D= A1 A1B
C1 + BA1B
1BA1
We can use this lemma on the expression for Pt by noting that Pt = D, P1t|t1 = A, yt = B
and C = h1 and it follows that,
Pt = Pt|t1 Pt|t1ytytPt|t1/ft where ft = y
tPt|t1yt+h.
23
-
8/11/2019 Kalman filter and State Space Models
24/24
One can make it even more compact by writing
at =
Pt|t1 Pt|t1ytytPt|t1/ft
P1t|t1 at| t1+ ytzt/h
= at|t1+ Pt|t1yt
zt/h y
tat|t1/ft y
tPt|t1ytzt/fth
= at|t1+f
1t Pt|t1yt ztft/h ytat|t1 ytPt|t1ytzt/h .
Substituting for fth = ytPt|t1yt in the above term and re-arranging, we get
at = at|t1+ Pt|t1yt
zt ytat|t1
/ft.
Note that the expressions for at andPt in this appendix are exactly the ones we have used as
updatingand the variance matrix of the estimation error respectively in the main text.
Note also that in the discussion so far, we have assumed the presence of an additional noise
in the measurement equation; that is, h > 0. If we dont, then, note that V would become
singular. But we also have to note that, in our examples of state space representation ofARMA
models, we have assumed that the measurement equation has no additional error. However this
should not matter, since through these adjustments, note that we have isolated the variance
component as an additive scalar, which when becomes zero, does not affect our calculations.