auni edfilterforsimultaneousinputandstateestimation … · 2014-06-17 · auni...

A Unified Filter for Simultaneous Input and State Estimation

of Linear Discrete-time Stochastic Systems

Sze Zheng Yong a Minghui Zhu b Emilio Frazzoli a

aLaboratory for Information and Decision Systems, Massachusetts Institute of Technology, Cambridge, MA 02139, USA(e-mail: [email protected], [email protected]).

bDepartment of Electrical Engineering, Pennsylvania State University, 201 Old Main, University Park, PA 16802, USA(e-mail: [email protected]).

Abstract

In this paper, we present a unified optimal and exponentially stable filter for linear discrete-time stochastic systems that simul-taneously estimates the states and unknown inputs in an unbiased minimum-variance sense, without making any assumptionson the direct feedthrough matrix. We also derive input and state observability/detectability conditions, and analyze theirconnection to the convergence and stability of the estimator. We discuss two variations of the filter and their optimality andstability properties, and show that filters in the literature, including the Kalman filter, are special cases of the filter derived inthis paper. Finally, illustrative examples are given to demonstrate the performance of the unified unbiased minimum-variancefilter.

1 Introduction

The term filter or estimator is commonly used to refer tosystems that extract information about a quantity of in-terest from measured data corrupted by noise. Kalmanfiltering provides the tool needed for obtaining that re-liable estimate when the system is linear and when thedisturbance inputs or the unknown parameters are wellmodeled by a zero-mean, Gaussian white noise. How-ever, in many instances, the exogenous input cannot bemodeled as a Gaussian stochastic process rendering theestimates unreliable.

For example, consider the problem of estimating thestate and inferring the intent of another vehicle atan intersection, for instance, for ensuring the safetyof autonomous or semi-autonomous vehicles [1]. Inthis case, the input of the other vehicle is inaccessi-ble/unmeasurable, and is not well modeled by a zero-mean Gaussian white noise process. Thus, the stan-dard Kalman filter does not yield an optimal estimate.Nonetheless, we want to be able to estimate the statesand inputs of the other vehicle based on noisy mea-surements for purposes of collision avoidance, routeplanning, etc.

Similar problems can be found across a wide range ofdisciplines, from the real-time estimation of mean arealprecipitation during a storm [2] to fault detection anddiagnosis [3] to input estimation in physiological systems

[4]. Thus, this filtering problem in the presence of un-known inputs has steadily made it to the forefront in therecent decades.

Literature review. Much of the research focus has beenon state estimation of systems with unknown inputswithout actually estimating the inputs. An optimal fil-ter that estimates a minimum-variance unbiased (MVU)state estimate for a system with unknown inputs is firstdeveloped for linear systems without direct feedthroughin [2]. This design was extended to a more general pa-rameterized solution by [5], and eventually to state es-timation of systems with direct feedthrough in [6, 7, 8].Similarly, while H∞ filters (e.g., [9, 10, 11, 12]) can dealwith non-Gaussian disturbance inputs in minimizing theworst-case state estimation error, the unknown input isnot estimated. However, the problem of estimating theunknown input itself is often as important as state in-formation, and should also be considered.

Palanthandalam-Madapusi and Bernstein [13] proposedan approach to reconstruct the unknown inputs, in aprocess that is decoupled from state estimation with anemphasis on unbiasedness, but neglecting the optimalityof the estimate. On the other hand, Hsieh [14] and Gilli-jns and De Moor [15] developed simultaneous input andstate filters that are optimal in the minimum-varianceunbiased sense, for systems without direct feedthrough.Extensions to systems with a full rank direct feedthrough

Extended version of preprint submitted to Automatica 17 June 2014

arX

iv:1

309.

6627

v4 [

mat

h.O

C]

16

Jun

2014

matrix were proposed by Gillijns and De Moor [16], Fanget al. [17] and Yong et al. [18]. In an attempt to dealwith systems with a rank deficient direct feedthroughmatrix, Hsieh [19] allowed the input estimate to be bi-ased. Thus, the problem of finding a simultaneous stateand input filter for systems with rank deficient directfeedthrough matrix that is both unbiased and has min-imum variance remains open. Moreover, a unified MVUfilter that works for all cases remains elusive.

Another set of relevant literature pertains to the stabil-ity of the state and input filters, since optimality doesnot imply stability and vice versa. However, to the bestof our knowledge, the literature on this subject is limitedto linear time-invariant systems [8, 17, 20]. Yet anotherrelated literature is on state and input observability anddetectability conditions, also known as strong or perfectobservability and detectability, as this will be shown tobe related to the stability of the filter dynamics for bothlinear time-varying and time-invariant systems with un-known inputs. Some conditions for state and input ob-servability were derived in [13, 21, 22].

Contributions. We introduce a unified filter for simulta-neously estimating both state and unknown input suchthat the estimates are unbiased and have minimum vari-ance with no restrictions on the direct feedthrough ma-trix of the linear discrete-time stochastic system. Withinthis framework, we propose two variants of the MVUstate and input estimator, which are generalizations ofthe estimators in the literature, specifically of [15, 16,18], and the Kalman filter. Furthermore, we derive suf-ficient conditions for the filter stability of linear time-varying systems with unknown inputs, an importantproblem that has been previously unexplored; while forlinear time-invariant systems, necessary and sufficientconditions for convergence of the filter gains to a steady-state solution are provided. The key insight we gained isthat the exponential stability of the filter is directly re-lated to the strong detectability of the time-varying sys-tem, without which unbiased state and input estimatescannot be obtained even in the absence of stochasticnoise. We shall also show that one of the filter variants wepropose is globally optimal (i.e., optimal over the classof all linear state and input estimators as in [23]).

In connection to the existing literature, this paperpresents a combination of several ideas from [8, 15, 16]and our recent work [18] into a unified filter in a mannerthat provably preserves and extends the nice propertiesof these filters. However, there are a number of dis-tinctions between our filter and the above referencedfilters. In particular, we show that the state-only fil-ter in [8] implicitly estimates the unknown inputs in asuboptimal manner and so does the approach for inputestimation in [16] (employed in one of the two variantsof our filter). In contrast, our optimal filter variantuses the approaches of our previous work in [18] and ofgeneralized least squares estimation, which lead to the

desired optimality of the input estimates. In addition,we gave sufficient conditions for filter stability for lineartime-varying systems, which clearly cannot be carriedover from the existing literature (including [8, 15, 16])for linear time-invariant systems.

Notation. We first summarize the notation used through-out the paper. Rn denotes the n-dimensional Euclideanspace, C the field of complex numbers and N nonnega-tive integers. For a vector of random variables, v ∈ Rn,the expectation is denoted by E[v]. Given a matrix M ∈Rp×q, its transpose, inverse, Moore-Penrose pseudoin-verse, range, trace and rank are given byM>,M−1,M†,Ra(M), tr(M) and rk(M). For a symmetric matrix S,S � 0 and S � 0 indicates that S is positive definite andpositive semidefinite, respectively.

2 Problem Statement

Consider the linear time-varying discrete-time system

xk+1 = Akxk +Bkuk +Gkdk + wk

yk = Ckxk +Dkuk +Hkdk + vk(1)

where xk ∈ Rn is the state vector at time k, uk ∈ Rm

is a known input vector, dk ∈ Rp is an unknown inputvector, and yk ∈ Rl is the measurement vector. Theprocess noise wk ∈ Rn and the measurement noise vk ∈Rl are assumed to be mutually uncorrelated, zero-mean,white random signals with known covariance matrices,Qk = E[wkw

>k ] � 0 and Rk = E[vkv

>k ] � 0, respectively.

Without loss of generality, we assume throughout thepaper that n ≥ l ≥ 1, l ≥ p ≥ 0 and m ≥ 0, and thatthe current time variable r is strictly nonnegative. x0 isalso assumed to be independent of vk and wk for all k.

The matrices Ak,Bk, Ck,Dk,Gk andHk are known andbounded. Note that no assumption is made on Hk to beeither the zero matrix (no direct feedthrough), or to havefull column rank when there is direct feedthrough. With-out loss of generality, we assume maxk(rk[G>k H>k ]) =p. (Otherwise, we can retain the linearly independentcolumns and the “remaining” inputs still affect the sys-tem in the same way.)

The estimator design problem, addressed in this paper,can be stated as follows:Given a linear discrete-time stochastic system with un-known inputs (1), design a globally optimal and stablefilter that simultaneously estimates system states and un-known inputs in an unbiased minimum-variance manner.

3 Preliminary Material

3.1 System Transformation

We first carry out a transformation of the system todecouple the output equation into two components, one

2

with a full rank direct feedthrough matrix and the otherwithout direct feedthrough. In this form, the filter can bedesigned leveraging existing approaches for both cases(e.g., [15, 18]).

Let pHk:= rk(Hk). Using singular value decomposition,

we rewrite the direct feedthrough matrix Hk as

Hk =[U1,k U2,k

] [Σk 0

0 0

][V >1,kV >2,k

](2)

where Σk ∈ RpHk×pHk is a diagonal matrix of full rank,

U1,k ∈ Rl×pHk , U2,k ∈ Rl×(l−pHk), V1,k ∈ Rp×pHk ,

V2,k ∈ Rp×(p−pHk), and Uk :=

[U1,k U2,k

]and

Vk :=[V1,k V2,k

]are unitary matrices. Note that in

the case with no direct feedthrough, Σk, U1,k and V1,kare empty matrices 1 , and U2,k and V2,k are arbitraryunitary matrices.

Then, as suggested in [8], we define two orthogonal com-ponents of the unknown input given by

d1,k = V >1,kdk, d2,k = V >2,kdk. (3)

Since Vk is unitary, dk = V1,kd1,k + V2,kd2,k and the

1 We adopt the convention that the inverse of an emptymatrix is also an empty matrix and assume that operationswith empty matrices are possible. These features are read-ily available in many simulation software products such asMATLAB, LabVIEW and GNU Octave. Otherwise, a con-ditional statement can be included to bypass this case.

system (1) can be rewritten as

xk+1 = Akxk +Bkuk +GkV1,kd1,k +GkV2,kd2,k + wk

= Akxk +Bkuk +G1,kd1,k +G2,kd2,k + wk

(4)

yk = Ckxk +Dkuk +HkV1,kd1,k +HkV2,kd2,k + vk= Ckxk +Dkuk +H1,kd1,k + vk, (5)

where G1,k := GkV1,k, G2,k := GkV2,k and H1,k :=HkV1,k = U1,kΣk. Next, as aforesaid, we decouple theoutput yk using a nonsingular transformation

Tk =

[T1,k

T2,k

]

=

[IpHk

−U>1,kRkU2,k(U>2,kRkU2,k)−1

0 I(l−pHk)

][U>1,kU>2,k

](6)

to obtain z1,k ∈ RpHk and z2,k ∈ Rl−pHk given by

z1,k = T1,kyk = C1,kxk +D1,kuk + Σkd1,k + v1,k

z2,k = T2,kyk = C2,kxk +D2,kuk + v2,k(7)

where C1,k := T1,kCk, C2,k := T2,kCk = U>2,kCk,

D1,k := T1,kDk, D2,k := T2,kDk = U>2,kDk, v1,k :=

T1,kvk and v2,k := T2,kvk = U>2,kvk. This transform isalso chosen such that the measurement noise terms forthe decoupled outputs are uncorrelated. The covari-

Zr :=

[Z1,r

Z2,r

], Zq,r :=

[z>q,0 z

>q,1 . . . z

>q,r

]>∀ q = {1, 2}, Z1,r ∈ R

∑r

k=0pHk , Z2,r ∈ R(r+1)l−

∑r

k=0pHk ,

Dr :=

[D1,r

D2,r

], D1,r :=

[d>1,0 d

>1,1 . . . d

>1,r

]>∈ R

∑r

k=0pHk ,D2,r :=

[d>2,0 d

>2,1 . . . d

>2,r−1

]>∈ Rrp−

∑r−1

k=0pHk ,

pHk= rk(Hk) ∀ 0 ≤ k ≤ r, Φ(i,i) := Ai := Ai −G1,iΣ

−1i C1,i, Φ(i,j>i) := Aj . . . Ai, and ∀ q = {1, 2}, s = {1, 2},

Oq,r :=

Cq,0

Cq,1A0

Cq,2Φ(0,1)

...

Cq,r−1Φ(0,r−2)

Cq,rΦ(0,r−1)

, I(q,s),r :=

0 0 . . . 0 0

Cq,1Gs,0 0 . . . 0 0

Cq,2A1Gs,0 Cq,2Gs,1 . . . 0 0...

.... . .

......

Cq,r−1Φ(1,r−2)Gs,0 Cq,r−1Φ(2,r−2)Gs,1 . . . Cq,r−1Gs,r−2 0

Cq,rΦ(1,r−1)Gs,0 Cq,rΦ(2,r−1)Gs,1 . . . Cq,rAr−1Gs,r−2 Cq,rGs,r−1

(?)

3

ances of v1,k and v2,k can then be found as follows:

R1,k := E[v1,kv>1,k] = T1,kRkT

>1,k � 0

R2,k := E[v2,kv>2,k] = T2,kRkT

>2,k = U>2,kRkU2,k � 0

R12,(k,i) := E[v1,kv>2,i] = T1,kRkT

>2,k = U>1,kRkU

>2,k (8)

− U>1,kRkU2,k(U>2,kRkU2,k)−1U>2,kRkU2,k = 0, ∀k, i ∈ N

Since the initial state, process and measurement noise,are assumed to be uncorrelated, the covariances of v1,kand v2,k with the initial state and process noise are

E[v1,kw>i ] = T1,kE[vkw

>i ] = 0

E[v2,kw>i ] = T2,kE[vkw

>i ] = 0

E[v1,kv>1,i] = T1,kE[vkv

>i ]T>1,i = 0, ∀k 6= i (9)

E[v2,kv>2,i] = T2,kE[vkv

>i ]T>2,i = 0, ∀k 6= i

E[v1,kx>0 ] = T1,kE[vkx

>0 ] = 0

E[v2,kx>0 ] = T2,kE[vkx

>0 ] = 0.

3.2 Input and State Observability and Detectability

Similar to the analysis of the convergence of the Kalmanfilter, we will show in Section 5 that the convergenceof the unified filter is directly related to the notion ofinput and state observability and detectability (withwk = vk = 0, and without loss of generality, we assumethat Bk = Dk = 0), also known as strong or perfectobservability and detectability (e.g., see ([21, 22, 24]),defined as follows:

Definition 1 (Strong observability) The linear sys-tem (1) is strongly observable, or equivalently state andinput observable or perfectly observable, if the initialcondition x0 and the unknown input sequence up to timer − 1, {di}r−1i=0 , and specifically Dr ∈ Rrp+pHr , can beuniquely determined from the measured output sequence{yi}ri=0, or equivalently Zr ∈ R(r+1)l, for a large enoughnumber of observations, i.e., r ≥ r0 for some r0 ∈ N,where Dr and Zr are given in (?).

Next, we present the conditions for strong observabilityfor the time-varying and time-invariant cases.

Theorem 2 (Strong observability (time-varying))A linear time-varying discrete-time system is input andstate observable if and only if

rk([O2,r I(2,2),r

]) = n+ rp−∑r−1

k=0 pHk(10)

where pHk, as well as the observability and invertibility

matrices, O2,r ∈ R((r+1)l−∑r

k=0pHk

)×n and I(2,2),r ∈R((r+1)l−

∑r

k=0pHk

)×(rp−∑r−1

k=0pHk

), are given in (?).

Necessary conditions for (10) to hold are

(I) r ≥ r0,r and l ≥ p+ 1, or l = p = n and pHr = 0,(II) (a) rk(O2,r) = n,

(b) rk(Ik(2,2),r) = p− pHk−1, ∀ 0 ≤ k ≤ r,

where r0,r := dn−l−pHr

l−p e, dae is the smallest integer not

less than a and Ik(2,2),r is the k-th column of I(2,2),r.

Proof. The system transformation given by (6) trans-forms the output equations such that the d1,k com-ponent of can be determined from only the currentoutput measurement and previous state and inputestimates. Specifically, from (4) and (7), and ig-noring the known input and noise terms, we find

D1,r = −Σ[O1,r I(1,2),r

] [ x0D2,r

]+ (I∑r

k=0pHk

−

I(1,1),r)ΣZ1,r where Σ =[Σ−10 . . . Σ−1r

]. Substitut-

ing this in the output equation z2,k in (4), we ob-serve that the initial state x0 and unknown inputD2,r (and consequently D1,r from the previous equa-

tion) can be obtained from[O2,r I(2,2),r

] [ x0D2,r

]=

Z2,r − I(2,1),r[G1,0Σ−10 . . . G1,r−1Σ−1r−1

]Z1,r−1. Thus,

the linear system has a unique solution if and only if(10) holds:

(I) The linear system is not underdetermined, i.e.,

(r + 1)l −∑rk=0 pHk

≥ n + rp −∑r−1k=0 pHk

⇒(r + 1)l ≥ n+ rp+ pHr

. Thus, (I) holds.

(II) The matrix[O2,r I(2,2),r

]has full column rank.

For this to hold, the following are necessary:(a) O2,r has full column rank.(b) Ik(2,2),r has full column rank, ∀ 1 ≤ k ≤ r.

Theorem 3 (Strong observability(time-invariant))A linear time-invariant discrete-time system is inputand state observable if and only if

rk([O2,n I(2,2),n

]) = n+ n(p− pH) (11)

for some 0 ≤ n ≤ n where pH = rk(H). Moreover,

if l 6= p, then r0 ≤ n ≤ n, where r0 := dn−l−pH

l−p e;otherwise, l = p = n and pH = 0 must hold. Necessaryconditions for (11) to hold when n = n are

(I) r ≥ r0 and l ≥ p+ 1, or l = p = n and pH = 0,

(II) (a) rk(O2,n−1) = n; thus, (A, C) is observable,(b) rk(C2G2) = p− pH ; thus, rk(CG) ≥ p− pH ,

where A = A−G1Σ−1C1.

4

Proof. By applying the Cayley-Hamilton theorem, wecan show that the observable subspace spanned byO2,n−1 is A-invariant (i.e., Ra(O2,n−1A) ⊂ Ra(O2,n−1)),which implies that rk(O2,r) = rk(O2,n−1) for all r ≥ n.Then, to prove the conditions given in the theorem, we

will show that (i) if[O2,n I(2,2),n

]is rank deficient, then[

O2,r I(2,2),r]

for all r > n is also rank deficient, and

(ii) if[O2,n I(2,2),n

]has full rank, then

[O2,r I(2,2),r

]for all r > n also has full rank.

(i) Suppose[O2,n I(2,2),n

]is rank deficient. This im-

plies one of three cases. In the first, O2,n is rankdeficient. This then implies that O2,r for all r > nis also rank deficient since rk(O2,r) = rk(O2,n). Inthe second case, one of the matrices {Id(2,2),n}nd=1

(d-th column matrix of I(2,2),n each of dimensionsr(l−pH)×p−pH) is rank deficient, which implies

that Id+r−n(2,2),r =

[0>(l−pH)(r−n)×(p−pH) Id>(2,2),n

]>is rank deficient for all r > n. And in the thirdcase, some columns of some matrix pair betweenO(2,2),n and {Id(2,2),n}nd=1 are linearly dependent,

which by virtue of the lower triangular struc-

ture of[O2,n I(2,2),n

]is only possible if some

columns of either C2 or C2G2 are zero vec-tors. However, this implies that C2G2 and hence

Ir(2,2),r =[0Tr(l−pH)×(p−pH) (C2G2)>

]>is rank

deficient for all r > n. Therefore, in all cases,[O2,r I(2,2),r

]for all r > n is rank deficient.

(ii) Suppose now that[O2,n I(2,2),n

]has full rank.

This implies that O2,n and {Id(2,2),n}nd=1 have full

rank, which in turn implies that for all r > n,O2,r is full rank since rk(O2,r) = rk(O2,n) andC2G2 is also full rank, which can be inferred fromIn(2,2),n being full rank. Hence, since the matri-

ces {Id(2,2),r}rd=1 have the form[0 (C2G2)> (∗)

]>with 0 and (∗) of appropriate entries and dimen-sions, each of these matrices {Id(2,2),r}rd=1 have full

rank. Finally, since the assumption also impliesthat C2 and C2G2 cannot have zero columns and

the matrix[O2,r I(2,2),r

]has a lower triangular

structure, then this matrix must also have full rank.

Note that an alternative proof can be found in [24].

Furthermore, since O2,n−1 = U>2 O, where O =[C> (CA)> . . . (CAn−1)>

]>and C2G2 = U>2 CGV2,

then rk(O2,n) = rk(O2,n−1) ≤ min(rk(U>2 ), rk(O)) andrk(C2G2) ≤ min(rk(U>2 ), rk(CG), rk(V2)). Thus, it fol-

lows that rk(O) = n (i.e., (A, C) is observable) and

rk(CG) ≥ p− pH are necessary.

Corollary 4 For the time-invariant case, the followingstatements are equivalent:

(i) rk([O2,n I(2,2),n

]) = n+ n(p− pH),

(ii) rk

[zI −A −GC H

]= n+ p for all z ∈ C,

(iii) rk

([zI − A −G2

C2 0

])= n+ p− pH for all z ∈ C.

Moreover, the observability of (A,C) is a necessary con-dition.

Proof. The proof of the equivalence of (i) and (ii) isfairly involved, and the reader is referred to [22, 24] fordetails. To relate (ii) and (iii), we use the following

n+ p = rk

[zI −A −GC H

]= rk

zI −A −G

C U

[Σ 0

0 0

]V >

= rk

[I 0

0 T

]zI −A −G

C U

[Σ 0

0 0

]V >

[I 0

0 V

]

= rk

zI −A −GV

TC TU

[Σ 0

0 0

] = rk

zI −A −G1 −G2

C1 Σ 0

C2 0 0

= rk

I G1Σ−1 0

0 I 0

0 0 I

zI −A −G1 −G2

C1 Σ 0

C2 0 0

= rk

zI − A 0 −G2

C1 Σ 0

C2 0 0

= rk

[zI − A −G2

C2 0

]+ pH

for all z ∈ C, where the final equality holds because Σis square and has full rank pH . The necessity of observ-ability of the pair (A,C) follows directly from (ii).

Remark 5 Note that if rk(Hr) = p, then d2,r is emptyand Dr contains unknown inputs up to time r.

A weaker condition than the strong observability is givenin the following definition and theorem.

Definition 6 (Strong detectability) The linear sys-tem (1) is strongly detectable if

yk = 0 ∀ k ≥ 0 implies xk → 0 as k →∞

5

for all input sequences and initial states.

Theorem 7 (Strong detectability(time-invariant))A linear time-invariant discrete-time system is stronglydetectable if and only if either of the following holds:

(i) rk

[zI −A −GC H

]= n+ p, ∀z ∈ C, |z| ≥ 1,

(ii) rk

[zI − A −G2

C2 0

]= n+ p− pH , ∀z ∈ C, |z| ≥ 1.

The above conditions are equivalent to the property thatthe system is minimum-phase (i.e., the invariant zeros ofthe system matrices in Corollary 4-(ii),(iii) are stable).

Proof. This theorem is a simple generalization of Corol-

lary 4 for the case that P(z) :=

[zI −A −GC H

]is rank

deficient for some z ∈ Z0 ⊂ C but |z| < 1. For each such

z, there exists[−x>z u>z

]>in the null space of P(z). It

can be verified that the input sequence uk = zkuz andthe initial state xz leads to the output is yk = 0 for allk ≥ 0 but xk = zkxz, where with a slight abuse of nota-tion, zk represents the product of any permutations of knumbers from Z0. Since |z| < 1 by assumption, xk → 0as k →∞, which coincides with Definition 6.

4 Algorithms for Minimum-variance UnbiasedFilter for Simultaneous Input and State Esti-mation

For the filter design, we consider a recursive three-stepfilter 2 as proposed in [16, 18], composed of an unknowninput estimation step which uses the current measure-ment and state estimate to estimate the unknown in-puts in the best linear unbiased sense, a time update stepwhich propagates the state estimate based on the sys-tem dynamics, and a measurement update step whichupdates the state estimate using the current measure-ment. Since this presents various options in terms of theorder of execution of each step and the simulations in[18] appear to indicate the existence of two possible op-timal structures, we propose two variants of a recursivethree-step filter for the system described by (4),(5),(7)to study both of these structures:

(I) Updated Linear Input & State Estimator (ULISE),which predicts d1,k using updated state estimatedenoted by xk|k (12a) as in [18],

2 Note that the restriction to a recursive filter will be relaxedand shown to not lead to suboptimality in Theorem 9 forone of the filter variants.

(II) Propagated Linear Input & State Estima-tor(PLISE), that uses propagated state estimatedenoted by x?k|k to predict d1,k (12b) as in [16].

Given measurements up to time k − 1, the three-steprecursive filter 3 can be summarized as follows:

Unknown Input Estimation:{d I1,k = M1,k(z1,k − C1,kxk|k −D1,kuk)

d II1,k = M1,k(z1,k − C1,kx

?k|k −D1,kuk)

(12a)

(12b)

d2,k−1 = M2,k(z2,k − C2,kxk|k−1 −D2,kuk) (13)

dk−1 = V1,k−1d1,k−1 + V2,k−1d2,k−1 (14)

Time Update:

xk|k−1 = Ak−1xk−1|k−1 +Bk−1uk−1 +G1,k−1d1,k−1(15)

x?k|k = xk|k−1 +G2,k−1d2,k−1 (16)

Measurement Update:

xk|k = x?k|k + Lk(yk − Ckx?k|k −Dkuk)

= x?k|k + Lk(z2,k − C2,kx?k|k −D2,kuk) (17)

where xk−1|k−1, d1,k−1, d2,k−1 and dk−1 denote the op-timal estimates of xk−1, d1,k−1, d2,k−1 and dk−1; Lk ∈Rn×l, Lk := LkU2,k ∈ Rn×(l−pHk

), M1,k ∈ RpHk×pHk

and M2,k ∈ R(p−pHk)×(l−pHk

) are filter gain matricesthat are chosen to minimize the state and input errorcovariances. Note that we applied Lk = LkU2,kU

>2,k in

(17), which we will justify in Lemma 14.

The above recursive three-step filter represents a unifiedfilter for simultaneously estimating unknown input andstate for systems with an arbitrary direct feedthroughmatrix, thus relaxing the assumptions on the directfeedthrough matrix in [15, 16, 18]. By a suitable sys-tem transformation given in (6), the unknown input isdecomposed into two components, d1,k and d2,k; andsimilarly, the output equation into two orthogonal pro-jections, z1,k and z2,k, one with no direct feedthroughand the other with a full-rank feedthrough matrix.Hence, in a nutshell, the d1,k component of the unknown

3 To initialize the filter, arbitrary initial values of x0|0, P x0

and d1,0 can be used since we will show that the ULISE filteris exponentially stable in Theorems 10 and 11, while thestability of the PLISE filter is shown in Theorem 12. If y0 andu0 are available, we can find the minimum variance unbiasedinitial estimates given in the initialization of Algorithm 1using the linear minimum-variance-unbiased estimator [25].

6

Algorithm 1 ULISE algorithm

1: Initialize: x0|0 = E[x0]; P x0|0 = Px

0 ; A0 = A0 −G1,0Σ−1

0 C1,0; Q0 = G1,0Σ−10 R1,0Σ−1

0 G>1,0 + Q0; d1,0 =

Σ−10 (z1,0−C1,0x0|0−D1,0u0); P d

1,0 = Σ−10 (C1,0P

x0|0C

>1,0+

R1,0)Σ−10 ;

2: for k = 1 to N do. Estimation of d2,k−1 and dk−1

3: Pk = Ak−1Pxk−1|k−1A

>k−1 + Qk−1;

4: R2,k = C2,kPkC>2,k +R2,k;

5: P d2,k−1 = (G>2,k−1C

>2,kR

−12,kC2,kG2,k−1)−1;

6: M2,k = P d2,k−1G

>2,k−1C

>2,kR

−12,k;

7: xk|k−1 = Ak−1xk−1|k−1 +Bk−1uk−1 +G1,k−1d1,k−1;

8: d2,k−1 = M2,k(z2,k − C2,kxk|k−1 −D2,kuk);

9: dk−1 = V1,k−1d1,k−1 + V2,k−1d2,k−1;10: P d

12,k−1 = M1,k−1C1,k−1Pxk−1|k−1A

>k−1C

>2,kM

>2,k

−P d1,k−1G

>1,k−1C

>2,kM

>2,k;

11: P dk−1 = Vk−1

P d1,k−1 P d

12,k−1

P d>12,k−1 P d

2,k−1

V >k−1;

. Time update

12: x?k|k = xk|k−1 +G2,k−1d2,k−1;

13: P ?xk|k = G2,k−1M2,kR2,kM

>2,kG

>2,k

+(I−G2,k−1M2,kC2,k)Pk(I−G2,k−1M2,kC2,k)>;

14: R?k = CkP

?xk|kC

>k +Rk − CkG2,k−1M2,kU

>2,kRk

−RkU2,kM>2,kG

>2,k−1Ck;

. Measurement update15: Kk = P ?x

k|kC>k −G2,k−1M2,kU

>2,kRk;

16: M?1,k := Σ−1

k (U>1,kR?†k U1,k)−1U>1,kR

?†k ;

17: Lk = Kk(Il − U1,kΣkM?1,k)>R?†

k ;18: xk|k = x?k|k + Lk(yk − Ckx

?k|k −Dkuk);

19: P xk|k = (I − LkCk)G2,k−1M2,kU

>2,kRkL

>k

+LkRkU2,kM>2,kG

>2,k−1(I − LkCk)>

+(I − LkCk)P ?xk|k(I − LkCk)> + LkRkL

>k ;

. Estimation of d1,k20: R1,k = C1,kP

xk|kC

>1,k +R1,k;

21: M1,k = Σ−1k ;

22: P d1,k = M1,kR1,kM1,k;

23: d1,k = M1,k(z1,k − C1,kxk|k −D1,kuk);

24: Ak = Ak −G1,kM1,kC1,k;

25: Qk = G1,kM1,kR1,kM>1,kG

>1,k +Qk;

26: end for

input can be estimated in the best linear unbiased senseby choosing M1,k as in [18, 16] and the d2,k componentby choosing M2,k as in [15]. On the other hand, the gainmatrix Lk is chosen to minimize the state estimate errorcovariance in an update similar to the Kalman filter. Infact, the proposed filter can be shown to be a general-ization of the Kalman filter to systems with unknowninputs (see Section 6.3).

Note that the three steps are not given in the order ofexecution. In ULISE (see Algorithm 1), the estimationof d2,k−1 is carried out before the time update, followedby the measurement update and finally, the estimation

Algorithm 2 PLISE algorithm

1: Initialize: x0|0 = E[x0]; x?0|0 = E[x0]; P x0|0 = Px

0

; P ?x0|0 = Px

0 ; A0 = A0 − G1,0Σ−10 C1,0; Q0 =

G1,0Σ−10 R1,0Σ−1

0 G>1,0 +Q0; d1,0 = Σ−10 (z1,0−C1,0x

?0|0−

D1,0u0); P d1,0 = Σ−1

0 (C1,0P?x0|0C

>1,0 + R1,0)Σ−1

0 ; P xd1,0 =

−P ?x0|0C

>1,0Σ−1

0 ;2: for k = 1 to N do. Estimation of d2,k−1 and dk−1

3: Pk = Ak−1Pxk−1|k−1A

>k−1 + Qk−1;

4: R2,k = C2,kPkC>2,k +R2,k;

5: P d2,k−1 = (G>2,k−1C

>2,kR

−12,kC2,kG2,k−1)−1;

6: M2,k = P d2,k−1G

>2,k−1C

>2,kR

−12,k;

7: xk|k−1 = Ak−1xk−1|k−1 +Bk−1uk−1 +G1,k−1d1,k−1;

8: d2,k−1 = M2,k(z2,k − C2,kxk|k−1 −D2,kuk);

9: P xd2,k−1 = −P x

k−1|k−1A>k−1C

>2,kM

>2,k

−P xd1,k−1G

>1,k−1C

>2,kM

>2,k;

10: P d12,k−1 = −P xd >

1,k−1A>k−1C

>2,kM

>2,k

−P d1,k−1G

>1,k−1C

>2,kM

>2,k;

11: dk−1 = V1,k−1d1,k−1 + V2,k−1d2,k−1;

12: P dk−1 = Vk−1

P d1,k−1 P d

12,k−1

P d>12,k−1 P d

2,k−1

V >k−1;

. Time update

13: x?k|k = xk|k−1 +G2,k−1d2,k−1;

14: P ?xk|k =

[Ak−1 G1,k−1 G2,k−1

]P xk−1|k−1 P xd

1,k−1 P xd2,k−1

P xd >1,k−1 P d

1,k−1 P d12,k−1

P xd >2,k−1 P d >

12,k−1 P d2,k−1

A>k−1

G>1,k−1

G>2,k−1

+Qk−1

−G2,k−1M2,kC2,kQk−1−Qk−1C>2,kM

>2,kG

>2,k;

15: R?k = CkP

?xk|kC

>k +Rk − CkG2,k−1M2,kU

>2,kRk

−RkU2,kM>2,kG

>2,k−1C

>k ;

. Estimation of d1,k16: R1,k = C1,kP

? xk|kC

>1,k +R1,k;

17: M1,k = Σ−1k ;

18: P d1,k = M1,kR1,kM1,k;

19: d1,k = M1,k(z1,k − C1,kx?k|k −D1,kuk);

20: Ak = Ak −G1,kM1,kC1,k;

21: Qk = G1,kM1,kR1,kM>1,kG

>1,k +Qk;

. Measurement update22: Kk = P ?x

k|kC>k −G2,k−1M2,kU

>2,kRk;

23: M?1,k := Σ−1

k (U>1,kR?†k U1,k)−1U>1,kR

?†k ;

24: Lk = Kk(I − U1,kΣkM?1,k)U>2,k;

25: xk|k = x?k|k + Lk(yk − Ckx?k|k −Dkuk);

26: P xk|k = LkRkL

>k + (I − LkCk)G2,k−1M2,kU

>2,kRkL

>k

+LkRkU2,kM>2,kG

>2,k−1(I − LkCk)>

+(I − LkCk)P ?xk|k(I − LkCk)>;

27: P xd1,k = −(I − LkCk)P ?x

k|kC>1,kM

>1,k

−LkRkT>2,kM

>2,kG

>2,k−1C

>1,kM

>1,k;

28: end for

of d1,k; while PLISE (see Algorithm 2) first computes

d2,k−1, followed by the time update, the estimation ofd1,k and the measurement update. Note also that Al-

7

gorithms 1 and 2 for ULISE and PLISE are given withsignificant simplifications and a particular choice of Γk

that will be further expounded in Section 5.

For both structures of the three-step filter variants, Al-gorithms 1 and 2 provide the ‘best’ estimates of thestates and unknown inputs in the minimum squared er-ror sense, as given in the following lemma and will beproven in Section 5.

Lemma 8 Let the initial state estimate x0|0 be unbi-ased. If rk(C2,kG2,k−1) = p − pHk−1

, then the ULISEand PLISE algorithms given in Algorithms 1 and 2 pro-vide the unbiased, best linear estimate (BLUE) of theunknown input and the minimum-variance unbiased es-timate of system states.

In particular, we can show that ULISE is globallyoptimal over the class of linear state and input es-timators. In other words, the structure of ULISE isoptimal. Moreover, the initial biases in the state andinput estimates of ULISE decay exponentially if someconditions of uniform stabilizability and detectabilityare satisfied. Specifically for the time-invariant sys-tems, conditions for the convergence of the error co-variance matrix, P x

k|k, as well as the filter gains, Lk,

M1,k and M2,k, to steady-state are provided. To statethese claims, which will be proven in Sections 5.4and 5.5, we first define: M2,k := (C2,kG2,k−1)†, Qk =

Qk +G1,kΣ−1k R1,kΣ−1>k G>1,k, Ak = Ak −G1,kM1,kC1,k,

Ak := (I − G2,k−1M2,kC2,k)Ak + G2,k−1M2,kC2,k and

Qk = (I−G2,k−1M2,kC2,k)Qk−1(I−G2,k−1M2,kC2,k)>.

Theorem 9 (Global Optimality of ULISE) Letrk(C2,kG2,k−1) = p − pHk−1

and the initial state esti-mate x0|0 be unbiased. Then, the ULISE algorithm isglobally optimal (over the class of all linear state andinput estimators).

Theorem 10 (Stability of ULISE) Suppose that

rk(C2,kG2,k−1) = p − pHk−1. Then, that (Ak, C2,k) is

uniformly detectable 4 is sufficient for the boundednessof the error covariance of the ULISE algorithm. Fur-

thermore, if (Ak, Q12

k ) is uniformly stabilizable 4 , ULISEis exponentially stable (i.e., its expected estimate errorsdecay exponentially).

Theorem 11 (Convergence of ULISE to Steady-state) Let rk(C2,kG2,k−1) = p − pHk−1

. Then, in thetime-invariant case with P x

0|0 � 0, the filter gains of

ULISE (exponentially) converge to a unique stationarysolution if and only if

4 The notions of uniform detectability and stabilizabilityare standard (see, e.g., [26, Section 2]). A spectral test forthese properties can be found in [27].

(i) The linear time-invariant discrete-time system isstrongly detectable, i.e Theorem 7 holds, and

(ii) rk

[A− ejωI G2 Q

12 0

ejωC2 0 0 R122

]= n+ l − pH ,

∀ω ∈ [0, 2π]

where Q := G1M1R1M>1 G>1 +Q.

On the other hand, although the structure of PLISE issuboptimal (as evidenced by the simulation examples inSection 7), PLISE does also possess stability guaranteesin the time-invariant case, as stated in the following the-orem and will be proven in Section 5.6.

Theorem 12 (Stability of PLISE (time-invariant))Let rk(C2,kG2,k−1) = p − pHk−1

. Then, in the time-invariant case with P x

0|0 � 0, the estimate errors and

error covariances of PLISE remain bounded if

(i) The linear time-invariant discrete-time system isstrongly detectable, i.e Theorem 7 holds, and

(ii) rk[ejωI − F s Qs 1

2

]= n, ∀ω ∈ [0, 2π]

where F s := NA − SΘ−1C2, Qs := G2M2R2M>2 G>2 +

NQN>− SΘ−1S>, N := I−G2M2C2, M2 := (C2G2)†,S := −NAG2M2R2, and assuming that Θ := R2 −C2G2M2R2 −R2M

>2 G>2 C>2 is invertible.

Furthermore, these ULISE and PLISE algorithms reduceto filters in existing literature, as shown in Section 6.

Remark 13 The stability (and convergence to steady-state in the time-invariant case) of both variants of theunified state and input estimator is closely related to thestrong detectability of the system. In the time-varyingcase, the sufficient condition of uniform detectability ofULISE implies strong detectability (cf. Definition 6 and[26, Lemma 2.2]) whereas in the time-invariant case, thestrong detectability condition appears explicitly for thestability of both ULISE and PLISE. On the other hand,uniform stabilizability of Theorem 10 parallels the suffi-cient condition for the Kalman filter and Condition (ii)of Theorems 11 and 12 corresponds to the controllabilityof the filter dynamics on the unit circle, akin to the sys-tem controllability on the unit circle for the Kalman filter.Conversely, if the system is not strongly detectable, thenit is not possible to obtain unbiased estimates of the statesand unknown inputs even for the case with no noise.

5 Filter Description and Analysis

For the analysis of the proposed filter, let xk|k :=

xk − xk|k, x?k|k := xk − x?k|k, d1,k := d1,k − d1,k,

d2,k := d2,k − d2,k, dk := dk − dk, P xk|k := E[xk|kx>k|k],

P ?xk|k := E[x?k|kx

?>k|k], P d

1,k := E[d1,kd>1,k], P d

2,k :=

8

E[d2,kd>2,k], P d

12,k = (P d21,k)> := E[d1,kd

>2,k], P xd

1,k =

(P xd1,k)> := E[xk|kd>1,k], P xd

2,k = (P xd2,k)> := E[xk|kd>2,k]

and P dk := E[dkd

>k ]. We initially assume that the initial

state estimate is unbiased, i.e., E[x0|0] = E[x?0|0] = E[x0]

and present a lemma that summarizes the unbiasednessof the state and unknown input estimates for all timesteps that is one piece of the claim in Lemma 8.

Lemma 14 Let x0|0 = x?0|0 be unbiased, then the input

and state estimates, dk−1, x?k|k and xk|k, are unbiased for

all k, if and only ifM1,kΣk = I,M2,kC2,kG2,k−1 = I andLkU1,k = 0. Consequently, rk(C2,kG2,k−1) = p − pHk−1

and Lk = LkU2,kU>2,k.

Proof. We observe from (7), (12a), (12b) and (13) that{d I1,k = M1,k(C1,kxk|k + Σkd1,k + v1,k)

d II1,k = M1,k(C1,kx

?k|k + Σkd1,k + v1,k)

(18a)

(18b)

d2,k−1 =M2,k(C2,k(Ak−1xk−1|k−1 +G1,k−1d1,k−1)

+ wk−1 + v2,k + C2,kG2,k−1d2,k−1). (19)

On the other hand, from (15) and (16), the error in thepropagated state estimate can be obtained as:

x?k|k =Ak−1xk−1|k−1 +G1,k−1d1,k−1 +G2,k−1d2,k−1+ wk−1. (20)

Moreover, from (5) and (17), the updated state estimateerror is

xk|k = (I − LkCk)x?k|k − LkU1,kΣkd1,k − Lkvk. (21)

We show by induction that the estimates dk, xk|k andx?k|k are unbiased. For the base case, since x0|0 and x?0|0are unbiased and the process and measurement noise areassumed to have zero mean, E[w0] = 0, E[v0] = 0, from

(18) and (19), E[d1,0] = d1,0 and E[d2,0] = d2,0, i.e., d1,0and d2,0 are unbiased, if and only if M1,0Σ0 = I, and

M2,1C2,1G2,0 = I. Hence, d0 is unbiased. In the induc-tive step, we assume that E[xk−1|k−1] = E[x?k−1|k−1] =

0. Then, the input estimates are unbiased, i.e., E[dk−1] =

E[d1,k−1] = E[d2,k−1] = 0, if and only if M1,k−1Σk−1 =I, and M2,kC2,kG2,k−1 = I. Since the process noise haszero mean, by (20), E[x?k|k] = 0. Similarly, from (21)

with a zero-mean measurement noise, we impose theconstraint LkU1,k = 0 such that we obtain E[xk|k] = 0.Therefore, by induction, E[x?k|k] = 0 and E[xk|k] = 0

for all k. Since we require M2,kC2,kG2,k−1 = I for all kfor the existence of an unbiased input estimate, it fol-lows that rk(C2,kG2,k−1) = p − pHk−1 is a necessaryand sufficient condition. Furthermore, Lk = LkUkU

>k =

LkU2,kU>2,k since LkU1,k = 0.

Remark 15 The assumption of an unbiased initial stateis common in existing filters, including the Kalman filter,although this is not critical because the resulting stateerror dynamics is a stable linear system and the effect ofan initial state error decays exponentially.

Next, we continue the proof of Lemma 8 in three subsec-tions, one for each step of the three-step recursive filter.Then, the subsequent two subsections present the proofof Theorems 9, 11, and 12.

5.1 Unknown Input Estimation

To obtain an optimal estimate of dk−1 using (14), we esti-mate both components of the unknown input as the bestlinear unbiased estimates (BLUE). This means that the

expected input estimate is unbiased, i.e., E[d1,k] = d1,k,

E[d2,k] = d2,k and E[dk] = dk, as was shown in Lemma14, and that the mean squared error of the estimate isthe lowest possible, shown next in Theorem 16.

Theorem 16 Suppose x0|0 = x?0|0 are unbiased. Then

(12a), (12b) and (13) provide the best linear input esti-mate (BLUE) with M1,k and M2,k given by

M1,k = Σ−1k (22)

M2,k = (G>2,k−1C>2,kR

−12,kC2,kG2,k−1)−1G>2,k−1C

>2,kR

−12,k

(23)

while the covariance matrices of the optimal input errorestimates are

P d1,k = Σ−1k R1,kΣ−1k (24)

P d2,k−1 = (G>2,k−1C

>2,kR

−12,kC2,kG2,k−1)−1 (25)

R I

1,k :=E[e I1,ke

I>1,k ] = C1,kP

xk|kC

>1,k +R1,k

− C1,kLkRkT>1,k − T1,kRkL

>k C>1,k

R II1,k :=E[e II

1,keII>1,k ] = C1,kP

? xk|kC

>1,k +R1,k

(26a)

(26b)

R2,k :=E[e2,ke>2,k] = C2,kPkC

>2,k +R2,k (27)

where Pk := Ak−1P xk−1|k−1A

>k−1 + Qk−1, Ak := Ak −

G1,kM1,kC1,k and Qk := Qk +G1,kM1,kR1,kM>1,kG

>1,k.

Proof. Let z I1,k := z1,k − C1,kxk|k − D1,kuk, z II

1,k :=z1,k−C1,kx

?k|k−D1,kuk and z2,k := z2,k−C2,kxk|k−1−

D2,kuk. Then, we have{z I1,k = Σkd1,k + e I

1,k,

z II1,k = Σkd1,k + e II

1,k,

(28a)

(28b)

9

z2,k = C2,kG2,k−1d2,k−1 + e2,k, (29)

where e I1,k := C1,kxk|k +v1,k, e II

1,k := C1,kx?k|k +v1,k and

e2,k := C2,k(Ak−1xk−1|k−1+G1,k−1d1,k−1+wk−1)+v2,k.From the unbiasedness of the state and input estimates(Lemma 14), E[e I

1,k] = 0, E[e II1,k] = 0 and E[e2,k] = 0.

Their covariance matrices are given by{R I

1,k :=E[e I1,ke

I>1,k] = C1,kP

xk|kC

>1,k +R1,k

R II1,k :=E[e II

1,keII>1,k ] = C1,kP

? xk|kC

>1,k +R1,k

(30a)

(30b)

R2,k :=E[e2,ke>2,k] = C2,kPkC

>2,k +R2,k (31)

where the simplified expressions above is obtained by ap-plying E[xk|kv>1,k] = (E[v1,kx

>k|k])> = 0, E[xk|kw>k ] = 0,

E[d1,kw>k ] = 0, E[xk−1|k−1v>2,k] = 0, E[d1,k−1v>2,k] = 0

and E[wk−1v>2,k] = 0, as well as Lk = LkU2,kU>2,k

from Lemma 14 and (8) to obtain LkRkT>1,k =

LkU2,kU>2,kRkT

>1,k = LkU2,kR21,k = 0. Next, we ob-

tain the estimates for d1,k and d2,k given by (12a),(13), (22) and (23) by applying the well known gen-eralized least squares (GLS) estimate (see, e.g., [25,Theorem 3.1.1]), which are linear minimum-varianceunbiased estimates, a.k.a. as best linear unbiased esti-mates (BLUE). Note that since Σk is invertible, there is

one unique unbiased estimate of d1,k. Since M1,kΣk = Iand M2,kC2,kG2,k−1 = I, the input estimate errors, andtheir covariance matrices are as follows

d I1,k = −M1,ke

I1,k, d

II1,k = −M1,ke

II1,k,

d2,k−1 = −M2,ke2,k

P d1,k = E[d1,kd

>1,k] = M1,kE[e1,ke

>1,k]M>1,k

= Σ−1k R1,kΣ−1k (32)

P d2,k−1 = E[d2,k−1d

>2,k−1] = M2,kE[e2,ke

>2,k]M>2,k

= (G>2,k−1C>2,kR

−12,kC2,kG2,k−1)−1

Finally, we note the following equality:

tr(E[dkd>k ]) = tr(E[Vk

[d1,k

d2,k

] [d1,k d2,k

]V >k ]) (33)

= tr(V >k VkE[

[d1,k

d2,k

] [d1,k d2,k

]]) = tr(P d

1,k) + tr(P d2,k).

Since the unbiased estimate of d1,k is unique, we have

min tr(E[dkd>k ]) = tr(E[d1,kd

>1,k]) + min tr(E[d2,kd

>2,k]),

from which it can be observed that the unbiased esti-mate dk has minimum variance when d1,k and d2,k haveminimum variances.

Remark 17 Moreover, if wi,k and vi,k for i = {1, 2} arewhite Gaussian noises, which lead to ei,k being white and

Gaussian, then (12a), (12b), (13) and (14) also providethe minimum variance unbiased (MVU) input estimate.

5.2 Time Update

The time update is given by (15) and (16), and the errorin the propagated state estimate by (20) and its covari-ance matrix are given by

P ?xk|k =

A>k−1

G>1,k−1

G>2,k−1

>

P xk−1|k−1 P xd

1,k−1 P xd2,k−1

P xd >1,k−1 P d

1,k−1 P d12,k−1

P xd >2,k−1 P d >

12,k−1 P d2,k−1

A>k−1

G>1,k−1

G>2,k−1

+Qk−1 −G2,k−1M2,kC2,kQk−1 −Qk−1C

>2,kM

>2,kG

>2,k−1.

(34)

Using (32), (22) and (23), (34) can be rewritten as

P ?xk|k =(I −G2,k−1M2,kC2,k)Pk(I −G2,k−1M2,kC2,k)>

+G2,k−1M2,kR2,kM>2,kG

>2,k−1 (35)

where we applied Lk = LkU2,kU>2,k from Lemma 14

and T1,kRkT>2,k = 0 from (8), and defined Ak := Ak −

AkLkCk, Gk := G1,kM1,kT1,k+AkLk, and Pk as follows:

P Ik :=Pk

P IIk :=Ak−1P

?xk−1|k−1A

>k−1 +Qk−1 + Gk−1Rk−1G

>k−1

+ Ak−1G2,k−2M2,k−1U>2,k−1Rk−1L

>k−1A

>k−1

+Ak−1Lk−1Rk−1U2,k−1M>2,k−1G

>2,k−2A

>k−1.

(36)

5.3 Measurement Update

In the measurement update step, the measurement ykis used to update the propagated estimate of x?k|k and

P ?xk|k. From (5) and (17), the updated state estimate er-

ror is given by (21) where the constraint LkU1,k = 0(Lemma 14) must be imposed for all k such that thestate estimate is unbiased (E[xk|k] = 0) for all possi-ble d1,k, since Σk has full rank. Note that the resid-ual/innovations term in the measurement update step

given in (17) appears to not contain an Hkdk term aswould be expected. This term is actually present, but hasbeen nullified by the unbiasedness constraint (Lemma14), since LkHk = LkU1,kΣkV

>1,k = 0. This is also in line

with the practical reason that the unknown input esti-mate is not yet available. Next, the covariance matrix of

10

the state error is computed as

P xk|k =(I − LkCk)P ?x

k|k(I − LkCk)> + LkRkL>k

+ (I − LkCk)G2,k−1M2,kU>2,kRkL

>k

+ LkRkU2,kM>2,kG

>2,k−1(I − LkCk)>

:=P ?xk|k + LkR

?kLk − LkS

>k − SkL

>k (37)

where E[x?k|kv>k ] = −G2,k−1M2,kU

>2,kRk, and we de-

fined R?k := CkP

?xk|kC

>k + Rk − CkG2,k−1M2,kU

>2,kRk −

RkU2,kM>2,kG

>2,k−1C

>k and Sk := −G2,k−1M2,kU

>2,kRk+

P ?xk|kC

>k . Using (35), we can rewrite the expres-

sion R?k = NkRkN

>k where Rk := CkPkC

>k + Rk,

Nk := I −CkG2,k−1M2,kU>2,k and Pk as defined in (36).

To obtain an unbiased minimum variance estimator, wethen proceed to derive the optimal gain matrix Lk, byminimizing the trace of (37), since the trace representsthe sum of the estimation error variances of the states,subject to the constraint LkU1,k = 0. However, the next

lemma shows that R?k = NkRkN

>k is singular because

Nk is rank deficient, except when p = pH , i.e., Hk hasfull rank.

Lemma 18 Consider M2,k that satisfies (23), then Nk

has rank pR = l − p+ pHk−1and pHk−1

≤ pR ≤ l.

Proof. Since M2,k satisfies (23), Nk is an idempotentmatrix, i.e., NkNk = Nk. From [28, Fact 3.12.9 andProposition 2.6.3] and rk(C2,kG2,k−1) = p − pHk−1

,we obtain pR := rk(Il − CkG2,k−1M2,kU2,k) =l− rk(CkG2,k−1M2,kU2,k) = l− p+ pHk−1

≤ l. Since weassumed l ≥ p, we have pHk−1

≤ pR ≤ l.

Hence, the optimal gain matrix Lk is in general notunique. Similar to [15], we propose a gain matrix Lk ofthe form Lk = LkΓk where Γk ∈ RpR×l is an arbitrarymatrix which has to be chosen such that ΓkR

?kΓ>k has

full rank. With this, we compute the optimal gain Lk

and thus Lk in the following theorem.

Theorem 19 Suppose x0|0 = x?0|0 are unbiased, and let

Γk ∈ RpR×l be chosen such that ΓkR?kΓ>k has full rank.

Then, the minimum-variance unbiased state estimator isobtained with the gain matrix Lk given by

Lk = KkRk(Il −H1,kM?1,k) = Kk(Il −H1,kM

?1,k)>Rk

(38)

whereH1,k = U1,kΣk,M?1,k := Σ−1k (U>1,kRkU1,k)−1U>1,kRk,

Rk := Γ>k (ΓkR?kΓ>k )−1Γk, and

Kk :=(P ?xk|kC

>k −G2,k−1M2,kU

>2,kRk)

=(PkC>k −G2,k−1M2,kU

>2,kRk)N>k ,

with M2,k and Pk as defined in the Theorem 16, and Rk

and R?k as defined in the text following (37).

Proof. By Lemma 14, the state estimates are unbiased.Next, we employ the optimization approach with La-grange multipliers (Λk ∈ Rn×pH ) in [2, 16, 18], to findthe particular gain Lk that minimizes the trace of ofthe covariance matrix P x

k|k, while being subjected to the

constraint LkU1,k = 0 which is a necessary condition forobtaining an unbiased estimate. This constrained opti-mization problem can be solved using differential calcu-lus with the Lagrangian given by

L(Lk,Λk) := tr(P xk|k)− 2 tr(LkΓkU1,kΛ>k )

with a filter gain of the form Lk = LkΓk. Differentiatingthe Lagrangian with respect to Lk and Λk, and settingit to zero, we obtain

∂L∂Lk

= 2(ΓkR?kΓ>k L

>k − ΓkU1,kΣkΛ>k

− Γk(CkP?xk|k −RkU2,kM

>2,kG

>2,k−1)) = 0

∂L∂Λk

= −2LkΓkU1,k = 0

Solving the above linear system of equations and simpli-fying, we obtain the optimal gain matrix (38).

One choice of Γk (first proposed in [5] using the singular

value decomposition of R− 1

2

k CkG2,k−1 = UkΣkV>k ) such

that ΓkR?kΓ>k has full rank, is given by

Γk =[0 IpR

]U>k R

− 12

k , (39)

where Rk and R?k are defined in the text following (37),

and pR = l − p − pHk−1. With this choice of Γk, we

obtain ΓkR?kΓ>k = IpR

which is invertible. Following theprocedure in [5, Appendix], it can be shown that (38)reduces to

Lk = Kk(Il −H1,kM?1,k)>R−1k . (40)

with M?1,k := Σ−1k (U>1,kR

−1k NkU1,k)−1U>1,kRkNk, which

is independent of Uk and as such, the “expensive” singu-lar value decomposition step can be bypassed. Anotherchoice would be to use the Moore-Penrose pseudoin-verse (†) such that Rk = (R?

k)†. Equivalently, we have

Lk = Lk

[U1,k U2,k

] [U>1,kU>2,k

]= LkU

>2,k where we defined

Lk := LkU2,k = Kk(Il −H1,kM?1,k)>R−1k U2,k. (41)

11

In addition, we can compute the (cross-)covariances as

P xd I1,k = (P dx I

1,k )> = −P xk|kC

>1,kM

>1,k + LkRkT

>1,kM

>1,k

= −P xk|kC

>1,kM

>1,k

P xd II1,k = (P dx II

1,k )> = LkRkT>1,kM

>1,k

− LkRkT>2,kM

>2,kG

>2,k−1C

>1,kM

>1,k

− (I − LkCk)P ?xk|kC

>1,kM

>1,k

(42a)

(42b)

P xd2,k−1 = (P dx

2,k−1)> = −P xk−1|k−1A

>k−1C

>2,kM

>2,k (43)

− P xd1,k−1G

>1,k−1C

>2,kM

>2,k

P d12,k−1 = (P d

21,k−1)> = −P dx1,k−1A

>k−1C

>2,kM

>2,k (44)

− P d1,k−1G

>1,k−1C

>2,kM

>2,k

P dk : =

[V1,k V2,k

] [ P d1,k P d

12,k

P d21,k P d

2,k

][V >1,kV >2,k

](45)

where we can apply Lk = LkU2,kU>2,k from Lemma 14

and (8) such that LkRkT>1,kM

>1,k = 0 which resulted in

the simplification of (42) shown above.

5.4 Global optimality of ULISE

In the following, we relax the recursivity assumption ofULISE for both the state and input estimates and con-

sider xk|k and dk to be the most general linear com-bination of the unbiased initial state estimate x0|0 andZk given in (?). We first prove that the state update ofULISE has the same optimal form as the filter proposedin [8, Remark 3], through which the claim of global op-timality of the state estimate over the class of all linearestimators follows from [23]. Then, we prove that the in-put estimate is also globally optimal, which completesthe proof of Theorem 9.

Proof of Theorem 9. To this end, we rearrange the lat-ter form of (17) of state estimation for ULISE with un-known inputs estimated with (12a) and (13), to obtain

xk|k =Ak−1xk−1|k−1 +Bk−1uk−1 +G1,k−1M1,k−1z1,k−1

+Kk(z2,k − C2,k(Ak−1xk−1|k−1 +Bk−1uk−1+G1,k−1M1,k−1z1,k−1)) (46)

Kk =G2,k−1M2,k + Lk(I − C2,kG2,k−1M2,k) (47)

where Ak−1 = Ak−1 − G1,k−1M1,k−1C1,k−1, as previ-ously defined. Repeating the procedure in Section 5.3,

Lk = (PkC>2,k − G2,k−1M2,kR2,k)N

>k (NkR2,kN

>k )−1Γk

and

P xk|k = (I −KkC2,k)Pk(I −KkC2,k)> +KkR2,kK

>k

(48)

where Nk := Γk(I − C2,kG2,k−1M2,k), R2,k :=

C2,kPkC>2,k + R2,k and Γk is an arbitrary matrix such

that NkR2,kN>k has full rank. Thus, the ULISE’s state

and state covariance update is almost identical to theone considered in [8], in which only state estimation isconsidered. The only difference is in the choice of M2,k,

where M2,k is replaced by M2,k := (C2,kG2,k−1)† in [8].More importantly, the state update law is of the optimalform [8, Remark 3] from which the global optimalityof the state estimate over the linear class of estimatorsaccording to [23].

To show that the input estimate is also globally opti-

mal, we consider the input estimate dgk−1 to be the mostgeneral linear combination of the unbiased initial stateestimate x0|0, as well as Z1,k and Z2,k given in (?). Sincez1,i and z2,i as defined for (28) and (29) are linear com-binations of x0|0, Z1,i and Z2,i, and of x0|0, Z1,i−1 and

Z2,i, respectively, dgk−1 can be expressed as

dgk−1 = χ0(k)x0|0 +

k∑i=1

χ1,i(k)z1,i +

k∑i=1

χ2,i(k)z2,i.

(49)

Clearly, if χ1,k−1(k) = V1,k−1M1,k−1 and χ2,k(k) =V2,k−1M2,k where M1,k−1 and M2,k are as in (22)

and (23), and if χ0(k), χ1,k(k), {χ1,i(k)}k−2i=0 and

{χ2,i(k)}k−1i=0 are zero, then dgk−1 is unbiased. To show

the converse, we suppose that dgk−1 is unbiased, i.e.,

E[dgk−1] = V1,k−1d1,k−1 + V2,k−1d2,k−1. Since dk cantake on any arbitrary value and z1,k is a function of d1,k,

χ1,k(k) = 0 such that dgk−1 remains unbiased. Moreover,the first measurements containing d1,k−1 and d2,k−1 arez1,k−1 and z2,k, then E[χ1,k−1(k)z1,k−1] = V1,k−1d1,k−1and E[χ2,k(k)z2,k] = V2,k−1d2,k−1. Consequently,χ1,k−1(k) = V1,k−1M1,k−1 and χ2,k(k) = V2,k−1M2,k.

Moreover, for dgk−1 to be unbiased, χ0(k) = 0,

{χ1,i(k)}k−2i=0 = 0 and {χ2,i(k)C2,iG2,i−1}k−1i=0 = 0 musthold. Finally, we prove that the mean squared errorE[‖dk−1− dgk−1‖22] is minimized when {χ2,i(k)}k−1i=0 = 0.

From the unbiasedness conditions of dgk−1 and from

(49), we have dk−1 − dgk−1 = dk−1 −∑k−1

i=0 χ2,i(k)z2,i

where dk is as defined above Lemma 14. Since it isstraightforward to verify (as in [23, Lemmas 1 and 2])

that E[dk(χ2,i(k)z2,i)>] = 0 for all i ≤ k, it follows that

E[‖dk−1 − dgk−1‖22]

= tr{E[(dk−1 −k−1∑i=0

χ2,i(k)z2,i)(dk−1 −k−1∑i=0

χ2,i(k)z2,i)>]}

= tr{E[dk−1d>k−1]}+ E[‖

k−1∑i=0

χ2,i(k)z2,i‖22]

where the first term is minimized by ULISE as is shown

12

in (33) and Theorem 16, while the latter term is min-

imized when∑k−1

i=0 χ2,i(k)z2,i = 0, which occurs when

{χ2,i(k)}k−1i=0 = 0, as desired. Thus, Theorem 9 holds.

Remark 20 We also conclude that the state estimatorin [8] implicitly estimates the unknown input, i.e., with

(12a) and (13), although the replacement ofM2,k by M2,k

in (13) is tantamount to using an ordinary least squares(OLS) estimate instead of the generalized least squares(GLS) estimate, resulting in the same expected estimatebut the estimate does not have minimum variance (seediscussion in [29, pp. 223-224]). Furthermore, ULISEprovides a family of optimal state estimators parameter-ized by Γk, whereas the filter in [8] provides a specific so-lution by choosing Nk as the left null matrix of C2,kG2,k,

i.e., Nk = Null((C2,kG2,k)>)>. More importantly, wehave shown that the decorrelation constraint assumed in[8], such that only z2,k can be used in the state updateto avoid obtaining a suboptimal estimator, is justified asa direct consequence of the unbiasedness constraint inLemma 14, i.e., LkU1,k = 0. By extension, ULISE is alsoless restrictive than the filter in [7]. In addition, the un-known input estimates are BLUE, thus, ULISE is glob-ally optimal over the class of all linear unbiased stateand input estimates for systems with unknown inputs.However, the same cannot be said of PLISE, as can beseen in the examples of Section 7.

5.5 Stability of ULISE

In this section, we prove the stability of the ULISE filterby first reducing the linear time-varying system with un-known inputs to an equivalent system without unknowninputs. Then, we use existing results on the stability ofthe Kalman filter [26, Section 5] to obtain the sufficientconditions for the stability of the original system.

Proof of Theorem 10. We begin by reducing the systemwith unknown inputs to one without unknown inputs.From (17) and (7), we obtain xk|k = x?k|k−Lk(C2,kx

?k|k+

v2,k). Then, substituting (32) into (20) and the aboveequation, and rearranging, we obtain

xk|k = Ak−1xk−1|k−1 + wk−1 − Lk(C2,kAk−1xk−1|k−1+ C2,kwk + v2,k), (50)

where Ak−1 = (I − G2,k−1M2,kC2,k)Ak−1 and wk =−(I − G2,k−1M2,kC2,k)(G1,k−1M1,k−1v1,k−1 − wk−1 −G2,k−1M2,kv2,k. As it turns out, the state estimate er-ror dynamics above is the same for a Kalman filter [30]for a linear system without unknown inputs: xek+1 =

Akxek + wk; yek = C2,kx

ek + v2,k. Since the objective

for both systems is the same, i.e., to obtain an unbi-ased minimum-variance filter, they are equivalent sys-tems from the perspective of optimal filtering. How-ever, the noise terms of this equivalent system are corre-lated, i.e., E[wkv

>2,k] = −G2,k−1M2,kR2,k. To transform

the system further into one without correlated noise,we employ a common trick of adding a zero term sinceyek − C2,kx

ek − v2,k = 0 to obtain

xek+1 = Akxek + wk −G2,k−1M2,k(yek − C2,kx

ek − v2,k)

= ¯Akxek + ¯uk + ¯wk

yek = C2,kxek + v2,k

where ¯Ak = Ak+G2,k−1M2,kC2,k, ¯uk = −G2,k−1M2,kyek

is a known input and ¯wk = wk + G2,k−1M2,kv2,k. Thenoise terms ¯wk and v2,k are uncorrelated with covari-

ances ¯Qk := E[ ¯wk ¯w>k ] = (I−G2,k−1M2,kC2,k)Qk−1(I−G2,k−1M2,kC2,k)>, R2,k and E[ ¯wkv

>2,k] = 0, where M2,k

and Qk−1 are as defined in Theorem 16.

Ideally, if we can compute ¯A and ¯Q prior to ap-plying the ULISE algorithm, then the uniform de-tectability and stabilizability conditions of [26, Sec-tion 5] can be directly applied to obtain the de-sired stability property. However, this is not thecase as these matrices depend on P x

k−1|k−1 which is

not available a priori. Thus, we substitute M2,k in

(13) with M2,k := (C2,kG2,k−1)† to obtain Ak :=

(I−G2,k−1M2,kC2,k)Ak−1+G2,k−1M2,kC2,k and Qk :=

(I −G2,k−1M2,kC2,k)Qk−1(I −G2,k−1M2,kC2,k)>. Thisremoves the dependence on P x

k−1|k−1 from the uniform

detectability and stabilizability tests in Theorem 10.

From [26, Lemma 5.1 & Corollary 5.2], if (Ak, C2,k) isuniformly detectable, then the corresponding filter error

covariance P x,subk|k is bounded. By the optimality of the

ULISE algorithm, it follows that the ULISE error covari-ance P x

k|k and Lk are bounded. Next, by [26, Theorems

4.3 & 5.3], the uniform stability of (Ak, Q12

k ) and the

boundedness of Lk implies that the filter (with Lk but

with M2,k in the input estimate) is exponentially stable.Finally, using the fact that the ordinary and generalizedleast squares input estimates have the same expectedvalue (see, e.g., [29, pp. 223-224]), it can be verified from

(50) that E[xk|k] = (I − LkC2,k)Ak−1E[xk−1|k−1] =

(I−LkC2,k)Ak−1E[xk−1|k−1], from which it follows that

the uniform stability of (Ak, Q12

k ) and the boundedness

of Lk also implies that ULISE is exponentially stable.

Next, we consider the time-invariant case, for which uni-form detectability and uniform stabilizability reduce tostandard definitions of detectability and stabilizability[27]. Thus, the sufficient conditions of Theorem 11 followdirectly. In addition, necessary and sufficient conditionscan be obtained for the time-invariant case. Noting thesimilarity of ULISE to the state estimator in [8] and theconditions given in [5] is independent of the choice of

M2,k or M2,k, it can be shown that the convergence and

13

stability conditions are as given in Theorem 11.

5.6 Stability of PLISE

Unfortunately, the ‘more complex’ structure of PLISErenders the proof approach in the previous section for thestability of ULISE for the time-varying case not applica-ble. Instead of taking this problem head-on, we chooseto only consider the stability of the PLISE variant forthe case of linear time invariant systems in Theorem 12,which will proven next.

Proof of Theorem 12. To proof the sufficiency of theconditions in Theorem 12 for the PLISE variant of theunified filter, we consider a suboptimal version of PLISE

that utilized a non-BLUE d2 by assuming that R2 = I,and thus, M2 becomes M2 (similar to the assumptionof [8]). Then, we rewrite (35) to obtain the associatedalgebraic Riccati equation as

P ?x = (F s −KsC2)P ?x(F s −KsC2)> +KsΘKs> +Qs

where Ks = NALU2 − SΘ−1, while F s, Qs, N , S andΘ are as defined in Theorem 12. Using the results in[31, 32], the error covariance matrix exponentially con-verges to a unique stabilizing solution of the algebraicRiccati equation if and only if (F s, C2) is detectable and

(F s, Qs 12 ) has no unreachable modes on the unit circle

(Condition (ii)). To obtain Condition (i) from the de-tectability of (F s, C2), we use the following identities:

rk

zI − F s

C2

= rk

I −sΘ−1

0 I

zI − F s

C2

=

zI − NAC2

= n, ∀z ∈ C, |z| ≥ 1

rk

zI − A −G2

C2 0

= rk

zI − A −G2

C2 0

I 0

−NC2A I

= rk

zI − NA −G2

C2 0

= rk

zI − NAC2

+ p− pH

= n+ p− pH , ∀z ∈ C, |z| ≥ 1,

the latter of which is equivalent to strong detectabil-ity of the system by Theorem 7. Since the suboptimalversion of PLISE admits a bounded steady-state solu-tion, the error covariance, and hence the estimate errorsof PLISE remain bounded because by the optimalityof PLISE, P x ≤ (I − LC)P ?x(I − LC)> + LRL> +

(I −LC)G2MU>2 RL>+LRU2M

>G>2 (I −LC)> where

L = (P ?xC> − G2M2U>2 R)R(Il −H1M

?1 ), H1 = U1Σ,

M?1 = Σ−1(U>1 RU1)−1U>1 R, R := Γ>(ΓR?Γ>)−1Γ,

R? = CP x?C> + R − CG2M2U>2 R − RU2M

>2 G>2 C>

and Γ is such that ΓR?Γ> has full rank.

6 Connection to existing literature

In this section, we show that ULISE and PLISE reduceto estimators that are closely related to the estimatorsin existing literature in the following special cases.

6.1 Special Case 1: Hk has full rank

In this special case, rk(Hk) = p and the singular

value decomposition of Hk =[U1,k U2,k

] [Σk

0

]V >1,k =

U1,kΣkV>1,k. Thus, V2,k is an empty matrix and corre-

spondingly G2,k, d2,k, M2,k and P d2,k are also empty ma-

trices. From (16), (34) and (37), we have x?k|k = xk|k−1,

P xk|k = (I − LkCk)P x

k|k−1(I − LkCk)> + LkRkL>k (51)

R?k = Rk := CkP

xk|k−1C

>k +Rk (52)

P ?xk|k = P x

k|k−1 := E[(xk − xk|k−1)(xk − xk|k−1)>]

= Ak−1Pxk−1|k−1A

>k−1 +Gk−1P

dkG>k−1 (53)

+Ak−1Pxdk−1G

>k−1 +Gk−1P

xd>k−1A

>k−1 +Qk−1

P dk = V1,kP

d1,kV

>1,k

= (H>k U1,k(T1,kRkT>1,k)−1U>1,kHk)−1 (54)

{P xd Ik = P xd I

1,k V>1,k = LkRkM

>k − P x

k|kC>k M

>k

P xd IIk = P xd II

1,k V >1,k = LkRkM>k − P x

k|k−1C>k M

>k

(55a)

(55b)

where we have defined

Mk := V1,kM1,kT1,k

= (H>k U1,k(T1,kRkT>1,k)−1U>1,kH

>k )−1

H>k U1,k(T1,kRkT>1,k)−1T1,k. (56)

Since Rk has full rank, Γk can be chosen as the identitymatrix and the state update and input estimates are

xk|k−1 = Ak−1xk−1|k−1 +Bk−1uk−1 +Gk−1dk−1 (57)

xk|k = xk|k−1 + Lk(yk − Ckxk|k−1 −Dkuk) (58)

{d Ik = Mk(yk − Ckxk|k −Dkuk)

d IIk = Mk(yk − Ckxk|k−1 −Dkuk)

(59a)

(59b)

withLk = P xk|k−1C

>k R−1k (I−Hk(H>k R

−1k Hk)−1H>k R

−1k ).

14

Comparing the above equations with the filters in [16,18], we note that ULISE variant is closely related to thefilter proposed in [18], with the main difference in (54)and (56), which would be equivalent if T1,k = U>1,k and

U1,k(T1,kRkT>1,k)−1U>1,k = Rk, which is only true when

U2,k is an empty matrix, i.e., when Hk has full row rank.

On the other hand, the PLISE variant is closely relatedto the filter in [16]. Similarly, the only differences lie in(54), (55b) and (56), and the filters are equivalent when

Hk has full row rank, which also leads to LkRkM>k = 0.

6.2 Special Case 2: Hk = 0

In this case, no transformation of the output equationsand no decomposition of the unknown input vector isnecessary. The U1,k and V1,k are empty matrices whileU2,k and V2,k are identity. Thus, ULISE and PLISE re-duce to the same state and covariance update equationsgiven by

xk|k−1 = Ak−1xk−1|k−1 +Bk−1uk−1 (60)

x?k|k = xk|k−1 +Gk−1dk−1 (61)

xk|k = x?k|k + Lk(yk − Ckx?k|k −Dkuk) (62)

dk−1 = Mk(yk − Ckxk|k−1 −Dkuk) (63)

P xk|k−1 = Ak−1P

xk−1|k−1A

>k−1 +Qk−1 (64)

P ?xk|k = (I −Gk−1MkCk)P x

k|k−1(I −Gk−1MkCk)>

+Gk−1MkRkM>k G>k−1 (65)

P dk = (G>k−1C

>k R−1k CkGk−1)−1 (66)

P xdk = −P x

k−1|k−1A>k−1C

>k M

>k (67)

P xk|k = P ?x

k|k + LkR?kLk − LkS

>k − SkL

>k (68)

where Rk = CkPxk|k−1C

>k +Rk, R?

k = CkP?xk|kC

>k +Rk−

CkGk−1MkRk −RkM>k G>k−1C

>k , Sk = −Gk−1MkRk +

P ?xk|kC

>k , Mk = (G>k−1C

>k R−1k CkGk−1)−1G>k−1C

>k R−1k

and Lk = (P ?xk|kC

>k − Gk−1MkRk)Rk. The above equa-

tions are identical to the filter derived in [15] for sys-tems without direct feedthrough, therefore, ULISE andPLISE are generalizations of the filter in [15] to systemswith direct feedthrough, and by extension, of the filtersin [2, 5].

6.3 Special Case 3: Gk = 0 and Hk = 0

When Gk = 0 and Hk = 0, the filter gain Lk reducesto the Kalman filter gain Lk = P x

k|k−1C>k R−1k where

Rk = CkPk|k−1C>k +Rk, while the state and covarianceupdate reduces to the Kalman filter equations:

xk|k = Ak−1xk−1|k−1 + Lk(yk − CkAk−1xk−1|k−1) (69)

P xk|k−1 = Ak−1P

xk−1|k−1A

>k−1 +Qk−1 (70)

P xk|k = (I − LkCk)P x

k|k−1(I − LkCk)> + LkRkL>k (71)

7 Illustrative Examples

7.1 Fault Identification

In this example, we consider the state estimation andfault identification problem when the system dynamicsis plagued by faults, dk, that can either influence thesystem dynamics through the input matrix Gk or theoutputs through the feedthrough matrix Hk, as well aszero-mean Gaussian white noise. Thus, the objective isto estimate the states of the system for the sake of con-tinued operation in spite of the faults, and to identify thefaults that the system is experiencing for self-repair ormaintenance purposes. Specifically, the linear discrete-time problems we consider are based on the system givenin [8], which is similar to the failure detection problemfirst considered in [33], with six different H matrices toillustrate the effect of parameter changes on filter per-formance:

A =

0.5 2 0 0 0

0 0.2 1 0 1

0 0 0.3 0 1

0 0 0 0.7 1

0 0 0 0 0.1

;

B = 05×1;

C = I5;

D = 05×1;

G =

1 0 −0.3

1 0 0

0 0 0

0 0 0

0 0 0

;

Q = 10−4

1 0 0 0 0

0 1 0.5 0 0

0 0.5 1 0 0

0 0 0 1 0

0 0 0 0 1

; R = 10−2

1 0 0 0.5 0

0 1 0 0 0.3

0 0 1 0 0

0.5 0 0 1 0

0 0.3 0 0 1

;

H1 =

0 0 1

0 0 0

0 1 0

0 0 0

0 0 0

; H2 =

0 0 1

0 0 0

0 1 0

0 0 0

1 0 0

; H3 =

0 0 0

0 0 0

0 1 0

0 0 0

1 0 0

,

H4 =

0 0 0

1 0 0

0 1 0

0 0 0

0 0 0

; H5 =

0 0 0

0 0 0

0 1 0

0 0 1

0 0 0

; H6 =

0 0 0

1 0 0

0 1 0

0 0 1

0 0 0

.

With the above H matrices, the invariant zeros of

the matrix pencil

[zI − A −G2

C2 0

]are respectively

{0.3, 0.8}, {0.1, 0.3, 0.5, 0.7, 0.8}, ∅, {0.3,−0.8}, ∅ and{0.1, 0.7, 0.3,−0.8, 0.35}. Thus, all six systems arestrongly detectable. Moreover, the direct feedthroughmatrices of the second and sixth systems, H2 and H6,have full rank.

15

Time, k

x1

0 200 400 600 800 1000−5

0

5

10

x1 xCYWZ1 xULISE

1 xPLISE1

Time, k

x2

0 200 400 600 800 1000−1

0

1

2

x2 xCYWZ2 xULISE

2 xPLISE2

Time, k

x3

0 200 400 600 800 1000−0 .5

0

0 .5

1x3 xCYWZ

3 xULISE3 xPLISE

3

Time, kx4

0 200 400 600 800 1000−0 .1

0

0 .1x4 xCYWZ

4 xULISE4 xPLISE

4

Time, k

x5

0 200 400 600 800 1000−0 .1

0

0 .1x5 xCYWZ

5 xULISE5 xPLISE

5

Time, k

d1

0 200 400 600 800 1000−2

0

2

d1 dCYWZ1 dULISE

1 dPLISE1

Time, k

d2

0 200 400 600 800 1000−1

0

1

2d2 dCYWZ

2 dULISE2 dPLISE

2

Time, k

d3

0 200 400 600 800 1000−5

0

5d3 dCYWZ

3 dULISE3 dPLISE

3

Fig. 1. Actual states x1, x2, x3, x4, x5 and its estimates, aswell as unknown inputs d1, d2 and d3 and its estimates.

Time, k

trace(P

x)

0 200 400 600 800 1000

100

1010

CYWZ

ULISE

PLISE

Time, k

trace(P

d)

0 200 400 600 800 1000

100

1010

CYWZ

ULISE

PLISE

Fig. 2. Trace of estimate error covariance of states, tr(P x),and unknown inputs, tr(P d).

The unknown inputs used in this example are

dk,1 =

1, 500 ≤ k ≤ 700

0, otherwise

dk,2 =

1700

(k − 100), 100 ≤ k ≤ 800

0, otherwise

dk,3 =

3, 500 ≤ k ≤ 549, 600 ≤ k ≤ 649, 700 ≤ k ≤ 749

−3, 550 ≤ k ≤ 599, 650 ≤ k ≤ 699, 750 ≤ k ≤ 799

0, otherwise.

To illustrate the performance of the unified simultane-ous input and state estimators, measured by the steady-state trace of the error covariance matrices, we comparethe performance of the following filters: (i) Cheng et al.filter [8], augmented by estimates the unknown input inthe BLUE sense, i.e., with (12a) and (13) (CYWZ), (ii)ULISE from Section 4, and (iii) PLISE from Section 4,as well as the filters for systems with full-rankH matrix:(iv) Gillijns and De Moor filter (GDM) [16], (iv) Fanget al. filter (FSY) [17] and (v) Yong et al. filter (YZF)[18]. The simulations were implemented in MATLAB ona 2.2 GHz Intel Core i7 CPU.

Table 1Steady-state Performance of CYWZ, ULISE, PLISE, GDM,FSY and YZF.

Px11 Px

22 Px33 Px

44 Px55 Pd

11 Pd22 Pd

33

H1

CYWZ 0.1843 0.0091 0.0002 0.0004 0.0001 0.0099 0.0102 0.1923

ULISE 0.1843 0.0091 0.0002 0.0004 0.0001 0.0099 0.0102 0.1923

PLISE 0.1843 0.0091 0.0002 0.0004 0.0001 0.0099 0.0102 0.1923

GDM N/A N/A N/A N/A N/A N/A N/A N/A

FSY N/A N/A N/A N/A N/A N/A N/A N/A

YZF N/A N/A N/A N/A N/A N/A N/A N/A

H2

CYWZ 0.1494 0.0052 0.0002 0.0004 0.0001 0.0097 0.0102 0.1574

ULISE 0.1494 0.0052 0.0002 0.0004 0.0001 0.0097 0.0102 0.1574

PLISE 0.1614 0.0053 0.0002 0.0004 0.0001 0.0102 0.0102 0.1889

GDM 0.1494 0.0052 0.0002 0.0004 0.0001 0.0097 0.0102 0.1574

FSY 0.1724 0.0108 0.0002 0.0004 0.0001 0.0097 0.0102 0.1648

YZF 0.1494 0.0052 0.0002 0.0004 0.0001 0.0097 0.0102 0.1574

H3

CYWZ 0.0076 0.0052 0.0002 0.0004 0.0001 0.0097 0.0102 0.3906

ULISE 0.0076 0.0052 0.0002 0.0004 0.0001 0.0097 0.0102 0.3906

PLISE 0.0076 0.0053 0.0002 0.0004 0.0001 0.0102 0.0102 0.3961




H4

CYWZ 0.0076 0.0257 0.0002 0.0004 0.0001 0.0348 0.0102 0.4925

ULISE 0.0076 0.0257 0.0002 0.0004 0.0001 0.0348 0.0102 0.4925

PLISE 0.0076 0.0258 0.0002 0.0004 0.0001 0.0349 0.0102 0.4925




H5

CYWZ 0.0079 0.0074 0.0002 0.0004 0.0001 0.0089 0.0102 0.0099

ULISE 0.0079 0.0074 0.0002 0.0004 0.0001 0.0089 0.0102 0.0099

PLISE 0.0079 0.0074 0.0002 0.0004 0.0001 0.0089 0.0102 0.0150




H6

CYWZ 0.0076 0.0218 0.0002 0.0004 0.0001 0.0309 0.0102 0.0097

ULISE 0.0076 0.0218 0.0002 0.0004 0.0001 0.0309 0.0102 0.0097

PLISE 0.0078 0.0257 0.0002 0.0004 0.0001 0.0368 0.0102 0.0165

GDM 0.0076 0.0218 0.0002 0.0004 0.0001 0.0309 0.0102 0.0097

FSY 0.0315 0.0232 0.0002 0.0004 0.0001 0.0310 0.0102 0.0100

YZF 0.0076 0.0218 0.0002 0.0004 0.0001 0.0309 0.0102 0.0097

Figure 1 shows a comparison of the input and state es-timation of the first three MVU estimators for the firstsystem with H1. In this case, these estimators were suc-cessful at estimating the states as well as the unknowninputs. It does appear from Figure 2 all three estimatorsproduces the same steady-state error covariances. How-ever, if we consider the results of all six systems in Table1, we observe that PLISE is outperformed by CYWZand ULISE. Note also that ULISE are consistently thebest filters, which agrees with the claim in Section 5.4of being globally optimal over the class of all linear un-biased state and input estimates for systems with un-known inputs, while CYWZ performs just as well, whichshows that in this particular example, the replacementof the generalized least squares estimate of d2,k with theordinary least squares estimate have little impact on thefilter performance.

On the other hand, when the direct feedthrough ma-trix has full rank, as with H2 and H6, GDM and YZF

16

performed just as well as CYWZ and ULISE, which isconsistent with the claim of global optimality of GDMin [34]. In both examples, the intentionally suboptimalFSY filter performs better than PLISE at estimating theunknown inputs, but is worse than PLISE when estimat-ing the system states.

7.2 Multi-vehicle Tracking

In this second example, we consider the problem of theposition and velocity tracking of multiple vehicles, fore.g., at an intersection, with partial information aboutthe decisions of the vehicles as well as faulty sensor read-ings. This can be particularly useful for the design ofintelligent transportation systems. To simplify the prob-lem, we consider the scenario with two vehicles, in whicheach vehicle only has access to its own control input,thus, the input of the other vehicle is unknown. Further-more, the velocity measurement of the vehicle is cor-rupted by a time-varying bias, which is also unknown.Thus, we model the linear continuous-time model of thecoupled system as:

p

p

q

q

=

0 1 0 0

0 −0.1 0 0

0 0 0 1

0 0 0 −0.1

p

p

q

q

+

0

0

0

1

u+

0 0

1 0

0 0

0 0

d1d2

+

0

w1

0

w2

y =

1 0 0 0

0 1 0 −1

0 0 1 0

0 0 0 1

p

p

q

q

+

0 0

0 0

0 0

0 1

d1d2

+ v

where p and p, and q and q, are the displacements andvelocities of the uncontrolled and controlled vehicle, re-spectively. d1 is the unknown input of the uncontrolledvehicle while d2 represents the unknown time-varyingbias. The intensities of the zero mean, white Gaussian

noises, w =[0 w1 0 w2

]>and v, are given by:

Qc = 10−4

0 0 0 0

0 1.6 0 0

0 0 0 0

0 0 0 0.9

; Rc = 10−4

1 0 0 0

0 0.16 0 0

0 0 0.9 0

0 0 0 2.5

.

Since the proposed filter is for discrete systems, we firstconvert the continuous dynamics to a discrete equivalentmodel with sample time 4t = 0.01s, assuming zero-order hold for the known and unknown inputs, u and d:

xk+1 = Adxk +Bduk +Gddk + wd,k

yk = Cdxk +Hddk + vd,k

where x =[p p q q

]>, k = 0, 1, 2, . . . and t = k4t, while

the system matrices as well as noise covariances can be

Time, t [s]

x1

0 2 4 6 8 10−20

0

20

40

60

80

x1 xULISE1 xPLISE

1

Time, t [s]

x2

0 2 4 6 8 10−5

0

5

10

15

x2 xULISE2 xPLISE

2

Time, t [s]

x3

0 2 4 6 8 10−5

0

5

10

x3 xULISE3 xPLISE

3

Time, t [s]

x4

0 2 4 6 8 10−0.5

0

0.5

1

1.5

2

x4 xULISE4 xPLISE

4

Time, t [s]

d1

0 2 4 6 8 10−2

0

2

4

6

8d1 dULISE

1 dPLISE1

Time, t [s]

d2

0 2 4 6 8 10−4

−2

0

2

4 d2 dULISE2 dPLISE

2

Fig. 3. Actual states x1, x2, x3, x4 and its estimates, as wellas unknown inputs d1, d2, and its estimates.

computed, e.g., using conversion algorithms involvingmatrix exponentials as in [35, 36], to obtain:

Ad =

1 0.01 0 0

0 −0.999 0 0

0 0 1 0.01

0 0 0 0.999

; Bd =

0

0

0

0.01

;

Cd =

1 0 0 0

0 1 0 −1

0 0 1 0

0 0 0 1

; Gd =

0 0

0.01 0

0 0

0 0

; Hd =

0 0

0 0

0 0

0 1

;

Qd = 10−5

0.0000 0.0008 0 0

0.0008 0.1598 0 0

0 0 0.0000 0.0004

0 0 0.0004 0.0899

; Rd = Rc,

with d1,k and d2,k as shown in Figure 3 (where t = k∆t).

From Figure 3, we observe that both variants of thefilter proposed in this paper successfully estimate thesystem states and the unknown inputs, which consistof the input of the uncontrolled vehicle and the time-varying measurement bias. The slight difference betweenthe two variants can be seen in Figure 4 where the rateof convergence of trace of the unknown input estimateerror covariance of the PLISE variant is slightly slower.

8 Conclusion

This paper presented a unified filter for simultaneouslyestimating the states and unknown inputs in an un-biased minimum-variance sense for linear discrete-timestochastic systems, without any restriction on the direct

17

Time, t [s]

trace(P

x)

0 0.05 0.1 0.15 0.2 0.2510−5

100

105

1010

ULISEPLISE

Time, t [s]

trace(P

d)

0 0.05 0.1 0.15 0.2 0.2510−10

100

1010

1020

ULISEPLISE

Fig. 4. Trace of estimate error covariance of states, tr(P x),and unknown inputs, tr(P d) for the first 0.25s.

feedthrough matrix of the system. Two variants of thefilter is proposed, one of which uses the propagated stateestimate for unknown input estimation (PLISE), andthe other with the updated state estimate (ULISE). Weproved that ULISE is also globally optimal over the classof all linear unbiased state and input estimates for sys-tems with unknown inputs and provided stability condi-tions for the filter, which are shown to be closely relatedto the strong detectability of the system. Simulation re-sults have shown that ULISE was the best estimator inall the test trials, whereas PLISE, though is not globallyoptimal, performed reasonably well.

A possible future direction is the extension of the currentunified filter to linear continuous-time systems, switchedsystems and nonlinear systems.

Acknowledgments

The work presented in this paper was supported in partby the National Science Foundation, grant #1239182.

References

[1] R. Verma and D. Del Vecchio. Semiautonomousmultivehicle safety. IEEE Robotics AutomationMagazine, 18(3):44–54, Sept. 2011.

[2] P. K. Kitanidis. Unbiased minimum-variance lin-ear state estimation. Automatica, 23(6):775–778,November 1987.

[3] R. Patton, R. Clark, and P.M. Frank. Fault diag-nosis in dynamic systems: theory and applications.Prentice-Hall international series in systems andcontrol engineering. Prentice Hall, 1989.

[4] G. De Nicolao, G. Sparacino, and C. Cobelli. Non-parametric input estimation in physiological sys-tems: Problems, methods, and case studies. Auto-matica, 33(5):851 – 870, 1997.

[5] M. Darouach and M. Zasadzinski. Unbiased min-imum variance estimation for systems with un-known exogenous inputs. Automatica, 33(4):717–719, 1997.

[6] M Hou and RJ Patton. Optimal filtering for sys-tems with unknown inputs. IEEE Transactions onAutomatic Control, 43(3):445–449, 1998.

[7] M. Darouach, M. Zasadzinski, and M. Boutayeb.Extension of minimum variance estimation for sys-tems with unknown inputs. Automatica, 39(5):867– 876, 2003.

[8] Y. Cheng, H. Ye, Y. Wang, and D. Zhou. Unbiasedminimum-variance state estimation for linear sys-tems with unknown input. Automatica, 45(2):485–491, 2009.

[9] D. Simon. Optimal state estimation: Kalman, H∞,and nonlinear approaches. John Wiley & Sons,2006.

[10] Z. Wang, Y. Liu, and X. Liu. H∞ filtering for un-certain stochastic time-delay systems with sector-bounded nonlinearities. Automatica, 44(5):1268 –1277, 2008.

[11] Z. Wang, H. Dong, B. Shen, and H. Gao. Finite-horizon H∞ filtering with missing measurementsand quantization effects. IEEE Transactions on Au-tomatic Control, 58(7):1707–1718, July 2013.

[12] X. Li and H. Gao. Robust finite frequency filter-ing for uncertain 2-D systems: The FM model case.Automatica, 49(8):2446 – 2452, 2013.

[13] H. J. Palanthandalam-Madapusi and D. S. Bern-stein. Unbiased minimum-variance filtering for in-put reconstruction. In American Control Confer-ence, pages 5712–5717, 2007.

[14] C. Hsieh. Robust two-stage Kalman filters for sys-tems with unknown inputs. IEEE Transactionson Automatic Control, 45(12):2374–2378, Decem-ber 2000.

[15] S. Gillijns and B. De Moor. Unbiased minimum-variance input and state estimation for lineardiscrete-time systems. Automatica, 43(1):111–116,January 2007.

[16] S. Gillijns and B. De Moor. Unbiased minimum-variance input and state estimation for lineardiscrete-time systems with direct feedthrough. Au-tomatica, 43(5):934 – 937, March 2007.

[17] H. Fang, Y. Shi, and J. Yi. On stable simultaneousinput and state estimation for discrete-time linearsystems. International Journal of Adaptive Controland Signal Processing, 25(8):671–686, 2011.

[18] S.Z. Yong, M. Zhu, and E. Frazzoli. Simultane-ous input and state estimation for linear discrete-time stochastic systems with direct feedthrough. InIEEE Conference on Decision and Control, pages7034–7039, Dec 2013.

[19] C. Hsieh. Extension of unbiased minimum-varianceinput and state estimation for systems with un-known inputs. Automatica, 45(9):2149 – 2153, 2009.

[20] H. Fang, Y. Shi, and J. Yi. On stable simultaneousinput and state estimation for discrete-time linearsystems. International Journal of Adaptive Controland Signal Processing, 25(8):671–686, 2011.

[21] M.L.J. Hautus. Strong detectability and observers.Linear Algebra and its Applications, 50:353 – 368,1983.

[22] N Suda and E Mutsuyoshi. Invariant zeros andinput-output structure of linear, time-invariant sys-tems. International Journal of Control, 28(4):525–535, 1978.

[23] W. S. Kerwin and J. L. Prince. On the optimality ofrecursive unbiased state estimation with unknown

18

inputs. Automatica, 36(9):1381 – 1383, 2000.[24] L. M. Silverman. Discrete Riccati equations: Alter-

native algorithms, asymptotic properties, and sys-tem theory interpretations, volume 12 of Controland Dynamic Systems. Academic Press, 1976.

[25] A.H. Sayed. Fundamentals of Adaptive Filtering.Wiley, 2003.

[26] B.D.O. Anderson and John B. Moore. Detectabil-ity and stabilizability of time-varying discrete-timelinear systems. SIAM Journal on Control and Op-timization, 19(1):20–32, 1981.

[27] M. A. Peters and P. A. Iglesias. A spectral testfor observability and reachability of time-varyingsystems. SIAM Journal on Control Optimization,37(5):1330–1345, August 1999.

[28] D.S. Bernstein. Matrix Mathematics: Theory,Facts, and Formulas (Second Edition). Princetonreference. Princeton University Press, 2009.

[29] N. R. Draper and H. Smith. Applied RegressionAnalysis (Wiley Series in Probability and Statis-tics). Wiley-Interscience, third edition, April 1998.

[30] R.E. Kalman. A new approach to linear filter-ing and prediction problems. Transactions ofthe ASME–Journal of Basic Engineering, 82(SeriesD):35–45, 1960.

[31] S. Chan, G. Goodwin, and K. Sin. Convergence

properties of the Riccati difference equation in op-timal filtering of nonstabilizable systems. Auto-matic Control, IEEE Transactions on, 29(2):110–118, February 1984.

[32] C. de Souza, M. Gevers, and G. Goodwin. Ric-cati equations in optimal filtering of nonstabiliz-able systems having singular state transition ma-trices. IEEE Transactions on Automatic Control,31(9):831–838, September 1986.

[33] J.Y. Keller, L. Summerer, M. Boutayeb, andM. Darouach. Generalized likelihood ratio ap-proach for fault detection in linear dynamic stochas-tic systems with unknown inputs. Intern. Journalof Systems Science, 27(12):1231–1241, 1996.

[34] C. Hsieh. On the global optimality of unbi-ased minimum-variance state estimation for sys-tems with unknown inputs. Automatica, 46(4):708– 715, 2010.

[35] R. A. DeCarlo. Linear systems: a state variableapproach with numerical implementation. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1989.

[36] C. Van Loan. Computing integrals involving thematrix exponential. IEEE Transactions on Auto-matic Control, 23(3):395–404, 1978.

19

auni edfilterforsimultaneousinputandstateestimation … · 2014-06-17 · auni...

Documents