kernels for dynamic textures - purdue universityvishy/talks/dynamic.pdf · 2009. 8. 22. · dynamic...

Post on 13-Mar-2021

4 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 1

Kernels for Dynamic TexturesS.V.N. Vishwanathan

SVN.Vishwanathan@nicta.com.auhttp://web.anu.edu.au/~vishy

National ICT Australiaand

Australian National University

Joint work with Alex Smola and René Vidal

Roadmap

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 2

Introduction to Kernel Methods

Why kernels?

Kernels on Dynamical Systems

Trajectories, Noise ModelsComputation

Dynamical Textures

ARMA ModelsApproximate SolutionsKernel ComputationExperiments

Outlook and Conclusion

Classification

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 3

Data:

Pairs of observations (xi, yi)

Underlying distribution P(x, y)

Examples (blood status, cancer), (transactions, fraud)

Task:

Find a function f (x) which predicts y given x

The function f (x) must generalize well

Optimal Separating Hyperplane

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 4

Minimize1

2‖w‖2 subject to yi(〈w, xi〉 + b) ≥ 1 for all i

Kernels and Nonlinearity

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 5

Problem: Linear functions are often toosimple to provide good estimators

Idea 1: Map to a higher dimensionalfeature space via Φ : x → Φ(x) andsolve the problem there Replace ev-ery 〈x, x′〉 by 〈Φ(x), Φ(x′)〉

Idea 2: Instead of computing Φ(x) ex-plicitly use a kernel functionk(x, x′) := 〈Φ(x), Φ(x′)〉A large class of functions are admis-sible as kernels

Non-vectorial data can be handled ifwe can compute meaningful k(x, x′)

Roadmap

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 6

Introduction to Kernel Methods

Why kernels?

Kernels on Dynamical Systems

Trajectories, Noise ModelsComputation

Dynamical Textures

ARMA ModelsApproximate SolutionsKernel ComputationExperiments

Outlook and Conclusion

The Basic Idea

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 7

Key Observation:

Trajectories are easily observableSimilar trajectories ⇒ similar systemsRestrict attention to interesting casesAverage over noise models

Kernels Using Dynamical Systems:

Simulate system for both inputsSimilar time evolution ⇒ similar inputs

Kernels on Dynamical Systems:

Restrict to interesting initial conditionsSimulate both the systemsSimilar time evolution ⇒ similar systems

Notation

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 8

X - state space (Hilbert space)

A - time evolution operators

T - time of measurement

µ - nice probability measure on T

Discounting Factors:For some λ > 0

µ(t) = λ−1e−λt for T = R+0

µ(t) =e−λt

1− e−λfor T = N0

Time Evolution:We study

xA(t) := A(t)x for A ∈ A

Trajectories and Kernels

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 9

Comparing Trajectories:Using the dot product on X we define a dot product on XT

〈θ, θ′〉 := Eµ[〈θ(t), θ′(t)〉] for θ, θ′ ∈ XT

Extending to Dynamical Systems:Identify a dynamical system with its trajectory and define

k((x,A), (x, A)) := Eµ

[〈A(t)x, A(t) x〉

]Other Ideas:

A nicely decaying measure required for convergenceModify the dot product in X

Covariance matrices?Rational kernels and transducers

Special Cases

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 10

Kernels on Dynamical Systems:

Restrict attention to x = x

Compare trajectory for identical initial conditionsTake expectation if interested in a range of x

k(A, A) := Ex

[k((x,A), (x, A))

]More generally

k(A, A) := EA EA Ex

[k((x,A), (x, A))

]Kernels Using Dynamical Systems:

Restrict attention to a particular dynamical systemAs before we can take expectations over A

k(x, x) := Ex Ex EA [k((x,A), (x,A))]

Discrete Linear Systems

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 11

Linear Systems:

We assume time propagation occurs as

xA(t + 1) = AxA(t) + at + ξt

In closed form

xA(t) = At x0 +

t∑i=0

At−i ξi + At−i at

To avoid messy math assume at = 0 and hence

xA(t) = At x0 +

t∑i=0

At−i ξi

Contribution to kernel due to A as well as noise

Continuous Linear Systems

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 12

Linear Systems:

Sytem dynamics here are described by

d

dtxA(t) = AxA(t) + a(t) + ξ(t)

Here ξ(t) with E[ξ(t)] = 0 is a stochastic process and

xA(t) = exp(A t)x0 +

∫ t

0

exp(A(t− τ ))(a(τ ) + ξ(τ ))dτ

As before we assume a(t) = 0

We even assume ξ(τ ) = 0 (avoids messy math again!)

xA(t) = exp(A t)x0

Kernel contribution only due to A

Convergence Criterion

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 13

Discrete Case:

Let A and B and W be linear operatorsThe matrix norms obey 0 ≤ ‖A‖, ‖B‖ ≤ Λ

For suitable λ with eλ > Λ2 and W � 0

M :=

∞∑t=0

e−λtAtWBt

Sylvester equation e−λAMB + W = M

Continuous Case:We define

M :=

∫ ∞

0

e−λt exp(At)>W exp(Bt) dt

Sylvester equation (A> + λ2 1)M + M(B + λ

2 1) = −W

Gory Details

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 14

Contribution due to A:

p∞∑t=0

e−λt〈Atx, Atx〉 := p · x>

[ ∞∑t=0

e−λt(At)>W At

]x

= p · x>M x

Contribution due to noise:

p∞∑t=0

t∑j,j′=0

e−λt〈At−jξj, At−j′

ξj′〉

= p tr

(Cξ

[ ∞∑t=0

e−λt(At)>M At

]):= p tr(Cξ M)

In above equations p is a normalizing term

Delving Deeper

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 15

More on M and M :

The matrix M and M look like

M :=

[ ∞∑t=0

e−λt(At)>W At

]and

M :=

[ ∞∑t=0

e−λt(At)>M At

]Sylvester Equation:

Both M and M satisfy the Sylvester equation

e−λ A>M A +W = M and e−λ A> M A +M = M

Can be solved for in cubic time

Discrete Kernel

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 16

Discrete Case:

Putting it all together

k((A, x), (A, x)) = p[x>M x+ tr(CξM)

]Note that Cξ is the covariance matrix of ξt

Can assume different noise models per time step

Initial Conditions:

C be the covariance matrix of the initial conditionsIf we set x = x then

k((A, x), (A, x)) = p[tr(CM) + tr(CξM)

]

Continuous Kernel

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 17

Contribution due to A:

Since we assumed a(t) = ξ(t) = 0 we get

k((x,A), (x, A)) = λ−1

∫ ∞

0

e−λt〈exp(A t)x, exp(A t) x〉dt

The Final Form:

The kernel can be expressed as

k((x,A), (x, A)) = λ−1x>M x

where

(A> +λ

21)M + M>(A +

λ

21) = −W

Solution in cubic time by solving Sylvester equation

Special Cases

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 18

Snapshot:

If we consider only the snapshot at time instance T

k((x,A), (x, A)) = λ−1x exp(A t)W exp(A t)> x>

Initial Conditions:

Fix A = A

Now we just solve

M = −1

2(A+

λ

21)−1W

Dynamical Systems:

Fix x = x to get k(A, A) = λ−1 tr(MC)

Here C is the covariance matrix of initial conditions

Graph Kernels

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 19

Graph Laplacian:

Let E be the adjacency matrix and D := diag(E 1)

L := E −D and L := D−12LD−1

2

Diffusion Process:

We can define a diffusion process by

d

dtx(t) = Lx(t)

Diffusion Kernel (Kondor and Lafferty, 2002):

If we measure overlap at time instance T we get

K = exp(LT )> exp(LT )

Kij is the probability that state l reached from i and j

Graph Kernels

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 20

Undirected Graphs (Kondor and Lafferty, 2002):

Here L is symmetric and hence yields

K = exp(2LT )

Labeled Graphs (Gärtner, 2002):

If W acts as an indicator for node labelsSay Wij = 1 if two nodes have same labelFor other fancy weights see (Kashima et al, 2003)

Averaged Graph Laplacian:

If we average over a range of T values

K =1

2

(L +

λ

21

)−1

Roadmap

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 21

Introduction to Kernel Methods

Why kernels?

Kernels on Dynamical Systems

Trajectories, Noise ModelsComputation

Dynamical Textures

ARMA ModelsApproximate SolutionsKernel ComputationExperiments

Outlook and Conclusion

ARMA Models

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 22

ARMA Model:

An auto-regressive moving average model is

x(t + 1) = Ax(t) + B v(t)

y(t) = φ(x(t)) + w(t)

x(t) is a hidden variablev(t) and w(t) are IID random noise

Linear Gaussian Model:

If φ is linear and the noise is white Gaussian:

x(t + 1) = Ax(t) + v(t) v(t) ∼ N(0, Q)

y(t) = C x(t) + w(t) w(t) ∼ N(0, R)

Fix scaling by demanding that C>C = 1

Dynamic Textures

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 23

Image Model:

y(t) ∈ Rm are the observed noisy imagesx(t) ∈ Rn (n < m) are hidden variables

Modeling:

A sequence of images {y(1), . . . , y(τ )} is observedIdeally we want to solve

A(τ ),C(τ ), Q(τ ), R(τ ) = arg maxA,C,Q,R

p(y(1), . . . , y(τ ))

Exact Solution:

n4sid in MATLAB solves above problemDoes not scale well if m is largeImpractical for images where m ∼ 105

Approximate Solution

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 24

Problem To Solve:

For any variable z(t) define Zτi := [z(i), . . . , z(τ )]

We are solving

Y τ1 = CXτ

1 + W τ1 with C>C = 1

Solving By SVD:

Solving for arg minC,Xτ1‖W‖ yields

C(τ ) = U and X(τ ) = ΣV > where Y τ1 = UΣV >

Solving for arg minA ‖Xτ2 −AXτ

1‖ yields

A(τ ) = ΣV >D1V (V >D2V )−1Σ−1

Here D1 =

[0 0

1(τ−1) 0

]and D2 =

[1(τ−1) 0

0 0

]

Dynamic Texture Kernel

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25

Kernel Definition:

Estimate model and compute kernels between modelsIf we average out the noise then for some W � 0

k((x0,A,C), (x′0,A′,C′)) := E

v,w

[ ∞∑t=1

e−λty>t Wy′t

]Kernel Computation:

The kernel can be computed as

k = x>0 Mx′0 +(eλ − 1

)−1tr[QM + WR

]The matrices M and M satisfy

M = e−λ A>C>WC ′A′ +e−λ A>M A′

M = C>W C′ +e−λ A> M A′

Experimental Setup

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 26

Typical Textures:

Some sample textures

A long clip was cut to shorter clips of 120 frames each

Freak Textures:

We also collected some freak textures

Results

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 27

Kernel Induced Metric:

Clips closer on a axis are from the same master clipWe plot the kernel induced metric for λ = 0.9 and 0.1

Results fairly independent of the cholice of λ

Notice the block diagonal structure of the metric matrix

Roadmap

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 28

Introduction to Kernel Methods

Why kernels?

Kernels on Dynamical Systems

Trajectories, Noise ModelsComputation

Dynamical Textures

ARMA ModelsApproximate SolutionsKernel ComputationExperiments

Outlook and Conclusion

Conclusion

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 29

A new method to embed dynamical systems

Analytical solutions for linear systems

Many graph kernels are special cases

Analytical solutions require cubic time

Are better solutions possible for special cases?

Extensions to nonlinear systems?

Application to dynamical textures

Works with approximate model parameters

Picks out clips from the same master clip

Close relations to rational kernels of Cortes et. al.

More information at http://mlg.anu.edu.au/~vishy

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 30

Questions?

top related