computational motor control: optimal estimation in noisy world (jaist summer course)

45
Computational Motor Control Summer School 04: Optimal estimation in noisy world. Hirokazu Tanaka School of Information Science Japan Institute of Science and Technology

Upload: hirokazutanaka

Post on 12-Apr-2017

315 views

Category:

Science


2 download

TRANSCRIPT

Page 1: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Computational Motor Control Summer School04: Optimal estimation

in noisy world.

Hirokazu Tanaka

School of Information Science

Japan Institute of Science and Technology

Page 2: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

In this lecture, we will learn:

• Maximum-likelihood (ML) estimation

• Cramer-Rao’s lower bound

• Bayesian estimation

• Wiener filter

• Kalman filter

• Kalman smoother

• Causal inference

• Information integration

• Postdiction

Page 3: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Classical estimation (point) and Bayesian estimation (density).

sample space Χ parameter space Θ

sample space Χ parameter space Θ

ˆ xx

x

|P x

Page 4: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Maximum-likelihood (ML) estimation.

sample space Χ parameter space Θ

ˆ xx

MLˆ arg max |

arg max log |

x P x

P x

ML is …• Asymptotically unbiased (i.e., approaches to a true value).• Asymptotically efficient (i.e., achieves minimum variance, CRLB).

Page 5: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

z

Maximum-likelihood (ML) estimation.

sample space Χ parameter space Θ

1ˆ x

1x

2x

2ˆ x

3ˆ x

3x 0

Page 6: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Kullback-Leibler divergence: a distance between two density functions.

KL ; log 0i

i

i

i

pD q p p

q

KL ; log 0

p xD q p dxp x

q x

KL divergence for discrete variable KL divergence for continuous variable

Page 7: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

ML as a minimization of KL divergence.

KL ; | log|

=E log E log |

p xD p x q x dx p x

q x

p x q x

Minimization of KL divergence

Maximization of

KL ; |D p x q x

E log |q x

1

1E log | log |

N

i

iq x q xN

KL divergence between true density p(x) and parametrized density q(x|θ):

Sampling approximation:

Page 8: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

ML as a minimization of KL divergence.

2T0 0

0

1T

T0

1E log | log loˆ g |ˆ ˆ ˆ ˆ|

ˆ ˆ ˆ

N

i

i

q x q x E q xN

I

2

T T

log | log |E E

ˆ ˆ

l ˆog |ˆq x q x

I q x

0 11ˆ ˆ, IN

Sampling approximation:

ML is …• Asymptotically unbiased (i.e., approaches to a true value).• Asymptotically efficient (i.e., achieves minimum variance, CRLB).

Fisher information:

In the limit of large samples (infinite N), the ML estimator is unbiased and efficient.

Page 9: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Cramer-Rao lower bound: scalar parameter case.

Suppose a random variable 𝑋~𝑝(𝑥; 𝜃), where 𝜃 is fixed but unknown. Assume that 𝑝(𝑥;𝜃) satisfies the “regularity” condition:

where the expectation is with respect to 𝑝(𝑥;𝜃). Then the variance of any unbiased estimator 𝜃 satisfies

E log | 0,p x

1ˆVar

I

2 2

2

logE

|E

| logp x p xI

Fisher information:

Page 10: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Cramer-Rao lower bound: vector case.

Suppose a random variable 𝑋~𝑝(𝑥|𝛉), where θ is fixed but unknown. Assume that 𝑝(𝑥|θ) satisfies the “regularity” condition:

where the expectation is with respect to 𝑝(𝑥;𝜃). Then the variance of any unbiased estimator 𝛉 satisfies

E log | 0,p x

θ

θ

1ˆCov θ I θ

2log | log |

E Elog |

iji j ji

p x p x p x

θ θθI

Fisher information matrix:

Page 11: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Width estimation: optimal integration of vision and haptics.

22

HV2 2 2 2

V V

2 2 2

V

VH

H H

H

1 1 1

ˆ x x

Ernst & Banks (2002) Nature

H V

2 2

V

V

V H

2

2

2

H

H

2

, | | |

exp2 2

exp2

ˆ

xp x p x p x

x x

ML estimation of object width from vision and haptics:

x1: visually perceived width,x2: haptically perceived width

σ1: visual uncertaintyσ2 : haptic uncertainty

Page 12: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Width estimation: optimal integration of vision and haptics.

Ernst & Banks (2002) Nature

Density functions (sensory uncertainty)

Distribution functions (human response)

By examining human forced choice, the sensory density function can be estimated.

Page 13: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Width estimation: optimal integration of vision and haptics.

Ernst & Banks (2002) Nature

mean (xV) and variance (σV2) of vision

mean (xH) and variance (σH2) of haptics

mean (x) and variance (σV) of multi-modal integration

Human performance in visual-haptic discrimination (RIGHT) is predicted by maximum-likelihood integration of those in visual and haptic discrimination measured separately (LEFT).

Page 14: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Width estimation: optimal integration of vision and haptics.

Ernst & Banks (2002) Nature

22

H1

V

H

2 2 2 2

V V H

Hx x x

V

2 2 2

H

1 1 1

Page 15: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Bayes’ theorem: mapping observation to probability density.

sample space Χ parameter space Θ

x

|P x

||

P x PP x

P x

| :

| :

:

:

P x

P x

P

P x

Posterior probability

Likelihood

Prior probability

Marginal probability

Page 16: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Bayesian estimation: Maximum A Posteriori (MAP).

sample space Χ parameter space Θ

ˆ xx

ˆ arg max

arg max

arg m

|

ax

|

|

P x

P x

P

P

P

x

P x

x

Page 17: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Bayesian estimation: Mean minimum-squared error estimator.

sample space Χ parameter space Θ

MMSEˆ x

x

2

MMSEˆ

2

ˆ

ˆ arg min ˆE

arg m ˆin

x x

p x d

MMSEˆ x p d Ex x

×: MAP, ×: MMSE

Page 18: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Motion illusion as optimal percepts.

http://www.cs.huji.ac.il/~yweiss/Rhombus/rhombus.html

Page 19: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Motion illusion as optimal percepts.

Weiss et al. (2002) Nature Neurosci

2

2

, , ,1, | , exp

2

, , ,, y

i i i i i i

i i x y x

I x t I xy y yy

t I x tP I x

x y tt v v v v

2

2 2

2

1

2, expx y x y

v

P vvv v

2

22 2

2 2

, , ,1 1, | , ex

,, p

,

2 2

,y

i i i i i i

x y i i

v i

x y x

y y yy v

x

I x t I x t I x tP v v I x t v

y tv v

Prior density: most of objects have small velocity (not moving).

Likelihood: Probability of observed image with given velocity (vx, vy).

, , , ,x yt y t t t yI x v v I tx Lucas-Kanade image model:

Posterior density

Page 20: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Motion illusion as optimal percepts.

Weiss et al. (2002) Nature Neurosci

Page 21: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Causal inference of sensory information.

Kording et al. (2007) PLoS One; Kayer & Shams (2015) PLoS Biol

Page 22: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Causal inference of sensory information.

Kording et al. (2007) PLoS One; Kayer & Shams (2015) PLoS Biol

Given visual (xV) and auditory (xA) observations…

Q1: Do they belong to a single object (C=1) or come from two different objects (C=2)?

Q2: What are the most likely positions for the visual and the auditory objects?

Ax

Vx

Ax

Vx

Ax

Vx

Page 23: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Causal inference of sensory information.

Kording et al. (2007) PLoS One; Kayer & Shams (2015) PLoS Biol

A V

A V

A V A V

,,

, ,

| 1 11|

| 1 1 | 2 2

xx

x

P x C P CP C x

P x C P C P x C P Cx

A V

A V

A V A V

,,

, ,

| 2 22 |

| 1 1 | 2 2

xx

x

P x C P CP C x

P x C P C P x C P Cx

Bayes’ theorem:

Page 24: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Causal inference of sensory information.

Kording et al. (2007) PLoS One; Kayer & Shams (2015) PLoS Biol

A A, 1 A V A, 2 A Vˆ ˆ ˆ1| , 2 | ,C Cs s P C x x s P C x x

V, 1 A V V, 2V A Vˆ ˆ ˆ1| , 2 | ,C Cs s P C x x s P C x x

Model averaging.:

Page 25: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Prediction, filtering and smoothing of a dynamic system.

Prediction

Filtering

Smoothing

0 k T

0 01: 1, ,k kz z z

1: 1, ,k kz z z

1: 1, ,T Tz z z

01:k kp x z

1:k kp x z

1:k Tp x z

Page 26: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Estimation of dynamic systems: Kalman filter.

1k k k

k k k

Ax

Cx

w

z v

x

Stochastic state-space model

Problem: Given a sequence of measurements up to step k,

then estimate the conditional mean

and the conditional covariance of the state vector at step k,

1: 1 2 ,, , , kk z z z z

| 1:ˆ ,k k k kE x x z

| 1 1:

T

: : :ˆ ˆCov E ,k k k k k k k k k k k

x z x x x x z

Page 27: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Kalman filter optimally integrates prediction and measurement.

Prediction

| 1

| 1

1| 1

T

1| 1

ˆ ˆ

k k

k

k

k

k

k k

x Ax

A A Q

Filtering

| |

| 1

1

| | 1ˆ ˆ ˆ

k k k k k k k k

k k k kk

x x K z Cx

I K C

1

T T

| 1 | 1k k k k k

K C C C RKalman gain

1| 1

1| 1

ˆk k

k k

x | 1

| 1

ˆk k

k k

x |

|

ˆk k

k k

xPrediction Filtering

kz

Page 28: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Kalman filter optimally integrates prediction and measurement.

1| 1

1| 1

ˆk k

k k

x | 1

| 1

ˆk k

k k

x |

|

ˆk k

k k

xPrediction Filtering

1: 1k kp x z

k kp z x

1:k kp x z

1: 1k kp x z k kp z x 1:k kp x z

prior

likelihood

posterior

Page 29: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Kalman filter: Derivation.

1| 1

1| 1

ˆk k

k k

x | 1

| 1

ˆk k

k k

x |

|

ˆk k

k k

xPrediction Filtering

kz

1: 1 1 1 1

1 1 1 1| 1 1| 1

| 1

1

|

: 1

1

ˆ; , ; ,

ˆ; ,

kk k k k kk

k k k k k k k k

k k k k k

p d p p

d

x z x x x zx

x x Ax Q x x

x x

1 1| 1 1| 1ˆ; ,k k k k k x x | 1 | 1; ˆ ,k k k k k x x | |; ,ˆ

k k k k kx x

Prediction

Filtering

1: 1

1: | |

1: 1

ˆ; ,k k k k

k k k k k k k

k k

p pp

p

z x x z

x z x xz z

Page 30: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Kalman filter: Derivation.

1| 1

1| 1

ˆk k

k k

x | 1

| 1

ˆk k

k k

x |

|

ˆk k

k k

xPrediction Filtering

kz

, ,

y y

x x xy

yx

x

y

μ

y μ

x

1 1, y yxx yy xx yyxy xyμ μx y y

1 1, x yy xyy xx xxyx yxμ μy x x

Lemma: When a joint density function of two variables x and y are given as

then conditional probability densities are given as follows:

Page 31: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Kalman filter explains smooth eye pursuit movements.

Burge (2007) J Vision

| | 1 | 1ˆ ˆ ˆ

k k k k k k k k x x K z Cx

noisy sensory feedback:Kalman filter predicts sloweradaptation.

accurate sensory feedback:Kalman filter predicts fasteradaptation.

Page 32: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Kalman filter explains smooth eye pursuit movements.

Orban de Xivry et al. (2013) J Neurosci

Kalman filter #1 for visual processing of retinal slip.Kalman filter #2 for remembered target dynamics.

Page 33: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Kalman filter explains smooth eye pursuit movements.

Sinusoidal target motion Anticipatory smooth pursuit

Orban de Xivry et al. (2013) J Neurosci

Page 34: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Smoothing of Dynamic systems: Kalman smoother.Smoothing consists of forward and backward paths.

Filtering

Smoothing

0 k T

1: 1, ,k kz z z

1: 1, ,T Tz z z

1:k kp x z

1:k Tp x z

0 k TForward path 1: 1, ,k kz z z

backward path 1: 1, ,k T k T z z z

Page 35: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Kalman smoother algorithm: RTS smoother.

Forward path: Using the standard algorithm of Kalman filter, compute the predicted mean and covariance,

and the filtered mean and covariance,

1| 1| , 1,ˆ , 0,k k k k k T x

| |ˆ , .0 ,,k k k k Tk x

Backward path: Then, the smoothed mean and covariance,

are solved backward in time as follows:

| |ˆ , .0 ,,k T k T Tk x

| | 1| 1|

| | 1| 1|

ˆ ˆ ˆ ˆk T k k k k T k k

T

k T k k k k T k k k

x x G x x

G G

1

| 1|

T

k k k k k

G Awhere

Page 36: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Flash-lag effect explained by Kalman smoother.

http://visionlab.harvard.edu/Members/Alumni/David/flash-lag.htm

Psychophysics Extrapolation model Latency difference model Optimal smoothing model

Tim

e

Nijhawan (1994) Whitney & Murakami (1998) Rao et al. (2001)

Rao et al. (2001) Neural Comput

Page 37: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Attractor neural networks for ML estimation.

1i ij j

j

u t w o t

Deneve et al. (1999) Nature Neurosci

0i io a

lim it

o t

Dynamics of recurrent neural network

2

2

11

1 1

i

j

i

j

u to t

u t

Local averaging

Divisive normalization (“winner-takes-all”)

Page 38: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Attractor neural networks for ML estimation.

Deneve et al. (1999) Nature Neurosci

Page 39: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Neural network algorithm for maximum likelihood.

Poissoni in f s

|!

ii s

i

i

nf

ifs

e sP r

n

1 1

|!

|

iinsN fN

i

i

i

ii

fs n

e sP P

ns

n

1

1 11

log

log

|

log

|

!

N

i

N N

i i

N

i ii i

i

i

n

sP

n sP

f nfs s

n

Jazayeri & Movshon (2006) Nature Neurosci

Probabilistic neural responses to stimulus s:

Likelihood function:

Log likelihood function:

Page 40: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Neural network algorithm for maximum likelihood.

Jazayeri & Movshon (2006) Nature Neurosci

1

1 1 1

log log

log

|

!

|N

i

N N N

i i i

i

i i i i

P P

f s r

r

r

s

f

s

s

r

Maximize 1

logi i

N

i

r f s

1

constantN

i

if s

under constraint

Page 41: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Neural implementation of multimodal integration.

Ma et al. (2006) Nature Neurosci

2

21 1 0

log | log const. const.2

N Ni

i i i

i i

nP s n f s s s

n

2| ; ,s sP n2

21 0

1 1

,

N

i ii

N N

i ii i

n

n n

s

If tuning function is Gaussian with mean si and variance σ02,

then, the likelihood function is also Gaussian:

where

2

2

0 2200

1, exp ,

22

i

i if sss s

Population activity Likelihood Population activity Likelihood

Page 42: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Neural implementation of multimodal integration.

Ma et al. (2006) Nature Neurosci

2

3

3 1 2

1 2

2 2

1 1 2

2

3

| |

, ,

| |

,

s s

s s

P P

P P

n n n

n n

2 2

23 12 2 2 2

1

1

1

2 2 2

1

2

2 2

3 2

1 1 1

x x

Page 43: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Summary

• Inference from noisy measurements is formulated as a statistical inference problem.

• Statistical inference is categorized into classical point estimation and Bayesian probability estimation.

• Human performance in psychophysical experiments matches the predictions of optimal estimation.

• The computation of optimal inference can be implemented by a few neural mechanisms.

Page 44: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

References

• Ernst, M. O., & Banks, M. S. (2002). Humans integrate visual and haptic information in a statistically optimal fashion. Nature, 415(6870), 429-433.

• Weiss, Y., Simoncelli, E. P., & Adelson, E. H. (2002). Motion illusions as optimal percepts. Nature neuroscience, 5(6), 598-604.

• Kording, K. P., & Wolpert, D. M. (2004). Bayesian integration in sensorimotor learning. Nature, 427(6971), 244-247.

• Kording, K. P., Beierholm, U., Ma, W. J., Quartz, S., Tenenbaum, J. B., & Shams, L. (2007). Causal inference in multisensory perception.

• Kayser, C., & Shams, L. (2015). Multisensory causal inference in the brain. PLoS biology, 13(2).

• Burge, J., Ernst, M. O., & Banks, M. S. (2008). The statistical determinants of adaptation rate in human reaching. Journal of Vision, 8(4), 20.

• de Xivry, J. J. O., Coppe, S., Blohm, G., & Lefevre, P. (2013). Kalman filtering naturally accounts for visually guided and predictive smooth pursuit dynamics. The Journal of Neuroscience, 33(44), 17301-17313.

• Rao, R. P., Eagleman, D. M., & Sejnowski, T. J. (2001). Optimal smoothing in visual motion perception. Neural Computation, 13(6), 1243-1253.

• Deneve, S., Latham, P. E., & Pouget, A. (1999). Reading population codes: a neural implementation of ideal observers. Nature neuroscience, 2(8), 740-745.

• Jazayeri, M., & Movshon, J. A. (2006). Optimal representation of sensory information by neural populations. Nature neuroscience, 9(5), 690-696.

• Ma, W. J., Beck, J. M., Latham, P. E., & Pouget, A. (2006). Bayesian inference with probabilistic population codes. Nature neuroscience, 9(11), 1432-1438.

Page 45: Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer course)

Exercise

• Write a Matlab code for Kalman filter and Kalman smoothing, and examine the difference.

• Psychophysics: Smooth eye pursuit experiment using GazeParser (http://gazeparser.sourceforge.net/).