bayesian estimation & information...

21
Bayesian Estimation & Information Theory Jonathan Pillow Mathematical Tools for Neuroscience (NEU 314) Spring, 2016 lecture 18

Upload: others

Post on 25-Feb-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum

Bayesian Estimation & Information Theory

Jonathan Pillow

Mathematical Tools for Neuroscience (NEU 314)Spring, 2016

lecture 18

Page 2: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum

Bayesian Estimation

1. Likelihood

2. Prior

3. Loss function

jointly determine the posterior

“cost” of making an estimate if the true value is

• fully specifies how to generate an estimate from the data

Bayesian estimator is defined as:

✓̂(m) = argmin✓̂

ZL(✓̂, ✓)p(✓|m)d✓

L(✓̂, ✓)

“Bayes’ risk”

three basic ingredients:

Page 3: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum

Typical Loss functions and Bayesian estimators

1. squared error loss

need to find minimizing the expected loss:

Differentiate with respect to and set to zero:

“posterior mean”

also known as Bayes’ Least Squares (BLS) estimator

L(✓̂, ✓) = (✓̂ � ✓)2

0

Page 4: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum

Typical Loss functions and Bayesian estimators

2. “zero-one” loss (1 unless )

• posterior maximum (or “mode”).• known as maximum a posteriori (MAP) estimate.

expected loss:

which is minimized by:

L(✓̂, ✓) = 1� �(✓̂ � ✓)0

Page 5: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum

MAP vs. Posterior Mean estimate:

0 2 4 6 8 100

0.1

0.2

0.3

Note: posterior maximum and mean not always the same!

gamma pdf

Page 6: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum

Typical Loss functions and Bayesian estimators

3. “L1” loss

expected loss:

HW problem: What is the Bayesian estimator for this loss function?

0

Page 7: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum

Simple Example: Gaussian noise & prior

1. Likelihood additive Gaussian noise

3. Loss function:

zero-mean Gaussian2. Prior

doesn’t matter (all agree here)

posterior distribution

MAP estimate variance

Page 8: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum

Likelihood

8 0 8

8

0

8

-

-

θ

m

Page 9: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum

Likelihood

θ

m

8 0 8

8

0

8

-

-

Page 10: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum

Likelihood

θ

m

8 0 8

8

0

8

-

-

Page 11: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum

Likelihood

θ

m

8 0 8

8

0

8

-

-

8 0 8-

8 0 8-

Page 12: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum

Prior

θ

m

8 0 8-

8

0

8

-

Page 13: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum

Computing the posterior

x

likelihood prior

00

posterior

0

0

θm

Page 14: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum

x ∝

likelihood prior posterior

00 0

00 0

0

bias

m*

θm

Making an Bayesian Estimate:

Page 15: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum

x ∝

likelihood prior posterior

00 0

00 0

0

largerbias

θm

High Measurement Noise: large bias

Page 16: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum

x ∝

likelihood prior posterior

00 0

00 0

0

smallbias

θm

Low Measurement Noise: small bias

Page 17: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum

Bayesian Estimation:

• Likelihood and prior combine to form posterior

• Bayesian estimate is always biased towards the prior (from the ML estimate)

Page 18: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum

+

Which grating moves faster?

Application #1: Biases in Motion Perception

Page 19: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum

+

Which grating moves faster?

Application #1: Biases in Motion Perception

Page 20: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum

Explanation from Weiss, Simoncelli & Adelson (2002):

• In the limit of a zero-contrast grating, likelihood becomes infinitely broad ⇒ percept goes to zero-motion.

prior priorlikelihood

likelihoodposterior

0 0

Noisier measurements, so likelihood is broader⇒ posterior has

larger shift toward 0(prior = no motion)

• Claim: explains why people actually speed up when driving in fog!

Page 21: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum

summary

• 3 ingredients for Bayesian estimation (prior, likelihood, loss)

• Bayes’ least squares (BLS) estimator (posterior mean)

• maximum a posteriori (MAP) estimator (posterior mode)

• accounts for stimulus-quality dependent bias in motion perception (Weiss, Simoncelli & Adelson 2002)