parameter estimation - i-systems.github.ioi-systems.github.io/hse545/machine learning all/11...

43
Parameter Estimation Industrial AI Lab.

Upload: dangdieu

Post on 17-Jun-2018

228 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Parameter Estimation

Industrial AI Lab.

Page 2: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Generative Model

2

Y

𝑦 = 𝜔𝑇𝑥 + 𝜀

𝜀~𝒩(0, 𝜎2)

𝜎2

wX

Page 3: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Maximum Likelihood Estimation (MLE)

• Estimate parameters 𝜃 𝜔, 𝜎2 such that maximize the likelihood given a generative model

– Given observed data

– Generative model structure (assumption)

3

Page 4: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Maximum Likelihood Estimation (MLE)

• Find parameters 𝜔 and 𝜎 that maximize the likelihood over the observed data

• Likelihood:

• Perhaps the simplest (but widely used) parameter estimation method

4

Page 5: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Drawn from a Gaussian Distribution

5

• You will often see the following derivation

Page 6: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Drawn from a Gaussian Distribution

• To maximize, 𝜕ℓ

𝜕𝜇= 0,

𝜕ℓ

𝜕𝜎= 0

• BIG Lesson– We often compute a mean and variance to represent data statistics

– We kind of assume that a data set is Gaussian distributed

– Good news: sample mean is Gaussian distributed by the central limit theorem

6

Page 7: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Numerical Example

• Compute the likelihood function, then– Maximize the likelihood function

– Adjust the mean and variance of the Gaussian to maximize its product

7

Page 8: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Numerical Example

8

Page 9: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Numerical Example for Gaussian

9

Page 10: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

When Mean is Unknown

10

Page 11: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

When Variance is Unknown

11

Page 12: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Probabilistic Machine Learning

• Probabilistic Machine Learning– I personally believe this is a more fundamental way of looking at machine

learning

• Maximum Likelihood Estimation (MLE)

• Maximum a Posterior (MAP)

• Probabilistic Regression

• Probabilistic Classification

• Probabilistic Clustering

• Probabilistic Dimension Reduction

12

Page 13: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Maximum Likelihood Estimation (MLE)

13

Page 14: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Linear Regression: A Probabilistic View

• Linear regression model with (Gaussian) normal errors

14

Page 15: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Linear Regression: A Probabilistic View

• BIG Lesson– Same as the least squared optimization

15

Page 16: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Linear Regression: A Probabilistic View

16

Page 17: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Linear Regression: A Probabilistic View

17

Page 18: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Linear Regression: A Probabilistic View

18

Page 19: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Linear Regression: A Probabilistic View

19

Page 20: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Linear Regression: A Probabilistic View

20

Page 21: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Maximum a Posterior (MAP)

21

Page 22: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Data Fusion with Uncertainties

• Learning Theory (Reza Shadmehr, Johns Hopkins University)– youtube link

• In a matrix form

22

𝑦𝑎 𝑦𝑏

X

Page 23: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Data Fusion with Uncertainties

• Find ො𝑥𝑀𝐿

• 𝐶𝑇𝑅−1𝐶 −1𝐶𝑇𝑅−1

23

Page 24: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Data Fusion with Uncertainties

24

Page 25: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Data Fusion with Less Uncertainties

• Summary

• BIG Lesson:

– Two sensors are better than one sensor ⟹ less uncertainties

– Accuracy or uncertainty information is also important in sensors

25

𝜇𝑎 ො𝑥𝑀𝐿 𝜇𝑏 𝜇𝑎 ො𝑥𝑀𝐿 𝜇𝑏

𝜎𝑎2 = 𝜎𝑏

2𝜎𝑎2 > 𝜎𝑏

2

Page 26: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

1D Examples

• Example of Two Rulers

• How brain works on human measurements from both hapticand visual channels

26

Page 27: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Data Fusion with 1D Example

27

Page 28: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Data Fusion with 2D Example

28

Page 29: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Maximum-a-Posterior Estimation (MAP)

• Choose 𝜃 that maximizes the posterior probability of 𝜃(i.e. probability in the light of the observed data)

• Posterior probability of 𝜃 is given by the Bayes Rule

– 𝑃 𝜃 : Prior probability of 𝜃 (without having seen any data)

– 𝑃 𝐷|𝜃 : Likelihood

– 𝑃 𝐷 : Probability of the data (independent of 𝜃 )

• The Bayes rule lets us update our belief about 𝜃 in the light of observed data

29

Page 30: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Maximum-a-Posterior Estimation (MAP)

• While doing MAP, we usually maximize the log of the posterior

probability

• for multiple observations 𝐷 = 𝑑1, 𝑑2, ⋯ , 𝑑𝑚

• same as MLE except the extra log-prior-distribution term

• MAP allows incorporating our prior knowledge about 𝜃 in its estimation

30

Page 31: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

MAP for mean of a univariate Gaussian

• Suppose that 𝜃 is a random variable with 𝜃~𝑁 𝜇, 12 , but a prior knowledge (unknown 𝜃 and known 𝜇, 𝜎2)

– Observations 𝐷 = 𝑑1, 𝑑2, ⋯ , 𝑑𝑚 : conditionally independent given 𝜃

– Joint Probability

31

Page 32: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

MAP for mean of a univariate Gaussian

• MAP: choose 𝜃𝑀𝐴𝑃

32

Page 33: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

MAP for mean of a univariate Gaussian

33

Page 34: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

MAP for mean of a univariate Gaussian

• ML interpretation:

• BIG Lesson: a prior acts as a data

• Note: prior knowledge– Education

– Get older

– School ranking

34

𝜇 ത𝑋

𝜃𝑀𝐴𝑃

𝑚 = 0 𝑚 → ∞

Page 35: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

MAP for mean of a univariate Gaussian

Example) Experiment in class

• Which one do you think is heavier?– with eyes closed

– with visual inspection

– with haptic (touch) inspection

35

Page 36: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

MAP Python code

• Suppose that 𝜃 is a random variable with 𝜃~𝑁 𝜇, 12 , but a prior knowledge (unknown 𝜃 and known 𝜇, 𝜎2)

– for mean of a univariate Gaussian

36

Page 37: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

MAP Python code

37

Page 38: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

MAP Python code

38

Page 39: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

MAP Python code

39

Page 40: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Object Tracking in Computer Vision

• Optional

• Lecture: Introduction to Computer Vision by Prof. Aaron Bobickat Georgia Tech

40

Page 41: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Object Tracking in Computer Vision

41

Page 42: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Kernel Density Estimation

• non-parametric estimate of density

• Lecture: Learning Theory (Reza Shadmehr, Johns Hopkins University)

42

Page 43: Parameter Estimation - i-systems.github.ioi-systems.github.io/HSE545/machine learning all/11 Parameter... · •Learning Theory (Reza Shadmehr, ... –Accuracy or uncertainty information

Kernel Density Estimation

43