introduction to machine learning multivariate methods

24
Introduction to Machine Learning Multivariate Methods 姓姓 : 姓姓姓

Upload: talasi

Post on 24-Feb-2016

65 views

Category:

Documents


0 download

DESCRIPTION

Introduction to Machine Learning Multivariate Methods. 姓名 : 李政軒. Multiple measurements d inputs/features/attributes: d -variate N instances/observations/examples. Multivariate Data. Multivariate Parameters. Parameter Estimation. Estimation of Missing Values. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Introduction to  Machine Learning Multivariate Methods

Introduction to Machine Learning

Multivariate Methods

姓名 :李政軒

Page 2: Introduction to  Machine Learning Multivariate Methods

Multiple measurements d inputs/features/attributes: d-

variate N instances/observations/examples

Multivariate Data

Nd

NN

d

d

XXX

XXXXXX

21

222

21

112

11

X

Page 3: Introduction to  Machine Learning Multivariate Methods

Multivariate Parameters

221

22221

11221

μμ

ddd

d

d

TE

XXXCov

ji

ijijji

jiij

Td

XX

XXE

,

,,...,

Corr :nCorrelatio

Cov:Covariance:Mean 1μx

Page 4: Introduction to  Machine Learning Multivariate Methods

Parameter Estimation

ji

ijij

jtj

N

t iti

ij

N

tti

i

sss

r

Nmxmx

s

diNx

m

:

:

,...,,:

R

S

m

matrix nCorrelatio

matrix Covariance

mean Sample

1

1 1

Page 5: Introduction to  Machine Learning Multivariate Methods

Estimation of Missing ValuesWhat to do if certain instances have missing attributesIgnore those instances: not a good idea if the sample is smallUse ‘missing’ as an attribute: may give informationImputation: Fill in the missing value

Mean imputation: Use the most likely valueImputation by regression: Predict based on other attributes

Page 6: Introduction to  Machine Learning Multivariate Methods

Multivariate Normal Distribution

μxμxx

μx

1212

Σ21

Σ21Σ

Td

d

p exp

~

//

,N

Page 7: Introduction to  Machine Learning Multivariate Methods

Mahalanobis distance: (x – μ)T ∑–1 (x – μ)

measures the distance from x to μ in terms of ∑ (normalizes for difference in variances and correlations)

Bivariate: d = 2

2

221

2121

Multivariate Normal Distribution

iiii xz

zzzzxxp

/

exp,

2

2212122

21

21 2121

121

Page 8: Introduction to  Machine Learning Multivariate Methods

Bivariate Normal

Is probability contour plot of the bivariate normal distribution.Its center is given by the mean, and its shape and

orientation depend on the covariance matrix.

Page 9: Introduction to  Machine Learning Multivariate Methods

If xi are independent, offdiagonals of ∑ are 0, Mahalanobis distance reduces to weighted (by 1/σi) Euclidean distance:

If variances are also equal, reduces to Euclidean distance

Independent Inputs: Naive Bayes

d

i i

iid

ii

d

d

iii

xxpp1

2

1

21 21

2

1

exp

/ x

Page 10: Introduction to  Machine Learning Multivariate Methods

Parametric Classification

If p (x | Ci ) ~ N ( μi , ∑i

)

Discriminant functions

iiT

ii

diCp μxμxx 1212

Σ21

Σ21 exp| //

iiiT

ii

iii

CPdCPCpg

log loglog

log| log

μΣμ21Σ

212

21 xx

xx

Page 11: Introduction to  Machine Learning Multivariate Methods

Estimation of Parameters

tti

Ti

tt i

tti

i

tti

ttt

ii

tti

i

rr

rr

Nr

CP

mxmx

xm

S

ˆ

iiiT

iii CPg ˆ log log mxmxx 1

21

21 SS

Page 12: Introduction to  Machine Learning Multivariate Methods

Different Si

Quadratic discriminant

iiiiTii

iii

ii

iTii

T

iiiTiii

Ti

Tii

CPw

w

CPg

ˆ log log

where

ˆ log log

SS

S

SW

W

SSSS

21

21

21

221

21

10

1

1

0

111

mm

mw

xwxx

mmmxxxx

Page 13: Introduction to  Machine Learning Multivariate Methods

likelihoods

posterior for C1

discriminant: P (C1|x ) = 0.5

Page 14: Introduction to  Machine Learning Multivariate Methods

Shared common sample covariance S

Discriminant reduces to

which is a linear discriminant

iiT

ii CP̂g log21 1 mxmxx S

Common Covariance Matrix S

ii

iCP̂ SS

iiTiiii

iTii

CPw

wg

ˆ log

where

mmmw

xwx

10

1

0

21 SS

Page 15: Introduction to  Machine Learning Multivariate Methods

Common Covariance Matrix S

Covariances may be arbitrary but shared by both classes.

Page 16: Introduction to  Machine Learning Multivariate Methods

When xj j = 1,..d, are independent, ∑ is diagonalp (x|Ci) = ∏j p (xj |Ci)

Classify based on weighted Euclidean distance (in sj units) to the nearest mean

Diagonal S

id

j j

ijtj

i CPsmx

g ˆ log

1

2

21x

Page 17: Introduction to  Machine Learning Multivariate Methods

Diagonal S

variances may bedifferent

Page 18: Introduction to  Machine Learning Multivariate Methods

Nearest mean classifier: Classify based on Euclidean distance to the nearest mean

Each mean can be considered a prototype or template and this is template matching

Diagonal S, equal variances

id

jij

tj

ii

i

CPmxs

CPs

g

ˆ log

ˆ log

2

12

2

2

212mx

x

Page 19: Introduction to  Machine Learning Multivariate Methods

Diagonal S, equal variances

*?

All classes have equal, diagonal covariance matrices of equal variances on both dimensions.

Page 20: Introduction to  Machine Learning Multivariate Methods

As we increase complexity (less restricted S), bias decreases and variance increases

Assume simple models (allow some bias) to control variance (regularization)

Model Selection

Page 21: Introduction to  Machine Learning Multivariate Methods

Different cases of the covariance matrices fitted to the same data lead to different boundaries.

Page 22: Introduction to  Machine Learning Multivariate Methods

Binary features:if xj are independent (Naive

Bayes’)

the discriminant is linear

ijij Cxpp |1

d

j

xij

xiji

jj ppCxp1

11|

tti

tti

tj

ij rrx

Discrete Features

i

jijjijj

iii

CPpxpx

CPCpg

log log log

log| log

11

xx

Estimated parameters

Page 23: Introduction to  Machine Learning Multivariate Methods

Multinomial (1-of-nj) features: xj Î {v1, v2,..., vnj

}

if xj are independent

tti

tti

tjk

ijk

iijkj k jki

d

j

n

k

zijki

rrz

p

CPpzg

pCpj

jk

ˆ

log log

|

x

x1 1

Discrete Features

ikjijkijk CvxpCzpp || 1

Page 24: Introduction to  Machine Learning Multivariate Methods

Multivariate linear model

Multivariate polynomial model: Define new higher-order variables z1=x1, z2=x2, z3=x1

2, z4=x22, z5=x1x2

and use the linear model in this new z space

211010

22110

21

ttdd

ttd

tdd

tt

xwxwwrwwwE

xwxwxww

X|,...,,

Multivariate Regression

dtt wwwxgr ,...,,| 10