unsupervisedffa rnnjcs229.stanford.edu/livenotes2020spring/cs229-livenotes-lecture11-dr… · plz j...

10
U nsupervisedff A RNNJ.TO eyiK MEANS MIXTURE of Gaussian EM t t Supervised Points TECHNIQUES ARE VALUABLE PED Atonally PDRactIf K MEANS GIN X X C Rd k the afekstees DI find AN assignment of X to ONE of k dustees C j Point c IN clostery How do we find Flea daters le I I i a

Upload: others

Post on 22-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: UnsupervisedffA RNNJcs229.stanford.edu/livenotes2020spring/cs229-livenotes-lecture11-dr… · PLZ j soft Assignment Gma morsel pfx z PIX Iz P 2 I BayesRule K n multinomial I sit Eloi

UnsupervisedffARNNJ.TOeyiKMEANS MIXTUREofGaussian EM

t t

Supervised Points

TECHNIQUES ARE VALUABLE PEDAtonally PDRactIf

K MEANS

GIN X X CRd k the afekstees

DI find AN assignment of X to ONE ofk dusteesC j Point c IN clostery

How do wefindFleadatersle

I I ia

Page 2: UnsupervisedffA RNNJcs229.stanford.edu/livenotes2020spring/cs229-livenotes-lecture11-dr… · PLZ j soft Assignment Gma morsel pfx z PIX Iz P 2 I BayesRule K n multinomial I sit Eloi

1 L1 Randomly Init U MZ Assegai EACHpointto cluster C Angmar11µm X if

L l K3 ComputeNEWclustersCENTERS 5

REPEATUNTIL No boats dead Nb fqXwhere Aj i C j compute

MEAN

Comments

Does KMEANS TERMINATE Yes

JCC µ IEKX Mccall decr.sn gmouoioegCIN NOTES

Does itfindA minimum NOTnecessarily NPHARD

SIDENOTE K MEANS ft 2007 from GREETStanfordstudentsIMPROVED Approximation Bourds through CleverinitDEFAULT IN SKLEARN

How do you choose K No on RightANSWERxxx xxx Xxx xxxxx

m F

kxtur offaus.sn

ToyAstRoNomyfxample BASED on Real 0W Example

Quasars stars ARE sourcesof lightBoth emit light AND WE Observe photons

Page 3: UnsupervisedffA RNNJcs229.stanford.edu/livenotes2020spring/cs229-livenotes-lecture11-dr… · PLZ j soft Assignment Gma morsel pfx z PIX Iz P 2 I BayesRule K n multinomial I sit Eloi

3 gh ss r phot

xxxx

Gone Assign CAct Photos tolight source PCE LProbability Point 2 belongs toObject j

Cf kMEANS this is A Soft AssignmentChallenge

Many Sources SAYWE know al sourcesSourceshavedifferent Intensity State

ASSUMEL 1 Sources ARE Gaussian like Ceej 052 WE DO not Assume CQual of Points Pensource mixture

NI IN this Example Physics can aceck how Plausible it is

M E Ga gi Model SETUP

w.io 7aebe.aei onitsa.s

OBSER.vn N7 IFwF kNEw cluster labels SOLVE WIK ADA

Comore µ llc Done

eyE we don't

Page 4: UnsupervisedffA RNNJcs229.stanford.edu/livenotes2020spring/cs229-livenotes-lecture11-dr… · PLZ j soft Assignment Gma morsel pfx z PIX Iz P 2 I BayesRule K n multinomial I sit Eloi

x X C R AND K 70

DI Finis Prob Z for E I n to one al kClostees

PLZ j soft Assignment

Gma morselpfx z PIX I z P 2 I Bayes RuleK

n multinomial I sit Eloi I d Zo

X Iz j NGAIthe PARAMETERS TO BE Found ARE IN BLUE

WE call Z A hidden on latent variableZ IS NOT DIRECTLY OBSERVED

teal has1 I GENERATEdata

E o 7 02 0.3

µ I µ 2 2I 2 0,2 05 2

GMmalgouthm CFAmous Algorithm HypeMirrors K MEANS

1 E STEP Guess latent VALUES of 2 For Eau Pont

2 M STEP UPDATE PARAMETERS MLE

whyAbstract Verygeneral Em Algorithm

ESTF.pt Emmi

GIVEN DATA CURRENT PARAMETERS X f Gmo

Page 5: UnsupervisedffA RNNJcs229.stanford.edu/livenotes2020spring/cs229-livenotes-lecture11-dr… · PLZ j soft Assignment Gma morsel pfx z PIX Iz P 2 I BayesRule K n multinomial I sit Eloi

Inn i 11 herD Predict latent value Z for i s a

wig Plz jlx s Hmo Innowenagisdam

P Z y j µo BAYES RULEPCxci

pcxcislzh y.jo y Plzd 2ldTEe Puli Iz 1 D a Plz 110

exp Cx u5o Howlikelyfromthis Gaussian

on Ole Howlikely thispointcamefromCloster

i WE CAN compute all teams RETURN

MSTEP_Given W PLZ j Estimate for all c em j chokesD Estimate the otherobseries PARAMETERS usy MLE

e.g by IEWj a fractionalElements in Clostery

Nj Ewj x soft coaster Center

wy

gag

Page 6: UnsupervisedffA RNNJcs229.stanford.edu/livenotes2020spring/cs229-livenotes-lecture11-dr… · PLZ j soft Assignment Gma morsel pfx z PIX Iz P 2 I BayesRule K n multinomial I sit Eloi

J

i MLE moregenerally Let's MAKERyonous

ETO0 Convexity JENSEN

THIS IS A KEYRESULT SO WANT To Goslowly Ask

A SET A IS CONTEX if for any a bGELTHE line going a b IS IN A

IN SymbolsF1 C o D a BEAkata Hb C A

CONVEX NOTCONVEXNEED to GECK Fab ER

Given A function f the gruff is Gf definers

Gf x y y fix

A function is coNu if itsgraph Is convexCABASEDCONVEX MEANS

CONVEXNOTCONVEX X a fca th 1 lb1lb C I

feary t.hn ek letz 1atCi x b

f Afca Ci 1 fcba b a EE b

Every C4ORD IS ABOVE function

If f Is twice diffcefecbleVx f C o f is convexPfi fca fez f z ca z t f g Ca Zf SaC G Z

Page 7: UnsupervisedffA RNNJcs229.stanford.edu/livenotes2020spring/cs229-livenotes-lecture11-dr… · PLZ j soft Assignment Gma morsel pfx z PIX Iz P 2 I BayesRule K n multinomial I sit Eloi

Ga 3f b f z t f Z b z t f Gb a Z 3bC 213

Afca 11Hfcb flz t f't 711b z t CI D

WE Say f is strictly convex if Fx cDomLf fix o

f x x2 f x 2 StronglyCONVEXfCx x 112 Is thegraphABOVE that is not convex

JENSENts.IN altyEfflxD3f EExT for convex f

X takes value a with prob Itakes value b with prob I I

Effed Afca 11 2 fcbf ECB FIZ for z chatCt Hb

for cowee f our defations ABOVE Implies JENSEN's4hmNB for finitely supported distributions ProvedJensen'sbg Induction

stronger if f is strongly Convex AND EACH f FIXthen X is A constant experts almostsurely

WE NEED CONCAVE functions g is concave if g is CONVEX

E guna GLEED

Exigallogcx g a X Z ow lo D 1 NEGATIVE

F curb Below foretrow

Page 8: UnsupervisedffA RNNJcs229.stanford.edu/livenotes2020spring/cs229-livenotes-lecture11-dr… · PLZ j soft Assignment Gma morsel pfx z PIX Iz P 2 I BayesRule K n multinomial I sit Eloi

WHAT ABOUThcx ax b x o convex AND concave

EchoxD NEED 4 Echl E hCECx

Ethan NEED bareandy d Exoecation

CNDDETOORCMAlgoruthms

asmanlikehhoodllole.EE log

isParameters

DATA

we assume Plx o PC x Z 0 of 6mmLATENTVARIABLE

PicturedAlgorithmetii I I

Hop CO IS EASIER TO OPTIMIZE

IT thanLeosAT 0Gt may O

RooghAlgorethmE STEP 1 Find Lt f given 0

4

m step 2 0 arghfax lol

Howdo WEconstruct to

Page 9: UnsupervisedffA RNNJcs229.stanford.edu/livenotes2020spring/cs229-livenotes-lecture11-dr… · PLZ j soft Assignment Gma morsel pfx z PIX Iz P 2 I BayesRule K n multinomial I sit Eloi

Howd s c

let's EXAMINE A singlePoint

log PCx2s0 log iQCz o foranyQiQI2

Pick QC27 Sit QC27 1 QC2 70

log 1 E JostSymbolPosting

a E log JENSEN logconcare

P X Z oQA log symbol pushingdefoeE

this holds for Any Q satisfying I

this gives A family of lowerbounds PROPERTY 1 ABOVE

Pick QC2 TO SATISFY Property 2 that is

log Patio log Rx E el l WHEN IS JENSE.no yhtif WE Pick Plqz c i.e QC27 Plz Ix O

Nd Lz depends on 0 TX diluentQ foreogpont

WE DEFINE Evidence Based lowerbound ELBO sum over 2

F LBO x Q f QE log

Page 10: UnsupervisedffA RNNJcs229.stanford.edu/livenotes2020spring/cs229-livenotes-lecture11-dr… · PLZ j soft Assignment Gma morsel pfx z PIX Iz P 2 I BayesRule K n multinomial I sit Eloi

WE'VE 540Wh do 3 Ef ELBOIX Q 0 forany Q

1 0 ELBO x Q je't forchoice QABOVE

WRAPUP

1 E STEP Q t Plz x j o

2 m STEPt

ARffmax lo

in which Leto ELBO X Q

WHY DOES This terminate l At I fo'tIS IT Globally OPTIMAL NODE SEE PICTURE

IN this lecture WE sayHARD SOFT clustering METHODSWE DERIVED General Algorithm Em IN terms

of MLE