lecture 2: inference - carnegie mellon school of computer ...nasmith/psnlp/lecture2.pdfinference...

85
Lecture 2: Inference

Upload: others

Post on 30-May-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Lecture2:Inference

Page 2: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Inference:AnUbiquitousObstacle

•  Decodingisinference.•  Subrou<nesforlearningareinference.•  Learningisinference.

•  Exactinferenceis#P‐complete.– Evenapproxima<onswithinagivenabsoluteorrela<veerrorarehard.

Page 3: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Probabilis<cInferenceProblemsGivenvaluesforsomerandomvariables(X⊂ V)…•  MostProbableExplana<on:whatarethemost probablevaluesoftherest 

ofther.v.sV\X?

(Moregenerally…)•  MaximumA Posteriori (MAP):whatarethemostprobablevaluesofsome

otherr.v.s,Y⊂ (V\X)?

•  RandomsamplingfromtheposteriorovervaluesofY•  FullposteriorovervaluesofY•  Marginalprobabili<esfromtheposterioroverY

•  MinimumBayesrisk:WhatistheYwiththelowestexpectedcost?•  Cost‐augmenteddecoding:Whatisthemostdangerous Y?

Page 4: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

ApproachestoInference

inference

exact

variableelimina<on

dynamicprogram’ng

ILP

approximate

randomized

MCMC

Gibbs

importancesampling

randomizedsearch

simulatedannealing

determinis<c

varia<onal

meanfield

loopybeliefpropaga<on

LPrelaxa<ons

dualdecomp.

localsearch

beamsearch

lecture6

today

Page 5: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

ExactMarginalforY

•  Thiswillbeageneraliza<onofalgorithmsyoualreadyknow:theforwardandbackwardalgorithms.

•  Thegeneralnameisvariableelimina<on.

•  A\erweseeitforthemarginal,we’llseehowtouseitfortheMAP.

Page 6: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

SimpleInferenceExample

•  Goal:P(D)

A

B

C

D

0

1

P(B|A) 0 1

0

1

P(C|B) 0 1

0

1

P(D|C) 0 1

0

1

Page 7: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

SimpleInferenceExample

•  Let’scalculateP(B)fromthingswehave.

A

B

C

D

0

1

P(B|A) 0 1

0

1

P(C|B) 0 1

0

1

P(D|C) 0 1

0

1

Page 8: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

SimpleInferenceExample

•  Let’scalculateP(B)fromthingswehave.

A

B

C

D

Page 9: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

SimpleInferenceExample

•  Let’scalculateP(B)fromthingswehave.

•  NotethatCandDdonotmaaer.

A

B

C

D

Page 10: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

SimpleInferenceExample

•  Let’scalculateP(B)fromthingswehave.

A

B

C

D

0

1

P(B|A) 0 1

0

1

T

= 0

1

Page 11: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

SimpleInferenceExample

•  WenowhaveaBayesiannetworkforthemarginaldistribu<onP(B,C,D).

B

C

D

0

1

P(C|B) 0 1

0

1

P(D|C) 0 1

0

1

Page 12: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

SimpleInferenceExample

•  WecanrepeatthesameprocesstocalculateP(C).

•  WealreadyhaveP(B)!

B

C

D

Page 13: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

SimpleInferenceExample

•  WecanrepeatthesameprocesstocalculateP(C).

B

C

D

0

1

P(C|B) 0 1

0

1

T

= 0

1

Page 14: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

SimpleInferenceExample

•  WenowhaveP(C,D).•  MarginalizingoutAandBhappenedintwosteps,andweareexploi<ngtheBayesiannetworkstructure.

C

D

0

1

P(D|C) 0 1

0

1

Page 15: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

SimpleInferenceExample

•  LaststeptogetP(D):

D

0

1

P(D|C) 0 1

0

1

T

= 0

1

Page 16: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

SimpleInferenceExample

•  No<cethatthesamestephappenedforeachrandomvariable:– WecreatedanewCPDoverthevariableandits“successor”

– Wesummedout(marginalized)thevariable.

Page 17: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

ThatWasVariableElimina<on

•  Wereusedcomputa<onfrompreviousstepsandavoideddoingthesameworkmorethanonce.– Dynamicprogrammingàlaforwardalgorithm!

•  WeexploitedtheBayesiannetworkstructure(eachsubexpressiononlydependsonasmallnumberofvariables).

•  Exponen<alblowupavoided!

Page 18: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

WhatRemains

•  Somemachinery•  Variableelimina<oningeneral

•  Themaximiza<onversion(forMAPinference)

•  Abitaboutapproximateinference

Page 19: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

FactorGraphs

•  Variablenodes(circles)•  Factornodes(squares)

–  CanbeMNfactorsorBNcondi<onalprobabilitydistribu<ons!

•  Edgebetweenvariableandfactorifthefactordependsonthatvariable.

•  Thegraphisbipar<te.

Z

X

Y

φ1

φ2

φ3

φ4

Page 20: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

ProductsofFactors

•  Giventwofactorswithdifferentscopes,wecancalculateanewfactorequaltotheirproducts.

Page 21: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

ProductsofFactors

•  Giventwofactorswithdifferentscopes,wecancalculateanewfactorequaltotheirproducts.

A B ϕ1(A,B)

0 0 30

0 1 5

1 0 1

1 1 10

B C ϕ2(B,C)

0 0 100

0 1 1

1 0 1

1 1 100

. =

A B C ϕ3(A,B,C)

0 0 0 3000

0 0 1 30

0 1 0 5

0 1 1 500

1 0 0 100

1 0 1 1

1 1 0 10

1 1 1 1000

Page 22: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

FactorMarginaliza<on

•  GivenXandY(Y∉X),wecanturnafactorϕ(X,Y)intoafactorψ(X)viamarginaliza<on:

Page 23: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

FactorMarginaliza<on

•  GivenXandY(Y∉X),wecanturnafactorϕ(X,Y)intoafactorψ(X)viamarginaliza<on:

P(C|A,B) 0,0 0,1 1,0 1,1

0 0.5 0.4 0.2 0.1

1 0.5 0.6 0.8 0.9

A C ψ(A,C)

0 0 0.9

0 1 0.3

1 0 1.1

1 1 1.7“summing out” B

Page 24: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

FactorMarginaliza<on

•  GivenXandY(Y∉X),wecanturnafactorϕ(X,Y)intoafactorψ(X)viamarginaliza<on:

P(C|A,B) 0,0 0,1 1,0 1,1

0 0.5 0.4 0.2 0.1

1 0.5 0.6 0.8 0.9

A B ψ(A,B)

0 0 1

0 1 1

1 0 1

1 1 1“summing out” C

Page 25: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

FactorMarginaliza<on

•  GivenXandY(Y∉X),wecanturnafactorϕ(X,Y)intoafactorψ(X)viamarginaliza<on:

•  Wecanrefertothisnewfactorby∑Yϕ.

Page 26: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

MarginalizingEverything?

•  TakeaMarkovnetwork’s“productfactor”bymul<plyingall ofitsfactors.

•  Sumoutallthevariables(onebyone).

•  Whatdoyouget?

Page 27: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

FactorsAreLikeNumbers

•  Productsarecommuta<ve:ϕ1· ϕ2=ϕ2· ϕ1•  Productsareassocia<ve:(ϕ1· ϕ2) · ϕ3=ϕ1· (ϕ2· ϕ3)

•  Sumsarecommuta<ve:∑X∑Yϕ=∑Y∑Xϕ

•  Distribu<vityofmultliplica<onoversumma<on:

Page 28: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Elimina<ngOneVariable

Input:SetoffactorsΦ,variableZtoeliminateOutput:newsetoffactorsΨ

1. LetΦ’={ϕ∈Φ|Z∈Scope(ϕ)}

2. LetΨ={ϕ∈Φ|Z∉Scope(ϕ)}

3. Letψbe∑Z∏ϕ∈Φ’ϕ

4. ReturnΨ∪{ψ}

Page 29: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Example

•  Query:P(Flu|runnynose)

•  Let’seliminateH.

Flu All.

S.I.

R.N. H.

Page 30: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Example

•  Query:P(Flu|runnynose)

•  Let’seliminateH.

Flu All.

S.I.

R.N. H.

ϕSR ϕSH

ϕAϕF

ϕFAS

Page 31: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Example

•  Query:P(Flu|runnynose)

•  Let’seliminateH.1. Φ’={ϕSH}2. Ψ={ϕF,ϕA,ϕFAS,ϕSR}3. ψ=∑H∏ϕ∈Φ’ϕ4. ReturnΨ∪{ψ}

Flu All.

S.I.

R.N. H.

ϕSR ϕSH

ϕAϕF

ϕFAS

Page 32: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Example

•  Query:P(Flu|runnynose)

•  Let’seliminateH.1. Φ’={ϕSH}2. Ψ={ϕF,ϕA,ϕFAS,ϕSR}3. ψ=∑HϕSH

4. ReturnΨ∪{ψ}

Flu All.

S.I.

R.N. H.

ϕSR ϕSH

ϕAϕF

ϕFAS

Page 33: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Example

•  Query:P(Flu|runnynose)

•  Let’seliminateH.1. Φ’={ϕSH}2. Ψ={ϕF,ϕA,ϕFAS,ϕSR}3. ψ=∑HϕSH

4. ReturnΨ∪{ψ}

Flu All.

S.I.

R.N. H.

ϕSR ϕSH

ϕAϕF

ϕFAS

P(H|S) 0 1

0 0.8 0.1

1 0.2 0.9

S ψ(S)

0 1.0

1 1.0

Page 34: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Example

•  Query:P(Flu|runnynose)

•  Let’seliminateH.1. Φ’={ϕSH}2. Ψ={ϕF,ϕA,ϕFAS,ϕSR}3. ψ=∑HϕSH

4. ReturnΨ∪{ψ}

Flu All.

S.I.

R.N.

ϕSR ψ

ϕAϕF

ϕFAS

P(H|S) 0 1

0 0.8 0.1

1 0.2 0.9

S ψ(S)

0 1.0

1 1.0

Page 35: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Example

•  Query:P(Flu|runnynose)

•  Let’seliminateH.•  Wecanactuallyignorethenewfactor,equivalentlyjustdele<ngH!– Why?–  Insomecaseselimina<ngavariableisreallyeasy!

Flu All.

S.I.

R.N.

ϕSR

ϕAϕF

ϕFAS

S ψ(S)

0 1.0

1 1.0

Page 36: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

VariableElimina<on

Input:SetoffactorsΦ,orderedlistofvariablesZtoeliminate

Output:newfactorψ

1. ForeachZi∈Z(inorder):–  LetΦ=Eliminate‐One(Φ,Zi)

2. Return∏ϕ∈Φϕ

Page 37: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Example

•  Query:P(Flu|runnynose)

•  Hisalreadyeliminated.

•  Let’snoweliminateS.

Flu All.

S.I.

R.N.

ϕSR

ϕAϕF

ϕFAS

Page 38: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Example

•  Query:P(Flu|runnynose)

•  Elimina<ngS.1. Φ’={ϕSR,ϕFAS}2. Ψ={ϕF,ϕA}3. ψFAR=∑S∏ϕ∈Φ’ϕ4. ReturnΨ∪{ψFAR}

Flu All.

S.I.

R.N.

ϕSR

ϕAϕF

ϕFAS

Page 39: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Example

•  Query:P(Flu|runnynose)

•  Elimina<ngS.1. Φ’={ϕSR,ϕFAS}2. Ψ={ϕF,ϕA}3. ψFAR=∑SϕSR∙ϕFAS4. ReturnΨ∪{ψFAR}

Flu All.

S.I.

R.N.

ϕSR

ϕAϕF

ϕFAS

Page 40: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Example

•  Query:P(Flu|runnynose)

•  Elimina<ngS.1. Φ’={ϕSR,ϕFAS}2. Ψ={ϕF,ϕA}3. ψFAR=∑SϕSR∙ϕFAS4. ReturnΨ∪{ψFAR}

Flu All.

R.N.

ϕAϕF

ψFAR

Page 41: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Example

•  Query:P(Flu|runnynose)

•  Finally,eliminateA.

Flu All.

R.N.

ϕAϕF

ψFAR

Page 42: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Example

•  Query:P(Flu|runnynose)

•  Elimina<ngA.1. Φ’={ϕA,ϕFAR}2. Ψ={ϕF}3. ψFR=∑AϕA∙ψFAR4. ReturnΨ∪{ψFR}

Flu All.

R.N.

ϕAϕF

ψFAR

Page 43: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Example

•  Query:P(Flu|runnynose)

•  Elimina<ngA.1. Φ’={ϕA,ϕFAR}2. Ψ={ϕF}3. ψFR=∑AϕA∙ψFAR4. ReturnΨ∪{ψFR}

Flu

R.N.

ϕF

ψFR

Page 44: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

MarkovChain,Again

•  Earlier,weeliminatedA,thenB,thenC.

A

B

C

D

0

1

P(B|A) 0 1

0

1

P(C|B) 0 1

0

1

P(D|C) 0 1

0

1

Page 45: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

MarkovChain,Again

•  Nowlet’sstartbyelimina<ngC.

A

B

C

D

0

1

P(B|A) 0 1

0

1

P(C|B) 0 1

0

1

P(D|C) 0 1

0

1

Page 46: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

MarkovChain,Again

•  Nowlet’sstartbyelimina<ngC.

A

B

C

D

P(C|B) 0 1

0

1

P(D|C) 0 1

0

1

. =

B C D ϕ’(B,C,D)

0 0 0

0 0 1

0 1 0

0 1 1

1 0 0

1 0 1

1 1 0

1 1 1

Page 47: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

MarkovChain,Again

•  Nowlet’sstartbyelimina<ngC.

A

B

C

D

ΣC =

B C D ϕ’(B,C,D)

0 0 0

0 0 1

0 1 0

0 1 1

1 0 0

1 0 1

1 1 0

1 1 1

B D ψ(B,D)

0 0

0 1

1 0

1 1

Page 48: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

MarkovChain,Again

•  Elimina<ngBwillbesimilarlycomplex.

A

B

D

B D ψ(B,D)

0 0

0 1

1 0

1 1

Page 49: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

VariableElimina<on:Comments

•  Canpruneawayallnon‐ancestorsofthequeryvariables.

•  Orderingmakesadifference!

•  WorksforMarkovnetworksandBayesiannetworks.– FactorsneednotbeCPDsand,ingeneral,newfactorswon’tbe.

Page 50: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

WhataboutEvidence?

•  Sofar,we’vejustconsideredtheposterior/marginalP(Y).

•  Next:condi<onaldistribu<onP(Y|X=x).

•  It’salmostthesame:theaddi<onalstepistoreducefactorstorespecttheevidence.

Page 51: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Example

•  Query:P(Flu|runnynose)

•  Let’sreducetoR=true(runnynose).

Flu All.

S.I.

R.N. H.

ϕSR ϕSH

ϕAϕF

ϕFAS

P(R|S) 0 1

0

1

Page 52: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Example

•  Query:P(Flu|runnynose)

•  Let’sreducetoR=true(runnynose).

Flu All.

S.I.

R.N. H.

ϕSR ϕSH

ϕAϕF

ϕFAS

P(R|S) 0 1

0

1

S R ϕSR(S,R)0 0

0 1

1 0

1 1

Page 53: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Example

•  Query:P(Flu|runnynose)

•  Let’sreducetoR=true(runnynose).

Flu All.

S.I.

R.N. H.

ϕSR ϕSH

ϕAϕF

ϕFAS

P(R|S) 0 1

0

1

S R ϕSR(S,R)0 0

0 1

1 0

1 1

S R ϕ’S(S)0 0

0 1

1 0

1 1

Page 54: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Example

•  Query:P(Flu|runnynose)

•  Let’sreducetoR=true(runnynose).

Flu All.

S.I.

R.N. H.

ϕSR ϕSH

ϕAϕF

ϕFAS

P(R|S) 0 1

0

1

S R ϕSR(S,R)0 0

0 1

1 0

1 1

S R ϕ’S(S)0 1

1 1

Page 55: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Example

•  Query:P(Flu|runnynose)

•  Let’sreducetoR=true(runnynose).

Flu All.

S.I.

H.

ϕ’S ϕSH

ϕAϕF

ϕFAS

S R ϕ’S(S)0 1

1 1

Page 56: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Example

•  Query:P(Flu|runnynose)

•  Nowrunvariableelimina<onallthewaydowntoonefactor(forF).

Flu All.

S.I.

H.

ϕ’S ϕSH

ϕAϕF

ϕFAS

H can be pruned for the same reasons as before.

Page 57: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Example

•  Query:P(Flu|runnynose)

•  Nowrunvariableelimina<onallthewaydowntoonefactor(forF).

Flu All.

S.I.

ϕ’S

ϕAϕF

ϕFAS

Eliminate S.

Page 58: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Example

•  Query:P(Flu|runnynose)

•  Nowrunvariableelimina<onallthewaydowntoonefactor(forF).

Flu All.ψFA

ϕAϕF

Eliminate A.

Page 59: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Example

•  Query:P(Flu|runnynose)

•  Nowrunvariableelimina<onallthewaydowntoonefactor(forF).

Flu ψF

ϕF

Take final product.

Page 60: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Example

•  Query:P(Flu|runnynose)

•  Nowrunvariableelimina<onallthewaydowntoonefactor.

ϕF· ψF

Page 61: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

VariableElimina<onforCondi<onalProbabili<es

Input:GraphicalmodelonV,setofqueryvariablesY,evidenceX=x

Output:factorϕandscalarα1.  Φ=factorsinthemodel2.  ReducefactorsinΦbyX=x3.  ChoosevariableorderingonZ=V\Y\X4.  ϕ=Variable‐Elimina<on(Φ,Z)5.  α=∑z∈Val(Z)ϕ(z)6.  Returnϕ,α

Page 62: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Note

•  ForBayesiannetworks,thefinalfactorwillbeP(Y,X=x)andthesumα=P(X=x).

•  ThisequatestoaGibbsdistribu<onwithpar<<onfunc<on=α.

Page 63: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

VariableElimina<on

•  Ingeneral,exponen<alrequirementsininducedwidthcorrespondingtotheorderingyouchoose.

•  It’sNP‐hardtofindthebestelimina<onordering.

•  Ifyoucanavoid“big”intermediatefactors,youcanmakeinferencelinearinthesizeoftheoriginalfactors.–  Chordalgraphs–  Polytrees

Page 64: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Addi<onalComments

•  Run<medependsonthesizeoftheintermediatefactors.

•  Hence,variableelimina<onorderingmaaersalot.– Butit’sNP‐hardtofindthebestone.– ForMNs,chordal graphspermitinferencein<melinearinthesizeoftheoriginalfactors.

– ForBNs,polytreestructuresdothesame.

Page 65: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Ge}ngBacktoNLP

•  Tradi<onalstructuredNLPmodelsweresome<messubconsciouslychosenfortheseproper<es.– HMMs,PCFGs(withalialework)

– Butnot:IBMmodel3

•  NeedMAPinferencefordecoding!

•  Needapproximateinferenceforcomplexmodels!

Page 66: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

FromMarginalstoMAP

•  Replacefactormarginaliza<onstepswithmaximiza;on.– Addbookkeepingtokeeptrackofthemaximizingvalues.

•  Addatracebackattheendtorecoverthesolu<on.

•  Thisisanalogoustotheconnec<onbetweentheforwardalgorithmandtheViterbialgorithm.– Orderingchallengeisthesame.

Page 67: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

FactorMaximiza<on

•  GivenXandY(Y∉X),wecanturnafactorϕ(X,Y)intoafactorψ(X)viamaximiza<on:

•  WecanrefertothisnewfactorbymaxYϕ.

Page 68: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

FactorMaximiza<on

•  GivenXandY(Y∉X),wecanturnafactorϕ(X,Y)intoafactorψ(X)viamaximiza<on:

A C ψ(A,C)

0 0 1.1 B=1

0 1 1.7 B=1

1 0 1.1 B=1

1 1 0.7 B=0“maximizing out” B

A B C ϕ(A,B,C)

0 0 0 0.9

0 0 1 0.3

0 1 0 1.1

0 1 1 1.7

1 0 0 0.4

1 0 1 0.7

1 1 0 1.1

1 1 1 0.2

Page 69: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Distribu<veProperty

•  Ausefulpropertyweexploitedinvariableelimina<on:

•  Underthesamecondi<ons,factormul<plica<ondistributesovermax,too:

Page 70: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Traceback

Input:Sequenceoffactorswithassociatedvariables:(ψZ1,…,ψZk)

Output:z*

•  EachψZisafactorwithscopeincludingZandvariableseliminateda<erZ.

•  Workbackwardsfromi=kto1:– Letzi=argmaxzψZi(z,zi+1,zi+2,…,zk)

•  Returnz

Page 71: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

AbouttheTraceback

•  Noextra(asympto<c)expense.– Lineartraversalovertheintermediatefactors.

•  Thefactoropera<onsforbothsum‐productVEandmax‐productVEcanbegeneralized.– Example:gettheKmostlikelyassignments

Page 72: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Input:SetoffactorsΦ,variableZtoeliminateOutput:newsetoffactorsΨ

1. LetΦ’={ϕ∈Φ|Z∈Scope(ϕ)}2. LetΨ={ϕ∈Φ|Z∉Scope(ϕ)}3. LetτbemaxZ∏ϕ∈Φ’ϕ

–  Letψ be ∏ϕ∈Φ’ϕ(bookkeeping) 4. ReturnΨ∪{τ},ψ

Elimina<ngOneVariable(Max‐ProductVersionwithBookkeeping)

Page 73: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

VariableElimina<on(Max‐ProductVersionwithDecoding)Input:SetoffactorsΦ,orderedlistofvariablesZtoeliminate

Output:newfactor

1. ForeachZi∈Z(inorder):–  Let(Φ,ψZi)=Eliminate‐One(Φ,Zi)

2. Return∏ϕ∈Φϕ,Traceback({ψZi})

Page 74: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

VariableElimina<onTips

•  Anyorderingwillbecorrect.•  Mostorderingswillbetooexpensive.

•  Thereareheuris<csforchoosinganordering(youarewelcometofindthemandtestthemout).

Page 75: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

(RocketScience:TrueMAP)

•  Evidence:X=x•  Query:Y•  Othervariables:Z=V\X\Y

•  First,marginalizeoutZ,thendoMAPinferenceoverYgivenX=x

•  ThisisnotusuallyaaemptedinNLP,withsomeexcep<ons.

Page 76: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

SketchofGibbsSampling

•  MCMC:design(onpaper)agraphwhereeachconfigura<onfromVal(V)isanode.– Transi<onsinthegraphdesignedtogiveaMarkovchainwhosesta<onarydistribu<onistheposterior.

•  Simulatearandomwalkinthegraph.

•  Ifyouwalklongenough,yourposi<onisdistributedaccordingtoP(V).

Page 77: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Transi<onsinGibbsSampling

•  Atransi<onintheMarkovchainequatestochangingasubsetoftherandomvariables.

•  Gibbs:resampleVi’svalueaccordingtoP(Vi|V\{Vi}).– OnlyneedthelocalfactorsthataffectVi:takeproduct,marginalize,andrandomlychoosenewvalue.

•  SimplylockevidencevariablesX.•  Maximizingversiongraduallyshi\ssamplerinfavorofmostprobablevalueforVi.

Page 78: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

SketchofMeanFieldVaria<onalInference

•  Inferencewithourdistribu<onPishard.•  Choosean“easier”distribu<onfamily,Q.Thenfind:

•  Usuallyitera<vemethodsarerequiredto“fit”QtoP.– Theseo\enresemblefamiliarlearningalgorithmslikeEM!

Page 79: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

EnergyFunc<onal

•  Expecta<onsundersimplerdistribu<onfamily,Q.–  EveryelementofQ isanapproximatesolu<on.– Wetrytofindthebestone.

Page 80: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Varia<onalMethods

•  Thisisasimpleexample.•  Foranyλandanyx:

family of functions gλ(x)

Page 81: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Varia<onalMethods

•  Thisisasimpleexample.•  Foranyλandanyx:

•  Further,foranyx,thereissomeλwheretheboundis<ght.– λiscalledavaria/onalparameter.

Page 82: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Tangent:Varia<onalMethods

•  Thisisasimpleexample.•  Foranyλandanyx:

•  Further,foranyx,thereissomeλwheretheboundis<ght.– λiscalledavaria/onalparameter.

Page 83: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Tangent:Varia<onalMethods

•  Thisisasimpleexample.•  Foranyλandanyx:

•  Further,foranyx,thereissomeλwheretheboundis<ght.– λiscalledavaria/onalparameter.

•  Forus,logP(X=x)islike‐ln(x),andQislikeλ.

Page 84: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

StructuredVaria<onalApproach

•  Maximizetheenergyfunc<onaloverafamilyQ thatiswell‐defined.– Agraphicalmodel!– ProbablynotanI‐mapforP.(Boundisn’t<ght.)

•  Simplerstructuresleadtoeasierinference.– Meanfieldisthesimplest:

Page 85: Lecture 2: Inference - Carnegie Mellon School of Computer ...nasmith/psnlp/lecture2.pdfInference inference exact variable eliminaon dynamic program’ng ILP approximate randomized

Par<ngShots

•  Youwillprobablyneverimplementthegeneralvariableelimina<onalgorithm.

•  Youwillrarelyuseexactinference.

•  Thereisvalueinunderstandingtheproblemthatapproxima<onmethodsaretryingtosolve,andwhatanexact(ifintractable)solu<onwouldlooklike!