probabilistic graphical modelslions.epfl.ch/wp-content/uploads/2019/01/gm_belief...excellent book...

77
Probabilistic Graphical Models Lecture 2: Graphical Models. Belief Propagation Volkan Cevher, Matthias Seeger Ecole Polytechnique Fédérale de Lausanne 30/9/2011 (EPFL) Graphical Models 30/9/2011 1 / 30

Upload: others

Post on 14-Jan-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Probabilistic Graphical Models

Lecture 2: Graphical Models. Belief Propagation

Volkan Cevher, Matthias SeegerEcole Polytechnique Fédérale de Lausanne

30/9/2011

(EPFL) Graphical Models 30/9/2011 1 / 30

Page 2: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Outline

1 Graphical Models

2 Belief Propagation

(EPFL) Graphical Models 30/9/2011 2 / 30

Page 3: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Literature

Excellent book about graphical models and belief propagation, writtenby one of the pioneers in these topics:

Pearl, J.Probabilistic Reasoning in Intelligent Systems (1990)

(EPFL) Graphical Models 30/9/2011 3 / 30

Page 4: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

The Need to Factorize

Variables x1, x2, . . . , xn

P(x1) =X

x2

· · ·X

xn

P(x1, x2, . . . , xn)

Marginalization: Exponential timeStorage: Exponential space ) Need factorization

Independence?But probabilistic modelling is about dependencies!

Conditional independenceDependencies may have simple structure

(EPFL) Graphical Models 30/9/2011 4 / 30

Page 5: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

The Need to Factorize

Variables x1, x2, . . . , xn

P(x1) =X

x2

· · ·X

xn

P(x1, x2, . . . , xn)

Marginalization: Exponential timeStorage: Exponential space ) Need factorization

Independence?But probabilistic modelling is about dependencies!Conditional independenceDependencies may have simple structure

(EPFL) Graphical Models 30/9/2011 4 / 30

Page 6: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Towards Bayesian NetworksTracking a fly

Path pretty random

Positions not independentBut conditionally independent(Markovian)

(EPFL) Graphical Models 30/9/2011 5 / 30

Page 7: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Towards Bayesian NetworksTracking a fly

Path pretty random

Positions not independentBut conditionally independent(Markovian)

(EPFL) Graphical Models 30/9/2011 5 / 30

Page 8: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Towards Bayesian NetworksTracking a fly

Path pretty randomPositions not independent

But conditionally independent(Markovian)

(EPFL) Graphical Models 30/9/2011 5 / 30

Page 9: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Towards Bayesian NetworksTracking a fly

Path pretty randomPositions not independent

But conditionally independent(Markovian)

(EPFL) Graphical Models 30/9/2011 5 / 30

Page 10: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Towards Bayesian NetworksTracking a fly

Path pretty randomPositions not independent

But conditionally independent(Markovian)

(EPFL) Graphical Models 30/9/2011 5 / 30

Page 11: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Towards Bayesian NetworksTracking a fly

Path pretty randomPositions not independentBut conditionally independent(Markovian)

(EPFL) Graphical Models 30/9/2011 5 / 30

Page 12: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Towards Bayesian NetworksTracking a fly

Path pretty randomPositions not independentBut conditionally independent(Markovian)

Remember

P(x1, . . . , xn) =P(x1)P(x2|x1) . . .

P(xn|xn�1, . . . , x1) ?

Here: P(xn|xn�1, . . . , x1) = P(xn|xn�1)) Linear storageCausal factorization) Bayesian networks

(EPFL) Graphical Models 30/9/2011 5 / 30

Page 13: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Bayesian Networks (Directed Graphical Models)Causal factorization:

P(x1, . . . , xn) =nY

i=1

P(xi |x⇡i )

Bayesian network(aka directed graphical model,aka causal network):

Graphical representation ofancestry [DAG]P(xi |x⇡i ): Conditionalprobability table (CPT)

Conditional IndependenceA?B | C , P(A,B | C) = P(A | C)P(B | C) , P(A | C,B) = P(A | C)

(EPFL) Graphical Models 30/9/2011 6 / 30

Page 14: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Did It Rain Tonight?

(EPFL) Graphical Models 30/9/2011 7 / 30

Page 15: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Monty Hall Problem

Let’s make a deal!Door with car (hidden)First choice of yours (remains closed)Host opens door with goat, H 6= F ,DDo you switch?

“Intuition”: Fifty-fifty.F ,H give no information. He would be stupid, wouldn’t he?

(EPFL) Graphical Models 30/9/2011 8 / 30

Page 16: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Monty Hall Problem

Let’s make a deal!Door with car (hidden)First choice of yours (remains closed)Host opens door with goat, H 6= F ,DDo you switch?

“Intuition”: Fifty-fifty.F ,H give no information. He would be stupid, wouldn’t he?

(EPFL) Graphical Models 30/9/2011 8 / 30

Page 17: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Winning with Bayes (I)

F

D

HI

oor withcar

irstchoice

ostopens

D=F?

Intuition “H does not tell anything” correct inprinciple. But about what?Add latent I = I{D=F} = I{first choice correct}

Gut feeling: F ,H no information about I.“He will not tell me whether I am correct”.P(I|F ,H) = P(I).Will use Bayes to see that.

OK, but P(Switch wins) = P(I = 0|F ,H) = P(I = 0) = 2/3!

Bayes makes you switch anddouble your chance ofwinning!

(EPFL) Graphical Models 30/9/2011 9 / 30

Page 18: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Winning with Bayes (I)

F

D

HI

oor withcar

irstchoice

ostopens

D=F?

Intuition “H does not tell anything” correct inprinciple. But about what?Add latent I = I{D=F} = I{first choice correct}

Gut feeling: F ,H no information about I.“He will not tell me whether I am correct”.P(I|F ,H) = P(I).Will use Bayes to see that.

OK, but P(Switch wins) = P(I = 0|F ,H) = P(I = 0) = 2/3!

Bayes makes you switch anddouble your chance ofwinning!

(EPFL) Graphical Models 30/9/2011 9 / 30

Page 19: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Winning with Bayes (I)

F

D

HI

oor withcar

irstchoice

ostopens

D=F?

Intuition “H does not tell anything” correct inprinciple. But about what?Add latent I = I{D=F} = I{first choice correct}

Gut feeling: F ,H no information about I.“He will not tell me whether I am correct”.P(I|F ,H) = P(I).Will use Bayes to see that.

OK, but P(Switch wins) = P(I = 0|F ,H) = P(I = 0) = 2/3!

Bayes makes you switch anddouble your chance ofwinning!

(EPFL) Graphical Models 30/9/2011 9 / 30

Page 20: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Winning with Bayes (I)

F

D

HI

oor withcar

irstchoice

ostopens

D=F?

Intuition “H does not tell anything” correct inprinciple. But about what?Add latent I = I{D=F} = I{first choice correct}

Gut feeling: F ,H no information about I.“He will not tell me whether I am correct”.P(I|F ,H) = P(I).Will use Bayes to see that.

OK, but P(Switch wins) = P(I = 0|F ,H) = P(I = 0) = 2/3!

Bayes makes you switch anddouble your chance ofwinning!

(EPFL) Graphical Models 30/9/2011 9 / 30

Page 21: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Winning with Bayes (II)

F

D

HI

oor withcar

irstchoice

ostopens

D=F?

To show: P(I|H,F ) = P(I).P(I|F ) = P(I),because D,F independent.

P(I|H,F ) = P(I|F ), I?H|F , H?I|F, P(H|I,F ) = P(H|F )(independence is symmetric)

P(H|F , I = 1) = (1/2)I{H 6=F}If F = D, host picks random goat

P(H|F , I = 0) = (1/2)I{H 6=F}D,F independent, and H 6= D,F

Working with Graphical ModelsIntermediate between lots of headscratching and doing all sumsPowerful division of inference in manageable, local steps

(EPFL) Graphical Models 30/9/2011 10 / 30

Page 22: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Winning with Bayes (II)

F

D

HI

oor withcar

irstchoice

ostopens

D=F?

To show: P(I|H,F ) = P(I).P(I|F ) = P(I),because D,F independent.P(I|H,F ) = P(I|F ), I?H|F , H?I|F, P(H|I,F ) = P(H|F )(independence is symmetric)

P(H|F , I = 1) = (1/2)I{H 6=F}If F = D, host picks random goat

P(H|F , I = 0) = (1/2)I{H 6=F}D,F independent, and H 6= D,F

Working with Graphical ModelsIntermediate between lots of headscratching and doing all sumsPowerful division of inference in manageable, local steps

(EPFL) Graphical Models 30/9/2011 10 / 30

Page 23: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Winning with Bayes (II)

F

D

HI

oor withcar

irstchoice

ostopens

D=F?

To show: P(I|H,F ) = P(I).P(I|F ) = P(I),because D,F independent.P(I|H,F ) = P(I|F ), I?H|F , H?I|F, P(H|I,F ) = P(H|F )(independence is symmetric)

P(H|F , I = 1) = (1/2)I{H 6=F}If F = D, host picks random goat

P(H|F , I = 0) = (1/2)I{H 6=F}D,F independent, and H 6= D,F

Working with Graphical ModelsIntermediate between lots of headscratching and doing all sumsPowerful division of inference in manageable, local steps

(EPFL) Graphical Models 30/9/2011 10 / 30

Page 24: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Winning with Bayes (II)

F

D

HI

oor withcar

irstchoice

ostopens

D=F?

To show: P(I|H,F ) = P(I).P(I|F ) = P(I),because D,F independent.P(I|H,F ) = P(I|F ), I?H|F , H?I|F, P(H|I,F ) = P(H|F )(independence is symmetric)

P(H|F , I = 1) = (1/2)I{H 6=F}If F = D, host picks random goat

P(H|F , I = 0) = (1/2)I{H 6=F}D,F independent, and H 6= D,F

Working with Graphical ModelsIntermediate between lots of headscratching and doing all sumsPowerful division of inference in manageable, local steps

(EPFL) Graphical Models 30/9/2011 10 / 30

Page 25: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Winning with Bayes (II)

F

D

HI

oor withcar

irstchoice

ostopens

D=F?

To show: P(I|H,F ) = P(I).P(I|F ) = P(I),because D,F independent.P(I|H,F ) = P(I|F ), I?H|F , H?I|F, P(H|I,F ) = P(H|F )(independence is symmetric)

P(H|F , I = 1) = (1/2)I{H 6=F}If F = D, host picks random goat

P(H|F , I = 0) = (1/2)I{H 6=F}D,F independent, and H 6= D,F

Working with Graphical ModelsIntermediate between lots of headscratching and doing all sumsPowerful division of inference in manageable, local steps

(EPFL) Graphical Models 30/9/2011 10 / 30

Page 26: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Why Graphical Models?

1 Easy way of communicating ideas about dependencies, models2 Precise semantics: Conditional independence constraints on

distributions. Efficient algorithms for testing these3 Lead to large savings in computations (belief propagation)

(EPFL) Graphical Models 30/9/2011 11 / 30

Page 27: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Graphical Models in PracticeDependency structures, and efficient ways to propagate information orconstraints, are fundamental.

Coding / Information TheoryLDPC codes and BP decodingrevolutionized this field(resurrection of Gallagercodes)Used from deep spacecommunication (Mars rovers)over satellite transmission toCD players / hard drives

Courtesy MacKay: Information Theory . . . (2003)

(EPFL) Graphical Models 30/9/2011 12 / 30

Page 28: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Graphical Models in PracticeDependency structures, and efficient ways to propagate information orconstraints, are fundamental.

Expert systems done rightQMR-DT: Invert causalnetwork for helping medicaldiagnosesHugin: Advanced decisionsupport (Lauritzen)

http://www.hugin.com/

Promedas: Medical diagnosticadvisory system(SNN Nimegen)

http://www.promedas.nl/

(EPFL) Graphical Models 30/9/2011 12 / 30

Page 29: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Graphical Models in PracticeDependency structures, and efficient ways to propagate information orconstraints, are fundamental.

Computer Vision:Markov Random Fields

Denoising, super-resolution,restoration (early work byBesag)Depth / reconstruction fromstereo, matching,correspondencesSegmentation, matting,blending, stitching, impainting,. . .

Courtesy MSR

(EPFL) Graphical Models 30/9/2011 12 / 30

Page 30: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Conditional Independence Semantics

Graphical model formally equivalent to long (finite) list ofconditional independence constraints:xA1?xB1 | xC1 , xA2?xB2 | xC2 , . . . Which do you prefer?

Graphs not just simpler for us:Linear-time algorithm to test such constraints (Bayes ball)Distribution consistent with graph iff all CI constraints are met.P(x1)P(x2) . . .P(xn): Consistent with all graphsHow do I see whether xA?xB | xC from the graph?Graph separation: If paths A $ B blocked by CFor Bayesian networks (directed graphical models): d-separation.) You’ll find out in the exercises!

(EPFL) Graphical Models 30/9/2011 13 / 30

Page 31: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Conditional Independence Semantics

Graphical model formally equivalent to long (finite) list ofconditional independence constraints:xA1?xB1 | xC1 , xA2?xB2 | xC2 , . . . Which do you prefer?Graphs not just simpler for us:Linear-time algorithm to test such constraints (Bayes ball)

Distribution consistent with graph iff all CI constraints are met.P(x1)P(x2) . . .P(xn): Consistent with all graphsHow do I see whether xA?xB | xC from the graph?Graph separation: If paths A $ B blocked by CFor Bayesian networks (directed graphical models): d-separation.) You’ll find out in the exercises!

(EPFL) Graphical Models 30/9/2011 13 / 30

Page 32: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Conditional Independence Semantics

Graphical model formally equivalent to long (finite) list ofconditional independence constraints:xA1?xB1 | xC1 , xA2?xB2 | xC2 , . . . Which do you prefer?Graphs not just simpler for us:Linear-time algorithm to test such constraints (Bayes ball)Distribution consistent with graph iff all CI constraints are met.P(x1)P(x2) . . .P(xn): Consistent with all graphs

How do I see whether xA?xB | xC from the graph?Graph separation: If paths A $ B blocked by CFor Bayesian networks (directed graphical models): d-separation.) You’ll find out in the exercises!

(EPFL) Graphical Models 30/9/2011 13 / 30

Page 33: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Conditional Independence Semantics

Graphical model formally equivalent to long (finite) list ofconditional independence constraints:xA1?xB1 | xC1 , xA2?xB2 | xC2 , . . . Which do you prefer?Graphs not just simpler for us:Linear-time algorithm to test such constraints (Bayes ball)Distribution consistent with graph iff all CI constraints are met.P(x1)P(x2) . . .P(xn): Consistent with all graphsHow do I see whether xA?xB | xC from the graph?Graph separation: If paths A $ B blocked by C

For Bayesian networks (directed graphical models): d-separation.) You’ll find out in the exercises!

(EPFL) Graphical Models 30/9/2011 13 / 30

Page 34: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Conditional Independence Semantics

Graphical model formally equivalent to long (finite) list ofconditional independence constraints:xA1?xB1 | xC1 , xA2?xB2 | xC2 , . . . Which do you prefer?Graphs not just simpler for us:Linear-time algorithm to test such constraints (Bayes ball)Distribution consistent with graph iff all CI constraints are met.P(x1)P(x2) . . .P(xn): Consistent with all graphsHow do I see whether xA?xB | xC from the graph?Graph separation: If paths A $ B blocked by CFor Bayesian networks (directed graphical models): d-separation.) You’ll find out in the exercises!

(EPFL) Graphical Models 30/9/2011 13 / 30

Page 35: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Undirected Graphical Models (Markov Random Fields)

Bayesian Networks: Describe CIs with directed graphs (DAGs)Markov Random Fields: Describe CIs with undirected graphs

CI semantics of undirected models: Really just graph separation

(EPFL) Graphical Models 30/9/2011 14 / 30

Page 36: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Undirected Graphical Models (Markov Random Fields)

Bayesian Networks: Describe CIs with directed graphs (DAGs)Markov Random Fields: Describe CIs with undirected graphsCI semantics of undirected models: Really just graph separation

(EPFL) Graphical Models 30/9/2011 14 / 30

Page 37: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Undirected Graphical Models (II)

Why two frameworks?Each can capture setups the other cannotMore important: In practice, some problems are much easier toparameterize (therefore: to learn) as MRFs, others much easier asBayes nets

How do distributions P for MRF graph G look like?Hammersley / Clifford:

Maximal cliques (completely connected parts) Cj of GP(x ) consistent with MRF G,

P(x ) = Z�1Y

j

�j(xCj ), Z :=X

x

Y

j

�j(xCj )

with potentials �j(xCj ) � 0. Z : Partition function.Potentials need not normalize to 1

(EPFL) Graphical Models 30/9/2011 15 / 30

Page 38: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Undirected Graphical Models (II)

Why two frameworks?Each can capture setups the other cannotMore important: In practice, some problems are much easier toparameterize (therefore: to learn) as MRFs, others much easier asBayes nets

How do distributions P for MRF graph G look like?Hammersley / Clifford:

Maximal cliques (completely connected parts) Cj of GP(x ) consistent with MRF G,

P(x ) = Z�1Y

j

�j(xCj ), Z :=X

x

Y

j

�j(xCj )

with potentials �j(xCj ) � 0. Z : Partition function.Potentials need not normalize to 1

(EPFL) Graphical Models 30/9/2011 15 / 30

Page 39: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Undirected Graphical Models (III)

(EPFL) Graphical Models 30/9/2011 16 / 30

Page 40: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Directed vs. Undirected

Sampling x ⇠ P(x ):Always simple from Bayes net. Can be very hard for an MRF

Implicit, symmetrical knowledge? Little idea about causal links(pixels of image, correspondences)? MRFs more useful thenBottomline: Usually, one or the other is much more suitable.Better know well about both!

(EPFL) Graphical Models 30/9/2011 17 / 30

Page 41: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Directed vs. Undirected

Sampling x ⇠ P(x ):Always simple from Bayes net. Can be very hard for an MRFImplicit, symmetrical knowledge? Little idea about causal links(pixels of image, correspondences)? MRFs more useful then

Bottomline: Usually, one or the other is much more suitable.Better know well about both!

(EPFL) Graphical Models 30/9/2011 17 / 30

Page 42: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Graphical Models

Directed vs. Undirected

Sampling x ⇠ P(x ):Always simple from Bayes net. Can be very hard for an MRFImplicit, symmetrical knowledge? Little idea about causal links(pixels of image, correspondences)? MRFs more useful thenBottomline: Usually, one or the other is much more suitable.Better know well about both!

(EPFL) Graphical Models 30/9/2011 17 / 30

Page 43: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Towards Efficient Marginalization

With sufficient Markovian CI constraints (directed or undirected):

P(x1, . . . , xn) /Y

j

�j(xNj ), |Nj |⌧ n

Can store that. But what about computation?

Short answer: It depends on global graph structure properties,beyond local factorization

(EPFL) Graphical Models 30/9/2011 18 / 30

Page 44: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Towards Efficient Marginalization

With sufficient Markovian CI constraints (directed or undirected):

P(x1, . . . , xn) /Y

j

�j(xNj ), |Nj |⌧ n

Can store that. But what about computation?Short answer: It depends on global graph structure properties,beyond local factorization

Storage: Linear in nComputation: Exponential in n1/2 [P6=NP]

(EPFL) Graphical Models 30/9/2011 18 / 30

Page 45: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Node Elimination

Chain:P(x1, . . . , x7) = �1(x1, x2)�2(x2, x3) . . .�6(x6, x7)

(EPFL) Graphical Models 30/9/2011 19 / 30

Page 46: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Node Elimination

Chain:P(x1, . . . , x7) = �1(x1, x2)�2(x2, x3) . . .�6(x6, x7)

X

x4

P(x1, . . . , x4, . . . , x7)

(EPFL) Graphical Models 30/9/2011 19 / 30

Page 47: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Node Elimination

Chain:P(x1, . . . , x7) = �1(x1, x2)�2(x2, x3) . . .�6(x6, x7)

X

x4

P(x1, . . . , x4, . . . , x7)

=�1(x1, x2)�2(x2, x3)

X

x4

�3(x3, x4)�4(x4, x5)

!�5(x5, x6)�6(x6, x7)

(EPFL) Graphical Models 30/9/2011 19 / 30

Page 48: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Node Elimination

Chain:P(x1, . . . , x7) = �1(x1, x2)�2(x2, x3) . . .�6(x6, x7)

X

x4

P(x1, . . . , x4, . . . , x7)

=�1(x1, x2)�2(x2, x3)

X

x4

�3(x3, x4)�4(x4, x5)

!�5(x5, x6)�6(x6, x7)

=�1(x1, x2)�2(x2, x3)M35(x3, x5)�5(x5, x6)�6(x6, x7)

(EPFL) Graphical Models 30/9/2011 19 / 30

Page 49: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Tree Graphs

(EPFL) Graphical Models 30/9/2011 20 / 30

Page 50: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Factor Graphs

Factor graphs: Yet another type of graphical model

Bipartite graph:variable / factor nodesNo probabilitysemanticsJust for derivingMarkovian propagationalgorithmsFactor graph = tree) Fast computation

(EPFL) Graphical Models 30/9/2011 21 / 30

Page 51: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Factor Graphs

Factor graphs: Yet another type of graphical model

Bipartite graph:variable / factor nodesNo probabilitysemanticsJust for derivingMarkovian propagationalgorithmsFactor graph = tree) Fast computation

Undirected GM! Factor graph: ImmediateDirected GM ! Factor graph: Easy exercise

(EPFL) Graphical Models 30/9/2011 21 / 30

Page 52: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Towards Belief Propagation

(EPFL) Graphical Models 30/9/2011 22 / 30

Page 53: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

What is a Message?

(EPFL) Graphical Models 30/9/2011 23 / 30

Page 54: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

What is a Message?

Formally: Directed potential over onevariableIntuition: Message T2 ! a:What T2 thinks xa should beNaive “definition”:

Product: All T2, and edge! aSum: All except xa

) Real definition recursive (G tree!)

(EPFL) Graphical Models 30/9/2011 24 / 30

Page 55: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

What is a Message?

Formally: Directed potential over onevariableIntuition: Message T2 ! a:What T2 thinks xa should beNaive “definition”:

Product: All T2, and edge! aSum: All except xa

) Real definition recursive (G tree!)

Subtle points:Messages: Not conditional / marginal distributions of P.Message µT2!a(xa) has seen T2 only

Strictly speaking: Two types of messages: �! ⇤, ⇤!�) Understand idea, behind formalities

(EPFL) Graphical Models 30/9/2011 24 / 30

Page 56: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

What is a Message?

Formally: Directed potential over onevariableIntuition: Message T2 ! a:What T2 thinks xa should beNaive “definition”:

Product: All T2, and edge! aSum: All except xa

) Real definition recursive (G tree!)

Subtle points:Messages: Not conditional / marginal distributions of P.Message µT2!a(xa) has seen T2 onlyStrictly speaking: Two types of messages: �! ⇤, ⇤!�) Understand idea, behind formalities

(EPFL) Graphical Models 30/9/2011 24 / 30

Page 57: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Message Passing: The RecipeMessage µT!a(xa) to a

1 Expand factor: (T1, b1),(T2, b2)

2 Product: Gather potentials

�j(xa, xb1 , xb2)All µ?!b1(xb1), exceptb1 �jµ?!b2(xb2) dito

3 Sum: Over xb1 , xb2

(EPFL) Graphical Models 30/9/2011 25 / 30

Page 58: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Message Passing: The RecipeMessage µT!a(xa) to a

1 Expand factor: (T1, b1),(T2, b2)

2 Product: Gather potentials

�j(xa, xb1 , xb2)All µ?!b1(xb1), exceptb1 �jµ?!b2(xb2) dito

3 Sum: Over xb1 , xb2

(EPFL) Graphical Models 30/9/2011 25 / 30

Page 59: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Message Passing: The RecipeMessage µT!a(xa) to a

1 Expand factor: (T1, b1),(T2, b2)

2 Product: Gather potentials�j(xa, xb1 , xb2)All µ?!b1(xb1), exceptb1 �jµ?!b2(xb2) dito

3 Sum: Over xb1 , xb2

(EPFL) Graphical Models 30/9/2011 25 / 30

Page 60: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Message Passing: The RecipeMessage µT!a(xa) to a

1 Expand factor: (T1, b1),(T2, b2)

2 Product: Gather potentials�j(xa, xb1 , xb2)All µ?!b1(xb1), exceptb1 �jµ?!b2(xb2) dito

3 Sum: Over xb1 , xb2

µT!a(xa) /X

xb1 ,xb2

�j(xab1b2)

0

@Y

T̃ :T1\b1

µT̃!b1(xb1)

1

A

0

@Y

T̃ :T2\b2

µT̃!b2(xb2)

1

A

(EPFL) Graphical Models 30/9/2011 25 / 30

Page 61: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Message Passing: The Idea

I can never remember these message passing equations.What I remember:

Messages are partial information, given part of graph

Message passing is information propagation

Product: PredictSum: Marginalize (cover your tracks)

Directed like filtering)We’ll see cases where

Qis not

Q, and

Pis not

P

Marginal distributions (our goal!) are obtained by combiningmessages$ combining information from all partsMP works on trees, because information cannot go around incycles

(EPFL) Graphical Models 30/9/2011 26 / 30

Page 62: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Message Passing: The Idea

I can never remember these message passing equations.What I remember:

Messages are partial information, given part of graphMessage passing is information propagation

Product: PredictSum: Marginalize (cover your tracks)

Directed like filtering)We’ll see cases where

Qis not

Q, and

Pis not

P

Marginal distributions (our goal!) are obtained by combiningmessages$ combining information from all partsMP works on trees, because information cannot go around incycles

(EPFL) Graphical Models 30/9/2011 26 / 30

Page 63: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Message Passing: The Idea

I can never remember these message passing equations.What I remember:

Messages are partial information, given part of graphMessage passing is information propagation

Product: PredictSum: Marginalize (cover your tracks)

Directed like filtering)We’ll see cases where

Qis not

Q, and

Pis not

P

Marginal distributions (our goal!) are obtained by combiningmessages$ combining information from all parts

MP works on trees, because information cannot go around incycles

(EPFL) Graphical Models 30/9/2011 26 / 30

Page 64: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Message Passing: The Idea

I can never remember these message passing equations.What I remember:

Messages are partial information, given part of graphMessage passing is information propagation

Product: PredictSum: Marginalize (cover your tracks)

Directed like filtering)We’ll see cases where

Qis not

Q, and

Pis not

P

Marginal distributions (our goal!) are obtained by combiningmessages$ combining information from all partsMP works on trees, because information cannot go around incycles

(EPFL) Graphical Models 30/9/2011 26 / 30

Page 65: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Belief Propagation: More than Node Elimination

Marginalization by message passing:

P(xa) =X

x\xa

P(x ) / �a(xa)Y

j2Na

µTj!a(xa)

Na : Factor nodes neighbouring a$ factors �j(xa, . . . )�a: Can be ⌘ 1

All marginals P(x1), P(x2), . . . ? Do this n times. Right?) NO! Do this twice only!) If you understand that, you’ve understood belief propagation

Belief Propagation on Trees

Message uniquely defined, independent of use, order ofcomputationMessage can be computed once all inputs received. Oncecomputed, it does not change anymoreCompute all messages (2 per edge)) All marginals, O(1) each

(EPFL) Graphical Models 30/9/2011 27 / 30

Page 66: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Belief Propagation: More than Node Elimination

Marginalization by message passing:

P(xa) =X

x\xa

P(x ) / �a(xa)Y

j2Na

µTj!a(xa)

Na : Factor nodes neighbouring a$ factors �j(xa, . . . )�a: Can be ⌘ 1All marginals P(x1), P(x2), . . . ? Do this n times. Right?) NO! Do this twice only!) If you understand that, you’ve understood belief propagation

Belief Propagation on Trees

Message uniquely defined, independent of use, order ofcomputationMessage can be computed once all inputs received. Oncecomputed, it does not change anymoreCompute all messages (2 per edge)) All marginals, O(1) each

(EPFL) Graphical Models 30/9/2011 27 / 30

Page 67: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Belief Propagation: More than Node Elimination

Marginalization by message passing:

P(xa) =X

x\xa

P(x ) / �a(xa)Y

j2Na

µTj!a(xa)

Na : Factor nodes neighbouring a$ factors �j(xa, . . . )�a: Can be ⌘ 1All marginals P(x1), P(x2), . . . ? Do this n times. Right?) NO! Do this twice only!) If you understand that, you’ve understood belief propagation

Belief Propagation on TreesMessage uniquely defined, independent of use, order ofcomputationMessage can be computed once all inputs received. Oncecomputed, it does not change anymoreCompute all messages (2 per edge)) All marginals, O(1) each

(EPFL) Graphical Models 30/9/2011 27 / 30

Page 68: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Implementation of Belief Propagation

Belief Propagation (Sum-Product) on Trees1 Designate node (any will do!) as root2 Inward pass: Compute messages leaves! root3 Outward pass: Compute messages root! leaves

(EPFL) Graphical Models 30/9/2011 28 / 30

Page 69: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Implementation of Belief Propagation

Belief Propagation (Sum-Product) on Trees1 Designate node (any will do!) as root2 Inward pass: Compute messages leaves! root3 Outward pass: Compute messages root! leaves

Messages can be normalized at will:

µT!a(xa) =X

xCj\a�j(xCj )

YCµT̃!b1

(xb1) . . .

(EPFL) Graphical Models 30/9/2011 28 / 30

Page 70: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Implementation of Belief Propagation

Belief Propagation (Sum-Product) on Trees1 Designate node (any will do!) as root2 Inward pass: Compute messages leaves! root3 Outward pass: Compute messages root! leaves

Messages can be normalized at will:

µT!a(xa) = CX

xCj\a�j(xCj )

YµT̃!b1

(xb1) . . .

(EPFL) Graphical Models 30/9/2011 28 / 30

Page 71: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Implementation of Belief Propagation

Belief Propagation (Sum-Product) on Trees1 Designate node (any will do!) as root2 Inward pass: Compute messages leaves! root3 Outward pass: Compute messages root! leaves

Avoiding underflow / overflow (yes, it does matter):Renormalize each message to sum to 1Better: Work in log domain (log-messages, log-potentials):Q! +P! logsumexp [careful with zeros!]

logsumexp(v ) := logXk

i=1evi = M + log

Xk

i=1evi�M

| {z }numerically stable

, M = maxi

vi

(EPFL) Graphical Models 30/9/2011 28 / 30

Page 72: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Searching for the Mode: Max-ProductDecoding:

x⇤ 2 argmaxx

P(x )

max,Q

: Same decomposition asP

,Q

.Better: max,

Pin log domain

Max-messages:

µT!a(xa) = maxxCj\a

✓log�j(xCj ) +

Xb2Cj\a

µTb!b(xb)

Back-pointer tables:

�T!a(xa) 2 argmaxxCj\a

✓log�j(xCj ) +

Xb2Cj\a

µTb!b(xb)

(EPFL) Graphical Models 30/9/2011 29 / 30

Page 73: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Searching for the Mode: Max-ProductDecoding:

x⇤ 2 argmaxx

P(x )

max,Q

: Same decomposition asP

,Q

.Better: max,

Pin log domain

Max-messages:

µT!a(xa) = maxxCj\a

✓log�j(xCj ) +

Xb2Cj\a

µTb!b(xb)

Back-pointer tables:

�T!a(xa) 2 argmaxxCj\a

✓log�j(xCj ) +

Xb2Cj\a

µTb!b(xb)

(EPFL) Graphical Models 30/9/2011 29 / 30

Page 74: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Searching for the Mode: Max-ProductDecoding:

x⇤ 2 argmaxx

P(x )

max,Q

: Same decomposition asP

,Q

.Better: max,

Pin log domain

Max-messages:

µT!a(xa) = maxxCj\a

✓log�j(xCj ) +

Xb2Cj\a

µTb!b(xb)

Back-pointer tables:

�T!a(xa) 2 argmaxxCj\a

✓log�j(xCj ) +

Xb2Cj\a

µTb!b(xb)

(EPFL) Graphical Models 30/9/2011 29 / 30

Page 75: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Wrap-Up

Belief propagation (sum-product) on trees:All marginals in linear time, by local information propagationMax-product, max-sum, logsumexp-sum, . . . :What matters is the graph!

What about general graphs?

Decomposable graphs. Treewidth of a graphJunction tree algorithm

Interested?

PMR Edinburgh slides:http://www.inf.ed.ac.uk/teaching/courses/pmr/slides/jta-2x2.pdf

Lauritzen, S; Spiegelhalter, D. Local Computations withProbabilities on Graphical Structures and their Application to ExpertSystems. JRSS-B, 50: 157-224 (1988)

Beware (not surprising): Inference on general graphs is NP hard.In general, approximations are a must

(EPFL) Graphical Models 30/9/2011 30 / 30

Page 76: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Wrap-Up

Belief propagation (sum-product) on trees:All marginals in linear time, by local information propagationMax-product, max-sum, logsumexp-sum, . . . :What matters is the graph!What about general graphs?

Decomposable graphs. Treewidth of a graphJunction tree algorithm

Interested?PMR Edinburgh slides:http://www.inf.ed.ac.uk/teaching/courses/pmr/slides/jta-2x2.pdf

Lauritzen, S; Spiegelhalter, D. Local Computations withProbabilities on Graphical Structures and their Application to ExpertSystems. JRSS-B, 50: 157-224 (1988)

Beware (not surprising): Inference on general graphs is NP hard.In general, approximations are a must

(EPFL) Graphical Models 30/9/2011 30 / 30

Page 77: Probabilistic Graphical Modelslions.epfl.ch/wp-content/uploads/2019/01/GM_Belief...Excellent book about graphical models and belief propagation, written by one of the pioneers in these

Belief Propagation

Wrap-Up

Belief propagation (sum-product) on trees:All marginals in linear time, by local information propagationMax-product, max-sum, logsumexp-sum, . . . :What matters is the graph!What about general graphs?

Decomposable graphs. Treewidth of a graphJunction tree algorithm

Interested?PMR Edinburgh slides:http://www.inf.ed.ac.uk/teaching/courses/pmr/slides/jta-2x2.pdf

Lauritzen, S; Spiegelhalter, D. Local Computations withProbabilities on Graphical Structures and their Application to ExpertSystems. JRSS-B, 50: 157-224 (1988)

Beware (not surprising): Inference on general graphs is NP hard.In general, approximations are a must

(EPFL) Graphical Models 30/9/2011 30 / 30