Download - Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Stat 375: Inference in Graphical ModelsLectures 7-8-9

Andrea Montanari

Stanford University

May 6, 2012

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 1 / 83

Page 2: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Variational methods

Idea

I know a lot about (convex) optimization. . .

Page 3: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Variational methods

Idea

I know a lot about (convex) optimization. . .

Page 4: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Outline

1 Gibbs Variational Principle

2 Naive Mean Field

3 Bethe Free Energy

4 Region-Based Approximation

5 Tree-based Convexi�cations

Page 5: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Undirected Pairwise Graphical Model

x1

x2 x3 x4

x5x6

x7x8x9x10

x11x12

G = (V ;E), V = [n ], x = (x1; : : : ; xn), xi 2 X , jX j <1

�(x ) =1

Z

Y(ij )2E

ij (xi ; xj ) :

Page 6: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Notation

`Actual' probability �! �

`Trial' probability (`belief') �! b

Page 7: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Gibbs Variational Principle

Page 8: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

I want to compute

� � logZ

Page 9: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Gibbs Free Energy

G : XV ! R ;

b 7! G (b) :

Proposition

G is strictly concave, and achieves its unique maximum at b = �.

Further

G (b = �) = � :

Page 10: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Gibbs Free Energy

G : XV ! R ;

b 7! G (b) :

Proposition

G is strictly concave, and achieves its unique maximum at b = �.

Further

G (b = �) = � :

Page 11: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Gibbs Free Energy

�(x ) =1

Z

Y(ij )2E

ij (xi ; xj ) �1

Z tot(x ) :

De�nition

G(b) = Eb log tot(x ) +H (b)

=X

(ij )2E

Xxi ;xj2X

b(xi ; xj ) log ij (xi ; xj )�Xx2XV

b(x ) log b(x )

Page 12: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Gibbs Free Energy

�(x ) =1

Z

Y(ij )2E

ij (xi ; xj ) �1

Z tot(x ) :

De�nition

=X

(ij )2E

Xxi ;xj2X

b(x ) log b(x )

Page 13: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Gibbs Free Energy

�(x ) =1

Z

Y(ij )2E

ij (xi ; xj ) �1

Z tot(x ) :

De�nition

=X

(ij )2E

Xxi ;xj2X

b(x ) log b(x )

Page 14: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Proof

1. De�ne the Lagrangian

L(b; �) = G(b)� �n Xx2XV

b(x )� 1o

and di�erentiate

2. Observe that

z 7! z log z is convex on R+

Page 15: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Proof

1. De�ne the Lagrangian

L(b; �) = G(b)� �n Xx2XV

b(x )� 1o

and di�erentiate

2. Observe that

z 7! z log z is convex on R+

Page 16: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Naive Mean Field

Page 17: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Good news/Bad news

Counting = Convex Optimization

M(XV ) is jX jV � 1 dimensional.

Page 18: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Good news/Bad news

Page 19: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Good news/Bad news

Page 20: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Idea

S

�b�

Maximize Gibbs free energy on a low-dim subset

� � supb2S

G (b) :

Page 21: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Naive Mean Field Idea

S =nb 2 M(X n) : b = b1 � b2 � � � � � bn

o;

Abuse b � fbigi2V

FMF(b = fbigi2V ) = G (b1 � � � � � bn)

Page 22: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

S =nb 2 M(X n) : b = b1 � b2 � � � � � bn

o;

Abuse b � fbigi2V

FMF(b = fbigi2V ) = G (b1 � � � � � bn)

Page 23: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

S =nb 2 M(X n) : b = b1 � b2 � � � � � bn

o;

Abuse b � fbigi2V

FMF(b = fbigi2V ) = G (b1 � � � � � bn)

Page 24: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Explicitly

FMF(b) =X

(i ;j )2E

Ebi�bj log ij (xi ; xj ) +H (b1 � � � � � bn)

=X

(i ;j )2E

Xxi ;xj

bi (xi )bj (xj ) log ij (xi ; xj )�Xxi

bi (xi ) log bi (xi )

Problem: Not convex.

Page 25: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Explicitly

FMF(b) =X

(i ;j )2E

=X

(i ;j )2E

Xxi ;xj

Page 26: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Explicitly

FMF(b) =X

(i ;j )2E

=X

(i ;j )2E

Xxi ;xj

Page 27: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Stationarity condition

Lagrangian

L(b; �) = FMF(b) +Xi2V

�i

n Xxi2X

bi (xi )� 1o;

@L

@bi (xi )(b; �) =

Xj2@i

Xxj2X

bj (xj ) log ij (xi ; xj )� 1� log bi (xi ) + �i = 0 ;

bi (xi ) �= expn Xj2@i

Xxj

log ij (xi ; xj )bj (xj )o� FMF(b)i (xi ) ;

[Naive Mean Field Equations]

Typically a �xed point is searched by iteration: b(t+1) = FMF(b(t))

Page 28: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Lagrangian

�i

n Xxi2X

bi (xi )� 1o;

@L

@bi (xi )(b; �) =

Xj2@i

Xxj2X

Xxj

Page 29: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Lagrangian

�i

n Xxi2X

bi (xi )� 1o;

@L

@bi (xi )(b; �) =

Xj2@i

Xxj2X

Xxj

Page 30: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Lagrangian

�i

n Xxi2X

bi (xi )� 1o;

@L

@bi (xi )(b; �) =

Xj2@i

Xxj2X

Xxj

Page 31: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Lagrangian

�i

n Xxi2X

bi (xi )� 1o;

@L

@bi (xi )(b; �) =

Xj2@i

Xxj2X

Xxj

Page 32: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Lagrangian

�i

n Xxi2X

bi (xi )� 1o;

@L

@bi (xi )(b; �) =

Xj2@i

Xxj2X

Xxj

Page 33: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Another point of view

Exercise : For ij (xi ; xj ) = e�ij (xi ;xj )

�i (xi ) �= E�

8<: e

Pj2@i

�ij (xi ;Xj )

Px 0ieP

j2@i�ij (x

0i;Xj )

9=;

If we could move the expectation to exponents

�i (xi ) �= eE�

Pj2@i

�ij (xi ;Xj )

[= Naive Mean Field]

Page 34: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Another point of view

Exercise : For ij (xi ; xj ) = e�ij (xi ;xj )

�i (xi ) �= E�

8<: e

Pj2@i

�ij (xi ;Xj )

Px 0ieP

j2@i�ij (x

0i;Xj )

9=;

If we could move the expectation to exponents

�i (xi ) �= eE�

Pj2@i

�ij (xi ;Xj )

[= Naive Mean Field]

Page 35: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Bethe Free Energy

Page 36: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Problem

One dimensional marginals give a very poor approximation.

Example: x1; x2 2 f0; 1g

�(x ) =1

2I(x1 � x2 = 0)

Would like to account exactly for the correlations induced by edges.

Page 37: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Problem

One dimensional marginals give a very poor approximation.

Example: x1; x2 2 f0; 1g

�(x ) =1

2I(x1 � x2 = 0)

Would like to account exactly for the correlations induced by edges.

Page 38: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Would like

F : M(X � X )E �M(X )V ! Rb = fbij ; big(i ;j )2E ;i2V 7! F(b)

bij = bij (xi ; xj ), bi = bi (xi ),

such that

argmaxb

F(b) � � ;

maxb

F(b) � � ;

Page 39: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

What is really the domain?

x1 x2 2 f0; 1g

Example:

b1 =

"0:10:9

#; b2 =

"0:90:1

#; b1 =

"0:4 0:10:1 0:4

#:

Page 40: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Natural guess

b 2 MARG(G)

MARG(G) =nb = fbi ; bij g : marginals of a distribution on XV

o;

bi (xi ) =XxV ni

p(x ) ;

bi ;j (xi ; xj ) =X

xV nfi;jg

p(x ) ;

p 2 M(XV ) :

Page 41: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Natural guess

b 2 MARG(G)

o;

bi (xi ) =XxV ni

p(x ) ;

bi ;j (xi ; xj ) =X

xV nfi;jg

p(x ) ;

p 2 M(XV ) :

Page 42: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Natural guess

b 2 MARG(G)

o;

bi (xi ) =XxV ni

p(x ) ;

bi ;j (xi ; xj ) =X

xV nfi;jg

p(x ) ;

p 2 M(XV ) :

Page 43: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Bad news

In general checking b 2 MARG(G) in NP-hard.

Page 44: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Second attempt

b 2 LOC(G)

LOC(G) =nb = fbi ; bij g : locally consistent marginals

o;

nfbi ; bij g :

Xxj

bij (xi ; xj ) = bi (xi );

Xxi

bi (xi ) = 1o

Page 45: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Second attempt

b 2 LOC(G)

o;

nfbi ; bij g :

Xxj

Xxi

bi (xi ) = 1o

Page 46: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Second attempt

b 2 LOC(G)

o;

nfbi ; bij g :

Xxj

Xxi

bi (xi ) = 1o

Page 47: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Geometric picture

LOC(G)

MARG(G)

Polytopes

Page 48: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Bethe Free Energy

F : LOC(G)! R

Intuition

F(b) � Eb log tot(x ) +H (b)

� Energy + Entropy :

Page 49: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Bethe Free Energy

F : LOC(G)! R

Intuition

Page 50: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Bethe Free Energy

F : LOC(G)! R

Intuition

Page 51: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Energy

Eb log tot(x ) =X

(i ;j )2E

Eb log ij (xi ; xj )

=X

(i ;j )2E

Ebij log ij (xi ; xj )

=X

(i ;j )2E

Xxi ;xj

bij (xi ; xj ) log ij (xi ; xj )

Page 52: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Entropy?

Page 53: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Entropy?

Idea: Consider G a tree.

Proposition

If G is a tree, then

�(x ) =Y

(i ;j )2E

�i ;j (xi ; xj )

�i (xi )�j (xj )

Yi2V

�i (xi ) :

Corollary

H (�) =Xi2V

H (�i )�X

(i ;j )2E

I (�ij ) :

Page 54: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Entropy?

Idea: Consider G a tree.

Proposition

�(x ) =Y

(i ;j )2E

�i ;j (xi ; xj )

�i (xi )�j (xj )

Yi2V

�i (xi ) :

Corollary

H (�) =Xi2V

H (�i )�X

(i ;j )2E

I (�ij ) :

Page 55: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Mutual Information

I (�1;2) =Xx1;x2

�1;2(x1; x2) log�1;2(x1; x2)

�1(x1)�2(x2)

= H (�1) +H (�2)�H (�1;2) :

Page 56: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Proof of the proposition

By induction over n .

n = 1: Trivial

Page 57: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

By induction over n .

n = 1: Trivial

Page 58: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

True for n , V = [n ], new vertex i = n + 1, connected to j = n

�(xV ; xn) = �(xV )�(xn+1jxV )

= �(xV )�(xn+1jxn) [Markov]

= �(xV )�(xn ; xn+1)

�(xn)�(xn+1)�(xn+1) [Bayes]

=Y

(i ;j )6=(n ;n+1)

�i ;j (xi ; xj )

�i (xi )�j (xj )

Yi2V

�i (xi )�(xn ; xn+1)

�(xn)�(xn+1)�(xn+1)

QED

Page 59: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

�(xV ; xn) = �(xV )�(xn+1jxV )

= �(xV )�(xn ; xn+1)

=Y

(i ;j )6=(n ;n+1)

�i ;j (xi ; xj )

�i (xi )�j (xj )

Yi2V

�i (xi )�(xn ; xn+1)

�(xn)�(xn+1)�(xn+1)

QED

Page 60: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

�(xV ; xn) = �(xV )�(xn+1jxV )

= �(xV )�(xn ; xn+1)

=Y

(i ;j )6=(n ;n+1)

�i ;j (xi ; xj )

�i (xi )�j (xj )

Yi2V

�i (xi )�(xn ; xn+1)

�(xn)�(xn+1)�(xn+1)

QED

Page 61: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

�(xV ; xn) = �(xV )�(xn+1jxV )

= �(xV )�(xn ; xn+1)

=Y

(i ;j )6=(n ;n+1)

�i ;j (xi ; xj )

�i (xi )�j (xj )

Yi2V

�i (xi )�(xn ; xn+1)

�(xn)�(xn+1)�(xn+1)

QED

Page 62: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

�(xV ; xn) = �(xV )�(xn+1jxV )

= �(xV )�(xn ; xn+1)

=Y

(i ;j )6=(n ;n+1)

�i ;j (xi ; xj )

�i (xi )�j (xj )

Yi2V

�i (xi )�(xn ; xn+1)

�(xn)�(xn+1)�(xn+1)

QED

Page 63: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Let's just export it

HBethe : LOC(G)! R :

HBethe(b) =Xi2V

H (bi )�X

(i ;j )2E

I (bij ) :

Page 64: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Putting everything together

F(b) = Eb log tot(x ) +HBethe(b)

=X

(i ;j )2E

Ebij log ij (xi ; xj )�X

(i ;j )2E

I (bij ) +Xi2V

H (bi )

=X

(i ;j )2E

Xxi ;xj

�X

(i ;j )2E

Xxi ;xj

bij (xi ; xj ) logbij (xi ; xj )

bi (xi )bj (xj )�Xi2V

Xxi

Page 65: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

=X

(i ;j )2E

I (bij ) +Xi2V

H (bi )

=X

(i ;j )2E

Xxi ;xj

�X

(i ;j )2E

Xxi ;xj

Xxi

Page 66: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

=X

(i ;j )2E

I (bij ) +Xi2V

H (bi )

=X

(i ;j )2E

Xxi ;xj

�X

(i ;j )2E

Xxi ;xj

Xxi

Page 67: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

F(b) =X

(i ;j )2E

Xxi ;xj

�X

(i ;j )2E

Xxi ;xj

bij (xi ; xj ) log bij (xi ; xj )

�Xi2V

(1� deg(i))Xxi

Page 68: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Problem

Want to maximize F(b).

F(b) is not concave.

Page 69: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Remark

Assume � a BP �xed point

�i!j (xi ) �=Y

k2@inj

n Xxk2X

ik (xi ; xk ) �k!i (xk )o:

Page 70: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

De�ne (as you would do on a tree)

bi (xi ) �=Y

k2@inj

n Xxk2X

ik (xi ; xk ) �k!i (xk )o;

bij (xi ; xj ) �= �i!j (xi ) ij (xi ; xj ) �j!i (xj ) :

Lemma

With these de�nitions, b 2 LOC(G).

Page 71: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Lagrangian

L(b; �) = F(b)�Xi2V

�i

nXxi

bi (xi )� 1o

�X

(i ;j )2~E

Xxi

�i!j (xi )nX

xj

bij (xi ; xj )� bi (xi )o

rbijL(b; �) = �1� bij (xi ; xj ) + log ij (xi ; xj )� �i!j (xi )� �j!i (xi ) ;

rbiL(b; �) = �(1� deg(i)) log[bi (xi ) e ]� �i +Xj2@i

�i!j (xi )

Page 72: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Lagrangian

L(b; �) = F(b)�Xi2V

�i

nXxi

bi (xi )� 1o

�X

(i ;j )2~E

Xxi

�i!j (xi )nX

xj

�i!j (xi )

Page 73: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Lagrangian

L(b; �) = F(b)�Xi2V

�i

nXxi

bi (xi )� 1o

�X

(i ;j )2~E

Xxi

�i!j (xi )nX

xj

�i!j (xi )

Page 74: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

bij (xi ; xj ) = ij (xi ; xj ) exp�� 1� �i!j (xi )� �j!i (xj )

;

bi (xi ) �= expn�

1

deg(i)� 1

Xj2@i

�i!j (xi )o

Xxj

bij (xi ; xj ) = bi (xi )

Did you recognize this?

Page 75: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

bij (xi ; xj ) = ij (xi ; xj ) exp�� 1� �i!j (xi )� �j!i (xj )

;

bi (xi ) �= expn�

1

deg(i)� 1

Xj2@i

�i!j (xi )o

Xxj

bij (xi ; xj ) = bi (xi )

Did you recognize this?

Page 76: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

De�ning

�i!j (xi ) �= e��i!j (xi ) :

We get

bi (xi ) �=Y

k2@inj

n Xxk2X

bij (xi ; xj ) �= �i!j (xi ) ij (xi ; xj ) �j!i (xj ) ;

+Local Consistency

Page 77: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

De�ning

�i!j (xi ) �= e��i!j (xi ) :

We get

bi (xi ) �=Y

k2@inj

n Xxk2X

bij (xi ; xj ) �= �i!j (xi ) ij (xi ; xj ) �j!i (xj ) ;

+Local Consistency

Page 78: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

We proved the following

Theorem (Yedidia, Freeman, Weiss, 2003)

Fixed points of BP are in one-to-one correspondence with

stationary points of Bethe free energy.

Fixed point messages are (exponentials of) the dual parameters at

the �xed point.

Page 79: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

We proved the following

Theorem (Yedidia, Freeman, Weiss, 2003)

Fixed points of BP are in one-to-one correspondence with

stationary points of Bethe free energy.

Fixed point messages are (exponentials of) the dual parameters at

the �xed point.

Page 80: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Uses

I Alternative algorithms to �nd �xed points (e.g. gradient ascent).[e.g. Heskes 2002]

I Include higher order marginals.[Yedidia, Freeman, Weiss, 2003]

I Convexify Bethe free energy.[Wainwright, Jaakkola, Willsky, 2005]

I Asymptotically tight estimates on logZ for graph sequences.[e.g. Dembo, Montanari 2010]

Page 81: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Region-Based Approximation

Page 82: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Idea

Vertices ! Edges ! Regions

Naive Mean Field ! Bethe Free Energy ! Region-Based Free Energy

MF Equations ! Belief Propagation ! Generalized BP

[Cluster variational method, Kikuchi 1951]

Page 83: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Region

R = (VR;ER), s.t.

I If (i ; j ) 2 ER then i ; j 2 VR

Page 84: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Region

Free energy of region R: FR : M(XVR)! R

FR(bR) = EbR log tot;R(xR) +H (bR)

=XxR

X(i ;j )2ER

bR(xR) log ij (xi ; xj )�XxR

bR(xR) log bR(xR) :

Page 85: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Region

FR(bR) = EbR log tot;R(xR) +H (bR)

=XxR

X(i ;j )2ER

bR(xR) log ij (xi ; xj )�XxR

bR(xR) log bR(xR) :

Page 86: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Region

Can be evaluated for small regions (complexity jX jjRj).

Page 87: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Region-based Approximation

Collection of regions

R =�R1;R2; : : : ;Rm

:

Coe�cients

cR =�cR1 ; cR2 ; : : : ; cRm

; cRi

2 R :

Free Energy approximation:

FR : M(XV (R1))� � � � �M(XV (Rm ))! R

bR = (bR1 ; : : : ; bRm ) 7! FR(bR) =XR2R

cR FR(bR)

Page 88: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

R =�R1;R2; : : : ;Rm

:

Coe�cients

cR =�cR1 ; cR2 ; : : : ; cRm

; cRi

2 R :

FR : M(XV (R1))� � � � �M(XV (Rm ))! R

bR = (bR1 ; : : : ; bRm ) 7! FR(bR) =XR2R

cR FR(bR)

Page 89: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

R =�R1;R2; : : : ;Rm

:

Coe�cients

cR =�cR1 ; cR2 ; : : : ; cRm

; cRi

2 R :

FR : M(XV (R1))� � � � �M(XV (Rm ))! R

bR = (bR1 ; : : : ; bRm ) 7! FR(bR) =XR2R

cR FR(bR)

Page 90: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

R =�R1;R2; : : : ;Rm

:

Coe�cients

cR =�cR1 ; cR2 ; : : : ; cRm

; cRi

2 R :

FR : M(XV (R1))� � � � �M(XV (Rm ))! R

bR = (bR1 ; : : : ; bRm ) 7! FR(bR) =XR2R

cR FR(bR)

Page 91: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Example: Bethe Free Energy

Regions

R =�Ri : i 2 V

[�Rij : (i ; j ) 2 E

;

Ri = (fig; ;) ;

Rij = (fi ; j g; f(i ; j )g) :

Coe�cients

ci = 1� deg(i) ; cij = 1 :

Free energy

FR(b) =Xi2V

f1� deg(i)gH (bi ) +X

(i ;j )2E

nH (bij ) + Ebij log ij (xi ; xj )

o

Page 92: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Regions

R =�Ri : i 2 V

[�Rij : (i ; j ) 2 E

;

Ri = (fig; ;) ;

Rij = (fi ; j g; f(i ; j )g) :

Coe�cients

ci = 1� deg(i) ; cij = 1 :

Free energy

FR(b) =Xi2V

(i ;j )2E

o

Page 93: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Regions

R =�Ri : i 2 V

[�Rij : (i ; j ) 2 E

;

Ri = (fig; ;) ;

Rij = (fi ; j g; f(i ; j )g) :

Coe�cients

ci = 1� deg(i) ; cij = 1 :

Free energy

FR(b) =Xi2V

(i ;j )2E

o

Page 94: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Regions

R =�Ri : i 2 V

[�Rij : (i ; j ) 2 E

;

Ri = (fig; ;) ;

Rij = (fi ; j g; f(i ; j )g) :

Coe�cients

ci = 1� deg(i) ; cij = 1 :

Free energy

FR(b) =Xi2V

(i ;j )2E

o

Page 95: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Questions?

1. What about domain/consistency?

2. How to choose coe�cients?

3. How to choose regions?

Page 96: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Valid Region-Based Approximations[Yedidia, Freeman, Weiss, 2003]

Condition 1: Consistency

R 2 R; R0 � R ) R0 2 R :

Condition 2: Vertex countingXR2R

cR I(i 2 R) = 1 for all i 2 V :

Condition 3: Edge countingXR2R

cR I((i ; j ) 2 R) = 1 for all (i ; j ) 2 E :

Page 97: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Example

Page 98: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Example

Page 99: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Example

Add intersections!

Page 100: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Example

Add intersections!

Page 101: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Example

Add intersections!

Page 102: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Why? #1

R 2 R; R0 � R ) R0 2 R :

Clean local consistecy conditionsXxRnR0

bR(xR) = bR0(xR0) for all R0 � R :

LOC(G ;R)

Page 103: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Why? #1

R 2 R; R0 � R ) R0 2 R :

Clean local consistecy conditionsXxRnR0

bR(xR) = bR0(xR0) for all R0 � R :

LOC(G ;R)

Page 104: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Geometric picture

LOC(G)

LOC(G ;R)

MARG(G)

Polytopes

Page 105: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Why? #2

XR2R

Consider ij (xi ; xj ) = 1, bR(xR) = Uniform

XR2R

FR(bR) =XR2R

cRH (bR)

=XR2R

cR jV (R)j log jX j

=Xi2V

8<:XR2R

cR I(i 2 R)

9=; log jX j = jV j log jX j

Page 106: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Why? #2

XR2R

Consider ij (xi ; xj ) = 1, bR(xR) = Uniform

XR2R

FR(bR) =XR2R

cRH (bR)

=XR2R

cR jV (R)j log jX j

=Xi2V

8<:XR2R

cR I(i 2 R)

9=; log jX j = jV j log jX j

Page 107: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Why? #3

XR2R

Neglect entropy (e.g. ij (xi ; xj ) = e� �ij (xi ;xj ), � !1)XR2R

FR(bR) = �XR2R

cRXxR

bR(xR)X

(ij )2E(R)

�ij (xi ; xj ) +O�(1)

= �XR2R

cRX

(ij )2E(R)

Ebij �ij (xi ; xj ) +O�(1)

= �X

(ij )2E

8<:XR2R

cR I((i ; j ) 2 R)

9=;Ebij �ij (xi ; xj ) +O�(1)

= �X

(ij )2E

Page 108: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Why? #3

XR2R

Neglect entropy (e.g. ij (xi ; xj ) = e� �ij (xi ;xj ), � !1)XR2R

FR(bR) = �XR2R

cRXxR

bR(xR)X

(ij )2E(R)

�ij (xi ; xj ) +O�(1)

= �XR2R

cRX

(ij )2E(R)

= �X

(ij )2E

8<:XR2R

cR I((i ; j ) 2 R)

9=;Ebij �ij (xi ; xj ) +O�(1)

= �X

(ij )2E

Page 109: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

How do you compute the coe�cients?

Page 110: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

The Region Graph

Page 111: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

The Region Graph

cloop = +1

cedge = +1

cvert = +1

cR = 1�X

R02ANCESTORS(R)

cR

Page 112: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Was it worth it?

10� 10 Ising model with random potentials [Yedidia et al. 2003]

Page 113: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Tree-Based Convexi�cations

Page 114: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Intermezzo: Exponential Families

T : XV ! Rm ;

x 7! T (x ) = (T1(x ); : : : ;Tm(x )) :

Exponential family f�� : � 2 Rmg

��(x ) =1

Z (�)exp

nh�;T (x )i

o; F (�) = logZ (�)

Page 115: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Intermezzo: Exponential Families

T : XV ! Rm ;

x 7! T (x ) = (T1(x ); : : : ;Tm(x )) :

Exponential family f�� : � 2 Rmg

��(x ) =1

Z (�)exp

nh�;T (x )i

o; F (�) = logZ (�)

Page 116: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Exponential Families: Basic Properties

Proposition

(1) � 7! F (�) is convex;

(2) r�F (�) = E�fT (x )g � � (�) ;

(3) r2�F (�) = Cov�fT (x );T (x )

�;

(4) Image(� ) = MARG(T ) :

MARG(T ) � conv�fT (x ) : x 2 XV g

�

=n

E�T (x ) : � 2 M(XV )o

Page 117: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Proposition

(1) � 7! F (�) is convex;

(2) r�F (�) = E�fT (x )g � � (�) ;

(3) r2�F (�) = Cov�fT (x );T (x )

�;

(4) Image(� ) = MARG(T ) :

�

=n

E�T (x ) : � 2 M(XV )o

Page 118: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Proposition

(1) � 7! F (�) is convex;

(2) r�F (�) = E�fT (x )g � � (�) ;

(3) r2�F (�) = Cov�fT (x );T (x )

�;

(4) Image(� ) = MARG(T ) :

�

=n

E�T (x ) : � 2 M(XV )o

Page 119: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Proofs

(1), (2), (3): Exercises

(4): A bit more di�cult

Claim 1: A closed convex set is the closure of its relative interior.[Hint: Assume the set has full dimension. Each point has a cone of full

dimension around it.]

Claim 2: Let �� 2 relint(MARG(T )). Then �� = E��fT (x )g for some�� s.t. ��(x ) > 0 for all x 2 XV .[Hint: Consider the set of signed weigths � such that

Px�(x )T (x ) = ��. If

the claim was false, it would be tangent to the simplex.]

Claim 3: There exists �� 2 Rm such that E��fT (x )g = E��fT (x )g.

Page 120: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Proofs

(1), (2), (3): Exercises

(4): A bit more di�cult

Claim 1: A closed convex set is the closure of its relative interior.[Hint: Assume the set has full dimension. Each point has a cone of full

dimension around it.]

Claim 2: Let �� 2 relint(MARG(T )). Then �� = E��fT (x )g for some�� s.t. ��(x ) > 0 for all x 2 XV .[Hint: Consider the set of signed weigths � such that

Px�(x )T (x ) = ��. If

the claim was false, it would be tangent to the simplex.]

Claim 3: There exists �� 2 Rm such that E��fT (x )g = E��fT (x )g.

Page 121: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Proof of Claim 3

Wlog f1;T1; : : : ;Tmg linearly independent.Consider

F (�; ��) � F (�)� h��; �i

= logn Xx2XV

exp�h�;T (x )i

�o� E��fh�;T (x )ig

I F ( � ; ��) : Rm ! R is di�erentiable and convex.

I If �� is a stationary point, then E��fT (x )g = E��fT (x )g.

I As � !1, F (�; ��)!1.

Implies the thesis.

Page 122: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Proof of Claim 3

F (�; ��) � F (�)� h��; �i

= logn Xx2XV

exp�h�;T (x )i

I As � !1, F (�; ��)!1.

Implies the thesis.

Page 123: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Proof of Claim 3

F (�; ��) � F (�)� h��; �i

= logn Xx2XV

exp�h�;T (x )i

I As � !1, F (�; ��)!1.

Implies the thesis.

Page 124: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

As � !1, F��(�)!1

Let � = � v , � 2 R+

F (�; ��) = logn Xx2XV

exp�h�;T (x )i

� �hmaxxhv ;T (x )i � E��fhv ;T (x )ig

i

and [ : : : ] > 0 strictly because ��(x ) > 0 for all x .

Page 125: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Duality structure

F�(� ) � inf�2Rm

�F (�)� h�; �i

;

F� : MARG(T )! R ; concave.

F (�) � sup�2MARG(T )

�F�(� ) + h�; �i

;

F : Rm ! R ; convex.

Page 126: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Duality structure

F�(� ) � inf�2Rm

�F (�)� h�; �i

;

F� : MARG(T )! R ; concave.

F (�) � sup�2MARG(T )

�F�(� ) + h�; �i

;

F : Rm ! R ; convex.

Page 127: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Let's apply all this

x1

x2 x3 x4

x5x6

x7x8x9x10

x11x12

G = (V ;E), V = [n ], x = (x1; : : : ; xn), xi 2 X ,

Ti ;�(x ) = I(xi = �) ; i 2 V ; � 2 X ;

Tij ;�1;�2(x ) = I(xi = �1) I(xj = �2) ; (i ; j ) 2 E ; �1; �2 2 X ;

overcomplete!Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 73 / 83

Page 128: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

The exponential family

��(x ) =1

Z (�)exp

8<:

X(i ;j )2E ;�1;�22X

�ij (�1; �2)Tij �1�2(x ) +X

i2V ;�2X

�i (�)Ti�(x )

9=;

=1

Z (�)exp

8<:

X(i ;j )2E

�ij (xi ; xj ) +Xi2V

�i (xi )

9=;

(General pairwise model)

The � parameters

bi (�) = E�fTi (�)g = ��(xi = �); for i 2 V ;

bij (�1; �2) = E�fTij (�1; �2)g = ��(xi = �1; xj = �2) ; for (i ; j ) 2 E :

Page 129: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

The exponential family

��(x ) =1

Z (�)exp

8<:

X(i ;j )2E ;�1;�22X

�ij (�1; �2)Tij �1�2(x ) +X

i2V ;�2X

�i (�)Ti�(x )

9=;

=1

Z (�)exp

8<:

X(i ;j )2E

�ij (xi ; xj ) +Xi2V

�i (xi )

9=;

(General pairwise model)

The � parameters

bi (�) = E�fTi (�)g = ��(xi = �); for i 2 V ;

bij (�1; �2) = E�fTij (�1; �2)g = ��(xi = �1; xj = �2) ; for (i ; j ) 2 E :

Page 130: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

The duality structure

F (�)$ F�(b) ;

F� : MARG(G)! R :

We want to evaluate at � = F (�� = log )':

� = supb2MARG(G)

nF�(b) + h��; bi

o= Entropy + Energy

New interpretation

Bethe entropy is an approximate expression for F�(b).

Page 131: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

F (�)$ F�(b) ;

F� : MARG(G)! R :

� = supb2MARG(G)

nF�(b) + h��; bi

o= Entropy + Energy

New interpretation

Page 132: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

F (�)$ F�(b) ;

F� : MARG(G)! R :

� = supb2MARG(G)

nF�(b) + h��; bi

o= Entropy + Energy

New interpretation

Page 133: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Interpretation works �ne on trees

Proposition

If G is a tree, then MARG(G) = LOC(G) and

F�(b) =Xi2V

H (bi )�X

(i ;j )2E

I (bij ) = F =1(b)

As a consequence, F : LOC(G)! R is concave.

Proof: Exercise.

Page 134: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Proposition

F�(b) =Xi2V

H (bi )�X

(i ;j )2E

I (bij ) = F =1(b)

Proof: Exercise.

Page 135: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Proposition

F�(b) =Xi2V

H (bi )�X

(i ;j )2E

I (bij ) = F =1(b)

Proof: Exercise.

Page 136: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

What about general graphs?

Write G as a convex combination of trees.

Page 137: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Abuse: I will use T to denote trees, not functions.

Page 138: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Convex combinations

T (G) =�spanning trees in G

;

� : T (G) ! [0; 1] ;

T 7! �T ; weights ;

XT2T (G)

�T = 1 ;

XT2T (G)

�T �T = � :

Page 139: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Convex combinations

;

� : T (G) ! [0; 1] ;

XT2T (G)

�T = 1 ;

XT2T (G)

�T �T = � :

Page 140: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Convex combinations

;

� : T (G) ! [0; 1] ;

XT2T (G)

�T = 1 ;

XT2T (G)

�T �T = � :

Page 141: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Convex combinations

� = F (�) = F� XT2T (G)

�T �T�

�X

T2T (G)

�T F (�T )

I Fix weigths �T .

I Minimize over �T (convex!)

Problem: Exponentially many spanning trees.

Page 142: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Convex combinations

� = F (�) = F� XT2T (G)

�T �T�

�X

T2T (G)

�T F (�T )

Page 143: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Convex combinations

� = F (�) = F� XT2T (G)

�T �T�

�X

T2T (G)

�T F (�T )

Page 144: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Minimization over (�T )T2T (G)

minimizeX

T2T (G)

�T F (�T ) ;

subject toX

T2T (G)

�T �Tij (xi ; xj ) = �ij (xi ; xj ) ;

XT2T (G)

�T �Ti (xi ) = �i (xi ) :

Convex Problem

Page 145: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Minimization over (�T )T2T (G)

minimizeX

T2T (G)

�T F (�T ) ;

subject toX

T2T (G)

�T �Tij (xi ; xj ) = �ij (xi ; xj ) ;

XT2T (G)

�T �Ti (xi ) = �i (xi ) :

Convex Problem

Page 146: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Lagrangian

L((�T ); b) =XT

�T F (�T )

�X

(ij )2E

Xxi ;xj

bij (xi ; xj )nX

T

�T �Tij (xi ; xj )� �ij (xi ; xj )

o

�Xi2V

Xxi

bi (xi )nX

T

�T �Ti (xi )� �i (xi )

o

=XT

�T

nF (�T )� hb; �T i

o+ hb; �i

Separable in �T

Page 147: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Lagrangian

L((�T ); b) =XT

�T F (�T )

�X

(ij )2E

Xxi ;xj

bij (xi ; xj )nX

T

o

�Xi2V

Xxi

bi (xi )nX

T

�T �Ti (xi )� �i (xi )

o

=XT

�T

nF (�T )� hb; �T i

o+ hb; �i

Separable in �T

Page 148: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Lagrangian

L((�T ); b) =XT

�T F (�T )

�X

(ij )2E

Xxi ;xj

bij (xi ; xj )nX

T

o

�Xi2V

Xxi

bi (xi )nX

T

�T �Ti (xi )� �i (xi )

o

=XT

�T

nF (�T )� hb; �T i

o+ hb; �i

Separable in �T

Page 149: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Lagrangian

min(�T )

L((�T ); b) =XT

�TF�(b; �) + hb; �i

=XT

�T

nXi2V

H (bi )�X

(ij )2E(T )

I (bij )o+ hb; �i

=Xi2V

H (bi )n XT : i2V

�T

o�

X(i ;j )2V

I (bij )n XT : (i ;j )2E(T )

�T

o+ hb; �i

=Xi2V

H (bi )�X

(i ;j )2V

�(ij )I (bij ) + hb; �i

Page 150: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Lagrangian

min(�T )

L((�T ); b) =XT

�TF�(b; �) + hb; �i

=XT

�T

nXi2V

H (bi )�X

(ij )2E(T )

I (bij )o+ hb; �i

=Xi2V

H (bi )n XT : i2V

�T

o�

X(i ;j )2V

I (bij )n XT : (i ;j )2E(T )

�T

o+ hb; �i

=Xi2V

H (bi )�X

(i ;j )2V

�(ij )I (bij ) + hb; �i

Page 151: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Lagrangian

min(�T )

L((�T ); b) =XT

�TF�(b; �) + hb; �i

=XT

�T

nXi2V

H (bi )�X

(ij )2E(T )

I (bij )o+ hb; �i

=Xi2V

H (bi )n XT : i2V

�T

o�

X(i ;j )2V

I (bij )n XT : (i ;j )2E(T )

�T

o+ hb; �i

=Xi2V

H (bi )�X

(i ;j )2V

�(ij )I (bij ) + hb; �i

Page 152: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Lagrangian

min(�T )

L((�T ); b) =XT

�TF�(b; �) + hb; �i

=XT

�T

nXi2V

H (bi )�X

(ij )2E(T )

I (bij )o+ hb; �i

=Xi2V

H (bi )n XT : i2V

�T

o�

X(i ;j )2V

I (bij )n XT : (i ;j )2E(T )

�T

o+ hb; �i

=Xi2V

H (bi )�X

(i ;j )2V

�(ij )I (bij ) + hb; �i

Page 153: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Tree-reweighted free energy

FTRW(b) =Xi2V

H (bi )�X

(i ;j )2V

�(ij )I (bij ) + hb; �i

Compare with Bethe free energy

F(b)Xi2V

H (bi )�X

(i ;j )2V

I (bij ) + hb; �i

�(i ; j ) = 0 Obviously concave upper bound.

�(i ; j ) = 1 Bethe free energy.

Page 154: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

FTRW(b) =Xi2V

H (bi )�X

(i ;j )2V

�(ij )I (bij ) + hb; �i

F(b)Xi2V

H (bi )�X

(i ;j )2V

I (bij ) + hb; �i

Page 155: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

FTRW(b) =Xi2V

H (bi )�X

(i ;j )2V

�(ij )I (bij ) + hb; �i

F(b)Xi2V

H (bi )�X

(i ;j )2V

I (bij ) + hb; �i

Page 156: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Edge weights

� = ( �(e) : e 2 E)Intepretation

�(e) = P�fe 2 E(T )g ; P�(T ) = �T :

Spanning-Tree polytopeX(i ;j )2E

�(i ; j ) = jV j � 1 ;

X(i ;j )2E(U )

�(i ; j ) � jU j � 1 ; for all U � V :

Page 157: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Edge weights

� = ( �(e) : e 2 E)Intepretation

�(e) = P�fe 2 E(T )g ; P�(T ) = �T :

Spanning-Tree polytopeX(i ;j )2E

�(i ; j ) = jV j � 1 ;

X(i ;j )2E(U )

�(i ; j ) � jU j � 1 ; for all U � V :

Page 158: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Example

k -regular graph

jV j = n ; jE j =nk

2:

Take all the weights equal (not necessarily ok, but. . . )

�(i ; j ) =2(n � 1)

nk�

2

k

For (some) models on locally tree-like graphs, �(i ; j ) = 1 isapproximately correct ! �(n) error.

Page 159: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Example

k -regular graph

jV j = n ; jE j =nk

2:

Take all the weights equal (not necessarily ok, but. . . )

�(i ; j ) =2(n � 1)

nk�

2

k

For (some) models on locally tree-like graphs, �(i ; j ) = 1 isapproximately correct ! �(n) error.

Download - Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Top Related