map estimation algorithms in

213
MAP Estimation Algorithms in M. Pawan Kumar, University of Oxford Pushmeet Kohli, Microsoft Research Computer Vision - Part I

Upload: may-bartlett

Post on 03-Jan-2016

54 views

Category:

Documents


3 download

DESCRIPTION

MAP Estimation Algorithms in. Computer Vision - Part I. M. Pawan Kumar, University of Oxford Pushmeet Kohli, Microsoft Research. Aim of the Tutorial. Description of some successful algorithms Computational issues Enough details to implement Some proofs will be skipped :-( - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: MAP Estimation Algorithms in

MAP Estimation Algorithms in

M. Pawan Kumar, University of Oxford

Pushmeet Kohli, Microsoft Research

Computer Vision - Part I

Page 2: MAP Estimation Algorithms in

Aim of the Tutorial

• Description of some successful algorithms

• Computational issues

• Enough details to implement

• Some proofs will be skipped :-(

• But references to them will be given :-)

Page 3: MAP Estimation Algorithms in

A Vision ApplicationBinary Image Segmentation

How ?

Cost function Models our knowledge about natural images

Optimize cost function to obtain the segmentation

Page 4: MAP Estimation Algorithms in

Object - white, Background - green/grey Graph G = (V,E)

Each vertex corresponds to a pixel

Edges define a 4-neighbourhood grid graph

Assign a label to each vertex from L = {obj,bkg}

A Vision ApplicationBinary Image Segmentation

Page 5: MAP Estimation Algorithms in

Graph G = (V,E)

Cost of a labelling f : V L Per Vertex Cost

Cost of label ‘obj’ low Cost of label ‘bkg’ high

Object - white, Background - green/grey

A Vision ApplicationBinary Image Segmentation

Page 6: MAP Estimation Algorithms in

Graph G = (V,E)

Cost of a labelling f : V L

Cost of label ‘obj’ high Cost of label ‘bkg’ low

Per Vertex Cost

UNARY COST

Object - white, Background - green/grey

A Vision ApplicationBinary Image Segmentation

Page 7: MAP Estimation Algorithms in

Graph G = (V,E)

Cost of a labelling f : V L Per Edge Cost

Cost of same label low

Cost of different labels high

Object - white, Background - green/grey

A Vision ApplicationBinary Image Segmentation

Page 8: MAP Estimation Algorithms in

Graph G = (V,E)

Cost of a labelling f : V L

Cost of same label high

Cost of different labels low

Per Edge Cost

PAIRWISECOST

Object - white, Background - green/grey

A Vision ApplicationBinary Image Segmentation

Page 9: MAP Estimation Algorithms in

Graph G = (V,E)

Problem: Find the labelling with minimum cost f*

Object - white, Background - green/grey

A Vision ApplicationBinary Image Segmentation

Page 10: MAP Estimation Algorithms in

Graph G = (V,E)

Problem: Find the labelling with minimum cost f*

A Vision ApplicationBinary Image Segmentation

Page 11: MAP Estimation Algorithms in

Another Vision ApplicationObject Detection using Parts-based Models

How ?

Once again, by defining a good cost function

Page 12: MAP Estimation Algorithms in

H T

L1

Each vertex corresponds to a part - ‘Head’, ‘Torso’, ‘Legs’

1

Edges define a TREE

Assign a label to each vertex from L = {positions}

Graph G = (V,E)

L2 L3 L4

Another Vision ApplicationObject Detection using Parts-based Models

Page 13: MAP Estimation Algorithms in

2

Each vertex corresponds to a part - ‘Head’, ‘Torso’, ‘Legs’

Assign a label to each vertex from L = {positions}

Graph G = (V,E)

Edges define a TREE

H T

L1 L2 L3 L4

Another Vision ApplicationObject Detection using Parts-based Models

Page 14: MAP Estimation Algorithms in

3

Each vertex corresponds to a part - ‘Head’, ‘Torso’, ‘Legs’

Assign a label to each vertex from L = {positions}

Graph G = (V,E)

Edges define a TREE

H T

L1 L2 L3 L4

Another Vision ApplicationObject Detection using Parts-based Models

Page 15: MAP Estimation Algorithms in

Cost of a labelling f : V L

Unary cost : How well does part match image patch?

Pairwise cost : Encourages valid configurations

Find best labelling f*

Graph G = (V,E)

3 H T

L1 L2 L3 L4

Another Vision ApplicationObject Detection using Parts-based Models

Page 16: MAP Estimation Algorithms in

Cost of a labelling f : V L

Unary cost : How well does part match image patch?

Pairwise cost : Encourages valid configurations

Find best labelling f*

Graph G = (V,E)

3 H T

L1 L2 L3 L4

Another Vision ApplicationObject Detection using Parts-based Models

Page 17: MAP Estimation Algorithms in

Yet Another Vision ApplicationStereo Correspondence

Disparity Map

How ?

Minimizing a cost function

Page 18: MAP Estimation Algorithms in

Yet Another Vision ApplicationStereo Correspondence

Graph G = (V,E)

Vertex corresponds to a pixel

Edges define grid graph

L = {disparities}

Page 19: MAP Estimation Algorithms in

Yet Another Vision ApplicationStereo Correspondence

Cost of labelling f :

Unary cost + Pairwise Cost

Find minimum cost f*

Page 20: MAP Estimation Algorithms in

The General Problem

b

a

e

d

c

f

Graph G = ( V, E )

Discrete label set L = {1,2,…,h}

Assign a label to each vertex f: V L

1

1 2

2 2

3

Cost of a labelling Q(f)

Unary Cost Pairwise Cost

Find f* = arg min Q(f)

Page 21: MAP Estimation Algorithms in

Outline• Problem Formulation

– Energy Function– MAP Estimation– Computing min-marginals

• Reparameterization

• Belief Propagation

• Tree-reweighted Message Passing

Page 22: MAP Estimation Algorithms in

Energy Function

Va Vb Vc Vd

Label l0

Label l1

Da Db Dc Dd

Random Variables V = {Va, Vb, ….}

Labels L = {l0, l1, ….} Data D

Labelling f: {a, b, …. } {0,1, …}

Page 23: MAP Estimation Algorithms in

Energy Function

Va Vb Vc Vd

Da Db Dc Dd

Q(f) = ∑a a;f(a)

Unary Potential

2

5

4

2

6

3

3

7Label l0

Label l1

Easy to minimize

Neighbourhood

Page 24: MAP Estimation Algorithms in

Energy Function

Va Vb Vc Vd

Da Db Dc Dd

E : (a,b) E iff Va and Vb are neighbours

E = { (a,b) , (b,c) , (c,d) }

2

5

4

2

6

3

3

7Label l0

Label l1

Page 25: MAP Estimation Algorithms in

Energy Function

Va Vb Vc Vd

Da Db Dc Dd

+∑(a,b) ab;f(a)f(b)

Pairwise Potential

0

1 1

0

0

2

1

1

4 1

0

3

2

5

4

2

6

3

3

7Label l0

Label l1

Q(f) = ∑a a;f(a)

Page 26: MAP Estimation Algorithms in

Energy Function

Va Vb Vc Vd

Da Db Dc Dd

0

1 1

0

0

2

1

1

4 1

0

3

Parameter

2

5

4

2

6

3

3

7Label l0

Label l1

+∑(a,b) ab;f(a)f(b)Q(f; ) = ∑a a;f(a)

Page 27: MAP Estimation Algorithms in

Outline• Problem Formulation

– Energy Function– MAP Estimation– Computing min-marginals

• Reparameterization

• Belief Propagation

• Tree-reweighted Message Passing

Page 28: MAP Estimation Algorithms in

MAP Estimation

Va Vb Vc Vd

2

5

4

2

6

3

3

7

0

1 1

0

0

2

1

1

4 1

0

3

Q(f; ) = ∑a a;f(a) + ∑(a,b) ab;f(a)f(b)

Label l0

Label l1

Page 29: MAP Estimation Algorithms in

MAP Estimation

Va Vb Vc Vd

2

5

4

2

6

3

3

7

0

1 1

0

0

2

1

1

4 1

0

3

Q(f; ) = ∑a a;f(a) + ∑(a,b) ab;f(a)f(b)

2 + 1 + 2 + 1 + 3 + 1 + 3 = 13

Label l0

Label l1

Page 30: MAP Estimation Algorithms in

MAP Estimation

Va Vb Vc Vd

2

5

4

2

6

3

3

7

0

1 1

0

0

2

1

1

4 1

0

3

Q(f; ) = ∑a a;f(a) + ∑(a,b) ab;f(a)f(b)

Label l0

Label l1

Page 31: MAP Estimation Algorithms in

MAP Estimation

Va Vb Vc Vd

2

5

4

2

6

3

3

7

0

1 1

0

0

2

1

1

4 1

0

3

Q(f; ) = ∑a a;f(a) + ∑(a,b) ab;f(a)f(b)

5 + 1 + 4 + 0 + 6 + 4 + 7 = 27

Label l0

Label l1

Page 32: MAP Estimation Algorithms in

MAP Estimation

Va Vb Vc Vd

2

5

4

2

6

3

3

7

0

1 1

0

0

2

1

1

4 1

0

3

Q(f; ) = ∑a a;f(a) + ∑(a,b) ab;f(a)f(b)

f* = arg min Q(f; )

q* = min Q(f; ) = Q(f*; )

Label l0

Label l1

Page 33: MAP Estimation Algorithms in

MAP Estimation

f(a) f(b) f(c) f(d) Q(f; )

0 0 0 0 18

0 0 0 1 15

0 0 1 0 27

0 0 1 1 20

0 1 0 0 22

0 1 0 1 19

0 1 1 0 27

0 1 1 1 20

16 possible labellings

f(a) f(b) f(c) f(d) Q(f; )

1 0 0 0 16

1 0 0 1 13

1 0 1 0 25

1 0 1 1 18

1 1 0 0 18

1 1 0 1 15

1 1 1 0 23

1 1 1 1 16

f* = {1, 0, 0, 1}q* = 13

Page 34: MAP Estimation Algorithms in

Computational Complexity

Segmentation

2|V|

|V| = number of pixels ≈ 320 * 480 = 153600

Page 35: MAP Estimation Algorithms in

Computational Complexity

|L| = number of pixels ≈ 153600

Detection

|L||V|

Page 36: MAP Estimation Algorithms in

Computational Complexity

|V| = number of pixels ≈ 153600

Stereo

|L||V|

Can we do better than brute-force?

MAP Estimation is NP-hard !!

Page 37: MAP Estimation Algorithms in

Computational Complexity

|V| = number of pixels ≈ 153600

Stereo

|L||V|

Exact algorithms do exist for special cases

Good approximate algorithms for general case

But first … two important definitions

Page 38: MAP Estimation Algorithms in

Outline• Problem Formulation

– Energy Function– MAP Estimation– Computing min-marginals

• Reparameterization

• Belief Propagation

• Tree-reweighted Message Passing

Page 39: MAP Estimation Algorithms in

Min-Marginals

Va Vb Vc Vd

2

5

4

2

6

3

3

7

0

1 1

0

0

2

1

1

4 1

0

3

f* = arg min Q(f; ) such that f(a) = i

Min-marginal qa;i

Label l0

Label l1

Not a marginal (no summation)

Page 40: MAP Estimation Algorithms in

Min-Marginals16 possible labellings qa;0 = 15

f(a) f(b) f(c) f(d) Q(f; )

0 0 0 0 18

0 0 0 1 15

0 0 1 0 27

0 0 1 1 20

0 1 0 0 22

0 1 0 1 19

0 1 1 0 27

0 1 1 1 20

f(a) f(b) f(c) f(d) Q(f; )

1 0 0 0 16

1 0 0 1 13

1 0 1 0 25

1 0 1 1 18

1 1 0 0 18

1 1 0 1 15

1 1 1 0 23

1 1 1 1 16

Page 41: MAP Estimation Algorithms in

Min-Marginals16 possible labellings qa;1 = 13

f(a) f(b) f(c) f(d) Q(f; )

1 0 0 0 16

1 0 0 1 13

1 0 1 0 25

1 0 1 1 18

1 1 0 0 18

1 1 0 1 15

1 1 1 0 23

1 1 1 1 16

f(a) f(b) f(c) f(d) Q(f; )

0 0 0 0 18

0 0 0 1 15

0 0 1 0 27

0 0 1 1 20

0 1 0 0 22

0 1 0 1 19

0 1 1 0 27

0 1 1 1 20

Page 42: MAP Estimation Algorithms in

Min-Marginals and MAP• Minimum min-marginal of any variable = energy of MAP labelling

minf Q(f; ) such that f(a) = i

qa;i mini

mini ( )

Va has to take one label

minf Q(f; )

Page 43: MAP Estimation Algorithms in

Summary

MAP Estimation

f* = arg min Q(f; )

Q(f; ) = ∑a a;f(a) + ∑(a,b) ab;f(a)f(b)

Min-marginals

qa;i = min Q(f; ) s.t. f(a) = i

Energy Function

Page 44: MAP Estimation Algorithms in

Outline• Problem Formulation

• Reparameterization

• Belief Propagation

• Tree-reweighted Message Passing

Page 45: MAP Estimation Algorithms in

Reparameterization

Va Vb

2

5

4

2

0

1 1

0

f(a) f(b) Q(f; )

0 0 7

0 1 10

1 0 5

1 1 6

2 +

2 +

- 2

- 2

Add a constant to all a;i

Subtract that constant from all b;k

Page 46: MAP Estimation Algorithms in

Reparameterization

f(a) f(b) Q(f; )

0 0 7 + 2 - 2

0 1 10 + 2 - 2

1 0 5 + 2 - 2

1 1 6 + 2 - 2

Add a constant to all a;i

Subtract that constant from all b;k

Q(f; ’) = Q(f; )

Va Vb

2

5

4

2

0

0

2 +

2 +

- 2

- 2

1 1

Page 47: MAP Estimation Algorithms in

Reparameterization

Va Vb

2

5

4

2

0

1 1

0

f(a) f(b) Q(f; )

0 0 7

0 1 10

1 0 5

1 1 6

- 3 + 3

Add a constant to one b;k

Subtract that constant from ab;ik for all ‘i’

- 3

Page 48: MAP Estimation Algorithms in

Reparameterization

Va Vb

2

5

4

2

0

1 1

0

f(a) f(b) Q(f; )

0 0 7

0 1 10 - 3 + 3

1 0 5

1 1 6 - 3 + 3

- 3 + 3

- 3

Q(f; ’) = Q(f; )

Add a constant to one b;k

Subtract that constant from ab;ik for all ‘i’

Page 49: MAP Estimation Algorithms in

Reparameterization

Va Vb

2

5

4

2

3 1

0

1

2

Va Vb

2

5

4

2

3 1

1

0

1

- 2

- 2

- 2 + 2+ 1

+ 1

+ 1

- 1

Va Vb

2

5

4

2

3 1

2

1

0 - 4 + 4

- 4

- 4

’a;i = a;i ’b;k = b;k

’ab;ik = ab;ik

+ Mab;k

- Mab;k

+ Mba;i

- Mba;i Q(f; ’) = Q(f; )

Page 50: MAP Estimation Algorithms in

Reparameterization

Q(f; ’) = Q(f; ), for all f

’ is a reparameterization of , iff

’b;k = b;k

’a;i = a;i

’ab;ik = ab;ik

+ Mab;k

- Mab;k

+ Mba;i

- Mba;i

Equivalently Kolmogorov, PAMI, 2006

Va Vb

2

5

4

2

0

0

2 +

2 +

- 2

- 2

1 1

Page 51: MAP Estimation Algorithms in

RecapMAP Estimation

f* = arg min Q(f; )Q(f; ) = ∑a a;f(a) + ∑(a,b) ab;f(a)f(b)

Min-marginals

qa;i = min Q(f; ) s.t. f(a) = i

Q(f; ’) = Q(f; ), for all f ’ Reparameterization

Page 52: MAP Estimation Algorithms in

Outline• Problem Formulation

• Reparameterization

• Belief Propagation– Exact MAP for Chains and Trees– Approximate MAP for general graphs– Computational Issues and Theoretical Properties

• Tree-reweighted Message Passing

Page 53: MAP Estimation Algorithms in

Belief Propagation

• Belief Propagation gives exact MAP for chains

• Remember, some MAP problems are easy

• Exact MAP for trees

• Clever Reparameterization

Page 54: MAP Estimation Algorithms in

Two Variables

Va Vb

2

5 2

1

0

Va Vb

2

5

40

1

Choose the right constant ’b;k = qb;k

Add a constant to one b;k

Subtract that constant from ab;ik for all ‘i’

Page 55: MAP Estimation Algorithms in

Va Vb

2

5 2

1

0

Va Vb

2

5

40

1

Choose the right constant ’b;k = qb;k

a;0 + ab;00 = 5 + 0

a;1 + ab;10 = 2 + 1minMab;0 =

Two Variables

Page 56: MAP Estimation Algorithms in

Va Vb

2

5 5

-2

-3

Va Vb

2

5

40

1

Choose the right constant ’b;k = qb;k

Two Variables

Page 57: MAP Estimation Algorithms in

Va Vb

2

5 5

-2

-3

Va Vb

2

5

40

1

Choose the right constant ’b;k = qb;k

f(a) = 1

’b;0 = qb;0

Two Variables

Potentials along the red path add up to 0

Page 58: MAP Estimation Algorithms in

Va Vb

2

5 5

-2

-3

Va Vb

2

5

40

1

Choose the right constant ’b;k = qb;k

a;0 + ab;01 = 5 + 1

a;1 + ab;11 = 2 + 0minMab;1 =

Two Variables

Page 59: MAP Estimation Algorithms in

Va Vb

2

5 5

-2

-3

Va Vb

2

5

6-2

-1

Choose the right constant ’b;k = qb;k

f(a) = 1

’b;0 = qb;0

f(a) = 1

’b;1 = qb;1

Minimum of min-marginals = MAP estimate

Two Variables

Page 60: MAP Estimation Algorithms in

Va Vb

2

5 5

-2

-3

Va Vb

2

5

6-2

-1

Choose the right constant ’b;k = qb;k

f(a) = 1

’b;0 = qb;0

f(a) = 1

’b;1 = qb;1

f*(b) = 0 f*(a) = 1

Two Variables

Page 61: MAP Estimation Algorithms in

Va Vb

2

5 5

-2

-3

Va Vb

2

5

6-2

-1

Choose the right constant ’b;k = qb;k

f(a) = 1

’b;0 = qb;0

f(a) = 1

’b;1 = qb;1

We get all the min-marginals of Vb

Two Variables

Page 62: MAP Estimation Algorithms in

RecapWe only need to know two sets of equations

General form of Reparameterization

’a;i = a;i

’ab;ik = ab;ik

+ Mab;k

- Mab;k

+ Mba;i

- Mba;i

’b;k = b;k

Reparameterization of (a,b) in Belief Propagation

Mab;k = mini { a;i + ab;ik }

Mba;i = 0

Page 63: MAP Estimation Algorithms in

Three Variables

Va Vb

2

5 2

1

0

Vc

4 60

1

0

1

3

2 3

Reparameterize the edge (a,b) as before

l0

l1

Page 64: MAP Estimation Algorithms in

Va Vb

2

5 5-3

Vc

6 60

1

-2

3

Reparameterize the edge (a,b) as before

f(a) = 1

f(a) = 1

-2 -1 2 3

Three Variables

l0

l1

Page 65: MAP Estimation Algorithms in

Va Vb

2

5 5-3

Vc

6 60

1

-2

3

Reparameterize the edge (a,b) as before

f(a) = 1

f(a) = 1

Potentials along the red path add up to 0

-2 -1 2 3

Three Variables

l0

l1

Page 66: MAP Estimation Algorithms in

Va Vb

2

5 5-3

Vc

6 60

1

-2

3

Reparameterize the edge (b,c) as before

f(a) = 1

f(a) = 1

Potentials along the red path add up to 0

-2 -1 2 3

Three Variables

l0

l1

Page 67: MAP Estimation Algorithms in

Va Vb

2

5 5-3

Vc

6 12-6

-5

-2

9

Reparameterize the edge (b,c) as before

f(a) = 1

f(a) = 1

Potentials along the red path add up to 0

f(b) = 1

f(b) = 0

-2 -1 -4 -3

Three Variables

l0

l1

Page 68: MAP Estimation Algorithms in

Va Vb

2

5 5-3

Vc

6 12-6

-5

-2

9

Reparameterize the edge (b,c) as before

f(a) = 1

f(a) = 1

Potentials along the red path add up to 0

f(b) = 1

f(b) = 0

qc;0

qc;1-2 -1 -4 -3

Three Variables

l0

l1

Page 69: MAP Estimation Algorithms in

Va Vb

2

5 5-3

Vc

6 12-6

-5

-2

9

f(a) = 1

f(a) = 1

f(b) = 1

f(b) = 0

qc;0

qc;1

f*(c) = 0 f*(b) = 0 f*(a) = 1

Generalizes to any length chain

-2 -1 -4 -3

Three Variables

l0

l1

Page 70: MAP Estimation Algorithms in

Va Vb

2

5 5-3

Vc

6 12-6

-5

-2

9

f(a) = 1

f(a) = 1

f(b) = 1

f(b) = 0

qc;0

qc;1

f*(c) = 0 f*(b) = 0 f*(a) = 1

Only Dynamic Programming

-2 -1 -4 -3

Three Variables

l0

l1

Page 71: MAP Estimation Algorithms in

Why Dynamic Programming?

3 variables 2 variables + book-keeping

n variables (n-1) variables + book-keeping

Start from left, go to right

Reparameterize current edge (a,b)

Mab;k = mini { a;i + ab;ik }

’ab;ik = ab;ik+ Mab;k - Mab;k’b;k = b;k

Repeat

Page 72: MAP Estimation Algorithms in

Why Dynamic Programming?

Start from left, go to right

Reparameterize current edge (a,b)

Mab;k = mini { a;i + ab;ik }

’ab;ik = ab;ik+ Mab;k - Mab;k’b;k = b;k

Repeat

Messages Message Passing

Why stop at dynamic programming?

Page 73: MAP Estimation Algorithms in

Va Vb

2

5 5-3

Vc

6 12-6

-5

-2

9

Reparameterize the edge (c,b) as before

-2 -1 -4 -3

Three Variables

l0

l1

Page 74: MAP Estimation Algorithms in

Va Vb

2

5 9-3

Vc

11 12-11

-9

-2

9

Reparameterize the edge (c,b) as before

-2 -1 -9 -7

’b;i = qb;i

Three Variables

l0

l1

Page 75: MAP Estimation Algorithms in

Va Vb

2

5 9-3

Vc

11 12-11

-9

-2

9

Reparameterize the edge (b,a) as before

-2 -1 -9 -7

Three Variables

l0

l1

Page 76: MAP Estimation Algorithms in

Va Vb

9

11 9-9

Vc

11 12-11

-9

-9

9

Reparameterize the edge (b,a) as before

-9 -7 -9 -7

’a;i = qa;i

Three Variables

l0

l1

Page 77: MAP Estimation Algorithms in

Va Vb

9

11 9-9

Vc

11 12-11

-9

-9

9

Forward Pass Backward Pass

-9 -7 -9 -7

All min-marginals are computed

Three Variables

l0

l1

Page 78: MAP Estimation Algorithms in

Belief Propagation on Chains

Start from left, go to right

Reparameterize current edge (a,b)

Mab;k = mini { a;i + ab;ik }

’ab;ik = ab;ik+ Mab;k - Mab;k’b;k = b;k

Repeat till the end of the chain

Start from right, go to left

Repeat till the end of the chain

Page 79: MAP Estimation Algorithms in

Belief Propagation on Chains

• A way of computing reparam constants

• Generalizes to chains of any length

• Forward Pass - Start to End• MAP estimate• Min-marginals of final variable

• Backward Pass - End to start• All other min-marginals

Won’t need this .. But good to know

Page 80: MAP Estimation Algorithms in

Computational Complexity

• Each constant takes O(|L|)

• Number of constants - O(|E||L|)

O(|E||L|2)

• Memory required ?

O(|E||L|)

Page 81: MAP Estimation Algorithms in

Belief Propagation on Trees

Vb

Va

Forward Pass: Leaf Root

All min-marginals are computed

Backward Pass: Root Leaf

Vc

Vd Ve Vg Vh

Page 82: MAP Estimation Algorithms in

Outline• Problem Formulation

• Reparameterization

• Belief Propagation– Exact MAP for Chains and Trees– Approximate MAP for general graphs– Computational Issues and Theoretical Properties

• Tree-reweighted Message Passing

Page 83: MAP Estimation Algorithms in

Belief Propagation on Cycles

Va Vb

Vd Vc

Where do we start? Arbitrarily

a;0

a;1

b;0

b;1

d;0

d;1

c;0

c;1

Reparameterize (a,b)

Page 84: MAP Estimation Algorithms in

Belief Propagation on Cycles

Va Vb

Vd Vc

a;0

a;1

’b;0

’b;1

d;0

d;1

c;0

c;1

Potentials along the red path add up to 0

Page 85: MAP Estimation Algorithms in

Belief Propagation on Cycles

Va Vb

Vd Vc

a;0

a;1

’b;0

’b;1

d;0

d;1

’c;0

’c;1

Potentials along the red path add up to 0

Page 86: MAP Estimation Algorithms in

Belief Propagation on Cycles

Va Vb

Vd Vc

a;0

a;1

’b;0

’b;1

’d;0

’d;1

’c;0

’c;1

Potentials along the red path add up to 0

Page 87: MAP Estimation Algorithms in

Belief Propagation on Cycles

Va Vb

Vd Vc

’a;0

’a;1

’b;0

’b;1

’d;0

’d;1

’c;0

’c;1

Potentials along the red path add up to 0

Page 88: MAP Estimation Algorithms in

Belief Propagation on Cycles

Va Vb

Vd Vc

’a;0

’a;1

’b;0

’b;1

’d;0

’d;1

’c;0

’c;1

Potentials along the red path add up to 0

- a;0

- a;1

’a;0 - a;0 = qa;0 ’a;1 - a;1 = qa;1

Page 89: MAP Estimation Algorithms in

Belief Propagation on Cycles

Va Vb

Vd Vc

’a;0

’a;1

’b;0

’b;1

’d;0

’d;1

’c;0

’c;1

Pick minimum min-marginal. Follow red path.

- a;0

- a;1

’a;0 - a;0 = qa;0 ’a;1 - a;1 = qa;1

Page 90: MAP Estimation Algorithms in

Belief Propagation on Cycles

Va Vb

Vd Vc

a;0

a;1

’b;0

’b;1

d;0

d;1

c;0

c;1

Potentials along the red path add up to 0

Page 91: MAP Estimation Algorithms in

Belief Propagation on Cycles

Va Vb

Vd Vc

a;0

a;1

’b;0

’b;1

d;0

d;1

’c;0

’c;1

Potentials along the red path add up to 0

Page 92: MAP Estimation Algorithms in

Belief Propagation on Cycles

Va Vb

Vd Vc

a;0

a;1

’b;0

’b;1

’d;0

’d;1

’c;0

’c;1

Potentials along the red path add up to 0

Page 93: MAP Estimation Algorithms in

Belief Propagation on Cycles

Va Vb

Vd Vc

’a;0

’a;1

’b;0

’b;1

’d;0

’d;1

’c;0

’c;1

Potentials along the red path add up to 0

- a;0

- a;1

’a;1 - a;1 = qa;1 ’a;0 - a;0 ≤ qa;0≤

Page 94: MAP Estimation Algorithms in

Belief Propagation on Cycles

Va Vb

Vd Vc

’a;0

’a;1

’b;0

’b;1

’d;0

’d;1

’c;0

’c;1

Problem Solved

- a;0

- a;1

’a;1 - a;1 = qa;1 ’a;0 - a;0 ≤ qa;0≤

Page 95: MAP Estimation Algorithms in

Belief Propagation on Cycles

Va Vb

Vd Vc

’a;0

’a;1

’b;0

’b;1

’d;0

’d;1

’c;0

’c;1

Problem Not Solved

- a;0

- a;1

’a;1 - a;1 = qa;1 ’a;0 - a;0 ≤ qa;0≥

Page 96: MAP Estimation Algorithms in

Belief Propagation on Cycles

Va Vb

Vd Vc

’a;0

’a;1

’b;0

’b;1

’d;0

’d;1

’c;0

’c;1

- a;0

- a;1

Reparameterize (a,b) again

Page 97: MAP Estimation Algorithms in

Belief Propagation on Cycles

Va Vb

Vd Vc

’a;0

’a;1

’’b;0

’’b;1

’d;0

’d;1

’c;0

’c;1

Reparameterize (a,b) again

But doesn’t this overcount some potentials?

Page 98: MAP Estimation Algorithms in

Belief Propagation on Cycles

Va Vb

Vd Vc

’a;0

’a;1

’’b;0

’’b;1

’d;0

’d;1

’c;0

’c;1

Reparameterize (a,b) again

Yes. But we will do it anyway

Page 99: MAP Estimation Algorithms in

Belief Propagation on Cycles

Va Vb

Vd Vc

’a;0

’a;1

’’b;0

’’b;1

’d;0

’d;1

’c;0

’c;1

Keep reparameterizing edges in some order

Hope for convergence and a good solution

Page 100: MAP Estimation Algorithms in

Belief Propagation

• Generalizes to any arbitrary random field

• Complexity per iteration ?

O(|E||L|2)

• Memory required ?

O(|E||L|)

Page 101: MAP Estimation Algorithms in

Outline• Problem Formulation

• Reparameterization

• Belief Propagation– Exact MAP for Chains and Trees– Approximate MAP for general graphs– Computational Issues and Theoretical Properties

• Tree-reweighted Message Passing

Page 102: MAP Estimation Algorithms in

Computational Issues of BP

Complexity per iteration O(|E||L|2)

Special Pairwise Potentials ab;ik = wabd(|i-k|)

i - k

d

Potts

i - k

d

Truncated Linear

i - k

d

Truncated Quadratic

O(|E||L|) Felzenszwalb & Huttenlocher, 2004

Page 103: MAP Estimation Algorithms in

Computational Issues of BP

Memory requirements O(|E||L|)

Half of original BP Kolmogorov, 2006

Some approximations exist

But memory still remains an issue

Yu, Lin, Super and Tan, 2007

Lasserre, Kannan and Winn, 2007

Page 104: MAP Estimation Algorithms in

Computational Issues of BP

Order of reparameterization

Randomly

Residual Belief Propagation

In some fixed order

The one that results in maximum change

Elidan et al. , 2006

Page 105: MAP Estimation Algorithms in

Theoretical Properties of BP

Exact for Trees Pearl, 1988

What about any general random field?

Run BP. Assume it converges.

Page 106: MAP Estimation Algorithms in

Theoretical Properties of BP

Exact for Trees Pearl, 1988

What about any general random field?

Choose variables in a tree. Change their labels.Value of energy does not decrease

Page 107: MAP Estimation Algorithms in

Theoretical Properties of BP

Exact for Trees Pearl, 1988

What about any general random field?

Choose variables in a cycle. Change their labels.Value of energy does not decrease

Page 108: MAP Estimation Algorithms in

Theoretical Properties of BP

Exact for Trees Pearl, 1988

What about any general random field?

For cycles, if BP converges then exact MAPWeiss and Freeman, 2001

Page 109: MAP Estimation Algorithms in

ResultsObject Detection Felzenszwalb and Huttenlocher, 2004

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

H

TA1 A2

L1 L2

Labels - Poses of parts

Unary Potentials:Fraction of foreground pixels

Pairwise Potentials:Favour Valid Configurations

Page 110: MAP Estimation Algorithms in

ResultsObject Detection Felzenszwalb and Huttenlocher, 2004

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 111: MAP Estimation Algorithms in

ResultsBinary Segmentation Szeliski et al. , 2008

Labels - {foreground, background}

Unary Potentials: -log(likelihood) using learnt fg/bg models

Pairwise Potentials: 0, if same labels

1 - exp(|Da - Db|), if different labels

Page 112: MAP Estimation Algorithms in

ResultsBinary Segmentation

Labels - {foreground, background}

Unary Potentials: -log(likelihood) using learnt fg/bg models

Szeliski et al. , 2008

Pairwise Potentials: 0, if same labels

1 - exp(|Da - Db|), if different labels

Belief Propagation

Page 113: MAP Estimation Algorithms in

ResultsBinary Segmentation

Labels - {foreground, background}

Unary Potentials: -log(likelihood) using learnt fg/bg models

Szeliski et al. , 2008

Global optimum

Pairwise Potentials: 0, if same labels

1 - exp(|Da - Db|), if different labels

Page 114: MAP Estimation Algorithms in

ResultsSzeliski et al. , 2008

Labels - {disparities}

Unary Potentials: Similarity of pixel colours

Pairwise Potentials: 0, if same labels

1 - exp(|Da - Db|), if different labels

Stereo Correspondence

Page 115: MAP Estimation Algorithms in

ResultsSzeliski et al. , 2008

Labels - {disparities}

Unary Potentials: Similarity of pixel colours

Pairwise Potentials: 0, if same labels

1 - exp(|Da - Db|), if different labels

Belief Propagation

Stereo Correspondence

Page 116: MAP Estimation Algorithms in

ResultsSzeliski et al. , 2008

Labels - {disparities}

Unary Potentials: Similarity of pixel colours

Global optimum

Pairwise Potentials: 0, if same labels

1 - exp(|Da - Db|), if different labels

Stereo Correspondence

Page 117: MAP Estimation Algorithms in

Summary of BP

Exact for chains

Exact for trees

Approximate MAP for general cases

Not even convergence guaranteed

So can we do something better?

Page 118: MAP Estimation Algorithms in

Outline• Problem Formulation

• Reparameterization

• Belief Propagation

• Tree-reweighted Message Passing– Integer Programming Formulation– Linear Programming Relaxation and its Dual– Convergent Solution for Dual– Computational Issues and Theoretical Properties

Page 119: MAP Estimation Algorithms in

TRW Message Passing

• Convex (not Combinatorial) Optimization

• A different look at the same problem

• A similar solution

• Combinatorial (not Convex) Optimization

We will look at the most general MAP estimation

Not trees No assumption on potentials

Page 120: MAP Estimation Algorithms in

Things to Remember

• Forward-pass computes min-marginals of root

• BP is exact for trees

• Every iteration provides a reparameterization

• Basics of Mathematical Optimization

Page 121: MAP Estimation Algorithms in

Mathematical Optimization

min g0(x)

subject to gi(x) ≤ 0 i=1, … , N

• Objective function • Constraints

• Feasible region = {x | gi(x) ≤ 0}

x* = arg

OptimalSolution

g0(x*)

OptimalValue

Page 122: MAP Estimation Algorithms in

Integer Programming

min g0(x)

subject to gi(x) ≤ 0 i=1, … , N

• Objective function • Constraints

• Feasible region = {x | gi(x) ≤ 0}

x* = arg

OptimalSolution

g0(x*)

OptimalValue

xk Z

Page 123: MAP Estimation Algorithms in

Feasible Region

Generally NP-hard to optimize

Page 124: MAP Estimation Algorithms in

Linear Programming

min g0(x)

subject to gi(x) ≤ 0 i=1, … , N

• Objective function • Constraints

• Feasible region = {x | gi(x) ≤ 0}

x* = arg

OptimalSolution

g0(x*)

OptimalValue

Page 125: MAP Estimation Algorithms in

Linear Programming

min g0(x)

subject to gi(x) ≤ 0 i=1, … , N

• Linear objective function • Linear constraints

• Feasible region = {x | gi(x) ≤ 0}

x* = arg

OptimalSolution

g0(x*)

OptimalValue

Page 126: MAP Estimation Algorithms in

Linear Programming

min cTx

subject to Ax ≤ b i=1, … , N

• Linear objective function • Linear constraints

• Feasible region = {x | Ax ≤ b}

x* = arg

OptimalSolution

cTx*

OptimalValue

Polynomial-time Solution

Page 127: MAP Estimation Algorithms in

Feasible Region

Polynomial-time Solution

Page 128: MAP Estimation Algorithms in

Feasible Region

Optimal solution lies on a vertex (obj func linear)

Page 129: MAP Estimation Algorithms in

Outline• Problem Formulation

• Reparameterization

• Belief Propagation

• Tree-reweighted Message Passing– Integer Programming Formulation– Linear Programming Relaxation and its Dual– Convergent Solution for Dual– Computational Issues and Theoretical Properties

Page 130: MAP Estimation Algorithms in

Integer Programming Formulation

Va Vb

Label l0

Label l12

5

4

2

0

1 1

0

2Unary Potentials

a;0 = 5

a;1 = 2

b;0 = 2

b;1 = 4

Labelling

f(a) = 1

f(b) = 0

ya;0 = 0 ya;1 = 1

yb;0 = 1 yb;1 = 0

Any f(.) has equivalent boolean variables ya;i

Page 131: MAP Estimation Algorithms in

Integer Programming Formulation

Va Vb

2

5

4

2

0

1 1

0

2Unary Potentials

a;0 = 5

a;1 = 2

b;0 = 2

b;1 = 4

Labelling

f(a) = 1

f(b) = 0

ya;0 = 0 ya;1 = 1

yb;0 = 1 yb;1 = 0

Find the optimal variables ya;i

Label l0

Label l1

Page 132: MAP Estimation Algorithms in

Integer Programming Formulation

Va Vb

2

5

4

2

0

1 1

0

2Unary Potentials

a;0 = 5

a;1 = 2

b;0 = 2

b;1 = 4

Sum of Unary Potentials

∑a ∑i a;i ya;i

ya;i {0,1}, for all Va, li∑i ya;i = 1, for all Va

Label l0

Label l1

Page 133: MAP Estimation Algorithms in

Integer Programming Formulation

Va Vb

2

5

4

2

0

1 1

0

2Pairwise Potentials

ab;00 = 0

ab;10 = 1

ab;01 = 1

ab;11 = 0

Sum of Pairwise Potentials

∑(a,b) ∑ik ab;ik ya;iyb;k

ya;i {0,1}

∑i ya;i = 1

Label l0

Label l1

Page 134: MAP Estimation Algorithms in

Integer Programming Formulation

Va Vb

2

5

4

2

0

1 1

0

2Pairwise Potentials

ab;00 = 0

ab;10 = 1

ab;01 = 1

ab;11 = 0

Sum of Pairwise Potentials

∑(a,b) ∑ik ab;ik yab;ik

ya;i {0,1}

∑i ya;i = 1

yab;ik = ya;i yb;k

Label l0

Label l1

Page 135: MAP Estimation Algorithms in

Integer Programming Formulation

min ∑a ∑i a;i ya;i + ∑(a,b) ∑ik ab;ik yab;ik

ya;i {0,1}

∑i ya;i = 1

yab;ik = ya;i yb;k

Page 136: MAP Estimation Algorithms in

Integer Programming Formulation

min Ty

ya;i {0,1}

∑i ya;i = 1

yab;ik = ya;i yb;k

= [ … a;i …. ; … ab;ik ….]

y = [ … ya;i …. ; … yab;ik ….]

Page 137: MAP Estimation Algorithms in

One variable, two labels

ya;0

ya;1

ya;0 {0,1} ya;1 {0,1} ya;0 + ya;1 = 1

y = [ ya;0 ya;1] = [ a;0 a;1]

Page 138: MAP Estimation Algorithms in

Two variables, two labels

= [ a;0 a;1 b;0 b;1

ab;00 ab;01 ab;10 ab;11]y = [ ya;0 ya;1 yb;0 yb;1

yab;00 yab;01 yab;10 yab;11]

ya;0 {0,1} ya;1 {0,1} ya;0 + ya;1 = 1

yb;0 {0,1} yb;1 {0,1} yb;0 + yb;1 = 1

yab;00 = ya;0 yb;0 yab;01 = ya;0 yb;1

yab;10 = ya;1 yb;0 yab;11 = ya;1 yb;1

Page 139: MAP Estimation Algorithms in

In General

Marginal Polytope

Page 140: MAP Estimation Algorithms in

In General

R(|V||L| + |E||L|2)

y {0,1}(|V||L| + |E||L|2)

Number of constraints

|V||L| + |V| + |E||L|2

ya;i {0,1} ∑i ya;i = 1 yab;ik = ya;i yb;k

Page 141: MAP Estimation Algorithms in

Integer Programming Formulation

min Ty

ya;i {0,1}

∑i ya;i = 1

yab;ik = ya;i yb;k

= [ … a;i …. ; … ab;ik ….]

y = [ … ya;i …. ; … yab;ik ….]

Page 142: MAP Estimation Algorithms in

Integer Programming Formulation

min Ty

ya;i {0,1}

∑i ya;i = 1

yab;ik = ya;i yb;k

Solve to obtain MAP labelling y*

Page 143: MAP Estimation Algorithms in

Integer Programming Formulation

min Ty

ya;i {0,1}

∑i ya;i = 1

yab;ik = ya;i yb;k

But we can’t solve it in general

Page 144: MAP Estimation Algorithms in

Outline• Problem Formulation

• Reparameterization

• Belief Propagation

• Tree-reweighted Message Passing– Integer Programming Formulation– Linear Programming Relaxation and its Dual– Convergent Solution for Dual– Computational Issues and Theoretical Properties

Page 145: MAP Estimation Algorithms in

Linear Programming Relaxation

min Ty

ya;i {0,1}

∑i ya;i = 1

yab;ik = ya;i yb;k

Two reasons why we can’t solve this

Page 146: MAP Estimation Algorithms in

Linear Programming Relaxation

min Ty

ya;i [0,1]

∑i ya;i = 1

yab;ik = ya;i yb;k

One reason why we can’t solve this

Page 147: MAP Estimation Algorithms in

Linear Programming Relaxation

min Ty

ya;i [0,1]

∑i ya;i = 1

∑k yab;ik = ∑kya;i yb;k

One reason why we can’t solve this

Page 148: MAP Estimation Algorithms in

Linear Programming Relaxation

min Ty

ya;i [0,1]

∑i ya;i = 1

One reason why we can’t solve this

= 1∑k yab;ik = ya;i∑k yb;k

Page 149: MAP Estimation Algorithms in

Linear Programming Relaxation

min Ty

ya;i [0,1]

∑i ya;i = 1

∑k yab;ik = ya;i

One reason why we can’t solve this

Page 150: MAP Estimation Algorithms in

Linear Programming Relaxation

min Ty

ya;i [0,1]

∑i ya;i = 1

∑k yab;ik = ya;i

No reason why we can’t solve this*

*memory requirements, time complexity

Page 151: MAP Estimation Algorithms in

One variable, two labels

ya;0

ya;1

ya;0 {0,1} ya;1 {0,1} ya;0 + ya;1 = 1

y = [ ya;0 ya;1] = [ a;0 a;1]

Page 152: MAP Estimation Algorithms in

One variable, two labels

ya;0

ya;1

ya;0 [0,1] ya;1 [0,1] ya;0 + ya;1 = 1

y = [ ya;0 ya;1] = [ a;0 a;1]

Page 153: MAP Estimation Algorithms in

Two variables, two labels

= [ a;0 a;1 b;0 b;1

ab;00 ab;01 ab;10 ab;11]y = [ ya;0 ya;1 yb;0 yb;1

yab;00 yab;01 yab;10 yab;11]

ya;0 {0,1} ya;1 {0,1} ya;0 + ya;1 = 1

yb;0 {0,1} yb;1 {0,1} yb;0 + yb;1 = 1

yab;00 = ya;0 yb;0 yab;01 = ya;0 yb;1

yab;10 = ya;1 yb;0 yab;11 = ya;1 yb;1

Page 154: MAP Estimation Algorithms in

Two variables, two labels

= [ a;0 a;1 b;0 b;1

ab;00 ab;01 ab;10 ab;11]y = [ ya;0 ya;1 yb;0 yb;1

yab;00 yab;01 yab;10 yab;11]

ya;0 [0,1] ya;1 [0,1] ya;0 + ya;1 = 1

yb;0 [0,1] yb;1 [0,1] yb;0 + yb;1 = 1

yab;00 = ya;0 yb;0 yab;01 = ya;0 yb;1

yab;10 = ya;1 yb;0 yab;11 = ya;1 yb;1

Page 155: MAP Estimation Algorithms in

Two variables, two labels

= [ a;0 a;1 b;0 b;1

ab;00 ab;01 ab;10 ab;11]y = [ ya;0 ya;1 yb;0 yb;1

yab;00 yab;01 yab;10 yab;11]

ya;0 [0,1] ya;1 [0,1] ya;0 + ya;1 = 1

yb;0 [0,1] yb;1 [0,1] yb;0 + yb;1 = 1

yab;00 + yab;01 = ya;0

yab;10 = ya;1 yb;0 yab;11 = ya;1 yb;1

Page 156: MAP Estimation Algorithms in

Two variables, two labels

= [ a;0 a;1 b;0 b;1

ab;00 ab;01 ab;10 ab;11]y = [ ya;0 ya;1 yb;0 yb;1

yab;00 yab;01 yab;10 yab;11]

ya;0 [0,1] ya;1 [0,1] ya;0 + ya;1 = 1

yb;0 [0,1] yb;1 [0,1] yb;0 + yb;1 = 1

yab;00 + yab;01 = ya;0

yab;10 + yab;11 = ya;1

Page 157: MAP Estimation Algorithms in

In General

Marginal Polytope

LocalPolytope

Page 158: MAP Estimation Algorithms in

In General

R(|V||L| + |E||L|2)

y [0,1](|V||L| + |E||L|2)

Number of constraints

|V||L| + |V| + |E||L|

Page 159: MAP Estimation Algorithms in

Linear Programming Relaxation

min Ty

ya;i [0,1]

∑i ya;i = 1

∑k yab;ik = ya;i

No reason why we can’t solve this

Page 160: MAP Estimation Algorithms in

Linear Programming Relaxation

Extensively studied

Optimization

Schlesinger, 1976

Koster, van Hoesel and Kolen, 1998

Theory

Chekuri et al, 2001 Archer et al, 2004

Machine Learning

Wainwright et al., 2001

Page 161: MAP Estimation Algorithms in

Linear Programming Relaxation

Many interesting Properties

• Global optimal MAP for trees

Wainwright et al., 2001

But we are interested in NP-hard cases

• Preserves solution for reparameterization

Page 162: MAP Estimation Algorithms in

Linear Programming Relaxation

• Large class of problems

• Metric Labelling• Semi-metric Labelling

Many interesting Properties - Integrality Gap

Manokaran et al., 2008

• Most likely, provides best possible integrality gap

Page 163: MAP Estimation Algorithms in

Linear Programming Relaxation

• A computationally useful dual

Many interesting Properties - Dual

Optimal value of dual = Optimal value of primal

Easier-to-solve

Page 164: MAP Estimation Algorithms in

Dual of the LP RelaxationWainwright et al., 2001

Va Vb Vc

Vd Ve Vf

Vg Vh Vi

min Ty

ya;i [0,1]

∑i ya;i = 1

∑k yab;ik = ya;i

Page 165: MAP Estimation Algorithms in

Dual of the LP RelaxationWainwright et al., 2001

Va Vb Vc

Vd Ve Vf

Vg Vh Vi

Va Vb Vc

Vd Ve Vf

Vg Vh Vi

Va Vb Vc

Vd Ve Vf

Vg Vh Vi

1

2

3

4 5 6

1

2

3

4 5 6

ii =

i ≥ 0

Page 166: MAP Estimation Algorithms in

Dual of the LP RelaxationWainwright et al., 2001

1

2

3

4 5 6

q*(1)

ii =

q*(2)

q*(3)

q*(4) q*(5) q*(6)

i q*(i)

Dual of LP

Va Vb Vc

Vd Ve Vf

Vg Vh Vi

Va Vb Vc

Vd Ve Vf

Vg Vh Vi

Va Vb Vc

Vd Ve Vf

Vg Vh Vi

i ≥ 0

max

Page 167: MAP Estimation Algorithms in

Dual of the LP RelaxationWainwright et al., 2001

1

2

3

4 5 6

q*(1)

ii

q*(2)

q*(3)

q*(4) q*(5) q*(6)

Dual of LP

Va Vb Vc

Vd Ve Vf

Vg Vh Vi

Va Vb Vc

Vd Ve Vf

Vg Vh Vi

Va Vb Vc

Vd Ve Vf

Vg Vh Vi

i ≥ 0

i q*(i)max

Page 168: MAP Estimation Algorithms in

Dual of the LP RelaxationWainwright et al., 2001

ii

max i q*(i)

I can easily compute q*(i)

I can easily maintain reparam constraint

So can I easily solve the dual?

Page 169: MAP Estimation Algorithms in

Outline• Problem Formulation

• Reparameterization

• Belief Propagation

• Tree-reweighted Message Passing– Integer Programming Formulation– Linear Programming Relaxation and its Dual– Convergent Solution for Dual– Computational Issues and Theoretical Properties

Page 170: MAP Estimation Algorithms in

TRW Message PassingKolmogorov, 2006

Va Vb Vc

Vd Ve Vf

Vg Vh Vi

VaVb Vc

VdVe Vf

VgVh Vi

1

2

3

1

2

3

4 5 6

4 5 6

ii

i q*(i)

Pick a variable Va

Page 171: MAP Estimation Algorithms in

TRW Message PassingKolmogorov, 2006

ii

i q*(i)

Vc Vb Va

1c;0

1c;1

1b;0

1b;1

1a;0

1a;1

Va Vd Vg

4a;0

4a;1

4d;0

4d;1

4g;0

4g;1

Page 172: MAP Estimation Algorithms in

TRW Message PassingKolmogorov, 2006

11 + 44 + rest

1 q*(1) + 4

q*(4) + K

Vc Vb Va Va Vd Vg

Reparameterize to obtain min-marginals of Va

1c;0

1c;1

1b;0

1b;1

1a;0

1a;1

4a;0

4a;1

4d;0

4d;1

4g;0

4g;1

Page 173: MAP Estimation Algorithms in

TRW Message PassingKolmogorov, 2006

1’1 + 4’4 + rest

Vc Vb Va

’1c;0

’1c;1

’1b;0

’1b;1

’1a;0

’1a;1

Va Vd Vg

’4a;0

’4a;1

’4d;0

’4d;1

’4g;0

’4g;1

One pass of Belief Propagation

1 q*(’1) + 4

q*(’4) + K

Page 174: MAP Estimation Algorithms in

TRW Message PassingKolmogorov, 2006

1’1 + 4’4 + rest

Vc Vb Va Va Vd Vg

Remain the same

1 q*(’1) + 4

q*(’4) + K

’1c;0

’1c;1

’1b;0

’1b;1

’1a;0

’1a;1

’4a;0

’4a;1

’4d;0

’4d;1

’4g;0

’4g;1

Page 175: MAP Estimation Algorithms in

TRW Message PassingKolmogorov, 2006

1’1 + 4’4 + rest

1 min{’1a;0,’1a;1} + 4

min{’4a;0,’4a;1} + K

Vc Vb Va Va Vd Vg

’1c;0

’1c;1

’1b;0

’1b;1

’1a;0

’1a;1

’4a;0

’4a;1

’4d;0

’4d;1

’4g;0

’4g;1

Page 176: MAP Estimation Algorithms in

TRW Message PassingKolmogorov, 2006

1’1 + 4’4 + rest

Vc Vb Va Va Vd Vg

Compute weighted average of min-marginals of Va

’1c;0

’1c;1

’1b;0

’1b;1

’1a;0

’1a;1

’4a;0

’4a;1

’4d;0

’4d;1

’4g;0

’4g;1

1 min{’1a;0,’1a;1} + 4

min{’4a;0,’4a;1} + K

Page 177: MAP Estimation Algorithms in

TRW Message PassingKolmogorov, 2006

1’1 + 4’4 + rest

Vc Vb Va Va Vd Vg

’’a;0 = 1’1a;0+ 4’4a;0

1 + 4

’’a;1 = 1’1a;1+ 4’4a;1

1 + 4

’1c;0

’1c;1

’1b;0

’1b;1

’1a;0

’1a;1

’4a;0

’4a;1

’4d;0

’4d;1

’4g;0

’4g;1

1 min{’1a;0,’1a;1} + 4

min{’4a;0,’4a;1} + K

Page 178: MAP Estimation Algorithms in

TRW Message PassingKolmogorov, 2006

1’’1 + 4’’4 + rest

Vc Vb Va Va Vd Vg

’1c;0

’1c;1

’1b;0

’1b;1

’’a;0

’’a;1

’’a;0

’’a;1

’4d;0

’4d;1

’4g;0

’4g;1

1 min{’1a;0,’1a;1} + 4

min{’4a;0,’4a;1} + K

’’a;0 = 1’1a;0+ 4’4a;0

1 + 4

’’a;1 = 1’1a;1+ 4’4a;1

1 + 4

Page 179: MAP Estimation Algorithms in

TRW Message PassingKolmogorov, 2006

1’’1 + 4’’4 + rest

Vc Vb Va Va Vd Vg

’1c;0

’1c;1

’1b;0

’1b;1

’’a;0

’’a;1

’’a;0

’’a;1

’4d;0

’4d;1

’4g;0

’4g;1

1 min{’1a;0,’1a;1} + 4

min{’4a;0,’4a;1} + K

’’a;0 = 1’1a;0+ 4’4a;0

1 + 4

’’a;1 = 1’1a;1+ 4’4a;1

1 + 4

Page 180: MAP Estimation Algorithms in

TRW Message PassingKolmogorov, 2006

1’’1 + 4’’4 + rest

Vc Vb Va Va Vd Vg

1 min{’’a;0,’’a;1} + 4

min{’’a;0,’’a;1} + K

’1c;0

’1c;1

’1b;0

’1b;1

’’a;0

’’a;1

’’a;0

’’a;1

’4d;0

’4d;1

’4g;0

’4g;1

’’a;0 = 1’1a;0+ 4’4a;0

1 + 4

’’a;1 = 1’1a;1+ 4’4a;1

1 + 4

Page 181: MAP Estimation Algorithms in

TRW Message PassingKolmogorov, 2006

1’’1 + 4’’4 + rest

Vc Vb Va Va Vd Vg

(1 + 4) min{’’a;0, ’’a;1} + K

’1c;0

’1c;1

’1b;0

’1b;1

’’a;0

’’a;1

’’a;0

’’a;1

’4d;0

’4d;1

’4g;0

’4g;1

’’a;0 = 1’1a;0+ 4’4a;0

1 + 4

’’a;1 = 1’1a;1+ 4’4a;1

1 + 4

Page 182: MAP Estimation Algorithms in

TRW Message PassingKolmogorov, 2006

1’’1 + 4’’4 + rest

Vc Vb Va Va Vd Vg

(1 + 4) min{’’a;0, ’’a;1} + K

’1c;0

’1c;1

’1b;0

’1b;1

’’a;0

’’a;1

’’a;0

’’a;1

’4d;0

’4d;1

’4g;0

’4g;1

min {p1+p2, q1+q2} min {p1, q1} + min {p2, q2}≥

Page 183: MAP Estimation Algorithms in

TRW Message PassingKolmogorov, 2006

1’’1 + 4’’4 + rest

Vc Vb Va Va Vd Vg

Objective function increases or remains constant

’1c;0

’1c;1

’1b;0

’1b;1

’’a;0

’’a;1

’’a;0

’’a;1

’4d;0

’4d;1

’4g;0

’4g;1

(1 + 4) min{’’a;0, ’’a;1} + K

Page 184: MAP Estimation Algorithms in

TRW Message Passing

Initialize i. Take care of reparam constraint

Choose random variable Va

Compute min-marginals of Va for all trees

Node-average the min-marginals

REPEAT

Kolmogorov, 2006

Can also do edge-averaging

Page 185: MAP Estimation Algorithms in

Example 1

Va Vb

0

1 1

0

2

5

4

2l0

l1

Vb Vc

0

2 3

1

4

2

6

3

Vc Va

1

4 1

0

6

3

6

4

2 =1 3 =11 =1

5

6

7

Pick variable Va. Reparameterize.

Page 186: MAP Estimation Algorithms in

Example 1

Va Vb

-3

-2 -1

-2

5

7

4

2

Vb Vc

0

2 3

1

4

2

6

3

Vc Va

-3

1 -3

-3

6

3

10

7

2 =1 3 =11 =1

5

6

7

Average the min-marginals of Va

l0

l1

Page 187: MAP Estimation Algorithms in

Example 1

Va Vb

-3

-2 -1

-2

7.5

7

4

2

Vb Vc

0

2 3

1

4

2

6

3

Vc Va

-3

1 -3

-3

6

3

7.5

7

2 =1 3 =11 =1

7

6

7

Pick variable Vb. Reparameterize.

l0

l1

Page 188: MAP Estimation Algorithms in

Example 1

Va Vb

-7.5

-7 -5.5

-7

7.5

7

8.5

7

Vb Vc

-5

-3 -1

-3

9

6

6

3

Vc Va

-3

1 -3

-3

6

3

7.5

7

2 =1 3 =11 =1

7

6

7

Average the min-marginals of Vb

l0

l1

Page 189: MAP Estimation Algorithms in

Example 1

Va Vb

-7.5

-7 -5.5

-7

7.5

7

8.75

6.5

Vb Vc

-5

-3 -1

-3

8.75

6.5

6

3

Vc Va

-3

1 -3

-3

6

3

7.5

7

2 =1 3 =11 =1

6.5

6.5

7 Value of dual does not increase

l0

l1

Page 190: MAP Estimation Algorithms in

Example 1

Va Vb

-7.5

-7 -5.5

-7

7.5

7

8.75

6.5

Vb Vc

-5

-3 -1

-3

8.75

6.5

6

3

Vc Va

-3

1 -3

-3

6

3

7.5

7

2 =1 3 =11 =1

6.5

6.5

7 Maybe it will increase for Vc

NO

l0

l1

Page 191: MAP Estimation Algorithms in

Example 1

Va Vb

-7.5

-7 -5.5

-7

7.5

7

8.75

6.5

Vb Vc

-5

-3 -1

-3

8.75

6.5

6

3

Vc Va

-3

1 -3

-3

6

3

7.5

7

2 =1 3 =11 =1

Strong Tree Agreement

Exact MAP Estimate

f1(a) = 0 f1(b) = 0 f2(b) = 0 f2(c) = 0 f3(c) = 0 f3(a) = 0

l0

l1

Page 192: MAP Estimation Algorithms in

Example 2

Va Vb

0

1 1

0

2

5

2

2

Vb Vc

1

0 0

1

0

0

0

0

Vc Va

0

1 1

0

0

3

4

8

2 =1 3 =11 =1

4

0

4

Pick variable Va. Reparameterize.

l0

l1

Page 193: MAP Estimation Algorithms in

Example 2

Va Vb

-2

-1 -1

-2

4

7

2

2

Vb Vc

1

0 0

1

0

0

0

0

Vc Va

0

0 1

-1

0

3

4

9

2 =1 3 =11 =1

4

0

4

Average the min-marginals of Va

l0

l1

Page 194: MAP Estimation Algorithms in

Example 2

Va Vb

-2

-1 -1

-2

4

8

2

2

Vb Vc

1

0 0

1

0

0

0

0

Vc Va

0

0 1

-1

0

3

4

8

2 =1 3 =11 =1

4

0

4 Value of dual does not increase

l0

l1

Page 195: MAP Estimation Algorithms in

Example 2

Va Vb

-2

-1 -1

-2

4

8

2

2

Vb Vc

1

0 0

1

0

0

0

0

Vc Va

0

0 1

-1

0

3

4

8

2 =1 3 =11 =1

4

0

4 Maybe it will decrease for Vb or Vc

NO

l0

l1

Page 196: MAP Estimation Algorithms in

Example 2

Va Vb

-2

-1 -1

-2

4

8

2

2

Vb Vc

1

0 0

1

0

0

0

0

Vc Va

0

0 1

-1

0

3

4

8

2 =1 3 =11 =1

f1(a) = 1 f1(b) = 1 f2(b) = 1 f2(c) = 0 f3(c) = 1 f3(a) = 1

f2(b) = 0 f2(c) = 1

Weak Tree Agreement

Not Exact MAP Estimate

l0

l1

Page 197: MAP Estimation Algorithms in

Example 2

Va Vb

-2

-1 -1

-2

4

8

2

2

Vb Vc

1

0 0

1

0

0

0

0

Vc Va

0

0 1

-1

0

3

4

8

2 =1 3 =11 =1

Weak Tree Agreement

Convergence point of TRW

l0

l1

f1(a) = 1 f1(b) = 1 f2(b) = 1 f2(c) = 0 f3(c) = 1 f3(a) = 1

f2(b) = 0 f2(c) = 1

Page 198: MAP Estimation Algorithms in

Obtaining the Labelling

Only solves the dual. Primal solutions?

Va Vb Vc

Vd Ve Vf

Vg Vh Vi

’ = ii

Fix the labelOf Va

Page 199: MAP Estimation Algorithms in

Obtaining the Labelling

Only solves the dual. Primal solutions?

Va Vb Vc

Vd Ve Vf

Vg Vh Vi

’ = ii

Fix the labelOf Vb

Continue in some fixed order

Meltzer et al., 2006

Page 200: MAP Estimation Algorithms in

Outline• Problem Formulation

• Reparameterization

• Belief Propagation

• Tree-reweighted Message Passing– Integer Programming Formulation– Linear Programming Relaxation and its Dual– Convergent Solution for Dual– Computational Issues and Theoretical Properties

Page 201: MAP Estimation Algorithms in

Computational Issues of TRW

• Speed-ups for some pairwise potentials

Basic Component is Belief Propagation

Felzenszwalb & Huttenlocher, 2004

• Memory requirements cut down by half

Kolmogorov, 2006

• Further speed-ups using monotonic chains

Kolmogorov, 2006

Page 202: MAP Estimation Algorithms in

Theoretical Properties of TRW

• Always converges, unlike BP

Kolmogorov, 2006

• Strong tree agreement implies exact MAP

Wainwright et al., 2001

• Optimal MAP for two-label submodular problems

Kolmogorov and Wainwright, 2005

ab;00 + ab;11 ≤ ab;01 + ab;10

Page 203: MAP Estimation Algorithms in

ResultsBinary Segmentation Szeliski et al. , 2008

Labels - {foreground, background}

Unary Potentials: -log(likelihood) using learnt fg/bg models

Pairwise Potentials: 0, if same labels

1 - exp(|Da - Db|), if different labels

Page 204: MAP Estimation Algorithms in

ResultsBinary Segmentation

Labels - {foreground, background}

Unary Potentials: -log(likelihood) using learnt fg/bg models

Szeliski et al. , 2008

Pairwise Potentials: 0, if same labels

1 - exp(|Da - Db|), if different labels

TRW

Page 205: MAP Estimation Algorithms in

ResultsBinary Segmentation

Labels - {foreground, background}

Unary Potentials: -log(likelihood) using learnt fg/bg models

Szeliski et al. , 2008

Belief Propagation

Pairwise Potentials: 0, if same labels

1 - exp(|Da - Db|), if different labels

Page 206: MAP Estimation Algorithms in

ResultsStereo Correspondence Szeliski et al. , 2008

Labels - {disparities}

Unary Potentials: Similarity of pixel colours

Pairwise Potentials: 0, if same labels

1 - exp(|Da - Db|), if different labels

Page 207: MAP Estimation Algorithms in

ResultsSzeliski et al. , 2008

Labels - {disparities}

Unary Potentials: Similarity of pixel colours

Pairwise Potentials: 0, if same labels

1 - exp(|Da - Db|), if different labels

TRW

Stereo Correspondence

Page 208: MAP Estimation Algorithms in

ResultsSzeliski et al. , 2008

Labels - {disparities}

Unary Potentials: Similarity of pixel colours

Belief Propagation

Pairwise Potentials: 0, if same labels

1 - exp(|Da - Db|), if different labels

Stereo Correspondence

Page 209: MAP Estimation Algorithms in

ResultsNon-submodular problems Kolmogorov, 2006

BP TRW-S

30x30 grid K50

BP TRW-S

BP outperforms TRW-S

Page 210: MAP Estimation Algorithms in

Summary• Trees can be solved exactly - BP

• No guarantee of convergence otherwise - BP

• Strong Tree Agreement - TRW-S

• Submodular energies solved exactly - TRW-S

• TRW-S solves an LP relaxation of MAP estimation

• Loopier graphs give worse results Rother and Kolmogorov, 2006

Page 211: MAP Estimation Algorithms in

Related New(er) Work

• Solving the Dual

Globerson and Jaakkola, 2007

Komodakis, Paragios and Tziritas 2007

Weiss et al., 2006

Schlesinger and Giginyak, 2007

• Solving the Primal

Ravikumar, Agarwal and Wainwright, 2008

Page 212: MAP Estimation Algorithms in

Related New(er) Work

• More complex relaxations

Sontag and Jaakkola, 2007

Komodakis and Paragios, 2008

Kumar, Kolmogorov and Torr, 2007

Werner, 2008

Sontag et al., 2008

Kumar and Torr, 2008

Page 213: MAP Estimation Algorithms in

Questions on Part I ?

Code + Standard Data

http://vision.middlebury.edu/MRF