energy minimization - university of chicagottic.uchicago.edu/~rurtasun/courses/cv/lecture15.pdfp =...

97
Energy Minimization Raquel Urtasun TTI Chicago Feb 28, 2013 Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 1 / 48

Upload: others

Post on 10-Aug-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Energy Minimization

Raquel Urtasun

TTI Chicago

Feb 28, 2013

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 1 / 48

Page 2: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

The ST-mincut problem

Suppose we have a graph G = {V ,E ,C}, with vertices V , Edges E andcosts C .

[Source: P. Kohli]Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 2 / 48

Page 3: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

The ST-mincut problem

An st-cut (S,T) divides the nodes between source and sink.

The cost of a st-cut is the sum of cost of all edges going from S to T

[Source: P. Kohli]Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 3 / 48

Page 4: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

The ST-mincut problem

The st-mincut is the st-cut with the minimum cost

[Source: P. Kohli]Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 4 / 48

Page 5: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Back to our energy minimization

Construct a graph such that

1 Any st-cut corresponds to an assignment of x

2 The cost of the cut is equal to the energy of x : E(x)

[Source: P. Kohli]Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 5 / 48

Page 6: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

St-mincut and Energy Minimization

[Source: P. Kohli]

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 6 / 48

Page 7: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

How are they equivalent?

A B

C D

0 1

0

1 xi

xj

= A + 0 0

C-A C-A

0 1

0

1

0 D-C

0 D-C

0 1

0

1

0 B+C-A-D

0 0

0 1

0

1 + +

if x1=1 add C-A

if x2 = 1 add D-C

B+C-A-D ! 0 is true from the submodularity of !ij

A = !ij (0,0) B = !ij(0,1) C = !ij

(1,0) D = !ij (1,1)

!ij (xi,xj) = !ij(0,0) + (!ij(1,0)-!ij(0,0)) xi + (!ij(1,0)-!ij(0,0)) xj + (!ij(1,0) + !ij(0,1) - !ij(0,0) - !ij(1,1)) (1-xi) xj

[Source: P. Kohli]Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 7 / 48

Page 8: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Graph Construction

[Source: P. Kohli]Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 8 / 48

Page 9: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Graph Construction

[Source: P. Kohli]Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 8 / 48

Page 10: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Graph Construction

[Source: P. Kohli]Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 8 / 48

Page 11: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Graph Construction

[Source: P. Kohli]Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 8 / 48

Page 12: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Graph Construction

[Source: P. Kohli]Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 8 / 48

Page 13: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Graph Construction

[Source: P. Kohli]Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 8 / 48

Page 14: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Graph Construction

[Source: P. Kohli]Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 8 / 48

Page 15: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Graph Construction

[Source: P. Kohli]Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 8 / 48

Page 16: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Graph Construction

[Source: P. Kohli]Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 8 / 48

Page 17: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

How to compute the St-mincut?

[Source: P. Kohli]

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 9 / 48

Page 18: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

How does the code look like

[Source: P. Kohli]Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 10 / 48

Page 19: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

How does the code look like

[Source: P. Kohli]

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 10 / 48

Page 20: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

How does the code look like

[Source: P. Kohli]Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 10 / 48

Page 21: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

How does the code look like

[Source: P. Kohli]Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 10 / 48

Page 22: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Graph cuts for multi-label problems

Exact Transformation to QPBF [Roy and Cox 98] [Ishikawa 03] [Schlesingeret al. 06] [Ramalingam et al. 08]

Very high computational cost

[Source: P. Kohli]Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 11 / 48

Page 23: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Computing the Optimal Move

[Source: P. Kohli]Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 12 / 48

Page 24: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Move Making Algorithms

[Source: P. Kohli]

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 13 / 48

Page 25: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Energy Minimization

Consider pairwise MRFs

E (f ) =∑

{p,q}∈N

Vp,q(fp, fq) +∑p

Dp(fp)

with N defining the interactions between nodes, e.g., pixels

Dp non-negative, but arbitrary.

This is the graph-cuts notation.

Important to notice it’s the same thing.

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 14 / 48

Page 26: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Energy Minimization

Consider pairwise MRFs

E (f ) =∑

{p,q}∈N

Vp,q(fp, fq) +∑p

Dp(fp)

with N defining the interactions between nodes, e.g., pixels

Dp non-negative, but arbitrary.

This is the graph-cuts notation.

Important to notice it’s the same thing.

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 14 / 48

Page 27: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Energy Minimization

Consider pairwise MRFs

E (f ) =∑

{p,q}∈N

Vp,q(fp, fq) +∑p

Dp(fp)

with N defining the interactions between nodes, e.g., pixels

Dp non-negative, but arbitrary.

This is the graph-cuts notation.

Important to notice it’s the same thing.

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 14 / 48

Page 28: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Metric vs Semimetric

Two general classes of pairwise interactions

Metric if it satisfies for any set of labels α, β, γ

V (α, β) = 0 ↔ α = β

V (α, β) = V (β, α) ≥ 0

V (α, β) ≤ V (α, γ) + V (γ, β)

Semi-metric if it satisfies for any set of labels α, β, γ

V (α, β) = 0 ↔ α = β

V (α, β) = V (β, α) ≥ 0

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 15 / 48

Page 29: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Metric vs Semimetric

Two general classes of pairwise interactions

Metric if it satisfies for any set of labels α, β, γ

V (α, β) = 0 ↔ α = β

V (α, β) = V (β, α) ≥ 0

V (α, β) ≤ V (α, γ) + V (γ, β)

Semi-metric if it satisfies for any set of labels α, β, γ

V (α, β) = 0 ↔ α = β

V (α, β) = V (β, α) ≥ 0

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 15 / 48

Page 30: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Examples for 1D label set

Truncated quadratic is a semi-metric

V (α, β) = min(K , |α− β|2)

with K a constant.

Truncated absolute distance is a metric

V (α, β) = min(K , |α− β|)

with K a constant.

For multi-dimensional, replace | · | by any norm.

Potts model is a metric

V (α, β) = K · T (α 6= β)

with T (·) = 1 if the argument is true and 0 otherwise.

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 16 / 48

Page 31: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Examples for 1D label set

Truncated quadratic is a semi-metric

V (α, β) = min(K , |α− β|2)

with K a constant.

Truncated absolute distance is a metric

V (α, β) = min(K , |α− β|)

with K a constant.

For multi-dimensional, replace | · | by any norm.

Potts model is a metric

V (α, β) = K · T (α 6= β)

with T (·) = 1 if the argument is true and 0 otherwise.

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 16 / 48

Page 32: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Examples for 1D label set

Truncated quadratic is a semi-metric

V (α, β) = min(K , |α− β|2)

with K a constant.

Truncated absolute distance is a metric

V (α, β) = min(K , |α− β|)

with K a constant.

For multi-dimensional, replace | · | by any norm.

Potts model is a metric

V (α, β) = K · T (α 6= β)

with T (·) = 1 if the argument is true and 0 otherwise.

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 16 / 48

Page 33: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Examples for 1D label set

Truncated quadratic is a semi-metric

V (α, β) = min(K , |α− β|2)

with K a constant.

Truncated absolute distance is a metric

V (α, β) = min(K , |α− β|)

with K a constant.

For multi-dimensional, replace | · | by any norm.

Potts model is a metric

V (α, β) = K · T (α 6= β)

with T (·) = 1 if the argument is true and 0 otherwise.

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 16 / 48

Page 34: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Binary Moves

α− β moves works for semi-metrics

α expansion works for V being a metric

Figure: Figure from P. Kohli tutorial on graph-cuts

For certain x1 and x2, the move energy is sub-modular QPBF

[Source: P. Kohli]Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 17 / 48

Page 35: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Swap Move

[Source: P. Kohli]Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 18 / 48

Page 36: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Swap Move

[Source: P. Kohli]

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 18 / 48

Page 37: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Expansion Move

[Source: P. Kohli]Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 19 / 48

Page 38: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Expansion Move

[Source: P. Kohli]Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 19 / 48

Page 39: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

More formally

Any labeling can be uniquely represented by a partition of image pixelsP = {Pl |l ∈ L}, where Pl = {p ∈ P|fp = l} is a subset of pixels assignedlabel l .

There is a one to one correspondence between labelings f and partitions P.

Given a pair of labels α, β, a move from a partition P (labeling f ) to a newpartition P’ (labeling f ′) is called an α− β swap if Pl = P ′ for any labell 6= α, β.

The only difference between P and P ′ is that some pixels that were labeledin P are now labeled in P ′, and vice-versa.

Given a label l , a move from a partition P (labeling f ) to a new partition P ′

(labeling f ′) is called an α-expansion if Pα ⊂ P ′α and P ′

l ⊂ Pl .

An α-expansion move allows any set of image pixels to change their labelsto α.

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 20 / 48

Page 40: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

More formally

Any labeling can be uniquely represented by a partition of image pixelsP = {Pl |l ∈ L}, where Pl = {p ∈ P|fp = l} is a subset of pixels assignedlabel l .

There is a one to one correspondence between labelings f and partitions P.

Given a pair of labels α, β, a move from a partition P (labeling f ) to a newpartition P’ (labeling f ′) is called an α− β swap if Pl = P ′ for any labell 6= α, β.

The only difference between P and P ′ is that some pixels that were labeledin P are now labeled in P ′, and vice-versa.

Given a label l , a move from a partition P (labeling f ) to a new partition P ′

(labeling f ′) is called an α-expansion if Pα ⊂ P ′α and P ′

l ⊂ Pl .

An α-expansion move allows any set of image pixels to change their labelsto α.

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 20 / 48

Page 41: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

More formally

Any labeling can be uniquely represented by a partition of image pixelsP = {Pl |l ∈ L}, where Pl = {p ∈ P|fp = l} is a subset of pixels assignedlabel l .

There is a one to one correspondence between labelings f and partitions P.

Given a pair of labels α, β, a move from a partition P (labeling f ) to a newpartition P’ (labeling f ′) is called an α− β swap if Pl = P ′ for any labell 6= α, β.

The only difference between P and P ′ is that some pixels that were labeledin P are now labeled in P ′, and vice-versa.

Given a label l , a move from a partition P (labeling f ) to a new partition P ′

(labeling f ′) is called an α-expansion if Pα ⊂ P ′α and P ′

l ⊂ Pl .

An α-expansion move allows any set of image pixels to change their labelsto α.

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 20 / 48

Page 42: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

More formally

Any labeling can be uniquely represented by a partition of image pixelsP = {Pl |l ∈ L}, where Pl = {p ∈ P|fp = l} is a subset of pixels assignedlabel l .

There is a one to one correspondence between labelings f and partitions P.

Given a pair of labels α, β, a move from a partition P (labeling f ) to a newpartition P’ (labeling f ′) is called an α− β swap if Pl = P ′ for any labell 6= α, β.

The only difference between P and P ′ is that some pixels that were labeledin P are now labeled in P ′, and vice-versa.

Given a label l , a move from a partition P (labeling f ) to a new partition P ′

(labeling f ′) is called an α-expansion if Pα ⊂ P ′α and P ′

l ⊂ Pl .

An α-expansion move allows any set of image pixels to change their labelsto α.

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 20 / 48

Page 43: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

More formally

Any labeling can be uniquely represented by a partition of image pixelsP = {Pl |l ∈ L}, where Pl = {p ∈ P|fp = l} is a subset of pixels assignedlabel l .

There is a one to one correspondence between labelings f and partitions P.

Given a pair of labels α, β, a move from a partition P (labeling f ) to a newpartition P’ (labeling f ′) is called an α− β swap if Pl = P ′ for any labell 6= α, β.

The only difference between P and P ′ is that some pixels that were labeledin P are now labeled in P ′, and vice-versa.

Given a label l , a move from a partition P (labeling f ) to a new partition P ′

(labeling f ′) is called an α-expansion if Pα ⊂ P ′α and P ′

l ⊂ Pl .

An α-expansion move allows any set of image pixels to change their labelsto α.

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 20 / 48

Page 44: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

More formally

Any labeling can be uniquely represented by a partition of image pixelsP = {Pl |l ∈ L}, where Pl = {p ∈ P|fp = l} is a subset of pixels assignedlabel l .

There is a one to one correspondence between labelings f and partitions P.

Given a pair of labels α, β, a move from a partition P (labeling f ) to a newpartition P’ (labeling f ′) is called an α− β swap if Pl = P ′ for any labell 6= α, β.

The only difference between P and P ′ is that some pixels that were labeledin P are now labeled in P ′, and vice-versa.

Given a label l , a move from a partition P (labeling f ) to a new partition P ′

(labeling f ′) is called an α-expansion if Pα ⊂ P ′α and P ′

l ⊂ Pl .

An α-expansion move allows any set of image pixels to change their labelsto α.

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 20 / 48

Page 45: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Example

Figure: (a) Current partition (b) local move (c) α− β-swap (d) α-expansion.

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 21 / 48

Page 46: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Algorithms

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 22 / 48

Page 47: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Finding optimal Swap move

Given an input labeling f (partition P) and a pair of labels α, β we want tofind a labeling f that minimizes E over all labelings within one α− β-swapof f .

This is going to be done by computing a labeling corresponding to aminimum cut on a graph Gαβ = (Vαβ , Eαβ).

The structure of this graph is dynamically determined by the currentpartition P and by the labels α, β.

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 23 / 48

Page 48: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Finding optimal Swap move

Given an input labeling f (partition P) and a pair of labels α, β we want tofind a labeling f that minimizes E over all labelings within one α− β-swapof f .

This is going to be done by computing a labeling corresponding to aminimum cut on a graph Gαβ = (Vαβ , Eαβ).

The structure of this graph is dynamically determined by the currentpartition P and by the labels α, β.

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 23 / 48

Page 49: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Finding optimal Swap move

Given an input labeling f (partition P) and a pair of labels α, β we want tofind a labeling f that minimizes E over all labelings within one α− β-swapof f .

This is going to be done by computing a labeling corresponding to aminimum cut on a graph Gαβ = (Vαβ , Eαβ).

The structure of this graph is dynamically determined by the currentpartition P and by the labels α, β.

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 23 / 48

Page 50: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Graph Construction

The set of vertices includes the two terminals α and β, as well as imagepixels p in the sets Pα and Pβ (i.e., fp ∈ {α, β}).

Each pixel p ∈ Pαβ is connected to the terminals α and β, called t-links.

Each set of pixels p, q ∈ Pαβ which are neighbors is connected by an edgeep,q

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 24 / 48

Page 51: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Computing the Cut

Any cut must have a single t-link not cut.

This defines a labeling

There is a one-to-one correspondences between a cut and a labeling.

The energy of the cut is the energy of the labeling.

See Boykov et al, ”fast approximate energy minimization via graph cuts”PAMI 2001.

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 25 / 48

Page 52: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Properties

For any cut, then

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 26 / 48

Page 53: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Finding the optimal α expansion

Given an input labeling f (partition P) and a label α we want to find alabeling f that minimizes E over all labelings within one α-expansion of f .

This is going to be done by computing a labeling corresponding to aminimum cut on a graph Gα = (Vα, Eα).

The structure of this graph is dynamically determined by the currentpartition P and by the label α.

Different graph than the α− β swap.

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 27 / 48

Page 54: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Finding the optimal α expansion

Given an input labeling f (partition P) and a label α we want to find alabeling f that minimizes E over all labelings within one α-expansion of f .

This is going to be done by computing a labeling corresponding to aminimum cut on a graph Gα = (Vα, Eα).

The structure of this graph is dynamically determined by the currentpartition P and by the label α.

Different graph than the α− β swap.

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 27 / 48

Page 55: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Finding the optimal α expansion

Given an input labeling f (partition P) and a label α we want to find alabeling f that minimizes E over all labelings within one α-expansion of f .

This is going to be done by computing a labeling corresponding to aminimum cut on a graph Gα = (Vα, Eα).

The structure of this graph is dynamically determined by the currentpartition P and by the label α.

Different graph than the α− β swap.

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 27 / 48

Page 56: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Finding the optimal α expansion

Given an input labeling f (partition P) and a label α we want to find alabeling f that minimizes E over all labelings within one α-expansion of f .

This is going to be done by computing a labeling corresponding to aminimum cut on a graph Gα = (Vα, Eα).

The structure of this graph is dynamically determined by the currentpartition P and by the label α.

Different graph than the α− β swap.

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 27 / 48

Page 57: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Graph Construction

The set of vertices includes the two terminals α and α, as well as all imagepixels p ∈ P.

Additionally, for each pair of neighboring pixels p, q such that fp 6= fq wecreate an auxiliary node ap,q.

Each pixel p is connected to the terminals α and α, called t-links.

Each set of pixels p, q which are neighbors and fp = fq, we connect with andn-link.

For each pair of neighboring pixels such that fp 6= fq, we create a triplet{ep,a, ea,q, tαa }.

The set of edges is then

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 28 / 48

Page 58: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Graph Construction

The set of vertices includes the two terminals α and α, as well as all imagepixels p ∈ P.

Additionally, for each pair of neighboring pixels p, q such that fp 6= fq wecreate an auxiliary node ap,q.

Each pixel p is connected to the terminals α and α, called t-links.

Each set of pixels p, q which are neighbors and fp = fq, we connect with andn-link.

For each pair of neighboring pixels such that fp 6= fq, we create a triplet{ep,a, ea,q, tαa }.

The set of edges is then

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 28 / 48

Page 59: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Graph Construction

The set of vertices includes the two terminals α and α, as well as all imagepixels p ∈ P.

Additionally, for each pair of neighboring pixels p, q such that fp 6= fq wecreate an auxiliary node ap,q.

Each pixel p is connected to the terminals α and α, called t-links.

Each set of pixels p, q which are neighbors and fp = fq, we connect with andn-link.

For each pair of neighboring pixels such that fp 6= fq, we create a triplet{ep,a, ea,q, tαa }.

The set of edges is then

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 28 / 48

Page 60: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Graph Construction

The set of vertices includes the two terminals α and α, as well as all imagepixels p ∈ P.

Additionally, for each pair of neighboring pixels p, q such that fp 6= fq wecreate an auxiliary node ap,q.

Each pixel p is connected to the terminals α and α, called t-links.

Each set of pixels p, q which are neighbors and fp = fq, we connect with andn-link.

For each pair of neighboring pixels such that fp 6= fq, we create a triplet{ep,a, ea,q, tαa }.

The set of edges is then

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 28 / 48

Page 61: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Graph Construction

The set of vertices includes the two terminals α and α, as well as all imagepixels p ∈ P.

Additionally, for each pair of neighboring pixels p, q such that fp 6= fq wecreate an auxiliary node ap,q.

Each pixel p is connected to the terminals α and α, called t-links.

Each set of pixels p, q which are neighbors and fp = fq, we connect with andn-link.

For each pair of neighboring pixels such that fp 6= fq, we create a triplet{ep,a, ea,q, tαa }.

The set of edges is then

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 28 / 48

Page 62: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Graph Construction

The set of vertices includes the two terminals α and α, as well as all imagepixels p ∈ P.

Additionally, for each pair of neighboring pixels p, q such that fp 6= fq wecreate an auxiliary node ap,q.

Each pixel p is connected to the terminals α and α, called t-links.

Each set of pixels p, q which are neighbors and fp = fq, we connect with andn-link.

For each pair of neighboring pixels such that fp 6= fq, we create a triplet{ep,a, ea,q, tαa }.

The set of edges is then

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 28 / 48

Page 63: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Graph Construction

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 29 / 48

Page 64: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Properties

There is a one-to-one correspondences between a cut and a labeling.

The energy of the cut is the energy of the labeling.

See Boykov et al, ”fast approximate energy minimization via graph cuts”PAMI 2001.

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 30 / 48

Page 65: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Global Minimization Techniques

Ways to get an approximate solution typically

Dynamic programming approximations

Sampling

Simulated annealing

Graph-cuts: imposes restrictions on the type of pairwise cost functions

Message passing: iterative algorithms that pass messages between nodes inthe graph.

Now we can solve for the MAP (approximately) in general energies. We can solve

for other problems than stereo

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 31 / 48

Page 66: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Let’s look at data/bechmarks

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 32 / 48

Page 67: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Benchmarks

Two benchmarks with very different characteristics

(Middlebury) (KITTI)

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 33 / 48

Page 68: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Middlebury Dataset

Middlebury Stereo Evaluation – Version 2

Laboratory

Lambertian

Rich in texture

Medium-size label set

Largely fronto-parallel

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 34 / 48

Page 69: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Middlebury Dataset

Middlebury Stereo Evaluation – Version 2

Laboratory

Lambertian

Rich in texture

Medium-size label set

Largely fronto-parallel

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 34 / 48

Page 70: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Middlebury Dataset

Middlebury Stereo Evaluation – Version 2

Laboratory

Lambertian

Rich in texture

Medium-size label set

Largely fronto-parallel

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 34 / 48

Page 71: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Middlebury Dataset

Middlebury Stereo Evaluation – Version 2

Laboratory

Lambertian

Rich in texture

Medium-size label set

Largely fronto-parallel

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 34 / 48

Page 72: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Middlebury Dataset

Middlebury Stereo Evaluation – Version 2

Laboratory

Lambertian

Rich in texture

Medium-size label set

Largely fronto-parallel

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 34 / 48

Page 73: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Benchmarks for Stereo and metrics

Middlebury Stereo Evaluation – Version 2

Best methods < 3% errors (for all non-occluded regions)

http://vision.middlebury.edu/stereo/data/

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 35 / 48

Page 74: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Benchmarks: KITTI Data Collection

Two stereo rigs (1392× 512 px, 54 cm base, 90◦ opening)

Velodyne laser scanner, GPS+IMU localization

6 hours at 10 frames per second!

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 36 / 48

Page 75: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

The KITTI Vision Benchmark Suite

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 37 / 48

Page 76: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Novel Challenges

Fast guided cost-volume filtering (Rhemann et al., CVPR 2011)

Middlebury, Errors: 2.7%

Error threshold: 1 px (Middlebury) / 3 px (KITTI)

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 38 / 48

Page 77: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Novel Challenges

Fast guided cost-volume filtering (Rhemann et al., CVPR 2011)

Middlebury, Errors: 2.7% KITTI, Errors: 46.3%

Error threshold: 1 px (Middlebury) / 3 px (KITTI)

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 38 / 48

Page 78: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Novel Challenges

So what is the difference?

Middlebury

Laboratory

Lambertian

Rich in texture

Medium-size label set

Largely fronto-parallel

KITTI

Moving vehicle

Specularities

Sensor saturation

Large label set

Strong slants

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 39 / 48

Page 79: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Novel Challenges

So what is the difference?

Middlebury

Laboratory

Lambertian

Rich in texture

Medium-size label set

Largely fronto-parallel

KITTI

Moving vehicle

Specularities

Sensor saturation

Large label set

Strong slants

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 39 / 48

Page 80: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Novel Challenges

So what is the difference?

Middlebury

Laboratory

Lambertian

Rich in texture

Medium-size label set

Largely fronto-parallel

KITTI

Moving vehicle

Specularities

Sensor saturation

Large label set

Strong slants

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 39 / 48

Page 81: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Novel Challenges

So what is the difference?

Middlebury

d = 50 px

d = 16 px

Laboratory

Lambertian

Rich in texture

Medium-size label set

Largely fronto-parallel

KITTI

d = 150 px

d = 0 px

Moving vehicle

Specularities

Sensor saturation

Large label set

Strong slants

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 39 / 48

Page 82: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Novel Challenges

So what is the difference?

Middlebury

Laboratory

Lambertian

Rich in texture

Medium-size label set

Largely fronto-parallel

KITTI

Moving vehicle

Specularities

Sensor saturation

Large label set

Strong slants

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 39 / 48

Page 83: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Stereo Evaluation

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 40 / 48

Page 84: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

MRFs for stereo

Global methods: define a Markov random field over

Pixel-level

Fronto-parallel planes

Slanted planes

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 41 / 48

Page 85: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Plane MRFs

First segment an image into small regions, i.e., superpixels

Assume that the 3D world is compose of small frontal/slanted planes

Good representation if the superpixels are small and respect boundaries

E (x1, · · · , xn) =∑i

C (xi ) +∑i

∑j∈Nj

C (xi , xj)

with xi ∈ < for the fronto-parallel planes, and xi ∈ <3 for the slanted planes

This are continuous variables. Is this a problem?

What can I do to solve this? Discretize the problem

The unitary are usually agreegation of cost over the local matching on thepixels in that superpixel

Pairwise is typically smoothness

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 42 / 48

Page 86: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Plane MRFs

First segment an image into small regions, i.e., superpixels

Assume that the 3D world is compose of small frontal/slanted planes

Good representation if the superpixels are small and respect boundaries

E (x1, · · · , xn) =∑i

C (xi ) +∑i

∑j∈Nj

C (xi , xj)

with xi ∈ < for the fronto-parallel planes, and xi ∈ <3 for the slanted planes

This are continuous variables. Is this a problem?

What can I do to solve this? Discretize the problem

The unitary are usually agreegation of cost over the local matching on thepixels in that superpixel

Pairwise is typically smoothness

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 42 / 48

Page 87: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Plane MRFs

First segment an image into small regions, i.e., superpixels

Assume that the 3D world is compose of small frontal/slanted planes

Good representation if the superpixels are small and respect boundaries

E (x1, · · · , xn) =∑i

C (xi ) +∑i

∑j∈Nj

C (xi , xj)

with xi ∈ < for the fronto-parallel planes, and xi ∈ <3 for the slanted planes

This are continuous variables. Is this a problem?

What can I do to solve this? Discretize the problem

The unitary are usually agreegation of cost over the local matching on thepixels in that superpixel

Pairwise is typically smoothness

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 42 / 48

Page 88: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Plane MRFs

First segment an image into small regions, i.e., superpixels

Assume that the 3D world is compose of small frontal/slanted planes

Good representation if the superpixels are small and respect boundaries

E (x1, · · · , xn) =∑i

C (xi ) +∑i

∑j∈Nj

C (xi , xj)

with xi ∈ < for the fronto-parallel planes, and xi ∈ <3 for the slanted planes

This are continuous variables. Is this a problem?

What can I do to solve this? Discretize the problem

The unitary are usually agreegation of cost over the local matching on thepixels in that superpixel

Pairwise is typically smoothness

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 42 / 48

Page 89: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Plane MRFs

First segment an image into small regions, i.e., superpixels

Assume that the 3D world is compose of small frontal/slanted planes

Good representation if the superpixels are small and respect boundaries

E (x1, · · · , xn) =∑i

C (xi ) +∑i

∑j∈Nj

C (xi , xj)

with xi ∈ < for the fronto-parallel planes, and xi ∈ <3 for the slanted planes

This are continuous variables. Is this a problem?

What can I do to solve this? Discretize the problem

The unitary are usually agreegation of cost over the local matching on thepixels in that superpixel

Pairwise is typically smoothness

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 42 / 48

Page 90: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Plane MRFs

First segment an image into small regions, i.e., superpixels

Assume that the 3D world is compose of small frontal/slanted planes

Good representation if the superpixels are small and respect boundaries

E (x1, · · · , xn) =∑i

C (xi ) +∑i

∑j∈Nj

C (xi , xj)

with xi ∈ < for the fronto-parallel planes, and xi ∈ <3 for the slanted planes

This are continuous variables. Is this a problem?

What can I do to solve this? Discretize the problem

The unitary are usually agreegation of cost over the local matching on thepixels in that superpixel

Pairwise is typically smoothness

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 42 / 48

Page 91: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Plane MRFs

First segment an image into small regions, i.e., superpixels

Assume that the 3D world is compose of small frontal/slanted planes

Good representation if the superpixels are small and respect boundaries

E (x1, · · · , xn) =∑i

C (xi ) +∑i

∑j∈Nj

C (xi , xj)

with xi ∈ < for the fronto-parallel planes, and xi ∈ <3 for the slanted planes

This are continuous variables. Is this a problem?

What can I do to solve this? Discretize the problem

The unitary are usually agreegation of cost over the local matching on thepixels in that superpixel

Pairwise is typically smoothness

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 42 / 48

Page 92: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Slanted-plane MRFs

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 43 / 48

Page 93: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

A more sophisticated occlusion model

MRF on continuous variables (slanted planes) and discrete var. (boundary)

Combines depth ordering (segmentation) and stereo

!"#$%&'(

)*+,*$- !"#$"%&'()*+),-").&$-*%/01/2.&$*/"3/4*+,*$-

./0%1)*2'()*+),-"5*.&6"$4782/9*-:**$/4*+,*$-4/

;/4-&-*4/

<==.#48"$ >8$+* ?"2.&$&'

?"$6$#"#4/@&'8&9.*/

184='*-*/@&'8&9.*/)#2*'28A*.4/BC?D/EF'9*.&*GH/*-/&.I/JKLLM/

////////////////////////&$%/)NO?/EF=7&$-&H/*-/&.I/JKLKMP/

Takes as input disparities computed by any local algorithm

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 44 / 48

Page 94: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Energy of PCBP-Stereo

y the set of slanted 3D planes, o the set of discrete boundary variables

E (y, o) = Ecolor (o) + Ematch(y, o) + Ecompatibility (y, o) + Ejunction(o)

!"#"$%&'()$)&' *"+,$-'.)'/,'()0$%1%&'

*,2'"#%3,'

4)$)&'5"#"$%&".-'

!"#"$%&'

6"55"#"$%&'

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 45 / 48

Page 95: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Energy of PCBP-Stereo

y the set of slanted 3D planes, o the set of discrete boundary variables

E (y, o) = Ecolor (o) + Ematch(y, o) + Ecompatibility (y, o) + Ejunction(o)

!"#$$%$&'()*'+(#$,-.'(/0(*&1-'(2*,13#*'4(%31(

5/%1-'$2(64(3&4(%3'7+*&"(%$'+/2((((89/2*:$2(,$%*;"./63.(%3'7+*&"<(

=*,13#*'4(%31( >.3&'$2(1.3&$(

?#-&73'$2(@-32#3A7(0-&7A/&

B&(6/-&23#4(CB77.-,*/&D(E(F/#$"#/-&2(,$"%$&'(/)&,(6/-&23#4(

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 46 / 48

Page 96: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Energy of PCBP-Stereo

y the set of slanted 3D planes, o the set of discrete boundary variables

E (y, o) = Ecolor (o) + Ematch(y, o) + Ecompatibility (y, o) + Ejunction(o)

!"#$%&'()*&+,*-) .*&$/,%)0*&$,-)!"#$%&

1',2,',$3,)"2)+"#$%&'()*&+,*) 45"0*&$&')6)78$9,)6):33*#-8"$;)

<&/3=)

>:33*#-8"$?)

>78$9,?)

>5"0*&$&'?)

i j i j

i j

i j

on boundary

in both segments

@<0"-,)0,$&*/()

4A;)

4B;)

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 47 / 48

Page 97: Energy Minimization - University of Chicagottic.uchicago.edu/~rurtasun/courses/CV/lecture15.pdfP = fP ljl 2Lg, where P l = fp 2Pjf p = lgis a subset of pixels assigned label l. There

Energy of PCBP-Stereo

y the set of slanted 3D planes, o the set of discrete boundary variables

E (y, o) = Ecolor (o) + Ematch(y, o) + Ecompatibility (y, o) + Ejunction(o)

!""#$%&'()*'$(+,-.)-/,%'(&(0)

123'%%&*#/)",%/%)

!""#$%&'()4-'(5)

6,"7)8&(0/)9'3#,(,-)

:;,#&7)<=>?@)

A/(,#&B/)&23'%%&*#/)C$("D'(%)

Raquel Urtasun (TTI-C) Computer Vision Feb 28, 2013 48 / 48