cvpr2012: tutorial: graphcut-based optimisation for computer vision

240
GraphCut-based Optimisation GraphCut-based Optimisation for Computer Vision for Computer Vision Ľubor Ladický

Upload: zukun

Post on 19-Jun-2015

2.478 views

Category:

Education


3 download

TRANSCRIPT

Page 1: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

GraphCut-based OptimisationGraphCut-based Optimisationfor Computer Visionfor Computer Vision

Ľubor Ladický

Page 2: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Overview

• Motivation

• Min-Cut / Max-Flow (Graph Cut) Algorithm

• Markov and Conditional Random Fields

• Random Field Optimisation using Graph Cuts

• Submodular vs. Non-Submodular Problems

• Pairwise vs. Higher Order Problems

• 2-Label vs. Multi-Label Problems

• Recent Advances in Random Field Optimisation

• Conclusions

Page 3: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Overview

• Motivation

• Min-Cut / Max-Flow (Graph Cut) Algorithm

• Markov and Conditional Random Fields

• Random Field Optimisation using Graph Cuts

• Submodular vs. Non-Submodular Problems

• Pairwise vs. Higher Order Problems

• 2-Label vs. Multi-Label Problems

• Recent Advances in Random Field Optimisation

• Conclusions

Page 4: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Image Labelling Problems

Image Denoising Geometry Estimation Object Segmentation

Assign a label to each image pixelAssign a label to each image pixel

Building

Sky

Tree Grass

Depth Estimation

Page 5: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Image Labelling Problems

• Labellings highly structured

Possible labelling Unprobable labelling Impossible labelling

Page 6: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Image Labelling Problems

• Labellings highly structured

• Labels highly correlated with very complex dependencies

• Neighbouring pixels tend to take the same label

• Low number of connected components

• Classes present may be seen in one image

• Geometric / Location consistency

• Planarity in depth estimation

• … many others (task dependent)

Page 7: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Image Labelling Problems

• Labelling highly structured

• Labels highly correlated with very complex dependencies

• Independent label estimation too hard

Page 8: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Image Labelling Problems

• Labelling highly structured

• Labels highly correlated with very complex dependencies

• Independent label estimation too hard

• Whole labelling should be formulated as one optimisation problem

Page 9: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Image Labelling Problems

• Labelling highly structured

• Labels highly correlated with very complex dependencies

• Independent label estimation too hard

• Whole labelling should be formulated as one optimisation problem

• Number of pixels up to millions

• Hard to train complex dependencies

• Optimisation problem is hard to infer

Page 10: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Image Labelling Problems

• Labelling highly structured

• Labels highly correlated with very complex dependencies

• Independent label estimation too hard

• Whole labelling should be formulated as one optimisation problem

• Number of pixels up to millions

• Hard to train complex dependencies

• Optimisation problem is hard to infer

Vision is hard !

Page 11: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Image Labelling Problems

• You can

either

• Change the subject from the Computer Vision to the History of Renaissance Art

Vision is hard !

Page 12: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Image Labelling Problems

• You can

either

• Change the subject from the Computer Vision to the History of Renaissance Art

or

• Learn everything about Random Fields and Graph Cuts

Vision is hard !

Page 13: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Foreground / Background Estimation

Rother et al. SIGGRAPH04

Page 14: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Foreground / Background Estimation

Data term Smoothness term

Data term

Estimated using FG / BG colour models

Smoothness term

where

Intensity dependent smoothness

Page 15: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Foreground / Background Estimation

Data term Smoothness term

Page 16: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Foreground / Background Estimation

Data term Smoothness term

How to solve this optimisation problem?

Page 17: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Foreground / Background Estimation

Data term Smoothness term

How to solve this optimisation problem?

• Transform into min-cut / max-flow problem

• Solve it using min-cut / max-flow algorithm

Page 18: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Overview

• Motivation

• Min-Cut / Max-Flow (Graph Cut) Algorithm

• Markov and Conditional Random Fields

• Random Field Optimisation using Graph Cuts

• Submodular vs. Non-Submodular Problems

• Pairwise vs. Higher Order Problems

• 2-Label vs. Multi-Label Problems

• Applications

• Conclusions

Page 19: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

9

5

6

8

42

2

2

5

3

5

5

311

6

2

3

Task :

Maximize the flow from the sink to the source such that

1) The flow it conserved for each node

2) The flow for each pipe does not exceed the capacity

Page 20: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

9

5

6

8

42

2

2

5

3

5

5

311

6

2

3

Page 21: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

9

5

6

8

42

2

2

5

3

5

5

311

6

2

3

flow from the source

capacity

flow from node i to node j

conservation of flow

set of edges

set of nodes

Page 22: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

9

5

6

8

42

2

2

5

3

5

5

311

6

2

3

Ford & Fulkerson algorithm (1956)

Find the path from source to sink

While (path exists)

flow += maximum capacity in the path

Build the residual graph (“subtract” the flow)

Find the path in the residual graph

End

Page 23: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

9

5

6

8

42

2

2

5

3

5

5

311

6

2

3

Ford & Fulkerson algorithm (1956)

Find the path from source to sink

While (path exists)

flow += maximum capacity in the path

Build the residual graph (“subtract” the flow)

Find the path in the residual graph

End

Page 24: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

9

5

6

8

42

2

2

5

3

5

5

311

6

2

3

Ford & Fulkerson algorithm (1956)

Find the path from source to sink

While (path exists)

flow += maximum capacity in the path

Build the residual graph (“subtract” the flow)

Find the path in the residual graph

End

flow = 3

Page 25: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

9

5

6-3

8-3

42+

32

2

5-3

3-3

5

5

311

6

2

3

Ford & Fulkerson algorithm (1956)

Find the path from source to sink

While (path exists)

flow += maximum capacity in the path

Build the residual graph (“subtract” the flow)

Find the path in the residual graph

End

flow = 3

+3

Page 26: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

9

5

3

5

45

2

2

2

5

5

311

6

2

3

Ford & Fulkerson algorithm (1956)

Find the path from source to sink

While (path exists)

flow += maximum capacity in the path

Build the residual graph (“subtract” the flow)

Find the path in the residual graph

End

flow = 3

3

Page 27: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

9

5

3

5

45

2

2

2

5

5

311

6

2

3

Ford & Fulkerson algorithm (1956)

Find the path from source to sink

While (path exists)

flow += maximum capacity in the path

Build the residual graph (“subtract” the flow)

Find the path in the residual graph

End

flow = 3

3

Page 28: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

9

5

3

5

45

2

2

2

5

5

311

6

2

3

Ford & Fulkerson algorithm (1956)

Find the path from source to sink

While (path exists)

flow += maximum capacity in the path

Build the residual graph (“subtract” the flow)

Find the path in the residual graph

End

flow = 6

3

Page 29: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

9-3

5

3

5-3

45

2

2

2

5

5

3-311

6

2

3-3

Ford & Fulkerson algorithm (1956)

Find the path from source to sink

While (path exists)

flow += maximum capacity in the path

Build the residual graph (“subtract” the flow)

Find the path in the residual graph

End

flow = 6

3

+3

+3

Page 30: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

6

5

3

2

45

2

2

2

5

5

11

6

2

Ford & Fulkerson algorithm (1956)

Find the path from source to sink

While (path exists)

flow += maximum capacity in the path

Build the residual graph (“subtract” the flow)

Find the path in the residual graph

End

flow = 6

3

3

3

Page 31: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

6

5

3

2

45

2

2

2

5

5

11

6

2

Ford & Fulkerson algorithm (1956)

Find the path from source to sink

While (path exists)

flow += maximum capacity in the path

Build the residual graph (“subtract” the flow)

Find the path in the residual graph

End

flow = 6

3

3

3

Page 32: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

6

5

3

2

45

2

2

2

5

5

11

6

2

Ford & Fulkerson algorithm (1956)

Find the path from source to sink

While (path exists)

flow += maximum capacity in the path

Build the residual graph (“subtract” the flow)

Find the path in the residual graph

End

flow = 11

3

3

3

Page 33: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

6-5

5-5

3

2

45

2

2

2

5-5

5-5

1+51

6

2+5

Ford & Fulkerson algorithm (1956)

Find the path from source to sink

While (path exists)

flow += maximum capacity in the path

Build the residual graph (“subtract” the flow)

Find the path in the residual graph

End

flow = 11

3

3

3

Page 34: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

1

3

2

45

2

2

2

61

6

7

Ford & Fulkerson algorithm (1956)

Find the path from source to sink

While (path exists)

flow += maximum capacity in the path

Build the residual graph (“subtract” the flow)

Find the path in the residual graph

End

flow = 11

3

33

Page 35: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

1

3

2

45

2

2

2

61

6

7

Ford & Fulkerson algorithm (1956)

Find the path from source to sink

While (path exists)

flow += maximum capacity in the path

Build the residual graph (“subtract” the flow)

Find the path in the residual graph

End

flow = 11

3

33

Page 36: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

1

3

2

45

2

2

2

61

6

7

Ford & Fulkerson algorithm (1956)

Find the path from source to sink

While (path exists)

flow += maximum capacity in the path

Build the residual graph (“subtract” the flow)

Find the path in the residual graph

End

flow = 13

3

33

Page 37: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

1

3

2-2

45

2-2

2-2

2-2

61

6

7

Ford & Fulkerson algorithm (1956)

Find the path from source to sink

While (path exists)

flow += maximum capacity in the path

Build the residual graph (“subtract” the flow)

Find the path in the residual graph

End

flow = 13

3

33

+2

+2

Page 38: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

1

3

45

61

6

7

Ford & Fulkerson algorithm (1956)

Find the path from source to sink

While (path exists)

flow += maximum capacity in the path

Build the residual graph (“subtract” the flow)

Find the path in the residual graph

End

flow = 13

3

33

2

2

Page 39: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

1

3

45

61

6

7

Ford & Fulkerson algorithm (1956)

Find the path from source to sink

While (path exists)

flow += maximum capacity in the path

Build the residual graph (“subtract” the flow)

Find the path in the residual graph

End

flow = 13

3

33

2

2

Page 40: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

1

3

45

61

6

7

Ford & Fulkerson algorithm (1956)

Find the path from source to sink

While (path exists)

flow += maximum capacity in the path

Build the residual graph (“subtract” the flow)

Find the path in the residual graph

End

flow = 15

3

33

2

2

Page 41: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

1

3-2

4-25

61

6-2

7

Ford & Fulkerson algorithm (1956)

Find the path from source to sink

While (path exists)

flow += maximum capacity in the path

Build the residual graph (“subtract” the flow)

Find the path in the residual graph

End

flow = 15

3

33+

2

2

2-2

+2

Page 42: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

1

1

5

61

4

7

Ford & Fulkerson algorithm (1956)

Find the path from source to sink

While (path exists)

flow += maximum capacity in the path

Build the residual graph (“subtract” the flow)

Find the path in the residual graph

End

flow = 15

3

35

2

2

2

Page 43: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

1

1

5

61

4

7

Ford & Fulkerson algorithm (1956)

Find the path from source to sink

While (path exists)

flow += maximum capacity in the path

Build the residual graph (“subtract” the flow)

Find the path in the residual graph

End

flow = 15

3

35

2

2

2

Page 44: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

1

1

5

61

4

7

Ford & Fulkerson algorithm (1956)

Find the path from source to sink

While (path exists)

flow += maximum capacity in the path

Build the residual graph (“subtract” the flow)

Find the path in the residual graph

End

flow = 15

3

35

2

2

2

Page 45: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

1

1

5

61

4

7

Ford & Fulkerson algorithm (1956)

Why is the solution globally optimal ?

flow = 15

3

35

2

2

2

Page 46: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

1

1

5

61

4

7

Ford & Fulkerson algorithm (1956)

Why is the solution globally optimal ?

1. Let S be the set of reachable nodes in the

residual graph

flow = 15

3

35

2

2

2

S

Page 47: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

Ford & Fulkerson algorithm (1956)

Why is the solution globally optimal ?

1. Let S be the set of reachable nodes in the

residual graph

2. The flow from S to V - S equals to the sum

of capacities from S to V – S

source

sink

9

5

6

8

42

2

2

5

3

5

5

311

6

2

3

S

Page 48: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

Ford & Fulkerson algorithm (1956)

Why is the solution globally optimal ?

1. Let S be the set of reachable nodes in the residual graph

2. The flow from S to V - S equals to the sum of capacities from S to V – S

3. The flow from any A to V - A is upper bounded by the sum of capacities from A to V – A

source

sink

9

5

6

8

42

2

2

5

3

5

5

311

6

2

3

A

Page 49: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

flow = 15

Ford & Fulkerson algorithm (1956)

Why is the solution globally optimal ?

1. Let S be the set of reachable nodes in the residual graph

2. The flow from S to V - S equals to the sum of capacities from S to V – S

3. The flow from any A to V - A is upper bounded by the sum of capacities from A to V – A

4. The solution is globally optimal

source

sink

8/9

5/5

5/6

8/8

2/4

0/2

2/2

0/2

5/5

3/3

5/5

5/5

3/30/10/1

2/6

0/2

3/3

Individual flows obtained by summing up all paths

Page 50: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

flow = 15

Ford & Fulkerson algorithm (1956)

Why is the solution globally optimal ?

1. Let S be the set of reachable nodes in the residual graph

2. The flow from S to V - S equals to the sum of capacities from S to V – S

3. The flow from any A to V - A is upper bounded by the sum of capacities from A to V – A

4. The solution is globally optimal

source

sink

8/9

5/5

5/6

8/8

2/4

0/2

2/2

0/2

5/5

3/3

5/5

5/5

3/30/10/1

2/6

0/2

3/3

Page 51: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

1000

1000

Ford & Fulkerson algorithm (1956)

Order does matter

1000

1000

1

Page 52: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

1000

1000

Ford & Fulkerson algorithm (1956)

Order does matter

1000

1000

1

Page 53: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

999

999

Ford & Fulkerson algorithm (1956)

Order does matter

1000

1000

1

Page 54: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

999

999

Ford & Fulkerson algorithm (1956)

Order does matter

1000

1000

1

Page 55: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

999

999

Ford & Fulkerson algorithm (1956)

Order does matter

999

999

1

Page 56: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

999

999

Ford & Fulkerson algorithm (1956)

Order does matter

• Standard algorithm not polynomial

• Breath first leads to O(VE2)

• Path found in O(E)

• At least one edge gets saturated

• The saturated edge distance to the source has to increase and is at most V leading to O(VE)

(Edmonds & Karp, Dinic)

999

999

1

Page 57: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Max-Flow Problem

source

sink

999

999

Ford & Fulkerson algorithm (1956)

Order does matter

• Standard algorithm not polynomial

• Breath first leads to O(VE2)

(Edmonds & Karp)

• Various methods use different algorithm to find the path and vary in complexity

999

999

1

Page 58: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

source

sink

9

5

6

8

42

2

2

5

3

5

5

311

6

2

3

Task :

Minimize the cost of the cut

1) Each node is either assigned to the source S or sink T

2) The cost of the edge (i, j) is taken if (iS) and (jT)

Page 59: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

source

sink

9

5

6

8

42

2

2

5

3

5

5

311

6

2

3

Page 60: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

source

sink

9

5

6

8

42

2

2

5

3

5

5

311

6

2

3

source set sink set

edge costs

Page 61: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

source

sink

9

5

6

8

42

2

2

5

3

5

5

311

6

2

3

source set sink set

edge costs

S

T

cost = 18

Page 62: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

source

sink

9

5

6

8

42

2

2

5

3

5

5

311

6

2

3

source set sink set

edge costs

S

T

cost = 25

Page 63: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

source

sink

9

5

6

8

42

2

2

5

3

5

5

311

6

2

3

source set sink set

edge costs

S

T

cost = 23

Page 64: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

source

sink

9

5

6

8

42

2

2

5

3

5

5

311

6

2

3

x1x2

x3

x4

x5

x6

Page 65: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

source

sink

9

5

6

8

42

2

2

5

3

5

5

311

6

2

3

x1x2

x3

x4

x5

x6

transformable into Linear program (LP)

Page 66: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

source

sink

9

5

6

8

42

2

2

5

3

5

5

311

6

2

3

x1x2

x3

x4

x5

x6

Dual to max-flow problem

transformable into Linear program (LP)

Page 67: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

source

sink

9

5

6

8

42

2

2

5

3

5

5

311

6

2

3

x1x2

x3

x4

x5

x6

After the substitution of the constraints :

Page 68: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

source

sink

9

5

6

8

42

2

2

5

3

5

5

311

6

2

3

xi = 0 xi S xi = 1 xi T

x1x2

x3

x4

x5

x6

source edges sink edges

Edges between variables

Page 69: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

source

sink

9

5

6

8

42

2

2

5

3

5

5

311

6

2

3

C(x) = 5x1 + 9x2 + 4x3 + 3x3(1-x1) + 2x1(1-x3)

+ 3x3(1-x1) + 2x2(1-x3) + 5x3(1-x2) + 2x4(1-x1)

+ 1x5(1-x1) + 6x5(1-x3) + 5x6(1-x3) + 1x3(1-x6)

+ 3x6(1-x2) + 2x4(1-x5) + 3x6(1-x5) + 6(1-x4)

+ 8(1-x5) + 5(1-x6)

x1x2

x3

x4

x5

x6

Page 70: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

C(x) = 5x1 + 9x2 + 4x3 + 3x3(1-x1) + 2x1(1-x3)

+ 3x3(1-x1) + 2x2(1-x3) + 5x3(1-x2) + 2x4(1-x1)

+ 1x5(1-x1) + 6x5(1-x3) + 5x6(1-x3) + 1x3(1-x6)

+ 3x6(1-x2) + 2x4(1-x5) + 3x6(1-x5) + 6(1-x4)

+ 8(1-x5) + 5(1-x6)

source

sink

9

5

6

8

42

2

2

5

3

5

5

311

6

2

3

x1x2

x3

x4

x5

x6

Page 71: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

C(x) = 2x1 + 9x2 + 4x3 + 2x1(1-x3)

+ 3x3(1-x1) + 2x2(1-x3) + 5x3(1-x2) + 2x4(1-x1)

+ 1x5(1-x1) + 3x5(1-x3) + 5x6(1-x3) + 1x3(1-x6)

+ 3x6(1-x2) + 2x4(1-x5) + 3x6(1-x5) + 6(1-x4)

+ 5(1-x5) + 5(1-x6)

+ 3x1 + 3x3(1-x1) + 3x5(1-x3) + 3(1-x5)

source

sink

9

5

6

8

42

2

2

5

3

5

5

311

6

2

3

x1x2

x3

x4

x5

x6

Page 72: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

C(x) = 2x1 + 9x2 + 4x3 + 2x1(1-x3)

+ 3x3(1-x1) + 2x2(1-x3) + 5x3(1-x2) + 2x4(1-x1)

+ 1x5(1-x1) + 3x5(1-x3) + 5x6(1-x3) + 1x3(1-x6)

+ 3x6(1-x2) + 2x4(1-x5) + 3x6(1-x5) + 6(1-x4)

+ 5(1-x5) + 5(1-x6)

+ 3x1 + 3x3(1-x1) + 3x5(1-x3) + 3(1-x5)

3x1 + 3x3(1-x1) + 3x5(1-x3) + 3(1-x5)

=

3 + 3x1(1-x3) + 3x3(1-x5)

source

sink

9

5

6

8

42

2

2

5

3

5

5

311

6

2

3

x1x2

x3

x4

x5

x6

Page 73: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

C(x) = 3 + 2x1 + 9x2 + 4x3 + 5x1(1-x3)

+ 3x3(1-x1) + 2x2(1-x3) + 5x3(1-x2) + 2x4(1-x1)

+ 1x5(1-x1) + 3x5(1-x3) + 5x6(1-x3) + 1x3(1-x6)

+ 3x6(1-x2) + 2x4(1-x5) + 3x6(1-x5) + 6(1-x4)

+ 5(1-x5) + 5(1-x6) + 3x3(1-x5)

source

sink

9

5

3

5

45

2

2

2

5

5

311

6

2

33

x1x2

x3

x4

x5

x6

Page 74: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

C(x) = 3 + 2x1 + 9x2 + 4x3 + 5x1(1-x3)

+ 3x3(1-x1) + 2x2(1-x3) + 5x3(1-x2) + 2x4(1-x1)

+ 1x5(1-x1) + 3x5(1-x3) + 5x6(1-x3) + 1x3(1-x6)

+ 3x6(1-x2) + 2x4(1-x5) + 3x6(1-x5) + 6(1-x4)

+ 5(1-x5) + 5(1-x6) + 3x3(1-x5)

source

sink

9

5

3

5

45

2

2

2

5

5

311

6

2

33

x1x2

x3

x4

x5

x6

Page 75: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

C(x) = 3 + 2x1 + 6x2 + 4x3 + 5x1(1-x3)

+ 3x3(1-x1) + 2x2(1-x3) + 5x3(1-x2) + 2x4(1-x1)

+ 1x5(1-x1) + 3x5(1-x3) + 5x6(1-x3) + 1x3(1-x6)

+ 2x5(1-x4) + 6(1-x4)

+ 2(1-x5) + 5(1-x6) + 3x3(1-x5)

+ 3x2 + 3x6(1-x2) + 3x5(1-x6) + 3(1-x5)

3x2 + 3x6(1-x2) + 3x5(1-x6) + 3(1-x5)

=

3 + 3x2(1-x6) + 3x6(1-x5)

source

sink

9

5

3

5

45

2

2

2

5

5

311

6

2

33

x1x2

x3

x4

x5

x6

Page 76: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

C(x) = 6 + 2x1 + 6x2 + 4x3 + 5x1(1-x3)

+ 3x3(1-x1) + 2x2(1-x3) + 5x3(1-x2) + 2x4(1-x1)

+ 1x5(1-x1) + 3x5(1-x3) + 5x6(1-x3) + 1x3(1-x6)

+ 2x5(1-x4) + 6(1-x4) + 3x2(1-x6) + 3x6(1-x5)

+ 2(1-x5) + 5(1-x6) + 3x3(1-x5)

source

sink

6

5

3

2

45

2

2

2

5

5

11

6

2

3

3

3

x1x2

x3

x4

x5

x6

Page 77: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

C(x) = 6 + 2x1 + 6x2 + 4x3 + 5x1(1-x3)

+ 3x3(1-x1) + 2x2(1-x3) + 5x3(1-x2) + 2x4(1-x1)

+ 1x5(1-x1) + 3x5(1-x3) + 5x6(1-x3) + 1x3(1-x6)

+ 2x5(1-x4) + 6(1-x4) + 3x2(1-x6) + 3x6(1-x5)

+ 2(1-x5) + 5(1-x6) + 3x3(1-x5)

source

sink

6

5

3

2

45

2

2

2

5

5

11

6

2

3

3

3

x1x2

x3

x4

x5

x6

Page 78: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

C(x) = 11 + 2x1 + 1x2 + 4x3 + 5x1(1-x3)

+ 3x3(1-x1) + 7x2(1-x3) + 2x4(1-x1)

+ 1x5(1-x1) + 3x5(1-x3) + 6x3(1-x6)

+ 2x5(1-x4) + 6(1-x4) + 3x2(1-x6) + 3x6(1-x5)

+ 2(1-x5) + 3x3(1-x5)

source

sink

1

3

2

45

2

2

2

61

6

7

3

33

x1x2

x3

x4

x5

x6

Page 79: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

C(x) = 11 + 2x1 + 1x2 + 4x3 + 5x1(1-x3)

+ 3x3(1-x1) + 7x2(1-x3) + 2x4(1-x1)

+ 1x5(1-x1) + 3x5(1-x3) + 6x3(1-x6)

+ 2x5(1-x4) + 6(1-x4) + 3x2(1-x6) + 3x6(1-x5)

+ 2(1-x5) + 3x3(1-x5)

source

sink

1

3

2

45

2

2

2

61

6

7

3

33

x1x2

x3

x4

x5

x6

Page 80: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

C(x) = 13 + 1x2 + 4x3 + 5x1(1-x3)

+ 3x3(1-x1) + 7x2(1-x3) + 2x1(1-x4)

+ 1x5(1-x1) + 3x5(1-x3) + 6x3(1-x6)

+ 2x4(1-x5) + 6(1-x4) + 3x2(1-x6) + 3x6(1-x5)

+ 3x3(1-x5)

source

sink

1

3

45

61

6

7

3

33

2

2

x1x2

x3

x4

x5

x6

Page 81: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

C(x) = 13 + 1x2 + 2x3 + 5x1(1-x3)

+ 3x3(1-x1) + 7x2(1-x3) + 2x1(1-x4)

+ 1x5(1-x1) + 3x5(1-x3) + 6x3(1-x6)

+ 2x4(1-x5) + 6(1-x4) + 3x2(1-x6) + 3x6(1-x5)

+ 3x3(1-x5)

source

sink

1

3

45

61

6

7

3

33

2

2

x1x2

x3

x4

x5

x6

Page 82: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

C(x) = 15 + 1x2 + 4x3 + 5x1(1-x3)

+ 3x3(1-x1) + 7x2(1-x3) + 2x1(1-x4)

+ 1x5(1-x1) + 6x3(1-x6) + 6x3(1-x5)

+ 2x5(1-x4) + 4(1-x4) + 3x2(1-x6) + 3x6(1-x5)

source

sink

1

1

5

61

4

7

3

35

2

2

2x1

x2

x3

x4

x5

x6

Page 83: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

C(x) = 15 + 1x2 + 4x3 + 5x1(1-x3)

+ 3x3(1-x1) + 7x2(1-x3) + 2x1(1-x4)

+ 1x5(1-x1) + 6x3(1-x6) + 6x3(1-x5)

+ 2x5(1-x4) + 4(1-x4) + 3x2(1-x6) + 3x6(1-x5)

source

sink

1

1

5

61

4

7

3

35

2

2

2x1

x2

x3

x4

x5

x6

Page 84: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

C(x) = 15 + 1x2 + 4x3 + 5x1(1-x3)

+ 3x3(1-x1) + 7x2(1-x3) + 2x1(1-x4)

+ 1x5(1-x1) + 6x3(1-x6) + 6x3(1-x5)

+ 2x5(1-x4) + 4(1-x4) + 3x2(1-x6) + 3x6(1-x5)

source

sink

1

1

5

61

4

7

3

35

2

2

2x1

x2

x3

x4

x5

x6 cost = 0

min cut

S

T

cost = 0

• All coefficients positive

• Must be global minimum

S – set of reachable nodes from s

Page 85: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Min-Cut Problem

C(x) = 15 + 1x2 + 4x3 + 5x1(1-x3)

+ 3x3(1-x1) + 7x2(1-x3) + 2x1(1-x4)

+ 1x5(1-x1) + 6x3(1-x6) + 6x3(1-x5)

+ 2x5(1-x4) + 4(1-x4) + 3x2(1-x6) + 3x6(1-x5)

source

sink

1

1

5

61

4

7

3

35

2

2

2x1

x2

x3

x4

x5

x6 cost = 0

min cut

S

T• All coefficients positive

• Must be global minimum

T – set of nodes that can reach t

(not necesarly the same)

Page 86: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Overview

• Motivation

• Min-Cut / Max-Flow (Graph Cut) Algorithm

• Markov and Conditional Random Fields

• Random Field Optimisation using Graph Cuts

• Submodular vs. Non-Submodular Problems

• Pairwise vs. Higher Order Problems

• 2-Label vs. Multi-Label Problems

• Recent Advances in Random Field Optimisation

• Conclusions

Page 87: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Markov and Conditional RF

• Markov / Conditional Random fields model probabilistic dependencies of the set of random variables

Page 88: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Markov and Conditional RF

• Markov / Conditional Random fields model conditional dependencies between random variables

• Each variable is conditionally independent of all other variables given its neighbours

Page 89: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Markov and Conditional RF

• Markov / Conditional Random fields model conditional dependencies between random variables

• Each variable is conditionally independent of all other variables given its neighbours

• Posterior probability of the labelling x given data D is :

partition function potential functionscliques

Page 90: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Markov and Conditional RF

• Markov / Conditional Random fields model conditional dependencies between random variables

• Each variable is conditionally independent of all other variables given its neighbours

• Posterior probability of the labelling x given data D is :

• Energy of the labelling is defined as :

partition function potential functionscliques

Page 91: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Markov and Conditional RF

• The most probable (Max a Posteriori (MAP)) labelling is defined as:

Page 92: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Markov and Conditional RF

• The most probable (Max a Posteriori (MAP)) labelling is defined as:

• The only distinction (MRF vs. CRF) is that in the CRF the conditional dependencies between variables depend also on the data

Page 93: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Markov and Conditional RF

• The most probable (Max a Posteriori (MAP)) labelling is defined as:

• The only distinction (MRF vs. CRF) is that in the CRF the conditional dependencies between variables depend also on the data

• Typically we define an energy first and then pretend there is an underlying probabilistic distribution there, but there isn’t really (Pssssst, don’t tell anyone)

Page 94: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Overview

• Motivation

• Min-Cut / Max-Flow (Graph Cut) Algorithm

• Markov and Conditional Random Fields

• Random Field Optimisation using Graph Cuts

• Submodular vs. Non-Submodular Problems

• Pairwise vs. Higher Order Problems

• 2-Label vs. Multi-Label Problems

• Recent Advances in Random Field Optimisation

• Conclusions

Page 95: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Standard CRF Energy

Pairwise CRF models

Data term Smoothness term

Page 96: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Graph Cut based Inference

The energy / potential is called submodular (only) if for every pair of variables :

Page 97: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Graph Cut based Inference

For 2-label problems x{0, 1} :

Page 98: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Graph Cut based Inference

For 2-label problems x{0, 1} :

Pairwise potential can be transformed into :

where

Page 99: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Graph Cut based Inference

For 2-label problems x{0, 1} :

Pairwise potential can be transformed into :

where

Page 100: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Graph Cut based Inference

For 2-label problems x{0, 1} :

Pairwise potential can be transformed into :

where

Page 101: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Graph Cut based Inference

After summing up :

Page 102: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Graph Cut based Inference

After summing up :

Let :

Page 103: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Graph Cut based Inference

After summing up :

Let : Then :

Page 104: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Graph Cut based Inference

After summing up :

Let : Then :

Equivalent st-mincut problem is :

Page 105: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Foreground / Background Estimation

Data term Smoothness term

Data term

Estimated using FG / BG colour models

Smoothness term

where

Intensity dependent smoothness

Page 106: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Foreground / Background Estimation

source

sink

Rother et al. SIGGRAPH04

Page 107: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Extendable to (some) Multi-label CRFs

• each state of multi-label variable encoded using multiple binary variables

• the cost of every possible cut must be the same as the associated energy

• the solution obtained by inverting the encoding

Graph Cut based Inference

ObtainsolutionGraph Cut

BuildGraph

Encoding

multi-label energy solution

Page 108: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Dense Stereo Estimation

Left Camera Image Right Camera Image Dense Stereo Result

• For each pixel assigns a disparity label : z• Disparities from the discrete set {0, 1, .. D}

Page 109: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Dense Stereo Estimation

Data term

Left image feature

Shifted right image feature

Left Camera Image Right Camera Image

Page 110: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Dense Stereo Estimation

Data term

Left image feature

Shifted right image feature

Smoothness term

Page 111: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Dense Stereo Estimation

Ishikawa PAMI03

Encodingsource

xi0

xiD

xk0

xkD

..

.

xk1 xk

2

source

sink

.

Page 112: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Dense Stereo Estimation

Ishikawa PAMI03

Unary Potential

source

xi0

xiD

xk0

xkD

..

.

xk1 xk

2∞

∞ ∞

source

sink

.

cost = +

The cost of every cut should beequal to the corresponding energy under the encoding

Page 113: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Dense Stereo Estimation

Ishikawa PAMI03

Unary Potential

source

xi0

xiD

xk0

xkD

..

.

xk1 xk

2∞

∞ ∞

source

sink

.

cost = +

The cost of every cut should beequal to the corresponding energy under the encoding

Page 114: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Dense Stereo Estimation

Ishikawa PAMI03

Unary Potential

source

xi0

xiD

xk0

xkD

..

.

xk1 xk

2∞

∞ ∞

source

sink

.

cost = ∞

The cost of every cut should beequal to the corresponding energy under the encoding

Page 115: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Dense Stereo Estimation

Ishikawa PAMI03

Pairwise Potential

source

xi0

xiD

xk0

xkD

..

.

xk1 xk

2∞

∞ ∞

K

K

K

K

source

sink

.

cost = + + K

The cost of every cut should beequal to the corresponding energy under the encoding

Page 116: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Dense Stereo Estimation

Ishikawa PAMI03

source

xi0

xiD

xk0

xkD

..

.

xk1 xk

2∞

∞ ∞

K

K

K

K

source

sink

.

cost = + + D K

Pairwise Potential

The cost of every cut should beequal to the corresponding energy under the encoding

Page 117: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Dense Stereo Estimation

See [Ishikawa PAMI03] for more details

source

xi0

xiD

xk0

xkD

..

.

xk1 xk

2∞

∞ ∞

source

sink

.

Extendable to any convex cost

Convex function

Page 118: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Graph Cut based Inference

Move making algorithms

• Original problem decomposed into a series of subproblems solvable with graph cut

• In each subproblem we find the optimal move from the current solution in a restricted search space

UpdatesolutionGraph Cut

BuildGraph

Encoding

Propose move

Initial solutionsolution

Page 119: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Graph Cut based Inference

Example : [Boykov et al. 01]

α-expansion– each variable either keeps its old label or changes to α– all α-moves are iteratively performed till convergence

Transformation function :

α-swap– each variable taking label α or can change its label to α or – all α-moves are iteratively performed till convergence

Page 120: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Graph Cut based Inference

Sufficient conditions for move making algorithms :(all possible moves are submodular)

α-swap : semi-metricity

Proof:

Page 121: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Graph Cut based Inference

Sufficient conditions for move making algorithms :(all possible moves are submodular)

α-expansion : metricity

Proof:

Page 122: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Object-class Segmentation

Data term Smoothness termData term

Smoothness term

where

Discriminatively trained classifier

Page 123: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Object-class Segmentation

Original Image Initial solution

grass

Page 124: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Object-class Segmentation

Original Image Initial solution Building expansion

grass

grass

building

Page 125: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Object-class Segmentation

Original Image Initial solution Building expansion

Sky expansion

grass

grass

grass

building

building

sky

Page 126: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Object-class Segmentation

Original Image Initial solution Building expansion

Sky expansion Tree expansion

grass

grass

grass grass

building

building building

sky sky

tree

Page 127: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Object-class Segmentation

Original Image Initial solution Building expansion

Sky expansion Tree expansion Final Solution

grass

grass

grass grass grass

building

building buildingbuilding

sky sky sky

tree treeaeroplane

Page 128: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Graph Cut based Inference

Range-expansion [Kumar, Veksler & Torr 1]– Expansion version of range swap moves– Each varibale can change its label to any label in a convex

range or keep its old label

Range-(swap) moves [Veksler 07]– Each variable in the convex range can change its label to any

other label in the convex range – all range-moves are iteratively performed till convergence

Page 129: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Dense Stereo Reconstruction

Data term

Smoothness term

Same as before

Left Camera Image Right Camera Image Dense Stereo Result

Page 130: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Dense Stereo Reconstruction

Data term

Smoothness term

Same as before

Left Camera Image Right Camera Image Dense Stereo Result

Convex range Truncation

Page 131: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Graph Cut based Inference

Original Image Initial Solution

Page 132: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Graph Cut based Inference

Original Image Initial Solution After 1st expansion

Page 133: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Graph Cut based Inference

Original Image Initial Solution

After 2nd expansion

After 1st expansion

Page 134: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Graph Cut based Inference

Original Image Initial Solution

After 2nd expansion

After 1st expansion

After 3rd expansion

Page 135: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Graph Cut based Inference

Original Image Initial Solution

After 2nd expansion

After 1st expansion

After 3rd expansion Final solution

Page 136: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Graph Cut based Inference

• Extendible to certain classes of binary higher order energies

• Higher order terms have to be transformable to the pairwise submodular energy with binary auxiliary variables z

C

Higher order term Pairwise term

UpdatesolutionGraph Cut

Transf. to pairwise subm.

Encoding

Propose move

Initial solutionsolution

Page 137: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Graph Cut based Inference

• Extendible to certain classes of binary higher order energies

• Higher order terms have to be transformable to the pairwise submodular energy with binary auxiliary variables z

C

Higher order term Pairwise term

Example :

Kolmogorov ECCV06, Ramalingam et al. DAM12

Page 138: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Graph Cut based Inference

Enforces label consistency of the clique as a weak constraint

Input image Pairwise CRF Higher order CRF

Kohli et al. CVPR07

Page 139: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Graph Cut based Inference

UpdatesolutionGraph Cut

Transf. to pairwise subm.

Encoding

Propose move

Initial solution solution

If the energy is not submodular / graph-representable

– Overestimation by a submodular energy

– Quadratic Pseudo-Boolean Optimisation (QPBO)

Page 140: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Graph Cut based Inference

UpdatesolutionGraph Cut

Overestim. by pairwise subm.

Encoding

Propose move

Initial solutionsolution

Overestimation by a submodular energy (convex / concave)

– We find an energy E’(t) s.t.

– it is tight in the current solution E’(t0) = E(t0)

– Overestimates E(t) for all t E’(t) ≥ E(t)

– We replace E(t) by E’(t)

– The moves are not optimal, but quaranteed to converge

– The tighter over-estimation, the better

Example :

Page 141: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Graph Cut based Inference

Quadratic Pseudo-Boolean Optimisation– Each binary variable is encoded using two binary

variables xi and xi s.t. xi = 1 - xi _

_

Page 142: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Graph Cut based Inference

Quadratic Pseudo-Boolean Optimisation

source

sink

xixj

xi xj

_

_

– Energy submodular in xi and xi

– Solved by dropping the constraint xi = 1 - xi

– If xi = 1 - xi the solution for xi is optimal

_

_

_

Page 143: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Overview

• Motivation

• Min-Cut / Max-Flow (Graph Cut) Algorithm

• Markov and Conditional Random Fields

• Random Field Optimisation using Graph Cuts

• Submodular vs. Non-Submodular Problems

• Pairwise vs. Higher Order Problems

• 2-Label vs. Multi-Label Problems

• Recent Advances in Random Field Optimisation

• Conclusions

Page 144: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Motivation

• To propose formulations enforcing certain structure properties of the output useful in computer vision

– Label Consistency in segments as a weak constraint

– Label agreement over different scales

– Label-set consistency

– Multiple domains consistency

• To propose feasible Graph-Cut based inference method

Page 145: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Structures in CRF

• Taskar et al. 02 – associative potentials

• Woodford et al. 08 – planarity constraint

• Vicente et al. 08 – connectivity constraint

• Nowozin & Lampert 09 – connectivity constraint

• Roth & Black 09 – field of experts

• Woodford et al. 09 – marginal probability

• Delong et al. 10 – label occurrence costs

Page 146: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Object-class Segmentation Problem

• Aims to assign a class label for each pixel of an image

• Classifier trained on the training set

• Evaluated on never seen test images

Page 147: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Pairwise CRF models

Standard CRF Energy for Object Segmentation

Cannot encode global consistency of labels!!

Local context

Ladický, Russell, Kohli, Torr ECCV10

Page 148: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

• Thing – Thing • Stuff - Stuff • Stuff - Thing

[ Images from Rabinovich et al. 07 ]

Encoding Co-occurrence

Co-occurrence is a powerful cue [Heitz et al. '08][Rabinovich et al. ‘07]

Ladický, Russell, Kohli, Torr ECCV10

Page 149: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

• Thing – Thing • Stuff - Stuff • Stuff - Thing

Encoding Co-occurrence

Co-occurrence is a powerful cue [Heitz et al. '08][Rabinovich et al. ‘07]

Proposed solutions :

1. Csurka et al. 08 - Hard decision for label estimation2. Torralba et al. 03 - GIST based unary potential3. Rabinovich et al. 07 - Full-connected CRF

[ Images from Rabinovich et al. 07 ] Ladický, Russell, Kohli, Torr ECCV10

Page 150: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

So...

What properties should these global co-occurence potentials have ?

Ladický, Russell, Kohli, Torr ECCV10

Page 151: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Desired properties

1. No hard decisions

Ladický, Russell, Kohli, Torr ECCV10

Page 152: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Desired properties

1. No hard decisions

Incorporation in probabilistic framework

Unlikely possibilities are not completely ruled out

Ladický, Russell, Kohli, Torr ECCV10

Page 153: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Desired properties

1. No hard decisions2. Invariance to region size

Ladický, Russell, Kohli, Torr ECCV10

Page 154: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Desired properties

1. No hard decisions2. Invariance to region size

Cost for occurrence of {people, house, road etc .. } invariant to image area

Ladický, Russell, Kohli, Torr ECCV10

Page 155: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Desired properties

1. No hard decisions2. Invariance to region size

The only possible solution :

Local context Global contextCost defined over the

assigned labels L(x)

L(x)={ , , }

Ladický, Russell, Kohli, Torr ECCV10

Page 156: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Desired properties

1. No hard decisions2. Invariance to region size3. Parsimony – simple solutions preferred

L(x)={ building, tree, grass, sky }

L(x)={ aeroplane, tree, flower, building, boat, grass, sky }

Ladický, Russell, Kohli, Torr ECCV10

Page 157: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Desired properties

1. No hard decisions2. Invariance to region size3. Parsimony – simple solutions preferred4. Efficiency

Ladický, Russell, Kohli, Torr ECCV10

Page 158: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Desired properties

1. No hard decisions2. Invariance to region size3. Parsimony – simple solutions preferred4. Efficiency

a) Memory requirements as O(n) with the image size and number or labels

b) Inference tractable

Ladický, Russell, Kohli, Torr ECCV10

Page 159: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

• Torralba et al.(2003) – Gist-based unary potentials

• Rabinovich et al.(2007) - complete pairwise graphs

• Csurka et al.(2008) - hard estimation of labels present

Previous work

Ladický, Russell, Kohli, Torr ECCV10

Page 160: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

• Zhu & Yuille 1996 – MDL prior• Bleyer et al. 2010 – Surface Stereo MDL prior• Hoiem et al. 2007 – 3D Layout CRF MDL Prior

• Delong et al. 2010 – label occurence cost

Related work

C(x) = K |L(x)|

C(x) = ΣLKLδL(x)

Ladický, Russell, Kohli, Torr ECCV10

Page 161: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

• Zhu & Yuille 1996 – MDL prior• Bleyer et al. 2010 – Surface Stereo MDL prior• Hoiem et al. 2007 – 3D Layout CRF MDL Prior

• Delong et al. 2010 – label occurence cost

Related work

C(x) = K |L(x)|

C(x) = ΣLKLδL(x)

All special cases of our model

Ladický, Russell, Kohli, Torr ECCV10

Page 162: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Inference for Co-occurence

Label indicator functions

Co-occurence representation

Ladický, Russell, Kohli, Torr ECCV10

Page 163: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Inference for Co-occurence

Move Energy Cost of current label set

Ladický, Russell, Kohli, Torr ECCV10

Page 164: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Inference for Co-occurence

Move Energy

Decomposition to α-dependent and α-independent part

α-independent α-dependent

Ladický, Russell, Kohli, Torr ECCV10

Page 165: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Inference for Co-occurence

Move Energy

Decomposition to α-dependent and α-independent part

Either α or all labels in the image after the move

Ladický, Russell, Kohli, Torr ECCV10

Page 166: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Inference for Co-occurence

Move Energy

submodular non-submodular

Ladický, Russell, Kohli, Torr ECCV10

Page 167: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Inference for Co-occurence

Move Energy

non-submodular

Non-submodular energy overestimated by E'(t)• E'(t) = E(t) for current solution• E'(t) E(t) for any other labelling

Ladický, Russell, Kohli, Torr ECCV10

Page 168: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Inference for Co-occurence

Move Energy

non-submodular

Non-submodular energy overestimated by E'(t)• E'(t) = E(t) for current solution• E'(t) E(t) for any other labelling

Occurrence - tight

Ladický, Russell, Kohli, Torr ECCV10

Page 169: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Inference for Co-occurence

Move Energy

non-submodular

Non-submodular energy overestimated by E'(t)• E'(t) = E(t) for current solution• E'(t) E(t) for any other labelling

Co-occurrence overestimation

Ladický, Russell, Kohli, Torr ECCV10

Page 170: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Inference for Co-occurence

Move Energy

non-submodular

Non-submodular energy overestimated by E'(t)• E'(t) = E(t) for current solution• E'(t) E(t) for any other labelling

General case [See the paper]

Ladický, Russell, Kohli, Torr ECCV10

Page 171: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Inference for Co-occurence

Move Energy

non-submodular

Non-submodular energy overestimated by E'(t)• E'(t) = E(t) for current solution• E'(t) E(t) for any other labelling

Quadratic representation

Ladický, Russell, Kohli, Torr ECCV10

Page 172: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Potential Training for Co-occurence

Ladický, Russell, Kohli, Torr ICCV09

Generatively trained Co-occurence potential

Approximated by 2nd order representation

Page 173: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Results on MSRC data set

Input Image

Comparisons of results with and without co-occurence

Input ImageWithoutCo-occurence

WithCo-occurence

WithoutCo-occurence

WithCo-occurence

Ladický, Russell, Kohli, Torr ICCV09

Page 174: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Pixel CRF Energy for Object-class Segmentation

Pairwise CRF models

Data term Smoothness term

Input Image Pairwise CRF Result

Shotton et al. ECCV06

Page 175: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Pixel CRF Energy for Object-class Segmentation

Pairwise CRF models

Data term Smoothness term

• Lacks long range interactions

• Results oversmoothed

• Data term information may be insufficient

Page 176: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Segment CRF Energy for Object-class Segmentation

Pairwise CRF models

Data term Smoothness term

Input Image UnsupervisedSegmentation

Shi, Malik PAMI2000,Comaniciu, Meer PAMI2002,

Felzenschwalb, Huttenlocher, IJCV2004,

Page 177: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Segment CRF Energy for Object-class Segmentation

Pairwise CRF models

Data term Smoothness term

Input Image Pairwise CRF ResultBatra et al. CVPR08, Yang et al. CVPR07, Zitnick et al. CVPR08,

Rabinovich et al. ICCV07, Boix et al. CVPR10

Page 178: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Segment CRF Energy for Object-class Segmentation

Pairwise CRF models

Data term Smoothness term

• Allows long range interactions

• Does not distinguish between instances of objects

• Cannot recover from incorrect segmentation

Page 179: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Segment CRF Energy for Object-class Segmentation

Pairwise CRF models

Data term Smoothness term

• Allows long range interactions

• Does not distinguish between instances of objects

• Cannot recover from incorrect segmentation

Page 180: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Segment CRF Energy for Object-class Segmentation

Robust PN approach

Pairwise CRF Segment consistency

term

Input ImageKohli, Ladický, Torr CVPR08

Higher order CRF

Page 181: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Segment CRF Energy for Object-class Segmentation

Robust PN approach

Pairwise CRF Segment consistency

term

Kohli, Ladický, Torr CVPR08

Page 182: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Segment CRF Energy for Object-class Segmentation

Robust PN approach

Pairwise CRF Segment consistency

term

Kohli, Ladický, Torr CVPR08

No dominant label

Page 183: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Segment CRF Energy for Object-class Segmentation

Robust PN approach

Pairwise CRF Segment consistency

term

Kohli, Ladický, Torr CVPR08

Dominant label l

Page 184: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Segment CRF Energy for Object-class Segmentation

Robust PN approach

Pairwise CRF Segment consistency

term

Kohli, Ladický, Torr CVPR08

• Enforces consistency as a weak constraint

• Can recover from incorrect segmentation

• Can combine multiple segmentations

• Does not use features at segment level

Page 185: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Segment CRF Energy for Object-class Segmentation

Robust PN approach

Pairwise CRF Segment consistency

term

Kohli, Ladický, Torr CVPR08

Transformable to pairwise graph with auxiliary variable

Page 186: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Segment CRF Energy for Object-class Segmentation

Robust PN approach

Pairwise CRF Segment consistency

term

Ladický, Russell, Kohli, Torr ICCV09

where

Page 187: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Segment CRF Energy for Object-class Segmentation

Robust PN approach

Pairwise CRF Segment consistency

term

Input ImageLadický, Russell, Kohli, Torr ICCV09

Higher order CRF

Page 188: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Associative Hierarchical CRFs

Ladický, Russell, Kohli, Torr ICCV09

Can be generalized to Hierarchical model

• Allows unary potentials for region variables

• Allows pairwise potentials for region variables

• Allows multiple layers and multiple hierarchies

Page 189: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Associative Hierarchical CRFs

Ladický, Russell, Kohli, Torr ICCV09

AH CRF Energy for Object-class Segmentation

Pairwise CRF Higher order term

Page 190: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Associative Hierarchical CRFs

Ladický, Russell, Kohli, Torr ICCV09

AH CRF Energy for Object-class Segmentation

Pairwise CRF Higher order term

Higher order term recursively defined as

Segment unary term

Segment pairwise term

Segment higher order term

Page 191: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Associative Hierarchical CRFs

Ladický, Russell, Kohli, Torr ICCV09

AH CRF Energy for Object-class Segmentation

Pairwise CRF Higher order term

Higher order term recursively defined as

Segment unary term

Segment pairwise term

Segment higher order term

Why is this generalisation useful?

Page 192: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Associative Hierarchical CRFs

Ladický, Russell, Kohli, Torr ICCV09

Let us analyze the case with

• one segmentation

• potentials only over segment level

Page 193: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Associative Hierarchical CRFs

Ladický, Russell, Kohli, Torr ICCV09

Let us analyze the case with

• one segmentation

• potentials only over segment level

1) Minimum is segment-consistent

Page 194: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Associative Hierarchical CRFs

Ladický, Russell, Kohli, Torr ICCV09

Let us analyze the case with

• one segmentation

• potentials only over segment level

1) Minimum is segment-consistent

2) Cost of every segment-consistent CRF equal to the cost of pairwise CRF over segments

Page 195: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Associative Hierarchical CRFs

Ladický, Russell, Kohli, Torr ICCV09

Let us analyze the case with

• one segmentation

• potentials only over segment level

Equivalent to pairwise CRF over segments!

Page 196: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Associative Hierarchical CRFs

Ladický, Russell, Kohli, Torr ICCV09

• Merges information over multiple scales

• Easy to train potentials and learn parameters

• Allows multiple segmentations and hierarchies

• Allows long range interactions

• Limited (?) to associative interlayer connections

Page 197: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Inference for AH-CRF

Graph Cut based move making algorithms [Boykov et al. 01]α-expansion transformation function for base layer

Ladický (thesis) 2011

Page 198: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Inference for AH-CRF

Graph Cut based move making algorithms [Boykov et al. 01]α-expansion transformation function for base layer

Ladický (thesis) 2011

α-expansion transformation function for auxiliary layer

(2 binary variables per node)

Page 199: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Inference for AH-CRF

Ladický (thesis) 2011

Interlayer connection between the base layer and the first auxiliary layer:

Page 200: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Inference for AH-CRF

Ladický (thesis) 2011

Interlayer connection between the base layer and the first auxiliary layer:

The move energy is:

where

Page 201: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Inference for AH-CRF

Ladický (thesis) 2011

The move energy for interlayer connection between the base and auxiliary layer is:

Can be transformed to binary submodular pairwise function:

Page 202: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Inference for AH-CRF

Ladický (thesis) 2011

The move energy for interlayer connection between auxiliary layers is:

Can be transformed to binary submodular pairwise function:

Page 203: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Inference for AH-CRF

Ladický (thesis) 2011

Graph constructions

Page 204: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Inference for AH-CRF

Ladický (thesis) 2011

Pairwise potential between auxiliary variables :

Page 205: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Inference for AH-CRF

Ladický (thesis) 2011

Move energy if x = x : c d

Can be transformed to binary submodular pairwise function:

Page 206: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Inference for AH-CRF

Ladický (thesis) 2011

Move energy if x ≠ x : c d

Can be transformed to binary submodular pairwise function:

Page 207: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Inference for AH-CRF

Ladický (thesis) 2011

Graph constructions

Page 208: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Potential training for Object class Segmentation

Pixel unary potential

Unary likelihoods based on spatial configuration (Shotton et al. ECCV06)

Classifier trained using boosting

Ladický, Russell, Kohli, Torr ICCV09

Page 209: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Potential training for Object class Segmentation

Segment unary potential

Ladický, Russell, Kohli, Torr ICCV09

Classifier trained using boosting (resp. Kernel SVMs)

Page 210: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Results on MSRC dataset

Page 211: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Results on Corel dataset

Page 212: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Results on VOC 2010 dataset

Input Image Result Input Image Result

Page 213: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

VOC 2010-Qualitative

Input Image Result Input Image Result

Page 214: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

CamVid-Qualitative

Our Result

Input Image

GroundTruth

Brostow et al.

Page 215: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Quantitative comparison

MSRC dataset

Corel dataset

VOC2009 dataset

Page 216: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Quantitative comparison

Ladický, Russell, Kohli, Torr ICCV09

MSRC dataset

VOC2009 dataset

Page 217: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Dense Stereo Reconstruction

Unary Potential

Unary Cost dependent on the similarity of patches, e.g.cross correlation

Disparity = 0

Page 218: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Dense Stereo Reconstruction

Unary Cost dependent on the similarity of patches, e.g.cross correlation

Unary Potential

Disparity = 5

Page 219: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Dense Stereo Reconstruction

Unary Cost dependent on the similarity of patches, e.g.cross correlation

Unary Potential

Disparity = 10

Page 220: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Dense Stereo Reconstruction

• Encourages label consistency in adjacent pixels

• Cost based on the distance of labels

Linear Truncated Quadratic Truncated

Pairwise Potential

Page 221: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Dense Stereo Reconstruction

Does not work for Road Scenes !

Original Image Dense Stereo Reconstruction

Page 222: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Dense Stereo Reconstruction

Does not work for Road Scenes !

Patches can be matched to any other patch for flat surfices

Different brightness in cameras

Page 223: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Dense Stereo Reconstruction

Does not work for Road Scenes !

Patches can be matched to any other patch for flat surfices

Different brightness in cameras

Could object recognition for road scenes help?

Recognition of road scenes is relatively easy

Page 224: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Joint Dense Stereo Reconstructionand Object class Segmentation

sky

• Object class and 3D location are mutually informative

• Sky always in infinity (disparity = 0)

Page 225: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Joint Dense Stereo Reconstructionand Object class Segmentation

building carsky

• Object class and 3D location are mutually informative

• Sky always in infinity (disparity = 0)

• Cars, buses & pedestrians have their typical height

Page 226: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Joint Dense Stereo Reconstructionand Object class Segmentation

building carsky

• Object class and 3D location are mutually informative

• Sky always in infinity (disparity = 0)

• Cars, buses & pedestrians have their typical height

• Road and pavement on the ground plane

road

Page 227: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Joint Dense Stereo Reconstructionand Object class Segmentation

building carsky

• Object class and 3D location are mutually informative

• Sky always in infinity (disparity = 0)

• Cars, buses & pedestrians have their typical height

• Road and pavement on the ground plane

• Buildings and pavement on the sides

road

Page 228: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Joint Dense Stereo Reconstructionand Object class Segmentation

building carsky

• Object class and 3D location are mutually informative

• Sky always in infinity (disparity = 0)

• Cars, buses & pedestrians have their typical height

• Road and pavement on the ground plane

• Buildings and pavement on the sides

• Both problems formulated as CRF

• Joint approach possible? road

Page 229: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Joint Formulation

• Each pixels takes label zi = [ xi yi ] L1 L2

• Dependency of xi and yi encoded as a unary and pairwise potential, e.g.

• strong correlation between x = road, y = near ground plane

• strong correlation between x = sky, y = 0

• Correlation of edge in object class and disparity domain

Page 230: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Joint Formulation

• Weighted sum of object class, depth and joint potential

• Joint unary potential based on histograms of height

Object layer

Disparity layer

Joint unary links

Unary Potential

Page 231: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Joint Formulation

Pairwise Potential

• Object class and depth edges correlated

• Transitions in depth occur often at the object boundaries

Object layer

Disparity layer

Joint pairwise links

Page 232: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Joint Formulation

Page 233: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Inference

• Standard α-expansion• Each node in each expansion move keeps its old label

or takes a new label [xL1, yL2],

• Possible in case of metric pairwise potentials

Page 234: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Inference

• Standard α-expansion• Each node in each expansion move keeps its old label

or takes a new label [xL1, yL2],

• Possible in case of metric pairwise potentials

Too many moves! ( |L1| |L2| ) Impractical !

Page 235: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Inference

• Projected move for product label space• One / Some of the label components remain(s)

constant after the move

• Set of projected moves• α-expansion in the object class projection• Range-expansion in the depth projection

Page 236: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Dataset

• Leuven Road Scene dataset• Contained

• 3 sequences• 643 pairs of images

• We labelled• 50 training + 20 test images• Object class (7 labels)• Disparity (100 labels)

• Publicly available• http://cms.brookes.ac.uk/research/visiongroup/files/Leuven.zip

Left camera Right camera Object GT Disparity GT

Page 237: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Qualitative Results

• Large improvement for dense stereo estimation

• Minor improvement in object class segmentation

Page 238: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Quantitative Results

Dependency of the ratio of correctly labelled pixels within the maximum allowed error delta

Page 239: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

• We proposed :• New models enforcing higher order structure

• Label-consistency in segments• Consistency between scale• Label-set consistency• Consistency between different domains

• Graph-Cut based inference methods• Source code http://cms.brookes.ac.uk/staff/PhilipTorr/ale.htm

There is more to come !

Summary

Page 240: CVPR2012: Tutorial: Graphcut-based Optimisation for Computer Vision

Questions?

Thank you