cvpr2010: higher order models in computer vision: part 4
TRANSCRIPT
Schedule
830-900 Introduction
900-1000 Models: small cliques and special potentials
1000-1030 Tea break
1030-1200 Inference: Relaxation techniques: LP, Lagrangian, Dual Decomposition
1200-1230 Models: global potentials and global parameters + discussion
MRF with global potential GrabCut model [Rother et. al. ‘04]
Fi = -log Pr(zi|θF) Bi= -log Pr(zi|θB)
Background
Foreground G
R
θF/B Gaussian Mixture models
E(x,θF,θB) =
Problem: for unknown x,θF,θB the optimization is NP-hard! [Vicente et al. ‘09]
Image z Output x
∑ Fi(θF)xi+ Bi(θB)(1-xi) + ∑ |xi-xj| i,j Є N i
θF/B
GrabCut: Iterated Graph Cuts [Rother et al. Siggraph ‘04]
Learning of the colour distributions
Graph cut to infer segmentation
F
x min E(x, θF, θB) θF,θB
min E(x, θF, θB)
B
Most systems with global variables work like that e.g. [ObjCut Kumar et. al. ‘05, PoseCut Bray et al. ’06, LayoutCRF Winn et al. ’06]
θF/B
1 2 3 4
GrabCut: Iterated Graph Cuts
Energy after each Iteration Result
Colour Model
Background
Foreground &
Background G
R
Background
Foreground G
R Iterated graph cut
Optimizing over θ’s help
after convergence [GrabCut ‘04]
no iteration [Boykov&Jolly ‘01]
Input
Input after convergence [GrabCut ‘04]
Global optimality?
GrabCut (local optimum)
Global Optimum [Vicente et al. ‘09]
Is it a problem of the optimization or the model?
… first attempt to solve it [Lempisky et al. ECCV ‘08]
Model a discrete subset: wF= (1,1,0,1,0,0,0,0); wB = (1,0,0,0,0,0,0,1) #solutions: wF*wB = 216
Global Optimum: Exhaustive Search: 65.536 Graph Cuts Branch-and-MinCut: ~ 130-500 Graph Cuts (depends on image)
E(x,θF,θB)= ∑ Fi(θF)xi+ Bi(θ
B)(1-xi) + ∑ wij|xi-xj|
8 Gaussians whole image
pЄV pq Є E
G
R
1
2 3
4
5
6 7 8 wF,B
Branch-and-MinCut
wF= (1,1,1,1,0,0,0,0)
wB= (1,0,1,1,0,1,0,0)
wF= (0,0,*,*,*,*,*,*)
wB= (0,*,*,*,*,*,*,*)
wF= (*,*,*,*,*,*,*,*)
wB= (*,*,*,*,*,*,*,*)
min E(x,wF,wB) = min [ ∑ Fi(wF)xi+ Bi(w
B)(1-xi) + ∑ wij(xi,xj) ] ≥ min [∑ min Fi(w
F)xi+ min Bi(wB)(1-xi) + ∑ wij(xi,xj)]
x,wF,wB
x
x,wF,wB
wB wF
Results …
E = -618 E = -624 (speed-up 481) E = -628
E = -593 E = -584 (speed-up 141) E = -607
E=-618 GrabCut
E=-624 (speed-up 481) Branch-and-MinCut
E=-593 GrabCut
E=-584 (speed-up 141) Branch-and-MinCut
E = -618 E = -624 (speed-up 481) E = -628
E = -593 E = -584 (speed-up 141) E = -607
Object Recognition & Segmentation
Given exemplar shapes:
Test: Speed-up ~900; accuracy 98.8%
|w| ~ 2.000.000
min E(x,w) with: w = Templates x Position w
… second attempt to solve it [Vicente et al. ICCV ‘09]
Eliminate global color model θF,θB :
θF,θB E’(x) = min E(x,θF,θB)
Eliminate color model E(x,θF,θB)= ∑ Fi(θF)xi+ Bi(θB)(1-xi) + ∑ wij|xi-xj|
Image histogram k
given x
K = 163
k
θB
θFє [0,1]K is a distributions (∑θF = 1) (background same)
θF
k background distribution
foreground distribution
Optimal θF/B given by empirical histograms: θF = nFk/nF
nF = ∑xi #fgd. pixel
nF = ∑xi #fgd. pixel in bin k
k pЄV
pЄVk
Image discretized in bins
K
i Є Bk
K
K
Eliminate color model
E’(x)= g(nF) + ∑ hk(nF) + ∑ wij|xi-xj| with nF = ∑xi, nF = ∑xi
min θF,θB
k k
E(x,θF,θB)= ∑ Fi(θF)xi+ Bi(θB)(1-xi) + ∑ wij|xi-xj|
k
nF 0 n/2 n
g
Prefers “equal area” segmentation Each color either fore- or background
hk
nF 0 max k
convex concave
i Є Bk
i
E(x,θF,θB)= ∑ -nFk log θF
k -nBk log θB
k + ∑ wij|xi-xj| k
(θF = nFk/nF )
K
How to optimize … Dual Decomposition
E(x)= g(nF) + ∑ hk(nFk) + ∑ wij|xi-xj|
E1(x) E2(x)
min E(x) = min [ E1(x) + yTx + E2(x) – yTx ]
≥ min [ E1(x’) + yTx’ ] + min [E2(x) – yTx] =: L(y)
Goal: - maximize concave function L(y) using sub-gradient - no guarantees on E (NP-hard)
L(y)
E(x’)
k
x’ x
x x
“paramteric maxflow” gives optimal y=λ1 efficiently [Vicente et al. ICCV ‘09]
Simple (no MRF) Robust Pn Potts
Some results… Global optimum in 61% of cases (GrabCut database)
Input GrabCut Global Optimum (DD)
Local Optimum (DD)
Insights on the GrabCut model
g 0.4 g 0.3 g 1.5 g
hk
nF
Each color either fore- or background
0 max k nF 0 n/2 n
g
Prefers “equal area” segmentation
concave convex
Relationship to Soft Pn Potts
Image
Pairwise CRF only
TextonBoost [Shotton et al. ‘06]
robust Pn Potts [Kohli et al ‘08]
One super-pixelization
another super-pixelization
GrabCut: cluster all colors together
Just different type of clustering:
Marginal Probability Field (MPF) What is the prior of a MAP-MRF solution:
[Woodford et. al. ICCV ‘09]
Training image: 60% black, 40% white
MRF is a bad prior since ignores shape of the (feature) distribution !
MAP: prior(x) = 0.6 = 0.016 8
Others less likely :
prior(x) = 0.6 * 0.4 = 0.005 5 3
Introduce a global term, which controls global statistic
Marginal Probability Field (MPF)
[Woodford et. al. ICCV ‘09]
Optimization done with Dual Decomposition (different ones)
max 0 max 0
MRF True energy
Examples
Segmentation:
In-painting:
Pairwise MRF – Increase Prior strength
Ground truth
Noisy input
Global gradient prior
Schedule
830-900 Introduction
900-1000 Models: small cliques and special potentials
1000-1030 Tea break
1030-1200 Inference: Relaxation techniques: LP, Lagrangian, Dual Decomposition
1200-1230 Models: global potentials and global parameters + discussion
Open Questions
• Many exciting future directions – Exploiting latest ideas for applications (object
recognition etc.) – Many other higher-order cliques:
Topology, Grammars, etc. (this conference).
• Comparison of inference techniques needed:
– Factor graph message passing vs. transformation vs. LP relaxation?
• Learning higher order Random Fields