ambiguity and uncertainty - ucla statisticssczhu/courses/ucla/stat_232b/handouts/ch6_… ·...
TRANSCRIPT
5/20/2013
1
Part 3: Computing Multiple Solutions from MCMC
Chapter 6 Advanced Algorithms for Inference in Gibbs fields
Computing multiple distinct solution is of great importance for preserving ambiguities and uncertainty. We introduce two algorithms
1, C4: Clustering with Cooperative and Competitive Constraints. in J. Porway, PAMI 2011
2, K-adventure’s algorithm in DDMCMC.
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
in Z. Tu, PAMI 2002
Ambiguity and uncertainty
Ambiguities : multiple distinct solutionsp(x)
Uncertainty : a range of variants
Both increase the entropy of the perception (posterior probability):
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
H(W) | I = entropy (p(W | I )
When H(W) | I is too large, we say W is imperceptible. Thus we seek for a simpleand low-dimensional perception W_. Taught in stat232A.
5/20/2013
2
Rationale for computing multiple solutions
Computing multiple solutions to avoid premature commitments and to preserve the intrinsic ambiguities.
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
Two criteria for designing “effective algorithms”:
1, Fast convergence to global optimal solutions: “not greedy”
Rationale for computing multiple solutions (cont.)
g g p g y
But this is not enough, especially when the probability has multiple modes.
2, Exploring diverse solutions (liquidate, fast mixing) : “not stubborn”
1 0 2 2 0 1 0 2 1 … is more dynamic than 1 1 1 2 2 2 0 0 0
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
And thus the former can better represent the underlying distributions by a shorter sequence.
5/20/2013
3
Sampling Probabilities with Multiple Modes
The two criteria for MCMC design: 1, Short “burn-in” period --- The MC reaches the equilibrium fast 2, Fast “mixing rate” --- The MC states are less correlated in time
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
p(X)
Sometimes, to design ambiguities is to create abstract art.
Rationale for computing multiple solutions (cont.)
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
5/20/2013
4
Example: creating abstract art from picture
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
M.T. Zhao, “Abstract Painting with Interactive Control of Perceptual Entropy,” ACM Trans. on Applied Perception, 2013.
Example of computing multiple solutions in vision
a. Input image b. Segmented texture regions c. synthesis by texture models
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
d. curve processes + bkgd region e. synthesis by curve models
5/20/2013
5
Ambiguities are ubiquitous in images
Example of google earth image in aerial image understanding
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
Local interpretations are often strongly coupled !
They form “clusters” and the search algorithms often get stuck.
Revisit: Swendsen-Wang and cluster sampling
(a) Augment with auxiliary bonding variable U along edges E:
)(
,
1),();,(1
)( ts xxtst
tss exxxx
ZXp
Ising model:
}}10{{ XtU }}1,0{;,,{ stst uXtsuU
(c) Select a connected component V0 and update its nodes’ labels.state A state B
V2V2
(b) Turn edges “on” (ust = +1) or “off” (ust = 0) probabilistically.
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
V0V0
V1V1
5/20/2013
6
Potts Model with Negative Edges (“frustrated” system)
)()(1
)(X
Split our edge set E into E+ and E- to represent positive and negative interactions:
)(1)(1
,,
),(,),(
),(),()(
tsts xxts
xxts
tEji
stEts
s
exxexx
xxxxZ
Xp
Form components similarly to Swendsen-Wang.
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
Components now consist of positively connected sub-components connected by negative interactions.
Experiments on Negative Edge Ising Model
Created “checkerboard” constraint problem.p
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
Initial State Solution 1 Solution 2
5/20/2013
7
CCCP: Composite Connected Components
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
Edge probabilities for clustering cccp’s
For positive edge, the edge variable follows a Bernoulli probability
1
01)|(
10
01)|(
1
01)|(
ij
jiij
ij
ij
jiij
ij
ij
jiij
uxxup
u
uxxup
u
uxxup
p y
ue ~ Bernoulli(qe1(xs=xt))
For negative edge, it follows
ue ~ Bernoulli(qe1(xs xt))
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
10
01)|(
1
ij
ij
jiij
ijjiij
u
uxxup
u
5/20/2013
8
Mixing rate
Solution states over time for Ising model (two states)
Sta
teS
tate
SW
C4
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
TimeC4 has- Low burn-in time (converges quickly).- High mixing rate (samples remaining unbiased over short run).
Simulating the Potts model
Energy vs. time
SW
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
5/20/2013
9
Representation: Candidacy Graphs
We formulate a candidacy graph representation, as it can represents1, MRF and CRF structures2, Soft and hard constraints.3, Positive (collaborative) and negative (competitive) edges.
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
Flattening the Candidacy Graphs
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
This becomes a graph coloring problem: each color is a solution.
5/20/2013
10
Ex: Candidacy Graph for Necker Cube
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
Some labels/edges omitted for clarity
Application I: Line Drawing
Solution1 Solution2
Solution 2
Solution 1 Solution 2Solution 1
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
C4 finds both solutions, swaps corner labels continuously.
5/20/2013
11
Line drawing interpretation
+ ++ + ++ - -
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
+
++
+
+ +++
++
-
+-
+ +++
+
-+
-+
-
- ---
-+
Application II: CRF for Building Detection
- Learn a CRF for building detection1.
- CRF labels each window of image as {0,1} .
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
1S. Kumar, M. Hebert, “Man-Made Structure Detection in Natural Images using a Causal Multiscale Random Field”, CVPR, 2003.
5/20/2013
12
Comparison: CRF for Building Detection
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
LBP = Loopy Belief Propagation ICM = Iterated Conditional Modes
Comparison: CRF Building Detection
Algorithm FP (per image) Detection Rate (%)
Loopy BP 23.18 0.451
Swendsen‐Wang 156.05 0.468
C4 47.12 0.696
ICM 61.78 0.697
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
Note: Couldn’t recreate Kumar’s exact results, so compared our CRF with his inference method (ICM) and ours (C4).
5/20/2013
13
Application III: Aerial Image Segmentation
-Detect objects in an aerial image.
- MANY false positives.
- Have C4 determine which ones best explain the image by turning detected objects “on” and “off”.
Car
Car
Car
Parking Lot
Tree Car
Car
Car
Parking Lot
Tree
(a) (b)
Car
Car
Car
Parking Lot
Tree
(c)
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
Car
Building
Car
Car
BuildingCCP1
CCP2
Positive Edges
Negative Edges
In Current Explanation
Not In Current Explanation
Car
Car
BuildingCCP1
CCP2
Results: Aerial Image SegmentationTreesRoadsCarsRoofsParking Lots
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
Initial Bottom-up Detections C4-Selected Subset
5/20/2013
14
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
Results: Aerial Image Segmentation
P-R curve for original detections vs. C4 pruned detections.
Original
C4
Example of C4’s benefits: Misinterpreted
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
Misinterpreted section of image is later updated by large composite move. (b)(a)
5/20/2013
15
Problem with “Flat”-- C4
The “love triangles”:create inconsistent clusters
B
C
Ears
Eye
Head
Love triangle
A
C
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
Eye
Beak Nosee.g. Love triangles in the duck/rabbit illusion.
Hierarchical C4
The candidacy graph so far represent pairwise edges, high-order relations arerepresented by extended candidacy graphs
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
System will now flip between duck and rabbit without love triangle issue.
5/20/2013
16
Elephant Illusion
Elephant
Hierarchical part model Layered representation of hierarchy
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
Leg Pair, Head, Back
Leg, Trunk
Line
Composing the Bottom Layer
Trunk compatibilitiesp
Leg compatibilities
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
Elephant
Leg Pair, Head, Back
Leg, Trunk
Line
5/20/2013
17
Part Binding for Next Layer
Trunk bindingsTrunk bindings
Leg bindings
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
Elephant
Leg Pair, Head, Back
Leg, Trunk
Line
Continue Sampling/Binding for each layer
Leg Pair compatibilities
Leg Pair bindings
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
Elephant
Leg Pair, Head, Back
Leg, Trunk
Line
5/20/2013
18
Top level bindings and sampling
Elephantcompatibilities
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
Elephant
Leg Pair, Head, Back
Leg, Trunk
Line
1, A candidacy graph representation incorporates
MRF and CRF structures
Summary
Soft and hard constraints.Positive (collaborative) and negative (competitive) edges.
hierarchic compositional structures for high-order relations.
2, An algorithm with large composite moves
Fast convergence to global solutions
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
Fast convergence to global solutionsEffective mixing between solutions
Relaxation labeling Gibbs sampling SW Cut C4
(Rosenfeld/Zucker/Hammel 1978) (Geman/ Geman 1984) (Barbu/Zhu 2005) (Porway/ Zhu 2009)
5/20/2013
19
General MCMC Design
Design Strategy I Design Strategy II
InputI
1. Proposal Q(x,y) 2. Transition K(x,y)3. Initial prob.
MC=(, K, )
x1, x2, …., xn ~ (x) Solutions))(ˆ||)((minarg xxKL **
2*1 ,...,, Kxxx
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
Computing Multiple Solutions
To faithfully preserve the posterior probability p(W|I),
We compute a set of weighted scene particles {W1, W2, …, WM},
A mathematical principle:
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
5/20/2013
20
Computing Multiple Solutions
To faithfully preserve the posterior probability p(W|I),
We compute a set of weighted scene particles {W1, W2, …, WM},
A mathematical principle:
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
Pursuit of Multiple Solutions
The Kullback-Leibler divergence can be computed if we assumemixture of Gaussian distributions.
--- a simple fact: the KL-divergence of two Gaussians is the signal-to-noise rati
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
a simple fact: the KL divergence of two Gaussians is the signal to noise rati
Intuition: S includes global maximum, local modes, apart from each other.
5/20/2013
21
A k-adventurer algorithm
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
Preserving Distinct Particles
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
x1 x2 x3 x4
5/20/2013
22
An Example of Keeping Multiple Solutions
An example of illustration:
1
23 4
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
Preserving Distinct Particles
min D(q || q1)
An example of illustration:min | q – q2 |
Stat 232B-CS276B Stat computing and inference. © S.C. Zhu
A model p with 50 particles two approximate models q with 6 particles
q q1 q2