learning to perceive transparency from the statistics of natural scenes anat levin school of...
TRANSCRIPT
Learning to Perceive Learning to Perceive Transparency from the Transparency from the
Statistics of Natural ScenesStatistics of Natural Scenes
Anat LevinSchool of Computer Science and Engineering
The Hebrew University of Jerusalem
Joint work with Assaf Zomet and Yair Weiss
21 :layers two III 0 :layer one 1 II
Transparency
),(),(),( 21 yxIyxIyxI
How does our visual system choose the right decomposition??
•Why not “simpler” one layer solution?
•Which two layers out of infinite possibilities?
Talk Outlines
•Motivation and previous work
•Our approach
•Results and future work
Transparency in the real world
“Fashion Planet's photographers have spent the last five years working to bring you clean photographs of the windows on New York especially without the reflections that usually occur in such photography”
http://www.fashionplanet.com/sept98/features/reflections/home
Transparency and shading
),(),(),( 21 yxIyxIyxI ),(),(),( yxRyxLyxI
Transparency in human vision
• Metelli's conditions (Metelli 74)
•T-junctions, X-junctions, doubly reversing junctions (Adelson and Anandan 90, Anderson 99)
Two layersOne layer
Not obvious how to apply “junction catalogs” to real images.
Transparency from multiple frames
•Two frames with polarizer using ICA (Farid and Adelson 99, Zibulevsky 02)
•Multiple frames with specific motions (Irani et al. 94, Szeliski et al. 00, Weiss 01)
Shading from a single frame
),(),(),( 21 yxIyxIyxI ),(),(),( yxRyxLyxI
•Retinex (Land and McCann 71).
•Color (Drew, Finlayson Hordley 02)
•Learning approach (Tappen, Freeman Adelson 02)
Talk Outlines
•Motivation and previous work
•Our approach
•Results and future work
Our Approach
Ill-posed problem.
Assume probability distribution Pr(I1), Pr(I2)
and search for most probable solution.
(ICA with a single microphone)
),(),(),( 21 yxIyxIyxI
Statistics of natural scenes
Input image dx histogram dx Log histogram
1 ,)( / sxexp
Statistics of derivative filters
Log histogram
Generalized Gaussian distribution (Mallat 89, Simoncelli 95)
Gaussian –x2
–x 1/2
-1
0
Log P
robab
ility
Laplacian –|x|
Is sparsity enough?
= +
= +Or:
Exactly the same derivatives exist in the single layer solution as in the two layers solution.
Is sparsity enough?
= +
= +Or:
Beyond sparseness
• Higher order statistics of filter outputs (e.g. Portilla and Simoncelli 2000).
•Marginals of more complicated feature detectors (e.g. Zhu and Mumford 97, Della Pietra Della Pietra and Lafferty 96).
Corners and transparency
•In typical images, edges are sparse.
•Adding typical images is expected to increase the number of corners.
•Not true for white noise
= +
Harris-like operator
),(),(),(
),(),(),(),(det),( 2
2
00 yxIyxIyxI
yxIyxIyxIyxwyxc
yyx
yxx
Derivative Filter Corner Operator
Corner histograms
Fitting: 1/)( sxexp
0.7 0.2
Derivative Filter Corner Operator
Typical exponents for
natural images:
2/)( sxexp
1/ 21 ss
Simple prior for transparency prediction
),(),(log),(log,
yxcyxIZyxPyx
The probability of a decomposition 21 III
),(log),(log),(log 2211yxyxyx IIPIIPIIP
1/ ,2.0 0.7, 21 ss
),(),(log),(log,
yxcyxIZyxPyx
Does this predict transparency?
1II 1II
How important are the statistics?
Is it important that the statistics are non Gaussian? Would any cost that penalized high gradients and corners work?
),(),(log),(log,
yxcyxIZyxPyx
1 ,2.0 0.7,
The importance of being non Gaussian
2.0 0.7, 2 2,
1II 1II
),(),(log),(log,
yxcyxIZyxPyx
The “scalar transparency” problem
Consider a prior over positive scalars
For which priors is the MAP solution sparse?
0,0th wi,1 baba
xexp )(
The “scalar transparency” problem
0,0 with ,1 baba
Observation:
The MAP solution is obtained with a=0, b=1 or a=1, b=0 if and only if f(x)=log P(x) is convex.
0 10 10.5 0.5
)2
(ba
f
2
)()( bfaf 2
)()( bfaf
)2
(ba
f
MAP solution: a=0, b=1 MAP solution: a=0.5, b=0.5
2.0 0.7, 2 2,
0 10.5
1II 1II
0 10.5
The importance of being non Gaussian
1II 1II
Can we perform a global optimization??
),(),(log),(log,
yxcyxIZyxPyx
Conversion to discrete MRF
Local Potential- derivative filters:
Pairwise Potential- pairwise approximation to the corner operator:
-Enforcing integrability
ii fgii eg )(
)det()det(, ),(
Tjj
Tii
Tjj
Tii ffffgggg
jiji egg
),,(,, kjikji ggg
1g 2g 3g 4g 5g
10g9g8g7g6g
11g 13g 14g 15g12g
For the decomposition:
,21 III
iiyixi
iyixi
gIIf
IIIg
),(
),,( 111 gradient at location i
Conversion to discrete MRF
Local Potential- derivative filters:
Pairwise Potential- pairwise approximation to the corner operator:
-Integrability enforcing
For the decomposition: ,21 III iiyixiiyixi gIIfiIIIg ),( .location at gradient oftion discretiza ),,( 1
11
),,(),()(1
)(,,
,,,
, kjikji
kjijiji
jiii
i ggggggZ
gP
ii fgii eg )(
)det()det(, ),(
Tjj
Tii
Tjj
Tii ffffgggg
jiji egg
),,(,, kjikji ggg
possible assignments.
Solution: use max-product belief propagation.
The MRF has many cycles but BP works in similar problems (Freeman and Pasztor 99, Frey et al 2001. Sun et al 2002).
Converges to strong local minimum (Weiss and Freeman 2001)
Optimizing discrete MRF ,21 III
.location at gradient oftion discretiza ,),(
.location at gradient oftion discretiza ),,(
2
111
iIgIIf
iIIIg
iiyixi
iyixi
),,(),()(1
)(,,
,,,
, kjikji
kjijiji
jiii
i ggggggZ
gP
Ng
Drawbacks of BP for this problem
•Large memory and time complexity.
•Convergence depends on update order.
•Discretization artifacts
Talk Outlines
•Motivation and previous work
•Our approach
•Results and future work
Results
input Output layer 1 Output layer 2
Results
input Output layer 1 Output layer 2
Future Work
Original Non linear filter
•Dealing with a more complex texture
+ =
Future Work •Dealing with a more complex texture:
•Use application specific priors (e.g. Manhattan World)
•Extend to shading and illumination.
•Applying other optimization methods.
•Learn discriminative features automatically•A coarse qualitative separation.
Conclusions•Natural scene statistics predict perception of transparency.
•First algorithm that can decompose a single image into the sum of two images.