a fast algorithm for finding crosswalks using figure
TRANSCRIPT
A Fast Algorithm for Finding Crosswalks using Figure-Ground Segmentation
James Coughlan and Huiying Shen
Aging population means more & more visual impairments
Worldwide:37 million blind124 million low vision
Common causes:– Diabetic retinopathy– Age-related macular
degeneration (AMD)– Cataract– Glaucoma
Incidence of eye disease rises sharply with age
Orientation for the blind & visually impaired
“Wayfinding” is a huge challenge:– Where am I with respect to the environment?– How can I move to my destination?
Our approach: attack small but important wayfinding problems using computer vision– Today’s paper: Traffic intersections– Tomorrow: “Cell Phone-based Wayfinding for the
Visually Impaired”, Workshop on Mobile VisionUltimate goal: combine various functionalities in
one cell phone
4
Traffic intersections
Very dangerous for blind and visually impairedMany blind people agree:
– intersection fairly easy to find using cane– timing also easy to determine by listening to traffic
sounds– but hard to traverse crosswalk in the right direction!
Existing aid: audible walk light signals– Useful for those intersections where installed– Extra cost– Noise causes some nearby residents to complain
5
Computer vision solution
Blind pedestrian approaches and photographs intersection using digital camera connected to small computer
Computer vision algorithm– locate crosswalk pattern in image– determine orientation of crosswalk in image
Tell pedestrian which way to turn for proper alignment, using synthesized speech
Algorithm should take only a few seconds of processing time!
6
Past WorkPast computer vision work on
finding crosswalks assumes favorable road/viewing conditions
Under these conditions, Hough transform is well-suited to finding crosswalk stripes
Indeed, many past approaches use Hough as a first processing step [Utcke ‘98; Se & Brady ‘03; Uddin & Shioyama ’05]
7
Typical intersection images photographed by blind pedestrian
Real-world conditionsGlare, shadows, occlusions, uneven paint very common
Two-stripe
Zebra
8
Limitations of Hough transform
Hough has problems on these real-world images:– assumes globally straight lines– bin quantization hard to choose:
– coarse bins may not resolve separate stripe edges– fine bins may not capture enough votes from non-ideal lines
– more problems when stripe edges are fragmented (due to occlusion, shadows, etc.)
9
Focus on zebra crosswalksEasier than the other common crosswalk type (the
two-stripe)Color is usually yellow or white
Our approach: figure-ground segmentation
Variable number of stripes (painted or visible)Deformable template is too slow without proper
initialization, especially given unknown number of stripes
Zebra crosswalk is like a texture: quasi-periodic arrangement of similar elementsSegment it using object-specific figure-ground process [Yu & Shi ’03]
11
Figure-ground framework: graphical modelExtract image elements to be labeled as figure or
groundCreate graphical model (MRF), specifically a CRF
(conditional random field)Each element defines a node. Connect nearby
nodesEach node has unknown state: figure or groundDefine pairwise potential to express compatibility
between pairs of elementsUse belief propagation to estimate F/G assigment of
each node
12
Graphical model framework
N pieces of data (e.g. stripe fragment elements) in image: {yi } where i=1, …, N
Each node xi has unknown state xi=0 or 1, where 0 is “ground” and 1 is “figure”
13
Graphical model framework
∏∏><
=ij
jiiji
iiN xxxZ
xxP ),()(1),,( 1 ψψK
Unary potential ψi(xi), binary potential ψij(xi,xj)Our potential function convention:1 means neutral, > 1 means likely, < 1 means less likely
Set ψi(xi =0)=1 and ψi(xi =1) to have a value that reflects how “figure-like” the element is (without considering other elements)
Joint probability of all F/G assignments:
14
Graphical model framework
Binary potentials strategy: set F-G and G-G potentials to be neutral
Make F-F potential express compatibilitiesbetween two elements – i.e. are the two elements both likely to be assigned to F?
ψij(xi=0,xj=0) =1, ψij(xi=0,xj=1) =1, ψij(xi=1,xj=0) =1
and ψij(xi=1,xj=1) is a function of the compatibility between element yi and yj
15
Zebra crosswalks: constructing elements from image features
Elements are crosswalk stripe fragments, or stripelets
Steps:1.) Find edges in image2.) Construct straight-line edge segments in a
greedy way from edges3.) Construct stripelets from suitable edge segment
pairs
16
Constructing elements from image features
Straight-line edgesegments
Stripelet elements
17
Crosswalk: unary potential
Observation: perspective dictates that stripes higher in the image appear narrower(assuming image is right-side-up)
Characteristic width as function of height in image
18
Crosswalk: unary potential
Empirical data: width of all stripelets in image as function of rowStripelets on crosswalk tend to fit straight line reflecting characteristic width property
Straight line forms lower envelope boundaryin each image
Find envelope automatically
19
Crosswalk: unary potential
Closer to envelope more likely that stripeletbelongs to figure
Let E=distance between point and envelope in (y,w) space
Let En = E/w* be the normalized distance, where w* is the width of the stripelet
20
Crosswalk: unary potential
Another unary cue: longer stripelet more likely to belong to figure
Let a and b denote the upper and lower lengths of a stripelet…
21
Crosswalk: unary potential
Putting both unary cues together:
Ground: ψi(xi =0)=1Figure: ψi(xi =1) = (1/10) max [1, (ab)1/4 (1-En)]
Note that maximum value of ψi(xi =1) is 1/10, much less than the ground value of 1
22
Crosswalk: binary potential
Based on cross ratio testsInspired by [Uddin & Shioyama ’05]
Stripelet 1
Stripelet 2
Vertical “probe” lines cross ratios r1 and r2
23
Crosswalk: binary potential
Test two properties using cross ratios:
1.) All four edges of the two stripelets are collinear in 3-D r1 ≅ r2
2.) Since stripe widths and gaps between stripes are all equal in 3-D, expect to be r1, r2 to be close to 1/4
24
Crosswalk: binary potential
Putting both binary cues together:
ψij(xi=0,xj=0) =1, ψij(xi=0,xj=1) =1, ψij(xi=1,xj=0) =1
and
ψij(xi=1,xj=1)=(10/3)e-10R
where R= 1/2 (|r1- ¼| + |r2- ¼|) + 2|r1- r2|
25
Constructing graph
Connect stripelet pairs that satisfy three criteria:
1.) Distance between stripelet centroids sufficiently small
2.) Cross ratio error measure R sufficiently low3.) Montonicity requirement: the lower of the two
stripelets must be wider
Eliminate stripelets without any connections
26
Constructing graph
Original stripelets Stripelets remaining after determining connections
27
Belief propagation
Run a few sweeps of BP (asynchronous message updates)
Calculate unary beliefs
Any element with unary belief P(xi=1) > 0.9 is labeled as figure
28
Belief propagation: result
29
Experimental results
30
Experimental results
Execution speed: a few seconds per image
Small dataset – but includes all images taken by one blind subject in a single session
Conclusions
Crosswalk visibility very important for safetyPedestrian must align him/herself properly to them
before crossingComputer vision solution: first step is to detect and
localize crosswalkUse figure-ground approach, implemented as
graphical model
Encouraging experimental results: fast and robust
Much more algorithm work is required, as well as extensive user testing…
32
Ongoing/future work
Learn unary and binary potentials from large labeled database
Include many possible cues to see which are most effective…
For instance, how about relative color cues? E.g. using color ratios [Funt & Finlayson ’95]
ROC analysis (e.g. false positive crosswalk detections)
33
Ongoing/future work
Basic geometric analysis of detected crosswalk to determine orientation
User testing is tracking needed once you begin crossing?
Tackle two-stripe crosswalk patternsDetermine intersection layout (3-way, 4-way, etc.)Eventually… port this to camera cell phone. Software
could be downloaded for free
Another application of F/G framework: finding text in cluttered scenes [Shen & Coughlan ICPR ’06]
34
Thanks to…
John Brabyn, Bill Gerrey, Tom Fowle and Josh Miele (Smith-Kettlewell)
Roberto Manduchi (UC Santa Cruz)