a fast algorithm for finding crosswalks using figure

A Fast Algorithm for Finding Crosswalks using Figure-Ground Segmentation

James Coughlan and Huiying Shen

Aging population means more & more visual impairments

Worldwide:37 million blind124 million low vision

Common causes:– Diabetic retinopathy– Age-related macular

degeneration (AMD)– Cataract– Glaucoma

Incidence of eye disease rises sharply with age

Orientation for the blind & visually impaired

“Wayfinding” is a huge challenge:– Where am I with respect to the environment?– How can I move to my destination?

Our approach: attack small but important wayfinding problems using computer vision– Today’s paper: Traffic intersections– Tomorrow: “Cell Phone-based Wayfinding for the

Visually Impaired”, Workshop on Mobile VisionUltimate goal: combine various functionalities in

one cell phone

4

Traffic intersections

Very dangerous for blind and visually impairedMany blind people agree:

– intersection fairly easy to find using cane– timing also easy to determine by listening to traffic

sounds– but hard to traverse crosswalk in the right direction!

Existing aid: audible walk light signals– Useful for those intersections where installed– Extra cost– Noise causes some nearby residents to complain

5

Computer vision solution

Blind pedestrian approaches and photographs intersection using digital camera connected to small computer

Computer vision algorithm– locate crosswalk pattern in image– determine orientation of crosswalk in image

Tell pedestrian which way to turn for proper alignment, using synthesized speech

Algorithm should take only a few seconds of processing time!

6

Past WorkPast computer vision work on

finding crosswalks assumes favorable road/viewing conditions

Under these conditions, Hough transform is well-suited to finding crosswalk stripes

Indeed, many past approaches use Hough as a first processing step [Utcke ‘98; Se & Brady ‘03; Uddin & Shioyama ’05]

7

Typical intersection images photographed by blind pedestrian

Real-world conditionsGlare, shadows, occlusions, uneven paint very common

Two-stripe

Zebra

8

Limitations of Hough transform

Hough has problems on these real-world images:– assumes globally straight lines– bin quantization hard to choose:

– coarse bins may not resolve separate stripe edges– fine bins may not capture enough votes from non-ideal lines

– more problems when stripe edges are fragmented (due to occlusion, shadows, etc.)

9

Focus on zebra crosswalksEasier than the other common crosswalk type (the

two-stripe)Color is usually yellow or white

Our approach: figure-ground segmentation

Variable number of stripes (painted or visible)Deformable template is too slow without proper

initialization, especially given unknown number of stripes

Zebra crosswalk is like a texture: quasi-periodic arrangement of similar elementsSegment it using object-specific figure-ground process [Yu & Shi ’03]

11

Figure-ground framework: graphical modelExtract image elements to be labeled as figure or

groundCreate graphical model (MRF), specifically a CRF

(conditional random field)Each element defines a node. Connect nearby

nodesEach node has unknown state: figure or groundDefine pairwise potential to express compatibility

between pairs of elementsUse belief propagation to estimate F/G assigment of

each node

12

Graphical model framework

N pieces of data (e.g. stripe fragment elements) in image: {yi } where i=1, …, N

Each node xi has unknown state xi=0 or 1, where 0 is “ground” and 1 is “figure”

13


∏∏><

=ij

jiiji

iiN xxxZ

xxP ),()(1),,( 1 ψψK

Unary potential ψi(xi), binary potential ψij(xi,xj)Our potential function convention:1 means neutral, > 1 means likely, < 1 means less likely

Set ψi(xi =0)=1 and ψi(xi =1) to have a value that reflects how “figure-like” the element is (without considering other elements)

Joint probability of all F/G assignments:

14


Binary potentials strategy: set F-G and G-G potentials to be neutral

Make F-F potential express compatibilitiesbetween two elements – i.e. are the two elements both likely to be assigned to F?

ψij(xi=0,xj=0) =1, ψij(xi=0,xj=1) =1, ψij(xi=1,xj=0) =1

and ψij(xi=1,xj=1) is a function of the compatibility between element yi and yj

15

Zebra crosswalks: constructing elements from image features

Elements are crosswalk stripe fragments, or stripelets

Steps:1.) Find edges in image2.) Construct straight-line edge segments in a

greedy way from edges3.) Construct stripelets from suitable edge segment

pairs

16

Constructing elements from image features

Straight-line edgesegments

Stripelet elements

17

Crosswalk: unary potential

Observation: perspective dictates that stripes higher in the image appear narrower(assuming image is right-side-up)

Characteristic width as function of height in image

18


Empirical data: width of all stripelets in image as function of rowStripelets on crosswalk tend to fit straight line reflecting characteristic width property

Straight line forms lower envelope boundaryin each image

Find envelope automatically

19


Closer to envelope more likely that stripeletbelongs to figure

Let E=distance between point and envelope in (y,w) space

Let En = E/w* be the normalized distance, where w* is the width of the stripelet

20


Another unary cue: longer stripelet more likely to belong to figure

Let a and b denote the upper and lower lengths of a stripelet…

21


Putting both unary cues together:

Ground: ψi(xi =0)=1Figure: ψi(xi =1) = (1/10) max [1, (ab)1/4 (1-En)]

Note that maximum value of ψi(xi =1) is 1/10, much less than the ground value of 1

22

Crosswalk: binary potential

Based on cross ratio testsInspired by [Uddin & Shioyama ’05]

Stripelet 1

Stripelet 2

Vertical “probe” lines cross ratios r1 and r2

23


Test two properties using cross ratios:

1.) All four edges of the two stripelets are collinear in 3-D r1 ≅ r2

2.) Since stripe widths and gaps between stripes are all equal in 3-D, expect to be r1, r2 to be close to 1/4

24


Putting both binary cues together:

ψij(xi=0,xj=0) =1, ψij(xi=0,xj=1) =1, ψij(xi=1,xj=0) =1

and

ψij(xi=1,xj=1)=(10/3)e-10R

where R= 1/2 (|r1- ¼| + |r2- ¼|) + 2|r1- r2|

25

Constructing graph

Connect stripelet pairs that satisfy three criteria:

1.) Distance between stripelet centroids sufficiently small

2.) Cross ratio error measure R sufficiently low3.) Montonicity requirement: the lower of the two

stripelets must be wider

Eliminate stripelets without any connections

26

Constructing graph

Original stripelets Stripelets remaining after determining connections

27

Belief propagation

Run a few sweeps of BP (asynchronous message updates)

Calculate unary beliefs

Any element with unary belief P(xi=1) > 0.9 is labeled as figure

28

Belief propagation: result

29

Experimental results

30

Experimental results

Execution speed: a few seconds per image

Small dataset – but includes all images taken by one blind subject in a single session

Conclusions

Crosswalk visibility very important for safetyPedestrian must align him/herself properly to them

before crossingComputer vision solution: first step is to detect and

localize crosswalkUse figure-ground approach, implemented as

graphical model

Encouraging experimental results: fast and robust

Much more algorithm work is required, as well as extensive user testing…

32

Ongoing/future work

Learn unary and binary potentials from large labeled database

Include many possible cues to see which are most effective…

For instance, how about relative color cues? E.g. using color ratios [Funt & Finlayson ’95]

ROC analysis (e.g. false positive crosswalk detections)

33

Ongoing/future work

Basic geometric analysis of detected crosswalk to determine orientation

User testing is tracking needed once you begin crossing?

Tackle two-stripe crosswalk patternsDetermine intersection layout (3-way, 4-way, etc.)Eventually… port this to camera cell phone. Software

could be downloaded for free

Another application of F/G framework: finding text in cluttered scenes [Shen & Coughlan ICPR ’06]

34

Thanks to…

John Brabyn, Bill Gerrey, Tom Fowle and Josh Miele (Smith-Kettlewell)

Roberto Manduchi (UC Santa Cruz)

a fast algorithm for finding crosswalks using figure

Documents