lecture 17: parts-based models and context cs6670: computer vision noah snavely

45
Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Post on 21-Dec-2015

225 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Lecture 17: Parts-based models and context

CS6670: Computer VisionNoah Snavely

Page 2: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Announcements

• Project 3: Eigenfaces– due Wednesday, November 11 at 11:59pm– solo project

Page 3: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

ObjectObject Bag of ‘words’Bag of ‘words’

Page 4: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

What about spatial info?What about spatial info?

?

Page 5: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Problem with bag-of-words

• All have equal probability for bag-of-words methods• Location information is important

Page 6: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Model: Parts and Structure

Page 7: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Deformable objects

Images from D. Ramanan’s dataset 7

Page 8: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Deformable objects

Images from Caltech-256

8

Page 9: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Part-based representation

• Objects are decomposed into parts and spatial relations among parts

9

Fischler and Elschlager ‘73

Page 10: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Pictorial structures

• Two components:

– Appearance model• How much does a given window look like a given part?

– Spatial model• How well do the parts match the expected shape?

Page 11: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely
Page 12: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Pictorial Structure

• Matching = Local part evidence + Global constraint

• mi(li): matching cost for part I

• dij(li,lj): deformable cost for connected pairs of parts

• (vi,vj): connection between part i and j12

Page 13: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely
Page 14: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Part-based representation

• Tree model Efficient inference by dynamic programming

14

Page 15: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Appearance model• Each part has an associated appearance model

– E.g., a reference patch, gradient histogram, etc.

Left eye Right eye Nose

ChinRight mouthLeft mouth

Page 16: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Spatial model• Each edge represents a spring with a certain

relative offset, covariance

Page 17: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Matching on tree structure

• For each l1, find best l2:

• Remove v2, and repeat with smaller tree, until only a single part

• Complexity: O(nk2): n parts, k locations per part

17

Page 18: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Putting it all together

Left eye Right eye Nose

ChinRight mouthLeft mouth

Marginal onNose

Page 19: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Sample result on matching human

19

Page 20: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Sample result on matching human

20

Page 21: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Matching results

Page 22: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Learning the model parameters

• Easiest approach: supervised learning

– Someone chooses the number and meaning of the parts, labels them in a bunch of training examples

– Use this to learn the appearance and spatial models

• A lot of work has been done on unsupervised learning of these models

Page 23: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Some learned object models

Page 24: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Part-based representation

• K-fans model (D.Crandall, et.all, 2005)

24

Page 25: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

How much does shape help?• Crandall, Felzenszwalb, Huttenlocher CVPR’05• Shape variance decreases with increasing model complexity• Do get some benefit from shape

Page 26: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

3-minute break

Page 27: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Context: thinking outside the (bounding) box

Slides courtesy Alyosha Efros

© Oliva & Torralba

Page 28: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Eye of the Beholder

Claude MonetGare St.Lazare Paris, 1877

Page 29: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Eye of the Beholder

where did it go?

Page 30: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Seeing less than you think…

Page 31: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Seeing less than you think…

Page 32: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

“The Miserable Life of a Person Detector”“The Miserable Life of a Person Detector”

Page 33: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

What the Detector Sees

Page 34: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

What the Detector Does

True Detection

True Detections

MissedMissed

False Detections

Local Detector: [Dalal-Triggs 2005]

Page 35: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

roadtablechairkeyboard tablecar

road

If we have 1000 categories (detectors), and each detector produces 1 FP every 10 images, we will have 100 false alarms per image… pretty much garbage…

with hundreds of categories…

Slide by Antonio Torralba

Page 36: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

We know there is a keyboard present in this scene even if we cannot see it clearly.

We know there is no keyboard present in this scene

… even if there is one indeed.Slide by Antonio Torralba

Context to the rescue!

Page 37: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

When is context helpful?

Distance

Information

Local features

Contextual features

Slide by Antonio Torralba

Page 38: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Is it just for small / blurry things?

Slide by Antonio Torralba

Page 39: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Is it just for small / blurry things?

Slide by Antonio Torralba

Page 40: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Is it just for small / blurry things?

Slide by Antonio Torralba

Page 41: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Context is hard to fight!

Thanks to Paul Violafor showing me these

Page 42: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

more “Look-alikes”

Page 43: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

1

Don’t even need to see the object

Slide by Antonio Torralba

Page 44: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

Don’t even need to see the object

Chance ~ 1/30000 Slide by Antonio Torralba

Page 45: Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely

The influence of an object extends beyond its physical boundaries

But object can say a lot about the scene