attention in computer vision

110

Upload: york

Post on 08-Jan-2016

53 views

Category:

Documents


0 download

DESCRIPTION

Attention in Computer Vision. Mica Arie-Nachimson and Michal Kiwkowitz May 22, 2005 Advanced Topics in Computer Vision Weizmann Institute of Science. Problem definition – Search Order. Vision applications apply “expensive” algorithms (e.g. recognition) to image patches - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Attention in Computer Vision
Page 2: Attention in Computer Vision

Attention in Computer Vision

Mica Arie-Nachimson and Michal Kiwkowitz

May 22, 2005Advanced Topics in Computer Vision

Weizmann Institute of Science

Page 3: Attention in Computer Vision

Problem definition – Search Order

Object recognition

NO

• Vision applications apply “expensive” algorithms (e.g. recognition) to image patches

• Mostly naïve selection of patches• Selection of patches determines number of calls to

“expensive” algorithm

Page 4: Attention in Computer Vision

Problem Definition - Search Order

Object recognition

NOYES

• More sophisticated selection of patches would imply less calls to “expensive” algorithm

• Attention used to efficiently focus on incoming data (better use for limited processing capacity)

Page 5: Attention in Computer Vision

Problem Definition - Search Order

Object recognition

12345

6

Page 6: Attention in Computer Vision

Outline• What is Attention• Attention in Object Recognition

• Saliency Model• Feature Integration Theory• Saliency Algorithm• Saliency & Object Recognition• Comparison

• Inner Scene Similarity Model• Biological motivation• Difficulty of Search Tasks• Algorithms

• FLNN• VSLE

Page 7: Attention in Computer Vision

Outline• What is Attention• Attention in Object Recognition

• Saliency Model• Feature Integration Theory• Saliency Algorithm• Saliency & Object Recognition• Comparison

• Inner Scene Similarity Model• Biological motivation• Difficulty of Search Tasks• Algorithms

• FLNN• VSLE

Page 8: Attention in Computer Vision

Attention

• Attention implies allocating resources, perceptual or cognitive, to some things at the expense of not allocating them to something else.

Page 9: Attention in Computer Vision

What is Attention

• You are sitting in class listening to a lecture.

• Two people behind you are talking. – Can you hear the lecture?

• One of them mentions the name of a friend of yours. – How did you know?

Page 10: Attention in Computer Vision

Attention in Other Applications

• Face Detection (feature selection)

• Video Analysis (temporal block selection)

• Robot Navigation (select locations)

• …

Page 11: Attention in Computer Vision

Attention is Directed by:

Bottom-up: • From small to large units of meaning • Rapid • Task-independent

Page 12: Attention in Computer Vision

Attention is Directed by:

Top-down:• Use higher levels (context, expectation)

to process incoming information (Guess)• Slower• Task dependent

http://www.rybak-et-al.net/nisms.html

Page 13: Attention in Computer Vision

Outline• What is Attention• Attention in Object Recognition

• Saliency Model• Feature Integration Theory• Saliency Algorithm• Saliency & Object Recognition• Comparison

• Inner Scene Similarity Model• Biological motivation• Difficulty of Search Tasks• Algorithms

• FLNN• VSLE

Page 14: Attention in Computer Vision

When is information selected (filtered)? – Early selection (Broadbent, 1958)– Cocktail party phenomenon (Moray, 1959)– Late selection (Treisman, 1960) - attenuation

• All information is sent to perceptual systems for processing

• Some is selected for complete processing• Some is more likely to be selected

Attention

WHICH?

Page 15: Attention in Computer Vision

Parallel SearchIs there a green O ?

+

A. Treisman, G. Gelade, 1980

Page 16: Attention in Computer Vision

Conjunction Search

Is there a green N ?

+

A. Treisman, G. Gelade, 1980

Page 17: Attention in Computer Vision

Results

A. Treisman, G. Gelade, 1980

Page 18: Attention in Computer Vision

Conjunction Search

+

A. Treisman, G. Gelade, 1980

Page 19: Attention in Computer Vision

Color map Orientation map

A. Treisman, G. Gelade, 1980

Page 20: Attention in Computer Vision

Color map Orientation map

A. Treisman, G. Gelade, 1980

Page 21: Attention in Computer Vision

Conjunction Search

+

A. Treisman, G. Gelade, 1980

Page 22: Attention in Computer Vision

Primitives

PP PP

PP

Intensity

PP P

PPP

Orientation

PP PP

PP

Color

xx

x

xs

x

Curvature

II

I

IILine End

Movement

x x x

xx

x

Page 23: Attention in Computer Vision

Feature Integration Theory

Attention - two stages:

Attention•Serial Processing•Localized Focus•Slower•Conjunctive search

Pre-attention•Parallel Processing•Low Level Features•Fast•Parallel Search

How is the Focus

found & shifted?

A. Treisman, G. Gelade, 1980

Page 24: Attention in Computer Vision

Outline• What is Attention• Attention in Object Recognition

• Saliency Model• Feature Integration Theory• Saliency Algorithm• Saliency & Object Recognition• Comparison

• Inner Scene Similarity Model• Biological motivation• Difficulty of Search Tasks• Algorithms

• FLNN• VSLE

Page 25: Attention in Computer Vision

Shifts in Attention

“Shifts in selective visual attention: towards the underlying neural circuitry”,

Christof Koch, and Shimon Ullman, 1985

C. Koch, and S. Ullman, 1985

Feature Maps

•Orientation•Color•Curvature•Line end•Movement

Feature Maps

•Orientation•Color•Curvature•Line end•Movement

Feature Maps

•Orientation•Color•Curvature•Line end•Movement

Feature Maps

•Orientation•Color•Curvature•Line end•Movement

Feature Maps

•Orientation•Color•Curvature•Line end•Movement

Central RepresentationAttention

SaliencySaliency

Page 26: Attention in Computer Vision

Saliency

“A model of saliency-based visual attention for rapid scene analysis”

Laurent Itti, Christof Koch, and Ernst Niebur, 1998

L. Itti, C. Koch, and E. Niebur, 1998

• Salient - stands out

• Example – telephone & road sign have high saliency

Page 27: Attention in Computer Vision

from C. Koch L. Itti, C. Koch, and E. Niebur, 1998

Page 28: Attention in Computer Vision

Intensity

L. Itti, C. Koch, and E. Niebur, 1998

Cells in the retina

Page 29: Attention in Computer Vision

01

2

Intensity

Create 8 spatial scale using Gaussian pyramids

8

L. Itti, C. Koch, and E. Niebur, 1998

Page 30: Attention in Computer Vision

IntensityCenter-Surround difference operator- Sensitive to local spatial

discontinuities- Principle computation in the retina &

primary visual cortex- Subtract coarse scale from fine

scale

+

-

Fine scale

Coarse scale

L. Itti, C. Koch, and E. Niebur, 1998

+

-

fine

coarse

Page 31: Attention in Computer Vision

Toy Example

0 0 0

0 0 0

0 0 0

0 0 0

0 255 0

0 0 0

Fine level Coarse level

Gauss Pyramid Interpolation

Coarse level

Point-by-point subtraction

0 0 0

0 255 0

0 0 0

Page 32: Attention in Computer Vision

Toy Example

255 255 255

255 255 255

255 255 255

255 255 255

255 255 255

255 255 255

Fine level Coarse level

Gauss Pyramid Interpolation

Coarse level

Point-by-point subtraction

0 0 0

0 0 0

0 0 0

Page 33: Attention in Computer Vision

Intensity

4,3,2c 4,3 ccs

)()(),( sIcIscI

)5()2()5,2( III

Compute:

6 Intensity maps

)6()2()6,2( III

Different ratios – multiscale feature extraction

)6()3()6,3( III

L. Itti, C. Koch, and E. Niebur, 1998

Page 34: Attention in Computer Vision

Color

Same c and s as with intensity12 Color maps

Kandel et al. (2000). Principles of Neural Science. McGraw-Hill/Appleton & Lange

L. Itti, C. Koch, and E. Niebur, 1998More

Page 35: Attention in Computer Vision

Orientation

Same c and s as with intensity24 Orientation maps

}135,90,45,0{

|),(),(|),,( sOcOscO

From Visual system presentation by S. Ullman

L. Itti, C. Koch, and E. Niebur, 1998More

Page 36: Attention in Computer Vision

from C. Koch L. Itti, C. Koch, and E. Niebur, 1998

Page 37: Attention in Computer Vision

More

Normalization Operator

L. Itti, C. Koch, and E. Niebur, 1998

Page 38: Attention in Computer Vision

Saliency Map

3

)()()( ONCNINS

L. Itti, C. Koch, and E. Niebur, 1998

Page 39: Attention in Computer Vision

1. Extract Feature Maps

Algorithm- up to now

2. Compute Center-Surround (42)

• Intensity – I (6)

• Color – C (12)

• Orientation – O (24)

3. Combine each channel into conspicuity map

4. Compute saliency by summing and normalizing maps

Page 40: Attention in Computer Vision

Laurent Itti, Christof Koch, and Ernst Niebur, 1998

Page 41: Attention in Computer Vision

Leaky integrate-and-fire neurons“Inhibition of return”

Winner Takes All

Selection (FOA)

L. Itti, C. Koch, and E. Niebur, 1998

FOA – Focus Of Attention

Page 42: Attention in Computer Vision

Results

• FOA shifts: 30-70 ms• Inhibition: 500-900 ms

Inhibition of return ends

L. Itti, C. Koch, and E. Niebur, 1998

Page 43: Attention in Computer Vision

Results

Spatial Frequency Content, Reinage & Zador, 1997

Image

SFC

Saliency

Output

L. Itti, C. Koch, and E. Niebur, 1998

Page 44: Attention in Computer Vision

Results

(a) (b)

(c) (d)

Image

SFC

Saliency

Output

L. Itti, C. Koch, and E. Niebur, 1998Spatial Frequency Content, Reinage & Zador, 1997

Page 45: Attention in Computer Vision

Outline• What is Attention• Attention in Object Recognition

• Saliency Model• Feature Integration Theory• Saliency Algorithm• Saliency & Object Recognition• Comparison

• Inner Scene Similarity Model• Biological motivation• Difficulty of Search Tasks• Algorithms

• FLNN• VSLE

Page 46: Attention in Computer Vision

Attention & Object Recognition

• “Is bottom-up attention useful for object recognition?”– U. Rutishauser, D. Walther, C. Koch and P. Perona,

2004

U. Rutishauser, D. Walther, C. Koch and P. Perona, 2004

Computer recognition

Human recognition

segmented Cluttered scenes

labeled Non labeled

Attention

Page 47: Attention in Computer Vision

Object Recognition

saliency model

U. Rutishauser, D. Walther, C. Koch and P. Perona, 2004

Growing region in strongest map

To Object Recognition

(Lowe)

More

Page 48: Attention in Computer Vision

Attention & Object Recognition

Learning inventories – “grocery cart problem”

U. Rutishauser, D. Walther, C. Koch and P. Perona, 2004

Real world scenes1 image for training (15 fixations)

2-5 images for testing (20 fixations)

Page 49: Attention in Computer Vision

testing

training Object recognitionMatch

Page 50: Attention in Computer Vision

“Grocery Cart” Problem

U. Rutishauser, D. Walther, C. Koch and P. Perona, 2004

training testing1

testing2

Page 51: Attention in Computer Vision

“Grocery Cart” Problem

Downsides:

• Bias of human photography

• Small image set

U. Rutishauser, D. Walther, C. Koch and P. Perona, 2004

Solution• Robot as acquisition tool

Page 52: Attention in Computer Vision

Robot - Landmark Learning

Objective – how many objects are found and classified correctly?

Navigation – simple obstacle avoiding algorithm using infrared sensors

U. Rutishauser, D. Walther, C. Koch and P. Perona, 2004

Page 53: Attention in Computer Vision

Object recognition

< 3 key points

Page 54: Attention in Computer Vision

Landmark Learning

With

Attention

U. Rutishauser, D. Walther, C. Koch and P. Perona, 2004

Page 55: Attention in Computer Vision

Landmark Learning

With Random Selection

U. Rutishauser, D. Walther, C. Koch and P. Perona, 2004

Page 56: Attention in Computer Vision

Landmark Learning - Results

U. Rutishauser, D. Walther, C. Koch and P. Perona, 2004

Page 57: Attention in Computer Vision

Saliency Based Object Recognition

• Biologically motivated• Uses bottom-up, allows

combining top-down information

• Segmentation– Cluttered scenes– Unlabeled objects– Multiple objects in single image

• Static priority map

U. Rutishauser, D. Walther, C. Koch and P. Perona, 2004

Page 58: Attention in Computer Vision

Outline• What is Attention• Attention in Object Recognition

• Saliency Model• Feature Integration Theory• Saliency Algorithm• Saliency & Object Recognition• Comparison

• Inner Scene Similarity Model• Biological motivation• Difficulty of Search Tasks• Algorithms

• FLNN• VSLE

Page 59: Attention in Computer Vision

Comparison

“Comparing attention operators for learning landmarks”, R. Sim, S. Polifroni, G. Dudek , June 2003

Other attention operators for low level features

R. Sim, S. Polifroni, G. Dudek , June 2003

Page 60: Attention in Computer Vision

Comparison

R. Sim, S. Polifroni, G. Dudek , June 2003

Edge density Radial symmetry

Smallest eigenvalue Caltech saliency

Page 61: Attention in Computer Vision

Comparison

• Landmark learning

• Training – learn landmarks knowing camera pose

• Testing - determine pose of camera according to landmarks (pose estimation)

R. Sim, S. Polifroni, G. Dudek , June 2003

Page 62: Attention in Computer Vision

Comparison - Results

• All operators better than random

• Radial symmetry worst results

• Caltech operator performs similar to edge and eigenvalue operators

• BUT – More complex to implement – More computing time

• Less preferred candidate in practice

R. Sim, S. Polifroni, G. Dudek , June 2003

Page 63: Attention in Computer Vision

Outline• What is Attention• Attention in Object Recognition

• Saliency Model• Feature Integration Theory• Saliency Algorithm• Saliency & Object Recognition• Comparison

• Inner Scene Similarity Model• Biological motivation• Difficulty of Search Tasks• Algorithms

• FLNN• VSLE

Page 64: Attention in Computer Vision

The Problem

Object recognition

12345

6

Page 65: Attention in Computer Vision

Outline• What is Attention• Attention in Object Recognition

• Saliency Model• Feature Integration Theory• Saliency Algorithm• Saliency & Object Recognition• Comparison

• Inner Scene Similarity Model• Biological motivation• Difficulty of Search Tasks• Algorithms

• FLNN• VSLE

Page 66: Attention in Computer Vision

Biological Motivation

• An alternative approach: continuous search difficulty

• Based on similarity:– Between Targets and Non-Targets in the scene– Between Non-Targets and Non-Targets in the scene

• Similar structural units do not need separate treatment

• Structural units similar to a possible target get high priority

Duncan & Humphreys [89]

Page 67: Attention in Computer Vision

Biological Motivation

similar

similar

not similar

not similar

search difficulty

target- nontarget similarity

nontarget- nontarget similarity

Duncan & Humphreys [89]

Page 68: Attention in Computer Vision

Biological Motivation

• Explains pop-out vs. serial search phenomenon

Non-targets:

Target:

Duncan & Humphreys [89]

Page 69: Attention in Computer Vision

Biological Motivation

• Explains pop-out vs. serial search phenomenon

Non-targets:

Target:

Duncan & Humphreys [89]

Page 70: Attention in Computer Vision

similar

similar

not similar

not similar

search difficulty

Biological Motivation

• Explains pop-out vs. serial search phenomenon Non-targets:

Target:

Non-targets:

Target:

target- nontarget similarity

nontarget- nontarget similarity

Duncan & Humphreys [89]

Page 71: Attention in Computer Vision

Using Inner-scene Similarities

• Every candidate is characterized by a vector of n attributes

• n-dimentional metric space– A candidate is a point in the space– Some distance function d is associated with

the space

Avraham & Lindenbaum [04] Avraham & Lindenbaum [05]

Page 72: Attention in Computer Vision

Using Inner-scene Similarities Example

• One feature only: object area

• d: regular Euclidean distance Feature space

Page 73: Attention in Computer Vision

Outline• What is Attention• Attention in Object Recognition

• Saliency Model• Feature Integration Theory• Saliency Algorithm• Saliency & Object Recognition• Comparison

• Inner Scene Similarity Model• Biological motivation• Difficulty of Search Tasks• Algorithms

• FLNN• VSLE

Page 74: Attention in Computer Vision

Difficulty of Search

• The difficulty measure is the number of queries until the first target is found

• Two main factors– Distance between Targets and Non-Targets– Distance between Non-Targets and Non-

Targets

Feature space

Page 75: Attention in Computer Vision

CoverDifficulty of Search

Feature space

c: the number of circles in the cover

Page 76: Attention in Computer Vision

Difficulty of Search

c will be our measure of the search difficulty

We need some constraint on the

circles’ size!

c: the number of circles

Page 77: Attention in Computer Vision

dt: max-min target distanceDifficulty of Search

dt

Page 78: Attention in Computer Vision

dt-cover

diamete

r

d t

Difficulty of Searchdt

Page 79: Attention in Computer Vision

Minimum dt-cover

c: The number of circles in the minimal dt-cover

diamete

r

d t

Difficulty of Searchdt

Page 80: Attention in Computer Vision

c: the number of circlesDifficulty of Search

dt

c = 7

dt

dt

Page 81: Attention in Computer Vision

c: insects exampleDifficulty of Search

dt

Feature spacec = 3

Page 82: Attention in Computer Vision

Example: easy searchDifficulty of Search

dt

c = 2

Page 83: Attention in Computer Vision

Example: hard searchDifficulty of Search

c = # of candidates

dt

Page 84: Attention in Computer Vision

Define the Difficulty using c

• Lower bound: Every search algorithm needs c calls to the oracle before finding the first target in the worst case

• Upper bound: There is an algorithm that will need max. c calls to the oracle to find the first target, for all search tasks

Difficulty of Search

Page 85: Attention in Computer Vision

Lower bound

Every search algorithm needs c calls to the oracle before finding the first target in the worst case

Difficulty of Search

1

2

3

4

5dt

dt

dt

dt

Page 86: Attention in Computer Vision

Upper bound

There is an algorithm that will need max. c calls to the oracle to find the first target, for all search tasks

FLNN-Farthest Labeled Nearest Neighbor

Difficulty of Search

Page 87: Attention in Computer Vision

Outline• What is Attention• Attention in Object Recognition

• Saliency Model• Feature Integration Theory• Saliency Algorithm• Saliency & Object Recognition• Comparison

• Inner Scene Similarity Model• Biological motivation• Difficulty of Search Tasks• Algorithms

• FLNN• VSLE

Page 88: Attention in Computer Vision

FLNNFarthest Labeled Nearest Neighbor

Efficient Algorithms

1

2

3

4

5

c is a tight bound!

Page 89: Attention in Computer Vision

How do we compute c?Difficulty of Search

dt

– Need to know dt

– Compute the minimal dt-cover

– Count number of circles c=7

dt

Page 90: Attention in Computer Vision

– Need to know dt

– Compute the minimal dt-cover

– Count number of circles = c

To know the exact dt we need to know all the targets and non-targets, but that’s what we’re looking for…

Computing the minimal dt-cover is NP-complete!

Ok, that’s easy…

Difficulty of Search

dt

How do we compute c?

Page 91: Attention in Computer Vision

Upper & Lower Bounds on c

• Upper bounds:– The number of candidates

– Know that dt is larger than some d0:• Can approximate cover size

• Lower bounds:– FLNN worst case

– Know that dt is larger than some d0:• Can approximate cover size

Difficulty of Search

Page 92: Attention in Computer Vision

Outline• What is Attention• Attention in Object Recognition

• Saliency Model• Feature Integration Theory• Saliency Algorithm• Saliency & Object Recognition• Comparison

• Inner Scene Similarity Model• Biological motivation• Difficulty of Search Tasks• Algorithms

• FLNN• VSLE

Page 93: Attention in Computer Vision

Improving FLNN

• What’s wrong with FLNN?– Relates only to the nearest known neighbor– Finds only the first target efficiently– Cannot be easily extended to include top-

down information

Efficient Algorithms

Page 94: Attention in Computer Vision

VSLEVisual Search using Linear Estimation

• Each candidate has a prob. to be a target• Query the candidate with the highest probability• Update other candidates’ prob. according to the

known results– Every known target/non-target affects other

candidates in reverse order to its distance.

If we know results for candidates 1,…,m:

• Dynamic priority map

Efficient Algorithms

Page 95: Attention in Computer Vision

Efficient Algorithms

0.650.4

0.45

0.6 0.5

0.54

0.450.51

0.530.46

0.58

0.51

0.1

0.4

0.450.5

0.560.48

0.5

0.56

0.63

0.70.68

VSLEVisual Search using Linear Estimation

Page 96: Attention in Computer Vision

Efficient Algorithms

0.15

0.45

0.6 0.60.63

0.450.65

0.20.25

0.53

0.23

0.55

0.1 0.620.15

0.59

0.210.27

0.65

VSLEVisual Search using Linear Estimation

0.06

0.45

0.12 0.550.18

0.95

0.220.28

More

Page 97: Attention in Computer Vision

Combining Top-Down Information

• Simply specify the initial probabilities to match previous known data

• Add known target objects to the space. This will alter the probabilities accordingly and speed up search

Efficient Algorithms

Page 98: Attention in Computer Vision

Experiment 1: COIL-100Efficient Algorithms

Columbia Object Image Library [96]

Page 99: Attention in Computer Vision

Experiment 1: COIL-100

• Features:– 1st, 2nd, 3rd gaussian derivatives 9 basis

filters– 5 scales 9x5 = 45 features

• Euclidean distance

Efficient Algorithms

Rao & Ballard [95]

Page 100: Attention in Computer Vision

Experiment 1: COIL-100Efficient Algorithms

10 cars10 cups

# queries# queries

Page 101: Attention in Computer Vision

Experiment 2: hand segmentedEfficient Algorithms

• Every large segment is a candidate• 24 candidates• 4 targets

Berkeley hand segmented DB

Martin, Fowlkes, Tal & Malik [01]

Page 102: Attention in Computer Vision

Experiment 2: hand segmented

• Features: color histograms and

separated into 8 bins each 64 features

• Euclidean distance

Efficient Algorithms

Page 103: Attention in Computer Vision

Experiment 3: automatic color segmentation

• Automatic color segmented image for face detection

Efficient Algorithms

Page 104: Attention in Computer Vision

Experiment 3: color segmentation

• 146 candidates

• 4 features: segment size, mean value of red, green and blue

• Euclidean distance

Efficient Algorithms

# queries

Page 105: Attention in Computer Vision

Combining top-down information

• Add known targets to the space

Efficient Algorithms

Without additional targets With additional targets

# queries# queries

Page 106: Attention in Computer Vision

Summary: similarity modelSaliency model• Biologically motivated• Uses bottom-up, allows

combining top-down information

• Segmentation• Static priority map

Similarity model• Biologically motivated• Uses bottom-up, allows

combining top-down information

• No segmentation• Dynamic priority map• Measures the search

difficulty

Page 107: Attention in Computer Vision

Summary

• What is attention

• Aid object recognition tasks by choosing the area of interest

• Two approaches: saliency model and similarity model– Biological motivation– Algorithms

Page 108: Attention in Computer Vision

Thank You!

Page 109: Attention in Computer Vision

Linearly Estimating l(xk)

A linear estimation for l(xk):

Which, of course, minimizes the error

Solving a set of equations gives an estimation:

Page 110: Attention in Computer Vision

Linearly Estimating l(xk)

Estimation:

Where vector of known labels,

and is computed as follows (i,j=1,…,m):

R and r depend only on the distances, computed in

advance once