cog5 lecppt chapter03

© 2010 by W. W. Norton & Co., Inc.

Recognizing Objects

Chapter 3Lecture Outline

Chapter 3: Recognizing Objects

Lecture OutlineForm PerceptionObject RecognitionWord RecognitionFeature NetsDifferent Objects, Different Recognition

Systems?Top-down Influences on Object Recognition

Recognizing Objects

Why is object recognition important?

Crucial for applying your knowledgeCrucial for learning

Form Perception

How do we perceive and recognize objects?

Form perception: shape and size Object recognition: identification

Form Perception

Jerome Bruner Gestalt Psychology

Form Perception

One set of visual features

Two possible interpretations

But only one can be seen at a time

Necker Cube

Form Perception

Knowledge can change our interpretation

Form Perception

People resolve ambiguity in everyday situations

Form Perception

Your ability to interpret these scenes is governed by a few basic principles

Form Perception

Good Continuation

Proximity

Similarity

Closure

Simplicity

Single objects

How our mind creates objects

Form Perception

Parallel Processing

Form Perception

Simpler to interpret this as one X and not two v’s

Form Perception

What is this? Hint: The black is the background.

Form Perception

Proximity, good continuation, closure

Letter and Word Recognition

Form Perception

Brain areas for basic visual features brain areas for large-scale form

Interactive

Object Recognition

Now let’s turn from form perception, the process through which the basic shape and size of an object are seen

And discuss object recognition, the process through which the object is identified

Object Recognition

Can recognize objects even when incomplete

Incomplete information

From the back

From the front

Context helps

Object Recognition

Same stimulus

H A

Top-Down Influences on Object Recognition

Bottom-up (or data-driven) processing Stimulus-driven effects

Top-down (or concept-driven) processing Knowledge- or expectation-driven effects

Object Recognition

Recognition begins with features—the small elements that result from the organized perception of form

Object Recognition

FeaturesBuilding blocksCommonalities for variable objectsPlay a role in visual search

Object Recognition

Visual Search Demo

Object Recognition

Find the vertical line (standing up)

Object Recognition

Object Recognition

Find the green-colored line

Object Recognition

Object Recognition

Find the vertical red-colored line (standing up)

Object Recognition

Object Recognition

Which one was harder?

Object Recognition

Difficulty in judging how more than one feature is bound together in objects

Integrative agnosia, parietal cortex damage Disruption of parietal cortex via transcranial

magnetic stimulation (TMS)

Word Recognition

Some methodology for studying word recognition:From tachistoscope to computers

Word Recognition

Word Recognition

Masked words Repeated words

40 ms

Word Recognition

Word-superiority effect: response when asked whether “DARK” has an “E” or a “K” faster than within a letter string such as “JPERW”

Word Recognition

Better at identifying letters in a word

Word Recognition

Why word superiority?Probability

How likely is it that letter combinations appear in English?

Word Recognition

Errors also driven by probabilityLikely to misread words predictably “TPUM” is likely to be misread as “TRUM” or

even “DRUM.”But the reverse errors are rare: “DRUM” is

unlikely to be misread as “TRUM” or “TPUM”

Feature Nets

Complex

Simple

Feature Nets

“Neural Network” Have receptive fields Fire above threshold Like complex assemblies of neurons

Feature Nets

Recent firing = higher starting activation levelFrequency leads to higher recencyRepetition increases recency

Feature Nets

To explain the word-superiority effect,

Feature Nets

Stronger baseline activity Better recognition

recover from confusion

Feature Nets

TH more frequent CA and AT more frequent

Feature Nets

Stronger baseline activity Will correct recognition

Feature Nets

Knowledge not locally represented But rather, distributed knowledge

Feature Nets

Errors arise from the network’s ability to deal with ambiguous inputs and to recover from errors

Accuracy sacrificed for efficiency

Feature Nets

A much more complex feature net with feedforward and feedback loops

More like a brain

Feature NetsBuilding blocks for objects

Feature Nets

Bottom-up recognitionGeon recognition leads to object recognitionViewpoint invariant

Object Recognition

Geon Demo

Object Recognition

Write the objects you see

Feature Nets

Feature Nets

Recognition by components

viewpoint independent viewpoint dependent

Whole objects need to be rotated

Different Objects, Different Recognition Systems?

Some categories are specialFaces


Prosopagnosia is a type of agnosia also known as face blindness


Houses about the same upright and inverted

Faces much worse Inverted and much betterupright


Do these two faces look different?


Viewpoint dependence appears when Interpreting facesExpertise is high (e.g., dog judges)Specific individuals have to be recognizedConfigurations of component parts are

important


Face Expertise Car Expertise

Bird Expertise


Holistic processingComposite faces

The Importance of Larger Contexts

Most of the accounts we have covered in this chapter depend on bottom-up processing

However, there is a great deal of knowledge that guides our recognition

Later chapters will discuss this further

Chapter 2 Questions

Which of the following is supportive of the claim that perception is in the “eye of the beholder” and not in the stimulus itself:

a) When presented with ambiguous letters, the visual system uses context to determine their identity.

b) A traffic light can be identified even if partially occluded by a tree branch.

c) Whether someone remembers having seen an ambiguous figure (e.g., face-vase) before depends on whether the interpretation of the figure is the same.

d) all of the above

Which of the following is evidence for a feature theory of perception?

a) The visual system is specialized with cells that detect single features.

b) When researchers are able to stabilize the retinal image for an individual, preventing tiny eye movements (saccades) that refresh the rods and cones, the image stays the same.

c) In visual search paradigms, in which a single target must be found in an array of other items, target identification is faster when it shares features with the distractors.

d) Detecting an embedded figure (including its features) is independent of the way the form is parsed.

When Betty (an English speaker) is shown strings of letters tachistoscopically, they are overregularized to follow the rules of common English spelling. This is because

a) of the word superiority effect. b) all humans are predisposed toward the

visual configurations evident in “regular” bigrams; this is why English uses them.

c) of a lifetime of strengthening the bigram detectors for common English letter pairs.

d) Betty is reluctant to give answers that she cannot easily pronounce.

Which of the following methodologies does not measure brain activity or structure?

a) magnetic resonance imaging (MRI)

b) computerized axial tomography (CT)

c) positron emission tomography (PET)

d) transcranial magnetic stimulation (TMS)

The use of geons is associated with

a) the recognition-by-components (RBC) model.

b) the word superiority effect.

c) visual masking.

d) feature nets.

The “recognition-via-multiple-views” approach to object recognition is also known as _____ recognition.

a) viewpoint dependent

b) viewpoint independent

c) object

d) face

Which of the following is the clinical term we use to describe a disturbance in the initiation or organization of voluntary action?

a) aphasia

b) neglect

c) agnosia

d) none of the above