what can computational models tell us about face processing?
DESCRIPTION
What can computational models tell us about face processing?. Garrison W. Cottrell Gary's Unbelievable Research Unit (GURU) Computer Science and Engineering Department Institute for Neural Computation UCSD. Collaborators on this work (and earlier versions of this work): - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/1.jpg)
What can computational models tell us about face processing?
Garrison W. CottrellGary's Unbelievable Research Unit (GURU)Computer Science and Engineering DepartmentInstitute for Neural ComputationUCSD
Collaborators on this work (and earlier versions of this work):Ralph Adolphs, Kristin Branson, Andy Calder, Matthew Dailey, Michael Fleming, Carrie Joyce, Janet Metcalfe, Nam Nguyen, Curt Padgett, Maki Sugimoto, Matt Tong, Brian Tran
![Page 2: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/2.jpg)
What can computational models tell us about face processing?
Garrison W. CottrellGary's Unbelievable Research Unit (GURU)Computer Science and Engineering DepartmentInstitute for Neural ComputationUCSD
Collaborators on this work (and earlier versions of this work):Ralph Adolphs, Kristin Branson, Andy Calder, Matthew Dailey, Michael Fleming, Carrie Joyce, Janet Metcalfe, Nam Nguyen, Curt Padgett, Maki Sugimoto, Matt Tong, Brian Tran
![Page 3: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/3.jpg)
What can computational models tell us about face processing?
Garrison W. CottrellGary's Unbelievable Research Unit (GURU)Computer Science and Engineering DepartmentInstitute for Neural ComputationUCSD
Collaborators on this work (and earlier versions of this work):Ralph Adolphs, Kristin Branson, Andy Calder, Matthew Dailey, Michael Fleming, Carrie Joyce, Janet Metcalfe, Nam Nguyen, Curt Padgett, Maki Sugimoto, Matt Tong, Brian Tran
![Page 4: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/4.jpg)
What can computational models tell us about face processing?
Garrison W. CottrellGary's Unbelievable Research Unit (GURU)Computer Science and Engineering DepartmentInstitute for Neural ComputationUCSD
Collaborators on this work (and earlier versions of this work):Ralph Adolphs, Kristin Branson, Andy Calder, Matthew Dailey, Michael Fleming, Carrie Joyce, Janet Metcalfe, Nam Nguyen, Curt Padgett, Maki Sugimoto, Matt Tong, Brian Tran
![Page 5: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/5.jpg)
What can computational models tell us about face processing?
Garrison W. CottrellGary's Unbelievable Research Unit (GURU)Computer Science and Engineering DepartmentInstitute for Neural ComputationUCSD
Collaborators on this work (and earlier versions of this work):Ralph Adolphs, Kristin Branson, Andy Calder, Matthew Dailey, Michael Fleming, Carrie Joyce, Janet Metcalfe, Nam Nguyen, Curt Padgett, Maki Sugimoto, Matt Tong, Brian Tran
![Page 6: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/6.jpg)
Why model?
• Models rush in where theories fear to tread.
• Models can be manipulated in ways people cannot
• Models can be analyzed in ways people cannot.
![Page 7: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/7.jpg)
Models rush in where theories fear to tread
• Theories are high level descriptions of the processes underlying behavior.• They are often not explicit about the processes involved.• They are difficult to reason about if no mechanisms are explicit --
they may be too high level to make explicit predictions.• Theory formation itself is difficult.
• Using machine learning techniques, one can often build a working modelworking model of a task for which we have no theories or algorithms (e.g., expression recognition).
• A working model provides an “intuition pump” for how things might work, especially if they are “neurally plausible” (e.g., development of face processing - Dailey and Cottrell).
• A working model may make unexpected predictions (e.g., the Interactive Activation Model and SLNT).
![Page 8: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/8.jpg)
Models can be manipulated in ways people cannot
• We can see the effects of variations in cortical architecture (e.g., split (hemispheric) vs. non-split models (Shillcock and Monaghan word perception model)).
• We can see the effects of variations in processing resources (e.g., variations in number of hidden units in Plaut et al. models).
• We can see the effects of variations in environment (e.g., what if our parents were cans, cups or books instead of humans? I.e., is there something special about face expertise versus visual expertise in general? (Sugimoto and Cottrell, Joyce and Cottrell, Tong & Cottrell)).
• We can see variations in behavior due to different kinds of brain damage within a single “brain” (e.g. Juola and Plunkett, Hinton and Shallice).
![Page 9: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/9.jpg)
Models can be analyzed in ways people cannot
In the following, I specifically refer to neural network models.
• We can do single unit recordings.
• We can selectively ablate and restore parts of the network, even down to the single unit level, to assess the contribution to processing.
• We can measure the individual connections -- e.g., the receptive and projective fields of a unit.
• We can measure responses at different layers of processing (e.g., which level accounts for a particular judgment: perceptual, object, or categorization? (Dailey et al. J Cog Neuro 2002).
![Page 10: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/10.jpg)
How (I like) to build Cognitive Models
• I like to be able to relate them to the brain, so “neurally plausible” models are preferred -- neural nets.
• The model should be a working model of the actual task, rather than a cartoon version of it.
• Of course, the model should nevertheless be simplifying (i.e. it should be constrained to the essential features of the problem at hand):• Do we really need to model the (supposed) translation
invariance and size invariance of biological perception?• As far as I can tell, NO!
• Then, take the model “as is” and fit the experimental data: 0 fitting parameters is preferred over 1, 2 , or 3.
![Page 11: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/11.jpg)
The other way (I like) to build Cognitive Models
• Same as above, except:• Use them as exploratory models -- in domains
where there is little direct data (e.g. no single cell recordings in infants or undergraduates) to suggest what we might find if we could get the data. These can then serve as “intuition pumps.”
• Examples: • Why we might get specialized face processors• Why those face processors get recruited for other tasks
![Page 12: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/12.jpg)
Today: Talk #1: The Face of Fear:A Neural Network Model of Human Facial Expression
Recognition
Joint work with Matt Dailey
![Page 13: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/13.jpg)
A Good Cognitive Model Should:• Be psychologically relevant (i.e. it should be in an area
with a lot of real, interesting psychological data).• Actually be implemented.• If possible, perform the actual task of interest rather
than a cartoon version of it.• Be simplifying (i.e. it should be constrained to the
essential features of the problem at hand).• Fit the experimental data.• Make new predictions to guide psychological research.
![Page 14: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/14.jpg)
The Issue: Are Similarity and Categorization Two Sides of the
Same Coin? • Some researchers believe perception of facial
expressions is a new example of categorical perception:• Like the colors of a rainbow, the brain
separates expressions into discrete categories, with:
• Sharp boundaries between expressions, and…• Higher discrimination of faces near those
boundaries.
![Page 15: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/15.jpg)
Percent ofPercent of
subjectssubjects
who who
pressed pressed thisthis
buttonbutton
![Page 16: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/16.jpg)
The Issue: Are Similarity and Categorization Two Sides of the
Same Coin? • Some researchers believe the underlying
representation of facial expressions is NOT discrete:• There are two (or three) underlying
dimensions, e.g., intensity and valence (found by MDS).
• Our perception of expressive faces induces a similarity structure that results in a circle in this space
Human MDS Configuration
Anger
Disgust
Fear
Happiness
Sadness
Surprise
![Page 17: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/17.jpg)
The Face Processing SystemThe Face Processing System
PCA
.
.
.
.
.
.
HappySadAfraidAngrySurprisedDisgustedNeural
Net
Pixel(Retina)
Level
Object(IT)
Level
Perceptual(V1)
Level
CategoryLevel
GaborFiltering
![Page 18: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/18.jpg)
The Face Processing SystemThe Face Processing System
PCA
.
.
.
.
.
.
BobCarolTedAlice
NeuralNet
Pixel(Retina)
Level
Object(IT)
Level
Perceptual(V1)
Level
CategoryLevel
GaborFiltering
![Page 19: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/19.jpg)
The Face Processing SystemThe Face Processing System
PCA...
.
.
.
GaborFiltering
BobCarolTedCupCanBookNeural
Net
Pixel(Retina)
Level
Object(IT)
Level
Perceptual(V1)
Level
CategoryLevel
FeatureFeaturelevellevel
![Page 20: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/20.jpg)
The Face Processing SystemThe Face Processing System
LSF
PCA
HSF
PCA
.
.
.
.
.
.
GaborFiltering
BobCarolTedCupCanBookNeural
Net
Pixel(Retina)
Level
Object(IT)
Level
Perceptual(V1)
Level
CategoryLevel
![Page 21: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/21.jpg)
The Gabor Filter Layer• Basic feature: the 2-D Gabor wavelet filter (Daugman, 85):
• These model the processing in early visual areas
Convolution
*
Magnitudes
Subsample in a 29x36 grid
![Page 22: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/22.jpg)
Principal Components Analysis• The Gabor filters give us 40,600 numbers• We use PCA to reduce this to 50 numbers• PCA is like Factor Analysis: It finds the underlying
directions of Maximum Variance• PCA can be computed in a neural network through
a competitive Hebbian learning mechanism• Hence this is also a biologically plausible processing
step• We suggest this leads to representations similar to
those in Inferior Temporal cortex
![Page 23: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/23.jpg)
How to do PCA with a neural network
(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991)
A self-organizing network that learns whole-object representations
(features, Principal Components, Holons, eigenfaces)(features, Principal Components, Holons, eigenfaces)
...
Holons
(Gestalt layer)
Input fromInput fromPerceptual LayerPerceptual Layer
![Page 24: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/24.jpg)
How to do PCA with a neural network
(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991)
A self-organizing network that learns whole-object representations
(features, Principal Components, Holons, eigenfaces)(features, Principal Components, Holons, eigenfaces)
...
Holons
(Gestalt layer)
Input fromInput fromPerceptual LayerPerceptual Layer
![Page 25: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/25.jpg)
How to do PCA with a neural network
(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991)
A self-organizing network that learns whole-object representations
(features, Principal Components, Holons, eigenfaces)(features, Principal Components, Holons, eigenfaces)
...
Holons
(Gestalt layer)
Input fromInput fromPerceptual LayerPerceptual Layer
![Page 26: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/26.jpg)
How to do PCA with a neural network
(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991)
A self-organizing network that learns whole-object representations
(features, Principal Components, Holons, eigenfaces)(features, Principal Components, Holons, eigenfaces)
...
Holons
(Gestalt layer)
Input fromInput fromPerceptual LayerPerceptual Layer
![Page 27: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/27.jpg)
How to do PCA with a neural network
(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991)
A self-organizing network that learns whole-object representations
(features, Principal Components, Holons, eigenfaces)(features, Principal Components, Holons, eigenfaces)
...
Holons
(Gestalt layer)
Input fromInput fromPerceptual LayerPerceptual Layer
![Page 28: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/28.jpg)
How to do PCA with a neural network
(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991)
A self-organizing network that learns whole-object representations
(features, Principal Components, Holons, eigenfaces)(features, Principal Components, Holons, eigenfaces)
...
Holons
(Gestalt layer)
Input fromInput fromPerceptual LayerPerceptual Layer
![Page 29: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/29.jpg)
How to do PCA with a neural network
(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991)
A self-organizing network that learns whole-object representations
(features, Principal Components, Holons, eigenfaces)(features, Principal Components, Holons, eigenfaces)
...
Holons
(Gestalt layer)
Input fromInput fromPerceptual LayerPerceptual Layer
![Page 30: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/30.jpg)
Holons• They act like face cells (Desimone, 1991):
• Response of single units is strong despite occluding eyes, e.g.• Response drops off with rotation• Some fire to my dog’s face
• A novel representation: Distributed templates -- • each unit’s optimal stimulus is a ghostly looking face
(template-like),• but many units participate in the representation of a single
face (distributed).• For this audience: Neither exemplars nor prototypes!
• Explain holistic processing:• Why? If stimulated with a partial match, the firing
represents votes for this template: Units “downstream” don’t know what caused this unit
to fire. (more on this later…)
![Page 31: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/31.jpg)
The Final Layer: Classification(Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; Padgett & Cottrell 1996; Dailey
& Cottrell, 1999; Dailey et al. 2002)
The holistic representation is then used as input to a categorization network trained by supervised learning.
• Excellent generalization performance demonstrates the sufficiency of the holistic representation for recognition
Holons
CategoriesCategories
...
Output: Cup, Can, Book, Greeble, Face, Bob, Carol, Ted, Happy, Sad, Afraid, etc.Output: Cup, Can, Book, Greeble, Face, Bob, Carol, Ted, Happy, Sad, Afraid, etc.
Input fromInput fromPerceptual LayerPerceptual Layer
![Page 32: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/32.jpg)
The Final Layer: Classification• Categories can be at different levels: basic,
subordinate.• Simple learning rule (~delta rule). It says (mild lie
here): • add inputs to your weights (synaptic strengths)
when you are supposed to be on, • subtract them when you are supposed to be off.
• This makes your weights “look like” your favorite patterns – the ones that turn you on.
• When no hidden units => No back propagation of error.• When hidden units: we get task-specific features (most
interesting when we use the basic/subordinate distinction)
![Page 33: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/33.jpg)
Outline
• An overview of our facial expression recognition system.
• The internal representation shows the model’s prototypical representations of Fear, Sadness, etc.
• How our model accounts for the “categorical” data
• How our model accounts for the “two-dimensional” data
• Discussion• Conclusions for part 1
![Page 34: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/34.jpg)
Facial Expression Database• Ekman and Friesen quantified muscle movements (Facial
Actions) involved in prototypical portrayals of happiness, sadness, fear, anger, surprise, and disgust.• Result: the Pictures of Facial Affect Database (1976).• 70% agreement on emotional content by naive human
subjects.• 110 images, 14 subjects, 7 expressions.
Anger, Disgust, Neutral, Surprise, Happiness (twice), Fear, and Sadness This is actor “JJ”: The easiest for humans (and our model) to classify
![Page 35: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/35.jpg)
The Final Layer: Classification• The final layer is trained based on the putative
emotion being expressed by the actor (what the majority of Ekman’s subjects responded when given a 6-way forced choice)
• Uses a very simple learning rule: The delta rule• It says:
• add inputs to your weights (synaptic strengths) when you are supposed to be on,
• subtract them when you are supposed to be off.• This makes your weights “look like” your favorite
patterns – the ones that turn you on.• Also: no hidden units => No back propagation of
error
![Page 36: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/36.jpg)
Results (Generalization)
• Kendall’s tau (rank order correlation): .667, p=.0441
• Note: This is an emergent property of the model!
ExpressionExpression Network % CorrectNetwork % Correct Human % AgreementHuman % Agreement
HappinessHappiness 100.0%100.0% 98.7%98.7%
SurpriseSurprise 100.0%100.0% 92.4%92.4%
DisgustDisgust 100.0%100.0% 92.3%92.3%
AngerAnger 89.2%89.2% 88.9%88.9%
SadnessSadness 82.9%82.9% 89.2%89.2%
FearFear 66.7%66.7% 87.7%87.7%
AverageAverage 89.9%89.9% 91.6%91.6%
![Page 37: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/37.jpg)
Correlation of Net/Human Errors• Like all good Cognitive Scientists, we like
our models to make the same mistakes people do!
• Networks and humans have a 6x6 confusion matrix for the stimulus set.
• This suggests looking at the off-diagonal terms: The errors
• Correlation of off-diagonal terms: r = 0.567. [F (1,28) = 13.3; p = 0.0011]
• Again, this correlation is an emergent property of the model: It was not told which expressions were confusing.
![Page 38: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/38.jpg)
Outline
• An overview of our facial expression recognition system.
• The internal representation shows the model’s prototypical representations of Fear, Sadness, etc.
• How our model accounts for the “categorical” data
• How our model accounts for the “two-dimensional” data
• Discussion• Conclusions for part 1
![Page 39: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/39.jpg)
Examining the Net’s Representations
• We want to visualize “receptive fields” in the network.• But the Gabor magnitude representation is noninvertible.• We can learn an approximate inverse mapping, however.• We used linear regression to find the best linear
combination of Gabor magnitude principal components for each image pixel.
• Then projecting each unit’s weight vector into image space with the same mapping visualizes its “receptive field.”
![Page 40: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/40.jpg)
Examining the Net’s Representations
• The “y-intercept” coefficient for each pixel is simply the average pixel value at that location over all faces, so subtracting the resulting “average face” shows more precisely what the units attend to:
• Apparently local features appear in the global templates.
![Page 41: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/41.jpg)
Outline
• An overview of our facial expression recognition system.
• The internal representation shows the model’s prototypical representations of Fear, Sadness, etc.
• How our model accounts for the “categorical” data
• How our model accounts for the “two-dimensional” data
• Discussion• Conclusions for part 1
![Page 42: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/42.jpg)
Morph Transition Perception• Morphs help psychologists study
categorization behavior in humans• Example: JJ Fear to Sadness morph:
• Young et al. (1997) Megamix: presented images from morphs of all 6 emotions (15 sequences) to subjects in random order, task is 6-way forced choice button push.
0% 10% 30% 50% 70% 90% 100%
![Page 43: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/43.jpg)
Results: classical Categorical Perception: sharp boundaries…
…and higher discrimination of pairs of images when they cross a perceived category boundary
iness Fear Sadness Disgust Anger Happ-Surprise
PERCENT CORRECT DISCRIMINATIONPERCENT CORRECT DISCRIMINATION
6-WAY ALTERNATIVE FORCED CHOICE6-WAY ALTERNATIVE FORCED CHOICE
![Page 44: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/44.jpg)
Results: Non-categorical RT’s• “Scalloped” Reaction Times
iness Fear Sadness Disgust Anger Happ-Surprise
REACTION TIMEREACTION TIME
BUTTONBUTTON PUSHPUSH
![Page 45: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/45.jpg)
Results: More non-categorical effects
Mixed-In Expression Detection
-0.5
0
0.5
1
1.5
10 30 50
Percent Mixing
Average Rank Score
Mixed-inexpression(humans)
Unrelatedexpressions(humans)
• Young et al. Also had subjects rate 1st, 2nd, and 3rd most apparent emotion.
• At the 70/30 morph level, subjects were above chance at detecting mixed-in emotion. These data seem more consistent with continuous theories of emotion.
![Page 46: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/46.jpg)
Modeling Megamix• 1 trained neural network = 1 human subject.• 50 networks, 7 random examples of each
expression for training, remainder for holdout.• Identification = average of network outputs• Response time = uncertainty of maximal output
(1.0 - ymax).
• Mixed-in expression detection: record 1st, 2nd, 3rd largest outputs.
• Discrimination: 1 – correlation of layer representations
• We can then find the layer that best accounts for the data
![Page 47: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/47.jpg)
Modeling Six-Way Forced Choice
• Overall correlation r=.9416, with NO FIT PARAMETERS!
iness Fear Sadness Disgust Anger Happ-Surprise
![Page 48: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/48.jpg)
Model Discrimination Scores
• The model fits the data best at a precategorical layer: The layer we call the “object” layer; NOT at the category level
iness Fear Sadness Disgust Anger Happ-Surprise
PERCENT CORRECT DISCRIMINATIONPERCENT CORRECT DISCRIMINATION
HUMANHUMAN
MODELMODELOUTPUTOUTPUTLAYERLAYERR=0.36R=0.36
MODELMODELOBJECTOBJECTLAYERLAYERR=0.61R=0.61
![Page 49: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/49.jpg)
Discrimination• Classically, one requirement for “categorical
perception” is higher discrimination of two stimuli at a fixed distance apart when those two stimuli cross a category boundary
• Indeed, Young et al. found in two kinds of tests that discrimination was highest at category boundaries.
• The result that we fit the data best at a layer before any categorization occurs is significant: In some sense, the category boundaries are “in the data,” or at least, in our representation of the data.
![Page 50: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/50.jpg)
Outline
• An overview of our facial expression recognition system.
• The internal representation shows the model’s prototypical representations of Fear, Sadness, etc.
• How our model accounts for the “categorical” data
• How our model accounts for the “non-categorical” data
• Discussion• Conclusions for part 1
![Page 51: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/51.jpg)
Reaction Time: Human/Modeliness Fear Sadness Disgust Anger Happ-Surprise
Correlation between model & data: .6771, Correlation between model & data: .6771, p<.001p<.001
HUMANHUMAN SUBJECTS REACTION TIMESUBJECTS REACTION TIME
MODEL REACTION TIME (1 MODEL REACTION TIME (1 - - MAX_OUTPUT)MAX_OUTPUT)
![Page 52: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/52.jpg)
Mix Detection in the ModelMixed-In Expression Detection
-0.5
-0.3
-0.1
0.1
0.3
0.5
0.7
0.9
1.1
1.3
1.5
10 30 50
Percent Mixing
Average Rank Score
Mixed-in expression(humans)
Unrelated expressions(humans)
Mixed-in expression(networks)
Unrelated expressions(networks)
Can the network account for the continuous data as well as the categorical data? YES.
![Page 53: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/53.jpg)
Human/Model CircumplexesThese are derived from similarities between images
using non-metric Multi-dimensional scaling.For humans: similarity is correlation between 6-way
forced-choice button push.For networks: similarity is correlation between 6-
category output vectors.
![Page 54: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/54.jpg)
Outline
• An overview of our facial expression recognition system.
• How our model accounts for the “categorical” data
• How our model accounts for the “two-dimensional” data
• The internal representation shows the model’s prototypical representations of Fear, Sadness, etc.
• Discussion• Conclusions for part 1
![Page 55: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/55.jpg)
Discussion• Our model of facial expression recognition:
• Performs the same task people do• On the same stimuli• At about the same accuracy
• Without actually “feeling” anything, without any access to the surrounding culture, it nevertheless:• Organizes the faces in the same order around
the circumplex• Correlates very highly with human responses.• Has about the same rank order difficulty in
classifying the emotions
![Page 56: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/56.jpg)
Discussion
• The discrimination correlates with human results most accurately at a precategorization layer: The discrimination improvement at category boundaries is in the representation of data, not based on the categories.
• These results suggest that for expression recognition, the notion of “categorical perception” simply is not necessary to explain the data.
• Indeed, most of the data can be explained by the interaction between the similarity of the representations and the categories imposed on the data: Fear faces are similar to surprise faces in our representation – so they are near each other in the circumplex.
![Page 57: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/57.jpg)
Conclusions from this part of the talk
• The best models perform the same task people do• Concepts such as “similarity” and “categorization”
need to be understood in terms of models that do these tasks
• Our model simultaneously fits data supporting both categorical and continuous theories of emotion
• The fits, we believe, are due to the interaction of the way the categories slice up the space of facial expressions,
• and the way facial expressions inherently resemble one another.
• It also suggests that the continuous theories are correct: “discrete categories” are not required to explain the data.
• We believe our results will easily generalize to other visual tasks, and other modalities.
![Page 58: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/58.jpg)
Outline for the next two parts
• Why would a face area process BMW’s?:Why would a face area process BMW’s?:• Behavioral & brain dataBehavioral & brain data• Model of expertiseModel of expertise• resultsresults
• Why is this model wrong, and what can we do Why is this model wrong, and what can we do to fix it? to fix it?
![Page 59: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/59.jpg)
Are you a perceptual expert?
Take the expertise test!!!**
“Identify this object with the first name that comes to mind.”
**Courtesy of Jim Tanaka, University of Victoria
![Page 60: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/60.jpg)
“Car” - Not an expert
“2002 BMW Series 7” - Expert!
![Page 61: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/61.jpg)
“Bird” or “Blue Bird” - Not an expert
“Indigo Bunting” - Expert!
![Page 62: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/62.jpg)
“Face” or “Man” - Not an expert
“George Dubya”- Expert!““Jerk” or “Megalomaniac” - Jerk” or “Megalomaniac” -
Democrat!Democrat!
![Page 63: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/63.jpg)
How is an object to be named?
Bird Basic Level(Rosch et al., 1971)
Indigo Bunting SubordinateSpecies Level
Animal Superordinate Level
![Page 64: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/64.jpg)
Entry Point Recognition
Bird
Indigo Bunting
Animal
Visual analysis
Fine grain visual analysis
Semantic analysis
Entry Point
Entry Point
DownwardShift Hypothesis
![Page 65: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/65.jpg)
Dog and Bird Expert Study
• Each expert had a least 10 years experiencein their respective domain of expertise.
• None of the participants were experts in both dogs and birds.
• Participants provided their own controls.
Tanaka & Taylor, 1991
![Page 66: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/66.jpg)
Robin
Bird
Animal
Object Verification Task
Superordinate
Basic
Subordinate
Plant
Dog
Sparrow
YES NO YES NO
![Page 67: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/67.jpg)
600
Mea
n R
eact
ion
Tim
e (m
sec)
700
800
900
Superordinate Basic Subordinate
DownwardShift
Expert Domain
Novice Domain
Dog and bird experts recognize objects in their domain of expertise at subordinate levels.
Animal Bird/Dog Robin/Beagle
![Page 68: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/68.jpg)
Is face recognition a general form of perceptual expertise?
George W.Bush
IndigoBunting
2002Series 7
BMW
![Page 69: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/69.jpg)
600
Mea
n R
eact
ion
Tim
e (m
sec)
800
1000
1200
Superordinate Basic SubordinateTanaka, 2001
DownwardShift
Face experts recognize faces at the individual level of unique identity
Faces
Objects
![Page 70: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/70.jpg)
Bentin, Allison, Puce, Perez & McCarthy, 1996
Event-related Potentials and Expertise
N170
Face Experts Object Experts
Tanaka & Curran, 2001; see also Gauthier, Curran, Curby & Collins, 2003, Nature Neuro.
Novice Domain Expert Domain
![Page 71: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/71.jpg)
Neuroimaging of face, bird and car experts
“Face Experts”
FusiformGyrus
CarExperts
BirdExperts
FusiformGyrus
FusiformGyrus
Cars-Objects Birds-Objects
Gauthier et al., 2000
![Page 72: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/72.jpg)
Behavioral benchmarks of expertise• Downward shift in entry point recognition • Improved discrimination of novel exemplars from learned and related categories
Neurological benchmarks of expertise• Enhancement of N170 ERP brain component• Increased activation of fusiform gyrus
How to identify an expert?
![Page 73: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/73.jpg)
End of Tanaka Slides
![Page 74: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/74.jpg)
• Kanwisher showed the FFA is specialized for faces
•But she forgot to control for what???
![Page 75: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/75.jpg)
Greeble Experts (Gauthier et al. 1999)
• Subjects trained over many hours to recognize individual Greebles.
• Activation of the FFA increased for Greebles as the training proceeded.
![Page 76: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/76.jpg)
The “visual expertise mystery”• If the so-called “Fusiform Face Area” (FFA) is
specialized for face processing, then why would it also be used for cars, birds, dogs, or Greebles?
• Our view: the FFA is an area associated with a process: fine level discrimination of homogeneous categories.
• But the question remains: why would an area that presumably starts as a face area get recruited for these other visual tasks? Surely, they don’t share features, do they?
Sugimoto & Cottrell (2001), Proceedings of the Cognitive Science Society
![Page 77: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/77.jpg)
Solving the mystery with models• Main idea:
• There are multiple visual areas that could compete to be the Greeble expert - “basic” level areas and the “expert” (FFA) area.
• The expert area must use features that distinguish similar looking inputs -- that’s what makes it an expert
• Perhaps these features will be useful for other fine-level discrimination tasks.
• We will create • Basic level models - trained to identify an object’s class• Expert level models - trained to identify individual objects.• Then we will put them in a race to become Greeble
experts.• Then we can deconstruct the winner to see why they won.
Sugimoto & Cottrell (2001), Proceedings of the Cognitive Science Society
![Page 78: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/78.jpg)
Model Database
• A network that can differentiate faces, books, cups A network that can differentiate faces, books, cups andand
cans is a “basic level network.”cans is a “basic level network.”•A network that can also differentiate individuals within A network that can also differentiate individuals within ONE ONE
class (faces, cups, cans OR books) is an “expert.”class (faces, cups, cans OR books) is an “expert.”
![Page 79: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/79.jpg)
Model
• Pretrain two groups of neural networks on different tasks.
• Compare the abilities to learn a new individual Greeble classification task.
cup
Carol
book
can
BobTed
cancupbookface
Hidden layer
Greeble1Greeble2Greeble3
Greeble1Greeble2Greeble3
(Experts)
(Non-experts)
![Page 80: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/80.jpg)
Expertise begets expertise
• Learning to individuate cups, cans, books, or faces first, leads to faster learning of Greebles (can’t try this with kids!!!).
• The more expertise, the faster the learning of the new task!• Hence in a competition with the object area, FFA would win.Hence in a competition with the object area, FFA would win.• If our parents were cans, the FCA (Fusiform Can Area) would win.
AmountAmountOfOf
TrainingTrainingRequiredRequiredTo be aTo be aGreebleGreebleExpertExpert
Training Time on first taskTraining Time on first task
![Page 81: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/81.jpg)
Entry Level Shift: Entry Level Shift: Subordinate RT decreases with Subordinate RT decreases with
trainingtraining (rt = uncertainty of response = 1.0 -(rt = uncertainty of response = 1.0 -max(output)max(output)))
Human dataHuman data
--- Subordinate Basic
RT
Network dataNetwork data
# Training Sessions
![Page 82: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/82.jpg)
How do experts learn the task?How do experts learn the task?
• Expert level networks must be Expert level networks must be sensitivesensitive to within-class variation:to within-class variation:• Representations must Representations must amplifyamplify small small
differencesdifferences• Basic level networks must Basic level networks must ignoreignore within- within-
class variation.class variation.• Representations should Representations should reducereduce
differencesdifferences
![Page 83: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/83.jpg)
Observing hidden layer representations
• Principal Components Analysis on hidden unit Principal Components Analysis on hidden unit activation:activation:• PCA of hidden unit activations allows us to PCA of hidden unit activations allows us to
reduce the dimensionality (to 2) and plot reduce the dimensionality (to 2) and plot representations.representations.
• We can then observe how tightly clustered We can then observe how tightly clustered stimuli are in a low-dimensional subspacestimuli are in a low-dimensional subspace
• We expect basic level networks to separate We expect basic level networks to separate classes, but not individuals.classes, but not individuals.
• We expect expert networks to separate We expect expert networks to separate classes and individuals.classes and individuals.
![Page 84: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/84.jpg)
Subordinate level training Subordinate level training magnifies small differences magnifies small differences
withinwithin object representations object representations1 epoch 80 epochs 1280 epochs
Face
Basic
greeble
![Page 85: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/85.jpg)
Greeble representations are Greeble representations are spread out prior to Greeble spread out prior to Greeble
TrainingTraining
FaceBasic
greeble
![Page 86: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/86.jpg)
Variability Decreases Learning TimeVariability Decreases Learning Time
GreebleLearningTime
Greeble Variance Prior to Learning Greebles
(r = -0.834)(r = -0.834)
![Page 87: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/87.jpg)
Examining the Net’s Representations
• We want to visualize “receptive fields” in the network.• But the Gabor magnitude representation is noninvertible.• We can learn an approximate inverse mapping, however.• We used linear regression to find the best linear
combination of Gabor magnitude principal components for each image pixel.
• Then projecting each hidden unit’s weight vector into image space with the same mapping visualizes its “receptive field.”
![Page 88: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/88.jpg)
Two hidden unit receptive fieldsAFTER TRAINING AS A FACE
EXPERTAFTER FURTHER TRAINING ON GREEBLES
HU 16HU 16
HU 36HU 36
NOTE: These are not face-NOTE: These are not face-specific!specific!
![Page 89: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/89.jpg)
Controlling for the number of classes
• We obtained 13 classes from hemera.com:
• 10 of these are learned at the basic level. • 10 faces, each with 8 expressions, make the expert task• 3 (lamps, ships, swords) are used for the novel expertise task.
![Page 90: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/90.jpg)
Results: Pre-training• New initial tasks of similar difficulty: In previous work, the
basic level task was much easier.
• These are the learning curves for the 10 object classes and the 10 faces.
![Page 91: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/91.jpg)
Results• As before, experts still learned new expert level
tasks faster
Number of epochsNumber of epochsTo learn swordsTo learn swords
After learning facesAfter learning facesOr objectsOr objects
Number of training epochs on faces or objectsNumber of training epochs on faces or objects
![Page 92: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/92.jpg)
Outline
• Why would a face area process BMW’s?• Why this model is wrong…
![Page 93: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/93.jpg)
background
• I have a model of face and object processing that accounts for a lot of data…
![Page 94: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/94.jpg)
The Face and Object Processing The Face and Object Processing SystemSystem
PCAGaborFilterin
g
BookCanCupFaceBobSueJane
ExpertLevel
Classifier
Pixel(Retina)
Level
GestaltLevel
Perceptual
(V1)Level
CategoryLevel
HiddenLayer
BookCanCupFace
BasicLevel
Classifier
FFA
LOC
![Page 95: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/95.jpg)
Effects accounted for by this model
• Categorical effects in facial expression recognition (Dailey et al., 2002)• Similarity effects in facial expression recognition (ibid)• Why fear is “hard” (ibid)• Rank of difficulty in configural, featural, and inverted face discrimination
(Zhang & Cottrell, 2006)• Holistic processing of identity and expression, and the lack of interaction
between the two (Cottrell et al. 2000)• How a specialized face processor may develop under realistic developmental
constraints (Dailey & Cottrell, 1999)• How the FFA could be recruited for other areas of expertise (Sugimoto &
Cottrell, 2001; Joyce & Cottrell, 2004, Tran et al., 2004; Tong et al, 2005)• The other race advantage in visual search (Haque & Cottrell, 2005)• Generalization gradients in expertise (Nguyen & Cottrell, 2005)• Priming effects (face or character) on discrimination of ambiguous chinese
characters (McCleery et al, under review)• Age of Acquisition Effects in face recognition (Lake & Cottrell, 2005; L&C,
under review)• Memory for faces (Dailey et al., 1998, 1999)• Cultural effects in facial expression recognition (Dailey et al., in preparation)
![Page 96: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/96.jpg)
background
• I have a model of face and object processing that accounts for a lot of data…
• But it is fundamentally wrong in an important way!
![Page 97: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/97.jpg)
background
• I have a model of face and object processing that accounts for a lot of data…
• But it is fundamentally wrong in an important way!
• It assumes that the whole face is processed to the same level of detail everywhere….
![Page 98: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/98.jpg)
background
• I have a model of face and object processing that accounts for a lot of data…
• But it is fundamentally wrong in an important way!
• It assumes that the whole face is processed to the same level of detail everywhere….and that the face is always aligned with “the brain”
![Page 99: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/99.jpg)
background
• Instead, people and animals actively sample the world around them with eye movements.
• The images our eyes take in “slosh around” on the retina like an MTV video.
• We only get patchespatches of information from any fixation
• But somehow (see previous talk), we perceive a stable visual world.
![Page 100: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/100.jpg)
background
• So, a model of visual processing needs to be dynamically integrating information over time.
• We need to decide where to look at any moment:
• There are both bottom up (inherently interesting things )• and top down influences on this (task demands)
![Page 101: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/101.jpg)
Questions for today
• What form should a model take in order to model eye movements without getting into muscle control?
• Do people actually gain information from saccades in face processing?
• How is the information integrated?
• How do features change over time?
![Page 102: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/102.jpg)
answers
• What form should a model take in order to model eye movements without getting into muscle control?• Probabilistic
• Do people actually gain information from saccades?• Yes
• How is the information integrated?• In a Bayesian way
• How do features change over time?• Don’t know yet, but we hypothesize that the effective
receptive fields grow over time
![Page 103: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/103.jpg)
This (brief!) part of the talk
• What form should a model take in order to model eye movements without getting into muscle control?• Probabilistic
• Do people actually gain information from saccades?
• Yes
• How is the information integrated? • In a Bayesian way: NIMBLENIMBLE
• How do features change over time?• Don’t know yet, but we hypothesize that the effective
receptive fields grow over time
![Page 104: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/104.jpg)
Where to look?
• We have lots of ideas about this (but, obviously, none as good as Baldi’s surprise!), but for the purposes of this talk, we will just use bottom up salience:
• Figure out where the image is “interesting,” and look there
• “interesting” = where there are areas of high contour
![Page 105: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/105.jpg)
Interest points created using Gabor filter varianceInterest points created using Gabor filter variance
![Page 106: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/106.jpg)
Given those fragments,how do we recognize people?
• The basic idea is that we simply store the fragments with labels:
• Bob, Carol, Ted or Alice• In the study set, not in the study set
(Note that we have cast face recognition as a categorization problem.)
• When we look at a new face, we look at the labels of the most similar fragments, and vote, using kernel density estimation, comparing a Gaussian kernel and 1-nearest neighbor.
• This can be done easily in a (Naïve) Bayesian framework
![Page 107: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/107.jpg)
A basic test:Face Recognition
• Following (Duchaine & Nakayama, 2005):• Images are displayed for 3 seconds --
which means about 10 fixations - which means about 10 fragments are stored, chosen at random using bottom-up salience
• 30 lures• Normal human ROC area (A’) is from 0.9
to 1.0• NIMBLE results:
![Page 108: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/108.jpg)
Multi-Class Memory Tasks
• An identification task tests NIMBLE on novel images from multiple classes
• Face identification:
• 29 identities from the FERET dataset• Changes in expression and illumination
• Object identification:
• 20 object classes from the COIL-100 dataset• Changes in orientation, scale and illumination
![Page 109: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/109.jpg)
Identification Results
• [Belongie, Malik & Puzicha, 2002] report 97.6% accuracy on a 20-class, 4 training image set from COIL-100
![Page 110: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/110.jpg)
Do people really need more than one fixation to recognize a
face?
• To answer this question, we ran a simple face recognition experiment:
• Subjects saw 32 faces, then were given an old/new recognition task with those 32 faces and 32 more.
• However, subjects were restricted to 1, 2, 3, or unlimited fixations on the face.
![Page 111: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/111.jpg)
• Using a gaze-contingent display, we can limit the fixations on the face itself to 1, 2, 3, or unlimited fixations
• The average face mask prevents subjects from acquiring information about the face from peripheral vision
![Page 112: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/112.jpg)
Fixations during training
![Page 113: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/113.jpg)
Fixations during testing
![Page 114: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/114.jpg)
Overall Data• Overall fixation maps during training
1st Fixation 2nd Fixation 3rd Fixation 4+ Fixations Overall
![Page 115: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/115.jpg)
Overall Data• Overall fixation maps during testing
1st Fixation 2nd Fixation 3rd Fixation 4+ Fixations Overall
![Page 116: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/116.jpg)
Results
• This says you gain the most recognition by making two fixations, and a third doesn’t help.• Hence. if a picture is worth a thousand words, each fixation is worth about 500 words!
![Page 117: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/117.jpg)
Long term questions
• Do people have visual routines for sampling from stimuli?
• Do people have different routines for different tasks on identical stimuli?
• Do the routines change depending on the information gained at any particular location?
• Do different people have different routines?• How do those routines change with expertise?• How do they change over the acquisition of
expertise?
![Page 118: What can computational models tell us about face processing?](https://reader030.vdocuments.mx/reader030/viewer/2022032612/5681334e550346895d9a5803/html5/thumbnails/118.jpg)
END