talk #3: empath a neural network model of human facial expression recognition gary cottrell joint...

65
Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs (Based on previous work with Mike Fleming and Janet Metcalfe)

Upload: shauna-adams

Post on 20-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Talk #3: EMPATHA Neural Network Model of Human Facial Expression Recognition

Gary CottrellJoint work with Matt Dailey, Curt Padgett, and Ralph Adolphs(Based on previous work with Mike Fleming and Janet Metcalfe)

Page 2: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Why model?

• Models rush in where theories fear to tread.

• Models can be manipulated in ways people cannot

• Models can be analyzed in ways people cannot.

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 22

Page 3: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

How (I like) to build Cognitive Models

• Start with an area with lots of data and lots of controversy about what the data means.

• I like to be able to relate them to the brain, so “neurally plausible” models are preferred -- neural nets.

• The model should be a working model of the actual task, rather than a cartoon version of it.

• Of course, the model should nevertheless be simplifying (i.e. it should be constrained to the essential features of the problem at hand):• Do we really need to model the (supposed) translation invariance

and size invariance of biological perception?• As far as I can tell, NO!

• Then, take the model “as is” and fit the experimental data: 0 fitting parameters is preferred over 1, 2 , or 3.

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 33

Page 4: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

The other way (I like) to build Cognitive Models

• Same as above, except:• Use them as exploratory models -- in domains

where there is little direct data (e.g. no single cell recordings in infants or undergraduates) to suggest what we might find if we could get the data. These can then serve as “intuition pumps.”

• Examples: • Why we might get specialized face processors• Why those face processors get recruited for other tasks

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 44

Page 5: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

The Issue: Are Similarity and Categorization Two Sides of the

Same Coin? • Some researchers believe perception of facial

expressions is a new example of categorical perception:• Like the colors of a rainbow, the brain

separates expressions into discrete categories, with:

• Sharp boundaries between expressions, and…• Higher discrimination of faces near those

boundaries.

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 55

Page 6: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Categorical Perception Requirement 1:

Sharp boundaries between Categories

SADSAD

FEARFEAR

Page 7: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Categorical Perception Requirement 2:Better discrimination near category boundaries

It is more difficult to tell these two apart:

Than it is to tell these two apart:

Page 8: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Categorical Perception Requirement 2:Better discrimination near category boundaries

Hard Easy

Page 9: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

The Issue: Are Similarity and Categorization Two Sides of the

Same Coin? • Some researchers believe the underlying

representation of facial expressions is NOT discrete:• There are two (or three) underlying

dimensions, e.g., intensity and valence (found by MDS).

• Our perception of expressive faces induces a similarity structure that results in a circle in this space

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 99

Page 10: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

The Issue: Are Similarity and Categorization Two Sides of the Same

Coin? The data supporting the continuous, underlying dimensions, view:

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 1010

Page 11: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Outline• An overview of our facial expression recognition

system.• The internal representation shows the model’s

prototypical representations of Fear, Sadness, etc.

• The data from “Facial Expression Megamix”• How our model accounts for the “categorical”

data• How our model accounts for the “two-dimensional” data

• Discussion• Conclusions

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 1111

Page 12: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Outline

• An overview of our facial expression recognition system.

• The internal representation shows the model’s prototypical representations of Fear, Sadness, etc.

• How our model accounts for the “categorical” data

• How our model accounts for the “two-dimensional” data

• Discussion• Conclusions

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 1212

Page 13: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

The Face Processing SystemThe Face Processing System

PCA

.

.

.

.

.

.

HappySadAfraidAngrySurprisedDisgustedNeural

Net

Pixel(Retina)

Level

Object(IT)

Level

Perceptual(V1)

Level

CategoryLevel

GaborFiltering

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 1313

Page 14: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

The Face Processing SystemThe Face Processing System

PCA

.

.

.

.

.

.

BobCarolTedAlice

NeuralNet

Pixel(Retina)

Level

Object(IT)

Level

Perceptual(V1)

Level

CategoryLevel

GaborFiltering

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 1414

Page 15: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

The Face Processing SystemThe Face Processing System

PCA...

.

.

.

GaborFiltering

BobCarolTedCupCanBookNeural

Net

Pixel(Retina)

Level

Object(IT)

Level

Perceptual(V1)

Level

CategoryLevel

FeatureFeaturelevellevel

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 1515

Page 16: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

The Face Processing SystemThe Face Processing System

LSF

PCA

HSF

PCA

.

.

.

.

.

.

GaborFiltering

BobCarolTedCupCanBookNeural

Net

Pixel(Retina)

Level

Object(IT)

Level

Perceptual(V1)

Level

CategoryLevel

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 1616

Page 17: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

The Gabor Filter Layer• Basic feature: the 2-D Gabor wavelet filter (Daugman, 85):

• These model the processing in early visual areas

Convolution

*

Magnitudes

Subsample in a 29x36 grid

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 1717

Page 18: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Principal Components Analysis• The Gabor filters give us 40,600 numbers• We use PCA to reduce this to 50 numbers• PCA is like Factor Analysis: It finds the underlying

directions of Maximum Variance• PCA can be computed in a neural network through

a competitive Hebbian learning mechanism• Hence this is also a biologically plausible

processing step• We suggest this leads to representations similar to

those in Inferior Temporal cortex

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 1818

Page 19: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

How to do PCA with a neural network

(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991)

A self-organizing network that learns whole-object representations

(features, Principal Components, Holons, eigenfaces)(features, Principal Components, Holons, eigenfaces)

...

Holons

(Gestalt layer)

Input fromInput fromPerceptual LayerPerceptual Layer

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 1919

Page 20: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

A self-organizing network that learns whole-object representations

(features, Principal Components, Holons, eigenfaces)(features, Principal Components, Holons, eigenfaces)

...

Holons

(Gestalt layer)

Input fromInput fromPerceptual LayerPerceptual Layer

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 2020

How to do PCA with a neural network

(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991)

Page 21: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

A self-organizing network that learns whole-object representations

(features, Principal Components, Holons, eigenfaces)(features, Principal Components, Holons, eigenfaces)

...

Holons

(Gestalt layer)

Input fromInput fromPerceptual LayerPerceptual Layer

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 2121

How to do PCA with a neural network

(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991)

Page 22: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

A self-organizing network that learns whole-object representations

(features, Principal Components, Holons, eigenfaces)(features, Principal Components, Holons, eigenfaces)

...

Holons

(Gestalt layer)

Input fromInput fromPerceptual LayerPerceptual Layer

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 2222

How to do PCA with a neural network

(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991)

Page 23: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

A self-organizing network that learns whole-object representations

(features, Principal Components, Holons, eigenfaces)(features, Principal Components, Holons, eigenfaces)

...

Holons

(Gestalt layer)

Input fromInput fromPerceptual LayerPerceptual Layer

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 2323

How to do PCA with a neural network

(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991)

Page 24: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

A self-organizing network that learns whole-object representations

(features, Principal Components, Holons, eigenfaces)(features, Principal Components, Holons, eigenfaces)

...

Holons

(Gestalt layer)

Input fromInput fromPerceptual LayerPerceptual Layer

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 2424

How to do PCA with a neural network

(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991)

Page 25: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

A self-organizing network that learns whole-object representations

(features, Principal Components, Holons, eigenfaces)(features, Principal Components, Holons, eigenfaces)

...

Holons

(Gestalt layer)

Input fromInput fromPerceptual LayerPerceptual Layer

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 2525

How to do PCA with a neural network

(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991)

Page 26: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Holons

• They act like face cells (Desimone, 1991):• Response of single units is strong

despite occluding eyes, e.g.• Response drops off with rotation• Some fire to my dog’s face

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 2626

Page 27: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Holons

• A novel representation: Distributed templates -- • each unit’s optimal stimulus is a ghostly looking face (template-like),

• but many units participate in the representation of a single face (distributed).

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 2727

Page 28: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Holons

• They can explain holistic processing:

Why? If stimulated with a partial match, the firing represents votes for this template:

Units “downstream” don’t know what caused this unit to fire.

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 2828

Page 29: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

The Final Layer: Classification(Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; Padgett & Cottrell 1996;

Dailey & Cottrell, 1999; Dailey et al. 2002)

The holistic representation is then used as input to a categorization network trained by supervised learning.

• Excellent generalization performance demonstrates the sufficiency of the holistic representation for recognition

Holons

CategoriesCategories

...

Output: Cup, Can, Book, Greeble, Face, Bob, Carol, Ted, Happy, Sad, Afraid, etc.Output: Cup, Can, Book, Greeble, Face, Bob, Carol, Ted, Happy, Sad, Afraid, etc.

Input fromInput fromPerceptual LayerPerceptual Layer

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 2929

Page 30: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

The Final Layer: Classification• Categories can be at different levels: basic, subordinate.• Simple learning rule (~delta rule). It says (mild lie here):

• add inputs to your weights (synaptic strengths) when you are supposed to be on,

• subtract them when you are supposed to be off.• This makes your weights “look like” your favorite patterns

– the ones that turn you on.• When no hidden units => No back propagation of error.• When hidden units: we get task-specific features (most

interesting when we use the basic/subordinate distinction)

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 3030

Page 31: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Facial Expression Database• Janet Metcalfe and I collected pictures of UCSD

Psychology Sophomores making facial expressions.• We told them to “look happy”, “look sad”, etc.• Unfortunately, we got the Irish Setter Effect…

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 3131

Page 32: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

The Many Moods of an Irish Setter (Gary Larson)

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 3232

Page 33: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

This happens in other species as well…

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 3333

Page 34: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Facial Expression Database• Ekman and Friesen quantified muscle movements

(Facial Actions) involved in prototypical portrayals of happiness, sadness, fear, anger, surprise, and disgust.• Result: the Pictures of Facial Affect Database (1976).• 70% agreement on emotional content by naive

human subjects.• 110 images, 14 subjects, 7 expressions.

Anger, Disgust, Neutral, Surprise, Happiness (twice), Fear, and Sadness This is actor “JJ”: The easiest for humans (and our model) to classify

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 3434

Page 35: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Results (Generalization)

• Kendall’s tau (rank order correlation): .667, p=.0441

• Note: This is an emergent property of the model!

ExpressionExpression Network % CorrectNetwork % Correct Human % AgreementHuman % Agreement

HappinessHappiness 100.0%100.0% 98.7%98.7%

SurpriseSurprise 100.0%100.0% 92.4%92.4%

DisgustDisgust 100.0%100.0% 92.3%92.3%

AngerAnger 89.2%89.2% 88.9%88.9%

SadnessSadness 82.9%82.9% 89.2%89.2%

FearFear 66.7%66.7% 87.7%87.7%

AverageAverage 89.9%89.9% 91.6%91.6%

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 3535

Page 36: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Correlation of Net/Human Errors

• Like all good Cognitive Scientists, we like our models to make the same mistakes people do!

• Networks and humans have a 6x6 confusion matrix for the stimulus set.

• This suggests looking at the off-diagonal terms: The errors

• Correlation of off-diagonal terms: r = 0.567. [F (1,28) = 13.3; p = 0.0011]

• Again, this correlation is an emergent property of the model: It was not told which expressions were confusing.

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 3636

Page 37: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Outline• An overview of our facial expression recognition

system.• The internal representation shows the model’s

prototypical representations of Fear, Sadness, etc.

• The data from “Facial Expression Megamix”• How our model accounts for the “categorical”

data• How our model accounts for the “two-dimensional” data

• Discussion• Conclusions

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 3737

Page 38: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Examining the Net’s Representations

• We want to visualize “receptive fields” in the network.• But the Gabor magnitude representation is noninvertible.• We can learn an approximate inverse mapping, however.• We used linear regression to find the best linear

combination of Gabor magnitude principal components for each image pixel.

• Then projecting each unit’s weight vector into image space with the same mapping visualizes its “receptive field.”

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 3838

Page 39: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Examining the Net’s Representations

• The “y-intercept” coefficient for each pixel is simply the average pixel value at that location over all faces, so subtracting the resulting “average face” shows more precisely what the units attend to:

• Apparently local features appear in the global templates.

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 3939

Page 40: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Outline• An overview of our facial expression recognition

system.• The internal representation shows the model’s

prototypical representations of Fear, Sadness, etc.

• The data from “Facial Expression Megamix”• How our model accounts for the “categorical”

data• How our model accounts for the “two-dimensional” data

• Discussion• Conclusions

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 4040

Page 41: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Morph Transition Perception• Morphs help psychologists study

categorization behavior in humans• Example: JJ Fear to Sadness morph:

Young et al. (1997) Megamix: presented images from morphs of all 6 emotions (15 sequences) to subjects in random order, task is 6-way forced choice button push.

0% 10% 30% 50% 70% 90% 100%

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 4141

Page 42: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Young et al. (1997) Megamix: What was the point?

They presented images from morphs of all 6 emotions (15 sequences) to subjects in random order, task is 6-way forced choice button push (6-AFC).

Why?

They reasoned that if the dimensional theory was correct, then as they morphed from one expression to another, that nearby emotions would “intrude”

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 4242

Page 43: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Young et al. (1997) Megamix: What was the point?

Why? They reasoned that if the dimensional theory was correct, then as they morphed from one expression to another, that nearby emotions would “intrude”

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 4343

E.g., a morph from Fear to Happy would “pass by” Surprise, so we should see Surprise responses along the morph.

Again, as we will see, this will demonstrate that theories are hard to reason about.

Page 44: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Young et al. (1997) Megamix: What was the point?

In other words, every experiment they did was designed to try to distinguish between:

- the discrete, categorical perception theory,

and

- the continuous, underlying dimensions theory.

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 4444

Page 45: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Megamix Results: classical Categorical Perception: sharp boundaries…

…and higher discrimination of pairs of images when they cross a perceived category boundary

iness Fear Sadness Disgust Anger Happ-Surprise

PERCENT CORRECT DISCRIMINATION

6-WAY ALTERNATIVE FORCED CHOICE

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 4545

Page 46: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Megamix Results: Non-categorical RT’s“Scalloped” Reaction Times, suggesting sensitivity to boundaries

iness Fear Sadness Disgust Anger Happ-Surprise

REACTION TIMEREACTION TIME

BUTTONBUTTON PUSHPUSH

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 4646

Page 47: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Results: More non-categorical effects

Young et al. also had subjects rate the 1st, 2nd, and 3rd most apparent emotion.

At the 30/70 morph level, subjects were above chance at detecting the mixed-in emotion. These data are more consistent with continuous theories of emotion.

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 4747

Page 48: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Outline• An overview of our facial expression recognition

system.• The internal representation shows the model’s

prototypical representations of Fear, Sadness, etc.

• The data from “Facial Expression Megamix”• How our model accounts for the “categorical”

data• How our model accounts for the “two-dimensional” data

• Discussion• Conclusions

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 4848

Page 49: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Modeling Megamix

• 1 trained neural network = 1 human subject.

• We used 50 networks, 7 random examples of each expression for training, remainder for holdout.

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 4949

Page 50: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Modeling MegamixHow to map neural net

responsesto behavioral measures

• Percent button push for each image =average of network outputs

• Response time = uncertainty of maximal output (1.0 - ymax).

• Mixed-in expression detection: record 1st, 2nd, 3rd largest outputs.

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 5050

Page 51: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Modeling MegamixHow to map neural net

responsesto behavioral measures

• Facial similarity: • We present a face to the network, and

record the responses at different layers.• We do the same for a second face.• Then we compute the correlation between

the responses to obtain how similar each layer considers the stimuli.

• We can then find the layer that best accounts for the data

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 5151

Page 52: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Modeling MegamixHow to map neural net

responsesto behavioral measures

• Face discrimination:We use (1 – correlation) of layer representations to represent how discriminable the stimuli are at that layer.

• We can then find the layer that best accounts for the data

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 5252

Page 53: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Modeling Megamix• 1 trained neural network = 1 human subject.• 50 networks, 7 random examples of each expression for

training, remainder for holdout.• Identification = average of network outputs• Response time = uncertainty of maximal output (1.0 -

ymax).

• Mixed-in expression detection: record 1st, 2nd, 3rd largest outputs.

• Similarity: correlation between layer responses for two faces

• Discrimination: 1 – correlation of layer representations• We can then find the layer that best accounts for the

data

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 5353

Page 54: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Modeling Six-Way Forced Choice

• Overall correlation r=.9416, with NO FIT PARAMETERS!

iness Fear Sadness Disgust Anger Happ-Surprise

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 5454

Page 55: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Model Discrimination Scores

• The model fits the data best at a precategorical layer: The layer we call the “object” layer; NOT at the category level

iness Fear Sadness Disgust Anger Happ-Surprise

PERCENT CORRECT DISCRIMINATIONPERCENT CORRECT DISCRIMINATION

HUMAN

MODELOUTPUTLAYERr=0.36

MODELOBJECTLAYERr=0.61

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 5555

Page 56: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Discrimination• Classically, one requirement for “categorical

perception” is higher discrimination of two stimuli at a fixed distance apart when those two stimuli cross a category boundary

• Indeed, Young et al. found in two kinds of tests that discrimination was highest at category boundaries.

• The result that we fit the data best at a layer beforebefore any categorization occurs is significant: • In some sense, the category boundaries are “in the

data,” or at least, in our representation of the data.

• This result is probably not true in general - but it seems to be a property of facial expressions: So, were we evolved to make this easy?

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 5656

Page 57: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Outline• An overview of our facial expression recognition

system.• The internal representation shows the model’s

prototypical representations of Fear, Sadness, etc.

• The data from “Facial Expression Megamix”• How our model accounts for the “categorical”

data• How our model accounts for the “two-dimensional” data

• Discussion• Conclusions

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 5757

Page 58: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Reaction Time: Human/Modeliness Fear Sadness Disgust Anger Happ-Surprise

Correlation between model & data: .6771, Correlation between model & data: .6771, p<.001p<.001

HUMAN SUBJECTS REACTION TIME

MODEL REACTION TIME (1 - MAX_OUTPUT)

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 5858

Page 59: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Mix Detection in the Model

Can the network account for the continuous data as well as the categorical data? YES.

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 5959

Page 60: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Human/Model CircumplexesThese are derived from similarities between images

using non-metric Multi-dimensional scaling.For humans: similarity is correlation between 6-way

forced-choice button push.For networks: similarity is correlation between 6-

category output vectors.

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 6060

Page 61: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Outline• An overview of our facial expression recognition

system.• The internal representation shows the model’s

prototypical representations of Fear, Sadness, etc.

• The data from “Facial Expression Megamix”• How our model accounts for the “categorical”

data• How our model accounts for the “two-dimensional” data

• Discussion• Conclusions

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 6161

Page 62: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Discussion• Our model of facial expression recognition:

• Performs the same task people do• On the same stimuli• At about the same accuracy

• Without actually “feeling” anything, without any access to the surrounding culture, it nevertheless:• Organizes the faces in the same order around

the circumplex• Correlates very highly with human responses.• Has about the same rank order difficulty in

classifying the emotions

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 6262

Page 63: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Discussion

• The discrimination correlates with human results most accurately at a precategorization layer: The discrimination improvement at category boundaries is in the representation of data, not based on the categories.

• These results suggest that for expression recognition, the notion of “categorical perception” simply is not necessary to explain the data.

• Indeed, most of the data can be explained by the interaction between the similarity of the representations and the categories imposed on the data: Fear faces are similar to surprise faces in our representation – so they are near each other in the circumplex.

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 6363

Page 64: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

Conclusions• The best models perform the same task people do• Concepts such as “similarity” and “categorization” need

to be understood in terms of models that do these tasks• Our model simultaneously fits data supporting both

categorical and continuous theories of emotion• The fits, we believe, are due to the interaction of the way

the categories slice up the space of facial expressions, • and the way facial expressions inherently resemble one

another.• It also suggests that the continuous theories are correct: “discrete categories” are not required to explain the data.

• We believe our results will easily generalize to other visual tasks, and other modalities.

International Cognitive Science Summer School, Sofia, 2015International Cognitive Science Summer School, Sofia, 2015 6464

Page 65: Talk #3: EMPATH A Neural Network Model of Human Facial Expression Recognition Gary Cottrell Joint work with Matt Dailey, Curt Padgett, and Ralph Adolphs

END

for now…