jay turcot - emotion ai developer day 2016

@affectiva

Metrics & How Affectiva Software works

Jay TurcotDirector of Applied AI, Affectiva Inc.@pjturcot

Nov 16, 2016

Emotion AI Developer Day 2016

Outline• Why the face? & FACS• How the technology works• Metrics and emotions• How it’s used• Digging deeper• Computer vision pipeline• Static image vs. video analysis• Pose & Luminance

Why the face?• Spontaneous• Real time feedback• Front-facing camera• Transmits rich information• Emotional state• Intensity

• Human interpretable

Ekman and Friesen Facial Action Coding System -- 1978

• Codifies facial expressions• Action Units

• Independent movements of the face• Numeric codes• Associated with specific facial muscles• 5 intensity ratings (A,B,C,D,E)

• Example:• AU 1 – Inner Eyebrow Raise• AU 9 – Nose wrinkle

FACS: example

• Common language:• Not bad meme• Frown (North America only!)

• FACS:• 15E+17E• Lip corner depressor (AU15)• Chin raiser (AU17)

Our computer visionalgorithms identify keylandmarks on the face

Machine learning algorithmsanalyze pixels in those regionsto classify facial expressions

Combinations of facialexpressions are mappedto emotions

How it works?

Nuanced facial expressions

Lip Suck Lip Pucker Lip Press

Smile Lip Corner Depressor Nose Wrinkle Chin Raise Mouth Open

Attention Brow Furrow Brow Raise Inner Brow Raise Eye Closure

Smirk Upper Lip Raise

A range of emotions

Valence: how positive or negative a person’s facial expressions

Engagement: overall level and intensity of the emotion

And emojis

Laughing Smiley Relaxed Wink

Kissing Stuck Out Tongue Stuck Out Tongue and Winking Eye Scream

Flushed Smirk Disappointed Rage

A person’sappearance

Gender:identifies the human perceptionof gender expression

Glasses: presence of eye or sun glasses

Disgust

Emotion analytics using state of the art computer vision, machine learning

50BSee more at:blog.affecitiva.com

How are they used?• Trigger events based on facial

expressions• Build predictive models from raw data• Analytics over time & across individuals

Digging deeper

@affectiva

Face Detection• Initial detection of a face in an image

• Position and scale (bounding box)• In the SDK

• Near-frontal & upright faces*• Looking for faces in multiple

positions and multiple scales is time consuming

• For multi-face, face detection needs to run periodically to scan for new individuals (until max faces is reached)

Landmark tracking• Locate 2D position of face landmarks

• 34 landmarks• Allows head angle estimate

• In the SDK• Once tracking, can follow face through

position/scale/orientation changes• Tracking allows a wider range of non-frontal

angles & upside down• Sudden position changes can disrupt

tracking• Confidence in tracking is checked at every

frame

Expression detection• Detect a facial expression

• Probability of expression(correlated with intensity)

• In the SDK• Frame-by-frame analysis• Detection relies on visual texture

information: shading, wrinkling• Robustness gained through observing

thousands of real-world examples

Expression interpretation• Interpret expressions

• Basic 6 emotions, contempt• Valence, engagement• Emoji

• In the SDK• Emotional interpretations are

available on a frame-by-frame basis• Great starting point for analysis• You can always drill down into

underlying expressions

A few more notes

@affectiva

Video vs. Images• Expression detection requires a

baseline (neutral)• With video, you can build an estimate

of a person’s neutral face• Person specific appearance• Improves accuracy & sensitivity

• With still image, no additional information is available

Pose & Luminance• Metrics are robust to lighting

• Lighting of the face!(and not lighting of the room)

• Pose is more challenging for some expressions• Beyond certain ranges we are no

longer confident in metrics and stop reporting them(despite tracking the face)

• Improvements in future releases as we learn from more and more data

Questions?

@affectiva

jay turcot - emotion ai developer day 2016

Technology