jay turcot - emotion ai developer day 2016
TRANSCRIPT
@affectiva
Metrics & How Affectiva Software works
Jay TurcotDirector of Applied AI, Affectiva Inc.@pjturcot
Nov 16, 2016
Emotion AI Developer Day 2016
Outline• Why the face? & FACS• How the technology works• Metrics and emotions• How it’s used• Digging deeper• Computer vision pipeline• Static image vs. video analysis• Pose & Luminance
Why the face?• Spontaneous• Real time feedback• Front-facing camera• Transmits rich information• Emotional state• Intensity
• Human interpretable
Ekman and Friesen Facial Action Coding System -- 1978
• Codifies facial expressions• Action Units
• Independent movements of the face• Numeric codes• Associated with specific facial muscles• 5 intensity ratings (A,B,C,D,E)
• Example:• AU 1 – Inner Eyebrow Raise• AU 9 – Nose wrinkle
FACS: example
• Common language:• Not bad meme• Frown (North America only!)
• FACS:• 15E+17E• Lip corner depressor (AU15)• Chin raiser (AU17)
Our computer visionalgorithms identify keylandmarks on the face
Machine learning algorithmsanalyze pixels in those regionsto classify facial expressions
Combinations of facialexpressions are mappedto emotions
How it works?
Nuanced facial expressions
Lip Suck Lip Pucker Lip Press
Smile Lip Corner Depressor Nose Wrinkle Chin Raise Mouth Open
Attention Brow Furrow Brow Raise Inner Brow Raise Eye Closure
Smirk Upper Lip Raise
A range of emotions
Valence: how positive or negative a person’s facial expressions
Engagement: overall level and intensity of the emotion
And emojis
Laughing Smiley Relaxed Wink
Kissing Stuck Out Tongue Stuck Out Tongue and Winking Eye Scream
Flushed Smirk Disappointed Rage
A person’sappearance
Gender:identifies the human perceptionof gender expression
Glasses: presence of eye or sun glasses
Disgust
Emotion analytics using state of the art computer vision, machine learning
50BSee more at:blog.affecitiva.com
How are they used?• Trigger events based on facial
expressions• Build predictive models from raw data• Analytics over time & across individuals
Digging deeper
@affectiva
Face Detection• Initial detection of a face in an image
• Position and scale (bounding box)• In the SDK
• Near-frontal & upright faces*• Looking for faces in multiple
positions and multiple scales is time consuming
• For multi-face, face detection needs to run periodically to scan for new individuals (until max faces is reached)
Landmark tracking• Locate 2D position of face landmarks
• 34 landmarks• Allows head angle estimate
• In the SDK• Once tracking, can follow face through
position/scale/orientation changes• Tracking allows a wider range of non-frontal
angles & upside down• Sudden position changes can disrupt
tracking• Confidence in tracking is checked at every
frame
Expression detection• Detect a facial expression
• Probability of expression(correlated with intensity)
• In the SDK• Frame-by-frame analysis• Detection relies on visual texture
information: shading, wrinkling• Robustness gained through observing
thousands of real-world examples
Expression interpretation• Interpret expressions
• Basic 6 emotions, contempt• Valence, engagement• Emoji
• In the SDK• Emotional interpretations are
available on a frame-by-frame basis• Great starting point for analysis• You can always drill down into
underlying expressions
A few more notes
@affectiva
Video vs. Images• Expression detection requires a
baseline (neutral)• With video, you can build an estimate
of a person’s neutral face• Person specific appearance• Improves accuracy & sensitivity
• With still image, no additional information is available
Pose & Luminance• Metrics are robust to lighting
• Lighting of the face!(and not lighting of the room)
• Pose is more challenging for some expressions• Beyond certain ranges we are no
longer confident in metrics and stop reporting them(despite tracking the face)
• Improvements in future releases as we learn from more and more data
Questions?
@affectiva