human emotion synthesis

34
Human Emotion Synthesis David Oziem, Lisa Gralewski , Neill Campbell, Colin Dalton, David Gibson, Barry Thomas University of Bristol, Motion Ripper, 3CR Research

Upload: zelig

Post on 06-Jan-2016

38 views

Category:

Documents


0 download

DESCRIPTION

Human Emotion Synthesis. David Oziem, Lisa Gralewski , Neill Campbell, Colin Dalton, David Gibson, Barry Thomas University of Bristol, Motion Ripper, 3CR Research. Project Group. Motion Ripper Project Methods of motion capture. Re-using captured motion signatures. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Human Emotion Synthesis

Human Emotion Synthesis

David Oziem, Lisa Gralewski, Neill Campbell, Colin Dalton, David Gibson, Barry Thomas

University of Bristol, Motion Ripper, 3CR Research

Page 2: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Project Group

• Motion Ripper Project

– Methods of motion capture.– Re-using captured motion signatures.– Synthesising new or extend motion sequences.– Tools to aid animation.

• Collaboration between University of Bristol CS, Matrix Media & Granada.

Page 3: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Introduction

• What is an emotion?

• Ekman outlined 6 different basic emotions.– joy, disgust, surprise, fear, anger and sadness.

• Emotional states relate to ones expression and movement.

• Synthesising video footage of an actress expressing different emotions.

Page 4: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Page 5: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Video Textures

• Video textures or temporal textures are textures with motion. (Szummer’96)

• Schodl’00, reordered frames from the original to produce loops or continuous sequences.

– Doesn’t produce new footage.

• Campbell’01, Fitzgibbon’01, Reissell’01, used Autoregressive process (ARP) to synthesis frames.

Examples of Video Textures

Page 6: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Autoregressive Process

• Statistical model

• Calculating the model involves working out the parameter vector (a1…an) and w.

• n is known as the order of the sequence.

y(t) = – a1y(t – 1) – a2y(t – 2) – … – any(t – n) + w.ε

Parameter vector (a1,…,an) Noise

Current value at time t

Page 7: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Autoregressive Process

• Statistical model

• Increasing dimensionality of y drastically increases the complexity in calculating (a1…an).

y(t) = – a1y(t – 1) – a2y(t – 2) – … – any(t – n) + w.ε

Page 8: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Autoregressive Process

PCA analysis of Sad footage in 2D

Secondary mode

Primary mode

• Principal Components Analysis is used to reduce number of dimensions in the original sequence.

Page 9: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Autoregressive Process

PCA analysis of Sad footage in 2D Generated sequence using an ARP

Secondary mode Secondary mode

Primary mode Primary mode

• Non-Gaussian Distribution is incorrectly modelled by an ARP.

Page 10: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Face Modelling

• Campbell’01, synthesised a talking head.

• Cootes and Talyor’00, combined appearance model.– Isolates shape and texture.

• Requires labelled frames.– Must label important features

on the face.

Labelled points

Page 11: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Combined Appearance

Shape space

Hand Labelled video footage provides a point set which represents the shape space of the clip.

Page 12: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Combined Appearance

Shape space Texture space

Warping each frame into a standard pose, creates the texture space.

The standard pose is the mean position of the points.

Page 13: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Combined Appearance

Shape space Texture space

Combined spaceCombined space

Joining the shape and texture space and then re-analysing using PCA produces the combined space.

Page 14: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Combined Appearance

Shape space Texture space

Combined space

Reconstruction of the original sequence from the combined space.

Combined spaceCombined space

Page 15: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Secondary mode

Primary mode

Combined Appearance

Combined Appearance sequence

Original sequence in 2D

Secondary mode

Primary mode

Change in distribution after applyingThe combined appearance technique

Page 16: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Secondary mode

Primary mode

Combined Appearance

Generated SequenceOriginal sequence

Secondary mode

Primary mode

ARPmodelARP

model

• Visually the generated plot appears to have been generated using the same stochastic process as the original.

Page 17: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Copying and ARP

• Combine the benefits of copying with ARP– New motion signatures.– Handles non-Gaussian distributions.

Page 18: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Copying and ARP

Original inputOriginal input

Reduced inputReduced input

PCAPCA

• Important to reduce the complexity of the search process.• Need around 30 to 40 dimensions in this example.

Page 19: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Copying and ARP

Original inputOriginal input

Reduced inputReduced input

Segmented inputSegmented inputPCAPCA Reduced segmentsReduced segmentsPCAPCA

• Temporal segments of between 15 to 30 frames.• Need to reduce each segment to be able to train ARP’s.

Page 20: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Copying and ARP

Original inputOriginal input

Reduced inputReduced input

Segmented inputSegmented input Reduced segmentsReduced segmentsPCAPCA PCAPCA

ARPARP

Synthesised segmentsSynthesised segments

• Many of the learned models are unstable.• 10-20% are usable.

Page 21: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Copying and ARP

Original inputOriginal input

Reduced inputReduced input

Segmented inputSegmented input Reduced segmentsReduced segmentsPCAPCA PCAPCA

ARPARP

Synthesised segmentsSynthesised segmentsSegment selectionSegment selection

Outputted SequenceOutputted Sequence

Page 22: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Example

First mode

Time t

End of generated sequence.

Possible segments.

Compared section

Page 23: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

First mode

Time t

Example

Closest 3 segmentsare chosen.

Page 24: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

First mode

Time t

Example

The segment to be copied is randomly selected from the closest 3.

Page 25: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

First mode

Time t

Example

Segments are blended together using a small overlap and averaging the overlapping pixels.

Page 26: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Secondary mode

Primary mode

Secondary mode

Primary mode

Copying& ARPmodel

Copying& ARPmodel

PCA analysis of Sad footage in 2D

Generated sequence

Copying and ARP

• Potentially infinitely long.• Includes new novel motions.

Page 27: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Results (Angry)

Source Footage Copying with ARPCombined Appearance ARP

• Combined appearance produces higher resolution frames.

• Better motion from the copying and ARP approach

Page 28: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Results (Sad)

Source Footage Copying with ARPCombined Appearance ARP

• Similar results as with the angry footage– Copied approach is less blurred due to the reduced variance.

Page 29: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Comparison Results

- Combined appearance - Segment copying

• Simple objective comparison.– Randomly selected temporal segments.

Page 30: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Comparison

• Perceptually is it better to have good motion or higher resolution.

Page 31: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Combined appearance Segment Copying with ARP

Page 32: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Other potential uses

• Self Organising Map

• Uses combined appearance– as each ARP model provides a

minimal representation of the given emotion.

• Can navigate between emotions to create new interstates.

Angry Sad Happy

Page 33: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Conclusions

• Both methods can produce synthesised clips of a given emotion.

• Combined appearance produces higher definition frames.

• Copying and ARPs generates more natural movements.

Page 34: Human Emotion Synthesis

Synthesising Facial Emotions – University of Bristol – 3CR Research

Questions