producing emotional speech thanks to gabriel schubiner

Producing Emotional

Speech

Thanks to Gabriel Schubiner

Papers

Generation of Affect in Synthesized Speech

Corpus-based approach to synthesis

Expressive visual speech using talking head

Affect Editor Quiz/Demo

Synface Demo

Affect in SpeechGoals

Addition of Emotion to Synthetic speech

Acoustic Model

Typology of parameters of emotional speech

Quantification

Addresses problem of expressiveness

What benefit is gained from expressive speech?

Emotion Theory/Assumptions

Emotion -> Nervous System -> Speech Output

Binary distinction

Parasympathetic vs Sympathetic

based on physical changes

universal emotions

Approaches to Affect

Generative

Emotion -> Physical -> Acoustic

Descriptive

Observed acoustic params imposed

Descriptive Framework

4 Parameter groups

Timing

Voice Quality

Articulation

Assumption of independence

How could this affect design and results?

PitchTiming

Accent Shape

Average Pitch

Contour Slope

Final Lowering

Pitch Range

Reference Line

Exaggeration (not used)

Fluent Pauses

Hesitation Pauses

Speech Rate

Stress Frequency

Stressed Stressable

Voice Quality Articulation

Breathiness

Brilliance

Loudness

Pause Discontinuity

Pitch Discontinuity

Tremor

Laryngealization

Precision

Implementation

Each parameter has scale

Each scale is independent

from other parameters

between positive and negative

Implementation

Settings grouped into preset conditions for each emotion

based on prior studies

Program Flow: Input

Emotion -> parameter representation

Utterance -> clauses

Agent, Action, Object, Locative

Clause and lexeme annotations

Finds all possible locations of affect and chooses whether or not to use

Program Flow

Utterance -> Tree structure -> linear phonology

“compiled” for specific synthesizer with software to simulate affects not available in hardware

Perception

30 Utterances

5 sentences * 6 affects

Forced choice of one of six affects

magnitude and comments

Elicitation Sentences

I’m almost finished

I’m going to the city

I saw your name in the paper X

I thought you really meant it

Look at that picture

Pop Quiz!!!

Pop Quiz Solutions

I’m almost finishedDisgust : Surprise : Sadness : Gladness : Anger : Fear

I’m going to the citySurprise : Gladness : Anger : Disgust : Sadness : Fear

I thought you really meant itAnger : Disgust : Gladness : Sadness : Fear : Surprise

Look at that pictureAnger : Fear : Disgust : Sadness : Gladness : Surprise

Resultsapprox 50% recognition rate

91% sadness

Conclusions

Effective?

Thoughts?

Corpus-based Approach to

Expressive Speech Synthesis

Corpus

Collect utterances in each emotion

emotion-dependent semantics

One speaker

Good news, Bad news, Question

Model: Feature Vector

FeaturesLexical stressPhrase-level stressDistance from beginning of phraseDistance from end of phrasePOSPhrase-typeEnd of syllable pitch

Model: Classification

Predicts F0

5 syllable window

Uses feature vector to predict observation vector

observation vector: log(p), Δp

p = end of syllable pitch

Decision Tree

Model: Target Duration

Similar to predicting F0

build tree with goal of providing Gaussian at leafs

Use mean of class as target duration

discretization

ModelsUses acoustic analogue of n-grams

captures sense of contextcompared to describing full emotion as sequence

compare to Affect EditorUses only F0 and length (comp. A E)Include information about from which utterance the features are derived

intentional bias, justified?

Model: SynthesisData tagged with original expression and emotion

expression-cost matrix

noted trade-off:

emotional intensity vs. smoothness

Paralinguistic events

Compare to Cahn’s typology

Abstraction layers

Perception Experiment

Distinguish same utterance spoken with neutral and affected prosody

Semantic content problematic?

Results

Binary decision

Reasonable gain over baseline?

Conclusion

Major contributions?

Paths forward?

Synthesis of Expressive Visual Speech on a

Talking Head

< Not these Talking Heads...

Synthesis Background

Manipulation of video imagesVirtual model with deformation parametersSynchronized with time-aligned transcriptionArticulatory Control Model

Cohen & Massaro (1993)

Single actor

Given specific emotion as instruction

6 emotions + neutral

Facial Animation Parameters

Face independent

FAP Matrix * scaling factor + position0

Weighted deformations of distance between vertices and feature point

Modeling

Phonetic segments assigned target parameter vector

temporal blending over dominance functions

Principal components

Separate models for each emotion

6:1 training:testing ratio

models -> PC traj -> FAP traj * emotion param matrix

Results

More extreme emotions easier to perceive

73% sad, 60% angry, 40% sad

Synface Demo

Discussion

Changes in approach from Cahn to Eide

Production compared to Detection

producing emotional speech thanks to gabriel schubiner

sadness slide

surprise slide

picture slide

stressable slide

comments slide

negative slide

hardware slide

gabriel schubiner slide

Documents

emotional intelligence. emotional intelligence e - iq

co-producing mobilities co-producing actual mobilities...

language drift gabriel schubiner seminar on endangered...

emotional intelligence: different models of emotional

reiki · reiki producing a deep sense of relaxation and...

building emotional intelligence - ppt · building emotional...

an emotional...

emotional intelligence. 2 emotional intelligence at work

emotional intelligence in police interviews · pdf...

eiq16 user manual - psychometric assessments emotional...

running head: social emotional...

emotional intelligence. course objectives define emotional...

producing music, producing myth? creativity in recording

emotional intelligence: correlates with exercise … · 2.4...

1 social & emotional dev social relationships emotional...

nmpra 2o grants & awards 2o · the howard schubiner award...

euroroundup carbapenemase-producing · pdf fileeuroroundup...

emotional freedom technique (eft) - emotional health

patient brochure pdf - wellness pain care · 2019. 3....

social emotional preschool curriculum consumer report ·...