[ieee 2009 symposia and workshops on ubiquitous, autonomic and trusted computing - brisbane,...

6
Expression and Analysis of Emotions: Survey and Experiment Changrong Yu 1 , Jiehan Zhou 2 , Jukka Riekki 3 1 University of Oulu, English Philology 2 University of Oulu, Department of Electrical and Information Engineering 3 Intelligent System Group, Department of Electrical and Information Engineering, University of Oulu Oulu, Finland [email protected]; [email protected]; [email protected] Abstract Emotions are everywhere in our daily interaction. Emotional interchanges among recipients involve coordinated verbal (linguistic) and non-verbal (paralinguistic) cues and kinesic actions (e.g. facial expression, sighs, laughter, nods, and gestures). The paper explores how speakers jointly achieve their emotion exchange and mutual emotion comprehension in conversation, by analyzing the linguistic features with the assistance of video annotation techniques by Anvil. The data analysis on a positive scenario, complimenting, is carried out by integrating a broader discourse analytical and sociolinguistic approach with a cognitive approach. An initial emotion labeling is also proposed. At the end, the paper views the potential of applying our results to the emerging emotion-oriented computing for perceiving and discerning emotions in conversation. Keywords: Emotion expression, emotion annotation, positive emotion, AmE framework 1. Introduction Emotion is everywhere in our daily interaction. With the rapid advance of computer-based devices and communication technology, e.g. ambient intelligence [1], pervasive computing [2], and ubiquitous computing [3], an emotion-aware ambient intelligence (AmE) framework was proposed in our previous work [4][5] for instant provision of emotion-aware mobile services (e.g. instant multimedia message or IMS, mobile services of video and voice calls). These instant and appropriate mobile services react on human emotions, helping to turn negative emotions into positive ones. For example, a real time maintenance service can dispel the housewife’s distress on the broken washing machine by emotional service. To approach this AmE paradigm, basic issues such as emotion modeling, emotion computing, and matching emotions with mobile services need to be addressed. This paper views that conversation is one of the major channels for communicating emotion and it makes two-fold contributions: (1) an overview of approaches to emotion towards emotion expression and analysis of English conversation; (2) an experiment on emotion modeling and emotion annotation with Anvil tool [6]. The remainder of the paper is organized as follows. Section 2 briefly presents an AmE-driven research approach. Section 3 overviews approaches to emotion. Section 4 presents our experiment on emotion annotation with Anvil tool. Section 5 concludes the paper. 2. AmE-driven research approach Our approach is driven by the AmE vision [5], aiming to suggest an emotion model for expression and analysis of emotions in English conversation. An emotion model will be used for analyzing emotional information in artificial or real life data. Incorporating this model into pervasive computing environment enables computers to process emotion information and provide corresponding mobile services in order to Symposia and Workshops on Ubiquitous, Autonomic and Trusted Computing 978-0-7695-3737-5/09 $25.00 © 2009 IEEE DOI 10.1109/UIC-ATC.2009.17 428 Symposia and Workshops on Ubiquitous, Autonomic and Trusted Computing 978-0-7695-3737-5/09 $25.00 © 2009 IEEE DOI 10.1109/UIC-ATC.2009.17 428 Symposia and Workshops on Ubiquitous, Autonomic and Trusted Computing 978-0-7695-3737-5/09 $25.00 © 2009 IEEE DOI 10.1109/UIC-ATC.2009.17 428

Upload: jukka

Post on 09-Dec-2016

215 views

Category:

Documents


3 download

TRANSCRIPT

Expression and Analysis of Emotions: Survey and Experiment

Changrong Yu1, Jiehan Zhou2, Jukka Riekki3

1University of Oulu, English Philology 2University of Oulu, Department of Electrical and Information Engineering

3Intelligent System Group, Department of Electrical and Information Engineering, University of Oulu

Oulu, Finland [email protected]; [email protected]; [email protected]

Abstract

Emotions are everywhere in our daily interaction.

Emotional interchanges among recipients involve

coordinated verbal (linguistic) and non-verbal

(paralinguistic) cues and kinesic actions (e.g. facial

expression, sighs, laughter, nods, and gestures). The

paper explores how speakers jointly achieve their

emotion exchange and mutual emotion comprehension in

conversation, by analyzing the linguistic features with the

assistance of video annotation techniques by Anvil. The

data analysis on a positive scenario, complimenting, is

carried out by integrating a broader discourse analytical

and sociolinguistic approach with a cognitive approach.

An initial emotion labeling is also proposed. At the end,

the paper views the potential of applying our results to

the emerging emotion-oriented computing for perceiving

and discerning emotions in conversation. Keywords: Emotion expression, emotion annotation,

positive emotion, AmE framework

1. Introduction

Emotion is everywhere in our daily interaction.

With the rapid advance of computer-based devices

and communication technology, e.g. ambient

intelligence [1], pervasive computing [2], and

ubiquitous computing [3], an emotion-aware ambient

intelligence (AmE) framework was proposed in our

previous work [4][5] for instant provision of

emotion-aware mobile services (e.g. instant

multimedia message or IMS, mobile services of video

and voice calls). These instant and appropriate mobile

services react on human emotions, helping to turn

negative emotions into positive ones. For example, a

real time maintenance service can dispel the

housewife’s distress on the broken washing machine

by emotional service.

To approach this AmE paradigm, basic issues such

as emotion modeling, emotion computing, and

matching emotions with mobile services need to be

addressed. This paper views that conversation is one

of the major channels for communicating emotion and

it makes two-fold contributions: (1) an overview of

approaches to emotion towards emotion expression

and analysis of English conversation; (2) an

experiment on emotion modeling and emotion

annotation with Anvil tool [6].

The remainder of the paper is organized as follows.

Section 2 briefly presents an AmE-driven research

approach. Section 3 overviews approaches to

emotion. Section 4 presents our experiment on

emotion annotation with Anvil tool. Section 5

concludes the paper.

2. AmE-driven research approach Our approach is driven by the AmE vision [5],

aiming to suggest an emotion model for expression

and analysis of emotions in English conversation. An

emotion model will be used for analyzing emotional

information in artificial or real life data. Incorporating

this model into pervasive computing environment

enables computers to process emotion information

and provide corresponding mobile services in order to

Symposia and Workshops on Ubiquitous, Autonomic and Trusted Computing

978-0-7695-3737-5/09 $25.00 © 2009 IEEE

DOI 10.1109/UIC-ATC.2009.17

428

Symposia and Workshops on Ubiquitous, Autonomic and Trusted Computing

978-0-7695-3737-5/09 $25.00 © 2009 IEEE

DOI 10.1109/UIC-ATC.2009.17

428

Symposia and Workshops on Ubiquitous, Autonomic and Trusted Computing

978-0-7695-3737-5/09 $25.00 © 2009 IEEE

DOI 10.1109/UIC-ATC.2009.17

428

regulate people’s emotions. Figure 1 illustrates the

AmE-driven emotion approach, which consists of

three main building blocks:

Figure 1. AmE-driven approach to emotion in

English conversation.

Emotion modeling. Conversation is one of the

main channels for communicating emotion. The

expression and analysis of emotion in English

conversation has not been systematically studied.

However, amounts of existing research on emotion by

biologists [7-9], psychologists, anthropologists

[10-12], social constructionists [13], social

psychologists, and linguists [14, 15] set foundations.

We highlight the view that people’s behaviors are

associated with emotions, and these behaviors

correspond with appropriate services (e.g. mobile

voice and video call). In this paper, we take a broad

discourse perspective to study the emotion model in

English conversation. That includes e.g. addressing

how people jointly achieve mutual emotion

comprehension and how they fail emotion perception

in conversation by analysis of the linguistic features

in conversation.

Computer-aided emotion analysis and

annotation. The emotion expression and analysis in

English conversation will be studied by integrating

computer technology and existing video annotation

technology. The emotion model will be encoded from

the emotion annotations occurring in naturally

occurring conversation. There are two ways of

creating emotion annotation, one by semi-automation,

another one by computer automation. The experiment

on emotion annotation in this paper is with the means

of semi-automation. Section 4 gives the detail on the

experiment.

Emotion-aware service computing. We assume

that appropriate and real-time responses can regulate

and mediate people’s emotion experiences. Advance

mobile services (e.g. voice call, video call, short

message service (SMS), multimedia message service

(MMS) make this assumption viable. In this block,

people’s emotional experiences will be regulated by

communicating with matched appropriate mobile

services in a pervasive computing environment.

3. Survey on emotion research paradigms

This section surveys approaches to emotion in

social science, focusing on approaches by

sociologists, social constructionists, social

psychologists, and social linguistics.

Izard [16] provides evidence that using cognitive

processes alone to explain emotion activation is

incomplete. Some cognitive theories study emotion in

discourse and social life, such as the approach of

discursive psychology. Discursive psychology (DP)

has been profoundly influenced by conversation

analysis (CA) which offered the approach for dealing

with interactional materials. Discursive approaches

have started to study evaluative expressions in

naturally occurring interaction as part of varied social

practices, considering what such expressions are

doing rather than their relationship to attitudinal

objects or other putative mental entities; they study

how discourse is situated sequentially and rhetorically

[17]. This approach goes beyond the function of

emotion signs. The cognitive approach tries to solve

what emotion is, while sociolinguistic approaches

mainly focus their interest on the relationship between

emotion and language. However, the study on the

relationship between language, emotion, and

cognition is still rare. In what follows, we will review

several studies on the relationship between emotion

and language.

Emotion in discourse has been studied from these

social settings such as taped family discussion [18],

who proposed affective reciprocity in emotional

interaction by studying the audio-recorded data from

marital conversation of dissatisfied couples; and

recordings of children’s play on the street [19], in

feedback session of academic talk-in-interaction.

Sandlund [20] studied emotions in academic

429429429

talk-in-interaction in terms of sequential environment,

their interactional elicitors and their management and

closing by using the conversation analytic approach.

She studied basically three themes: frustration,

embarrassment and enjoyment. Within each,

assortments of practices for doing emotions were

found. Frustration was primarily located in the

context of violations of activity-specific turn-taking

norms. Enjoyment was found to be collaboratively

pursued between and within institutional activities.

The findings indicate that emotion displays can be

viewed as transforming a situated action, opening up

alternative trajectories for sequences-in-progress, and

also functioning as actions in themselves.

Within sociolinguistics and anthropological

linguistics, there is a substantial body of research into

the lexicology of emotions (see e.g. [21]).

Linguistically, emotions are viewed as lexical entries

and refer to valenced internal states or reactions, or

expressions of an internal reaction [15]. Chafe [14]

emphasizes that emotion is present in everyday

conversation. Emotion is what gives communication

life. Coordination between partners in conversation

occurs at many levels, and they are all grounds for

emotion. Emotion is thus identified as intersubjective.

Emotion is not just something which is known,

defined and judged. It is something which variously

holds social situations together or pulls them apart

[22].

The emotions can be classified by two types,

positive emotion and negative emotion [10]. The

roughly distinction between these two emotions have

been proposed by the cognitive approach. We can

regard them as polarized, there is a dividing line

where one type of emotions change into the other type

of emotions. In everyday language, we express our

emotions with a positive-negative scale and in

variable magnitudes. Fredrickson [23, 24] has

developed a “broaden-and-build” perspective on the

value of positive emotions. She maintains that

positive emotions are important in that they broaden

attention and create situations where cognitive,

physical, and social resources can be built. Negative

emotions express an attempt or intention to exclude.

Negative emotions are fueled by an underlying fear of

the unknown, a fear of the actions of others, and a

need to control them or stop them to avoid being

harmed.

4. Experiment This section presents our experiments on emotions

annotation in English conversation (EC). The

experiment consists of the following five main steps:

Step 1 - Choosing a tool for encoding emotion

model. Our experiment aims to apply the initially

summarized emotion model and theory in English

conversation to create emotion annotation with

Audiovisual (AV) data. In our experiment, we use

Anvil, which is a free video annotation program [28].

Step 2 - Encoding emotion model. To make use of

computer annotation systems, the extracted emotion

model needs to be encoded with a certain computer

language. In our experiment, we use the XML

(eXtensible Mark-up Language) language to specify

the emotion annotation from the interdisciplinary

perspectives. The specified emotion file consists of

seven tracks based on emotion cues of emotional

lexical words, emotional syntax, prosody, facial

expression, gesture, laughter and sequential

positioning. After settling down the seven tracks, we

assign each track the forty eight emotion elements as

emotion labeling, i.e. emotional lexical words in the

data may contain any of the forty-eight emotions

labeling.

Step 3 - Choosing AV data. The occurrence and

expression of our emotion can not be separated from

the social settings and environment. As a result the

study of emotion on naturally occurring conversation

is a special setting for understanding how our

emotions interact. The available data in our

experiment comes from the videotaped transcript

called ‘Never in Canada’ collected in the Department

of English, University of Oulu, in 2003. Some of the

data come from the audio-recording and transcripts of

‘the Santa Barbara Corpus of Spoken American

English, or SBCSAE in short, collected in the

Department of Linguistics, University of California at

430430430

Santa Barbara. Different from the videotaped data

‘Never in Canada’, SBCSAE is an audio-taped

recording. They both represent naturally-occurring

and grammatically standard English. The corpus data

were transcribed using the conventions proposed in

Du Bois et al. [25]. The data is transcribed into

intonation units, or stretches of speech uttered under a

single intonation contour, such that each line

represents one intonation unit [14].

Step 4 - Preprocessing AV data. Sometimes AV

data is too long and does not confirm to the required

formats of the annotation software. In this situation,

AV data needs to be preprocessed before making

annotation. In our experiment, SUPER [26] software

is used for format conversion. VirtualDub [27] is used

for splitting AV data.

Step 5 - Creating and reporting annotation.

After designing the annotation scheme and

preprocessing AV data, it is ready to create a new

annotation. In our experiment, first Anvil asks you to

open the AV data you want to create annotations.

Second Anvil asks you to open the emotion

specification file. Third, Anvil provides four main

windows [28]. Our annotation is set as Table 1, in

which we annotate the emotion flow of each

recipient from its linguistic features and

paralinguistic features. The data extract is from

‘Never in Canada’, which is a 2.21 minutes narration.

Jason is the story-teller. Jason’s narration starts at

5.04 minutes of their whole conversation, and his

narration goes across more than 120 intonation units

in this 2.21 minutes episode. We are unable to

provide the transcripts in this paper due to the length

limitation.

Table 1. Emotion inter-correlation between

Jason’s narration and Mary’s emotion

Emotion flow of Jason’s

narration

Emotion flow of Mary’s

narration

1. Prelude: eager

encouragement and

curiosity from two recipients

1. Reactive: surprise,

curiosity

2. Preface and 2. Positive reaction:

development: Jason’s

willingness for sharing his

story with recipients ;

Linguistic features of

Jason’s expression: vivid

lexical choices, rising tone

and lengthened prosody

tease, acceptance,

excitement (excited

laughter )

3. Climax: dramatized

reiteration

3. Lively positive :

compliment

4. Denouement:

self-evaluation of the whole

story

4. Empathy: speaking

her own voice

In this episode, Jason tells his friends in which

situation and how he told people that he was

Canadian. While he and some other exchange

students were waiting for the taxi in a long queue at

four in the morning at minus twenty-degrees, he

shouted out: <VOX this is the dumbest, fucking thing,

I have ever seen, in my entire life VOX>, and then

<VOX no offense, (0.7) we just don't do that, in

Canada VOX>. Afterwards, they walked up the roads

and hailed a taxi instead of waiting for their turn in the

gigantic queue. Jason’ story gains exciting laughter

and compliments from the other two recipients. For

attracting the other recipients to affiliate with his

story, he needs to perceive and discern their emotion

and stance, as well as make his narration attractive.

He is skilful in telling story by using direct speech,

indirect speech, verbatim quotes and self-commentary

as well as evaluation. Importantly, emotion

interchanges with the other two recipients are

intertwined in his narration. We classify the narrative

interaction into four stages: prelude, preface and

development, climax, and denouement with

evaluation, in terms of the emotion flow. Of the two recipients- Mary and Sophie, Sophie

has heard the story before. And she encourages Jason

to re-tell the story to Mary. Since the story is still new

for Mary, she seems to evaluate more in the story.

However Sophie always affiliates with Mary. We use

Anvil to track these three speakers’ emotions, and the

starting time and the end time of the emotion

expression were recorded. After we annotated each

431431431

speaker’s emotion, then we save these annotations

into a table for comparing the mutual emotional

interaction of the recipients. The analysis suggests

that the positive emotion and the compliments of the

recipients help push Jason’s narration to the climax.

Our finding is similar to the theories on emotion

coincidence, emotion contagion and empathy in terms

of cognitive and sociology. The expression of an

emotional state in one person often leads to the

experience or expression of a similar emotion in

another person [14]. From the generated annotation track, we obtain

emotion inter-correlations between Jason’s narration

and Mary’s emotion, as seen in Table 1. We find out

that the degree of Mary‘s emotion deepens and

becomes more and more positive with the progress of

Jason’s story. Their emotion interacts well. In the

prelude of the story, Mary is reactive (surprise), then

in the stage of preface and development, she becomes

quite positive, then her laughter is a sign of

acceptance and offers the story-teller, Jason, a

relaxing and encouraging atmosphere. Following the

laughter, she gives her evaluation, ”Nice”, and in the

climax of Jason’s narrative, her emotion becomes

lively positive, which is shown by her compliments

and empathy, “That’s funny. Yeah I guess I wouldn’t stand

in a line like that...” The above analysis shows that the emotion flow

of Jason’s narration mirrors well Mary’s emotion

flow. Jason’s narration cannot achieve such a positive

effect without the collaboration of Mary. And the

recipients show different degrees of affiliation with

each other during the interaction.

The result is meanwhile similar to the cognitive

finding, with positive emotions resulting from goal

congruence and producing more creative and variable

actions [29]. The annotation tool can help us

understand how people jointly achieve emotion

comprehension, why assessments or evaluations

occur, where they occur, how they occur, and what is

evaluated in conversation.

5. Conclusion and implications Emotions are pervasive in human behaviours.

Pervasive emotional behaviors can be identified, then

be regulated by mobile service provisioning in a

pervasive computing environment. Within the AmE

vision, this paper extracts the model of emotion

expression and analysis in English conversation. A

survey was presented on emotion approaches from

interdisciplinary perspective. And an initial

experience was conducted with emotion scheme

specification and emotion annotation. Some lessons

were obtained, e.g. in using annotation software Anvil

and selecting emotion-intensive data. The future work

continues to extend the emotion labelling, study the

model of emotion expression and analysis in English

conversation and enhance our empirical study with

more case analyses.

Acknowledgement This work was carried out in the Ubiquitous

Computing and Diversity of Communication

(MOTIVE) founded by the Academy of Finland's

Research Program. Special thanks to Dr. Elise

Karkkainen for reviewing the paper.

References

[1] Anonymous "Special issue on automation and

engineering for ambient intelligence," Automation Science

and Engineering, IEEE Transactions on, vol. 4, pp.

295-295, 2007.

[2] Anonymous "IEEE Pervasive Computing Call for

Papers," Pervasive Computing, IEEE, vol. 7, pp. c4-c4,

2008.

[3] J. Riekki, J. Huhtinen, P. Ala-Siuru, P. Alahuhta, J.

Kaartinen and J. Roning, "Genie of the Net, an Agent

Platform for Managing Services on Behalf of the User,"

Computer Communications Journal, Special Issue on

Ubiquitous Computing, vol. 26, pp. 1188-1198, 2003.

[4] J. Zhou, C. Yu, J. Riekki and E. Kärkkäinen, "AmE

Framework: a Model for Emotion-aware Ambient

Intelligence, 2007," The Second International Conference

on Affective Computing and Intelligent Interaction

(ACII2007): Lisbon, Portugal., 2007.

[5] Yu Changrong and Zhou Jiehan, "Research on

service-mediated emotion computing and communication,"

432432432

in The First Finnish Symposium on Emotions and

Human Technology Interaction, May, 2008, pp. 48-52.

[6] Anonymous "ANVIL-the video annotation research

tool,"

[7] C. Darwin, The Expression of Emotion in Man and

Animals. Indy Publish, 1872,

[8] W. James, "Psychological essay: What is an Emotion?"

Mind, vol. 9, pp. 188-205, 1884.

[9] S. Freud, Beyond the Pleasure Principle. New York:

Norton, 1975,

[10] M. Lweis, J. M. Haviland and J. M. Haviland,

Handbook of Emotion. New York: The Guilford

PressLweis, M, 1993,

[11] C. Lutz, Unnatural Emotions: Everyday Sentiments

on a Micronesian Atoll and their Challenge to Western

Theory. Chicago: University of Chicago Press, 1988,

[12] R. Cornelius, The Science of Emotions. Upper Saddle

River, NJ: Prentice-Hall, 1996,

[13] R. Plutchik, H. Kellerman and H. Kellerman, Emotion:

Theory, Research and Experience. , vol. 1, New York:

Academic PressPlutchik, R, 1980,

[14] W. Chafe, Discourse, Consciousness, and Time: The

Flow and Displacement of Conscious Experience in

Speaking and Writing. Chicago: University of Chicago

Press., 1994,

[15] Stein, N. L., Bernas, R. S., & Calicchia, D., "Conflict

talk: Understanding and resolving arguments." in

Conversation: Cognitive, Communicative and Social

Perspectives T. Giron, Ed. Amsterdam: John Benjamins,

1996,

[16] C. E. Izard, "Four systems for emotion activation:

Cognitive and noncognitive processes " Psychological

Review, vol. 100, pp. 68-90, 1993.

[17] Sally Wiggins and Jonathan Potter, "Attitudes and

evaluative practices: Category vs.item and subjective vs.

objective constructions in everyday food assessments "

British Journal of Social Psychology, vol. 42, pp. 513–531,

2003.

[18] J. M. Gottman and R. W. Levenson, "A valid

procedure for obtaining self-report of affect in marital

interaction" Journal of Consulting and Clinical Psychology,

vol. 53, pp. 151-60., 1985.

[19] M. H. Goodwin and C. Goodwin, "Emotion within

situated activity," in Communication: An Arena of

Development.

Http://www.Sscnet.Ucla.edu/clic/cgoodwin/00emot_act.Pd

f Budwig, N.,Ina Uzgiris and James Wertsch, Ed.

Stamford: Ablex Publishing Corporation, 2000, pp. 33–53.

[20] Erica Sandlund, Feeling by Doing the Social

Organization of Everyday Emotions in Academic

Talk-in-Interaction. Karlstad University, 2004,

[21] J. A. Russell, "Culture and the Categorization of

Emotions," Psychological Bulletin, vol. 110, pp. 426-450,

1991.

[22] R. Collins, "Stratification, emotional energy, and the

transient emotions," in Research Agendas in the Sociology

of Emotions Kemper T. D., Ed. New York: SUNY Press,

1990, pp. 27-57.

[23] B. L. Fredrickson, "What good are positive

emotions?" Review of General Psychology: Special Issue:

New Directions in Research on Emotion, vol. 2, pp.

300–319, 1998.

[24] B. L. Fredrickson and T. Joiner, "Positive emotions

trigger upward spirals toward emotional well-being,"

Psychological Science, vol. 13, pp. 172–175, 2002.

[25] D. Bois and P. Danae, "Outline of discourse

transcription," in Talking Data: Transcription and Coding

in Discourse Research J. A. Edwards and M. D. Lampert,

Eds. Hillsdale, NJ: Erlbaum, 1993, pp. 45-89.

[26] SUPER, "SUPER,

http://www.erightsoft.com/SUPER.html. Retrieved by 5th

April 2009."

[27] VirtualDub, "VirtualDub,

http://www.afterdawn.com/guides/archive/cut_avi_with_vi

rtualdub.cfm. Retrieved by 5th April 2009."

[28] Kipp Michael, "Anil 4.0 annotation of video and

spoken language, user manual," University of the Saarland,

German Research Center for Artificial Intelligence,

Germany., 2003.

[29] Kahnand BE and Isen AM, "The influence of positive

affect on variety seeking among safe, enjoyable products,"

J Consum Res, vol. 20, pp. 257-270, 1993.

433433433