[ieee 2009 symposia and workshops on ubiquitous, autonomic and trusted computing - brisbane,...
TRANSCRIPT
Expression and Analysis of Emotions: Survey and Experiment
Changrong Yu1, Jiehan Zhou2, Jukka Riekki3
1University of Oulu, English Philology 2University of Oulu, Department of Electrical and Information Engineering
3Intelligent System Group, Department of Electrical and Information Engineering, University of Oulu
Oulu, Finland [email protected]; [email protected]; [email protected]
Abstract
Emotions are everywhere in our daily interaction.
Emotional interchanges among recipients involve
coordinated verbal (linguistic) and non-verbal
(paralinguistic) cues and kinesic actions (e.g. facial
expression, sighs, laughter, nods, and gestures). The
paper explores how speakers jointly achieve their
emotion exchange and mutual emotion comprehension in
conversation, by analyzing the linguistic features with the
assistance of video annotation techniques by Anvil. The
data analysis on a positive scenario, complimenting, is
carried out by integrating a broader discourse analytical
and sociolinguistic approach with a cognitive approach.
An initial emotion labeling is also proposed. At the end,
the paper views the potential of applying our results to
the emerging emotion-oriented computing for perceiving
and discerning emotions in conversation. Keywords: Emotion expression, emotion annotation,
positive emotion, AmE framework
1. Introduction
Emotion is everywhere in our daily interaction.
With the rapid advance of computer-based devices
and communication technology, e.g. ambient
intelligence [1], pervasive computing [2], and
ubiquitous computing [3], an emotion-aware ambient
intelligence (AmE) framework was proposed in our
previous work [4][5] for instant provision of
emotion-aware mobile services (e.g. instant
multimedia message or IMS, mobile services of video
and voice calls). These instant and appropriate mobile
services react on human emotions, helping to turn
negative emotions into positive ones. For example, a
real time maintenance service can dispel the
housewife’s distress on the broken washing machine
by emotional service.
To approach this AmE paradigm, basic issues such
as emotion modeling, emotion computing, and
matching emotions with mobile services need to be
addressed. This paper views that conversation is one
of the major channels for communicating emotion and
it makes two-fold contributions: (1) an overview of
approaches to emotion towards emotion expression
and analysis of English conversation; (2) an
experiment on emotion modeling and emotion
annotation with Anvil tool [6].
The remainder of the paper is organized as follows.
Section 2 briefly presents an AmE-driven research
approach. Section 3 overviews approaches to
emotion. Section 4 presents our experiment on
emotion annotation with Anvil tool. Section 5
concludes the paper.
2. AmE-driven research approach Our approach is driven by the AmE vision [5],
aiming to suggest an emotion model for expression
and analysis of emotions in English conversation. An
emotion model will be used for analyzing emotional
information in artificial or real life data. Incorporating
this model into pervasive computing environment
enables computers to process emotion information
and provide corresponding mobile services in order to
Symposia and Workshops on Ubiquitous, Autonomic and Trusted Computing
978-0-7695-3737-5/09 $25.00 © 2009 IEEE
DOI 10.1109/UIC-ATC.2009.17
428
Symposia and Workshops on Ubiquitous, Autonomic and Trusted Computing
978-0-7695-3737-5/09 $25.00 © 2009 IEEE
DOI 10.1109/UIC-ATC.2009.17
428
Symposia and Workshops on Ubiquitous, Autonomic and Trusted Computing
978-0-7695-3737-5/09 $25.00 © 2009 IEEE
DOI 10.1109/UIC-ATC.2009.17
428
regulate people’s emotions. Figure 1 illustrates the
AmE-driven emotion approach, which consists of
three main building blocks:
Figure 1. AmE-driven approach to emotion in
English conversation.
Emotion modeling. Conversation is one of the
main channels for communicating emotion. The
expression and analysis of emotion in English
conversation has not been systematically studied.
However, amounts of existing research on emotion by
biologists [7-9], psychologists, anthropologists
[10-12], social constructionists [13], social
psychologists, and linguists [14, 15] set foundations.
We highlight the view that people’s behaviors are
associated with emotions, and these behaviors
correspond with appropriate services (e.g. mobile
voice and video call). In this paper, we take a broad
discourse perspective to study the emotion model in
English conversation. That includes e.g. addressing
how people jointly achieve mutual emotion
comprehension and how they fail emotion perception
in conversation by analysis of the linguistic features
in conversation.
Computer-aided emotion analysis and
annotation. The emotion expression and analysis in
English conversation will be studied by integrating
computer technology and existing video annotation
technology. The emotion model will be encoded from
the emotion annotations occurring in naturally
occurring conversation. There are two ways of
creating emotion annotation, one by semi-automation,
another one by computer automation. The experiment
on emotion annotation in this paper is with the means
of semi-automation. Section 4 gives the detail on the
experiment.
Emotion-aware service computing. We assume
that appropriate and real-time responses can regulate
and mediate people’s emotion experiences. Advance
mobile services (e.g. voice call, video call, short
message service (SMS), multimedia message service
(MMS) make this assumption viable. In this block,
people’s emotional experiences will be regulated by
communicating with matched appropriate mobile
services in a pervasive computing environment.
3. Survey on emotion research paradigms
This section surveys approaches to emotion in
social science, focusing on approaches by
sociologists, social constructionists, social
psychologists, and social linguistics.
Izard [16] provides evidence that using cognitive
processes alone to explain emotion activation is
incomplete. Some cognitive theories study emotion in
discourse and social life, such as the approach of
discursive psychology. Discursive psychology (DP)
has been profoundly influenced by conversation
analysis (CA) which offered the approach for dealing
with interactional materials. Discursive approaches
have started to study evaluative expressions in
naturally occurring interaction as part of varied social
practices, considering what such expressions are
doing rather than their relationship to attitudinal
objects or other putative mental entities; they study
how discourse is situated sequentially and rhetorically
[17]. This approach goes beyond the function of
emotion signs. The cognitive approach tries to solve
what emotion is, while sociolinguistic approaches
mainly focus their interest on the relationship between
emotion and language. However, the study on the
relationship between language, emotion, and
cognition is still rare. In what follows, we will review
several studies on the relationship between emotion
and language.
Emotion in discourse has been studied from these
social settings such as taped family discussion [18],
who proposed affective reciprocity in emotional
interaction by studying the audio-recorded data from
marital conversation of dissatisfied couples; and
recordings of children’s play on the street [19], in
feedback session of academic talk-in-interaction.
Sandlund [20] studied emotions in academic
429429429
talk-in-interaction in terms of sequential environment,
their interactional elicitors and their management and
closing by using the conversation analytic approach.
She studied basically three themes: frustration,
embarrassment and enjoyment. Within each,
assortments of practices for doing emotions were
found. Frustration was primarily located in the
context of violations of activity-specific turn-taking
norms. Enjoyment was found to be collaboratively
pursued between and within institutional activities.
The findings indicate that emotion displays can be
viewed as transforming a situated action, opening up
alternative trajectories for sequences-in-progress, and
also functioning as actions in themselves.
Within sociolinguistics and anthropological
linguistics, there is a substantial body of research into
the lexicology of emotions (see e.g. [21]).
Linguistically, emotions are viewed as lexical entries
and refer to valenced internal states or reactions, or
expressions of an internal reaction [15]. Chafe [14]
emphasizes that emotion is present in everyday
conversation. Emotion is what gives communication
life. Coordination between partners in conversation
occurs at many levels, and they are all grounds for
emotion. Emotion is thus identified as intersubjective.
Emotion is not just something which is known,
defined and judged. It is something which variously
holds social situations together or pulls them apart
[22].
The emotions can be classified by two types,
positive emotion and negative emotion [10]. The
roughly distinction between these two emotions have
been proposed by the cognitive approach. We can
regard them as polarized, there is a dividing line
where one type of emotions change into the other type
of emotions. In everyday language, we express our
emotions with a positive-negative scale and in
variable magnitudes. Fredrickson [23, 24] has
developed a “broaden-and-build” perspective on the
value of positive emotions. She maintains that
positive emotions are important in that they broaden
attention and create situations where cognitive,
physical, and social resources can be built. Negative
emotions express an attempt or intention to exclude.
Negative emotions are fueled by an underlying fear of
the unknown, a fear of the actions of others, and a
need to control them or stop them to avoid being
harmed.
4. Experiment This section presents our experiments on emotions
annotation in English conversation (EC). The
experiment consists of the following five main steps:
Step 1 - Choosing a tool for encoding emotion
model. Our experiment aims to apply the initially
summarized emotion model and theory in English
conversation to create emotion annotation with
Audiovisual (AV) data. In our experiment, we use
Anvil, which is a free video annotation program [28].
Step 2 - Encoding emotion model. To make use of
computer annotation systems, the extracted emotion
model needs to be encoded with a certain computer
language. In our experiment, we use the XML
(eXtensible Mark-up Language) language to specify
the emotion annotation from the interdisciplinary
perspectives. The specified emotion file consists of
seven tracks based on emotion cues of emotional
lexical words, emotional syntax, prosody, facial
expression, gesture, laughter and sequential
positioning. After settling down the seven tracks, we
assign each track the forty eight emotion elements as
emotion labeling, i.e. emotional lexical words in the
data may contain any of the forty-eight emotions
labeling.
Step 3 - Choosing AV data. The occurrence and
expression of our emotion can not be separated from
the social settings and environment. As a result the
study of emotion on naturally occurring conversation
is a special setting for understanding how our
emotions interact. The available data in our
experiment comes from the videotaped transcript
called ‘Never in Canada’ collected in the Department
of English, University of Oulu, in 2003. Some of the
data come from the audio-recording and transcripts of
‘the Santa Barbara Corpus of Spoken American
English, or SBCSAE in short, collected in the
Department of Linguistics, University of California at
430430430
Santa Barbara. Different from the videotaped data
‘Never in Canada’, SBCSAE is an audio-taped
recording. They both represent naturally-occurring
and grammatically standard English. The corpus data
were transcribed using the conventions proposed in
Du Bois et al. [25]. The data is transcribed into
intonation units, or stretches of speech uttered under a
single intonation contour, such that each line
represents one intonation unit [14].
Step 4 - Preprocessing AV data. Sometimes AV
data is too long and does not confirm to the required
formats of the annotation software. In this situation,
AV data needs to be preprocessed before making
annotation. In our experiment, SUPER [26] software
is used for format conversion. VirtualDub [27] is used
for splitting AV data.
Step 5 - Creating and reporting annotation.
After designing the annotation scheme and
preprocessing AV data, it is ready to create a new
annotation. In our experiment, first Anvil asks you to
open the AV data you want to create annotations.
Second Anvil asks you to open the emotion
specification file. Third, Anvil provides four main
windows [28]. Our annotation is set as Table 1, in
which we annotate the emotion flow of each
recipient from its linguistic features and
paralinguistic features. The data extract is from
‘Never in Canada’, which is a 2.21 minutes narration.
Jason is the story-teller. Jason’s narration starts at
5.04 minutes of their whole conversation, and his
narration goes across more than 120 intonation units
in this 2.21 minutes episode. We are unable to
provide the transcripts in this paper due to the length
limitation.
Table 1. Emotion inter-correlation between
Jason’s narration and Mary’s emotion
Emotion flow of Jason’s
narration
Emotion flow of Mary’s
narration
1. Prelude: eager
encouragement and
curiosity from two recipients
1. Reactive: surprise,
curiosity
2. Preface and 2. Positive reaction:
development: Jason’s
willingness for sharing his
story with recipients ;
Linguistic features of
Jason’s expression: vivid
lexical choices, rising tone
and lengthened prosody
tease, acceptance,
excitement (excited
laughter )
3. Climax: dramatized
reiteration
3. Lively positive :
compliment
4. Denouement:
self-evaluation of the whole
story
4. Empathy: speaking
her own voice
In this episode, Jason tells his friends in which
situation and how he told people that he was
Canadian. While he and some other exchange
students were waiting for the taxi in a long queue at
four in the morning at minus twenty-degrees, he
shouted out: <VOX this is the dumbest, fucking thing,
I have ever seen, in my entire life VOX>, and then
<VOX no offense, (0.7) we just don't do that, in
Canada VOX>. Afterwards, they walked up the roads
and hailed a taxi instead of waiting for their turn in the
gigantic queue. Jason’ story gains exciting laughter
and compliments from the other two recipients. For
attracting the other recipients to affiliate with his
story, he needs to perceive and discern their emotion
and stance, as well as make his narration attractive.
He is skilful in telling story by using direct speech,
indirect speech, verbatim quotes and self-commentary
as well as evaluation. Importantly, emotion
interchanges with the other two recipients are
intertwined in his narration. We classify the narrative
interaction into four stages: prelude, preface and
development, climax, and denouement with
evaluation, in terms of the emotion flow. Of the two recipients- Mary and Sophie, Sophie
has heard the story before. And she encourages Jason
to re-tell the story to Mary. Since the story is still new
for Mary, she seems to evaluate more in the story.
However Sophie always affiliates with Mary. We use
Anvil to track these three speakers’ emotions, and the
starting time and the end time of the emotion
expression were recorded. After we annotated each
431431431
speaker’s emotion, then we save these annotations
into a table for comparing the mutual emotional
interaction of the recipients. The analysis suggests
that the positive emotion and the compliments of the
recipients help push Jason’s narration to the climax.
Our finding is similar to the theories on emotion
coincidence, emotion contagion and empathy in terms
of cognitive and sociology. The expression of an
emotional state in one person often leads to the
experience or expression of a similar emotion in
another person [14]. From the generated annotation track, we obtain
emotion inter-correlations between Jason’s narration
and Mary’s emotion, as seen in Table 1. We find out
that the degree of Mary‘s emotion deepens and
becomes more and more positive with the progress of
Jason’s story. Their emotion interacts well. In the
prelude of the story, Mary is reactive (surprise), then
in the stage of preface and development, she becomes
quite positive, then her laughter is a sign of
acceptance and offers the story-teller, Jason, a
relaxing and encouraging atmosphere. Following the
laughter, she gives her evaluation, ”Nice”, and in the
climax of Jason’s narrative, her emotion becomes
lively positive, which is shown by her compliments
and empathy, “That’s funny. Yeah I guess I wouldn’t stand
in a line like that...” The above analysis shows that the emotion flow
of Jason’s narration mirrors well Mary’s emotion
flow. Jason’s narration cannot achieve such a positive
effect without the collaboration of Mary. And the
recipients show different degrees of affiliation with
each other during the interaction.
The result is meanwhile similar to the cognitive
finding, with positive emotions resulting from goal
congruence and producing more creative and variable
actions [29]. The annotation tool can help us
understand how people jointly achieve emotion
comprehension, why assessments or evaluations
occur, where they occur, how they occur, and what is
evaluated in conversation.
5. Conclusion and implications Emotions are pervasive in human behaviours.
Pervasive emotional behaviors can be identified, then
be regulated by mobile service provisioning in a
pervasive computing environment. Within the AmE
vision, this paper extracts the model of emotion
expression and analysis in English conversation. A
survey was presented on emotion approaches from
interdisciplinary perspective. And an initial
experience was conducted with emotion scheme
specification and emotion annotation. Some lessons
were obtained, e.g. in using annotation software Anvil
and selecting emotion-intensive data. The future work
continues to extend the emotion labelling, study the
model of emotion expression and analysis in English
conversation and enhance our empirical study with
more case analyses.
Acknowledgement This work was carried out in the Ubiquitous
Computing and Diversity of Communication
(MOTIVE) founded by the Academy of Finland's
Research Program. Special thanks to Dr. Elise
Karkkainen for reviewing the paper.
References
[1] Anonymous "Special issue on automation and
engineering for ambient intelligence," Automation Science
and Engineering, IEEE Transactions on, vol. 4, pp.
295-295, 2007.
[2] Anonymous "IEEE Pervasive Computing Call for
Papers," Pervasive Computing, IEEE, vol. 7, pp. c4-c4,
2008.
[3] J. Riekki, J. Huhtinen, P. Ala-Siuru, P. Alahuhta, J.
Kaartinen and J. Roning, "Genie of the Net, an Agent
Platform for Managing Services on Behalf of the User,"
Computer Communications Journal, Special Issue on
Ubiquitous Computing, vol. 26, pp. 1188-1198, 2003.
[4] J. Zhou, C. Yu, J. Riekki and E. Kärkkäinen, "AmE
Framework: a Model for Emotion-aware Ambient
Intelligence, 2007," The Second International Conference
on Affective Computing and Intelligent Interaction
(ACII2007): Lisbon, Portugal., 2007.
[5] Yu Changrong and Zhou Jiehan, "Research on
service-mediated emotion computing and communication,"
432432432
in The First Finnish Symposium on Emotions and
Human Technology Interaction, May, 2008, pp. 48-52.
[6] Anonymous "ANVIL-the video annotation research
tool,"
[7] C. Darwin, The Expression of Emotion in Man and
Animals. Indy Publish, 1872,
[8] W. James, "Psychological essay: What is an Emotion?"
Mind, vol. 9, pp. 188-205, 1884.
[9] S. Freud, Beyond the Pleasure Principle. New York:
Norton, 1975,
[10] M. Lweis, J. M. Haviland and J. M. Haviland,
Handbook of Emotion. New York: The Guilford
PressLweis, M, 1993,
[11] C. Lutz, Unnatural Emotions: Everyday Sentiments
on a Micronesian Atoll and their Challenge to Western
Theory. Chicago: University of Chicago Press, 1988,
[12] R. Cornelius, The Science of Emotions. Upper Saddle
River, NJ: Prentice-Hall, 1996,
[13] R. Plutchik, H. Kellerman and H. Kellerman, Emotion:
Theory, Research and Experience. , vol. 1, New York:
Academic PressPlutchik, R, 1980,
[14] W. Chafe, Discourse, Consciousness, and Time: The
Flow and Displacement of Conscious Experience in
Speaking and Writing. Chicago: University of Chicago
Press., 1994,
[15] Stein, N. L., Bernas, R. S., & Calicchia, D., "Conflict
talk: Understanding and resolving arguments." in
Conversation: Cognitive, Communicative and Social
Perspectives T. Giron, Ed. Amsterdam: John Benjamins,
1996,
[16] C. E. Izard, "Four systems for emotion activation:
Cognitive and noncognitive processes " Psychological
Review, vol. 100, pp. 68-90, 1993.
[17] Sally Wiggins and Jonathan Potter, "Attitudes and
evaluative practices: Category vs.item and subjective vs.
objective constructions in everyday food assessments "
British Journal of Social Psychology, vol. 42, pp. 513–531,
2003.
[18] J. M. Gottman and R. W. Levenson, "A valid
procedure for obtaining self-report of affect in marital
interaction" Journal of Consulting and Clinical Psychology,
vol. 53, pp. 151-60., 1985.
[19] M. H. Goodwin and C. Goodwin, "Emotion within
situated activity," in Communication: An Arena of
Development.
Http://www.Sscnet.Ucla.edu/clic/cgoodwin/00emot_act.Pd
f Budwig, N.,Ina Uzgiris and James Wertsch, Ed.
Stamford: Ablex Publishing Corporation, 2000, pp. 33–53.
[20] Erica Sandlund, Feeling by Doing the Social
Organization of Everyday Emotions in Academic
Talk-in-Interaction. Karlstad University, 2004,
[21] J. A. Russell, "Culture and the Categorization of
Emotions," Psychological Bulletin, vol. 110, pp. 426-450,
1991.
[22] R. Collins, "Stratification, emotional energy, and the
transient emotions," in Research Agendas in the Sociology
of Emotions Kemper T. D., Ed. New York: SUNY Press,
1990, pp. 27-57.
[23] B. L. Fredrickson, "What good are positive
emotions?" Review of General Psychology: Special Issue:
New Directions in Research on Emotion, vol. 2, pp.
300–319, 1998.
[24] B. L. Fredrickson and T. Joiner, "Positive emotions
trigger upward spirals toward emotional well-being,"
Psychological Science, vol. 13, pp. 172–175, 2002.
[25] D. Bois and P. Danae, "Outline of discourse
transcription," in Talking Data: Transcription and Coding
in Discourse Research J. A. Edwards and M. D. Lampert,
Eds. Hillsdale, NJ: Erlbaum, 1993, pp. 45-89.
[26] SUPER, "SUPER,
http://www.erightsoft.com/SUPER.html. Retrieved by 5th
April 2009."
[27] VirtualDub, "VirtualDub,
http://www.afterdawn.com/guides/archive/cut_avi_with_vi
rtualdub.cfm. Retrieved by 5th April 2009."
[28] Kipp Michael, "Anil 4.0 annotation of video and
spoken language, user manual," University of the Saarland,
German Research Center for Artificial Intelligence,
Germany., 2003.
[29] Kahnand BE and Isen AM, "The influence of positive
affect on variety seeking among safe, enjoyable products,"
J Consum Res, vol. 20, pp. 257-270, 1993.
433433433