russian multimodal corpora
DESCRIPTION
Russian multimodal corpora. Andrej A. Kibrik (Inst. of Linguistics RAN and MSU) [email protected]. Multimodality. Traditional linguistic approach: language = verbal material Multimodal approach: linguistic communication involves several modes, or channels Apart from the verbal mode, also: - PowerPoint PPT PresentationTRANSCRIPT
2
Multimodality Traditional linguistic approach:
language = verbal material Multimodal approach:
linguistic communication involves several modes, or channels
Apart from the verbal mode, also: non-segmental sound (=prosody) visual mode (=“body language”)
These modes are no less important for linguistic communication than the traditional verbal mode
“Any use of language is inescapably multimodal” (Scollon 2006)
In this talk: I. Corpora annotated for prosody II. Corpora annotated for gesticulation and prosody
3
I. CORPORA ANNOTATED FOR PROSODY
Night Dream Stories Siberian Life Stories Funny Life StoriesThis work is being currently supported by
the Russian Academy of Sciences project “Corpus Linguistics”http://www.corpling-ran.ru/
This online service has been created:http://mib431.ru/corpus/#
4
Night Dream Stories Authors: E.A.Korabelnikova, A.A.Kibrik,
V.I.Podlesskaya, A.O.Litvinenko, N.A.Korotaev, M.K.Buryakov et al.
Goals: Multi-purpose corpus of spoken
Russian Comparison of language produced by
normal speakers and speakers with neural disorders
Discourse type: personal stories Speakers: children and adolescents Setting:
When:• Recorded in 1990s• 2000-2009: the NDS project• 2011: the current stage
Where: mostly in a clinic How: immediately after wake-up
5
Night Dream Stories Composition
Audio files• Marked for temporal structure
Transcripts of three levels of detail: minimal, medium, and full
Volume 129 stories Almost 2 hours
Conservative estimate: transcribing one minute of talk takes an experienced transcriber 5 hours of work
14,000 words 3776 elementary discourse units (EDUs) – basic building
blocks of spoken language
6
Night Dream Stories What’s in the transcript?
EDUs Temporal dynamics Pauses Disfluencies Accents Tone in accents Illocutionary characteristics Phase Emphasis Reduction Tempo Tonal register General characterization Comments on specific EDUs Etc., etc.
7
Night Dream Stories
Project site Example: 016z Play Three levels of detail in transcript Play by EDU
8
Night Dream Stories: ELAN annotation
9
Siberian Life Stories Authors: K.V.Orlova, N.A.Korotaev,
V.I.Podlesskaya, A.O.Litvinenko, M.L. Pal’ko, M.L.Buryakov, E.I.Il’yina
Differences from the Nigth Dream Stories corpus Various age groups “Tell me about a remarkable episode in your life” Temporal dynamics was done in a more sophisticated
way Volume:
17 stories 40 min. 1267 EDUs
10
Funny Life Stories Authors: A.A.Kibrik, N. Molchanova, T. Sokolova, N.A.
Korotaev et al. Goal: resource for comparing written and spoken
discourse Differences from the Nigth Dream Stories corpus
Students “Tell me about a funny episode in your life” Next week: “Write down the funny episode” Each story is represented in a spoken (audio + transcript) and
written version Volume:
40 spoken and 40 written stories Spoken: 70 minutes, 2391 EDUs, 7000 words Written: 10 000 words
11
Spoken corpora: Problems and perspectives Problem
Adobe Flash Player, integrated into browsers, does not find the proper end of an EDU
HTML5 player is used (refresh rate 0.25s) but the result is not satisfactory
Solution? Perspectives
Downloadable version ELAN multi-tier annotation Customization of transcription Search and statistics:
• Prosody, such as accents, disfluencies, etc.• Frequent lexicon
12
Another spoken corpus: Stories about presents and skiing Authors: V.G. Xurshudyan, V.I. Podlesskaya, N.A.
Korotaev, A.O.Litvinenko, O.A. Savel’eva et al. Goal:
Comparison of original comics-based stories and subsequent retellings Cross-linguistics comparison
• Russian, Belorussian, Polish, Armenian, Italian, French, Japanese, English Design:
Stories elicited from pictures Retellings (by the same speaker) on the next day 10 speakers for each language
Volume (Russian): 35 min. 10 stories 5500 words
Hyperfull transcription (intonation constructions)
13
II. CORPORA ANNOTATED FOR GESTURES
Pear Stories 1 Pear Stories 2
14
Pear Stories 1 Author: Julia V. Nikolaeva Goal: Study the coordination between
gestures and discourse structure, both local and global
Discourse type: Retellings of the Pear Film (Chafe 1980) Monologue with backchannels
Speakers: students (pairwise) Setting:
When: recorded in 2006 Where: Faculty of Foreign Languages,
MSU How:
• To a person who had not seen the film• The picture includes both interlocutors
15
Pear Stories 1 Composition
Video Audio ELAN annotation
Volume 8 retellings 20 minutes 2500 words 596 EDUs 325 gestures
16
Pear Stories 1: Tiers Transcript Gesture 1 Rhythmic gesture 1 Hand(s) 1 Gesture 2 Rhythmic gesture 2 Hand(s) 2 Comments Discourse level Catchment
17
Pear Stories 1: Tier “Transcript” Verbal component Local discourse structure: EDUs Dialog structure Prosody
Pauses Disfluencies Illocutionary and phasal structure Reduction Smiling and laughing
Gestures Punctual |
• Short gestures: beats• Emphasized points in extended gestures
Extended• Beginning {• PEAK PHASE• End }
18
19
Pear Stories 1: Tier “Gesture”
GESTURE TYPES: Pointing Iconic Rhythmic Beats Metatextual Emblems Blurred Unclear
20
Pear Stories 1: Tier “Hands”
Rigth Left Two hands
21
Pear Stories 1: Tier “Catchment”
Gesture shape, gesture location and meaning are kept througout several EDUs
“Gestural sentence” Switch to ELAN, 387-392
22
Pear Stories 2 Authors: O.V.Fedorova, S. Maljutina, Ju. Akinina, O.V.Dragoj Goal: Study of discourse strategies in aphasics, compared to
normal participants Parallel corpus of normal and aphasic retellings Composition
Video Audio Transcripts
• Verbal component• Pauses• Disfluencies• Comments
Volume: 30 normal and 23 aphasic retellings 12,000 words
23
Pear Stories 2a Authors: O.V.Fedorova, A. Fejn, E. Pavlova Goal: Study the inheritance of discourse strategies between the
original and second retellings The status of discourse protagonist, as reflected in verbal vs. gestural
component Corpus
Three original retellings (normal speakers) 3x8=24 second retellings
Composition Video Audio Transcripts
• Verbal component• Pauses• Disfluencies• Comments• Gestures
Multimodal analysis provides richer information on speakers’ strategies than the verbal component alone
24
25
Conclusion Developing multimodal corpora brings us closer to a
genuine understanding of human communication In a better world, the reasonable sequence in the
scientific study of language should have been:(1) basic, original use of language: spoken face-to-face communication
(2) derived, secondary use of language: written texts
But if we cannot revert the history of linguistics, let us explore the fundamental form of language now – better late than never
26
Conclusion For other major languages, there exist some multimodal
corpora already – see http://www.multimodal-corpora.org/ Often designers of multimodal corpora just add gesture and
other visual information to the verbal component But particularly important is to also include the prosodic
channel Only a combination of all three can give us a realistic
picture of human communication
verbal channel
visual channel
prosodic channel
language
27
Conclusion Russian multimodal corpora are still in their
incipient stage But they are steps in the right direction On the basis of the accumulated expertise,
we could undertake a multimodal corpus that is prosodically highly detailed at the same time, contains the sufficiently detailed
gesture and body language annotationand therefore approaches an ecologically realistic
model of actual human communication
28
Conclusion
Use of such future product linguistic research psychological research sociological research as well as various applied uses, such as
spoken human-computer interaction and language teaching
29
Kiitos huomiostanne!
verbal channel
visual channel
prosodic channel
language