Sound and Unsound Documentation:
David Nathan
Questions about the roles of audio in language documentation
Endangered Languages ArchiveSchool of Oriental and African StudiesUniversity of Londonwww.hrelp.org
A paradigm shift?
From evidence to performance…
Documentation output
Wittenburg & Mosel (following Himmelmann):
“… the corpus should consist of a variety of text types and genres.
Multimedia (sound and video) recordings form the basis of the documentation work. These recordings should be associated with an orthographic or phonemic transcription, a translation in one of the major languages of the world, and/or glossings in a local lingua franca and English…”
Documentation output
Johnson & Dwyer:“GenreInteraction: conversation, verbal contest, interview, meeting/gathering,
riddling, consultation, greeting/leave-taking, humor, insult/praise, letterExplanation: procedure, recipe, description, instruction, commentary, essay,
report/newsPerformance: narrative, oratory, ceremony, poetry, song, drama, prayer,
lament, jokeTeaching: textbook, primer, workbook, reader, exam, guide, problemsAnalysis: dictionary, word-list, grammar, sketch, field notes
Register informal/conversational, formal, honorific, jargon, baby/caretaker talk, joking, foreigner talk
Style ordinary speech, code-switching, play language, metrical organization, parallelism, rhyming, nonsense/unintelligible speech”
A paradigm shift?
Sound as evidence in documentary linguistics …data not independent of a theory which uses itwhat is it? Disk, sound recording, file, file + metadata,
transcription etchow to represent and store ithow to present itwhat to do with it
Recorded/recording events as performances
Reifications of pattern or ideal Distinguish between event and record of it –
(fundamental for documentary linguistics) Repeatable, comparable; implies genre, audience Assists with protocol (attributes and participation) Allows editing to be methodologically possible Links us to existing fields’ knowledge and experience,
e.g. radio, cinematography, performing arts, music, musicology, ethnography …
Archivism
However, what we got was archivism
Archivism: capitulation of language documenters to the agenda and priorities of archives and information technology
Why did this happen?for historical reasonsrapid changes in technologywe left a vacuum
From evidence to archivism
Positive aspects of archivism - for some, for now, endangered languages field is luckier than others clear imperative to archive data benefits of new technologies (media, storage, convergence) funding and resources: DoBeS, EMELD, HRELP etc
However may be short-lived we are thrust into competing with entities like banks not enough contribution to language strengthening etc not nurturing documentary linguistics a 'productivity paradox' as experienced by the financial sector?
What have we missed?
Contact with wisdom and experience of established fields e.g. radio/broadcasting (eg mics, MD) cinematography (eg quality and
specialisation) journalism (eg equipment handling) audio archives (linguists had
input to IASA before 80s or so)
What have we missed?
Woodbury: most developments are "what's been happening around the emergence of a documentary linguistics", particularly technology, which has raised expectations more than changed practices
Examples
(Schüller) audio professionals use the trained ear as evaluator of quality, while linguists prefer wave-forms etc cf value of binaural recording
media people know that signals emanate from events but do not represent them
recording to edit
Lost opportunities?
Technical stereo, binaural monitoring while recording (headphones) environment and psychoacoustics microphones and handling editing
Content everyday expressions, eg Yuwaalaraay ngarigaa capturing environment/eliminating environment preludes to stories that explain who is talking and why etc.
Wider question is: in a mature documentary linguistics, is there a clear, or even valid, boundary between these two?
Did we get what we needed?
What did we get?advice about formats, parameters, what to avoid'silver bullet' equipment and formats fundamentalism and format wars
What do we need? If we continue to be 'lone wolf' fieldworkers, how to get good quality signals? Quality is relative to purpose. But given exhortations to make 'best record', what influences quality?
What influences audio quality?
A large number of factors: physical environment (inside, outside) control/management of environment acoustics - room, objects microphone selection, placement, handling, compatibility mono/stereo/binaural sources of noise/interference recorder and recording medium handling
Clearly these span fields: do they tell us anything about the scope of documentary linguistics?
Disappearing recorders
Zounds! Where’s my recorder? storage (eg iPod etc) A-D and storage (eg laptop) transducer (microphone)
Reasons for using a recorder (not laptop) workflow quality assurance consistency power
There are principles involved!
How much sound?
Under archivism, repositories are seen to determine amount as well as quality of data
ELDP experience some applicants propose amounts of audio in terms of
technologies, eg flash cards only hold a few hours; or (on other hand) voice recorders can hold hundreds!
to get a grant! Understandable lurching back and forth between
extremes rapid changes in technology, and advice about it more information available about documentation agenda and
technologies competition for grants as opportunities in linguistics decrease?
How much sound?
Determined by lists of output types and genres?
Wittenburg & Mosel:
“… the corpus should consist of a variety of text types and genres.
Multimedia (sound and video) recordings form the basis of the documentation work. These recordings should be associated with an orthographic or phonemic transcription, a translation in one of the major languages of the world, and/or glossings in a local lingua franca and English…”
How much sound?
Johnson & Dwyer:“GenreInteraction: conversation, verbal contest, interview, meeting/gathering,
riddling, consultation, greeting/leave-taking, humor, insult/praise, letterExplanation: procedure, recipe, description, instruction, commentary, essay,
report/newsPerformance: narrative, oratory, ceremony, poetry, song, drama, prayer,
lament, jokeTeaching: textbook, primer, workbook, reader, exam, guide, problemsAnalysis: dictionary, word-list, grammar, sketch, field notes
Register informal/conversational, formal, honorific, jargon, baby/caretaker talk, joking, foreigner talk
Style ordinary speech, code-switching, play language, metrical organization, parallelism, rhyming, nonsense/unintelligible speech”
How much sound?
Possible answers :distinguish recording from outputs/products (incl
archive deposit as one output)ELDP/ELAR: demonstrate 10% commitmentlet language community members and academic
peers judge, not archives or technologies
Un-sound documentation?
Johnston & Schembri: Documenting AUSLAN no writing or widely-used transcription system no standardization associated with the culture and
history of writing no written literature; little known about genres etcno possibility of processing, eg corpus work or 'text
mining‘
Un-sound documentation?
Johnston sees tools like MPI’s ELAN as the equivalent of 'writing' for signed languages
Problems annotating video for SL also raise issues being questioned in mainstream linguisticseg existence and atomicity of grammatical categories
Sound interfaces
Spoken Karaim and ShoeHorn
Run
Is audio the prime representation?
Multi-tiered, multi-scoped annotation cf recent ELAP workshop where meaning in documentation seen as: at different linguistic levelschanging and ongoing over timemessy, irreconcilable, contesteddrawing on meanings and texts outside the text in
questionSuggests that audio recording is merely one
(important!) aspect of the documenter’s toolset
Other questions
Who does the recording? Can community members only use cassettes? What changes if we shoot video as well? Are community members more motivated if they can
shoot video? Would we collect data by phone if there was sufficient
bandwidth? What audio resources are most effective for language
strengthening? Have we conflated fieldwork methodology with
documentation’s outputs?
Conclusions
In language documentation, a twin shift to data orientation and digitisation
has led us into domains where there is a wealth of existing experience, which we can not easily tap into, while competing against those who we can't possibly match
Treat audio as a way to capture various kinds of performances, not as the object of description
We are lacking interfaces and software for working with and presenting audio
Thank you