a framework for publishing oral history interviews to the web

14
A Framework for Publishing Oral History Interviews to the Web Stephen Paul Davis Director, Libraries Digital Program Columbia University OCLC Western Digital Forum August 2006 rev. 10/2011

Upload: chuong

Post on 13-Feb-2016

28 views

Category:

Documents


0 download

DESCRIPTION

A Framework for Publishing Oral History Interviews to the Web. Stephen Paul Davis Director, Libraries Digital Program Columbia University OCLC Western Digital Forum August 2006 rev. 10/2011. The Players. Columbia's Libraries Digital Program - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A Framework for Publishing Oral History Interviews to the Web

A Framework for Publishing Oral History Interviews to the Web

Stephen Paul DavisDirector, Libraries Digital Program

Columbia University

OCLC Western Digital ForumAugust 2006

rev. 10/2011

Page 2: A Framework for Publishing Oral History Interviews to the Web

The Players Columbia's Libraries Digital Program

Columbia Center for Oral History (formerly: Oral History Research Office)

Columbia's Digital Knowledge Ventures (ceased operations)

Backstage Library Works (formerly: OCLC Preservation Services)

George Blood, L.P. (formerly: Safe Sound Archive)

OCLC Digital Archive

Page 4: A Framework for Publishing Oral History Interviews to the Web

The Script Sessions: 10 interviewees in 193 individual interview

sessions

Recordings: 205 hours on 170 Tapes (109 Cassettes, 53 Five-inch Reels, 8 Seven-inch Reels)

Transcriptions

◦ 11,064 pages of typescript in 72 notebook binders

◦ 2,644 pages in MS Word format

Related material: name indexes, biographies, tables of contents, photos

Page 5: A Framework for Publishing Oral History Interviews to the Web

The Plot Online audio in Real & MP3 format, both downloadable & streaming Audio segments directly correlated with transcriptions at the

paragraph level Page images of transcriptions in PDF OCR'd transcriptions plus TEI/XML mark up Full-text search and retrieval Name index entries linked back to references in text Abstract of each interview A general introduction A few pictures Rights and permissions cleared in advance

Page 6: A Framework for Publishing Oral History Interviews to the Web

The Revised Plot Online audio in Real & MP3 format, both downloadable & streaming Audio segments directly correlated with transcriptions at the

paragraph session level Page images of transcriptions in PDF OCR'd Re-keyed transcriptions plus TEI/XML mark up Full-text search and retrieval Name index entries linked back to references in text Abstract of each interview A general introduction Three general introductory essays & a video interview with ORHO

director emeritus Ten introductions for the interviewees A few 50 pictures Ten new, detailed tables of contents Ten audio & text 'excerpts' to provide interview lead-ins Rights and permissions cleared in advance

◦ Dropped: Robert F. Wagner, Kitty Carlisle Hart, Alice Hartley Neel, Schuyler Garrison Chapin, Ed Koch (1997)

◦ Almost dropped: Foner (bad language) ◦ Added: Mamie Clark, Mary Lasker, Frances Perkins, John Oakes

Page 7: A Framework for Publishing Oral History Interviews to the Web

Cataloging & Metadata Cataloging options:  Audio: the original audio collection, the complete wav files, the

complete MP3 files, the segmented Real files Transcriptions: the original typescripts and/or Word files; the

converted XML files; the generated HTML files

Cataloging decisions Previous catalog records for oral history transcripts left intact

under “Reminiscences of …”

New collection-level catalog record created for entire NNY site

New “analytic” catalog records created for each Notable New Yorker subsite as a component of the NNY collection site: 773 0_ |7 nnbc |a Notable New Yorkers |h [electronic resource]. |w (OCoLC65181290)

Page 8: A Framework for Publishing Oral History Interviews to the Web

Ticket Prices Scanning, keying & XML Markup: $12,200 Audio transfers, file header edits, MP3 creation & media: $13,720 Audio time coding & post-processing: $9,000 Web site (outsource): $17,150

◦ Pre-production, $2,600◦ Rights research & permissions, $1,000◦ Web site design, $3,850◦ Web programming, $7,500◦ Copy editing & QA, $1,400◦ XSLT Generation of HTML from METS/TEI, $2,000

Additional site content: $12,800◦ Introductory Essays, $5,700◦ Tables of Contents, etc. $5,900◦ Video shoot & post-production, $1,200

Oral History Research Office Contributions: "Priceless"◦ Text preprocessing◦ Audio inventory◦ Rights and permissions clearances◦ Editorial review

 Digital Library Program Contributions: “Ditto”◦ Project and vendor coordination ◦ Text QC, post-processing, METS file creation◦ Text indexing & retrieval system (Lucene)◦ Application integration

Page 9: A Framework for Publishing Oral History Interviews to the Web

Challenges 1Problems with Rights & Permissions Permission status uncertain Permission withdrawn Permission equivocal

Problems with Source Material Incomplete / outdated inventory of original media Missing tapes, audio files Patrons using only (single) copy of transcripts Misnumbered pages in transcriptions Missing pages in transcriptions

Scanning & Keying Vendor / Digital Program Relations Novelty of / unfamiliarity with oral history content Delays in providing vendor with source material Recognition that typescripts could not be OCR’d because of poor quality;

instead 100% rekeying of originals Clarity, interpretation, accuracy of markup specs

Page 10: A Framework for Publishing Oral History Interviews to the Web

Challenges 2Web Design Vendor  / Digital Program Relations Outsource design of a web site intended to be maintained afterwards in-house; Differences in development process, methodology Difference in “one shot” site versus ongoing collection-driven site Differences in design “values,” e.g., aesthetics versus usability; “teaching &

learning” ethos versus “easy & effective access” ethos; role of branding; Differences in familiarity and experience with full-text / cross-text search and

retrieval Availability of time to meet & discuss issues, project management by email,

deadlines,

Curatorial / Digital Program Relations Curatorial time and staffing constraints Curatorial enthusiasm leading to requirements creep Assumptions about feasibility of “last minute changes”

Textual Issues Identity of the “master file” after online publication? “Fixity” of transcriptions in MS Word Retaining consistency of references / citations in paper version and in online version

Page 11: A Framework for Publishing Oral History Interviews to the Web

Challenges IIIIssues Relating to the Practice of Oral History Publishing oral history interviews reflecting older, “outdated”

practice along with those reflecting current practice Making available original, unedited audio files in conjunction with

transcriptions reviewed & edited by the interviewees Web exposure of interviews that were originally to be available

onsite to scholars and researchers Influence on current and prospective interview subjects who know

that their comments will be published on the Web

Page 12: A Framework for Publishing Oral History Interviews to the Web

The Moral (Lessons Learned) 1 Commit to doing more planning up front than you think you need to

do;

Set up a rigorous schedule of face-to-face meetings with key stakeholders even if they don't think you need to;

Make sure all content pieces are agreed to, in hand, fixed, and have clear permissions to publish before agreeing to do the project (or at least before contracting with vendors);

Oral Histories are by their nature fuzzy in their fixity;

Widows often object to their husbands' bad language long after their husbands are gone;

Keep detailed inventories of all content pieces before, during and after the project (good asset management);

Enthusiasm can often lead to scope creep;

Page 13: A Framework for Publishing Oral History Interviews to the Web

The Moral (Lessons Learned) II Push off non-essential scope creep to Phase 2;

Don't try to edit Emeritus' prose;

Many people don't like Realmedia /  RealPlayer any more (I blame Microsoft);

Curators often have other things to do than what you're interested in having them do;

Library Digital Program staff always have other things to do than the project the curator is interested in;

If a Digital Project is successful it becomes a permanent part of your life and will always need care and feeding even if you think you're finished with it, so get used to it;

There are less expensive ways to do projects like Notable New Yorkers but not that much less expensive.