itinera nova in the world(s) of crowdsourcing and tei
DESCRIPTION
By Ben Brumfield Colloquium Itinera Nova. Tools, people & history (City Archives Leuven) April 25th, 2013TRANSCRIPT
Itinera Nova in the World(s) of Crowdsourcing and TEI
Ben Brumfield
International Colloquium Itinera Nova
Crowdsourced Transcription
Offline projects from the 1990s
Van Papier Naar Digitaal (NL)
FreeBMD/FreeREG/FreeCEN (GB)
Demogen (BE)
Arkivalieronline (DK)
Western Michigan Genealogy Society (US)
Crowdsourced Transcription
Online tools developed from 2005
Diverse projects released from 2006
2006 FamilySearch Indexing
2008 FromThePage
2008 Wikisource (+ ProofreadPage)
2009 North American Bird Phenology Program
Tension
Volunteer transcribers vs. Professional editors
Easy tools vs. Powerful tools
Easy Tools, Hard Mark-up
Amateurs + Mark-up = ???
So get rid of the mark-up, right?
Power vs Usability
• Power can enable users.
• Lack of power frustrates users.
• For transcription, mark-up is power.
Power vs Usability
• A little story about scrambled eggs...
TEI
Ultimate in mark-up
Standard since 1990
Ubiquitous in scholarly editing
Usually hand-edited XML in offline tools“TEI? That's just for data entry.”
TEI
Strengths– Powerful data model
– Tools for presentation and analysis– Active community
Genetic Edition Module– Represents changes to texts– Still in development
TEI
But how was that encoded?
TEI
Amateurs + TEI = ????
Rarely attempted– 29 projects in crowdsourced transcription tool
directory– Only 7 claim to “support TEI”
TEI + Amateurs
• Tag Buttons– T-PEN/CCL
– TEI Toolbar (TB)
• Tag Menus– VdU– Papyrological Editor
Buttons: Transcribe Bentham
Menus: MOM-CA
Button Limitations
• Users outgrow buttons– “I believe one or two transcribers now add
tags manually rather than use the toolbar, which says something about the improvement in their IT skills.” –Tim Causer (TB)
• Users ignore buttons– “One editor for exampled prefered to put || for
<lb> as he was used from the preparation of a printed edition.” –Georg Vogeler (VdU)
Button Limitations
“...there were something like 67 necessary buttons, and it was maddening to fish around for the desired button. And the research assistants, who had been encoding in oXygen, just typed in angle brackets and memorized tags, instead of using the buttons. .” – Abigail Firey (CCL)
A New Way
A New Way
A New Way
A New Way
For data entry, consider alternatives to TEI– Existing print notations (e.g. Leiden+)
– Robust data entry tools
Use TEI for data models and presentation– Papyrological Editor
Opportunities to combine crowdsourcing tools with TEI– Skylark project at UMD MITH
• Zooniverse transcription components
• Genetic edition TEI module
Questions
Ben Brumfield
@benwbrum
FromThePage.comSlides and transcript at
http://manuscripttranscription.blogspot.com/