slovenian biographical lexicon – from a digital edition to an on-line application

14
INFuture 2009, Zagreb Slovenian Biographical Lexicon – From a Digital Edition to an On- Line Application Jan Jona Javoršek* Tomaž Erjavec* Petra Vide Ogrin** *Jožef Stefan Institute, Ljubljana, Slovenia **Slovenian Academy of Sciences and Arts, Library, Ljubljana, Slovenia

Upload: willem

Post on 31-Jan-2016

43 views

Category:

Documents


0 download

DESCRIPTION

Jan Jona Javoršek *Tomaž Erjavec*Petra Vide Ogrin** * Jožef Stefan Institute, Ljubljana, Slovenia ** Slovenian Academy of Sciences and Arts, Library, Ljubljana, Slovenia. Slovenian Biographical Lexicon – From a Digital Edition to an On-Line Application. Outline. Digitization - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Slovenian Biographical Lexicon – From a Digital Edition to an On-Line Application

INFuture 2009, Zagreb

Slovenian Biographical Lexicon – From a Digital Edition to an On-Line Application

Jan Jona Javoršek* Tomaž Erjavec* Petra Vide Ogrin**

*Jožef Stefan Institute, Ljubljana, Slovenia

**Slovenian Academy of Sciences and Arts, Library, Ljubljana, Slovenia

Page 2: Slovenian Biographical Lexicon – From a Digital Edition to an On-Line Application

INFuture 2009, Zagreb

Outline

Digitization Encoding methodology XML–TEI structure On-line application Future plans

Page 3: Slovenian Biographical Lexicon – From a Digital Edition to an On-Line Application

INFuture 2009, Zagreb

Slovenian Biographical Lexicon

Printed version comprises 15 volumes + index, published over a longer period of time (1925–1991)

Includes notable figures important for Slovenian cultural life, from the beginnings up to contemporary time

Covers 5,042 biographical entries, over 5,100 persons because of family entries

Data in the articles are checked against the relevant primary sources

Page 4: Slovenian Biographical Lexicon – From a Digital Edition to an On-Line Application

INFuture 2009, Zagreb

Example page from SBL

Page 5: Slovenian Biographical Lexicon – From a Digital Edition to an On-Line Application

INFuture 2009, Zagreb

Encoding methodology

Use of open standards and software Use of TEI P5: specific elements for describing

biographical and prosopographical data, e.g.:

<birth>, <death>, <date>, <placeName>, <sex>, <faith>, <occupation>, <floruit>

Up-conversion into TEI–XML: OpenOffice – TEI OO package (XSLT stylesheets) → TEI–XML document (basic structure)

Semi-automatic extraction of metadata: Perl, XSLT + manual intervention

Page 6: Slovenian Biographical Lexicon – From a Digital Edition to an On-Line Application

INFuture 2009, Zagreb

SBL article structure

<div>

<listPerson>

<person n=“main“> <!-- other elements for biographical data: birth, death, occupation … -->

</person>

<person n=“author“> <!--author's name-->

</person>

</listPerson>

<p> <!-- the annotated text of the article -->

</p>

</div>

Page 7: Slovenian Biographical Lexicon – From a Digital Edition to an On-Line Application

INFuture 2009, Zagreb

Page 8: Slovenian Biographical Lexicon – From a Digital Edition to an On-Line Application

INFuture 2009, Zagreb

Example of various atribute values for <persName>

@type = adopted 2 = artistic 21 = incorrect 6 = married 193 = monastic 4 = nickname 37 = operosorum 21 = partisan 96 = pseudo 2350

Page 9: Slovenian Biographical Lexicon – From a Digital Edition to an On-Line Application

INFuture 2009, Zagreb

SBL online application

Fedora Commons: extensible framework for storage, management and dissemination of complex objects and object relationships

Repository + a digital library of bibliographical articles, enabling browsing and searching

Fedora Generic Search – provides native Fedora Commons interface between an external search system and Fedora Commons API

SOLR, search system based on Apache Lucene search and indexing library

OAI-MH protocol, REST and SOAP protocols

Page 10: Slovenian Biographical Lexicon – From a Digital Edition to an On-Line Application

INFuture 2009, Zagreb

Example entry

Page 11: Slovenian Biographical Lexicon – From a Digital Edition to an On-Line Application

INFuture 2009, Zagreb

Advanced search options

Page 12: Slovenian Biographical Lexicon – From a Digital Edition to an On-Line Application

INFuture 2009, Zagreb

Advanced search

Drop-down menus for occupations – integrated taxonomy

Drop-down menus for placenames: search by different categories, e.g. country, district, settlement, multilanguage search for some places: e.g. Gradec (slov.) – Graz (ger.)

Search by forename, surname, and by different languages of person's name

Search by rolename: e.g. bishop, or nobility titles, e.g. count, knight, baron etc.

Page 13: Slovenian Biographical Lexicon – From a Digital Edition to an On-Line Application

INFuture 2009, Zagreb

Future plans

Expansion and normalization of numerous abbreviations – problem: Slovenian is a highly inflectional language

Named Entity Recognition: to enable (semi)-automatic extraction/encoding of persons' and place names occuring in the full-text

Encode other information in the full-text: relatives within SBL, person disambiguation, links within SBL and to external sources, e.g. COBISS bibliographical records, wikisource (online literature publication)

Map placenames on an atlas, e.g. Google maps Slovenian Biographical Hub – SBL joined by other

biographical resources

Page 14: Slovenian Biographical Lexicon – From a Digital Edition to an On-Line Application

INFuture 2009, Zagreb

Welcome to beta:

http://nl.ijs.si/fedora/sbl

Hvala!