enp belgrade ws metadata
TRANSCRIPT
Europeana Newspapers
Belgrade Workshop
WP5 Metadata – Structural Metadata
Belgrade, 14th June 2013
Günter Mühlberger, Innsbruck University
WP5 leader
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
WP 5 Metadata
• The main objectives are • To gather and analyse metadata models from libraries currently
in use for the digitisation of newspapers.• To design and release a comprehensive metadata model based
on de-facto standards such as METS, MODS, MARC, ALTO, etc.• To manage the feedback cycles where stakeholders will comment
on the format• To prepare an online resource (Wiki, database or similar
website) that contains the rules how to apply the format and how to use it within a digitisation project
• ENMAP (Europeana Newspaper Mets Alto Profile)• Already used within the project – standardization aspect• Public release in October 2013
2
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
Structural Metadata
• Idea of a data dictionary• Addition to ENMAP• What is what? Definition of structural elements / text types
• Structural elements• Title section• Headline• Advertisement• Illustration• Caption line• Running title (column title)• Page number• ...
3
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
Text types
• Text types (or sub-genres)• breaking news• short news• book review, theatre review, software review,...• obituary• advertisement• family notice• job announcement• wheater forecast• novel, poem, ...• etc.
4
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
Objectives
• Set up a data dictionary• Provide a comprehensive list of elements and text types• Use clear definitions• Make the criteria for definitions transparent• Include many examples from several newspapers• Make it an open dictionary, so that people can contribute• Classify structural elements according to their intention
• Rationale• Many libraries need to define these elements for
• service providers• search services (facetted search)• crowd based services (apply these metadata)
• Currently no other standard is available (partly TEI)
5
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
Some considerations to take home...
• Understand text as an interaction towards the reader:
• A text may want to inform, entertain, convince, activate, support, etc. users. What are the main interactions in (historical) newspapers?
• Does the layout define the interaction or the semantic content or a combination of both?
• Are family notices, obituaries, cross-word puzzles, poems, novels, etc. articles or (intellectual) items?
• Is the headline of an article a piece of information, or does it support the user in navigating through a newspaper?
• Imagine a crowd based service where users can apply text types from a list.
6
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
Want to contribute?
• Send me your lists of structural elements and how you defined them!
• For project partners: Have a look to the updated version of the paper on structural metadata: WP5/documents/structural MD
• Do not hesitate to take part in the discussion!
7
Thank you for your attention!
lGünter Mühlberger <[email protected]>