eaa2014 istanbul - barriers and opportunities for linked open data use in archaeology and cultural...
DESCRIPTION
EAA2014 Istanbul - Barriers and Opportunities for Linked Open Data use in Archaeology and Cultural HeritageTRANSCRIPT
Presented by
Keith May @Keith_May
Incorporating work by
Ceri Binding & Prof Doug TudhopeFaculty of Advanced Technology
University of South Wales &
Anja Masur & Gerald Hiebel ARIADNE project partners
Barriers and Opportunities for Linked Open Data use in Archaeology and Cultural Heritage
EAA2014 Istanbul
Overview of Presentation
• Intro to LOD research • Barriers Encountered • Possible Solutions • Opportunities • Conclusions
Increasing Access and Integration of Excavation records through use of conceptual modelling
• CRM-EH and now CRMarchaeo focuses on common ‘core’ Concepts of our Archaeological processes
• Common Concepts (e.g. Stratigraphic relationships -Harris matrix) crucial for relating individual records
• Modelled/Mapped a Limited degree of the minute archaeological detail to CIDOC CRM - ISO 21127
• Different broad categories of contexts (Deposits, Masonry, Timber, etc) handled by separate forms but modelled together
• Model already "complex" enough - most archaeologists find it a little daunting
Details of Context on recording
form
Prototype Controlled Vocabulary searching
Barriers
Documentation• Different excavation methods bring differing documentation • Comparison of different documentation sheets
Similarities and Differences
With thanks to Anja Masur
What about comparing records across different countries?
Context
Locus
Unit
Spit
Level
StratumBehälter (Troy)
Layer
Semantics One Concept - one meaning – different terms
Stratigraphic Unit
“…another of my examples has something about some flint that is ‘snuff coloured’ & I don’t know if I’ve ever seen snuff, let alone know what colour it is, or might have been over 150 years ago, and I would think it would make sense to take some kind of integrated approach from the outset,….” [G. Carver]
For data entry: Semi-controlled vocabularies represent a useful compromise somewhere between descriptive & controlled vocabularies, the best of both worlds! For data retrieval: The worst of all worlds (Re. find all the iron age post holes) This problem arises from trying to do two different things within a single input field. Should do both, but separately – 1) describe using free text description fields, and 2) index using controlled index fields
Barrier: Semi-controlled vocabularies…
Unlocking Some
Barriers
Archaeological Terms represented as Concepts with Relationships
SKOS = Simple Knowledge Organisation System STELLAR Project Tools - SKOS Template
Using SKOS - W3C standard for Web-based Terminologies
skos:Concept Castle:c789
skos:Concept Motte:c456
skos:broader skos:narrower
skos:Concept Bailey:c789
skos:Concept Motte:c456
skos:related skos:related
skos:ConceptScheme Monument:s123
skos:Concept Motte:c456
skos:inScheme
SKOS_CONCEPTS – scheme_id, broader_id, related_id
▪Controlled vocabularies online ▪Vocabularies from EH, RCAHMS, RCAHMW (England, Scotland, Wales) ▪Conversion to a common standard format (SKOS) ▪Persistent globally unique identifiers (URIs) for every concept ▪Made available online as Linked Open Data ▪Also downloadable data files and listings
▪Web services ▪Facilitate concept searching, browsing, suggestion, validation
▪ Tools to use controlled vocabularies ▪Browser-based ‘widget’ user interface controls ▪Search, browse, suggest, select concepts
▪ Use Cases ▪Legacy data to thesaurus alignment ▪Thesaurus to thesaurus alignment ▪Third party use of project outcomes
Vocabulary Widgets (RDF/XML) – e.g. for OASIS & Related Archives
(composite control)(top concepts)
(scheme details)
(scheme list) More Widget details on HeritageData.org
Allows us to embed controlled vocabularies in web pages and web forms to
better index data and improve Search and
Access to research Archives
LOD Heritage Vocabularies: http://www.heritagedata.org
Thesaurus searching and browsing
Opportunities
Opportunities: E.g. British Oceanographic Data Centre - LOD
EH Thesauri of Maritime
Craft
With Thanks to Adam Leadbetter
- Semantic ENrichment Enabling Sustainability of arCHAeological LinksSENESCHAL
Opportunities ▪Clwyd-Powys (Wales) Archaeological Trust (SENESCHAL
widgets embedded into HER application and mobile field recording app)
Opportunities for Alignment of Thesauri: Getty A&AT Vocab as SKOS LOD
STAR - Semantic Technologies for Archaeological Resources
With thanks to Andreas Vlachidis
Natural Language Processing (NLP)• NLP Information
Extraction (IE) of Concepts from OASIS GL Reports such as
• Place • Period • Object • Utilise semantic
annotation XML files • MySQL database
server to store relevant thesauri structures.
Stages for making Data Open
LOD may blur existing boundaries as (Big) data integration becomes more dynamic
STAR outcomes suggest still 4 key stages for coherent data integration in the Archaeological Research Cycle.
Excavation archive stage
Results of Analysis
"Final" Publication
Integrated Archive for new Research
Open Archaeological Data somewhere over the horizon?
Different archaeological recording systems share common conceptual frameworks and semantic relationships
By conceptualising common relationships in our different data sets at a broad level and aligning vocabularies of shared reference terms we can cross-search data for patterns and broader answers to related research questions
The technologies are being developed in other domains (e.g. Biology) but is there a common will for sharing archaeological data Openly for re-use in the interests of improving research methods?
References
Catalin Pavel. "Describing and Interpreting the Past" Tudhope, May, Binding, Vlachidis. "Connecting Archaeological Data and Grey Literature via Semantic Cross Search" - Internet Archaeology Vol 30 http://dx.doi.org/10.11141/ia.30.5 CIDOC CRM. http://cidoc.ics.forth.gr/ HeritageData LOD Vocabularies: http://www.heritagedata.org
Contact: [email protected]
@Keith_May