open provenance model tutorial session 4: use cases from data.uk
DESCRIPTION
Open Provenance Model Tutorial Session 4: Use cases from data.gov.uk. Outline. Background about data.gov.uk The use cases XML serialization Data transformation on the fly Complex and nested processes. data.gov.uk. Linking UK government data Aims: - PowerPoint PPT PresentationTRANSCRIPT
Open Provenance Model Tutorial Session 4: Use cases from data.gov.uk
Outline
• Background about data.gov.uk• The use cases– XML serialization– Data transformation on the fly– Complex and nested processes
data.gov.uk
• Linking UK government data• Aims:– Provide a set of best practices for government
agencies– Provide the minimum set of tooling and
specification to facilitate the publication of data– Encourage “responsible” data publishing
XML -> RDF
XSLT Processor
XSLT ParameterBinding
XSLT Stylesheet
XSLT Template
input outputRDF File
Who, when, which version,
how
XSLT Processorinput output
RDF FileXSLT ParameterBinding
XSLT Stylesheet
XSLT Template
Downloaded from;Unzipped from, etc Made accessible
Who, when, which version,
how
On-the-fly Transformation
Data transformation
wrapper
http://mytransportatio.db/j10
Who, when, which
version, how
Complex Data Creation Pipeline
GATE Pipeline
GateXMLRegressionTransformation
GateXMLRdfaTransformation
RdfaRdfXmlTransformation
Courtesy of Paul Appleby from TSO (Data Enrichment Service)
Complex Data Creation Pipeline
GATE Pipeline
GateXMLRegressionTransformation
GateXMLRdfaTransformation
RdfaRdfXmlTransformation
Document Reset PR
ANNIE English Tokeniser
ANNIE English Splitter
ANNIE POS Tagger
Data.gov.uk Morphological Analyzer
Data.gov.uk Flexible Roof Gazetteer
Data.gov.uk Generic Gazeteer
GATE Noun Phrase Chunker
Data.gov.uk Generic Transducer
TSO CoreferenceCourtesy of Paul Appleby from TSO (Data Enrichment Service)
wasGeneratedBy wasGeneratedBy wasGeneratedBy
hasParentProcess iterationOfProcess
Level 1: Provenance of execution at higher level
Level 0: Provenance of execution at detailed level
Services used by executions
Artifacts
followed
wasDerivedFrom A data collection
wasTriggeredBy wasTriggeredByaccessedService
Non-digital Data Objects
• Organizations– Organizational structure changes over time– Origin organization, resulting Organization
• Boundary• Legislation
An organization ontology: http://www.epimorphics.com/public/vocabulary/org.html
The Challenges
• Data of different representations, of physical forms, of granularity
• Not tooling support• Provenance across different types of systems– Identification– Different terminologies
The Gaps
• A vocabulary being able to describe provenance of all types of data, from different systems
• A vocabulary still providing enough terms to describe provenance accurately
This work is licensed under a Creative Commons Attribution-Share Alike 3.0 License
(http://creativecommons.org/licenses/by-sa/3.0/)