information intermediaries

21
Information intermediaries for government linked data Dave Reynolds, Epimorphics Ltd

Upload: dave-reynolds

Post on 08-May-2015

2.400 views

Category:

Technology


0 download

DESCRIPTION

Information intermediaries for government linked data

TRANSCRIPT

Page 1: Information Intermediaries

Information intermediaries for government linked data

Dave Reynolds, Epimorphics Ltd

Page 2: Information Intermediaries

Governments around the world are releasing data

Page 3: Information Intermediaries

Why?

transparency, openness, it’s public data tap creativity, enthusiasm of web developers

stimulate applications for citizens & commerce

track crime in your areaunderstand where funding is going

plan travel

choose a school

Page 4: Information Intermediaries

Theme for this talkhow to accelerate this uptake?

reduce cost of exploiting public data?stimulate an ecosystem of value added

services?

data dump and information intermediaries linked data approach intermediaries for a linked data world

Page 5: Information Intermediaries

Traditional publication approach:data dumps publish individual datasets – typically CSV easy for publisher consumer has complete control

no complex formats or query languages manage data as they want to familiar technology stack

growing set of intermediaries web services to help you work with datasets

not specific to public sector data

Page 6: Information Intermediaries

Intermediary services

Service Features Examples

Discovery Metadata searchFaceted searchSocial annotationAggregation across repositories

CKANNumbraryGuardian data storeSocrataInfochimpsFactual

API access Programmatic accessQuery supportRESTful access APIMultiple formats

FactualGoogle spreadsheets(e.g. Guardian data store)

Data model Access to the data model, schema or ontology

Factual

Page 7: Information Intermediaries

Intermediary services

Service Features Examples

Data exploration

Table viewsSlice, dice, dice, drill down

SwivelFactualSocrata

Visualization & comparison

Interactive chartingGraph one set against another

SwivelManyEyesGoogle public data

explorer

Embeddable views

Static embeddable charts/graphsEmbeddable interactive widgets

SwivelFactual

Page 8: Information Intermediaries

Intermediary services

Service Features Examples

Data quality Ability to correct dataProvenance tracking for corrections

Factual

[Several support social annotations]

Commerce support

Marketplace for datasetsBids and offersPay per set (pay per use?)

InfochimpsMicrosoft

Page 9: Information Intermediaries

Limitations to data dumps

Silo design pattern each application does its own

data integration hard to share or reuse efforts

between applications

Static local stores which require

management and update

*http://www.flickr.com/photos/zoomzoom/

Page 10: Information Intermediaries

Linked data : public sector data webHow:

URIs to identify things described dereference to RDF (& other formats) SPARQL endpoints for query vocabularies and patterns for

statistics, versioning, provenance ... standard URI sets

time periods, regions, departments, schools ...

Page 11: Information Intermediaries

Public sector data web

SchoolsTime

Periods

Gov.Bodies

AdminGeograph

yEdubase

Ofsted

DCSF

Page 12: Information Intermediaries

Benefits of linked data approach integrated (linked!) data standard identifiers enables linking other sets

seed connections between third party sets fine grain addressing of data

annotations (e.g. provenance) fine grained programmatic access

consume live or cache, not forced to use static data model directly linked from data

Page 13: Information Intermediaries

But ... barrier to entry too high - “just give us CSV”

alien data model alien query methods alien representation formats overall mismatch to typical web developer tool kit

Page 14: Information Intermediaries

Solution

middleware to provide web-friendly access run at publisher end or as an intermediary publish as linked data -> automatic API configure automatically from ontology

customize configuration (e.g.URI patterns)if needed

Page 15: Information Intermediaries

Linked data API

Access RESTful API design serve lists of resources or individual resources automatic sorting, paging of lists simple web API to control filtering, viewing

Formatting developer-friendly JSON & XML retain resource-centric model remove round-tripping requirements rooted graph

Page 16: Information Intermediaries

Structure

API specificati

on

Data source

SPARQLendpoint

vocabulary of

data set

Endpoint

request

response

GET /doc/schools/district/Oxford.json ? min-capacity=1200 selector

viewer

formatter

SELECT ?itemWHERE { ... }

DESCRIBE <x> <y>

cache

Page 17: Information Intermediaries

Operation

/doc/schools/district/Oxford.json ? min-capacity=1200

/doc/schools/district/{d}

SELECT ?r WHERE { ?r a school:School; school:district [rdfs:label ‘Oxford’]; school:capacity ?c . FILTER (?c >= 1200)} OFFSET 0 LIMIT 10

Matchendpoint

Retrievematches

buildresponse List

metadata: query and

configurationpage

N-1

page N

page N+1

school ischool

ischool i

select format:JSON

Page 18: Information Intermediaries

JSON serialization "results":[

{

"_about":"http://.../district/Oxford?min-schoolCapacity=1200&_page=0",

"first":"http://.../district/Oxford?&min-schoolCapacity=1200&_page=0",

"isPartOf":"http://.../district/Oxford?&min-schoolCapacity=1200",

"page":0, "pageSize":10,

"type":"http://www.epimorphics.com/vocabularies/api#Page",

"contains":[

{

"_about":"http://education.data.gov.uk/id/school/123242",

"label":"Peers School",

"districtAdministrative":{

"_about":"http://statistics.data.gov.uk/id/local-authority-district/38UC",

"label":"Oxford“ },

"phaseOfEducation":{

"_about":"http://education.data.gov.uk/def/school/PhaseOfEducation_Secondary",

"label":"Secondary” },

"schoolCapacity":1220,

"type":[

{

"_about":"http://education.data.gov.uk/def/school/School",

"label":"School” },

}, ...

Page 19: Information Intermediaries

Linked data API : outcomes lowers barrier to entry

very positive reception build linked data applications with e.g. jQuery

no need to for full RDF stack

stepping stone to linked data world retain concept of resources with URIs retain schema-less model look at the SPARQL you made, look at API config

open specification (Epimorphics, Talis, TSO) multiple implementations, including open source http://code.google.com/p/linked-data-api/

Page 20: Information Intermediaries

What other mediators are needed for a linked data world?

Service Features Examples

Discovery Metadata searchSearch on entity/concept use

SindiceDCAT, VOiD

Integration Entity co-reference discoveryOntology mappingLink to text (named entity rec.)

UberlicSameAs.orgTSO doc serviceFreebase

Enrichment Inference closureStructure transformation

WebPIE??

Exploration Follow linked data graph Tabulator, ODE, Disco, Zitgist, sig.ma ...

Visualization Interactive slice, compare, visualize

??

Page 21: Information Intermediaries

Conclusions intermediary services, such as LD access API,

can make the power and flexibility of linked data available to broader range of developers

meet public sector goals of stimulating network of value added applications for citizens and business

lots more to do ...