linking uk government data, john sheridan

63
John Sheridan Linked Data lead for data.gov.uk Head of Legislation Services at The [UK] National Archives

Upload: semantic-web-company

Post on 11-Jun-2015

3.861 views

Category:

Technology


1 download

DESCRIPTION

Keynote Präsentation von John Sheridan bei der OGD2011 Konferenz am 16. Juni 2011 in Wien: Linking UK Government Data (englisch).

TRANSCRIPT

Page 1: Linking UK Government Data, John Sheridan

John Sheridan Linked Data lead for data.gov.uk Head of Legislation Services at The [UK] National Archives

Page 2: Linking UK Government Data, John Sheridan

2

Page 3: Linking UK Government Data, John Sheridan

3

Page 4: Linking UK Government Data, John Sheridan

4

Page 5: Linking UK Government Data, John Sheridan

16. GOVERNMENT TRANSPARENCY

The Government believes that we need to throw open the doors of public bodies, to enable the public to hold politicians and public bodies to account. We also recognise that this will help to deliver better value for money in public spending, and help us achieve our aim of cutting the record deficit. Setting government data free will bring significant economic benefits by enabling businesses and non-profit organisations to build innovative applications and websites.

We will require public bodies to publish online the job titles of every member of

staff and the salaries and expenses of senior officials paid more than the lowest salary permissible in Pay Band 1 of the Senior Civil Service pay scale, and organograms that include all positions in those bodies.

We will ensure that all data published by public bodies is published in an open

and standardised format, so that it can be used easily and with minimal cost by third parties.

5

Page 6: Linking UK Government Data, John Sheridan

6

Page 7: Linking UK Government Data, John Sheridan

7

Formats for people Focused on presentation or typographic layout Look good, but hard to access the underlying data

Formats for machines Focused on data interchange between computers Look dreadful, hard for people to understand but easy to import into other systems and use

Page 8: Linking UK Government Data, John Sheridan

8

Single source of

data

Formats for people Focused on presentation or typographic layout

Formats for machines Focused on data interchange between computers

Page 9: Linking UK Government Data, John Sheridan

Download Good for static information

Small files

Used for export/import

Easy for publishers

Most of the data registered on data.gov.uk

Programmatic access Good for dynamic or real-time information or very large datasets

Lets developers select and use just the information they need

Retains more control for the publisher

More complicated to implement but much more powerful

Vital for many useful datasets

9

Page 10: Linking UK Government Data, John Sheridan

10

Page 11: Linking UK Government Data, John Sheridan

He also developed the first industrially practical screw-cutting lathe in 1800, allowing standardisation of screw thread sizes for the first time. This allowed the concept of interchangeability (a idea that was already taking hold) to be practically applied to nuts and bolts. Before this, all nuts and bolts had to be made as matching pairs only. This meant that when machines were disassembled, careful account had to be kept of the matching nuts and bolts ready for when reassembly took place.

http://en.wikipedia.org/wiki/Henry_Maudslay

Page 12: Linking UK Government Data, John Sheridan

In 1841, Joseph Whitworth created a design that, through its adoption by many British railroad companies, became a national standard for the United Kingdom called British Standard Whitworth. During the 1840s through 1860s, this standard was often used in the United States and Canada as well, in addition to myriad intra- and inter-company standards. .

http://en.wikipedia.org/wiki/Screw_thread#History_of_standardization

Page 13: Linking UK Government Data, John Sheridan

* make your stuff available on the Web (whatever format) under an open licence

** make it available as structured data (e.g., Excel instead of image scan of a table)

*** use non-proprietary formats (e.g., CSV instead of Excel)

**** use URIs to identify things, so that people can point at your stuff

***** link your data to other data to provide context

13

Page 14: Linking UK Government Data, John Sheridan

14

Page 15: Linking UK Government Data, John Sheridan

Give names, or web identifiers (URIs), to things

Publish information about them as Web Resources

Use RDF triples (subject, property, value) Link to other data about those things

15

Page 16: Linking UK Government Data, John Sheridan

Enables web-scale data publishing - distributed publication with web-based discovery mechanisms

Everything is a resource – follow your nose to discover more about properties, classes, or codes within a code list

Everything can be annotated - make comments about observations, data series, points on a map

Easy to extend - create new properties as required, no need to plan everything up-front

Easy to merge - slot together RDF graphs, no need to worry about name clashes

16

Page 17: Linking UK Government Data, John Sheridan
Page 18: Linking UK Government Data, John Sheridan

developing standards for responsible publishing of key types of data (financial data, organisation data, aggregate statistics, location data)

developing guidance, practices and tools that make it easy to publish data in Linked Data form, at low cost

making it easy for people to consume data in a programmatic way

Page 19: Linking UK Government Data, John Sheridan

2008 2009 2010

A 1,345 1,456 2,301

B 2,112 3,543 2,111

C 2,345 2,987 2,455

D 6,342 6,256 6,123

E 7,435 7,432 8,102

Transaction Date Supplier Amount

A-1263 09/09/2010 Spottiswoode & Co £ 2,345

A-1264 09/09/2010 JSB & Sons £ 2,111

A-1265 09/09/2010 BLG Ltd £ 2,455

A-1266 09/09/2010 Spottiswoode & Co £ 6,123

A-1267 09/09/2010 BLG Ltd £ 8,102

Director General

Director (Operations)

Director (Strategy)

Deputy Director (A)

Deputy Director (A)

Page 20: Linking UK Government Data, John Sheridan

URI = uniform resource identifier Everything starts HTTP – which gives us

actionable names There is choice about how to make URIs We are using {sector}.data.gov.uk/id/{something}

20

Page 21: Linking UK Government Data, John Sheridan

21

Page 22: Linking UK Government Data, John Sheridan
Page 23: Linking UK Government Data, John Sheridan

If you visit legislation.gov.uk you will see we have taken great care with naming things

23

Returns an html document for United Kingdom Public General Act (ukpga), 2005, Chapter 14, Section 1

Returns an html document with a list from all legislation types where the title contains “wildlife”

Page 24: Linking UK Government Data, John Sheridan

UK Public General Act (ukpga) 1981 Chapter 69 Section 5 As it extends to England As it stood on 30th January 2001 Displayed as an HTML document with the timeline

on Although URIs are opaque having this type of

design changes how people use the service 24

Page 25: Linking UK Government Data, John Sheridan

25

Page 26: Linking UK Government Data, John Sheridan

Everything on legislation.gov.uk is available as open data under the terms of our Open Government Licence

To access the data, visit any page and add: /data.xml

/data.rdf

/data.xht

For lists /data.feed

26

Page 27: Linking UK Government Data, John Sheridan

Re-use where we can, create where we must Small, high level, light weight vocabularies

Examples include datacube, organization, provenance

Create local specialisations

Examples include payments, central-government

Post hoc linking

27

Page 28: Linking UK Government Data, John Sheridan

28

qb:ComponentSpecification qb:componentRequired : boolean qb:componentAttachment : rdfs:Class qb:order : xsd:int

qb:ComponentProperty

qb:DimensionProperty

qb:AttributeProperty

qb:MeasureProperty

qb:CodedProperty sdmx:ConceptRole

skos:ConceptScheme

qb:codeList

qb:concept

qb:DataSet

qb:Slice

qb:slice

qb:Observation

qb:observation

qb:dataset

qb:structure

qb:SliceKey

qb:sliceStructure

qb:DataStructureDefinition

qb:sliceKey

sdmx:FrequencyRole sdmx:CountRole sdmx:EntityRole sdmx:TimeRole ...

sdmx:Concept

sdmx:CodeList

qb:componentProperty

qb:measureType

skos:Concept

qb:dimension qb:attribute qb:measure qb:componentProperty

qb:subSlice

Page 29: Linking UK Government Data, John Sheridan

29

qb:slice

PaymentDataset

Payment

ExpenditureLine Purchase

qb:dataset

foaf:Agent

payer

payee

payment

expenditureLine

interval:Interval date

skos:Concept

expenditureCode

amountIncludingVAT

amountExcludingVAT

vatCategory

vatRate

order

invoice

contract

transactionReference

paymentReference

totalAmountIncludingVAT

purchase

skos:Concept

narrative

ItemCategory

foaf:Agent

org:OrganizationalUnit unit

qb:structure

redacted

capital

revenue

procurementCategory

Item

skos:Concept

item totalAmountExcludingVAT

Page 30: Linking UK Government Data, John Sheridan

*new* Government Linked Data Working Group

Provenance Working Group

Page 31: Linking UK Government Data, John Sheridan

31

Page 32: Linking UK Government Data, John Sheridan

http://reference.data.gov.uk/id/day/2011-06-16

http://reference.data.gov.uk/id/department/CO

http://transport.data.gov.uk/id/station/WAT

http://education.data.gov.uk/id/school/341451

http://location.data.gov.uk/id/3245677362123

http://www.legislation.gov.uk/id/ukpga/2009/12/section/2

Page 33: Linking UK Government Data, John Sheridan

http://reference.data.gov.uk/id/day/2011-06-1 There are similar URIs for seconds, minutes,

hours, weeks, months, quarters, years We were a bit slow (170 years) to move from the

Julian to Gregorian Calendar (see the Calendar Act, 1750)

To transition, we lost 11 days in 1752 Convoluted explanation of why the tax year in

the UK starts on the 6th April Our URIs for time intervals work this way too

and the British time intervals URI Set is linked to the legislation

Page 34: Linking UK Government Data, John Sheridan

34

Page 35: Linking UK Government Data, John Sheridan

Malcolm Gladwell article on Ron Popeil from 2000 in the New Yorker:

”And how do you persuade people to disrupt their lives? Not merely by ingratiation or sincerity, and not by being famous or beautiful. You have to explain the invention to consumers - not once or twice but three or four times, with a different twist each time. You have to show them exactly how it works and why it works, and make them follow your hands as you chop liver with it, and then tell them precisely how it fits into their routine, and, finally, sell them on the paradoxical fact that, revolutionary as the gadget is, it's not at all hard to use.”

Page 36: Linking UK Government Data, John Sheridan

36

Page 37: Linking UK Government Data, John Sheridan

37

Page 38: Linking UK Government Data, John Sheridan

38

Page 39: Linking UK Government Data, John Sheridan

39

Page 40: Linking UK Government Data, John Sheridan

Open Standard Generic approach for creating APIs from

Linked Data Sits on top of a Linked Data store Several implementations, most mature is

Puelia

40

Page 41: Linking UK Government Data, John Sheridan

41

Page 42: Linking UK Government Data, John Sheridan

42

Page 43: Linking UK Government Data, John Sheridan

43

Page 44: Linking UK Government Data, John Sheridan

44

Page 45: Linking UK Government Data, John Sheridan

We will require public bodies to publish online the job titles of every member of staff and the salaries and expenses of senior officials paid more than the lowest salary permissible in Pay Band 1 of the Senior Civil Service pay scale, and organograms that include all positions in those bodies.

Page 46: Linking UK Government Data, John Sheridan

October 2010 CSV template and PDFs of organograms,

typically authored using Powerpoint Emphasis on visual appearance, led to

inconsistent datasets which are very hard to re-use

No relationship between the organogram and data

Not using web standards

46

Page 47: Linking UK Government Data, John Sheridan
Page 48: Linking UK Government Data, John Sheridan

“The Government has published

the most comprehensive

organisational charts of the UK

Civil Service ever released online,

taking another step towards its

goal of being the most transparent

government in the world and

opening up the structure of the

Civil Service to public scrutiny”

Page 49: Linking UK Government Data, John Sheridan

100s of UK Government Organisations have published their organisation data as Linked Data

Distributed data publishing It the largest number of organisations joining the Web

of Linked Data in a single day! The data is deeply linked (Departments, Grades ,

Professions, date of the snapshot) Cross dataset queries are perhaps the most

interesting Proves Linked Data is moving from research topic to

commodity publishing We can now extend this approach to other types of

dataset and link our transparency data

49

Page 50: Linking UK Government Data, John Sheridan

Make it as simple as possible for people in Departments to create Linked Data

Create high quality, consistent data that matches the policy intent and guidance

Distributed capture and publishing Create open data in open standards using open source

tools Human readable and machine readable from single source Provide download and API access in different formats

(CSV, XML, JSON, RDF, HTML) Evolutionary route to create longitudinal datasets,

reconciling against previous data Enable everyone to publish 5 Star Linked Data

50

Page 51: Linking UK Government Data, John Sheridan

Capture organisation data using a spreadsheet, which verifies policy rules and datatypes

Upload spreadsheet Preview organogram Download RDF and two CSVs Publish on your website and register with

data.gov.uk

51

Page 52: Linking UK Government Data, John Sheridan

It’s the tool most Civil Servants have This *does* also work in Libre Office / Open

Office etc

52

Page 53: Linking UK Government Data, John Sheridan

53

Page 54: Linking UK Government Data, John Sheridan

54

Page 55: Linking UK Government Data, John Sheridan

55

Page 56: Linking UK Government Data, John Sheridan
Page 57: Linking UK Government Data, John Sheridan

5. Create RDF

57

Organogram (PHP)

Sesame RDF Store

Senior CSV

Junior CSV

XLWrap

TDB

Linked Data API

Mapping TRiG

Excel file

RDF file API Config

Organogram HTML, CSS &

JavaScript

1. Upload Excel

2. Create CSVs

3. Create Mapping

4. Query (SPARQL)

6. Load RDF

7. Query (SPARQL)

JSON XML HTML

Reconciliation

Page 58: Linking UK Government Data, John Sheridan
Page 59: Linking UK Government Data, John Sheridan

Implicit properties are made explicit (person, role, person in a role)

Reconciliation adds value by automatic linking to other data

Provenance Example data Explicit open licence

Page 60: Linking UK Government Data, John Sheridan

60

Page 61: Linking UK Government Data, John Sheridan
Page 62: Linking UK Government Data, John Sheridan

Linked Data is essential to realising the promise of Open Government Data

Using Linked Data means working on Standards

Reference Data

Production

Publishing Lots of opportunities for international

collaboration Best advice, just start

Page 63: Linking UK Government Data, John Sheridan

email: john@johnlsheridan Twitter: @johnlsheridan Skype: johnlsheridan