linked (open) data opportunities and challenges makx dekkers mail@makxdekkers.com

Post on 31-Mar-2015

221 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Linked (Open) Data

Opportunities and challenges

Makx Dekkersmail@makxdekkers.com

Outline

• Basic notions• Recent developments• Comparing objectives• Opportunities and risks• Conclusions

© 2011 Makx Dekkers Journeés ABES 2011 2

BASIC NOTIONS

© 2011 Makx Dekkers Journeés ABES 2011 3

The idea and its history

• 1989: Tim Berners-Lee already talked about linking documents and data together (http://www.w3.org/History/1989/proposal.html)

• 2001: Tim Berners-Lee and Ora Lassila introduced the “Semantic Web” (http://www.scientificamerican.com/article.cfm?id=the-semantic-web)

• 2006: Tim Berners-Lee presented the initial design issues (rules) for Linked Data (http://www.w3.org/DesignIssues/LinkedData.html)

© 2011 Makx Dekkers Journeés ABES 2011 4

W3C Semantic Web initiative

• Objective– to create a universal medium for the exchange of

data […] to smoothly interconnect personal information management, enterprise application integration, and the global sharing of commercial, scientific and cultural data

• Main results– Resource Description Framework (RDF), RDFa

(RDF-in-HTML), SPARQL Query Language

© 2011 Makx Dekkers Journeés ABES 2011 5

Core Linked Data Specifications

• Transport– HTTP Hypertext Transfer Protocol

• Identification– URI Uniform Resource Identifier

• Description and linking– RDF Resource Description Framework

• Search and access– SPARQL Query Language for RDF

© 2011 Makx Dekkers Journeés ABES 2011 6

The four rules of Linked Data

• TBL’s recommendations:1. Use URIs as names for things2. Use HTTP URIs so that people can look up those

names3. When someone looks up a URI, provide useful

information, using the standards (RDF*, SPARQL)4. Include links to other URIs so that they can

discover more things

© 2011 Makx Dekkers Journeés ABES 2011 7

The basic model of RDF

• Resource Description Framework “triple”:– Subject: the “thing” (resource) described– Predicate: the characteristic of the resource– Object: the value of the characteristic

Subject ObjectPredicate

© 2011 Makx Dekkers Journeés ABES 2011 8

Complex structures in RDF

This presentation

Makx Dekkers Barcelona

Journées ABES

ABES

Montpellier

17-18 May 2011

presenter

partOf organizer location

hometown

date

location

© 2011 Makx Dekkers Journeés ABES 2011 9

Linked (Open / Enterprise) Data

• Commonalities– Using Semantic Web technologies (RDF)– Linking information resources, people, places

• Differences– Open Data with open licenses; Enterprise Data

mostly for closed, controlled environments– Open Data links to other Open Data, available for

external use; Enterprise Data may link to external data but not openly available for external use

© 2011 Makx Dekkers Journeés ABES 2011 10

Linked Data -- Open Data

• Linked Data: focus on technology– Semantic Web: Resource Description Framework,

and other Web standards– Final solutions still under development

• Open Data: focus on strategy– Based on notion that sharing is important and

benefits all– Technology is secondary

© 2011 Makx Dekkers Journeés ABES 2011 11

The five-star system

Source: http://inkdroid.org/journal/2010/06/04/the-5-stars-of-open-linked-data/

© 2011 Makx Dekkers Journeés ABES 2011 12

The LOD diagram: 2007

Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/

© 2011 Makx Dekkers Journeés ABES 2011 13

25 datasets

The LOD diagram: 2008

Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/

© 2011 Makx Dekkers Journeés ABES 2011 14

45 datasets

The LOD diagram: 2009

Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/

© 2011 Makx Dekkers Journeés ABES 2011 15

95 datasets

The LOD diagram: 2010

Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/

© 2011 Makx Dekkers Journeés ABES 2011 16

203 datasets

RECENT DEVELOPMENTS

© 2011 Makx Dekkers Journeés ABES 2011 17

W3C communities

• LinkingOpenData SWEO Community Project– Goal: to extend the Web with a data commons by

publishing various open data sets as RDF on the Web and by setting RDF links between data items from different data sources (http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData)

• Library Linked Data Incubator Group– to help increase global interoperability of library

data on the Web (http://www.w3.org/2005/Incubator/lld/)

© 2011 Makx Dekkers Journeés ABES 2011 18

More W3C communities

• Government Linking Data Working Group– to provide standards and other information which

help governments around the world publish their data as effective and usable linked data (http://www.w3.org/2011/gld/charter)

• Semantic Web Health Care and Life Sciences (HCLS) Interest Group– to develop, advocate for, and support the use of

Semantic Web technologies for health care and life science (e.g. biology, medicine) (http://www.w3.org/2001/sw/hcls/)

© 2011 Makx Dekkers Journeés ABES 2011 19

Open Knowledge Foundation, okfn.org

• not-for-profit organization promoting open knowledge: any kind of data and content that can be freely used, reused, and redistributed

• Working and Interest Groups, e.g.– Open Data in Science, Open Government Data,

Open Bibliographic Data, Cultural Heritage etc.

• CKAN.net: registry of open datasets and other “knowledge resources”

© 2011 Makx Dekkers Journeés ABES 2011 20

Linked Data initiativesPredicate vocabularies (descriptors)

Research Description and Access (RDA) http://metadataregistry.org/rdabrowse.htm

The Bibliographic Ontology (BIBO) http://bibliontology.com/

Dublin Core http://dublincore.org/

Object vocabularies (values)

Virtual International Authority File (VIAF) http://viaf.org/

Library of Congress authorities http://id.loc.gov/authorities/

AgroVOC (agricultural terminology) e.g. http://aims.fao.org/aos/agrovoc/c_550

DBPedia (based on Wikipedia) e.g. http://dbpedia.org/page/Montpellier

Bibliographic data

LIBRIS Sweden e.g. http://libris.kb.se/library/S

British Library http://www.bl.uk/bibliographic/datasamples.html

CrossRef (DOI metadata) http://www.crossref.org/CrossTech/linked_data/

© 2011 Makx Dekkers Journeés ABES 2011 21

More Linked Data initiativesBroadcasting, publishing

BBC http://www.bbc.co.uk/blogs/bbcinternet/linked_data/

New York Times http://data.nytimes.com/

Governments (small sample)

USA http://data.gov/

France http://opendata.paris.fr/

Finland http://data.suomi.fi/

UK http://data.gov.uk/

Spain (Cataluña) http://dadesobertes.gencat.cat/

Norway http://data.norge.no

Netherlands http://www.overheid.nl/opendata

Australia http://data.gov.au/

© 2011 Makx Dekkers Journeés ABES 2011 22

COMPARING OBJECTIVES

© 2011 Makx Dekkers Journeés ABES 2011 23

Strategic aspects Linked Data

• Achieving global interoperability with minimal coordination

• Aggregating human knowledge• Supporting democracy, transparency and

accountability• Enhancing and enriching information• Enabling user-driven and user-generated

applications

© 2011 Makx Dekkers Journeés ABES 2011 24

Strategic aspects libraries

• Organizing information for use by specific users for specific goals

• Ensuring and maintaining quality• Sustaining services economically• Preserving information for the long term• Providing trusted services

© 2011 Makx Dekkers Journeés ABES 2011 25

Functional aspects Linked Data

• Searching distributed collections • “Following your nose” – navigating links

between pieces of content• Distributing responsibility for making

statements about things• Leaving to the user whom and what to trust• Leaving development of products and services

to an open market (apps)

© 2011 Makx Dekkers Journeés ABES 2011 26

Functional aspects libraries

• Describing information by professionals• Bringing together and managing aggregations

of information• Selecting relevant information• Mixing analogue and digital resources

© 2011 Makx Dekkers Journeés ABES 2011 27

Technical aspects Linked Data

• Publishing and using machine-readable statements (“data that speak for themselves”)

• Focusing on Semantic Web technology• Enabling inferences across large distributed

data sets• (Still to be done) Solving issues around

harvesting, caching and real-time updating

© 2011 Makx Dekkers Journeés ABES 2011 28

Technical aspects libraries

• Using proven technology to provide high-quality services

• Managing production systems and services• Guaranteeing performance, uptime,

consistency across data

© 2011 Makx Dekkers Journeés ABES 2011 29

Agility versus sustainability

• In the Linked Data space:– Things move fast– Trial-and-error– Lots of development by volunteers (hackers)

• In the library domain:– Operational systems need to evolve– Need to handle legacy data– Development by professionals in managed

projects

© 2011 Makx Dekkers Journeés ABES 2011 30

Data versus services

• In the Linked Data space:– Focus on availability of “raw data”– Quality is secondary– Data and technology should lead to useful results

• In the library domain:– Focus on services– Quality is essential– Data and technology in support of the service

© 2011 Makx Dekkers Journeés ABES 2011 31

Economic aspects

• In the Linked Data space:– “Information wants to be free” – a human right?– Short-term thinking: today is hot, yesterday is not – Focus on applications to create value out of data

• In the library domain:– Long-term view: sustainability is crucial– Public money to provide community services– Expected to do more with less money

© 2011 Makx Dekkers Journeés ABES 2011 32

OPPORTUNITIES AND RISKS

© 2011 Makx Dekkers Journeés ABES 2011 33

Strong points Linked Data

• Attempt to create a common technical platform for machine-readable data

• Lots of enthusiasm in publishing open data• Promise of global interoperability• Mix of researchers, user communities,

hackers, professional data providers• High visibility on political level

© 2011 Makx Dekkers Journeés ABES 2011 34

Risks Linked Data

• Driven by technology, not by requirements• Technology may not (yet) be stable – RDF 2.0?• Operational issues far from solved (reliability,

performance, quality, security, trust) • Hope for general agreement across domains

may not be realistic• Promise may turn into disappointment

© 2011 Makx Dekkers Journeés ABES 2011 35

Strong points libraries

• Long time operational experience in managing information

• Professional intermediaries between users and information needs

• Sustainable business models (albeit with eternally shrinking budgets)

• Long-term perspective: the past (legacy data) as well as the future (preservation)

© 2011 Makx Dekkers Journeés ABES 2011 36

Risks libraries

• Technologies change rapidly• New skills difficult to spread through the

organization• Some people see libraries as a thing of the

past (“the book museum”)• Underestimation of information handling skills• Information overload, human intervention

does not scale, need for better tools

© 2011 Makx Dekkers Journeés ABES 2011 37

Meeting both worlds

• An example: Europeana.eu– Started out with domain perspectives (libraries,

archives, museums, audiovisual archives)– “Traditional” approach (metadata mappings)

works but insufficient– Using Linked Data approach preserves domain

specifics but allows for generalization to support common services

– Cross-domain (but co-ordinated) interoperability

© 2011 Makx Dekkers Journeés ABES 2011 38

Europeana Data ModelClasses Properties

Simple example

Complex example

Source at:http://version1.europeana.eu/web/europeana-project/technicaldocuments/

© 2011 Makx Dekkers Journeés ABES 2011 39

CONCLUSION

© 2011 Makx Dekkers Journeés ABES 2011 40

Libraries and Linked Data

• Using Linked Data technology as the next step in connecting services

• Offering information management skills to the technology domain

• Creating a quality hub in the Linked Data space

© 2011 Makx Dekkers Journeés ABES 2011 41

Best of both worlds

• Libraries providing stability and sustainability to Linked Data spaces

• Library professionals helping to manage the distributed collections

• Libraries delivering high-quality linked data to the Web

• Technologists to provide the next generation of systems and tools

© 2011 Makx Dekkers Journeés ABES 2011 42

Linked (Open) Data: opportunity for libraries!

Thank you!

Makx Dekkersmail@makxdekkers.com

top related