introduction to linked data 1/5
DESCRIPTION
TRANSCRIPT
Introduction to Linked Data
Juan F. SequedaSemantic Technology Conference
June 2011
What is the Semantic Web?
What is the Semantic Web?
Internet != Web
What is the Web?
“… the Web, is a system of interlinked hypertext documents accessed via the Internet. With a web browser, one can view web pages that may contain text, images […] and navigate between them via hyperlinks”
http://en.wikipedia.org/wiki/World_Wide_Web
Current Web = internet + links + docs
History of the Web
• Created by Tim Berners-Lee at CERN in 1989• Mosaic browser in 1993• W3C created in 1994• Exponential growth mid 90s• Amazon, Ebay – 1995• Search engines – Google 1998• Dot-com boom 1997 – 2001• Web 2.0 – blogs, Facebook, Twitter, etc
What is the problem?
WHAT’S THE WEATHER IN SAN
FRANCISCO TODAY?
What is the problem?
• The web is full of documents• We aren’t always interested in documents– We are interested in THINGS– These THINGS might be in documents
• We can read a HTML document rendered in a browser and find what we are searching for– This is hard for computers. – Computers have to guess (even though they are
pretty good at it)
The Web is a Data Shredder
Structured Data
Unstructured Data
Thanks Martin Hepp
What would we like?
• Make it easy for computers/software to find THINGS
Do you SEARCH or do you FIND?
Search for
Football Players who went to the University of Texas at Austin, played for
the Dallas Cowboys as Cornerback
Why can’t we just FIND it…
Guess how I FOUND out?
On a Semantic Web
• Besides publishing documents on the web– which computers can’t understand easily
• Let’s publish on the web something that computers can understand
DATA
The Semantic Web is a web of linked data
The current web is a web of linked documents
But wait… doesn’t the web already have data?
Current Data on the Web
• Relational Databases• APIs• XML• CSV• XLS• …• Can’t computers and applications already
consume that data on the web?
Yes! But it is all in different formats and data models!
This makes it hard to integrate data
The data in different data sources aren’t linked
For example, how do I know that the Juan Sequeda in Facebook is the same as Juan
Sequeda in Twitter
Or if I create a mashup from different services, I have to learn different APIs and I get different
formats of data back
Data is Siloed
Wouldn’t it be great if we had a standard way of publishing data on the Web?
We have a standardized way of publishing documents on the web, right?
HTML
Then why can’t we have a standard way of publishing data on the Web?
Good question! And the answer is YES. There is!
RDF
Resource Description Framework (RDF)
• A data model – A way to model data– i.e. Relational databases use relational data model
• RDF is a triple data model• Labeled Graph• Subject, Predicate, Object• <Juan> <was born in> <California>• <California> <is part of> <the USA>• <Juan> <has hobby> <Salsa dancing>
RDF can be serialized in different ways
• RDF/XML• RDFa (RDF in HTML)• N3• Turtle• JSON
So does that mean that I have to publish my data in RDF now?
You don’t have to… but we would like you to
An example
Document on the Web
Databases back up documents
Isbn Title Author PublisherID ReleasedData
978-0-596-15381-6
Programming the Semantic Web
Toby Segaran 1 July 2009
… … … … …
PublisherID PublisherName
1 O’Reilly Media
… …
This is a THING:A book title “Programming the Semantic Web” by Toby Segaran, …
THINGS have PROPERTIES:A Book as a Title, an author, …
Lets represent the data in RDF
book
Programming the Semantic Web
978-0-596-15381-6
Toby Segaran
Publisher O’Reilly
title
name
author
publisher
isbn
Isbn Title Author PublisherID ReleasedData
978-0-596-15381-6
Programming the Semantic Web
Toby Segaran
1 July 2009
PublisherID PublisherName
1 O’Reilly Media
Remember that we are on the web
Everything on the web is identified by a URI
And now let’s link the data to other data
http://…/isbn978
Programming the Semantic Web
978-0-596-15381-6
Toby Segaran
http://…/publisher1 O’Reilly
title
name
author
publisher
isbn
And now consider the data from Revyu.com
http://…/isbn978
http://…/
review1
Awesome Book
http://…/
reviewer
Juan Sequeda
hasReview
reviewer
description
name
Let’s start to link data
http://…/isbn978
Programming the Semantic Web
978-0-596-15381-6
Toby Segaran
http://…/publisher1 O’Reilly
title
name
author
publisher
isbn
http://…/isbn978
sameAs
http://…/
review1
Awesome Book
http://…/
reviewer
Juan Sequeda
hasReview
hasReviewer
description
name
Juan Sequeda publishes data too
http://juansequeda.
com/id
livesIn
Juan Sequedaname
http://dbpedia.org/Austin
Let’s link more datahttp://…/isbn978
http://…/
review1
Awesome Book
http://…/
reviewer
Juan Sequeda
http://juansequeda.
com/id
hasReview
hasReviewer
description
name
sameAs
livesIn
Juan Sequedaname
http://dbpedia.org/Austin
And more
http://…/isbn978
Programming the Semantic Web
978-0-596-15381-6
Toby Segaran
http://…/publisher1
O’Reilly
title
name
author
publisher
isbn
http://…/isbn978
sameAs
http://…/
review1
Awesome Book
http://…/
reviewer
Juan Sequeda
http://juansequeda.
com/id
hasReview
hasReviewer
description
name
sameAs
livesIn
Juan Sequedaname
http://dbpedia.org/Austin
Data on the Web that is in RDF and is linked to other RDF data is LINKED DATA
Linked Data Principles
1. Use URIs as names for things
2. Use HTTP URIs so that people can look up (dereference) those names.
3. When someone looks up a URI, provide useful information.
4. Include links to other URIs so that they can discover more things.
Linked Data makes the web appear as ONE
GIANTHUGE
GLOBAL
DATABASE!
I can query a database with SQL. Is there a way to query Linked Data with a query language?
Yes! There is actually a standardize language for that
SPARQL
FIND all the reviews on the book “Programming the Semantic Web” by people who live in
Austin
SELECT ?review ?commentWHERE { isbn:978 ex:hasReview ?review . ?review ex:description ?comment . ?review ex:hasReviewer ?person . ?person ex:lives dbpedia:Austin .}
http://…/isbn978
Programming the Semantic Web
978-0-596-15381-6
Toby Segaran
http://…/publisher1 O’Reilly
title
name
author
publisher
isbn
http://…/isbn978
sameAs
http://…/
review1
Awesome Book
http://…/
reviewer
Juan Sequeda
http://juansequeda.
com
hasReview
hasReviewer
description
name
sameAs
livesIn
Juan Sequedaname
http://dbpedia.org/Austin
SELECT ?review ?commentWHERE {isbn:978 ex:hasReview ?review .?review ex:description ?comment .?review ex:hasReviewer ?person .?person ex:lives dbpedia:Austin .}
This looks cool, but let’s be realistic. What is the incentive to publish Linked Data?
What was your incentive to publish an HTML page in 1990?
1) Share data in documents2) Because you neighbor was doing it
… later on …3) Marketing, Advertising, SEO
So why should we publish Linked Data in 2011?
1) Share data as data2) Because you neighbor is doing it
…3) (Semantic) SEO ++
Linked Data Publishers• UK Government• US Government• BBC• Open Calais – Thomson Reuters• Freebase/Google• NY Times• Best Buy• CNET• Dbpedia• Overstock.com• O’Reilly Media• …
May 2007
Oct 2007
Nov 2007
Feb 2008
Mar 2008
Sept 2008
Mar 2009 (1)
Mar 2009 (2)
July 2009
September 2010
June 2011
YOU GET THE PICTURE
ITS BIG and getting
BIGGER and
BIGGER
QUESTIONS?