ist16-04 an introduction to rdf
TRANSCRIPT
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
Interoperability and Semantic Technologies 2015-16An Introduction RDF
Emanuele Della ValleDEIB - Politecnico di Milanohttp://emanueledellavalle.org - @manudellavalle
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
This work is licensed under the Creative Commons Attribution 3.0 Unported License.
Your are free:
to Share — to copy, distribute and transmit the work
to Remix — to adapt the workUnder the following conditions
Attribution — You must attribute the work by inserting“by E. Della Valle – http://emanueledellavalle.org -
@manudellavalle” at the end of each reused slide
To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/
Share, Remix, Reuse — Legally
2
E. Della Valle – http://emanueledellavalle.org - @manudellavalle 3
Data Interchange: RDF
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellLooking for a flexible data model Why
• Application are always changing(competitive environment)
• People are always adding more features• Graceful evolution is important
Optimal: relational model• Relational model is remarkably flexible• Supports graceful evolution
– Change => Add another table– Existing queries are unaffected
• Easily accommodates new data – Without affecting existing queries
• Allows data to be easily combined ("joined") in new ways• 25+ years of relational database experience
4
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellResource Description Framework The adaptation of the relational model to the Web give rise
to RDF From tuples to triples
Any relational data can be represented as triples• Row Key --> Subject• Column --> Property• Value --> Value
5
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellRepresenting relational data in RDF (almost) E.g., drug data
Represented in RDF (almost)
ID Category FormulaBB.2 Anxiolytic
sC16H21NO2
ID Name LangBB.2 Propranolol en
BB.2 Propranololo it
BB.2 プロプラノロール
jp
BB.2
Anxiolytics C16H21NO2 Propranolol Propranololo プロプラノロール
CategoryFormula
Is a Drug
Legend
resource
literal
Name
6
Drug Drug Names
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellRepresenting relational data in RDF (almost) Two important problems,
- internal IDs (e.g., BB.2)- internal names of schema element (e.g., Category)once out of the database become meaningless
RDF solves this problem by using URI• Internal ID should be replaced by URI• Internal schema names should be replaced by URI• Values do not (always) need to be URI-fied
http://bio2rdf.org/drugbank_drugs:DB00571
http://dbpedia.org/resource/Category:Anxiolytics
C16H21NO2
http://www.w3.org/2004/02/skos/core#subject
http://bio2rdf.org/drugbank_ontology:chemicalFormula
http://www.w3.org/2000/01/rdf-schema#label
http://bio2rdf.org/page/drugbank_ontology:drugs
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
Legend
resource
literal
Propranolol
Propranololo
プロプラノロール
7
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
Which URI should we use?• Popular ones! Data merge will take place automatically!
RDF in a nutshellRepresenting data in RDF Q/A 1/3
http://bio2rdf.org/drugbank_drugs:DB00571
http://dbpedia.org/resource/Category:Anxiolytics
http://www.w3.org/2004/02/skos/core#subject
+http://bio2rdf.org/drugbank_drugs:DB00571
http://dbpedia.org/ontology/knownFor
http://bio2rdf.org/drugbank_drugs:DB00571
http://dbpedia.org/resource/Category:Anxiolytics
http://www.w3.org/2004/02/skos/core#subject
=http://dbpedia.org/ontology/knownFor
http://dbpedia.org/resource/James_W._Black
http://dbpedia.org/resource/James_W._Black
8
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
Where do I find popular URIs?• Ontology Dowsing
– http://www.w3.org/wiki/Ontology_Dowsing
• 1st solution: you know them ;-)– RDF vocabulary already includes:
- rdf:type : the “instance of” relationship between an individual and its class
- rdf:Property : the “concept” of property– RDF comes along a schema description language (RDF-S)
- rdfs:Class : the “concept” of class- rdfs:subClassOf : the “is a” relationship between two classes- rdfs:label : the standard way to provide a human-readable label
– …
• 2nd you search for them– With specialized search engines (e.g., http://vocab.cc/ )– In popular repositories (e.g., http://lov.okfn.org/dataset/lov/ )– …
RDF in a nutshellRepresenting data in RDF Q/A 2/3
9
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
What is a value? When shall we URI-fy a value? • Literals cannot be used to merge different data set• E.g., what if we what to merge two resources based on their
labels?
– BRCA may refer to different thing on the Webe.g., try http://en.wikipedia.org/wiki/BRCA
• URI-fy any value that can be eventually used to merge different dataset and leave the other values as literals
RDF in a nutshellRepresenting data in RDF Q/A 3/3
BRCA
http://www.w3.org/2000/01/rdf-schema#label
BRCA
http://www.w3.org/2000/01/rdf-schema#label
+ = ?
10
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellOther data structure in RDF Trees can be represented in RDF
Anything can be represented in RDF
11
E. Della Valle – http://emanueledellavalle.org - @manudellavalle 12
RDF in a nutshellMulti Source Data Integration with RDF
Gene Ontology
rdf:type
iRefIndex
uniprot:P05067 Uniprot:P05067Interacts with
uniprot:P05067 Go:MembranelocatedIn
uniprot:Proteinrdf:type
Uniprot:P05067Interacts with
Go:MembranelocatedIn
uniprot:P05067
uniprot:P05067 uniprot:Protein
RDF
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellSerializing RDF Three alternatives to write triples, in
1. RDF/XML: Standard serialization in XML<Description about=”subject”> <property>value</property></Description>– e.g., http://dbpedia.org/data/Propranolol.rdf – Check-out the triples using http://www.w3.org/RDF/Validator/
2. NTriples: Simple (verbose) reference serialization (for specifications only)
<http://...subject> <http://...predicate> “value” .– http://www.w3.org/RDF/Validator/ARPServlet?URI=http%3A%2F
%2Fdbpedia.org%2Fresource%2FPropranolol&PARSE=Parse+URI%3A+&TRIPLES_AND_GRAPH=PRINT_TRIPLES&FORMAT=PNG_EMBED
• N3 and Turtle: Developer-friendly serializations :subject :property “value” .– e.g. http://dbpedia.org/data/Propranolol.n3
13
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellSerializing RDF in Turtle - namespaces
URI terms can be abbreviated using namespaces@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .@prefix dbpedia: <http://dbpedia.org/resource/> .@prefix dbpedia-owl: <http://dbpedia.org/ontology/> .
dbpedia:Propranolol rdf:type dbpedia-owl:Drug .
<http://www.w3.org/1999/ 02/22-rdf-syntax-ns#type> = 'a'
dbpedia:Propranolol a dbpedia-owl:Drug .
14
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellSerializing RDF in Turtle - Literals Literals: "Propranolol"
• Literals with language tags: " プロプラノロール "@jp
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
dbpedia:Propranolol rdfs:label "Propranolol"@en . dbpedia:Propranolol rdfs:label "Propranololo"@it . dbpedia:Propranolol rdfs:label "プロプラノロール "@jp .
• Typed literals: "3.14"^^xsd:floatdbpedia:Propranolol dbpprop:molecularWeight "259.34"^^xsd:float .
15
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellSerializing RDF in Turtle - Convenience Syntax Abbreviating repeated subjects:
dbpedia:Propranolol rdfs:label "Propranololo"@it . dbpedia:Propranolol dbpprop:molecularWeight "259.34"^^xsd:float .
... is the same as ...dbpedia:Propranolol rdfs:label "Propranololo"@it ; dbpprop:molecularWeight "259.34"^^xsd:float .
Abbreviating repeated subject/predicate pairs:dbpedia:Propranolol rdfs:label "Propranolol"@en . dbpedia:Propranolol rdfs:label "Propranololo"@it . dbpedia:Propranolol rdfs:label " プロプラノロール "@cn .
... is the same as ...dbpedia:Propranolol rdfs:label "Propranolol"@en , "Propranololo"@it , "プロプラノロール "@cn .
16
E. Della Valle – http://emanueledellavalle.org - @manudellavalle 17
RDF in a nutshellAre you following? Let's check!
[source http://origin-www.yoursingapore.com/content/traveller/zh/browse/see-and-do/hands-on/_jcr_content/flash/image.img.png ]
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellAre you following? Let's check! Represent in the graphical syntax presented at slide 6 the
following database
Is this only a pedantic application of slide 6? Serialize it in turtle syntax
• using rdf:type to say "is a" and rdfs:label to say "name"• Using "@prefix : <http://www.ex.org/>" for all other terms,
e.g., :Doctor, :treats, etc. Shorted down the serialization using the convenience syntax Check correctness at http://www.easyrdf.org/converter Results are published on the course Web site
DID Name
D1 Alice
D2 Bob
Doctor treatsPID Name
P1 Carl
P2 David
PatientDID PID
D1 P2
D1 P1
D2 P1
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
What if I cannot thing about a good URI?• When no go URI exists, you can use blank nodes ( ) • The following relational data …
• … can be translated in RDF, in the BIO vocabulary [1], as follows
[1] http://vocab.org/bio/0.1.html
RDF in a nutshellRepresenting data in RDF Q/A 4/4
Person Bio Event Date
Sofia Birth 1974-02-28
Sofia Marriage 1995-08-04
1974-02-28
http://www.sofia.org/#me
http://purl.org/vocab/bio/0.1/Birth
http://purl.org/vocab/bio/0.1/Marriage
1995-08-04
http://purl.org/vocab/bio/0.1/event
http://purl.org/vocab/bio/0.1/event
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://purl.org/vocab/bio/0.1/date
http://www.w3.org/1999/02/22-rdf-syntax-ns#typehttp://purl.org/vocab/bio/0.1/date
19
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellSerializing RDF in Turtle – black nodes The following RDF model (see previous slide), including two
blank nodes, can be serialized as shown hereafter
@prefix bio: <http://purl.org/vocab/bio/0.1/> .
http://www.sofia.org/#me bio:event _:1, _:2 ._:1 a bio:Birth; bio:date "1974-02-28"^^xsd:date ._:2 a bio:Mariage; bio:date "1995-08-04"^^xsd:date .
1974-02-28
http://www.sofia.org/#me
http://purl.org/vocab/bio/0.1/Birth
http://purl.org/vocab/bio/0.1/Marriage
1995-08-04
http://purl.org/vocab/bio/0.1/event
http://purl.org/vocab/bio/0.1/event
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://purl.org/vocab/bio/0.1/date
http://purl.org/vocab/bio/0.1/event
http://purl.org/vocab/bio/0.1/date
Blank node identifiers
20
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellSerializing RDF in Turtle – black nodes The following RDF model (see previous slide), including two
blank nodes, can be serialized as shown hereafter
@prefix bio: <http://purl.org/vocab/bio/0.1/> .
http://www.sofia.org/#me bio:event [ a bio:Birth; bio:date "1974-02-28"^^xsd:date ] , [ a bio:Mariage; bio:date "1995-08-04"^^xsd:date ].
1974-02-28
http://www.sofia.org/#me
http://purl.org/vocab/bio/0.1/Birth
http://purl.org/vocab/bio/0.1/Marriage
1995-08-04
http://purl.org/vocab/bio/0.1/event
http://purl.org/vocab/bio/0.1/event
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://purl.org/vocab/bio/0.1/date
http://purl.org/vocab/bio/0.1/event
http://purl.org/vocab/bio/0.1/date
Blank node Convenience syntax
21
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellXML vs. RDF w.r.t. Evolving Data Scenario: Describe printer capabilities V1 has several features
XML RDF
22
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellXML vs. RDF w.r.t. Evolving Data V1.1 adds two features
• What effect on existing client software? – Regenerate stubs?– Recompile?– Did any queries break?– (Depends how they're written. Best programmers?)
XML RDF
23
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellXML vs. RDF w.r.t. Evolving Data V1.2 adds three more features
• What effect on existing client software?
XML RDF
24
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellXML vs. RDF w.r.t. Evolving Data V2 adds colors
• What effect on existing client software?
XML RDF
25
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellXML vs. RDF w.r.t. Evolving Data Version n combines printer, scanner, fax:
Problem: How to combine trees?• Printer and fax both have output paper settings (red)• Scanner and fax both have input image settings (blue)
26
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellXML vs. RDF w.r.t. Evolving Data Flexibility is important
• Products are always changing(competitive environment)
• People are always adding more features• Graceful evolution is important• Relational data is remarkably flexible
XML syntax is important• Lots of application, which use XML, are already available• Lots of tools for XML are already available• Trees alows for simple parsing without loading the entire
model (i.e., XML parsing using SAX)
27
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellSerializing RDF in XML - basics W3C standardized an RDF/XML syntax [1] The basic idea is to insert an XML element for each node
(sobject and value) and arc (predicate) Es.
<rdf:RDF xmlns:rdf=”http://www.w3.org/1999/02/22-rdf-syntax-ns#” xmlns:ex=”http://www.example.org/” xmlns:sid=“URN:org:example:staffid:” xmlns:dc=”http://purl.org/dc/elements/1.1/”> <rdf:Description rdf:about="http://www.example.org/index.html ">
<dc:creator> <rdf:Description rdf:about="URN:org:example:staffid:85740"/>
</dc:creator ></rdf:Description>
</rdf:RDF>[1] RDF/XML Syntax Specification available at http://www.w3.org/TR/rdf-syntax-grammar/
ex:index.html sid:85740
dc:creator
propertyelement
Root tag
28
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellSerializing RDF in XML - abbreviation A possible abbreviation is using rdf:resource in the
property elements Es.
<rdf:RDF xmlns:rdf=”http://www.w3.org/1999/02/22-rdf-syntax-ns#” xmlns:ex=”http://www.example.org/” xmlns:sid=“URN:org:example:staffid:” xmlns:dc=”http://purl.org/dc/elements/1.1/”> <rdf:Description rdf:about="http://www.example.org/index.html ">
<dc:creator rdf:resource="URN:org:example:staffid:85740"/> </rdf:Description></rdf:RDF>
29
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellSerializing RDF in XML – more statements Other statements
• Es.
can be added • afterwords
<rdf:Description rdf:about="http://www.example.org/index.html"> <dc:creator rdf:resource="URN:org:example:staffid:85740"/>
</rdf:Description><rdf:Description rdf:about="URN:org:example:staffid:85740">
<foaf:email rdf:resource="mailto:[email protected]"/></rdf:Description>
• or inline if they share the same subject <rdf:Description rdf:about="http://www.example.org/index.html ">
<dc:creator> <rdf:Description rdf:about="URN:org:example:staffid:85740">
<foaf:email rdf:resource="mailto:[email protected]"/> </rdf:Description> </dc:creator ></rdf:Description>
ex:index.html sid:85740
dc:creator
mailto:[email protected]:email
30
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellSerializing RDF in XML – rdf:type abbreviation It’s possible to abbreviate rdf:type
• Es.
• Long form
<rdf:Description rdf:about="http://www.example.org/index.html"><rdf:type rdf:resource="http://www.example.org/pagina_web">
</rdf:Description>
• Abbreviated form
<ex:pagina_web rdf:about="http://www.example.org/index.html" />
ex:index.html
ex:pagina_web
rdf:type
31
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellSerializing RDF in XML – a larger example A compact XML serialization of
is<ex:pagina_web rdf:about="http://www.example.org/index.html"> <dc:creator> <ex:employee rdf:about="sid:55740" foaf:email="mailto:[email protected]"/> <dc:creator> </ex:pagina_web>
ex:index.html sid:85740
dc:creator
mailto:[email protected]:email
ex:pagina_web ex:impiegato
rdf:type rdf:type
32
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellMerging XML files 1/2 Suppose you have to merge the two following XML
Merging the XML trees is difficult, but being RDF …
<Park rdf:about="Yosemite"> <conteins> <Camp rdf:about="North-Pines"/> </conteins> <crossedBy> <Path rdf:about="S11"/> </crossedBy></Park>
<Camp rdf:about="North-Pines" locatedIn="Yosemite"> <accessibleBy> <Path rdf:about="S11"/> </accessibleBy> </Camp>
Yosemite
North-Pines
Park
rdf:type
rdf:type
conteins
Camp
S11rdf:type
Path
crossedBy
Yosemite
North-Pinesrdf:type
Camp
S11
rdf:typePath
accessibleBylocatedIn
33
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellMerging XML files 2/2 It’s (just) a matter to merge the two RDF graphs
NOTE: It works out nicely because both RDF/XML documents refer to the same resources and use the same vocabularies.
U
Yosemite
North-Pines
Park
rdf:type
rdf:type
conteins
Camp
S11
Path
accessibleBy
crossedBylocatedIn
rdf:type
Yosemite
North-Pines
Park
rdf: type
rdf: type
conteins
Camp
S11rdf: type
Path
crossedBy
Yosemite
North-Pinesrdf:type
Camp
S11
rdf:typePath
accessibleBylocatedIn
34
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellRDFa Resource Description Framework in Attributes (RDFa)
• First proposed in 2004• W3C recommendation since October 14, 2008:
http://www.w3.org/TR/rdfa-syntax/ The essence
• it provides a set of markup attributes to augment the visual information on the Web with machine-readable hints
35
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellRDFa – An example Consider the HTML snippet hereafter
• <p>The microformats.org site was launched on 2005-06-20 at the Supernova Conference in San Francisco, CA, USA.</p>
Let’s annotate it using hCalendar • <p>
<span vocab="http://schema.org/" typeof="Event"> <span property="name"> The microformats.org site was launched </span> on <span property="startDate”> 2005-06-20</span> at the Supernova Conference in <span property="location" typeof=”City"> San Francisco, CA, USA </span> . </span></p>
36
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellRDFa – the attributes about – a URI specifying the resource the metadata is
about rel and rev – specifying a relationship and reverse-
relationship with another resource, respectively src, href and resource – specifying the partner resource property – specifying a property for the content of an
element or the partner resource content – optional attribute that overrides the content of
the element when using the property attribute datatype – optional attribute that specifies the datatype of
text specified for use with the property attribute typeof – optional attribute that specifies the RDF type(s) of
the subject or the partner resource37
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellRDFa – resources For more information
• http://www.slideshare.net/ivan_herman/introduction-to-rdfa/ Validation tool
• http://validator.w3.org/nu/
38
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
Publishing RDF as Linked DataPrinciples
1. Things must be identified with dereferenceable HTTP URIs.2. If such a URI is dereferenced asking for the MIME-type
application/rdf+xml, a data source must return an RDF/XML description of the identified resource.
3. Besides RDF links to resources within the same data source, RDF descriptions should also contain RDF links to resources provided by other data sources
39
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
Publishing RDF as Linked Data Example based on dbpedia Approach
E.g. • Resource identifier
– http://dbpedia.org/resource/Propranolol • RDF representation describing Propranolol
– http://dbpedia.org/data/Propranolol • HTML representation describing Propranolol
– http://dbpedia.org/page/Propranolol Test it! http://idi.fundacionctic.org/vapour
40
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
RDF in a nutshellRDF Resources RDF at the W3C - primer and specifications
• http://www.w3.org/RDF/ Semantic Web tools - community maintained list; includes
triple store, programming environments, tool sets, and more• http://esw.w3.org/topic/SemanticWebTools
302 Semantic Web Videos and Podcasts - includes a section specifically on RDF videos• http://www.semanticfocus.com/blog/entry/title/302-semantic-
web-videos-and-podcasts/
41
E. Della Valle – http://emanueledellavalle.org - @manudellavalle
credits Some slides are derived from
• “How to Publish Linked Data on the Web” by Chris Bizer, Richard Cyganiak and Tom Heath http://sites.wiwiss.fu-berlin.de/suhl/bizer/pub/LinkedDataTutorial/
• “Bio2RDF and Beyond!” by Michel Dumontier http://www.slideshare.net/micheldumontier/bio2rdf-and-beyond
42