linked data for librarians
DESCRIPTION
TRANSCRIPT
![Page 1: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/1.jpg)
Linked Data Fundamentals
Trevor Thornton
Senior Applications Developer, NYPL Labs
The New York Public Library
![Page 2: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/2.jpg)
Linked Data
Data published on the Web in accordance with principles
designed to facilitate linkages between resources
The potential for linked data in libraries:
• Eliminates data silos - makes data accessible on the Web
and promotes sharing and re-use
• Promotes discovery of related resources through links
(to common people, subjects, etc.)
• Supports cooperative description
(‘open world assumption’)
![Page 3: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/3.jpg)
Key aspects of linked data
• Based on the core Web technologies (HTTP, URIs)
• Uses a simple data structure based on atomic statements
about resources (RDF)
• Can be interpreted by machines (semantic data)
• Focus on connecting resources, rather than simply
describing them (though it can do both)
![Page 4: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/4.jpg)
HTTP (Hypertext Transfer Protocol)
The foundation of data communication for the Web
HTTP request
HTTP response
Client/User agent(e.g. web browser)
WebServer
![Page 5: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/5.jpg)
URI (Uniform Resource Identifier)
Globally unique identifier for a resource on a computer
or a network.
HTTP URIs identify resources on the Web.
http://www.yourdomain.org/something
![Page 6: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/6.jpg)
URI vs. URL
URLs (Uniform Resource Locators) are a subset of URIs
that, in addition to identifying a resource, provide a means of
locating it.
A URI does not necessarily point to a document;
a URL does.
A URI can identify a real-world object.
![Page 7: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/7.jpg)
The Semantic Web
Proposed by Tim Berners-Lee in a 2001 article in Scientific
American
“The Semantic Web is not a separate Web but an extension of the current one, in
which information is given well-defined meaning, better enabling computers and
people to work in cooperation…
In the near future, these developments will usher in significant new functionality
as machines become much better able to process and ‘understand’ the data that
they merely display at present.”
![Page 8: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/8.jpg)
The Linked Data PrinciplesTim Berners-Lee, 2006
1. Use URIs as names for things.
2. Use HTTP URIs so that people can look up those names.
3. When someone looks up a URI, provide useful
information, using the standards (RDF, SPARQL).
4. Include links to other URIs so that they can discover
more things.
![Page 9: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/9.jpg)
RDF (Resource Description Framework)
A framework for describing Web resources.
A Web resource is anything that can be retrieved or identified
on the Web via a URI.
RDF descriptions are based on simple
subject-predicate-object expressions called “triples”.
![Page 10: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/10.jpg)
The RDF Triple
Subject - the resource being described
Predicate - a property of that resource
Object - the value of the property
Subject and predicate are defined using URIs.
Object can either be a URI or a literal value
(text, number, date, etc.)
subjectpredicate
object
![Page 11: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/11.jpg)
Here is some metadata…
Robert Moses Papers
CREATOR:
Moses, Robert, 1888-1981
EXTENT:
142 linear feet
REPOSITORY:
The New York Public Library. Manuscripts and Archives Division.
![Page 12: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/12.jpg)
Here are some triples
http://archives.nypl.org/mss/
2071
http://viaf.org/viaf/52866196
http://archives.nypl.org/mss/
2071‘142 linear feet’
http://archives.nypl.org/mss/
2071
http://data.nypl.org/org_units/mss
http://purl.org/dc/terms/creator
http://purl.org/dc/terms/extent
http://purl.org/archival/vocab/arch#heldBy
Robert Moses Papers
Robert Moses Papers
Robert Moses Papers
creator Moses, Robert, 1888-1981
extent
repository NYPL Manuscripts & Archives
![Page 13: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/13.jpg)
A set of related triples = a graph
http://archives.nypl.org/mss/
2071
http://viaf.org/viaf/52866196
‘142 linear feet’
http://archives.nypl.org/mss/
2071
http://purl.org/dc/terms/creator
http://purl.org/dc/terms/extent
http://purl.org/archival/vocab/arch#heldBy
![Page 14: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/14.jpg)
This is another graph
http://www.worldcat.org/oclc/
834874
http://viaf.org/viaf/44312399
http://viaf.org/viaf/52866196
http://purl.org/dc/terms/creator
http://purl.org/dc/terms/subject
![Page 15: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/15.jpg)
Put the graphs together to make a new graph
http://archives.nypl.org/mss/
2071
http://viaf.org/viaf/52866196
‘142 linear feet’
http://archives.nypl.org/mss/
2071
http://purl.org/dc/terms/creatorhttp://purl.org/dc/
terms/extent
http://purl.org/archival/vocab/arch#heldBy
http://viaf.org/viaf/44312399
http://purl.org/dc/terms/creator
http://purl.org/dc/terms/subject
Robert Moses Papers
The Power Broker
http://www.worldcat.org/oclc/
834874
![Page 16: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/16.jpg)
RDF serialization formats
‘Serialization’ = to record one or more RDF graphs in a
machine-readable file. There are 2 basic options:
RDF in a standalone text file:• RDF XML• N3 (Notation 3)• Turtle (Terse RDF Triple Language)• N-Triples
RDF embedded in HTML• RDFa (RDF in attributes)
![Page 17: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/17.jpg)
<http://archives.nypl.org/mss/2071> <http://purl.org/dc/terms/creator>
<http://viaf.org/viaf/52866196> .
<http://archives.nypl.org/mss/2071> <http://purl.org/dc/terms/extent>
‘142 linear feet’ .
<http://archives.nypl.org/mss/2071> <http://purl.org/archival/vocab/arch#heldBy>
<http://archives.nypl.org/mss/2071> .
Basic triples in N-Triples
N-Triples is the most basic expression of RDF.
![Page 18: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/18.jpg)
@prefix dcterms: <http://purl.org/dc/terms/>.
@prefix arch: <http://purl.org/archival/vocab/arch#>.
<http://archives.nypl.org/mss/2071>
dcterms:creator http://viaf.org/viaf/52866196;
dcterms:extent ‘142 linear feet’;
arch:heldBy http://archives.nypl.org/mss/2071.
Basic triples in N3/Turtle
Statements about the same resource are grouped together.Property URIs are shortened using prefixes (‘q-names’).
![Page 19: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/19.jpg)
Basic triples in RDF-XML
<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf=“http://www.w3.org/1999/02/22-rdf-syntax-ns#”
xmlns:dcterms="http://purl.org/dc/terms/”
xmlns:arch="http://purl.org/archival/vocab/arch#">
<rdf:Description rdf:about="http://archives.nypl.org/mss/2071">
<dcterms:creator rdf:resource="http://viaf.org/viaf/52866196” />
<dcterms:extent>142 linear feet</dcterms:extent>
<arch:heldBy rdf:resource="http://archives.nypl.org/mss/2071” />
</rdf:Description>
</rdf:RDF>
![Page 20: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/20.jpg)
RDFa (RDF in Attributes)
RDFa allows RDF data to be embedded within HTML.
Rendered HTML:
The Power Broker, by Robert Caro, is a biography of Robert Moses.
HTML code:<div about=“http://www.worldcat.org/oclc/834874”
prefix=“dcterms: http://purl.org/dc/terms/>
The Power Broker, by <span property=“dcterms:creator”
resource=“http://viaf.org/viaf/44312399”>Robert Caro</span>, is a biogrpahy of
<span property=“dcterms:subject”
resource=“http://viaf.org/viaf/52866196”>Robert Moses</span>
</div>
![Page 21: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/21.jpg)
RDF Ontologies/vocabularies
• Define categories of things and the relationships that they
can have to each other
• Provide the semantics that allow data to be interpreted
by machines
• Establish rules of inference – what can be assumed to
be true based on what is asserted by a triple
![Page 22: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/22.jpg)
RDFS (RDF Schema)
A basic vocabulary for ontology development.
RDFS defines RDF classes and properties.
Class: a category of resources; a resource in such a
category is said to be an instance of the class
Property: a relation between a subject and object in a triple
![Page 23: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/23.jpg)
Classes and subClasses
The subClassOf property (used in defining a class) allows a
broad class to serve as the basis of a more specific class.
Defining a class (A) as a subClassOf another class (B)
means that any instance of A can be inferred to also be an
instance of B.
Class B
Class A
![Page 24: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/24.jpg)
A simple Class/subClass example
Based on these class definitions:
‘Dog’ is a Class
‘Poodle’ is a Class
‘Poodle’ is a subClassOf ‘Dog’
And the statement:
Fido is a Poodle.
It can be inferred that:
Fido is a Dog.
![Page 25: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/25.jpg)
RDFS Properties
The predicates in RDF triples are properties.
Properties themselves have two important properties:
domain: asserts that the subject of the triple is an instance
of specific class
range: asserts that the object of the triple is an instance of
specific class
![Page 26: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/26.jpg)
OWL (Web Ontology Language)
Provides an extended set of properties used in
ontology/vocabulary definitions (used in conjunction with
RDFS)
• Equivalence/disjunction
• Advanced property definitions
• Restrictions and cardinality
owl:sameAs: A property that asserts that two resources are
the same (i.e. two URIs refer to the same thing)
![Page 27: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/27.jpg)
SKOS(Simple Knowledge Organization System)
Defines classes and properties to support the use of
thesauri, classification schemes, subject heading systems
and taxonomies in RDF
• Classes: skos:ConceptScheme, skos:Concept
• Properties: skos:broader, skos:narrower, skos:related,
skos:prefLabel, skos:altLabel
![Page 28: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/28.jpg)
Library of Congress Linked Data Service (id.loc.gov)
• Provides URIs for LC controlled vocabularies, thesauri,
language codes, classification schemes
• Most terms defined using SKOS + RDF representation
of MADS (where applicable)
• Complete vocabularies available as free downloads
![Page 29: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/29.jpg)
FOAF (Friend of a Friend)
• Provides a vocabulary for describing people and their
relationships to each other and to the things they
make and do
• Originally intended for web-based social networks,
FOAF has gained wider acceptance in describing
historical figures and their relationships
• Classes: Agent, Person, Organization, Group
• Properties: knows, name, based_near
![Page 30: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/30.jpg)
VIAF (Virtual International Authority File)
• Clusters names in authority files from numerous national
libraries and other agencies
• Named entities vs. just names
• OCLC is actively establishing links between VIAF and
Wikipedia, building an invaluable resource for
libraries/archives/museums to provide context for their
collections
![Page 31: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/31.jpg)
Dublin Core Metadata Initiative
• Terms for general use in describing resources
• Properties relating to simple and qualified Dublin Core
elements
• Classes for general material types (Text, Image,
PhysicalObject, etc.)
• Classes for other resources referenced by DCMI
properties (FileFormat, RightsStatement,
ProvenanceStatement, etc.)
![Page 32: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/32.jpg)
Schema.org
• Cooperative project between Bing, Google and Yahoo to
provide mechanism to describe web content via
standardized vocabularies
• Structured data is included in HTML content via microdata
(similar to RDFa)
• Basis of Google Knowledge Graph
• OCLC now provides Schema.org linked data for all
records in WorldCat
![Page 33: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/33.jpg)
DbPedia
• Crowd-sourced community effort to extract structured
information from Wikipedia
• Enables sophisticated queries against Wikipedia
• Makes Wikipedia data freely available for re-use
![Page 34: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/34.jpg)
Other useful/notable linked data sources
Vocabularies/ontologies
• Bibliographic ontology
• Archival ontology
• Relationship ontology
Data sources
• GeoNames, Europeana, MusicBrainz, data.gov,
nytimes.com, BBC, Project Gutenberg…
![Page 35: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/35.jpg)
The obligatory linked data cloud slide
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
![Page 36: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/36.jpg)
Technical things to know a little about
• Triplestore – a database for storing RDF data
• SPARQL (SPARQL Protocol and RDF Query Language)
The primary query language for RDF data (analogous to
SQL for relational databases)
• SPARQL endpoint – Web service that provides direct
access to RDF data stores via SPARQL queries
• HTTP content negotiation – process for delivering
content (data) in different formats (e.g. RDF vs. HTML)
based on HTTP request
![Page 37: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/37.jpg)
Linked data attribution
A growing concern in the linked data community is the need
to include attribution with data in order to determine whether
or not it can/should be trusted.
• RDF reification – allows source attribution to be associated with an
RDF triple
• Named graphs – Extension of RDF that allows attribution and other
metadata to be associated with RDF descriptions
• Quad stores – Similar to triplestores but with an additional element
that connects the triple with its source
![Page 38: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/38.jpg)
Linked Open Data
Linked data that is freely usable, reusable, and
redistributable — subject, at most, to attribution and ‘share
alike’ requirements
![Page 39: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/39.jpg)
Open data licensing
A nonprofit organization that enables the sharing and use of creativity and knowledge through free legal tools.
CC provides alternatives to “all rights reserved” copyright.
![Page 40: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/40.jpg)
Creative Commons LicensesO
PEN
DAT
A (: Attribution (CC BY)
Allows distribution and reuse in any way as long as you get credit
Attribution-ShareAlike (CC BY-SA)Allows distribution and reuse in any way as long as you get credit and derivative works are released under the same license
Attribution-NoDerivs (CC BY-ND)Requires that the original is used unchanged and in whole, with credit to you
Attribution-NonCommercial (CC BY-ND)Allows distribution and reuse in any way, for non-commercial purposes only, as long as you get credit
Attribution-NonCommercial-ShareAlike (CC BY-NC-SA)Requires that the original is used unchanged and in whole, with credit to you, provided that derivative works are released under the same license
Attribution-NonCommercial-NoDerivs (CC BY-NC-ND)Only permits use as-is, for non commercial purposes, and with credit to you – the most restrictive CC license available
NO
T O
PEN
DAT
A ):
![Page 41: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/41.jpg)
CC0 (‘CC Zero’)
• Allows creators to waive all rights to work and to place it
as completely as possible into the public domain.
• Designed to make it as clear as is legally possible that any
use of your content is allowed
• Quickly becoming the preferred license for open data
![Page 42: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/42.jpg)
LC Bibliographic Framework Initiative
• Developing a new bibliographic framework (to replace
MARC) based on linked data principles
• First draft of the Bibliographic Framework (BIBFRAME)
model published in November 2012
![Page 43: Linked data for librarians](https://reader036.vdocuments.mx/reader036/viewer/2022082700/5490b0bcb47959ed448b457b/html5/thumbnails/43.jpg)
LC Bibliographic Framework Initiative