seeding the linked data cloud - hioaedu.hioa.no/korg2016/korg2016_godby.pdfseeding the linked data...
TRANSCRIPT
“Days of Knowledge Organization.” Oslo and Akershus University,Department of Archivistics,
Library and Information Science. 30 May 2016
Seeding the Linked Data Cloud: The present and future of library
identifiers
Carol Jean GodbySenior Research Scientist, OCLC Research
Founded in
1967as the Ohio College Library Center
16,957members worldwide
1,200+ staff
18 offices in
10 countries OCLC headquarters in Dublin, OH USA
365million records
2.3billion holdings
46million digital items
17million eBooks
*As of February 26, 2016
A web of documents; a web of data
Albert Einstein
Person
Relativity: The Special and General Theory
Work
Physics
Concept
author
about
Entities and relationships
https://www.wikidata.org/wiki/Q937http://viaf.org/viaf/75121530/
Wikidata and the Virtual International Authority File (VIAF)
http://experiment.worldcat.org/entity/work/data/369081611
WorldCat Works
http://id.loc.gov/authorities/subjects/sh85101653.html
Library of Congress Subject Headings
author
about
described in entity hubs and “linked”
URI
URL
ID
• Persistent
• Globally unique
• ‘Thing’
• Web accessible document
• Database record
The evolution of identifiers
Database record ID: Library of Congress
control number. 78078534 is the source of the
heading “Hemingway, Ernest, 1899-1961”
URL: Web document for the LC Name Authority
record: https://lccn.loc.gov/n78078534
URI:http://id.loc.gov/authorities/names/n780785
34/. Refers to the “concept” Ernest Hemingway.
URI: http://id.loc.gov/rwo/agents/n78078534.
Refers to the “person” Ernest Hemingway.
An example: Ernest Hemingway
MORE CONTEXT: OCLC’S 2015
INTERNATIONAL LINKED DATA
SURVEYSOURCE: KAREN SMITH-YOSHIMURA
20 countries
represented
0 5 10 15 20 25 30 35 40 45
USA
Spain
UK
The Netherlands
Norway
Canada
Australia
France
Germany
Italy
Switzerland
Austria
Czech Republic
Hungary
Ireland
Japan
Malaysia
Portugal
Singapore
Sweden
Linked Data Survey Respondents
Geographic breakdown of 90 responding institutions
Academic library
National library
Network
Government
Scholarly
Public Library
Museum
Other
31%
20%14%
10%
8%
7%4% 6%
2015 responding institutions by type
What is published as linked data
0 10 20 30 40 50 60
Authority files
Bibliographic data
Data about musuem objects
Datasets
Descriptive metadata
Digital collections
Encoded archival descriptions
Geographic data
Ontologies/vocabularies
Other
VIAF
DBpedia
GeoNames
id.loc.gov
“Resources we convert to linked data ourselves”
Getty's Art and Architecture Thesaurus
FAST (Faceted Application of Subject Terminology)
WorldCat.org
data.bnf.fr
Deutsche National Bib Linked Data Service
Linked data resources most consumed
http://bnb.data.bl.uk
PUBLISHING LINKED DATA
IDENTIFIERS: LESSONS FROM
OCLC’S EXPERIENCE
Data is easier to
manage.
Data is broadly
understandable.
The cost of
description can
be shared.Data is easier to
integrate.
Conformance to linked data principles
Benefits for data publishersP
erc
eiv
ed b
enefits
CONVERTING LEGACY
DESCRIPTIONS
Format
conversion; one-
to-one mapping.
Objective: publishing
and maintaining
persistent identifiers
(URIs).
Outcomes
• A low-cost start
• A technical proof of concept
• A test of current ontologies
• A [small] break from the
past
Conformance to linked data principles
GoalsP
erc
eiv
ed b
enefits
Authority record
A MARC record and three RDF descriptions
British Library Data
Model
Schema.org BibFrame
bibo:
BibliographicResource
schema:CreativeWork bf:Instance
dc:title schema:name bf:Title
dcterms:language schema:inLanguage bf:language
dc:creator schema:creator bf:creator
Schema
BIBFRAME
AV model and ontology
Search engine discovery
OPAC discovery
Curation
The 2016 Library of Congress
audiovisual study
MARC, FOAF,
Product Types Ontology
MARC, FRBR, RDA,
Schema, FOAF,
Dublin Core
MARC, RDA,
PREMIS, FOAF
What we have to get right
Defining the right Things
Mapping strings to “Things”
Breaking away from legacy
Solving the essential problem
BUILDING ENTITY HUBS
Aggregating
evidence.
Objective:
resolving URIs to
the same entity.
Establishing real-
world references.
Outcomes
• A knowledge store or vault
about important entities or
‘Things’
• A resource that can be
integrated outside its original
creation context
• A radical break with the past
Conformance to linked data principles
Goals
Evaluating quality
and truthfulness
of source data.
Perc
eiv
ed b
enefits
WorldCat Linked Data for “A Farewell to Arms Control”
http://experiment/worldcat.org/entity/nnnnnnn#Topic
United States
Anti-Missile Missiles
Nuclear Weapons
Military Defences
Nuclear Disarmament
Arms Limitation
VIAF: An entity hubhttp://viaf.org/viaf/89803084
OCLC’s published identifiers
WorldCat Catalog
WorldCat Works
FAST
VIAF
ISNI
SOME LESSONS AND NEXT
STEPS
Janet A. Smith
Name Authority File 2
Janet B. A. Smith
Uncontrolled local URIs…
Janet B. Adam Smith
Wikidata
Janet Adam Smith
DBpedia
Janet A. Smith
Janet B. A. Smith
Name Authority File 1
‘Person’ entity
hub
Beyond legacies
Janet Adam Smith
Oxford Biography Index
Some obvious gaps
Defining creator roles beyond the
published monograph. Tracking
creators throughout their careers.
Respecting their privacy. Tracking
pseudonyms, collective names, and
personas. Linking to 3rd-party datasets.
Connecting creators to works. Defining
the model of “format” that users
understand. Graphic Novel. BluRay.
Virtual Reality. Delivering the objects
that users ask for. Identifying the
simplest possible model of ‘work’ that
cuts across all formats and genres.
work place
person event
conceptorganization
Aspirations
In the linked data
paradigm, authority
control is more
important than ever.
SM
Together we make breakthroughs possible.
Takk!
“Days of Knowledge Organization.” Oslo and Akershus University,Department of
Archivistics, Library and Information Science. 30 May 2016
Carol Jean [email protected]
• Godby, Carol Jean, Shenghui Wang and Jeffrey K. Mixter.
2015. Library Linked Data in the Cloud: OCLC’s Experiments with New
Models of Resource Description. Morgan & Claypool.
• Lyons, B and Van Malssen, K. 2016. “BIBFRAME AV Assessment:
Technical, Structural, and Preservation Metadata.”
https://www.loc.gov/bibframe/docs/pdf/bf-avtechstudy-01-04-2016.pdf
• Smith-Yoshimura, Karen. 2016. “Linked Data Implementations—Who, What
and Why?” CNI Spring Membership Meeting, 4 April 2016, San Antonio,
Texas (USA).
• Smith-Yoshimura, Karen, et al. 2016. “Addressing the Challenges with
Organizational Identifiers and ISNI.” Dublin, Ohio: OCLC Research.
http://www.oclc.org/content/dam/research/publications/2016/oclcresearch-
organizational-identifiers -and-isni-2016.pdf
• Smith-Yoshimura, Karen, et al. 2014. “Registering Researchers in Authority
Files.” Dublin, Ohio: OCLC Research.
http://www.oclc.org/content/dam/research/publications/library/2014/oclcrese
archregistering-researchers-2014.pdf.
References