ld4l oclc data strategy
TRANSCRIPT
Richard Wallis
OCLC Data Strategy
Technology Evangelist
@rjw
LD4L Workshop – Stanford University – February 23rd 2015
Richard Wallis
OCLC Data Strategy
Technology Evangelist
@rjw
LD4L Workshop – Stanford University – February 23rd 2015
Building on a Web of Knowledge
The Web of …
The Web of …
Documents
Active Documents
Discovery
☌
The Web of …
Documents
Active Documents
Discovery
☌
✔
✔
The Web of …
Documents
Active Documents
Discovery
☌
✔
✔
✗
The Web of …
Documents
Active Documents
Discovery
Data
☌☌
✔
✔
✗
The Web of …
Documents
Active Documents
Discovery
Data
☌☌
✔
✔
✔✗
✗
The Web of …
Documents
Active Documents
Discovery
Data
Knowledge
☌☌
✔
✔
✔✗
✗
☌
The Web of …
Documents
Active Documents
Discovery
Data
Knowledge
☌☌
✔
✔
✔✗
✗
?☌
http://www.opte.org/
A Web of Data
http://www.opte.org/
The Web of Data
http://www.opte.org/
The Web of Data
A Library Shaped Black Hole ?
Entities in
a
Knowledge
Graph
Entities in
a
Knowledge
Graph
Open Linked Data -‐ Silos
Library Linked DataProjects
British Library
German National Library
French National Library
Swedish National Library
Open Linked Data -‐ Silos
Library Linked DataProjects
British Library
German National Library
French National Library
Swedish National Library
Open Linked Data -‐ Silos
Library Linked DataProjects
British Library
German National Library
French National Library
Swedish National Library
Open Linked Data -‐ Silos
Library Linked Data
British Library
German National Library
French National Library
Swedish National Library
Open Linked Data -‐ Silos
Library Linked Data
British Library
German National Library
French National Library
Swedish National Library
Open Linked Data -‐ SilosBehind A Vocabulary Barrier
Library Linked Data
A general purpose vocabulary for describing things on the web
A general purpose vocabulary for describing things on the web
"Used by 5 million
domains" "25% o
f pages
in our
indexe
s"
"15% of the Web"
A general purpose vocabulary for describing things on the web
"Used by 5 million
domains" "25% o
f pages
in our
indexe
s"
de facto
y
"15% of the Web"
A general purpose vocabulary for describing things on the web
"Used by 5 million
domains" "25% o
f pages
in our
indexe
s"
de facto
y
• Linked Data
"15% of the Web"
A general purpose vocabulary for describing things on the web
"Used by 5 million
domains" "25% o
f pages
in our
indexe
s"
de facto
y
• Linked Data • Embedded in HTML
"15% of the Web"
A general purpose vocabulary for describing things on the web
"Used by 5 million
domains" "25% o
f pages
in our
indexe
s"
de facto
y
• Linked Data • Embedded in HTML• RDFa, Microdata, JSON-‐LD
"15% of the Web"
A general purpose vocabulary for describing things on the web
"Used by 5 million
domains" "25% o
f pages
in our
indexe
s"
de facto
y
• Linked Data • Embedded in HTML• RDFa, Microdata, JSON-‐LD• Descriptive data
"15% of the Web"
A general purpose vocabulary for describing things on the web
"Used by 5 million
domains" "25% o
f pages
in our
indexe
s"
de facto
y
• Linked Data • Embedded in HTML• RDFa, Microdata, JSON-‐LD• Descriptive data• Active links
"15% of the Web"
THE LIBRARY KNOWLEDGE GRAPHTowards
person place
object concept
organization work
The library knowledge graphA graph of relationships
person place
object concept
organization work
The library knowledge graphA graph of relationships
person place
object concept
organization work
start here
The library knowledge graphA graph of relationships
person place
object concept
organization work
start here
The library knowledge graphA graph of relationships
person place
object concept
organization work
start here
ILL and AnalyticsCataloging
Discovery Integration with the web
The library knowledge graphPutting entities in library workflows
ILL and AnalyticsCataloging
Discovery Integration with the web
The library knowledge graphPutting entities in library workflows
Entities and library workflowsCataloging
Improve data quality • Link to authoritative sources
A new approach to cataloging • Point and click cataloging • Managing entities instead of
managing records Consistent with RDA
Entities and library workflowsCataloging
Improve data quality • Link to authoritative sources
A new approach to cataloging • Point and click cataloging • Managing entities instead of
managing records Consistent with RDA
Entities and library workflowsCataloging
Improve data quality • Link to authoritative sources
A new approach to cataloging • Point and click cataloging • Managing entities instead of
managing records Consistent with RDA
Entities and library workflowsDiscovery
Entities and library workflowsDiscovery
Entities and library workflowsDiscoveryEntities and library workflowsWeb exposure
Be found on the web
Connect your users to unique content
What the web requires for web exposure: • Aggregation
• Familiar structures
• A Network of Links
• Entity Identifiers
WHAT’S HAPPENINGA Library Data Revolution
person place
object concept
organization work
WHAT’S HAPPENINGA Library Data Revolution
person place
object concept
organization work
WHAT’S HAPPENINGA Library Data Revolution
person place
object concept
organization work
OCLC’s Approach to Discoverable
Data
Model things of interest to the web.
WHAT’S HAPPENINGA Library Data Revolution
person place
object concept
organization work
OCLC’s Approach to Discoverable
Data
Model things of interest to the web.
WHAT’S HAPPENINGA Library Data Revolution
person place
object concept
organization work
OCLC’s Approach to Discoverable
Data
Model things of interest to the web.
Make those things available viastructures familiar to the web.
Schema Bib Extend – http://www.w3.org/community/schemabibex
WHAT’S HAPPENINGA Library Data Revolution
person place
object concept
organization work
OCLC’s Approach to Discoverable
Data
Model things of interest to the web.
Make those things available viastructures familiar to the web.
Schema Bib Extend – http://www.w3.org/community/schemabibex
BiblioGraph.net – http://bibliograph.net
WHAT’S HAPPENINGA Library Data Revolution
person place
object concept
organization work
OCLC’s Approach to Discoverable
Data
Model things of interest to the web.
Make those things available viastructures familiar to the web.
Improve library workflows.
Schema Bib Extend – http://www.w3.org/community/schemabibex
BiblioGraph.net – http://bibliograph.net
ENTITIES AND WORLDCATThe Library Data Revolution
person place
object concept
organization work
Getting from here to there
Data from aconverted record does not an entity make
Transformation into Linked Data is just a beginning …
Getting from here to there
Data from aconverted record does not an entity make
Transformation into Linked Data is just a beginning …• Mine and analyse the aggregate
Getting from here to there
Data from aconverted record does not an entity make
Transformation into Linked Data is just a beginning …• Mine and analyse the aggregate• Identify, map, merge -‐ evidence based
Getting from here to there
Data from aconverted record does not an entity make
Transformation into Linked Data is just a beginning …• Mine and analyse the aggregate• Identify, map, merge -‐ evidence based• Relate to external sources
Getting from here to there
Data from aconverted record does not an entity make
Transformation into Linked Data is just a beginning …• Mine and analyse the aggregate• Identify, map, merge -‐ evidence based• Relate to external sources• Share authoritative entities
Getting from here to there
• 197+ million Work descriptions and URIs • Schema.org + BiblioGraph.net • RDF Data formats
• RDF/XML, Turtle, Triples, JSON-‐LD
• Links to WorldCat manifestations • Links to Dewey, LCSH, LCNAF, VIAF, FAST • Open Data license via Linked Data Explorer • 2015: Discovery API, Metadata API
• Released April 2014
http://www.oclc.org/dataThe Work Entity
• 98+ million Person descriptions and URIs • Person entities with authority: 20.2 million
• Person entities without authority: 78.3 million
• Schema.org + BiblioGraph.net • Harvested from WorldCat data and enriched from other hubs RDF Data formats • RDF/XML, Turtle, Triples, JSON-‐LD
• Links to WorldCat Works. Added links from WC Works. • Open Data license via Linked Data Explorer • 2015: Linked Data Explorer, Discovery API
http://www.oclc.org/dataThe Person Entity
• Photo credit: http://measuringupblog.com/app/wp-‐content/uploads/2013/11/blogpic2.jpg
• Photo credit: http://measuringupblog.com/app/wp-‐content/uploads/2013/11/blogpic2.jpg
Can we measure impact?
Monthly Unique Visitors
✓ VIAF, ISNI, FAST Publish Linked Data ✓ WorldCat.org Linked Data Release – using Schema.org
✓ Internal agreement on data strategy ✓ Evangelism ✓ Research & Design with Data Architecture Group ✓ Data mining of WorldCat resources ✓ WorldCat Works Released
2012
2014
2013
OCLC Entity-‐Based Data Strategy
✓ VIAF, ISNI, FAST Publish Linked Data ✓ WorldCat.org Linked Data Release – using Schema.org
✓ Internal agreement on data strategy ✓ Evangelism ✓ Research & Design with Data Architecture Group ✓ Data mining of WorldCat resources ✓ WorldCat Works Released
2012
2014
• Application Integration • WorldCat Discovery • Analytics • Discovery API • Cataloging
!
…
• More Entities Released • Person • Manifestation • Organization • Concept
!!• New Products
• Continuing Evangelism !
• New Services
• Continuing Innovation
!
2013
OCLC Entity-‐Based Data Strategy
✓ VIAF, ISNI, FAST Publish Linked Data ✓ WorldCat.org Linked Data Release – using Schema.org
✓ Internal agreement on data strategy ✓ Evangelism ✓ Research & Design with Data Architecture Group ✓ Data mining of WorldCat resources ✓ WorldCat Works Released
2012
2014
• Application Integration • WorldCat Discovery • Analytics • Discovery API • Cataloging
!
…
• More Entities Released • Person • Manifestation • Organization • Concept
!!• New Products
• Continuing Evangelism !
• New Services
• Continuing Innovation
!
2013
OCLC Entity-‐Based Data Strategy
✓ VIAF, ISNI, FAST Publish Linked Data ✓ WorldCat.org Linked Data Release – using Schema.org
✓ Internal agreement on data strategy ✓ Evangelism ✓ Research & Design with Data Architecture Group ✓ Data mining of WorldCat resources ✓ WorldCat Works Released
2012
2014
• Application Integration • WorldCat Discovery • Analytics • Discovery API • Cataloging
!
…
• More Entities Released • Person • Manifestation • Organization • Concept
!!• New Products
• Continuing Evangelism !
• New Services
• Continuing Innovation
!
2013
OCLC Entity-‐Based Data Strategy
✓ VIAF, ISNI, FAST Publish Linked Data ✓ WorldCat.org Linked Data Release – using Schema.org
✓ Internal agreement on data strategy ✓ Evangelism ✓ Research & Design with Data Architecture Group ✓ Data mining of WorldCat resources ✓ WorldCat Works Released
2012
2014
• Application Integration • WorldCat Discovery • Analytics • Discovery API • Cataloging
!
…
• More Entities Released • Person • Manifestation • Organization • Concept
!!• New Products
• Continuing Evangelism !
• New Services
• Continuing Innovation
!
2013
OCLC Entity-‐Based Data Strategy
✓ VIAF, ISNI, FAST Publish Linked Data ✓ WorldCat.org Linked Data Release – using Schema.org
✓ Internal agreement on data strategy ✓ Evangelism ✓ Research & Design with Data Architecture Group ✓ Data mining of WorldCat resources ✓ WorldCat Works Released
2012
2014
• Application Integration • WorldCat Discovery • Analytics • Discovery API • Cataloging
!
…
• More Entities Released • Person • Manifestation • Organization • Concept
!!• New Products
• Continuing Evangelism !
• New Services
• Continuing Innovation
!
2013
OCLC Entity-‐Based Data Strategy
Richard Wallis
OCLC Data Strategy
Technology Evangelist @rjw
LD4L Workshop – Stanford University – February 23rd 2015
Richard Wallis
OCLC Data Strategy
Technology Evangelist @rjw
LD4L Workshop – Stanford University – February 23rd 2015
Building on a Web of Knowledge