ld4l oclc data strategy

80
Richard Wallis OCLC Data Strategy Technology Evangelist @rjw LD4L Workshop – Stanford University – February 23 rd 2015

Upload: richard-wallis

Post on 14-Jul-2015

674 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: LD4L OCLC Data Strategy

Richard  Wallis

OCLC  Data  Strategy

Technology  Evangelist  

@rjw

LD4L  Workshop  –  Stanford  University  –  February  23rd  2015

Page 2: LD4L OCLC Data Strategy

Richard  Wallis

OCLC  Data  Strategy

Technology  Evangelist  

@rjw

LD4L  Workshop  –  Stanford  University  –  February  23rd  2015

Building  on  a  Web  of  Knowledge

Page 3: LD4L OCLC Data Strategy

The  Web  of  …

Page 4: LD4L OCLC Data Strategy

The  Web  of  …

Documents

Active  Documents

Discovery

Page 5: LD4L OCLC Data Strategy

The  Web  of  …

Documents

Active  Documents

Discovery

Page 6: LD4L OCLC Data Strategy

The  Web  of  …

Documents

Active  Documents

Discovery

Page 7: LD4L OCLC Data Strategy

The  Web  of  …

Documents

Active  Documents

Discovery

Data

☌☌

Page 8: LD4L OCLC Data Strategy

The  Web  of  …

Documents

Active  Documents

Discovery

Data

☌☌

✔✗

Page 9: LD4L OCLC Data Strategy

The  Web  of  …

Documents

Active  Documents

Discovery

Data

Knowledge

☌☌

✔✗

Page 10: LD4L OCLC Data Strategy

The  Web  of  …

Documents

Active  Documents

Discovery

Data

Knowledge

☌☌

✔✗

?☌

Page 11: LD4L OCLC Data Strategy

http://www.opte.org/

A  Web  of  Data  

Page 12: LD4L OCLC Data Strategy

http://www.opte.org/

The  Web  of  Data  

Page 13: LD4L OCLC Data Strategy

http://www.opte.org/

The  Web  of  Data  

A  Library  Shaped  Black  Hole  ?

Page 14: LD4L OCLC Data Strategy
Page 15: LD4L OCLC Data Strategy
Page 16: LD4L OCLC Data Strategy
Page 17: LD4L OCLC Data Strategy

Entities  in  

a  

Knowledge

 Graph

Page 18: LD4L OCLC Data Strategy

Entities  in  

a  

Knowledge

 Graph

Page 19: LD4L OCLC Data Strategy

Open  Linked  Data  -­‐  Silos

Library  Linked  DataProjects

Page 20: LD4L OCLC Data Strategy

British  Library

German  National  Library

French  National  Library

Swedish  National  Library

Open  Linked  Data  -­‐  Silos

Library  Linked  DataProjects

Page 21: LD4L OCLC Data Strategy

British  Library

German  National  Library

French  National  Library

Swedish  National  Library

Open  Linked  Data  -­‐  Silos

Library  Linked  DataProjects

Page 22: LD4L OCLC Data Strategy

British  Library

German  National  Library

French  National  Library

Swedish  National  Library

Open  Linked  Data  -­‐  Silos

Library  Linked  Data

Page 23: LD4L OCLC Data Strategy

British  Library

German  National  Library

French  National  Library

Swedish  National  Library

Open  Linked  Data  -­‐  Silos

Library  Linked  Data

Page 24: LD4L OCLC Data Strategy

British  Library

German  National  Library

French  National  Library

Swedish  National  Library

Open  Linked  Data  -­‐  SilosBehind  A  Vocabulary  Barrier

Library  Linked  Data

Page 25: LD4L OCLC Data Strategy
Page 26: LD4L OCLC Data Strategy
Page 27: LD4L OCLC Data Strategy

A  general  purpose  vocabulary  for  describing  things  on  the  web

Page 28: LD4L OCLC Data Strategy

A  general  purpose  vocabulary  for  describing  things  on  the  web

"Used  by  5  million  

domains" "25%  o

f  pages

 in  our  

indexe

s"

"15%  of  the  Web"

Page 29: LD4L OCLC Data Strategy

A  general  purpose  vocabulary  for  describing  things  on  the  web

"Used  by  5  million  

domains" "25%  o

f  pages

 in  our  

indexe

s"

de  facto

y

"15%  of  the  Web"

Page 30: LD4L OCLC Data Strategy

A  general  purpose  vocabulary  for  describing  things  on  the  web

"Used  by  5  million  

domains" "25%  o

f  pages

 in  our  

indexe

s"

de  facto

y

• Linked  Data  

"15%  of  the  Web"

Page 31: LD4L OCLC Data Strategy

A  general  purpose  vocabulary  for  describing  things  on  the  web

"Used  by  5  million  

domains" "25%  o

f  pages

 in  our  

indexe

s"

de  facto

y

• Linked  Data  • Embedded  in  HTML

"15%  of  the  Web"

Page 32: LD4L OCLC Data Strategy

A  general  purpose  vocabulary  for  describing  things  on  the  web

"Used  by  5  million  

domains" "25%  o

f  pages

 in  our  

indexe

s"

de  facto

y

• Linked  Data  • Embedded  in  HTML• RDFa,  Microdata,  JSON-­‐LD

"15%  of  the  Web"

Page 33: LD4L OCLC Data Strategy

A  general  purpose  vocabulary  for  describing  things  on  the  web

"Used  by  5  million  

domains" "25%  o

f  pages

 in  our  

indexe

s"

de  facto

y

• Linked  Data  • Embedded  in  HTML• RDFa,  Microdata,  JSON-­‐LD• Descriptive  data

"15%  of  the  Web"

Page 34: LD4L OCLC Data Strategy

A  general  purpose  vocabulary  for  describing  things  on  the  web

"Used  by  5  million  

domains" "25%  o

f  pages

 in  our  

indexe

s"

de  facto

y

• Linked  Data  • Embedded  in  HTML• RDFa,  Microdata,  JSON-­‐LD• Descriptive  data• Active  links

"15%  of  the  Web"

Page 35: LD4L OCLC Data Strategy

THE  LIBRARY  KNOWLEDGE  GRAPHTowards

person place

object concept

organization work

Page 36: LD4L OCLC Data Strategy

The  library  knowledge  graphA  graph  of  relationships

person place

object concept

organization work

Page 37: LD4L OCLC Data Strategy

The  library  knowledge  graphA  graph  of  relationships

person place

object concept

organization work

start  here

Page 38: LD4L OCLC Data Strategy

The  library  knowledge  graphA  graph  of  relationships

person place

object concept

organization work

start  here

Page 39: LD4L OCLC Data Strategy

The  library  knowledge  graphA  graph  of  relationships

person place

object concept

organization work

start  here

ILL  and  AnalyticsCataloging

Discovery Integration  with  the  web

The  library  knowledge  graphPutting  entities  in  library  workflows

Page 40: LD4L OCLC Data Strategy

ILL  and  AnalyticsCataloging

Discovery Integration  with  the  web

The  library  knowledge  graphPutting  entities  in  library  workflows

Page 41: LD4L OCLC Data Strategy

Entities  and  library  workflowsCataloging

Improve  data  quality  • Link  to  authoritative  sources  

A  new  approach  to  cataloging  • Point  and  click  cataloging  • Managing  entities  instead  of  

managing  records  Consistent  with  RDA

Page 42: LD4L OCLC Data Strategy

Entities  and  library  workflowsCataloging

Improve  data  quality  • Link  to  authoritative  sources  

A  new  approach  to  cataloging  • Point  and  click  cataloging  • Managing  entities  instead  of  

managing  records  Consistent  with  RDA

Page 43: LD4L OCLC Data Strategy

Entities  and  library  workflowsCataloging

Improve  data  quality  • Link  to  authoritative  sources  

A  new  approach  to  cataloging  • Point  and  click  cataloging  • Managing  entities  instead  of  

managing  records  Consistent  with  RDA

Entities  and  library  workflowsDiscovery

Page 44: LD4L OCLC Data Strategy

Entities  and  library  workflowsDiscovery

Page 45: LD4L OCLC Data Strategy

Entities  and  library  workflowsDiscoveryEntities  and  library  workflowsWeb  exposure

Be  found  on  the  web  

Connect  your  users  to  unique  content  

What  the  web  requires  for  web  exposure:  • Aggregation  

• Familiar  structures  

• A  Network  of  Links  

• Entity  Identifiers

Page 46: LD4L OCLC Data Strategy

WHAT’S  HAPPENINGA  Library  Data  Revolution

person place

object concept

organization work

Page 47: LD4L OCLC Data Strategy

WHAT’S  HAPPENINGA  Library  Data  Revolution

person place

object concept

organization work

Page 48: LD4L OCLC Data Strategy

WHAT’S  HAPPENINGA  Library  Data  Revolution

person place

object concept

organization work

OCLC’s  Approach  to  Discoverable  

Data

Model  things  of  interest  to  the  web.  

Page 49: LD4L OCLC Data Strategy

WHAT’S  HAPPENINGA  Library  Data  Revolution

person place

object concept

organization work

OCLC’s  Approach  to  Discoverable  

Data

Model  things  of  interest  to  the  web.  

Page 50: LD4L OCLC Data Strategy

WHAT’S  HAPPENINGA  Library  Data  Revolution

person place

object concept

organization work

OCLC’s  Approach  to  Discoverable  

Data

Model  things  of  interest  to  the  web.  

Make  those  things  available  viastructures  familiar  to  the  web.

Schema  Bib  Extend  –  http://www.w3.org/community/schemabibex

Page 51: LD4L OCLC Data Strategy

WHAT’S  HAPPENINGA  Library  Data  Revolution

person place

object concept

organization work

OCLC’s  Approach  to  Discoverable  

Data

Model  things  of  interest  to  the  web.  

Make  those  things  available  viastructures  familiar  to  the  web.

Schema  Bib  Extend  –  http://www.w3.org/community/schemabibex

BiblioGraph.net  –  http://bibliograph.net

Page 52: LD4L OCLC Data Strategy

WHAT’S  HAPPENINGA  Library  Data  Revolution

person place

object concept

organization work

OCLC’s  Approach  to  Discoverable  

Data

Model  things  of  interest  to  the  web.  

Make  those  things  available  viastructures  familiar  to  the  web.

Improve  library  workflows.

Schema  Bib  Extend  –  http://www.w3.org/community/schemabibex

BiblioGraph.net  –  http://bibliograph.net

Page 53: LD4L OCLC Data Strategy

ENTITIES  AND  WORLDCATThe  Library  Data  Revolution

person place

object concept

organization work

Page 54: LD4L OCLC Data Strategy

Getting  from  here  to  there

Page 55: LD4L OCLC Data Strategy

Data  from  aconverted  record  does  not  an  entity  make

Transformation  into  Linked  Data  is  just  a  beginning  …

Getting  from  here  to  there

Page 56: LD4L OCLC Data Strategy

Data  from  aconverted  record  does  not  an  entity  make

Transformation  into  Linked  Data  is  just  a  beginning  …• Mine  and  analyse  the  aggregate

Getting  from  here  to  there

Page 57: LD4L OCLC Data Strategy

Data  from  aconverted  record  does  not  an  entity  make

Transformation  into  Linked  Data  is  just  a  beginning  …• Mine  and  analyse  the  aggregate• Identify,  map,  merge  -­‐  evidence  based

Getting  from  here  to  there

Page 58: LD4L OCLC Data Strategy

Data  from  aconverted  record  does  not  an  entity  make

Transformation  into  Linked  Data  is  just  a  beginning  …• Mine  and  analyse  the  aggregate• Identify,  map,  merge  -­‐  evidence  based• Relate  to  external  sources

Getting  from  here  to  there

Page 59: LD4L OCLC Data Strategy

Data  from  aconverted  record  does  not  an  entity  make

Transformation  into  Linked  Data  is  just  a  beginning  …• Mine  and  analyse  the  aggregate• Identify,  map,  merge  -­‐  evidence  based• Relate  to  external  sources• Share  authoritative  entities

Getting  from  here  to  there

Page 60: LD4L OCLC Data Strategy
Page 61: LD4L OCLC Data Strategy

• 197+  million  Work  descriptions  and  URIs  • Schema.org  +  BiblioGraph.net  • RDF  Data  formats  

• RDF/XML,  Turtle,  Triples,  JSON-­‐LD  

• Links  to  WorldCat  manifestations  • Links  to  Dewey,  LCSH,  LCNAF,  VIAF,  FAST  • Open  Data  license  via  Linked  Data  Explorer  •  2015:  Discovery  API,  Metadata  API  

• Released  April  2014

http://www.oclc.org/dataThe  Work  Entity

Page 62: LD4L OCLC Data Strategy
Page 63: LD4L OCLC Data Strategy
Page 64: LD4L OCLC Data Strategy
Page 65: LD4L OCLC Data Strategy
Page 66: LD4L OCLC Data Strategy

• 98+  million  Person  descriptions  and  URIs  • Person  entities  with  authority:  20.2  million  

• Person  entities  without  authority:  78.3  million  

• Schema.org  +  BiblioGraph.net  • Harvested  from  WorldCat  data  and  enriched  from  other  hubs  RDF  Data  formats  • RDF/XML,  Turtle,  Triples,  JSON-­‐LD  

• Links  to  WorldCat  Works.    Added  links  from  WC  Works.  • Open  Data  license  via  Linked  Data  Explorer  •  2015:  Linked  Data  Explorer,  Discovery  API

http://www.oclc.org/dataThe  Person  Entity

Page 67: LD4L OCLC Data Strategy
Page 68: LD4L OCLC Data Strategy

• Photo  credit:  http://measuringupblog.com/app/wp-­‐content/uploads/2013/11/blogpic2.jpg

Page 69: LD4L OCLC Data Strategy

• Photo  credit:  http://measuringupblog.com/app/wp-­‐content/uploads/2013/11/blogpic2.jpg

Can  we  measure  impact?

Page 70: LD4L OCLC Data Strategy
Page 71: LD4L OCLC Data Strategy

Monthly  Unique  Visitors

Page 72: LD4L OCLC Data Strategy
Page 73: LD4L OCLC Data Strategy

✓ VIAF,  ISNI,  FAST  Publish  Linked  Data  ✓ WorldCat.org  Linked  Data  Release  –  using  Schema.org  

✓ Internal  agreement  on  data  strategy  ✓ Evangelism  ✓ Research  &  Design  with  Data  Architecture  Group  ✓ Data  mining  of  WorldCat  resources  ✓ WorldCat  Works  Released  

2012  

2014

2013

OCLC  Entity-­‐Based  Data  Strategy

Page 74: LD4L OCLC Data Strategy

✓ VIAF,  ISNI,  FAST  Publish  Linked  Data  ✓ WorldCat.org  Linked  Data  Release  –  using  Schema.org  

✓ Internal  agreement  on  data  strategy  ✓ Evangelism  ✓ Research  &  Design  with  Data  Architecture  Group  ✓ Data  mining  of  WorldCat  resources  ✓ WorldCat  Works  Released  

2012  

2014

• Application  Integration  • WorldCat  Discovery  • Analytics  • Discovery  API  • Cataloging  

!

• More  Entities  Released  • Person  • Manifestation  • Organization  • Concept  

!!• New  Products                

• Continuing  Evangelism  !

• New  Services  

• Continuing  Innovation  

!

2013

OCLC  Entity-­‐Based  Data  Strategy

Page 75: LD4L OCLC Data Strategy

✓ VIAF,  ISNI,  FAST  Publish  Linked  Data  ✓ WorldCat.org  Linked  Data  Release  –  using  Schema.org  

✓ Internal  agreement  on  data  strategy  ✓ Evangelism  ✓ Research  &  Design  with  Data  Architecture  Group  ✓ Data  mining  of  WorldCat  resources  ✓ WorldCat  Works  Released  

2012  

2014

• Application  Integration  • WorldCat  Discovery  • Analytics  • Discovery  API  • Cataloging  

!

• More  Entities  Released  • Person  • Manifestation  • Organization  • Concept  

!!• New  Products                

• Continuing  Evangelism  !

• New  Services  

• Continuing  Innovation  

!

2013

OCLC  Entity-­‐Based  Data  Strategy

Page 76: LD4L OCLC Data Strategy

✓ VIAF,  ISNI,  FAST  Publish  Linked  Data  ✓ WorldCat.org  Linked  Data  Release  –  using  Schema.org  

✓ Internal  agreement  on  data  strategy  ✓ Evangelism  ✓ Research  &  Design  with  Data  Architecture  Group  ✓ Data  mining  of  WorldCat  resources  ✓ WorldCat  Works  Released  

2012  

2014

• Application  Integration  • WorldCat  Discovery  • Analytics  • Discovery  API  • Cataloging  

!

• More  Entities  Released  • Person  • Manifestation  • Organization  • Concept  

!!• New  Products                

• Continuing  Evangelism  !

• New  Services  

• Continuing  Innovation  

!

2013

OCLC  Entity-­‐Based  Data  Strategy

Page 77: LD4L OCLC Data Strategy

✓ VIAF,  ISNI,  FAST  Publish  Linked  Data  ✓ WorldCat.org  Linked  Data  Release  –  using  Schema.org  

✓ Internal  agreement  on  data  strategy  ✓ Evangelism  ✓ Research  &  Design  with  Data  Architecture  Group  ✓ Data  mining  of  WorldCat  resources  ✓ WorldCat  Works  Released  

2012  

2014

• Application  Integration  • WorldCat  Discovery  • Analytics  • Discovery  API  • Cataloging  

!

• More  Entities  Released  • Person  • Manifestation  • Organization  • Concept  

!!• New  Products                

• Continuing  Evangelism  !

• New  Services  

• Continuing  Innovation  

!

2013

OCLC  Entity-­‐Based  Data  Strategy

Page 78: LD4L OCLC Data Strategy

✓ VIAF,  ISNI,  FAST  Publish  Linked  Data  ✓ WorldCat.org  Linked  Data  Release  –  using  Schema.org  

✓ Internal  agreement  on  data  strategy  ✓ Evangelism  ✓ Research  &  Design  with  Data  Architecture  Group  ✓ Data  mining  of  WorldCat  resources  ✓ WorldCat  Works  Released  

2012  

2014

• Application  Integration  • WorldCat  Discovery  • Analytics  • Discovery  API  • Cataloging  

!

• More  Entities  Released  • Person  • Manifestation  • Organization  • Concept  

!!• New  Products                

• Continuing  Evangelism  !

• New  Services  

• Continuing  Innovation  

!

2013

OCLC  Entity-­‐Based  Data  Strategy

Page 79: LD4L OCLC Data Strategy

Richard  Wallis

OCLC  Data  Strategy

Technology  Evangelist  @rjw

LD4L  Workshop  –  Stanford  University  –  February  23rd  2015

Page 80: LD4L OCLC Data Strategy

Richard  Wallis

OCLC  Data  Strategy

Technology  Evangelist  @rjw

LD4L  Workshop  –  Stanford  University  –  February  23rd  2015

Building  on  a  Web  of  Knowledge