linked data and the implications for library cataloguing: metadata models and structures in the...

59
Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web Gordon Dunsire Presented at the Canadian Library Association Annual Conference, 26- 29 May 2011, Halifax, Nova Scotia

Upload: quito

Post on 23-Mar-2016

23 views

Category:

Documents


3 download

DESCRIPTION

Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web. Gordon Dunsire Presented at the Canadian Library Association Annual Conference, 26-29 May 2011, Halifax, Nova Scotia. Outline. Context: evolution of the catalogue record RDF 101 - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

Linked data and the implications for library cataloguing:

metadata models and structures in the Semantic Web

Gordon DunsirePresented at the Canadian Library Association Annual Conference, 26-29 May 2011, Halifax,

Nova Scotia

Page 2: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

Outline

Context: evolution of the catalogue recordRDF 101Library metadata models/schemas in RDF

FRBR, RDA, ISBD, DCT, BiBO, ...From record to triples: worked example

Page 3: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

A short historyof the evolution

of the library catalogue record

Page 4: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

Lee, T. B.

Cataloguing has a future. - Audio disc (Spoken word). - Donated by the author.

1. Metadata

In the beginning ...

... the catalogue card

Page 5: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

Author:

Title:

Content type:

Provenance:

Subject:

Lee, T. B.

Cataloguing has a future

Spoken word

Audio disc

Metadata

Donated by the author

Carrier type:

From flat-file record ...

... to relational record

Name:Biography:

...

Name authority

Term:Definition:

...

Subject authority

Bibliographic description

Page 6: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

Author:

Title:

Content type:

Provenance:

Subject:

Lee, T. B.

Cataloguing has a future

Spoken word

Audio disc

MetadataDonated by the author

Carrier type:

From flat-file description ...

... to FRBR record

Name:Biography:

...

Name authority

Term:Definition:

...

Subject authority

Bibliographic description

Item

Manifestation

Author:

Content type:

Subject:

Spoken wordExpression

Work

Page 7: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

Lee, T. B.

Metadata

From FRBR record ...

... to extinction!

Name:

Name authority

Term:

Subject authority

Item

Manifestation

Expression

Work

Provenance: Donated by the author

Subject:Author:

Title: Cataloguing has a future

Content type: Spoken word

Audio discCarrier type:Term:

RDA content type

Term:

RDA carrier type

Donor:

Title:

Amazon/Publisher

Page 8: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

Where is the record?Implicit, not explicit

Everywhere and nowhereA semantic Web will allow machines to create the

record just-in-timeWe will not have to maintain records just-in-case

The user will have control over the presentationI want to see an archive or library or museum or Amazon

or Google or Flickr or ? displayAnd by avoiding duplication, we can all get on with

describing new stuff ...

Page 9: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

The hyperdimensional (Tardis) card

Lee, T. B.

Cataloguing has a future. - Audio disc (Spoken word). - Donated by the author.

1. Metadata

Audio shop

Lee MuseumSpoken word archive

W3C Library

“TARDIS four port USB hub, for office-bound Time Lords:Open a time vortex on your desk” – Pocket-lint

Page 10: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

RDF 101

Page 11: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

Semantic Web

“machine-readable metadata”Faster! 24/7/365! Global!

Metadata expressed as “atomic” statementsA simple, single, irreducible statement

The title of this book is “Treasure island”In a standard machine-processable format

Resource Description Framework (RDF)

Page 12: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

Resource Description Framework

Metadata statement constructed in 3 parts“Triple”

The title of this book is “Treasure island”Subject of the statement = Subject: This bookNature of the statement = Predicate: has titleValue of the statement = Object: “Treasure island”

This book – has title – “Treasure island”subject – predicate - object

Page 13: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

Identifiers

Need unambiguous way of identifying each part of the triple for efficient machine-processingHuman labels (“This book”, “has title”) no good

Same thing, different labels; different things, same label

Exploit the utility of the URLMachine-readable, regular syntax, unambiguous

Uniform Resource Identifier (URI)

Page 14: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

Uniform Resource IdentifierCan be any unique combination of numbers and

lettersNo intrinsic meaning; it’s just an identifying label

Can look like a URLhttp://iflastandards.info/ns/isbd/elements/P1001But does not lead to a Web page (in principle ...)

RDF requires the subject and predicate of triple to be URIsObject can be a URI, or a literal string (“Treasure island”)

Page 15: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

Namespaces

URI can be constructed from a base plus a unique, identifying suffixhttp://iflastandards.info/ns/isbd/elements/+ P1001

Base is known as a namespaceCan be abbreviated by human programmer

“isbd” = http://iflastandards.info/ns/isbd/elements/isbd:P1001

Machine expands abbreviation for processing

Page 16: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

Everything as triples in RDF

Every aspect of the metadata must be expressed in RDF to be machine-processableMetadata about real-world objects (books, people,

etc.)Metadata about the predicates (definition, label,

scope, etc.)Common predicates apply to many types of thing

(human-readable label, etc.)High-level RDF namespaces (rdfs, owl)

RDF is expressed in RDF (“bootstrap”)

Page 17: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

Library namespaces

Page 18: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

Creating namespaces and URIs

FRBR/FRAD/FRSAD, ISBD, and RDA are using the Open Metadata RegistryCan assign a running “number” to the base to

create a new URISet of properties for creating basic triples

Properties = predicatesrdfs:label for assigning a human-readable label to

the subjectisbd:P1001 - rdfs:label - “has content form”

Page 19: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web
Page 20: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web
Page 21: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web
Page 22: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web
Page 23: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

Subject Predicate Object

isbd:P1001 rdfs:label “has content form”

Page 24: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web
Page 25: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web
Page 26: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web
Page 27: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web
Page 28: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

Subject Predicate Object

isbdcf:T1008 skos:prefLabel “spoken word”

Page 29: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web
Page 30: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web
Page 31: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

Application profile

Need a way to specify how a useful “record” can be constructed from RDF triples

Which triples are involved, and from which namespaces?

Sequence? Repeatable? Mandatory?Sub-component aggregations

Publication statement = place + name + dateContent rules?

Page 32: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

Mandatory Not repeatable Aggregation of simpler elements

Syntax of aggregation (punctuation)

Page 33: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

Getting triples from records

Page 34: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

Linking Open Data cloud (LOD)

Diagram by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/

Page 35: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

LOD: “Library” corner

Page 36: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

Why get involved?

To share our dataWe work for “society”

To share our expertise and experience150 + years

To promote the power of libraries (and archives and museums)

To survive

Page 37: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

From record to triples (in 9 stages)Very large numbers of records

Catalogue records, finding aids, etc.300 million; 1 billion?

High quality metadataIn comparison with other communities

Each record may generate many triples200 “raw” triples (no inferences) per MARC record?

Very, very large numbers of triplesBillions? Trillions?

Page 38: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

1. Take a recordField/attribute ValueRecord ID 54321Title Museum archives: an introductionAuthor Wythe, DeborahDate 2004LCSH Museum archivesMedia/GMD ElectronicContent form Text

Page 39: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

2. Disaggregate to single statementsRecord Attribute Value54321 (has) title Museum archives: an

introduction54321 (has) author Wythe, Deborah54321 (has) date 200454321 (has) LCSH Museum archives54321 (has) media type Electronic54321 (has) content form Text

Page 40: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

3. Create URI for record

Must be unique, so 54321 no good on its ownhttp URIs are a good thing (W3C)So add record ID to a unique http domain

E.g. http://MyLibraryX.com (unique to the library)+ 54321

http://MyLibraryX.com/54321(or http://MyLibraryX.com#54321)

This is not a URL!

Page 41: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

4. Replace record ID with URIURI Attribute Valuemlx:54321 (has) title Museum archives:

an introductionmlx:54321 (has) author Wythe, Deborahmlx:54321 (has) date 2004mlx:54321 (has) LCSH Museum archivesmlx:54321 (has) media type Electronicmlx:54321 (has) content form Text

“mlx” = qname (xmlns) = shorthand for “http://MyLibraryX.com/”

Page 42: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

5. Find URIs for attributesAttributes are modelled as RDF properties (predicates) in

“element set” namespacesE.g. Dublin Core terms (dct); ISBD (isbd); FRBR (frbrer); RDA

(rdaxxx); Bibliographic Ontology (bibo); etc.Choose a namespace, find property with same (or closest)

“meaning” (e.g. definition) as attributeNearest property minimises loss of information

Get URI for property If no suitable property, choose another namespace

Properties do not have to come from single namespaceMatch and mix!

Page 43: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

5 (cont). Find URI for titlehttp://purl.org/dc/terms/title (dct:title)http://iflastandards.info/ns/isbd/elements/

P1014 (isbd:P1014)hasTitleProper

http://RDVocab.info/Elements/titleProper (rdaGR1:titleProper)

Page 44: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

5 (cont). Find URI for authordct:creatorrdarole:author(isbd does not cover “headings”)

Page 45: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

5 (cont). Find URI for datedct:dateisbd:P1018

hasDateOfPublicationProductionDistributionrdaGr1:dateOfPublication

Page 46: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

5 (cont). Find URI for LCSHLCSH is a subject vocabulary

Controlled termsSo attribute is really “subject”

And the term itself is the valuedct:subject

Page 47: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

5 (cont). Find URI for media typeAssuming record uses new ISBD Area 0 ...isbd:P1003

hasMediaType

Page 48: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

5 (cont). Find URI for content formAssuming record uses new ISBD Area 0 ...isbd: P1001

hasContentForm

Page 49: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

6. Replace attributes with URIsURI URI Valuemlx:54321 isbd:P1014 Museum archives:

an introductionmlx:54321 rdarole:author Wythe, Deborahmlx:54321 isbd:P1018 2004mlx:54321 dct:subject Museum archivesmlx:54321 isbd:P1003 Electronicmlx:54321 isbd:P1001 Text

Page 50: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

7. Find URIs for values If object of a triple is a URI, it can link to the subject of

another triple with the same URILinked data!

Values from controlled vocabularies may have URIsPossible vocabularies: author, subject, ISBD Area 0NOT: title, date

For author: Virtual International Authority File (VIAF)For LCSH: Library of Congress Authorities &

VocabulariesFor ISBD Area 0: Open Metadata Registry

Page 51: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

7 (cont). Find URI for authorAuthor: Wythe, DeborahVIAF: http://www.viaf.org/

viaf:31899419/#Wythe,+Deborah

Page 52: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

7 (cont). Find URI for subject (LCSH)LCSH: Museum archivesLoC: http://id.loc.gov/authorities/

lcsh:/sh85088707#concept

Page 53: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

7 (cont). Find URIs for ISBD Area 0

Media type: ElectronicISBD media type

isbdmt:T1002Content form: TextISBD Content form

isbdcf:T1009

Page 54: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

8. Replace values with URIssubject predicate objectmlx:54321 isbd:P1014 “Museum archives: an

introduction”mlx:54321 rdarole:author viaf:31899419/#Wythe,

+Deborahmlx:54321 isbd:P1018 “2004”mlx:54321 dct:subject lcsh:/

sh85088707#concept mlx:54321 isbd:P1003 isbdmt:T1002mlx:54321 isbd:P1001 isbdcf:T1009

Page 55: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

9. Publish triples (linked data)mlx:54321 | isbd:P1014 | “Museum archives: an

introduction” mlx:54321 | rdarole:author | viaf:31899419/#Wythe,

+Deborahmlx:54321 | isbd:P1018 | “2004”

mlx:54321 | dct:subject | lcsh:/sh85088707#concept

mlx:54321 | isbd:P1003 | isbdmt:T1002

mlx:54321 | isbd:P1001 | isbdcf:T1009

Page 56: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

Linked data chainsmlx:54321 | dct:subject | lcsh:/sh85088707#concept

lcsh:/sh85088707#concept | skos:related | rameau:XXX

rameau:XXX | frbrer:isSubjectOf | mly:98765

rameau:XXX | skos:prefLabel | “archives du musée”

mly:98765 | rda:titleOfTheWork | “Managing archives in museums”

Page 57: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

Linked data cluster = “record”mlx:54321 | isbd:P1014 | “Museum archives: an

introduction” mlx:54321 | rdarole:author | viaf:31899419/#Wythe,

+Deborahmlx:54321 | isbd:P1018 | “2004”

mlx:54321 | dct:subject | lcsh:/sh85088707#concept

mlx:54321 | isbd:P1003 | isbdmt:T1002

mlx:54321 | isbd:P1001 | isbdcf:T1009

Page 58: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

Metadata focus

Shift of focus of metadata creation, maintenance, storage, preservation (by professionals, amateurs, machines)

From Record To Statement(s) = triple(s)

But metadata display ...... aggregates triples (from multiple sources) to create records on the fly

Page 59: Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

Thank you

[email protected] Metadata Registry

http://metadataregistry.org/