metadata: first principles pat bell knowledge, analysis and intelligence
TRANSCRIPT
Metadata: first principles
Pat Bell
Knowledge, Analysis and Intelligence
Definition
“Metadata is data about data … structured information
about a resource”
Instances of metadataresource: bookmetadata: catalogue record
Instances of metadata
resource: record
metadata: corporate file plan
Instances of metadata
resource: person
metadata: directory entry
… (Right click on web page) …
… (Select view source) …
Instances of metadataresource: web pagemetadata: tags
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd"><!-- InstanceBeginEditable name="doctitle" --><title>HM Revenue & Customs: Child Benefit & Guardian's Allowance</title>
<!-- InstanceBeginEditable name="Metadata" --><meta name="title" lang="eng" content="" /><meta name="description" lang="eng" content="" /><meta name="keywords" lang="eng" content="" /><meta name="eGMS.subject.category" lang="eng" scheme=“IPSV" content="Tax, Benefits" /><meta name="DCTERMS.audience" lang="eng" content="all" /><meta name="DC.creator" lang="eng" content="HM Revenue and Customs" /><meta name="DC.date.issued" scheme=" W3CDTF" content="2006-03-24" /><meta name="DC.date.modified" scheme= "W3CDTF" content="" /><meta name="eGMS.disposal.review" scheme=" W3CDTF" content="2006/04/01" /><meta name="DC.identifier" scheme="URI" content="" /><meta name="DC.format" lang="eng" content="text/html"/><meta name="DC.language" scheme="ISO639-2/T" content="eng" /><meta name="DC.publisher" lang="eng" content="HM Revenue and Customs" /><meta name="eGMS.rights.copyright" lang="eng" content="HM Revenue and Customs" /><!-- InstanceEndEditable -->
1st principleone resource, one description
The resource
The metadata
Title: Mona Lisa Title: Mona Lisa
Creator: Da Vinci Creator: Bell
Relation: (Very distant) Relation:
Uses of metadatatoday
Resource discovery
Resource
administration
Technical support
search
authentication
navigation disposal
version control
filtering
Intellectual property rights
preservation
Uses of metadatatomorrow: the semantic web
“An extension of the web … that will bring structure to the meaningful content of Web pages, creating an environment where software agents roaming from page to page can carry out sophisticated tasks for users”
Tim Berners-Lee et al, Scientific American 17 May 2001
Uses of metadatabuilding blocks for the semantic web
• Metadata …
• … expressed using the Resource Description Framework (RDF) …
• … in standardised XML (eXtensible Markup Language) documents.
Find out more at the World Wide Web Consortium (W3C): www.w3.org/
Components of metadatastatement
<meta name="title" lang="eng" content="" /><meta name="description" lang="eng" content="" /><meta name="keywords" lang="eng" content="" /><meta name="eGMS.subject.category" lang="eng" scheme=“IPSV" content="Tax, Benefits" /><meta name="DCTERMS.audience" lang="eng" content="all" /><meta name="DC.creator" lang="eng" content="HM Revenue and Customs" />
<meta name="DC.date.issued" scheme=" W3CDTF" content="2006-03-24" /><meta name="DC.date.modified" scheme= "W3CDTF" content="" /><meta name="eGMS.disposal.review" scheme=" W3CDTF" content="2006/04/01" /><meta name="DC.identifier" scheme="URI" content="" /><meta name="DC.format" lang="eng" content="text/html"/><meta name="DC.language" scheme="ISO639-2/T" content="eng" /><meta name="DC.publisher" lang="eng" content="HM Revenue and Customs" /><meta name="eGMS.rights.copyright" lang="eng" content="HM Revenue and Customs" />
Components of metadataelements
<meta name="title" lang="eng" content="" />
<meta name="description" lang="eng" content="" />
<meta name="keywords" lang="eng" content="" />
<meta name="eGMS.subject.category" lang="eng" scheme=“IPSV" content="Tax, Benefits" />
<meta name="DCTERMS.audience" lang="eng" content="all" />
<meta name="DC.creator" lang="eng" content="HM Revenue and Customs" />
<meta name="DC.date.issued" scheme=" W3CDTF" content="2006-03-24" />
<meta name="DC.date.modified" scheme= "W3CDTF" content="" />
<meta name="eGMS.disposal.review" scheme=" W3CDTF" content="2006/04/01" />
Components of metadatarefinements (Qualifiers)
<meta name="title" lang="eng" content="" /><meta name="description" lang="eng" content="" /><meta name="keywords" lang="eng" content="" />
<meta name="eGMS.subject.category" lang="eng" scheme=“IPSV" content="Tax, Benefits" /><meta name="DCTERMS.audience" lang="eng" content="all" /><meta name="DC.creator" lang="eng" content="HM Revenue and Customs" />
<meta name="DC.date.issued" scheme=" W3CDTF" content="2006-03-24" />
<meta name="DC.date.modified" scheme= "W3CDTF" content="" />
<meta name="eGMS.disposal.review" scheme=" W3CDTF" content="2006/04/01" />
2nd principledumb-down
A valid value for a refinement must also be valid for the unrefined element
date issued (2007-07-25) is fine for date
date updating frequency (monthly) is not
Components of metadataencoding schemes
<meta name="title" lang="eng" content="" /><meta name="description" lang="eng" content="" /><meta name="keywords" lang="eng" content="" />
<meta name="eGMS.subject.category" lang="eng" scheme=“IPSV" content="Tax, Benefits" /><meta name="DCTERMS.audience" lang="eng" content="all" /><meta name="DC.creator" lang="eng" content="HM Revenue and Customs" />
<meta name="DC.date.issued" scheme=" W3CDTF" content="2006-03-24" /><meta name="DC.date.modified" scheme= "W3CDTF" content="" />
<meta name="eGMS.disposal.review" scheme=" W3CDTF" content="2006/04/01" />
Components of metadataencoding schemes
Two sorts:
• Controlled vocabulary (Pick list)
eg Library of Congress Subject Headings
• Syntax (Prescribed format)
eg Date format yyyy-mm-dd
(and you can have free text tags, like Title)
Components of metadatavalues
<meta name="title" lang="eng" content="" />
<meta name="description" lang="eng" content="" />
<meta name="keywords" lang="eng" content="" />
<meta name="eGMS.subject.category" lang="eng" scheme="GCL" content="Tax, Benefits" />
<meta name="DCTERMS.audience" lang="eng" content="all" />
<meta name="DC.creator" lang="eng" content="HM Revenue and Customs" />
<meta name="DC.date.issued" scheme=" W3CDTF" content="2006-03-24" />
<meta name="DC.date.modified" scheme= "W3CDTF" content="" />
<meta name="eGMS.disposal.review" scheme=" W3CDTF" content="2006/04/01" />
3rd principleappropriate values
• Develop policies to support local requirements
• But keep in mind wider needs
• The metadata can be used by people as well as machines
Summary• Metadata is structured resource description
• A very abstract name for more concrete activities
• For resource discovery and administration, and technical support
• A building block of the semantic web
• Three principles: one to one, dumb-down and appropriate values
• Statements break down into elements, refinements, encoding schemes and values
The role of the information professional
Not tagging huge numbers of resources for someone else
Part of implementing a system (website, EDRM…)
Part of managing the system
Expert and guardian of standards
Guidance to the people who do the tagging