metadata and ontologies
DESCRIPTION
Slides from the Introduction and Theoretical Foundations of New Media course of the Interactive Media and Knowledge Environments master program (Tallinn University).TRANSCRIPT
Introduction and Theoretical Foundations of New Media
Metadata and Ontologies
..
David Lamas, TLU, 2011
2
Contents
Metadata
Ontologies
Folksonomies
The sematic web
The internet of things
David Lamas, TLU, 2011
Metadata
Click icon to add picture
3
David Lamas, TLU, 2011
4
Metadata
So, why is metadata relevant?Or… why should we care about metadata?
David Lamas, TLU, 2011
5
Metadata
As a concept, is not newMetadata has long been for managing document collections
such as the ones kept by libraries
But the term itself, was only coined in 1968By Philip Bagley, a pioneer of computerized document
retrieval
David Lamas, TLU, 2011
6
Metadata
Literally, a set of data that describes and gives information about other data, metadata in our context is:Machine readable
Descriptive
For the purposes of resource…
Discovery
Management
Delivery
Access control
Use
Re-use
Long term preservation
David Lamas, TLU, 2011
7
Metadata
Or in other words, metadata allows for the description of the…Definition
Structure; and
Administration
of selected resources with all contents in context to ease the further use of the resource
David Lamas, TLU, 2011
8
MARC
Or… Machine Readable CatalogueIs still the main metadata standard in the library world
although it is not a full cataloguing scheme being
David Lamas, TLU, 2011
9
UDC, AARC2 and RDA
Universal Decimal ClassificationA multilingual classification scheme for all fields of knowledge
Available at… http://www.udcc.org/udcsummary/php/index.php
Anglo-American Cataloguing RulesFor use in the construction of catalogues
Available at…
http://www.aacr2.org/
Resource description and accessAvailable at…
http://www.rda-jsc.org/rda.html
David Lamas, TLU, 2011
10
Z39.50, SRW and SRU
Z39.50is a client–server protocol for searching and retrieving
information widely used in library environments
Search & Retrieve Web ServiceA intended standard web-based text-searching interface
Search/Retrieval via URLA standard XML-focused search protocol for Internet search
queries, which uses the Contextual Query Language
David Lamas, TLU, 2011
11
But…
This should not bother you other than to note that…Metadata tends to get more complicated the longer you think
about it
David Lamas, TLU, 2011
12
As for the web…
It was early recognized that finding what you need was going to start getting difficultWe’re talking about the mid nineties when the web’s size
was referred to in terms of tens of thousands
Users, mainly information sciences specialists, begun trying to catalogue it by handDo you remember Yahoo’s earlier versions?
David Lamas, TLU, 2011
13
As for the web…
The first search engines appeared and authors begun to realize that the metadata they embedded into web pages might be important
<html>
<head>
<title>A web page</title>
<meta name=“keywords” content=“some, key, words” />
<meta name=“description” content=“a summary” />
</head>
<body>
…
David Lamas, TLU, 2011
14
As for the web…
Then came GoogleAnd metadata lost some relevance as Google’s PageRank
algorithm takes note of links between pages but places less emphasis on embedded metadata to avoid…
Metaspam
<meta name=“description” content=“a summary” />
Metacrap
<title>put your title here</title>
David Lamas, TLU, 2011
15
Dublin Core
Despite the initial drawbacks, work continued on embedded metadata and the Dublin Core was and still is one of the main players with its 15 elements…Title, Creator, Subject, Description, Publisher, Contributor, Date,
Type, Format, Identifier, Source, Language, Relation, Coverage, Rights
…embedded into web pages or encoded using XML
The initial intention was to improve indexing by search enginesBut whereas its promoters forgot about metaspam and metacrap,
the search engines didn’t
And so, main search engines still ignore embedded metadata
David Lamas, TLU, 2011
16
Dublin Core
David Lamas, TLU, 2011
17
Metadata
Remarkably, there has been fairly widespread adoption of metadata principles, specially in policy terms, namely in government(look into http://www.esd.org.uk/standards/egms/viewer/
viewer.aspx for and interesting example)
And in:
Education
Health
Cultural heritage
Environmental agencies, and…
Libraries, of course
David Lamas, TLU, 2011
18
Metadata
This resulted in the… Growth of metadata cataloguing rules
(although every community has its own rules)
Growth in use of additional elements for particular communities
(and again, every community’s additions are different)
Adoption of application profiles to document the distinct cataloguing rules and additions
Institution of the Dublin Core Metadata Initiative as
an organization engaged in the development of interoperable metadata standards that support a broad range of purposes and business models
David Lamas, TLU, 2011
19
Metadata
But the Dublin Core isn’t alone, far from itMany other standards were and are being developed such as
these, just to name two:
RDF (Resource Description Framework)
LOM (Learning Object Metadata)
David Lamas, TLU, 2011
20
Resource Description Framework
The resource description framework was developed by the W3C, the RDF is the envisioned standard for the semantic webIts goal is to allow software to automatically navigate and
reason about web content thus enabling…
A web of (linked) data
David Lamas, TLU, 2011
21
Resource Description Framework
David Lamas, TLU, 2011
22
Learning Object Metadata
Learning Object Metadata is a data modelUsually encoded in XML, it is used to describe learning
objects and similar digital resources used to support learning.
David Lamas, TLU, 2011
23
Learning Object Metadata
David Lamas, TLU, 2011
24
Metadata
As said in the beginning…Metadata tends to get more complicated the longer we think
about it
The current metadata efforts lack of within standards and within communities coherence and cohesion are a good example
And that is why we will next look into Ontologies
So… do we care about metadata?Why are we interested?
David Lamas, TLU, 2011
25
Metadata
I guess the answer is yes, we care.And yes, we are interested, because metadata is everywhere
Sometimes it is explicitly available,
Other times it is hidden or not so readily available, but anyway…
It would be foolish not to make use of it
David Lamas, TLU, 2011
26
Metadata
Further, there is increasing pressure to expose metadata on the web for other to mash up and this is specially true today in settings such as…Education;
Research; and
Government
And finally, metadata becomes paramount in scenarios wherecontent is data; or
the required information can not easily derived from content
David Lamas, TLU, 2011
Ontologies
Click icon to add picture
27
David Lamas, TLU, 2011
28
Ontologies
One way of dealing with the lack of within standards and within communities coherence and cohesion of current metadata efforts is to evolve to an ontology-base metadata approach
But what does this means?
David Lamas, TLU, 2011
Ontologies
An ontology is a logical theory which gives an explicit partial account of a conceptualizationAn intentional semantic structure which encodes the implicit
rules constraining the structure of a piece of reality
In this light, the aim of an ontology is to define which primitives, provided with their associated semantics, are necessary for knowledge representation in a given context
Thomas R. Gruber (1993). Toward principles for the design of ontologies used for knowledge sharing.
Originally in N. Guarino and R. Poli, (Eds.), International Workshop on Formal Ontology, Padova, Italy. Revised
August 1993. Published in International Journal of Human-Computer Studies, Volume 43 , Issue 5-6
Nov./Dec. 1995, Pages: 907-928, special issue on the role of formal ontology in the information technology.
David Lamas, TLU, 2011
30
Ontologies
Ontologies are usually characterized by their…Coverage
The extent to which the primitives mobilized by the perceived usage scenarios are covered by the ontology
Specificity
The extent to which ontological primitives are precisely identified
Granularity
The extent to which primitives are precisely and formally defined
Formality
The extent to which primitives are described in a formal language
David Lamas, TLU, 2011
31
Ontologies
And ontologies are not… taxonomies
But taxonomy might be perceived as a specific case of an ontologyA taxonomy is a particular classification arranged in a
hierarchical structure
Typically it is organized by supertype/subtype relationships also called generalization/specialization relationships
David Lamas, TLU, 2011
32
Why ontologies?
Pipe
David Lamas, TLU, 2011
33
Why ontologies?
Pipe
David Lamas, TLU, 2011
34
Why ontologies?
Pipe
David Lamas, TLU, 2011
35
Why ontologies?
In short, we interpret, machines don’tAs such, an effort must be undertaken in order to support
adequate usage of digital resources
So, what’s missing?Among other…
The possibility to share a common understanding of the structure of information within a specific domain
The possibility to reuse domain knowledge
The possibility to make domain assumptions explicit
The possibility to analyze domain knowledge
David Lamas, TLU, 2011
36
Ontologies and the web
It is estimated that by 2010…70% of public web pages will have some level of metadata,
but only
20% will use more extensive semantic web approaches such as ontology-based metadata
But why should we care?
http://www.afsg.nl/InformationManagement/images/nieuws/finding%20and%20exploiting%20value%20of%20semantic%20tech%20on%
20web.pdf
David Lamas, TLU, 2011
37
Ontologies and the web
An emerging ontological approach is OWL or…Web Ontology Language
A vocabulary extension of the Resource Description Framework, which adds more vocabulary for describing characteristics of properties and classes or relations between classes
David Lamas, TLU, 2011
38
Web Ontology Language
OWL enables ontology-based information sharing and manipulation together with RDF and XMLIn reverse order…
XML allows users to add arbitrary structure to their docuemnts but says nothing about what such structures mean
RDF enables expression of meaning over XML (and other) structures
Using subject, verb and object triples
OWL enables machines to comprehend semantic documents and data
David Lamas, TLU, 2011
39
Web Ontology Languagehttp://www.w3.org/TR/owl-features/
David Lamas, TLU, 2011
40
Ontologies
This said and while addressing some of the current metadata efforts weaknesses, present-day ontologies still largely depend on explicit human intervention to be usefulAnd that is why we will next look into folksonomies
David Lamas, TLU, 2011
Folksonomies
Click icon to add picture
David Lamas, TLU, 2011
Folksonomies
Are mainly a bottom-up social classification systemA way to organize and share contents by tagging resources
Synonyms are…
Ethno-classification; and
Collaborative tagging
David Lamas, TLU, 2011
43
Folksonomies
Folksonomies are created by users and have…No structure
No fixed vocabulary
No explicit relationships between terms, and
No authority
David Lamas, TLU, 2011
44
Folksonomies
Folksonomies also are…Distributed, and
Collaboratively built and maintained
You can tag items owned by others
You can get instant feedback
All items for the same tag
All tags for the same item
You can a adapt your tags to the group norm
But you are never forced
David Lamas, TLU, 2011
45
Folksonomies
Some of their apparent benefits are…Being cheap and easy to build and use
Being capable to adapt very quickly to changes and users needs
They scale well
Foster serendipity
Semantic browsing instead of searching
Lower the cooperation barriers
David Lamas, TLU, 2011
46
Folksonomies
But they have limits such as…Semantic ambiguity
Polysemy, synonymy, cardinality and the use of acronyms
Syntax free
Spaces and multiple words are used without rules
Language
Different languages can be used for the same tag
Being eventually shortsighted
Fail to depict the general overview
Lack of (or minimal) structure
No explicit relationships between otherwise related tags
David Lamas, TLU, 2011
47
Folksonomies and ontologies
Folksonomies
Domains
Large corpus
Informal categories
Unstable entities
Unclear edges
Participants
Naïve cataloguers
No authority
Uncoordinated users
Amateur users
Critical mass needed
Ontologies
Domains
Small corpus
Formal categories
Stable entities
Restricted entities
Clear edges
Participants
Expert cataloguers
Authoritative sources of judgment
Coordinated users
Expert users
David Lamas, TLU, 2011
48
Folksonomies and ontologies
How do we choose?Folksonomies are useful when all that is needed is the ability
to link items to topics
Ontologies are useful when what is needed is to formally define meaning
But… do we need to choose?Not really, at least that what current research is exploring
David Lamas, TLU, 2011
49
Folksonomies and ontologies
Research directions includeThe combination of the folksonomy and ontology approaches
into an hybrid system where the most consensual constructs would long last while others would be forgotten or redefined
An approach that combines the ease and adaptability of folksonomy with the formality and semantic richness of an ontology
Quantitative tag analysis and qualitative use analysis in current online social networking services
To understand if tag usage converge or not
To understand how a folksonomy is formed
To… any ideas?
David Lamas, TLU, 2011
Semantic web
Click icon to add picture
David Lamas, TLU, 2011
Semantic Web
The Web was designed as an information space, with the goal that it should be useful not only for human-human communication, but also that machines would be able to participate and help
One of the major obstacles to this has been the fact that most information on the Web is designed for human consumption, and even if it was derived from a database with well defined meanings (in at least some terms) for its columns, that the structure of the data is not evident to a robot browsing the web
Leaving aside the artificial intelligence problem of training machines to behave like people, the Semantic Web approach instead develops languages for expressing information in a machine processable form.
David Lamas, TLU, 2011
Internet of things
Click icon to add picture
David Lamas, TLU, 2011
The internet of things
The internet of things might be described as a self-configuring wireless network of sensors whose purpose would be to interconnect all thingsAnd the concept is attributed to the former Auto-ID Center,
founded in 1999, based at the time at the MIT
An alternative view focuses instead on making all things addressable by the existing naming protocolsIn the current vision, objects themselves do not interact, but
they may now be referred to by other agents, such as centralized servers acting for their human users
David Lamas, TLU, 2011
54
Metadata and Ontologies recap
Metadata
Ontologies
Folksonomies
The sematic web
The internet of things