istic thesaurus ws-keizer_2010-10-22
Post on 08-May-2015
370 Views
Preview:
DESCRIPTION
TRANSCRIPT
The role of Thesauriand Standard Vocabularies in linking data
Dr. Johannes KeizerFAO of the United NationsOffice of Knowledge Exchange, Research and ExtensionKnowledge and Capacity for Development
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22 The Development of the Internet
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
“Closed” (“normal”) IT environments
Data sources carefully controlled.
Data formats “custom-defined” for an application.
Linked data based on an “open world mindset”
Integrating data from the open Web
Systems designed to incorporate new information incrementally
By design, tolerance of incomplete information
Open World Mindset
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22 The Linked Data Universe: http://www.linkeddata.org (july 2009)
4
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22The Linked Data Universe: http://www.linkeddata.org (july 2010)
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22 Example: BBC Wildlife Finder
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22 Humboldt Squid page, pulled together from a diversity of Linked Data sources
Animal Diversity Web:Nocturnal way of life
BBC TV Documentary
BBC News item
Wikipedia
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
RDF– a grammar for the language of data
ResourcerelatedTo
ResourceA ResourceB
ResourcedescribedBy
ResourceA Some text
1. Describe resources using interrelated “statements” (“triples”).2. Use URIs – unique, globally managed identifiers – as the “words” of statements.
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
• http://www.w3.org/2007/Talks/0221-Bangalore-IH/
RDF as a common format for merging data
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22 Finding things related to “genes” across databases
Source: Joanne Luciano, Mitre, and the W3C HCLS IG
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
Born as tools to assure consistency in the indexing of library collections
Thesauri were based on “terms”, but terms represented already concepts in a non explicit way
Hierarchical and associative relationships represented generic ontological domain knowledge
Candidate building blocks for the semantic web
Role of thesauri/concept schemes
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22 ..from thesaurus to Ontologies….
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
around 30,000 concepts
600000 labels in around 20 languages.
one-stop shop for terminological knowledge related to agriculture in general
a knowledge base of related concepts organized in ontological relationships (hierarchical, associative, equivalence)
Is a concept/term/string based system
Concepts may be organized in multiple categories.
AGROVOC today
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22 Semantic Relationships
Concept to Concept
isA (hierarchy), isPestOf, hasPest
Concept to Term
has_lexicalization (links concepts to their lexical realizations)
Term to Term
isSynonymOf, isTranslationOf, hasAcronym, hasAbbreviation
Term to String
hasSpellingVariant, hasSingular
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
The AGROVOC SKOS-XL Model
8171
1474
12332
skosxl:altLabel
skosxl:prefLabel
skos:broader
SKOS Label
skos:broader
SKOSConcept
rdf:type
rdf:type
6211skos:broader
AgrovocConceptScheme
skos:topConceptOfskos:inScheme
SKOSConceptScheme
rdf:type
rdf:type
:bar
:foo
“corn”
“maize”
skosxl:literalForm
skosxl:literalForm
rdf:type
rdf:type
rdf:type
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
http://www.w3.org/2004/02/skos/
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22 SKOS-XL output
<rdf:Description rdf:about="http://aims.fao.org/aos/agrovoc/agrovocScheme"> <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#ConceptScheme"/></rdf:Description><rdf:Description rdf:about="http://aims.fao.org/aos/agrovoc/c_330829"> <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/><skos:inScheme rdf:resource="http://aims.fao.org/aos/agrovoc/agrovocScheme"/><skos:topConceptOf rdf:resource="http://aims.fao.org/aos/agrovoc/agrovocScheme"/></rdf:Description><rdf:Description rdf:about="http://aims.fao.org/aos/agrovoc/xl_en_1278479064610"><literalForm xmlns="http://www.w3.org/2008/05/skos-xl#" xml:lang="en">subjects</literalForm> <rdf:type rdf:resource="http://www.w3.org/2008/05/skos-xl#Label"/></rdf:Description>
URI of AGROVOC concept
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
AGROVOC EUROVOC UNBIS Relationship
http://aims.fao.org/aos/agrovoc/c_207
http://eurovoc.europa.eu/219055
agroforestry skos:exactMatch/ owl:sameAs
http://aims.fao.org/aos/agrovoc/c_4826
http://eurovoc.europa.eu/220018
MILK skos:exactMatch/ owl:sameAs
http://aims.fao.org/aos/agrovoc/c_12332
http://eurovoc.europa.eu/219871
MAIZE skos:exactMatch/ owl:sameAs
Linking vocabularies
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22http://agris.fao.org/agris-search/search/display.do?f=2004/ZA/ZA04002.xml;ZA2004000049
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
http://aims.fao.org/aos/agrovoc/c_7825
http://eurovoc.europa.eu/218754
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
http://eurovoc.europa.eu/219871
Maize
skosxl: literalForm
Maize
http://aims.fao.org/aos/agrovoc/c_12332
AGROVOC
skosxl: literalFormMaize
http://aims.fao.org/aos/agrovoc/c_12332 owl:sameAs http://eurovoc.europa.eu/219871
owl:sameAs/exactMatch
http://agris.fao.org/agris-search/search/display.do?f=1996/TR/TR96001.xml;TR9600026
Linking data through common URIs
skosxl: literalForm
owl:sameAs/exactMatch
http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:L:2010:202:0011:0015:EN:PDF
http://unbisnet.un.org:8080/ipac20/ipac.jsp?session=128F308557F34.283092&profile=bib&uri=full=3100001~!685149~!1&ri=1&aspect=subtab124&menu=search&source=~!horizon
Maize
Eurovoc
UNBIS
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
What are we doing with unstructured data?• We have enormous amounts of unstructured
material
• Still most of the documents that we are producing are mostly semantically unstructured
• Human work to catalogue and index is becoming always more rare
• We need machines to do automatic semantic mark ups of text
• If machines are trained and based on concept schemes, ther are able to do so
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
• Does Concept identification in unstructured texts
• Uses Agrovoc as a controlled vocabulary
• Prototype under testing with excellent results (entire repository of ICARDA indexed)
• Will produce in future Structured RDF files that can be used to link data like “open Calais”
•
AgroTagger
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
Life Demo: Semantic mark ups:
http://viewer.opencalais.com/
http://agropedialabs.iitk.ac.in/Tagger/Agrotagger_text.php
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22 The concept scheme workbench
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
Is a web-based working environment for managing the AGROVOC Concept Server
Facilitate the collaborative editing of multilingual terminology and semantic concept information
It includes administration and group management features
It includes workflows for maintenance, validation and quality assurance of the data pool
The CS is accessible freely to everybody to facilitates collaborative editing
The workbench
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22 Group/Action/Status
GROUP
Non registered usersTerm editorsOntology editorsValidatorsPublishersAdministrators
ACTION
concept-createconcept-deleteconcept-edit
term-createterm-editterm-delete..........
STATUS
Proposed by guestProposedRevised by guestRevisedValidatedPublishedProposed deprecatedDeprecated
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
32
Concept Life Cycle
GUEST<concept-create>Proposed by guest
VALIDATOR<validates>Validated
PUBLISHER<publishes>Published
TERM EDITOR<concept-edit>Revised
ADMINISTRATOR<validates>Published
ONTOLOGY EDITOR<concept-delete>Proposed deprecated
PUBLISHER<validates>Deprecated
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22 Modules
• Home
• Search
• Concept/Term Management
• Relationship Management
• Classification Scheme Management
• Validation
• Consistency Check
• Import/Export
• User/Group Management
• Statistics/Preferences
33
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
• by string: the user can specify if the system should search by exact match, beginning with, contains or fuzzy
• by URI or term code; or by range of term code (e.g. between 123 and 9876)
• by classification schemes
• by creation or modification date
• by specific relationships (e.g. search all concepts using the “has_pest”)
• by status, language
by notes/attributes
Search
34
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
35
Graph Visualization
Java Applets based touch graph
Visualizes concepts and its relationships with other concepts in graphical view
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
36
Web services
AGROVOC CS WORKBENCH maintain access
response
uses
SKOS
TripleStore
Other Applications
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
AGROVOC Web Services
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
Architecture of the System
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
39
Front endFront end Back endBack end
Administrative Database(Mysql)
Protégé Triple Store(Mysql)
MiddlewareMiddleware
Hibernate Layer
ProtégéOWL API
Gilead
Intermediate Layer
Google Web Toolkit(GWT)
Graph Visualization
GWT Incubator
Web services
System Overview
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
Giving it a try…….
A demo version of the AWB: http://202.73.13.50:55234/agrovocdevv10d/ With all functionalities, availabe to users for testing purpose.
Latest stable release version 1.0 : (read/write) http://202.73.13.50:55381/agrovocv10i/
Latest stable release version 1.0 (Read only): http://202.73.13.50:55481/agrovocv10i/ (Visitors only with only view privilege)
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
…and more: http://aims.fao.org
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
T
hes
auru
s W
ork
sho
p –
CA
S
Bei
jin
g,
2010
-10-
22
Thank You!
top related