the agricultural ontology service. a comprehensive framework for building multilingual domain...
Upload: aims-agricultural-information-management-standards-fao-of-the-un
Post on 11-May-2015
1.436 views
TRANSCRIPT
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
The AGRICULTURAL ONTOLOGY SERVICEA Comprehensive Framework for Building Multilingual Domain Ontologies:
Creating an ontology on Food Safety, Animal and Plant Health
(OFsAPH)
Johannes Keizer
Information Systems Officer
Food and Agriculture Organization of the UN
AFITA 2002, Beijing 28th October 2002
Team: Boris LauserTeam: Boris Lauser, Allison Poullos, Tanja Wildemann, Frehiwot Fisseha
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 2
FAOs mandate
• Reducing the quantity of hungry people by 50% within the year 2015 (World Food Summit 1996).
• WAICENT (World Agricultural Information Center) is FAO’s approach to fight hunger with information
• FAO itself produces huge amount of content in it’s subject area
• It is also within FAOs mandate to make available useful information from other information providers
• FAO collaborates in information networks
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 3
Introduction
It has become a triviality to state the difficulty of finding relevant information on the web
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 4
The Search Problem
Both parameters are ranking low today!
RecallNumber of Relevant Documents in the Collection
Number of Relevant Documents Identified
PrecisionNumber of Relevant Documents Identified
Total Number of Documents Identified
How to evaluate Search Results?
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 5
State of Search Systems
Full text search engines based on statistical text analysis are inprecise by nature
New system based only on “machine intelligence” do not show too promising results
Recogniton of meaning (semantic analysis) by machines is only possible by using knowledge organization systems
agreed metadata schemas Controlled vocabularies Machine readable encoding
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 6
Knowledge Organization Systems: Vocabularies
AGROVOC
NAL Thesaurus
CABI Thesaurus
Dedicated KOSs
Non-dedicated KOSs
e.g., ASFA thesaurus
e.g., the Multilingual Forestry Thesaurus
e.g., the Sustainable Development
website classification
e.g., biological taxonomies such as NCBI and ITIS
GEMET
Other thematic thesauri
Existing Thesauri and Knowledge Organization Systems (KOSs)
Common concepts are not declared
No or very limited interoperability
Insufficient subject + language coverage
Severe maintenance problems
Very limited machine readability
Only very simple encoding of semantic relations
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 7
Ontologies?
An ontology is a formal knowledge organization system
It contains concepts (and instances) a formal description of the application knowledge Definitions of concepts and instances Relations between concepts and instances possibility of machine processing
Nearly everyone tries to build (inexplicit) ontologies Directory structures, navigation trees Humans can overcome bad organization by intuition Machine have no intuition, Machine need formal information
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 8
What benefits do we expect from Ontologies?
• Semantic Organization of websites Knowledge maps Guided discovery of knowledge Easy retrievability of information without using complicated
Boolean logic
• Text processing by machines Text Mining on the Web (meaning-oriented access) Automatic indexing and text annotation tools Full text search engines that create meaningful classification
(FAO-Schwartz not related to FAO) (semantic clustering)
• Intelligent search of the Web Building dynamical catalogues from machine readable meta data
• Natural Language processing Better machine translation Queries using natural language
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 9
The Example: International Portal on Food Safety, Animal and Plant Health
• Goal: To create an explicit, formal specification of a shared conceptualization of a domain of interest
Ontology
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 10
Ontology: conceptual model
Concept
label
synonym
synonym
synonym
stem
description
Concept
relationship
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 11
Ontology: RDFS model, machine readable encoding
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 12
Processes to create a Domain Ontology
• Ontology acquisition (2 paths)– Creating core ontology from scratch
– Automatic extraction of ontological knowledge from base vocabulary and domain specific text sources
• Merging into one ontology• Refinement and Extension• Evaluation and Assessment
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 13
Creation of the core ontology
67 concepts91 relationships
Information Resources:•Brainstorming•Codex Alimentarius•SPS Agreement
Core Ontology
Ontology Editor(SOEP)
3 subject specialists
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 14
1st Acquisition Approach:
Focused Crawling
Focused Web Crawling
68 concepts91 relationships
Core Ontology
List of extracted main sites:http://www.foodsafety.gov/ Gateway to Government Food Safety Information
http://vm.cfsan.fda.gov/ Center for Food Safety & Applied Nutrition
http://www.inspection.gc.ca/ Canadian Food Inspection Agency
http://www.extension.iastate.edu/foodsafety/ Iowa State University - Food Safety Project
http://www.foodsafety.iastate.edu Iowa State University - Food Safety Consortium
http://www.fsis.usda.gov/ United States Department of Agriculture, Food Safety and Inspection Service
http://www.nal.usda.gov/foodborne/index.html Foodborne Ilness Education Information Center
http://www.euro.who.int/foodsafety World Health Organization – Regional Office for Europe Food Safety Programme
List of 257 food Safety domainweb pages
Grouping into Main sites
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 15
Selection of Documents
• Domain Set: Manual selection– 11 documents
• Codex Alimentarius: Description, Code of Ethics, Food Hygiene, Food Import and Export• Report of consultation on risk assessment of microbiological hazards in foods• Ensuring food quality and safety, Protecting food quality and safety
• Domain Set: Focused Crawler Output– 5 documents extracted:
• http://vm.cfsan.fda.gov/;• http://www.inspection.gc.ca/;• http://www.foodsafety.iastate.edu; • http://www.extension.iastate.edu/foodsafety/; • http://www.euro.who.int/foodsafety
• Generic documents: Manual Selection– 8 documents
• www.nytimes.com• Several documents of the animal feed domain
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 16
2nd Acquisition Approach:
Thesaurus Pruning
Food SafetyDocuments
GenericDocuments
Rice BT … NT … RT … RT … RT … …
AGROVOC27365 keywords
Automatic Pruning
Extracted ontological structure:# of concepts: 504taxonomic depth: 5
5 evaluation runs
1632 frequent terms
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 17
Merging of Ontologies and Refinement
1632 Terms from pruning process 12 new concepts
extracted
Ontologicalstructureextracted from AGROVOC
23 new conceptsWith hierarchicalrelationships extracted
67 concepts91 relationships
Core Ontology
Assemblystep
92 new relationshipscreated
Food Safety OntologyPrototype
102 concepts183 relationships
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 18
Final Prototype
Food Safety OntologyPrototype
102 concepts183 relationships
1.79 relationshipsconcept
Core Ontology67 concepts
91 relationships
relationshipsconcept1.36
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 19
102 Concepts
Agreement of AgricultureALOPALOP, CodexALOP, OIEALRanimal byproductsanimal diseasesanimal fatsanimal feed additives animal feed contaminantsanimal feed ingredientsanimal feedinganimal healthanimal processinganimal productsanimal wasteanimalsantibioticsBacteriabakery productsbiological agentCACCaragene protocolCCFHcereal productscheese
chemical agentCodex CommitteescommoditiesConsumer healthdiseaseseggsexposure assessmentfabricationFAOfishesfoodfood additivesfood consumptionfood contaminantsfood exportfood importfood ingredientsfood safetyfood-borne diseasesfungigood hygienic practiceshazardhazard characterizationhazard identificationhuman healthhuman nutrition
humansinternational agreementsinternational food tradeinternational governmental organizationsIPPClabellingmeatmicroorganismsmicroorganisms byproductsmicroorganisms processingmicroorganisms productsmicroorganisms wastemilkmilk productsmilk productsnon-pathogensOIEpackagingparasitespathogensphysical agentplant byproductsplant diseasesplant feed additivesplant feed contaminants
plant feed ingredientsplant feedingplant healthplant processingplant productsplant wasteplantsprocessed animal productsprocessed plant productsprocessed productsprocessingrisk analysisrisk assessmentrisk characterizationrisk communicationrisk managementslaughterSPS agreementstandardssugar TBT agreementtransportvirusesWHOWTO
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 20
29 Unique Relationships
adoptsadversely affectare included inare produced byare the source forcan be used asconstitutesdescribesdeterminesensuresestablishesgovernhas economical impact onImpliesincludes
influencesinteracts withis a consequence of is a step in the processis comprised ofis established byis protected byoriginate fromrefer to requiresrulesustainstradesuses
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 21
Current project statusOntology creation: 2nd application of framework
Food Safety OntologyPrototype
102 concepts183 relationships
Text To Onto ~100 domain
Specificdocuments
AGROVOC
Revised OntologyPruner
List offrequent
terms
Pruned Agrovoc: ~3000 concepts
Ontology Editor(OIModeler)
Merging &Refinement
1st acquisitionapproach
2nd acquisitionapproach
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 22
Usage Scenario
Search:
Risk assessment
Biosecurity Portal:
…
…
OntologyEnabled Search
Application
Ontology based search extension
Risk characterization
Hazard characterization
Hazard identification
Exposure assessment
Risk assessment
Risk management
Risk communication Risk analysis
Is aStepIn theprocess
Is aStepIn theprocess
Extended Search
Mark the terms below, which you might want to include in your search:
Interactswith
Risk assessment Risk characterization Risk analysisSearch:
Ontology
Doc baseSearchresults
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 23
Current project statusApplication scenario: 2 use cases
Use Case 1: Indexing the subject of a document
Use Case 2: Searching information on the portal
Risk;…Subject
Title
…
…
OFsAPH
Risk;…Search…
…
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 24
Current project status: Application: Ontology Browser for the Ontology on Food Safety,Animal and Plant Health
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 25
The Project for an Agricultural Ontology Service
• Only agreed semantic standards guarantee knowledge discovery between different applications
• The definition of Knowledge Organization systems is resource intensive
• Therefore FAO started initiatives to bring interested partners together October 2000 Launch of the AGStandards initiative to agree on
metadata standards July 2001 concept paper on Agricultural Ontology Service
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 26
What does Agricultural Ontology Service mean?
The Agricultural Ontology Service is an approach to organize knowledge organization systems that is
International The Internet must become plurilingual
MultidisciplinaryThe area of subjects is broad and needs various inputs
Cooperativedifferent expert knowledge has to be associated and used
Distributed no central ownership should be looked for
CoordinatedCoordination must ensure reusability and standardization
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 27
AOS: Iterative Knowledge Registration
KOS uses components to build
an application
Discussions and choices for amendments to
components
Components: terms, definitions,
relationshipsUsers search and browse
application using components
User feedback
Agricultural Ontology Service (AOS)
Federated storage and description facility
Components: terms, definitions,
relationships
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 28
Activities up to now
• The first workshop took place in Rome, November 2001
• A launch group was established with participation of
– Content providers (FAO, CABI)– Solution providers in the Agricultural Area (ATO -Wageningen,
University of Florida)
– Ontology development Groups (AIFB Karlsruhe, CNR Italy)
• Two further workshops were organized in January and May 2002
• Ontology protypes are under development
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 29
AOS – a “business model”
• A consortium of Information Providers• A clearinghouse for semantic standards in the
relevant subject areas• One stop access to agreed standards (Ontologies,
Metadataschemas, Vocabularies…)• Participation as a consortium in semantic web
activities to get funding for specific projects (“Semkos” for EU 6th framework)
• Organization of seminars and workshops to further develop and promote the use of semantic standards
Johannes Keizer
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
Semantic Standards for
the Web
28-10-2002
A Comprehensive Framework for
Building Multilingual
Domain Ontologies: Creating a Prototype
Biosecurity Ontology
AFITA 2002
Beijing
Slide 30
Further Information
http://www.fao.org/agris/AOS
http://www.fao.org/agris/AGMES