paul smits, anders friis-christensen european commission, dg joint research centre
DESCRIPTION
Automatic Concept Space Generation in Support of Resource Discovery in Spatial Data Infrastructures. Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre Institute for Environment and Sustainability Spatial Data Infrastructures Unit TP 262, Ispra (VA), Italy. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre](https://reader036.vdocuments.mx/reader036/viewer/2022070401/5681375d550346895d9eeb93/html5/thumbnails/1.jpg)
1EnviroInfo 2006, 05/09/06 Graz
Automatic Concept SpaceAutomatic Concept SpaceGeneration in Support of Resource Generation in Support of Resource
Discovery in Spatial Data Discovery in Spatial Data InfrastructuresInfrastructures
Paul Smits, Anders Friis-Christensen
European Commission, DG Joint Research CentreInstitute for Environment and Sustainability
Spatial Data Infrastructures UnitTP 262, Ispra (VA), Italy
![Page 2: Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre](https://reader036.vdocuments.mx/reader036/viewer/2022070401/5681375d550346895d9eeb93/html5/thumbnails/2.jpg)
2EnviroInfo 2006, 05/09/06 Graz
The mission of the JRC is to provide customer-driven scientific and technical support for the conception, development, implementation and monitoring of EU policies.
As a service of the European Commission, the JRC functions as a reference centre of science and technology for the Union.
Close to the policy-making process, it serves the common interest of the Member States, while being independent of special interests, whether private or national.
JRC’s Mission
![Page 3: Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre](https://reader036.vdocuments.mx/reader036/viewer/2022070401/5681375d550346895d9eeb93/html5/thumbnails/3.jpg)
3EnviroInfo 2006, 05/09/06 Graz
OutlineOutline
• Introduction
• Objectives of the study
• Approach
• Results
• Conclusions
![Page 4: Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre](https://reader036.vdocuments.mx/reader036/viewer/2022070401/5681375d550346895d9eeb93/html5/thumbnails/4.jpg)
4EnviroInfo 2006, 05/09/06 Graz
GI PolicyGI Policy GI standardsGI standards
Spatial Information ServicesSpatial Information ServicesFundamental Fundamental GI data setsGI data sets
Introduction – components of a European SDIIntroduction – components of a European SDI
![Page 5: Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre](https://reader036.vdocuments.mx/reader036/viewer/2022070401/5681375d550346895d9eeb93/html5/thumbnails/5.jpg)
6EnviroInfo 2006, 05/09/06 Graz
IntroductionIntroductionINSPIRE requirements
• metadata*• spatial data sets and spatial data
services*• network services*
– EU geo-portal
• access and rights of use for Community institutions and bodies**
• monitoring and reporting mechanisms**• process and procedures
* technical: under JRC responsibility** legal/procedural: under Eurostat responsibility
![Page 6: Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre](https://reader036.vdocuments.mx/reader036/viewer/2022070401/5681375d550346895d9eeb93/html5/thumbnails/6.jpg)
7EnviroInfo 2006, 05/09/06 Graz
IntroductionIntroduction
• European interoperability framework for pan-European interoperability framework for pan-European eGovernment servicesEuropean eGovernment services
• Recommendations related to multilingualism, e.g.,Recommendations related to multilingualism, e.g., – For the Pan-European services provided via portals, the
top-level EU portal interface should be fully multilingual, the second-level pages (introductory texts and the descriptions of links) should be offered in the official languages and the external links and related pages on the national websites should be available in at least one other language (for example English) in addition to the national language(s).
http://europa.eu.int/idabc
![Page 7: Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre](https://reader036.vdocuments.mx/reader036/viewer/2022070401/5681375d550346895d9eeb93/html5/thumbnails/7.jpg)
EcoInformatics meeting, 17/01/06 Ispra
Introduction
Issues on Multilingualism identified by the INSPIRE DT on Network Services– only mentioned in the context of the interoperability of
spatial data sets and services for key attributes and corresponding multilingual thesauri
– Granularity: should the list of available languages be a service feature or at the data set or even at the feature attribute level ?
– Metadata/Data: should only metadata be multilingual or datasets as well ?
– Attributes label versus Attribute value: Should only attributes label be multilingual or should the attribute’ values be as well multilingual?
![Page 8: Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre](https://reader036.vdocuments.mx/reader036/viewer/2022070401/5681375d550346895d9eeb93/html5/thumbnails/8.jpg)
EcoInformatics meeting, 17/01/06 Ispra
Introduction«view»
Information community 3
«view»
Information community 4
«view»
Central
«view»
Information community 1
«view»
Information community 2
«view»
Information community 1.1
«view»
Information community 1.2
Metadata creation
Collections of metadata (e.g., portal, search engine)
Define query and consult metadata
harvest /distributedsearch
harvest /distributedsearch
searchsearch
![Page 9: Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre](https://reader036.vdocuments.mx/reader036/viewer/2022070401/5681375d550346895d9eeb93/html5/thumbnails/9.jpg)
10EnviroInfo 2006, 05/09/06 Graz
OutlineOutline
• Introduction
• Objectives of the study
• Approach
• Results
• Conclusions
![Page 10: Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre](https://reader036.vdocuments.mx/reader036/viewer/2022070401/5681375d550346895d9eeb93/html5/thumbnails/10.jpg)
11EnviroInfo 2006, 05/09/06 Graz
Objective of the studyObjective of the study
• Focus on discovery of resources
• Answer question:– Is, from a technical point of view, a common
ontology or thesaurus desirable and feasible for multi-lingual resource discovery in a European Spatial Data Infrastructure?
![Page 11: Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre](https://reader036.vdocuments.mx/reader036/viewer/2022070401/5681375d550346895d9eeb93/html5/thumbnails/11.jpg)
12EnviroInfo 2006, 05/09/06 Graz
OutlineOutline
• Introduction
• Objectives of the study
• Approach
• Results
• Conclusions
![Page 12: Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre](https://reader036.vdocuments.mx/reader036/viewer/2022070401/5681375d550346895d9eeb93/html5/thumbnails/12.jpg)
13EnviroInfo 2006, 05/09/06 Graz
ApproachApproach
• Implement and extend work of H. Chen, et al., "A Parallel Computing Approach to Creating Engineering Concept Spaces for Semantic Retrieval: The Illinois Digital Library Initiative Project," IEEE Transactions on Pattern Analysis and Machine Intelligence vol. 18 pp. 771-782, 1996.
• Integrate thesauri, vocabularies and gazetteers in resource discovery
• Experiments P. Smits, A. Friis-Christensen, Resource Discovery in a European Spatial Data Infrastructure. IEEE Transactions on Knowledge and Data Engineering (accepted for publication)
![Page 13: Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre](https://reader036.vdocuments.mx/reader036/viewer/2022070401/5681375d550346895d9eeb93/html5/thumbnails/13.jpg)
14EnviroInfo 2006, 05/09/06 Graz
ApproachApproach
• What is a Concept Space?• Simply put:
– An index of all concepts existing in a metadata repository
– With numerical relationships defined between any two concepts
– To be queried by associative retrieval
![Page 14: Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre](https://reader036.vdocuments.mx/reader036/viewer/2022070401/5681375d550346895d9eeb93/html5/thumbnails/14.jpg)
15EnviroInfo 2006, 05/09/06 Graz
• Two-step approach– Creation of multi-
lingual concept space
– Associative retrieval based on a neural network
H. Chen, B. Schatz, T. Ng, J. Martinez, A. Kirchhoff, C. Lin, A parallel computing approach to creating engineering concept spaces for semantic retrieval: the Illinois digital library initiative project. IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 18, No. 8, August 1996, pp. 771-782.
ApproachApproachStart
End
3. Cluster analysis
«database»
Ontology and v ocabulary
«database»
Resource descriptors
«database»
Concept Space
«database»
Index
1. Collect resourcedescriptors
2. Filter and indexconcepts
«database»
Unidentified Concepts
![Page 15: Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre](https://reader036.vdocuments.mx/reader036/viewer/2022070401/5681375d550346895d9eeb93/html5/thumbnails/15.jpg)
16EnviroInfo 2006, 05/09/06 Graz
ApproachApproach
• Creation of the multi-lingual concept space– Collection of resource descriptors– Object filtering and indexing
• identify those concepts and terms that we already have in our human-created ontology which includes any thesauri and vocabulary
• to filter out any irrelevant terms like stop words in order to improve performance
• to store any remaining terms in the concept space
![Page 16: Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre](https://reader036.vdocuments.mx/reader036/viewer/2022070401/5681375d550346895d9eeb93/html5/thumbnails/16.jpg)
17EnviroInfo 2006, 05/09/06 Graz
Approach - Associative queryApproach - Associative query
• Initialize the associative retrieval– The neural network is initialized at query time by
assigning initial membership values to the units of the neural network = concepts in the Concept Space
• Terms in the concept space that match exactly a query term: 1
• Partial matches get membership value < 1 • Terms that do not match the query: 0
![Page 17: Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre](https://reader036.vdocuments.mx/reader036/viewer/2022070401/5681375d550346895d9eeb93/html5/thumbnails/17.jpg)
18EnviroInfo 2006, 05/09/06 Graz
Approach - Associative queryApproach - Associative query
• Initialize the associative retrieval
Query: “soil”
Soil, bodem
1
Sub-surface
information
0
0
Situation at t=0
Wij = 0
Wij = 0.7
![Page 18: Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre](https://reader036.vdocuments.mx/reader036/viewer/2022070401/5681375d550346895d9eeb93/html5/thumbnails/18.jpg)
19EnviroInfo 2006, 05/09/06 Graz
Approach - Associative queryApproach - Associative query
• Iterate though the neural network
Soil, bodem
1
Sub-surface
information
0
0
Situation at t=0
Wij = 0
Wij = 0.7
Soil, bodem
1
Sub-surface
information
0.7
0
Situation at t=1
Wij = 0
Wij = 0.7
![Page 19: Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre](https://reader036.vdocuments.mx/reader036/viewer/2022070401/5681375d550346895d9eeb93/html5/thumbnails/19.jpg)
20EnviroInfo 2006, 05/09/06 Graz
Approach - Associative queryApproach - Associative query
• Link membership values of concepts to resource descriptors
Soil, bodem
1
Sub-surface
information
0.7
0
Situation at t=1
Wij = 0
Wij = 0.7Membership > threshold?Use index to find resourcesthat contain the conceptOrder found resources in order of relevance, based on membership values
![Page 20: Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre](https://reader036.vdocuments.mx/reader036/viewer/2022070401/5681375d550346895d9eeb93/html5/thumbnails/20.jpg)
21EnviroInfo 2006, 05/09/06 Graz
OutlineOutline
• Introduction
• Objectives of the study
• Approach
• Results
• Conclusions
![Page 21: Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre](https://reader036.vdocuments.mx/reader036/viewer/2022070401/5681375d550346895d9eeb93/html5/thumbnails/21.jpg)
22EnviroInfo 2006, 05/09/06 Graz
«database»
Metadata repository
Ontology (thesauri,
v ocabularies)
Harv esterWMS
RDF
Wrapper
Concept Space Manager
Associativ e retriev er
query
Concept Space
Ontology editor
OWL
Reasoner
Collection of web addresses
XSLT file library
Ontology importer
XML
DIG
![Page 22: Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre](https://reader036.vdocuments.mx/reader036/viewer/2022070401/5681375d550346895d9eeb93/html5/thumbnails/22.jpg)
23EnviroInfo 2006, 05/09/06 Graz
ResultsResults
• Creating the metadata repository
![Page 23: Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre](https://reader036.vdocuments.mx/reader036/viewer/2022070401/5681375d550346895d9eeb93/html5/thumbnails/23.jpg)
24EnviroInfo 2006, 05/09/06 Graz
ResultsResults
![Page 24: Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre](https://reader036.vdocuments.mx/reader036/viewer/2022070401/5681375d550346895d9eeb93/html5/thumbnails/24.jpg)
25EnviroInfo 2006, 05/09/06 Graz
ResultsResults
![Page 25: Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre](https://reader036.vdocuments.mx/reader036/viewer/2022070401/5681375d550346895d9eeb93/html5/thumbnails/25.jpg)
26EnviroInfo 2006, 05/09/06 Graz
ResultsResults
• Query computationally expensive
query Remark
Time required for four iterations of neural network
(600 MHz, 512 MB RAM)
soil (eng) Query term found in the concept space (GEMET 2001.1 concept no.
7843)
16.1 s.
infrastructuur (nld) Query term not literally defined in the concept space or ontology.
27.8 s.
![Page 26: Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre](https://reader036.vdocuments.mx/reader036/viewer/2022070401/5681375d550346895d9eeb93/html5/thumbnails/26.jpg)
27EnviroInfo 2006, 05/09/06 Graz
OutlineOutline
• Introduction
• Objectives of the study
• Approach
• Results
• Conclusions
![Page 27: Paul Smits, Anders Friis-Christensen European Commission, DG Joint Research Centre](https://reader036.vdocuments.mx/reader036/viewer/2022070401/5681375d550346895d9eeb93/html5/thumbnails/27.jpg)
28EnviroInfo 2006, 05/09/06 Graz
Conclusions from the studyConclusions from the study
• It will be impractical to rely only on one common ontology for resource discovery in a European SDI
• The approach of using human-created ontologies in combination with automatic concept space generation and associative retrieval is a powerful means to the discovery of geospatial resources.
• Proposed approach is useful and merits further investigation and development
• The importance of structured information, using metadata standards, is underlined by our study and is also a basic assumption of our work.