representing contextual aspects of data wp2 - andreas harth 1st year review luxembourg, december...
TRANSCRIPT
Representing Contextual Aspects of Data
WP2 - Andreas Harth
1st year reviewLuxembourg, December 2011
18 24 30 366 120
Task 2.1Data quality assessment and repair
Task 2.2Temporal, spatial and social aspects of data
Task 2.3Recommendations for enhancing best practices for data publishing
D2.4 Update of D2.1
D2.3 Modelling and processing contextual aspects of data
D2.5 Proof-of-concept evaluation for modelling space and time
FUB
42 48D2.1 Conceptual model and best practices for high-quality data publishing
D2.2 Methods for quality repair
KIT
KIT
Work Plan View
D2.6 Methods for assessing the quality of sensor data
D2.7 Recommendations for contextual data publishing
Outline
• Motivation• NeoGeo Vocabularies• Mappings and Community Activities• Demo• Conclusion and Future Work
Motivation
Motivation
• Geospatial data is becoming increasingly relevant• Location-based services, mobile applications• Ever increasing amount of sensor data (phones,
satellites…)
• Applications require integrated access to geospatial data• Spatial querying• Spatial reasoning
Source 2
Source 1
Geospatial Data Scenario
Source n
Wrapper 1
Mapping 2
Mapping n
Integration
Mapping 1
...
Challenges
• Disparate data formats• Integrated data format (syntax) and access (data
transfer protocol) - Linked Data (RDF, HTTP)
• Most data sources provide just points (geo:lat, geo:long)• Create vocabulary/method for publishing regions
• Each data source uses own way to encode geospatial data• Allow for syntax differences, provide mappings where
possible (KML, GML, WKT, RDF…)
• Instance data not interlinked across sources• Create mappings between instances
• Geospatial data is political
„Unpolitical“ Datasets
GADM-RDF – http://gadm.geovocab.org/ RDF representation of the administrative regions of the
GADM project (http://gadm.org/)
NUTS-RDF – http://nuts.geovocab.org/ RDF representation of Eurostat's NUTS nomenclature.
GADM and NUTS serve as:New geospatial information on the Semantic Web.Bridges between already published spatial datasets.Experimentation and evaluation datasets.
NeoGeo Vocabularies
Prototypical Geo-Ontology
Feature Geometrygeometry
Spatial relations
Spatial functions
geometry„Luxembourg“
„Europe“
in
Features vs. Geometries
• NeoGeo vocabularies are based on the General Feature Model
• General Feature Model makes a distinction between the feature (resource to which the region belongs), and the actual geometry.
• Semantics of the feature are more important than the representation of the geometry.
• Instances of the feature are related to the type of the feature.
• A feature can be related to multiple distinct geometries; also allows for modelling different geometric properties for one single feature (e.g. different scales).
NeoGeo Vocabularies
• Spatial Vocabulary• Representation and reasoning on topological relations based
on the Region Connection Calculus (RCC).
• Geometry Vocabulary• Representation of geo-referenced geometric shapes.
spatial:Feature ngeo:Geometryngeo:geometry
spatial:* RESTful services
RCC: Randell, D. A., Cui, Z. and Cohn, A. G.: A spatial logic based on regions and connection, 3rd Int. Conf. on Knowledge Representation and Reasoning, 1992.
Modelling Spatial RelationsDataset Disjoint Touches Overlaps Within Contains Equals Nearby
UN FAO hasBorderWith
isInGroup
Ordnance Survey disjoint touches partiallyOverlaps
within contains Equals
geo.linkeddata.es formaParteDe
formadoPor
LinkedGeoData
GeoNames.org neighbour/neighbouringFeatures
parentFeature
childrenFeatures
nearby / nearbyFeatures
Uberblic.org adjoining_location
containing_location
DBpedia locatedInArea
NUTS Part of
GADM Part of
RCC-8 Properties
partially overlapping (PO)tangential proper part (TPP)non-tangential proper part (NTPP)equal (EQ)tangential proper part inverse (TPPi)non-tangential proper part inverse (NTPPi)externally connected (EC)disconnected (DC)
proper part (PP)proper part inverse (Ppi)part (P)part inverse (Pi)overlaps (O)connects with (C)discrete from (DR)
Spatial Vocabulary
• http://geovocab.org/spatial#
• Uses RCC for the representation of topological relations between regions
• Supports RCC5 and RCC8 relations• Inference available for most RCC
relations• However some rules require „negation as
failure“, which requires closed world assumption
Modelling RegionsDataset Point Bounding Box Points in
ListsSingle predicate
Literal
UN FAO Own
Ordnance Survey W3C Geo / GeoRSS
Own / GML
geo.linkeddata.es W3C Geo Own Own / GML
LinkedGeoData W3C Geo Own
GeoNames.org W3C Geo
Uberblic.org Own
DBpedia.org W3C Geo
NUTS
GADM
Modelling Regions - Alternatives
• RDF List of W3C Geo Points• RDF List of latitude/longitude pairs• RDF Literal of latitude/longitude• RDF Literal in GML/WKT format
(GeoSPARQL)• …• Geometries have own URI, content
format negotiated (KML, GML, WKT…)• Geometry vocabulary describes high-level
features of geospatial regions
Mappings and Community Activities
Vocabulary Mappings
TOFROM
NeoGeo DBpedia Linked-GeoData
geo.linked-data.es
Geo-names
NeoGeo - SC, SP SC, SP SC, SP SC, SP
DBpedia -
Linkged-GeoData
wip -
geo.linked-data.es
wip -
Geonames -
SC: rdfs:subClassOf, SP: rdfs:subPropertyOf, SA: owl:sameAs
Instance Mappings
TOFROM
NeoGeoNUTS
NeoGeo GADM
DBpedia Linked-GeoData
geo.linked-data.es
Geo-names
NeoGeo NUTS
- EQ PPi
NeoGeo GADM
EQ - PPi PPi PPi tbd
DBpedia EQ, PP (wip)
-
Linked-GeoData
PP (wip) -
geo.linked-data.es
PP (wp) -
Geonames -
VoCamp
Planning underway for VoCamp at UPM (jointly organised with KIT), weekend of Feb 4th and 5th
Co-located with VoCamp Santa Barbara.
http://vocamp.org/
Demo!
Summary
• NeoGeo vocabularies and experiences on publishing geo-spatial data as Linked Data
• NUTS and GADM datasets online• Integration vocabulary online, including
vocabulary mappings• GADM mappings to NUTS, DBpedia,
LinkedGeoData…• Linked Data Services for accessing/querying
spatial indices (withinRegion, boundingBox)• Work on similarity metrics (with optimisations
and evaluation) for geospatial regions
Future Work
• Finalise NeoGeo vocabularies (VoCamp)• Improvement of precision of spatial similarity;
publish similarity service online• More instance mappings to GADM• Earth and space science data• More experiments: querying of integrated data• Reasoning
• Temporal context
• First deliverable due in March 2012