quality issues in spatial databases m. mostafavi, g. edwards, r. jeansoulin crg & geoide &...

33
Quality issues in Quality issues in Spatial Databases Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS CRG & GEOIDE & REVIGIS Victoria, May 2003 Victoria, May 2003

Upload: alban-griffith

Post on 05-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

Quality issues in Spatial Quality issues in Spatial DatabasesDatabases

M. Mostafavi, G. Edwards, R. Jeansoulin M. Mostafavi, G. Edwards, R. Jeansoulin

CRG & GEOIDE & REVIGISCRG & GEOIDE & REVIGISVictoria, May 2003Victoria, May 2003

Page 2: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

Contents

Introduction

Problems

Objective

Methodology

Results

Discussion

Conclusions and perspectives

Page 3: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

IntroductionData fusion and Data Quality

Multi sources spatial data

Vector data : BNDT, BDTQ, …

Raster data: satellites images, aerial images,…

Need for better qualityLogical consistency

Completeness

Semantic accuracy

Temporal accuracy

Positional accuracy

and more …

Decision making (Effective crisis management (MSPQ))

Page 4: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

A real case problem BNDT: good geometry Statistics Canada database, Canada election database:

reach descriptive information but weak geometry How to reconcile these two data sets?

BNDT

SC, EC

Page 5: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

Context

Fusion

SDB1

SDB3

SDB2 Information of greater quality

User vision(fitness for use)

Producer vision(Product ontology)

Page 6: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

Logical consistencyLogical consistency is an important element of data quality. It defines the degree of consistency of the data with respect to its specifications.

Integrity constrains

Explicit rules stated in the data specifications (e.g. connectivity between two objects)

Implicit rules (e.g. a river always flows downstream)

Ontology vs. specifications

Ontology

specifications

Page 7: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

Project definition

NTDBOntology

BDTQOntology

NTDB data

BDTQdata

New dataset

Ontology consistency

dataconsistency

dataConsistency

vs. BNDT

dataconsistency

Step 1

Step 2

Step 3Step 4

Step 5

Integrated ontology

Mapping the ontologiesOntology fusion

Data fusion

Does thisHelp?

Lack of explicit rules

Yes?......No

Page 8: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

Consistency in NTDB

NTDB Ontologies

DelphiInterface

DelphiInterface

Studying theLogical consistency of the dataset

Prolog

Dataset

Step 1 Step 2

Page 9: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

Formalizing the ontology

Knowledge baseFacts Rules

Queries

BNDTOntology

Page 10: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

Spatial relations in NTDB

Spatial relations in NTDB are:

1. Connection relations

2. Sharing relations

3. adjacency relations

4. Superposition relations

AB

A

B

A

B

C D E

AB

C

1

2

34

Page 11: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

Logical approach- factsFor NTDB the facts consist of

Taxonomy of NTDB Themes Entities Allowed Combinations Code (NTDB identity code) Geometric representations

Spatial relations Connection Sharing Superposition/ adjacency Minimal values (e.g. distance constraints between objects)

Page 12: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

Logical approach- facts

Types of facts Total groups Facts

Taxonomy 368 386

Connection 574 330 796

Sharing 523 15 853

Adjacency/superposition 138 1637

Total 348672

There are about 350,000 facts describing the NTDB• Remark: regrouping of objects for programming purposes

has created some inconsistencies

Page 13: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

Logical approach- rules

Relation Inconsistency rules

Connection a. Object A is connected to object B and the inverse relation is not defined.

b. Connection is illegal (C1=0-0) and for the same objects we have C1 ≠ 0-0 .

Sharing a. Object A shares with object B but the inverse relation is not defined.

b. The same objects share with different values of C2.

SuperpositionAdjacency

a. Two objects are superposed and are adjacent at the same time.

Several rules are defined to analyze the ontological consistency of the NTDB.

Inconsistency rules

Page 14: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

Results (1/2)

Inconsistency (inverse connection)Data dictionary: (generic relation) between themes: Railway (L) Connected to Road (L) between themes : Road (L) Connected to Railway (L)

Table of connection and cardinalities

Code Entity Combination C1 Code

3002 Railway Standard, Ground level, Operational, Multiple

1-2 3660

3660 Road Secondary, Ground level ,Hard surface

Not verified

?

Page 15: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

Results (2/2)

Inconsistency (Different Values for the cardinality one)

Data dictionary: (Generic relation)

Gas and oil facilities (P) is Connected to Building (P)

Table of connection and cardinalities

Code Entity Combination C1 Code

788 Gas and oil facilities

Generic/

unknown

0-0 147

788 Gas and oil facilities

Generic/

unknown

-

(Not verified)

147

?

Page 16: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

Consistency in Data

NTDB Ontologies

DelphiInterface

Studying theLogical consistency of the dataset

PrologVB

Interface

Dataset

Step 1 Step 2

Page 17: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

Geomedia professional

•Meet•Entirely Contained•Entirely Contained by•Contains and •Contained by•Spatially equal•touch

Meet

Overlap

Spatial operations

Page 18: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

Mapping

Polygon – Polygon Relations

Relations Disjoint Meet Equal Inside ContainsCovered

byCovers Overlap

Connection

Sharing x x x x

Superposition x x x x x x

Adjacent x

Page 19: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

Mapping problems

Several problems

Confusions in spatial relations

Unique mapping is not possible

Cardinalities cannot be considered

Page 20: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

Step 2: BNDT Data vs its ontology

Page 21: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

Data vs ontology

File 21E05 Region: Sherbrooke 68 Entities 23,283 objects Analyzed binary relations:

Contours vs. water bodies Buildings vs. roads Water bodies vs. buildings Liquid depot vs. Liquid depot Roads vs. water bodies …

Page 22: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

Results

Liquid depot vs. Liquid depot Spatial representations (Point, Area)Spatial relations

Ontology/ specification (superposition is illegal)Data (superposition case is found)

Page 23: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

• Problem: Road crosses a water body• Illegal relation with respect to semantics of the

objects• Incomplete ontology

Results

Page 24: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

• Problem: Cut line crosses a water body• Illegal relation with respect to semantic definition

of the objects• Incomplete ontology

Results

Page 25: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

• Problem: Contour crosses water body• Illegal relation with respect to the ontology• Inconsistent data

Results

Page 26: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

Results• Problem: Road crosses water body• Illegal relation with respect to the ontology• Inconsistent data

Page 27: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

• Problem: Road crosses Building• Illegal relation with respect to the semantics of

objects• Incomplete ontology

Results

Page 28: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

• Problem: Water body (L) superposed Vegetation (A)• Illegal relation with respect to the ontology• Inconsistent data• Control system problem

Results

Page 29: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

• Problem: Buildings (S) superposed to water body (A)

• Illegal relation with respect to the semantics of objects

• Inconsistent data

Results

Page 30: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

• Problem: Building (A) Overlap Vegetation (A)• Illegal relation with respect to the semantics of

objects• Inconsistent data

Results

Page 31: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

Suggestions, solutions

Adding new rulesBuilding (a) and vegetation (a) (illegal

superposition)Road (l) and building (conditional

superposition)A better control system is needed

Find exceptions

Page 32: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

Current situationProduct ontology is analyzed

Mapping of topological relations to binary relations

Ontology translation in prolog (Delphi program)

Consistency studding of spatial relations

Connection (table C)

Sharing (table D)

Superposition and adjacency (table E)

Consistency between different relations (fusion of facts)

connection and sharing , connection and superposition / adjacency,

sharing and superposition / adjacency

Consistency of data vs. specifications are studied

Page 33: Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003

Future work

logical consistency of other available

datasets

Mapping of ontologies

Fusion of ontologies

Fusion of data

Consistency of the newly created data set