spatial databases: building spatial db spring, 2015 ki-joune li

20
Spatial Databases: Building Spatial DB Spring, 2015 Ki-Joune Li

Upload: frederica-baldwin

Post on 02-Jan-2016

223 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Spatial Databases: Building Spatial DB Spring, 2015 Ki-Joune Li

Spatial Databases:Building Spatial DB

Spring, 2015

Ki-Joune Li

Page 2: Spatial Databases: Building Spatial DB Spring, 2015 Ki-Joune Li

STEMPNU

2

Importance of Database

Application of Spatial Databases

(e.g. GIS)

Garbage-In Garbage-OutGarbage-In Garbage-Out

About 70% of GIS Development Cost: DB CostAbout 70% of GIS Development Cost: DB Cost

Page 3: Spatial Databases: Building Spatial DB Spring, 2015 Ki-Joune Li

STEMPNU

3

Comparison with Software Lifecycle

Requirement Analysis

Functional Specification

Design

Development Environments

Coding

Test

Maintenance

Software Life Cycle – Waterfall Model

Requirement Analysis

Modeling

Schema Design

DB Environments

Data Collection and Input

Quality Control

Maintenance

DB Life Cycle

Page 4: Spatial Databases: Building Spatial DB Spring, 2015 Ki-Joune Li

STEMPNU

4

Requirement Analysis

Analysis of Status as it is and as it shall be.

Output of Analysis Use-Case Diagram of UML: Workflow Analysis Data items that have been maintained and to be maintained Description of each item: Data Dictionary Relationships and Constraints on items Required accuracy

Spatial Precision Temporal Precision

Current State: As it is As it must be

Page 5: Spatial Databases: Building Spatial DB Spring, 2015 Ki-Joune Li

STEMPNU

5

Data Dictionary

Definitions and Representation of Data Items such as Precise definition of data elements Integrity constraints or Constrains Stored procedures and trigger rules Specification of

Producer and Consumer of data element

Why it is so important? Common understanding on data items Consistency of databases Important input to data modeling

Page 6: Spatial Databases: Building Spatial DB Spring, 2015 Ki-Joune Li

STEMPNU

6

Data Modeling

Data Modeling Understanding the real world and application A very small piece of the real world

According to viewpoint Determined by applications

Drawing what you have understood in formal method Class Diagram in UML

4 steps Definition of Entities Attributes of each Entity Relationships Constraints

Page 7: Spatial Databases: Building Spatial DB Spring, 2015 Ki-Joune Li

STEMPNU

7

Class Diagram: Basic

DVD Movie VHS Movie Video Game

Rental Item{abstract}

Rental Invoice

1..*1

Customer

Checkout Screen

0..1

1

Simple

Association

Class

Abstract

Class

Simple

Aggregation

Generalization

Composition

(Dependency)

Multiplicity

MyClassName

+SomePublicAttribute : SomeType

-SomePrivateAttribute : SomeType

#SomeProtectedAttribute : SomeType

+ClassMethodOne()

+ClassMethodTwo()

Responsibilities

-- can optionally be described here.

Page 8: Spatial Databases: Building Spatial DB Spring, 2015 Ki-Joune Li

STEMPNU

8

Extract nouns from Problem statement Use-Case Diagram

Delete unnecessary entities Duplication Attributes rather than entity

ex. Loan amount

Definition of Features Geographic Entity Granularity

Definition of Entities

MyClassName

Page 9: Spatial Databases: Building Spatial DB Spring, 2015 Ki-Joune Li

STEMPNU

9

Definition of Features

Feature Meaningful Object of GIS in real world Must have a geometry

Point, Line, Polygon, etc..

How to define the Granularity of Features Example

How to define “a” coastal line? The highway from Pusan to Seoul is a long feature ?

How to separate this long road?

Page 10: Spatial Databases: Building Spatial DB Spring, 2015 Ki-Joune Li

STEMPNU

10

Definition of Attributes

Attributes of Feature Geometric type: Spatial Attribute Non-Spatial Attributes

Geometric Type Different Levels of Detail (LOD)

Building Polygon in 1/1,000 scale Point in 1/1,000,000 scale

Road Polygon in 1/1,000 scale Polyline in 1/1,000,000 scale

MyClassName

+SomePublicAttribute : SomeType

-SomePrivateAttribute : SomeType

#SomeProtectedAttribute : SomeType

+GeometricAttribute

Page 11: Spatial Databases: Building Spatial DB Spring, 2015 Ki-Joune Li

STEMPNU

11

Relationship

Relationship Non-Spatial Relationship Spatial Relationship: Topology

Page 12: Spatial Databases: Building Spatial DB Spring, 2015 Ki-Joune Li

STEMPNU

12

Constraints

Example No building on road surface More than 50 meters between two poles

Implementation Internal Functions for checking constraints (or constructor) Spatial OCL (Object Constraint Language)

More detail and complete constraint Better quality of DB

Page 13: Spatial Databases: Building Spatial DB Spring, 2015 Ki-Joune Li

STEMPNU

13

Quality Control for Data Modeling

For the quality control, A Simulation with a pre-defined test scenario

Page 14: Spatial Databases: Building Spatial DB Spring, 2015 Ki-Joune Li

STEMPNU

14

Schema Design

Automatic Conversion from Data Modeling to Schema

Check Points: Performance Issues Materialization Index Geographic Distribution of DB: Clustering

Based on Workload Analysis Distribution of operations Distribution of values

Page 15: Spatial Databases: Building Spatial DB Spring, 2015 Ki-Joune Li

STEMPNU

15

Materialization

In SQL, view is a virtual table derived from a Select statement Eample

CREATE VIEW ExcellentStudents ASSELECT Name, Department, ScoreFROM StudentsWHERE Score > 4.0

SELCT NameFROM ExcellentStudentsWhere Department=‘CS’

Invoke

ExcellentStudents

Materialization

Page 16: Spatial Databases: Building Spatial DB Spring, 2015 Ki-Joune Li

STEMPNU

16

Materialize or Not ?

Materialization Duplication

Not 3NF (BCNF) Cause an inconsistency between the original and derived tables Update: Overhead due to update propagation

Extra Space Requirements

Should be determined depending on the WORKLOAD Frequency of updates Cost for update propagation

Especially when materialized view is geographically distributed

Page 17: Spatial Databases: Building Spatial DB Spring, 2015 Ki-Joune Li

STEMPNU

17

Spatial Index

Index: Accelerate Search

Spatial Index Spatial predicates: contain, overlapping, k-NN Much improves the query processing performance Has a performance overhead for insertion/deletion

Search ConditionSearch

Condition { Block# }{ Block# }Search Block Number

Databaseon Disk

1st Phase

2nd Phase

Page 18: Spatial Databases: Building Spatial DB Spring, 2015 Ki-Joune Li

STEMPNU

18

Clustering: Placement of records

Vertical Fragmentation vs. Horizontal Fragmentation

Vertical Fragmentation: Decomposition of table Horizontal Fragmentation: Placement of objects Consideration on Workload

Vertical Fragmentation Horizontal Fragmentation

Page 19: Spatial Databases: Building Spatial DB Spring, 2015 Ki-Joune Li

STEMPNU

19

Clustering

Clustering: Grouping objects so as to maximize Prob(a C, bC), when OK=a and OK+1=b for any two objects a and b of the same group C.

Spatial Clustering Basic Assumption:

If dist(a,b) < dist(a,c), Prob(OK=a, OK+1=b) > Prob(OK=a, OK+1=c)

Two consecutive accesses

a

b

c

Page 20: Spatial Databases: Building Spatial DB Spring, 2015 Ki-Joune Li

STEMPNU

20

Spatial Clustering Methods

k-Means CLARANS in IEEE TKDE 2002, 14(5) BIRCH in proc. VLDB 1996 DBSCAN in proc. KDD 1996 SMTIN in proc. ACM-GIS 1997