an interactive framework for raster data spatial joins

29
1 ACM GIS 2007 An Interactive Framework for Raster Data Spatial Joins Wan Bae (Computer Science, University of Denver) Petr Vojtěchovský (Mathematics, University of Denver) Shayma Alkobaisi (Computer Science, University of Denver) Scott T. Leutenegger (Computer Science, University of Denver) Seon Ho Kim (Computer Science, University of Denver)

Upload: gracie

Post on 13-Jan-2016

39 views

Category:

Documents


0 download

DESCRIPTION

An Interactive Framework for Raster Data Spatial Joins. Wan Bae (Computer Science, University of Denver) Petr Vojtěchovský (Mathematics, University of Denver) Shayma Alkobaisi (Computer Science, University of Denver) Scott T. Leutenegger (Computer Science, University of Denver) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: An Interactive Framework for  Raster Data Spatial Joins

1 ACM GIS 2007

An Interactive Framework for Raster Data Spatial Joins

Wan Bae (Computer Science, University of Denver)

Petr Vojtěchovský (Mathematics, University of Denver)

Shayma Alkobaisi (Computer Science, University of Denver)

Scott T. Leutenegger (Computer Science, University of Denver)

Seon Ho Kim (Computer Science, University of Denver)

Page 2: An Interactive Framework for  Raster Data Spatial Joins

2 ACM GIS 2007

Outline

Introduction

Issues and Problems

Probabilistic Joins

Sampling Joins

Interactive Framework

Experiments

Conclusion

Page 3: An Interactive Framework for  Raster Data Spatial Joins

3 ACM GIS 2007

Geographic Information Systems

Web applicationWeb application

datadata datadata

datadata

• CollectCollect• StoreStore• RetrieveRetrieve

• Integration of georeferenced dataIntegration of georeferenced data• Spatial queriesSpatial queries• Complex spatial data analysis & Complex spatial data analysis & modeling for decision supportmodeling for decision support

GIS

Web application

UsersUsers

datadata

datadatadatadata

Page 4: An Interactive Framework for  Raster Data Spatial Joins

4 ACM GIS 2007

Raster Data Model

(a) Satellite Image

0 1 2 3 4 5 6 7 8 90 R T1 R T2 H R3 R4 R R5 R6 R T T H7 R T T8 R9 R

(b) Raster Model

• A great portion of georeferenced data• Simple data structure but greater storage space• Continuously changing data

Page 5: An Interactive Framework for  Raster Data Spatial Joins

5 ACM GIS 2007

Continuously Changing Data

Page 6: An Interactive Framework for  Raster Data Spatial Joins

6 ACM GIS 2007

Raster Data Spatial Joins

(a) (b)

“Find the regions where rainfall rate is greater than 1.0 and wind speed is greater than 50”

Page 7: An Interactive Framework for  Raster Data Spatial Joins

7 ACM GIS 2007

Issues for User-driven Data Exploration

Fast Query response time

– Time consuming for exact answers due to large size of data sets

– Time intensive GIS decision support queries

– Lack of optimization and approximation techniques for raster data joins

Interactive query processing

– Lack of interactivities in traditional GIS

– No user control over query processing Visualization increases the utility of the GIS

Page 8: An Interactive Framework for  Raster Data Spatial Joins

8 ACM GIS 2007

Our Approach

Fast approximation of query results

1. probabilistic join

2. sampling join

Visualize intermediate results

1. “big picture” of query result

2. partial result: non-blocking joins

Allow users to control query processing

For faster and more effective decision support queries:

Page 9: An Interactive Framework for  Raster Data Spatial Joins

9 ACM GIS 2007

Our Approximations

2. Can use the result of a subset of data cell joins for the final answer?

R (8/16) S (9/16) = they must join!

1. What is the probability that R joins S?

1 joins / 2 cells ? / 16 cells

Page 10: An Interactive Framework for  Raster Data Spatial Joins

10

ACM GIS 2007

Augmented Quad-trees

Both data sets are indexed using Quad-trees

NW

SESW

NE NW

SESW

NE

Page 11: An Interactive Framework for  Raster Data Spatial Joins

11

ACM GIS 2007

Join Probability

Let X = [0, 1], m and n be randomly chosen intervals in X of length a, b. The probability p that m ∩ n ≠ 0

Join Probability of p (m ∩ n ≠ 0) = ?

Page 12: An Interactive Framework for  Raster Data Spatial Joins

12

ACM GIS 2007

1-d Join Probability

0 1

overlapped

dxaxaaxba

bapb

1

0

},0max{}1,min{)1)(1(

1),(

aa1 a2m

bb1 b2n

x x+bb1-b q

p

a1-a

Page 13: An Interactive Framework for  Raster Data Spatial Joins

13

ACM GIS 2007

2-d Join Probability

1

1

1111

1

1 1

1 ),(),()1)(1(

1),( dbda

b

b

a

apbap

babap

a b

a1

a2 a

m

b1

b2b

n

0

Page 14: An Interactive Framework for  Raster Data Spatial Joins

14

ACM GIS 2007

Look-up table for 2-d Join Probability

P 0.1 0.2 0.3 0.4 0.5

0.1 0.4636 0.6228 0.7414 0.8317 0.8997

0.2 0.6228 0.7683 0.8640 0.9277 0.9681

0.3 0.7414 0.8640 0.9343 0.9738 0.9930

0.4 0.8317 0.9277 0.9738 0.9937 0.9995

0.5 0.8997 0.9681 0.9930 0.9995 1.0

Page 15: An Interactive Framework for  Raster Data Spatial Joins

15

ACM GIS 2007

Probabilistic Join (PJ)

p( , )4

2

4

2

p( , )16

9

16

8

Page 16: An Interactive Framework for  Raster Data Spatial Joins

16

ACM GIS 2007

Probabilistic Join Result

(b) data set S (65536 x 65536)

(a) data set Q (65536 x 65536)

(e) 4th level joins(c) 2th level joins (d) 3th level joins

Page 17: An Interactive Framework for  Raster Data Spatial Joins

17

ACM GIS 2007

Incremental Stratified Sampling Join (ISSJ)

Utilize stratified random sampling technique from quad-

trees of two data sets R and S

Data randomization: Acceptance/Rejection method

1. Sampling step: sample data from outer data set R

2. Spatial joining step: joins with the corresponding data cell on inner data set S

3. Refining step: running estimates and confidence intervals

4. Visualization: display partial results (actual join results)

Page 18: An Interactive Framework for  Raster Data Spatial Joins

18

ACM GIS 2007

Stratified Random Sampling

ST1 ST2 ST3 ST4

02 21

ST1

ST2 ST3

ST4

Page 19: An Interactive Framework for  Raster Data Spatial Joins

19

ACM GIS 2007

Estimates and Confidence Interval

Population Proportion: fraction indicating the part of the sample having a particular interest

Estimated Value: the statistic computed from sample information using population proportion

Confidence interval: an interval that estimates a population parameter within a range of possible values at specified probability

Confidence level: the specified probability

Page 20: An Interactive Framework for  Raster Data Spatial Joins

20

ACM GIS 2007

Incremental Sampling Join Result

(b) Partial result(a) Estimated result

IA

NE

WI

CO

KS

MI

state airports confidence

interval

13

22

19

15

11

8

0.05 0.05 0.05 0.05 0.05 0.05

95

95

95

95

95

95

10% done

Page 21: An Interactive Framework for  Raster Data Spatial Joins

21

ACM GIS 2007

Interactive Join Framework

Page 22: An Interactive Framework for  Raster Data Spatial Joins

22

ACM GIS 2007

Experiments

PJ and ISSJ compared to full Quad-tree join.

Confidence level set to 95% in ISSJ

Varied buffer size and data sets size.

Data sets:

– Synthetic: U E, E U, U U

(65536 65536 and 262144 262144)

– Real: 6 data sets mineral resources for each state of AZ, CO, OR and WY from U.S. Geological Survey

(65536 65536)

Page 23: An Interactive Framework for  Raster Data Spatial Joins

23

ACM GIS 2007

Actual joins vs. 2-d PJ

sample size actual joins 2-d (error)

5% 54 48 (0.1060)

10% 109 99 (0.0917)

20% 218 197 (0.0963)

50% 545 494 (0.0936)

Page 24: An Interactive Framework for  Raster Data Spatial Joins

24

ACM GIS 2007

Accuracy of Estimates of ISSJ

Estimates vs. exact value for real data sets

number of processed cells

Page 25: An Interactive Framework for  Raster Data Spatial Joins

25

ACM GIS 2007

Time for Confidence Interval of ISSJ

Confidence Interval and I/Os for real data sets

sampling joinfull quad-tree join

Page 26: An Interactive Framework for  Raster Data Spatial Joins

26

ACM GIS 2007

ISSJ vs. PJ vs. Actual joins

(a) ISSJ w/10% CI (b) ISSJ w/5% CI

(a) Actual join (d) PJ

Page 27: An Interactive Framework for  Raster Data Spatial Joins

27

ACM GIS 2007

Time for Confidence Intervals

I/Os of PJ, ISSJ and the full quad-tree join for Colorado

Page 28: An Interactive Framework for  Raster Data Spatial Joins

28

ACM GIS 2007

Conclusion

A novel spatial join, Probabilistic Join, for raster data joins for obtaining a “big picture” visualization of query answer

An interactive raster spatial join algorithm, Incremental Refining Spatial Join, for confidence interval bounded estimated query answer of raster data joins

Page 29: An Interactive Framework for  Raster Data Spatial Joins

29

ACM GIS 2007

Thank you!