efficient placement of geographical data over broadcast ...efficient placement of geographical data...

36
1 Efficient Placement of Geographical Data Over Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query Broadcast Channel for Spatial Range Query Under Quadratic Cost Model Under Quadratic Cost Model Jianting Zhang Le Gruenwald School of Computer Science The University of Oklahoma Norman, Oklahoma, 73019, USA {jianting, ggruenwald}@ou.edu

Upload: others

Post on 18-Mar-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

1

Efficient Placement of Geographical Data Over Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query Broadcast Channel for Spatial Range Query

Under Quadratic Cost Model Under Quadratic Cost Model

Jianting Zhang

Le Gruenwald

School of Computer ScienceThe University of Oklahoma

Norman, Oklahoma, 73019, USA{jianting, ggruenwald}@ou.edu

Page 2: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

2

OutlineOutlineIntroductionRelated WorkReview of the Cost ModelThe Optimization Method– General Ideas– The Approximation Algorithm– An Example

Experiments and ResultsConclusions and Future Work Directions

Page 3: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

3

IntroductionIntroductionWhat is What is Geographical Information?

• Mailing Address:

Engineering Laboratory, Room 139

200 Felgar Street,Norman,OK, 73019-6151

• Relative Direction:

4 miles southeast of Norman

• Coordination: Longitude/Latitude

(-97.443067, 35.194425)

Page 4: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

4

IntroductionIntroductionWhy Broadcasting?Why Broadcasting?

Help solving several key problems in mobile computing– Bandwidth

Independent of number of usersExcellent scalability

– Power Consumption:Listen/Sleep mode consumes less power than in send mode

– MobilityNo mobility management is required at neither server side nor client side

Page 5: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

5

IntroductionIntroductionTypes of BroadcastingTypes of Broadcasting

– Pull basedNeeds explicit client requestOnly requested data are broadcastNeeds frequent scheduling

– Push basedSchedule broadcast sequence without explicit requestsNeed prior knowledge for efficient sequencingSuitable for pushing data to a large number of users

Page 6: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

6

IntroductionIntroductionWhy GI Broadcasting ?Why GI Broadcasting ?

Public Information – Service locations: ATM machines, Restaurants– Traffic & Road Conditions– Weather Information

Large number of potential users of GI (metropolitan area for example)Relatively static/low update frequencyMostly Read-onlyPrivacy is not a big concernDistributed in nature

Page 7: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

7

IntroductionIntroductionDisk Access vs. Air AccessDisk Access vs. Air Access

Page 8: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

8

IntroductionIntroductionParameters in Broadcast SystemParameters in Broadcast System

Access Time (Latency) – AT: –The duration between the time the broadcast channel is accessed to the time when all data are retrieved

–The user may switch to sleep mode in between active downloading

Page 9: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

9

IntroductionIntroductionParameters in Broadcast SystemParameters in Broadcast System

Tune-in Time -TT: – Time for downloading data from broadcast

sequence – Mobile hosts are in active (listen) mode

Page 10: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

10

IntroductionIntroductionResearch Objectives: A big pictureResearch Objectives: A big picture

Page 11: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

11

IntroductionIntroductionBroadcast Scheme Under ConsiderationBroadcast Scheme Under Consideration

Index and Data UseSeparate Channels

Page 12: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

12

Related WorkRelated Work•General Data Broadcasting:

•Tree-indexing, Hashing, Signature, Hybrid, etc.

•Suitable for One-dimensional and/or categorical data

•Allows only one data item per access

•Focus on trade off between TT and AT using replication

•Object-Orientated/Relational Database Broadcasting

•Allows multiple data items per access

•Assumes data access has predefined orders

Page 13: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

13

Related WorkRelated Work

•Geographical Data is multi-dimensional and continuous data.

•There may be multiple data items in a spatial range query result set and they may not have a pre-defined order.

•Existing broadcast techniques can not be applied to geographical data broadcast for efficient spatial range query processing.

Page 14: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

14

Cost ModelCost ModelBrief ReviewBrief Review

Assumption 1 : it takes unit time to broadcast a single data itemThus we can use the difference of the positions (D) between two data items in a broadcast sequence as the measurement of the data access time.

Assumption 2 : Data and index are broadcast exactly once in a broadcast cycle. We do not consider replication in this study.

Assumption 3: The number of range queries(M) that are requested within a region is proportional to its area (A): M=c*A

The cost is measured by A*D.

Page 15: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

15

Cost ModelCost ModelBrief ReviewBrief Review

qx/2

qy/2

P1

A1

))]()...2(),1(min())()...2(),1([max(*...

))](),(),(min())(),(),([max(*

|)()(|*

,...2,1

1,,

1,

nnw

kjikjiw

jiw

Cost

n

nkjikji

njiji

ππππππ

ππππππ

ππ

−++

−+

=

≤≤<≤

≤<≤

qy/2qx/2

qy/2

qy/2P2

P1

=

=

=

Qqyqx

qqnn

Qqyqx

qqikji

Qqyqx

qqjiji

yx

yx

yx

Aw

Aw

Aw

),(

),(,...2,1,...2,1

),(

),(,,

),(

),(,,

~...

~

~

nn

n

kkjijiji

n

jjiii

AA

kjiAAA

jiAAA

..2,1,...2,1

1,,,

1,

~...

)(,~

)(~

=

≠≠−=

≠−=

=

=

U

U

A12

Page 16: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

16

Optimization Optimization General IdeasGeneral Ideas

Requirements:

•Low-Cost: near real time scheduling (sequence 100-10000 nodes in 0-5 minutes)

•Trade off between Greedy and Non-greedy methods (n! possible orderings)

MinLA algorithm of (Bar-Yehuda, 2001):•Divide-and-conquer strategy•Space Complexity: O(2depth(T)) O(n)•Time Complexity: O(n2)

•Examines 2n-1 orderings in O(n2) time

∑∈Tt

tdepth )(2

Page 17: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

17

Optimization Optimization General IdeasGeneral Ideas

Graph Minimum Linear Arrangement problem

|)()(|)*,()(),(

vuvuwGlaEvu

ππ∑∈

−=

)}(),...(),(min{)}(),...(),(max{ 2121 kk nnnnnn ππππππ −

]2

)1)(([1)( 222

2−−−

−=LLLL

LL

LgQuadratic

Page 18: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

18

Optimization Optimization General IdeasGeneral Ideas

Observation: the monotonic relationship between L2and g(L2)

Motivation: Use L2 to approximate g(L2)

Assumption: the optimized ordering where the optimization is based on the definition of la(G) which is linear with respect to L2, is also a good ordering according to quadratic cost model respect to L2.

]2

)1)(([1)( 222

2−−−

−=LLLL

LL

Lg

Page 19: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

19

Optimization Optimization The Approximation AlgorithmThe Approximation Algorithm

10

115

8

64

7

31

9

02

BDT

Page 20: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

20

Optimization Optimization The Approximation AlgorithmThe Approximation Algorithm

A BDT:

• A binary tree that has all the nodes in a graph as its leaf nodes

• Two options: 0-orientation and 1-orientation

• Number of possible orderings is 2n-1, if the BDT is full and balanced

1-Orientation 0-Orientation+ -

LR RL

Page 21: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

21

Optimization Optimization The Approximation AlgorithmThe Approximation Algorithm

The algorithm:

• Starts with the root of the BDT and computes the costs of the two possible orientations of its two sub-trees recursively.

•Keeps the orientation that has lower cost while discard the one that has a higher cost.

•The computed orientations at each intermediate node of the BDT form an orientation tree that has the same structure as the BDT.

•The orientation tree determines an ordering sequence of all the nodes in a graph.

Page 22: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

22

Optimization Optimization The Approximation AlgorithmThe Approximation Algorithm

∈∩∈−∈∩∈∈∩∈−

=

∑∈ otherwise

RvtVuutVvuwLvtVuuvuwtVvtVuvuvuw

Evu 0)(|)()(|)*,()()(*),(

)()(|)()(|)*,(],[Cost

),(

RV(t),L,

ππ

πππ

•In-Cut

•Left_cut

• Right_cut

LR

V(t)

Page 23: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

23

Optimization Optimization The Approximation AlgorithmThe Approximation Algorithm

Use position implicitly in computing the cost (access time in our applications) which is very efficient.

t̂t̂t̂t̂t̂t̂

Page 24: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

24

Optimization Optimization An ExampleAn Example

T0

1 2 3 4

T11 T12

iA~

jiA ,~

kjiA ,,~In_cuts:

T11: {1,2}=22 T12: {3,4}=8

T0: {1,3}=2, {2,3}=38, {2,4}=3, {1,2,3}=14,{2,3,4}=4, total=62

Page 25: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

25

Optimization Optimization DBW: An ExampleDBW: An Example

+ +

3 4 2 122

3

38

43 4 2 1

- +

22

2

14

T11 T11+-

Left_cut(1)=2+14+22=38Right_cut(1)=0Cost(1)=Left_cut(1)=38

Left_cut(2)=38+3+4=45Right_cut(2)=22Cost(2)= Left_cut(2)=45

Left_cut(T11)=45+38-22=61Right_cut(T11)=22+0-22=0Cost(T11)=45+38+(22-22)*1+(38-22)*1=99

Page 26: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

26

Optimization Optimization DBW: An ExampleDBW: An Example

Left_cut(1)=2Right_cut(1)=22Cost(1)= Left_cut(1)=2

+

3 4 222

1

38

43 4 1 2

- -

2214

3

+T11 T11--

2

Left_cut(2)=38+4+3+22+14=81Right_cut(2)=0Cost(2)=Left_cut(2)=81

Left_cut(T11)=2+81-22=62Right_cut(T11)=22+0-22=0Cost(T11)=2+81+(22-22)*1+(81-22)*1=142

Page 27: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

27

Optimization Optimization An ExampleAn Example

T11: 1-orientation 99, 0-orientation 142

T12: 1-orientation 15, 0-orientation 66

T0= 1-orientation 114, [4,3,2,1]

T11: 1-orientation 81, 0-orientation 38

T12: 1-orientation 127, 0-orientation 76

T0=0-orientation 114 [1,2,3,4]

Best among all 4!=24 possible orderings

Page 28: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

28

Optimization Optimization An ExampleAn Example

How good is the approximation?

Answer from the example:

Page 29: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

29

Experiments and ResultsExperiments and ResultsGenerating Data SetsGenerating Data Sets

•Five synthetic point data sets: 100/200/300/400/500 data points

•Data space [0,1) ×[0,1)

•Query window size: (0.1,0.1)

Page 30: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

30

Experiments and ResultsExperiments and ResultsOrdering HeuristicsOrdering Heuristics

•Hilbert Space Filling Curve Ordering

•R-Tree Traversal Ordering

1 4 3 6 2 5

Page 31: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

31

Experiments and ResultsExperiments and ResultsComparison of Random Orderings Comparison of Random Orderings

Page 32: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

32

Experiments and ResultsExperiments and ResultsComparison of Two HeuristicsComparison of Two Heuristics

Page 33: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

33

Experiments and ResultsExperiments and ResultsOptimization of ROptimization of R--Tree OrderingTree Ordering

Page 34: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

34

ConclusionsConclusions•Observe the structural similarity between the quadratic cost model we previously developed and the MinLA problem and the monotonic relationship between the cost in terms of DPW+DBW and the DBW for a single query

•Propose to use the access time of DBW to approximate the access time of DPW+DBW and convert the optimization problem under the quadratic cost model into a MinLA optimization problem

•The experiment results using the five synthetic data sets based on optimization method showed that the optimized ordering is 21%-32% better than the 1000 random orderings average under the quadratic cost model.

Page 35: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

35

Future WorkFuture Work•Extend our cost models to handle the access time both to the data channel and the index channel

•Explore more ordering heuristics as well as exact and/or approximation optimization methods

•Perform more experiments using both synthetic and real data sets with different sizes, distributions and densities to examine the effectiveness and scalabilities of the optimization methods

Page 36: Efficient Placement of Geographical Data Over Broadcast ...Efficient Placement of Geographical Data Over Broadcast Channel for Spatial Range Query . Under Quadratic Cost Model . Jianting

36

Thanks!

Questions?