proceedings template - wordpages.cs.wisc.edu/~jignesh/publ/stripes.doc  · web viewan efficient...

15
An Efficient Method for Indexing Future Positions of Continuously Moving Objects ABSTRACT Over the last decade, we have witnessed an increasing use of location-aware devices in a number of different applications. A critical and common requirement in many of these applications is to efficiently query on the locations of moving objects. Indexing methods are a natural solution to support this key operation. Since these applications are characterized by a large number of continuously moving objects, a key requirement for indexing methods in this domain is to efficiently support both updates and queries. Previous work on indexing such database can be broadly divided into two categories: indexing the past positions and indexing the future predicted positions. In this paper we focus on an efficient indexing method for indexing the future positions of moving objects. The indexing method proposed in this paper indexes the predicted trajectories in a dual- transformed space. Trajectories for objects in d-dimensional space become points in a higher-dimensional 2d-space. In this dual transformed space, a regular hierarchical grid decomposition indexing structure is used, which results in leaf nodes that are the size of disk block and non-leaf nodes that are much smaller. An efficient technique is used for clustering the non- leaf nodes on disk pages. This new indexing method can evaluate a range of queries including timestamp, window, and moving queries. We have compared the performance of the proposed index with a recently proposed indexing method (TPR*-tree), and show that our approach has significant performance advantages for both updates and search queries. 1. INTRODUCTION Past work on indexing predicted trajectories have produced a range of 1

Upload: others

Post on 15-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Proceedings Template - WORDpages.cs.wisc.edu/~jignesh/publ/stripes.doc  · Web viewAn Efficient Method for Indexing Future Positions of Continuously Moving Objects ABSTRACT. Over

An Efficient Method for Indexing Future Positions of Continuously Moving Objects

ABSTRACTOver the last decade, we have witnessed an increasing use of location-aware devices in a number of different applications. A critical and common requirement in many of these applications is to efficiently query on the locations of moving objects. Indexing methods are a natural solution to support this key operation. Since these applications are characterized by a large number of continuously moving objects, a key requirement for indexing methods in this domain is to efficiently support both updates and queries. Previous work on indexing such database can be broadly divided into two categories: indexing the past positions and indexing the future predicted positions. In this paper we focus on an efficient indexing method for indexing the future positions of moving objects.

The indexing method proposed in this paper indexes the predicted trajectories in a dual-transformed space. Trajectories for objects in d-dimensional space become points in a higher-dimensional 2d-space. In this dual transformed space, a regular hierarchical grid decomposition indexing structure is used, which results in leaf nodes that are the size of disk block and non-leaf nodes that are much smaller. An efficient technique is used for clustering the non-leaf nodes on disk pages. This new indexing method can evaluate a range of queries including timestamp, window, and moving queries. We have compared the performance of the proposed index with a recently proposed indexing method (TPR*-tree), and show that our approach has significant performance advantages for both updates and search queries.

1. INTRODUCTIONPast work on indexing predicted trajectories have produced a range of indexing structures which either index the trajectories in native space, or index the trajectories in dual-transformed space. Most of the recent work

Importance of moving object database. Two classes of indexing issues, past and predicted trajectories. Need to be efficient w.r.t. updates and queries. A number of techniques. Our proposal based on dual transformation to a higher dimensional space. Trajectories in two-dimension become points in a four-dimensional space. The higher dimensional space is indexed using a regular hierarchical decomposition with leaf pages being the size of a disk block, and the internal nodes being small records that are clustered by the children

STRIPES = A Scalable Trajectory Index for Predicted Positions in Moving Object Databases. The STRIPES index maps queries which results in producing a pattern that looks like stripes when projected on a two-dimensional space.

2. BACKGROUND AND MODEL<Prasad: can you take a crack at this section too, modeling it after the way the SETI paper introduces the background and model. The only caveat is we need to use the same terminology as is used in Section 4, see the notations table.> Explain the background of trajectories – represented as a sequence of moving points. Historical trajectories a time series of points in a the physical space.

Future trajectories – different models for the predicted trajectories. The Sistla model, which is widely used. We use that here too.

Explain the notion of horizon.

2.1 Query TypesExplain the query types

Q1

Q2

Q3

t

p

o1

o3

o2

o4

o5

Figure 1: Query Examples for Objects Moving in a One-dimensional Space

3. TPR AND TPR*-TREE INDICESIn this section we review the two popular predicted indices, TPR-tree index [20] and the TPR*-tree index [26].

3.1 TPR-TreeThe TPR-tree [20] is essentially a time parameterized R*-Tree. The index stores velocities of the elements along with their positions in nodes. Since the elements are not static, the corresponding MBRs are dynamic (see Figure 2).

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

Conference’04, Month 1–2, 2004, City, State, Country.Copyright 2004 ACM 1-58113-000-0/00/0004…$5.00.

1

Page 2: Proceedings Template - WORDpages.cs.wisc.edu/~jignesh/publ/stripes.doc  · Web viewAn Efficient Method for Indexing Future Positions of Continuously Moving Objects ABSTRACT. Over

The index structure and as well as the algorithms for search, insert and delete used are very similar to that of R*-tree [2]. The R*-tree uses a number of static parameters such as the area, perimeter, distance from the centroid, and the intersection between the two MBRs. The TPR-tree uses time parameterized metrics for these parameters. The time parameterized metric is

computed using the formulae , where M(t) is

some metric that is used in the original R*-tree (for example the area), and H is the life time of the index. The life time of an index is the time for which the index is used and queried.

Figure 2 shows an example of the time-parameterized area metric for four object a, b, c, and d, moving in a two-dimensional space.. The MBR of the index node at time T0 is labeled as A, and the MBR of the node at time T0+H is labeled as B. The size and the position of B are calculated by extrapolating (position, velocity) of the entries with in the node. Then the area metric used is the volume of the trapezoid that is formed by moving MBR of the node from time T0 to T0+H. All the other metrics are similarly computed.

Figure 2: Area Computation in the TPR-Tree: The actual data objects are shown as shaded boxes. The “area” computation

is the volume of the three-dimensional shape shown in the figure.

The insert algorithm chooses a node such that the expansion in volume is the smallest at non-leaf nodes and the expansion in integrated perimeter is the smallest at the leaf node level. When such a node is full, it is split similar to R*-Tree. Now instead of just sorting boundaries of elements, the velocity vectors are also sorted to choose the best distribution of the elements.

The TPR-Tree inherits all the issues related to the R*-Tree such as overlap and dead space. Since the positions and the velocities are estimated and can change, the optimal combination of elements can not be maintained at all times in the future.

3.2 TPR*-TreeThe recently proposed TPR*-tree [26] provides a number of optimization to the basic TPR-tree algorithms.

<Prasad: This part still needs some work, as it needs to clear convey the key insight of the TPR*-tree algorithm regarding using the PQ to find the optimal leaf. Feel free to use a figure

to explain this clearly.> The insert algorithm of TPR*-tree recognizes that a local optimal solution at a level can be from a broken-tie (two elements have same deterioration) and that the sub-elements of that element may not be optimal. It proposes to traverse down the tree to leaf level by maintaining a priority queue of deteriorations, an optimal node can be determined. The authors argue that the extra cost incurred in traversing can be offset by the benefits of finding an optimal node for insertion. This algorithm leads to a tighter packing of elements in nodes and thus better query performance.

The algorithm to deal with overflow nodes in TPR*-Tree is to first force reinsert and then split the node. The nodes are first sorted along all the 8(4*d) possible dimensions and first (=30%) entries from the best possible sort are chosen for reinsert. If during the reinsert, if a node overflows then the node is split. The authors propose a heuristic to reduce the number of sorts to just one, by recognizing that the elements at leaf nodes can be assumed to be uniformly distributed, and the largest extent of all the dimension (positions and velocities) would give the best benefit.

The authors propose two complex algorithms to optimize TPR-tree algorithms at the cost of extra I/O and CPU. The authors of TPR*-tree also propose a cost model as well as hypothetical optimal tree for predictive indices using the TPR-tree style of indexing. They show that their proposed algorithmic improvement produces an index structure that is close to the optimal index. Consequently, one can conclude that the TPR*-tree is currently the best know practical indexing technique for predicted trajectories.

Figure 3: TPR*-tree figure goes here

4. STRIPESThis section presents the techniques to represent moving objects, the index structure and algorithms used to query, insert and delete data points in STRIPES. We start with one-dimensional moving objects, and then extend to two dimensions and further to d-dimensions. In this section, we will use the notations described in Table 1.

Table 1. Notations

Notation Description

number of dimensions in real space

2

Page 3: Proceedings Template - WORDpages.cs.wisc.edu/~jignesh/publ/stripes.doc  · Web viewAn Efficient Method for Indexing Future Positions of Continuously Moving Objects ABSTRACT. Over

D Number of dimensions in dual (transformed) space (=2d)

L Index lifetime

Reference time, a.k.a. initialization time of an index

Vector of maximum velocities

Maximum absolute velocity value in dimension

Maximum position value of a moving object in the dimension

Position vector of an object in original space

Velocity vector of an object in original space

Reference position of an object in dimension of original space

Position of an object in dimension of original space

Velocity of an object in dimension of original space

Position vector of an object in transformed dual space

Velocity vector of an object in transformed dual space

Reference position of an object in plane of transformed dual space

Total number of mobile objects

Page capacity (number of objects per page)

= Minimum number of pages to store the mobile objects

K Number of objects reported by a query

Minimum number of I/O’s to report the query answer

Non-leaf node fanout

4.1 Representing Moving Objects using Dual TransformThe STRIPES index represents the moving object in a dual transformed space.

The basic idea of a dual transform in this case is to transform a linear trajectory defined by equation in

d+1-dimensional space ( being the additional dimension) into a

point ( ) in 2d-dimensional dual space, where

and

are the transformed velocity and reference position vectors. We incorporate both negative and positive values for velocity by applying the following transform: given , the velocity and position vectors of an object, the corresponding transformed velocity and reference position vectors are calculated as follows:

Thus in the range for is and the range for

is .

Since time is monotonically increasing, the value of is not bounded, and also in the discussion of queries in the next section, the slope of the lines bounding query regions can be potentially infinite as time advances far into the future relative to . This presents a potential problem with the accuracy and functionality of the index structure.

To solve this problem, we apply the assumptions that objects must (i) issue an update when they cross over the boundaries (ii) issue an update periodically to keep alive with the system, and employ a dual-index system where we keep two distinct index structures in the system, starting from time . We empirically assign the same lifetime value to each of the index structures, which also becomes the enforced update period. We then enforce the following rules: starting from

, the reference time of the first index is , and it indexes

objects that issue updates within the period , and the

second index has a reference time that contains the

objects that issue updates within the period . Since an update consists of the deletion of the old entry and the insertion of the new entry, when an update with timestamp > comes in, we are sure that the first index is either empty, given all objects in it have at least issued an update with timestamp within , which means they are removed from the first index and inserted into the second one, given the periodic update rules, or that it is not empty because it contains objects that failed to issue an update within the required time frame, which we treat as expired. At this time, we clear the first index structure and update its to be and insert the new

updates with timestamps within , so on and so forth. This way each time we create a new index structure we remove an empty or expired one. It is easy to see that now the range for

in each of the indexes is

. To simplify the

computation of index entry coordinates, we add to

at transform time and convert the range to

3

Page 4: Proceedings Template - WORDpages.cs.wisc.edu/~jignesh/publ/stripes.doc  · Web viewAn Efficient Method for Indexing Future Positions of Continuously Moving Objects ABSTRACT. Over

, thus the transform equation becomes:

And the linear motion equation becomes:

4.2 Index StructureThe STRIPES index is essentially a disk-based multi-dimensional bucket quadtree. Each of the dual planes is equally partitioned into 4 quads, resulting in altogether

partitions that we call grids, the fanout of non-leaf

nodes is thus . Each leaf node consists of an integer level indicator (root node has level 0), a pointer to its parent non-leaf node, the grid it is associated with, and an array of objects that each consists of a unique object ID and the transformed

tuple. Each non-leaf node consists of an integer level indicator, a pointer to its parent non-leaf node (that of the root node is null), and an array of length of pointers to its children nodes, be them non-leaf or leaf nodes.

Worth noting is that the grids are not strict-sense enclosed hyper-cubes, but they consist of a series of quads from the

two-dimensional planes. Thus each grid is uniquely

identified by the tuple , where ( ) is the vector of velocity(reference position) coordinates of the leftmost(lowest) vertex of the quads, and ( ) is the vector of side lengths along the velocity(reference position) axis of the quads. Each grid is also locally identified by its parent grid using its local index. Child nodes of a non-leaf node are ordered in a generalized row-major ordering fashion, with their local indexes being the array index in their parent node. We discuss child node coordinates assignment given local index and child node identification given an object to insert or delete in the algorithms section.

4.3 InsertionBeing a dynamic index structure, STRIPES allows the insertion of objects on the fly.

We first discuss the algorithm used to find the index and target leaf node given the object to insert into the tree.

Given the tuple of a moving object, we first obtain

tuple using the transform algorithm discussed in section 4.1. The starting from the root node, we recursively identify the next level node to insert into by calculating the local index of the target node using the following formula:

where , , , and are the velocity, reference position, velocity side length, and reference position side length parameters in the dual plane.

The recursion terminates when either of the following 2 cases occurs: i) the target leaf node is non-existent; ii) the target leaf node is found; Since for case ii) there are two sub-cases considering whether the leaf node is full or not, we derive at the following 3 cases for insertion.

Case 1: the target leaf node is non-existent.

Case 2: the target leaf node is found and not full.

Case 3: the target leaf node is found and is full.

We discuss the above 3 cases separately.

In case 1, a new leaf node is created into which the new entry is inserted. The parameters of the grid of the new leaf node is determined as follows.

(Note: Multiplications and divisions between vectors above apply to element-wise multiplications and divisions that result in new vectors whose elements are the products or quotients of the corresponding elements of the two vectors involved in the operation.)

Where, compose the parameters of the

newly created leaf node; comes from the new entry

to be inserted; and come from the parameters of the current node.

In case 2, the object is directly inserted into the leaf node.

In case 3, a split operation is performed, where the target leaf node is upgraded to a non-leaf node, and new leaf nodes are created, following the same process defined in case 1, into which the data entries are inserted, including the newly arrived entry.

One thing to note is that new nodes are only created when necessary, i.e., when there are entries that need to be inserted into the new nodes.

4

Page 5: Proceedings Template - WORDpages.cs.wisc.edu/~jignesh/publ/stripes.doc  · Web viewAn Efficient Method for Indexing Future Positions of Continuously Moving Objects ABSTRACT. Over

Next we discuss deletion of objects from the index.

4.4 DeletionDeletion of objects is categorized into expiration deletion when objects are considered expired by the system, and update deletion when objects update their motion parameters. Expiration deletion is implicit in the index in the sense that in our dual-index structure when a new index is to be created, the old index is either empty or contains expired objects, thus we delete the expired index as a whole, in which case we do not worry about locating individual objects within the index. As a result, we consider only update deletion where we assume that objects send in updated motion parameters together with the old parameters which are used to locate the old entry of the object in the index. In the case where the old entry can not be found due to expiration deletion, the object issuing the update is treated as a new object and is inserted into the index and assigned a new unique ID.

Given the above, we assume that for each deletion, we are able to locate the target entry to delete, using the same locating algorithm described in the insertion section above.

At deletion time, we check whether a non-leaf node is under-filled, defined as whether the number of objects contained within this node is less than or equal to the capacity of a leaf node. The following two cases apply:

Case 1: the non-leaf node is not under-filled, then we directly delete the target entry.

Case 2: the non-leaf node is under-filled, then we collect all the entries within this node and down-grade the non-leaf node to a leaf node, and re-insert the entries into the new leaf node, and delete the target entry.

The rationale of this pre-processing approach instead of post-processing approach where the under-fillness of a non-leaf node is checked after the deletion is that with post-processing, we might leave a leaf node with number of entries just equal to its capacity and when the next insert comes in, an additional split is required, while with pre-processing, we always leave one additional entry in the leaf node which acts as a buffer to absorb one more entry before splitting, thus avoiding additional I/O cost due to extraneous splits.

4.5 UpdateUpdates issued by objects contain the tuple

and are performed as

a deletion followed by an insertion, after determining which of the two indexes the old and new entry belong to, respectively, by examining and .

4.6 Queries We consider three types of queries: time-slice query, window query, and moving query.

Let < , < , and < be the vectors of lower

bounds and upper bounds in position, and , < be three time instants not earlier than current time, then the three types of queries are defined as follows:

Time-slice query: specifies a hyper-rectangle

bounded by [ , ] at time .

Window query: specifies a hyper-

rectangle bounded by [ , ] that covers the time interval

, i.e., this query retrieves points with trajectories in

space crossing the (d+1)-dimensional hyper-rectangle

.

Moving query: specifies the (d+1)-dimensional trapezoid obtained by connecting the hyper-rectangle bounded by at time

and the hyper-rectangle bounded by at time . Figure 1 illustrates the three query types on one-dimensional data.

In Figure 1, is a time-slice query that returns object o1, Q2 is a window query that returns objects o2 and o3, and Q3 is a moving query that returns objects o4 and o5.

To be as general as possible, we assume all queries are moving queries, i.e., all queries issued are of the form

. Then they are transformed into the following set of inequalities:

Eq. 1

4.6.1 Time-slice QueriesFor time-slice queries, , , and , Eq.1 effectively becomes:

Eq. 2The query region for the one-dimensional time-slice query Q1 shown in Figure 1 is illustrated in Figure 4.

5

Page 6: Proceedings Template - WORDpages.cs.wisc.edu/~jignesh/publ/stripes.doc  · Web viewAn Efficient Method for Indexing Future Positions of Continuously Moving Objects ABSTRACT. Over

Lvp maxmax 2

max2 v V0maxv

up

lp

refP

Figure 4: Transformed One-dimensional Time-slice Query: This example uses query Q1 from Figure 1.

4.6.2 Window QueriesFor window queries, , , Eq. 1 effectively becomes:

Eq. 3The query region for the one-dimensional window query Q2 from Figure 1 is illustrated in Figure 4.2.3.

Lvp maxmax 2

max2 v V0maxv

up

lp

refP

Figure 5: Transformed One-dimensional Window Query: This example uses query Q2 from Figure 1.

4.6.3 Moving QueryFigure 6 illustrates the query region for a one-dimensional moving query, using query Q3 from Figure 1 as an example.

Lvp maxmax 2

max2 v V0maxv

up1

lp1

refP

lp2

up2

U1

U2U3

L1

L2

L3

Figure 6: Transformed One-dimensional Moving Query: This example uses query Q3 from Figure 1.

In all cases, the query region is a bounded polygon that is confined within an upper bound and a lower bound. Note that the upper bound and the lower bound are not necessary straight lines (Figure 4.2.3). Thus we define the query region with six points, U1, U2, U3, L1, L2, and L3 in Figure 6, among which the four marginal points U1, U3, L1, and L3 are obtained by calculating intersections of the following lines with the boundaries.

L2 is obtained by calculating the intersection of the following set of lines:

U2 is obtained by calculating the intersection of the following set of lines:

In the case where either of L2 and U2 is outside the boundaries, the end points are used.

As a result, a query body consists of such distinctive query

regions corresponding to the dual transformed planes.

4.6.4 STRIPES Search AlgorithmQueries are processed in STRIPES as follows: At level l, each of the f grids are tested for relative position to the query body, which is in turn performed as a conjunction of d two-dimensional relative position tests between data regions and the corresponding query region. Relative positions include INSIDE, OVERLAP, and DISJUNCT. A grid is INSIDE a query body if and only if all the sub-queries return INSIDE; it is DISJUNCT as soon as one of the sub-queries returns DISJUNCT; otherwise OVERLAP is returned. For all the grids that return an INSIDE result, we immediately retrieve the entries within. DISJUNCT results are discarded and OVERLAP results are further probed recursively. Figure 7 shows the algorithm for relative position

6

Page 7: Proceedings Template - WORDpages.cs.wisc.edu/~jignesh/publ/stripes.doc  · Web viewAn Efficient Method for Indexing Future Positions of Continuously Moving Objects ABSTRACT. Over

test between a data region and a query region. Figure 8 shows the relative positions between data regions and the query region. As shown in Figure 8, R3 is DISJUNCT to the query region, while R2 is INSIDE the query region and R1 OVERLAPs the query region.

Figure 7: Algorithm to Test the Relative Positions of Data and Query Regions

Lvp maxmax 2

max2 v V0maxv

refP

U1

U2 U3

L1

L2L3

R1 R3

R2

Figure 8: Relative Positions of Data and Query Regions for One-dimensional Points

5. EXPERIMENTAL EVALUATIONIn this section, we present results comparing the performance of the STRIDES and the TPR*-tree index.

5.1 Implementation Details and Experimental Platform We implement both STRIPES and TPR*-tree [26] on top of SHORE [4]. The storage manage uses a 4KB page size, and we set the buffer pool size to 2048 pages (We followed the same philosophy as in previous studies [11, 20, 26] and use a small buffer pool size to keep the experiments manageable). The TPR*-tree is implemented using the algorithms described in [26]. <Prasad: we need to say something here about the sorting implementation and the pick worst algorithms. (Essentially need to say something here that convinces the reader that we have paid attention to the details).>

The experimental platform used in these experiments is a 2 GHz Intel Xeon machine with a 512KB L2 cache, a 40GB Western Digital 7200 RMP IDE Hard Drive, running Red Hat Linux 9.

5.2 Data Sets and WorkloadWe generated data sets using the GSTD data generator [28], which has been extensively used in previous studies too. The GSTD data generator has a number of different parameters to vary the distribution of the data, control …

Describe the workload generator, the mix of updates and queries. Workloads: 2-D space and a time dimension

Update-Queries: 80-20, 50-50, 20-80

Page Size 4KB

Buffer pool size: 2048 pages (As in previous studies to keep the experiments manageable, we choose a relatively small buffer pool size.)

Default number of moving object: 500K

Horizon: 1200?

Range:?

Number of different parameters. Present results using the default parameters which are representative of other setting that we tries.

Default parameters: query mix is 60% time-stamp, 20% window queries, and 20% moving window queries. The area of the queries on the spatial dimensions is 0.025% of the entire spatial area, and

The experiments are based on workloads that contain both updates and queries to simulate real index usage across a period of time. Also, at the beginning of each workload, the parameters of objects at time 0 are bulkloaded.

For workload generation we employ the workload generator described in [ŠALTENIS et al. 2000]. We briefly describe it here. The generator allows for both uniform data generation and workloads where two-dimensional objects move in a network of routes connecting a number of destinations. The parameters that can be varied are: N, the number of moving objects [cardinal number]; ND, number of destinations [cardinal number]; UI, update interval length [time units]; W, querying window size [time units]; and QS, query size [% of the data space].

In uniform workloads, the initial positions of objects are uniformly distributed in space. The directions of the velocity vectors are assigned randomly, both initially and on each update. The speeds are uniformly distributed between 0 and 3 km/min. The time interval between successive updates is uniformly distributed between 0 and 2UI.

In skewed workloads, ND destinations are distributed randomly in a two-dimensional space and server as the vertices in a fully connected graph of routes. Initially objects are placed at random positions on routes. The objects are assigned with equal probability to one of three groups of points with maximum speeds of 0.75, 1.5, and 3 km/min. During the first sixth of a route, objects accelerate from zero speed to their maximum speeds; during the middle two thirds, they travel at their maximum speeds; and during the last one sixth of a route, they

7

Page 8: Proceedings Template - WORDpages.cs.wisc.edu/~jignesh/publ/stripes.doc  · Web viewAn Efficient Method for Indexing Future Positions of Continuously Moving Objects ABSTRACT. Over

decelerate. When an object reaches its destination, a new destination is assigned to it at random.

In all workloads used in the experiments, the simulated objects move in a bounded square region of dimensions 1000 1000 kilometers. We vary N from 10,000 to 50,000, with an increment of 5,000. UI is set to 60 minutes in all cases. We set ND=20 for all skewed workloads. No objects disappear, and no new objects appear for the duration of the simulation. Each of the workloads is run for 600 time units (minutes). For UI=60, this corresponds to number of updates 10 times the cardinality of the data.

In addition to updates, queries are also generated in the workloads. Four queries are generated within each time unit, thus resulting in a total of 2400 queries for each of the workloads. Time-slice, window, and moving queries are generated with probabilities 0.6, 0.2, and 0.2. The queries start at current time and have random intervals with length no greater than W. The spatial extents of the queries are squares occupying a fraction QS of the space (QS=0.25% in all experiments). Time-slice and window queries have random locations, while for moving queries, the center of a query follows the trajectory of one of the points currently in the index.

In the experiments we employ dual-index structure technique for both STRIPES and TPR*-tree, with index lifetime L set to 2UI=120 minutes.

5.3 Effect of Workload Mix In one graph draw a line graph with the total execution time for the three workloads for STRIPES and the TPR*-tree index.

In a second bar graph breakup the cost of the workload into total IO and CPU components.

Draw a third bar graph with only the I/O and CPU breakdown for queries

Draw a fourth bar graph with only the I/O and CPU breakdown for queries

5.3.1 Update performance Discuss the update performance.

5.3.2 Query performanceDiscuss the query performance

5.4 Scaling with Increasing Number of UsersFor 20-80 and 80-20 workload plot the total I/O and CPU components in a bar graph for 100K, 500K and 900K users.

5.5 Effect of Data SkewShow that STRIPES can effectively deal with data skew

5.6 SummarySummarize the experimental evaluations.

6. RELATED WORKWithin the broader context of indexing trajectories for moving objects, there are two broad classes of related works. The first is methods on indexing the historical trajectories and the current locations of moving objects. The second, and more closely

connect class of research is on indexing the predicted locations of moving objects. The methods for indexing the past and the current locations are typically concerned with queries on exact trajectory points, where as methods for indexing on the future locations are concerned primarily with indexing the parameters of the predicted trajectory representations, which typically include a velocity vector an start position vector. However, both these classes of indices are concerned with efficient indexing mechanisms for supporting fast updates and queries on spatial representations of the trajectories. In the next paragraph we briefly review the methods for indexing on the past trajectory locations, and then turn our attention to the more closely related work in indexing predicted trajectories.

Most of the work on indexing the past locations of trajectories is based on variations of the R-tree [6] and the R*-tree [2]. These methods include the 3-D R-trees[29] which simply treats time as a third dimension. The MR-tree [31] and the HR-tree[13] are also 3-D R-tree structures and maintain a separate R-tree for each time stamp. The MV3R-tree[25] is a hybrid structure that uses a multi-version R-tree (MVR) for time-stamp queries, and a small 3D R-tree for time-interval queries. This indexing structure has been shown to outperform other historical trajectory indexing structures, such as the popular TB-tree [15]. In the area of indexing historical SEB[24] and SETI [5] are two additional indexing techniques that partition the spatial extents and build indices on the temporal dimension. A number of indexing methods have also focused on efficient methods for indexing the current location of moving objects [9, 10, 14, 23]. All the methods described in this paragraph are not concerned with indexing the predicted location, and index the native space of the trajectories. In contrast, STRIPES indexes the predicted locations in dual-transformed space.

Two main approaches have been used for indexing the predicted locations of trajectories. These two approaches are a) methods that index the predicted trajectories in the original spatial and temporal dimensions, and b) methods that transform the predicted trajectories into a dual transform space and index the spatial representation in the dual transformed space.

One of the early works on indexing predicted trajectories is by Tayeb et al. [27], in which trajectories in a d-dimensional space are treated as lines in a d+1 dimensional space (time is the additional dimension). The line is then index using a PMR quadtree [21]. The drawbacks of this approach are that the index may have excessive dead space since it is indexing high dimensional lines, and there could be a lot of replication in the index. However this work by Tayeb et al., carried out within the context of the MOST [22] project strongly influenced and stimulating interests in methods for querying moving object databases.

The TPR-tree [20] is a popular indexing structure for indexing predicted trajectories. This index structure uses the basic R-tree indexing structure and extends the notion of bounding boxes to time-parameterized bounding boxes. A time parameterized bounding box can be viewed as a time-parameterized spatial bounding box, and has also been used by other related indexing structures [3, 16]. One of the problems with the time-parameterize boxes is that estimating it requires reasoning about the positions of the objects enclosed by the box over some period of time. The original TPR-tree paper [20] used a

8

Page 9: Proceedings Template - WORDpages.cs.wisc.edu/~jignesh/publ/stripes.doc  · Web viewAn Efficient Method for Indexing Future Positions of Continuously Moving Objects ABSTRACT. Over

conservative bounding box, but this has been improved in a number of different ways [17-19], often by exploiting various additional parameters such as expiration times or the maximum speed. The TPR*-tree is an index structure which improved the methods proposed in the original TPR-tree, and has been shown to be significantly faster than the TPR-tree. In this paper we compare STRIDE with the TPR*-tree, and show that STRIDE outperforms the TPR*-tree.

Dual transformation techniques have also been proposed for indexing predicted trajectories [30]. These indexing methods include the Kinetic data structure [1], the R-tree based parameterized space indexing method [16], and the SV-model [7]. Perhaps the most popular approach for using dual transformation techniques for indexing predicted trajectories is the work by Kollios et al [8]. In this work, the authors derive nice lower bounds on the cost of answering predictive queries using dual-transformation. Most of the paper is concerned with objects moving in one-dimensional space, and the paper sketches extensions to higher-dimensional space. In addition, the paper only considers window queries. The largely theoretical approach has served as the basis for some of the choices made in the TPR-tree [20], but has largely been dismissed by recent work that are use a more systems approach to investigating this area [20, 26]. The dual-transformation method used in STRIDE index is based on the Kough-X transform that is used in the work by Kollios et al.[8]. STRIDE can also handle moving window queries, and we show that STRIDE vastly outperforms the current best know methods for indexing predicted trajectories.

For a more detailed overview of related work in this area, the reader is directed to a comprehensive recent review [12].

7. CONCLUSIONS AND FUTURE WORK

8. REFERENCES1. Agarwal, P.K., Arge, L. and Erickson, J., Indexing Moving

Points. In Proceedings of the Nineteenth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, (Dallas, Texas, USA, 2000), 175-186.

[1] 2. Beckmann, N., Kriegel, H.-P., Schneider, R. and Seeger, B., The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles. In Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data, Atlantic City, NJ, May 23-25, 1990 , (1990), 322-331.

[2] 3. Cai, M.C., Keshwani, D. and Revesz, P.Z., Parametric Rectangles: A Model for Querying and Animation of Spatiotemporal Satabases. In Advances in Databse Technology-EDBT 2000, Proceedings, (2000), 430-444.

[3] 4. Carey, M.J., DeWitt, D.J., Franklin, M.J., Hall, N.E., McAuliffe, M.L., Naughton, J.F., Schuh, D.T., Solomon, M.H., Tan, C.K., Tsatalos, O.G., White, S.J. and Zwilling, M.J., Shoring Up Persistent Applications. In Proceedings of the 1994 ACM SIGMOD International Conference on Management of Data, (Minneapolis, Minnesota, 1994), 383-394.

[4] 5. Chakka, V.P., Everspaugh, A.C. and Patel, J.M., Indexing Large Trajectory Data Sets with SETI. In the First Biennial Conference on Innovative Data Systems Research (CIDR), (Asilomar, CA, 2003), 164-175.

[5] 6. Guttman, A., R-Trees: A Dynamic Index Structure for Spatial Searching. In SIGMOD'84, Proceedings of Annual Meeting, (Boston, Massachusetts, 1984), 47-57.

[6] 7. Hae Don Chon, D.A., Amr El Abbadi:, Storage and Retrieval of Moving Objects. In Mobile Data Management, (2001), 173-184.

[7] 8. Kollios, G., Gunopulos, D. and Tsotras, V.J., On Indexing Mobile Objects. In Proceedings of the 18th SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), (Philadelphia, Pennsylvania, 1999), 261-272.

[8] 9. Kwon, D., Lee, S. and Lee, S., Indexing the Current Positions of Moving Objects Using the Lazy Update R-tree. In Mobile Data Management, (2002), 113-120.

[9] 10. Lee, M.-L., Hsu, W., Jensen, C.S., Cui, B. and Teo, K.L., Supporting Frequent Updates in R-Trees: A Bottom-Up Approach. In VLDB, (Berlin, Germany, 2003), 608-619.

[10] 11. Leutenegger, S.T. and Lopez, M.A., The Effect of Buffering on the Performance of R-trees. In Ieee Transactions on Knowledge and Data Engineering, (2000), 33-44.

[11] 12. Mokbel, M.F., Ghanem, T.M. and Aref, W.G. Spatio-Temporal Access Methods, IEEE Data Engineering Bulletin 26(2): 40-49 (2003), 2003.

[12] 13. Nascimento, M.A. and Silva, J.R.O., Towards Historical R-trees. In In Proceedings of ACM Symposium on Applied Computing (ACM-SAC), (1998), 235-240.

[13] 14. Nascimento, M.A., Silva, J.R.O. and Theodoridis, Y., Evaluation of Access Structures for Discretely Moving Points. In Spatio-Temporal Database Management, (Edinburgh, Scotland, 1999), 171-188.

[14] 15. Pfoser, D., Jensen, C. and Theodoidis, Y., Novel Approaches to the Indexing of Moving Objects Trajectories. In Proceedings of the 26th International Conference on Very Large Data Bases (VLDB), (Cairo, Egypt, 2000), 395-406.

[15] 16. Porkaew, K., Lazaridis, I. and Mehrotra, S., Querying Mobile Objects in Spatio-temporal Databases. In Advances in Spatial and Temporal Databases, Proceedings, (2001), 59-78.

[16] 17. Prabhakar, S., Xia, Y.N., Kalashnikov, D.V., Aref, W.G. and Hambrusch, S.E., Query Indexing and Velocity Constrained Indexing: Scalable Techniques for Continuous Queries on Moving Objects. In Ieee Transactions on Computers, (2002), 1124-1140.

[17] 18. Procopiuc, C.M., Agarwal, P.K. and Har-Peled, S., STAR-Tree: An Efficient Self-adjusting Index for Moving Objects. In Algorithm Engineering and Experiments, (2002), 178-193.

9

Page 10: Proceedings Template - WORDpages.cs.wisc.edu/~jignesh/publ/stripes.doc  · Web viewAn Efficient Method for Indexing Future Positions of Continuously Moving Objects ABSTRACT. Over

[18] 19. Saltenis, S. and Jensen, C.S., Indexing of Moving Objects for Location-Based Service. In Proceedings of the International Conference on Data Engineering, (2002).

[19] 20. Saltenis, S., Jensen, C.S., Leutenegger, S.T. and Lopez, M.A., Indexing the Positions of Continuously Moving Objects. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, (Dallas, Texas, USA, 2000), 331-342.

[20] 21. Samet, H. The Quadtree and Related Hierarchical Data Structures. Computing Surveys, 16 (2). 187-260.

[21] 22. Sistla, A.P., Wolfson, O., Chamberlain, S. and Dao, S., Modeling and Querying Moving Objects. In Proc. of the 13th International Conference on Data Engineering, (Birmingham U.K, 1997), 422-432.

[22] 23. Song, Z. and Roussopoulos, N., Hashing Moving Objects. In Mobile Data Management, (Hong Kong Polytechnic University, Hong Kong, 2001), 161-172.

[23] 24. Song, Z.X. and Roussopoulos, N., SEB-tree: An approach to index continuously moving objects. In Mobile Data Management, Proceedings, (2003), 340-344.

[24] 25. Tao, Y. and Papadias, D., MV3R-Tree: A Spatio-Temporal Access Method for Timestamp and Interval Queries. In The VLDB Journal, (2001), 431-440.

[25] 26. Tao, Y., Papadias, D. and Sun, J., The TPR*-Tree: An Optimized Spatio-Temporal Access Method for Predictive Queries. In VLDB, (2003), 790-801.

[26] 27. Tayeb, J., Ulusoy, O. and Wolfson, O. A Quadtree-based Dynamic Attribute Indexing Method. Computer Journal, 41 (3). 185-200.

[27] 28. Theodoridis, Y., Silva, J.R.O. and Nascimento, M.A., On the Generation of Spatiotemporal Datasets. In Symposium on Large Spatial Databases, (1999), 147-164.

[28] 29. Theodoridis, Y., Vazirgiannis, M. and Sellis, T.K., Spatio-Temporal Indexing for Large Multimedia Application. In International Conference on Multimedia Computing and Systems, (1996), 441-448.

[29] 30. Wolfson, O., Xu, B., Chamberlain, S. and Jiang, L. Moving Objects Databases: Issues and Solutions. in Proc of the 10th International Conference on Scientific and Statistical Database Management, Capri, Italy, 1998, 111-122.

[30] 31. Xu, X., Han, J. and Lu, W. RT-tree: An Improved R-Tree Index Structure for Spatiotemporal Databases.

[31]

10