query processing in connectivity- challenged environments priyanka puri sharma chakravarthy gururaj...

Post on 06-Jan-2018

224 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Query Processing Has been addressed in the context of centralized DBMSs Has been addressed in the context of distributed DBMSs Cost-based plan generation is typically used So, is there anything more/new to do? May 23, 2010Sharma: AF Mobility Workshop

TRANSCRIPT

Query Processing in Connectivity-Challenged

EnvironmentsPriyanka Puri

Sharma ChakravarthyGururaj Poornima

Mohan KumarInformation Technology Laboratory

Computer Science and Engineering Department The University of Texas at Arlington, Arlington, TX 76009

Email: sharma@cse.uta.eduURL: http://itlab.uta.edu/sharma

• This effort is supported by AFRL under Contract Number: FA8750-09-2-0199

• Sanjay Madria and Raytheon (Waseem Naqvi) are also involved in this project

May 23, 2010 Sharma: AF Mobility Workshop

Query Processing

• Has been addressed in the context of centralized DBMSs

• Has been addressed in the context of distributed DBMSs

• Cost-based plan generation is typically used

• So, is there anything more/new to do?

May 23, 2010 Sharma: AF Mobility Workshop

Ground Controller 2

Ground Controller n

Ground Controller 1

UAV 1

UAV 4UAV 3

UAV 5

UAV 2

May 23, 2010 Sharma: AF Mobility Workshop

Ground Controller 2

Ground Controller 1

Ground Controller n

UAV 5

UAV 3

UAV 1

UAV 2

UAV 6

May 23, 2010 Sharma: AF Mobility Workshop

Currently• Data is dumped into a central server and

queried

• Bandwidth, QoS issues are not addressed

• No collaboration among nodes

• No continuous query processing, notification, fusion, context usage, and real- or near real-time support

May 23, 2010 Sharma: AF Mobility Workshop

Network of computing nodes:Unmanned vehicles, Sensors, Robots, PCs ,

Servers, Ground Controlling devices

Fault Tolerance Services

Context/ Knowledge

Base

Local fusion/Materiali

zation

Publish Subscribe Capability

Query Capability Raw Data / fused data

/data from other nodes

Queries, Tasks, Requests, Continuous Queries Publish/Subscribe

SOA Distributed MiddlewareTask planning Join computationComposition pub/subContext-aware NotificationResource Management Data management

Limited ResourcesMobilityHeterogeneityDisconnections

Proposed long-term Architecture

May 23, 2010 Sharma: AF Mobility Workshop

Query Processing

May 23, 2010 Sharma: AF Mobility Workshop

MyObjects Table at each node

Timestamp Node_id Longitude Latitude Obj_type Obj_desc Object_ptr

8 bytes 4 bytes 4 bytes 4 bytes 8 chars Varchar (64)

Pointer (8 bytes)

Total width: 100 bytes

Cardinality (number of tuples) , Selectivity, replication site of data are known (part of meta data)

May 23, 2010 Sharma: AF Mobility Workshop

Query Plan Format

May 23, 2010 Sharma: AF Mobility Workshop

Operation 1 Param Operand1 Operand1 Loc

Operand2 Operand 2 Location

Result Name

Result Loc

Operation 2 Param Operand1 Operand1 Loc

Operand1 Operand2 Loc

Result Name

Result Loc

… … … … … … … …

Operation n Param Operand1 Operand1 Loc

Operand1 Operand2 Loc

Result Name

Result Loc

Operations in Plan formatOperation Param Operand

1Operand

1 LocOperand

2Operand

2 LocResult Name

Result Loc

Select A > 100 R1 1 Null Null R1’ 1

Project A1, A3, A4 R1’ 1 Null Null R1’’ 1

Move Null R1’’ 1 Null Null R’’ 2

Copy Null R1” 1 Null Null R14 4

SemiJoin A = C R” 2 R2 2 SR1 2

Join B = D R12 2 R2’’ 2 JR1 2

May 23, 2010 Sharma: AF Mobility Workshop

Plan using Semijoin chainsSELECT c1 R1

MOVE R11 To Site2

SELECT c2 R2

SJ R11 R21 : J1

MOVE J1 To Site3

SELECT c3 R3

SJ J1 R31 : J2

MOVE J2 To Site2

SJ J2 R21 : J3

MOVE J3 To Site1

SJ J3 R11 : J4

COPY R To Site7 :JTotal Cost= 14720 + 32000 = 46720

May 23, 2010 Sharma: AF Mobility Workshop

1 2 3

[lat][long]

R1 [1000] R2 [5000] R3 [3000]

R11[800]R21[3000]

R31[600]

selectproject select

projectselectproject

Cost=3200 Cost=4800

Cost=1920

Cost=4800

7

JCost=32000

J1[1200]

J3[1200]

J2[240]

[lat,nodeid]

[long,nodeid

]

J4[320]

Semi-join/join plan generation

• We are developing algorithms for generating the plan space and pruning it for generating “best” (or “good”) plan for each input query (expressed as a join query)

• It is a cost-based algorithm based on System R and SDD approaches extended to include connectivity and bandwidth issues

• The complexity of plan generation is kn ; n is number of joins and k is the number of alternatives for each join.

• Assuming less than 5 joins in a query• Integrate replication into the algorithm

May 23, 2010 Sharma: AF Mobility Workshop

Plan Generation Alternatives• A Query Plan (QP) is a numbered sequence of operations

for executing a Query• A QP includes how data is moved as part of execution

• Plan generation alternatives Static Plan: generated once and executed in a distributed

manner Dynamic plan: generated incrementally at each node as the

query progresses using current connectivity information Parallel plan: partial plans are executed in parallel Interactive plan: get some estimate by asking nodes that has

relevant data

May 23, 2010 Sharma: AF Mobility Workshop

Static plan

• The physical plan generated will have node information for data propagation.

• This will be mapped to “actual connectivity” by the physical layer for execution

• It is possible that no connectivity exists by the time execution is performed for a generated query plan

• In that case, either a new plan can be generated (using the same algorithm, but using current meta data) or an alternative approach can be used to incrementally modify the plan

May 23, 2010 Sharma: AF Mobility Workshop

Dynamic plan• Generate plan for the first join and defer the rest of

the plan Join plans are generated one at a time Current connectivity information can be used Result size estimation will also be more accurate

• Query execution and (partial) plan generation are intertwined

• Does not increase the complexity of plan generation or plan execution (compared to static)

May 23, 2010 Sharma: AF Mobility Workshop

Parallel plan

• All local operations/computations (select, project, and even some joins) can be done in parallel Join plans are still generated one at a time Increases message/information exchange Current connectivity information can be used Result size estimation will also be more accurate

• Deal with responses and plan generation and execution may be slightly more complicated than the previous cases

May 23, 2010 Sharma: AF Mobility Workshop

Interactive plan• When a query comes in, send out requests for local

processing and get processing time and size information

• Use the above to generate partial plans Join plans are still generated using information

obtained interactively Increases message/information exchange Current connectivity information can be used Result size estimation will also be more accurate

• Combines Dynamic and parallel execution in an interactive manner

May 23, 2010 Sharma: AF Mobility Workshop

Replication Issues• Algorithm for Replication

Single copy replication that “minimizes” the data transmission cost and “maximizes” the number of paths (to deal with connectivity)

• Algorithm for Replication utilization Given a replication, determine the utility of that

replica in terms of query evaluation cost for a reasonable load

• Reconcile the above two to come up with a replication strategy that balances the competing tradeoffs

May 23, 2010 Sharma: AF Mobility Workshop

Thank You !

Sharma: AF Mobility Workshop

May 23, 2010

top related