query evaluation techniques for cluster database systems
DESCRIPTION
Query Evaluation Techniques for Cluster Database Systems. Andrey V. Lepikhov , Leonid B. Sokolinsky South Ural State University Russia. Outline. Motivation Problem Statement Background Partial mirroring method Results Future work. Motivation. Top500 Cluster: 84.8% MPP : 14.8% - PowerPoint PPT PresentationTRANSCRIPT
Query Evaluation Techniques for Cluster Database Systems
Andrey V. Lepikhov, Leonid B. Sokolinsky
South Ural State University
Russia
22 September 2010
1
ADBIS 2010
ADBIS 20102
Outline
22 September 2010
MotivationProblem StatementBackgroundPartial mirroring methodResultsFuture work
Motivation
22 September 20103
Top500• Cluster: 84.8%• MPP: 14.8%• Others: 0.4%
ADBIS 2010
Problem Statement
22 September 20104
Not expensive parallel hardware needs not expensive parallel database management system
Today we have no such chip parallel database management system
ADBIS 2010
Background
22 September 20105 ADBIS 2010
Exchange operator
22 September 2010ADBIS 20106
pE p: port ψ: distributing function
Parallel plan for query Q = RS
22 September 2010ADBIS 20107
Query processing in cluster system
22 September 20108 ADBIS 2010
The problem
22 September 2010ADBIS 20109
Load balancing
Partial mirroring method
Fragmentation strategy Replication strategy
10
Fragmentation strategy
22 September 2010ADBIS 201011
FR
AG
ME
NT
AT
IO
N
Source relation P0
P1
Pn
• Relation is divided into fragments distributed among cluster nodes
• Each fragment is divided into sequence of segments with an equal length
• Segment is the minimal unit of replication
. . .
. . .. . .
. . .
Replication strategy
22 September 2010ADBIS 201012
Disk Di Disk Dj
ρj = 50%
Ri
jiR
Load balancing method
22 September 2010ADBIS 201013
D1 D2
t1
D1 D2
t2
D1 D2
- scheduled to process
D1 D2
t4
- processed
t3
P1
P2P1 P2
P1
P1
P2
P2
- not used
12S
1S 2S
21S
12S
1S 2S
21S
12S
1S 2S
21S
12S
1S 2S
21S
Parallel agent with two input streams
22 September 2010ADBIS 201014
Load balancing algorithm
22 September 2010ADBIS 201015
Parameters of experiments
22 September 2010ADBIS 201016
Speedup versus replication factor
22 September 2010ADBIS 201017
Speedup versus skew factor θ
22 September 2010ADBIS 201018
• 0.68 corresponds to the ”80-20” rule (80 percents of tuples of the relation will be stored in 20 percents of fragments)• 0 corresponds to the uniform distribution
Future Work
22 September 2010ADBIS 201019
To incorporate the proposed technique of parallel query execution into open source PostgreSQL DBMS.
To extend this approach on GRID DBMS for clusters with multicor processors.
22 September 2010ADBIS 201020
Thank you