query evaluation techniques for cluster database systems

20
Query Evaluation Techniques for Cluster Database Systems Andrey V. Lepikhov, Leonid B. Sokolinsky South Ural State University Russia 22 September 2010 1 ADBIS 2010

Upload: iona-cleveland

Post on 02-Jan-2016

22 views

Category:

Documents


0 download

DESCRIPTION

Query Evaluation Techniques for Cluster Database Systems. Andrey V. Lepikhov , Leonid B. Sokolinsky South Ural State University Russia. Outline. Motivation Problem Statement Background Partial mirroring method Results Future work. Motivation. Top500 Cluster: 84.8% MPP : 14.8% - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Query Evaluation Techniques  for Cluster Database Systems

Query Evaluation Techniques for Cluster Database Systems

Andrey V. Lepikhov, Leonid B. Sokolinsky

South Ural State University

Russia

22 September 2010

1

ADBIS 2010

Page 2: Query Evaluation Techniques  for Cluster Database Systems

ADBIS 20102

Outline

22 September 2010

MotivationProblem StatementBackgroundPartial mirroring methodResultsFuture work

Page 3: Query Evaluation Techniques  for Cluster Database Systems

Motivation

22 September 20103

Top500• Cluster: 84.8%• MPP: 14.8%• Others: 0.4%

ADBIS 2010

Page 4: Query Evaluation Techniques  for Cluster Database Systems

Problem Statement

22 September 20104

Not expensive parallel hardware needs not expensive parallel database management system

Today we have no such chip parallel database management system

ADBIS 2010

Page 5: Query Evaluation Techniques  for Cluster Database Systems

Background

22 September 20105 ADBIS 2010

Page 6: Query Evaluation Techniques  for Cluster Database Systems

Exchange operator

22 September 2010ADBIS 20106

pE p: port ψ: distributing function

Page 7: Query Evaluation Techniques  for Cluster Database Systems

Parallel plan for query Q = RS

22 September 2010ADBIS 20107

Page 8: Query Evaluation Techniques  for Cluster Database Systems

Query processing in cluster system

22 September 20108 ADBIS 2010

Page 9: Query Evaluation Techniques  for Cluster Database Systems

The problem

22 September 2010ADBIS 20109

Load balancing

Page 10: Query Evaluation Techniques  for Cluster Database Systems

Partial mirroring method

Fragmentation strategy Replication strategy

10

Page 11: Query Evaluation Techniques  for Cluster Database Systems

Fragmentation strategy

22 September 2010ADBIS 201011

FR

AG

ME

NT

AT

IO

N

Source relation P0

P1

Pn

• Relation is divided into fragments distributed among cluster nodes

• Each fragment is divided into sequence of segments with an equal length

• Segment is the minimal unit of replication

. . .

. . .. . .

. . .

Page 12: Query Evaluation Techniques  for Cluster Database Systems

Replication strategy

22 September 2010ADBIS 201012

Disk Di Disk Dj

ρj = 50%

Ri

jiR

Page 13: Query Evaluation Techniques  for Cluster Database Systems

Load balancing method

22 September 2010ADBIS 201013

D1 D2

t1

D1 D2

t2

D1 D2

- scheduled to process

D1 D2

t4

- processed

t3

P1

P2P1 P2

P1

P1

P2

P2

- not used

12S

1S 2S

21S

12S

1S 2S

21S

12S

1S 2S

21S

12S

1S 2S

21S

Page 14: Query Evaluation Techniques  for Cluster Database Systems

Parallel agent with two input streams

22 September 2010ADBIS 201014

Page 15: Query Evaluation Techniques  for Cluster Database Systems

Load balancing algorithm

22 September 2010ADBIS 201015

Page 16: Query Evaluation Techniques  for Cluster Database Systems

Parameters of experiments

22 September 2010ADBIS 201016

Page 17: Query Evaluation Techniques  for Cluster Database Systems

Speedup versus replication factor

22 September 2010ADBIS 201017

Page 18: Query Evaluation Techniques  for Cluster Database Systems

Speedup versus skew factor θ

22 September 2010ADBIS 201018

• 0.68 corresponds to the ”80-20” rule (80 percents of tuples of the relation will be stored in 20 percents of fragments)• 0 corresponds to the uniform distribution

Page 19: Query Evaluation Techniques  for Cluster Database Systems

Future Work

22 September 2010ADBIS 201019

To incorporate the proposed technique of parallel query execution into open source PostgreSQL DBMS.

To extend this approach on GRID DBMS for clusters with multicor processors.

Page 20: Query Evaluation Techniques  for Cluster Database Systems

22 September 2010ADBIS 201020

Thank you