nested-loop joins

35
Nested-Loop joins “one-and-a-half” pass method, since one relation will be read just once. Tuple-Based Nested-loop Join Algorithm: FOR each tuple s in S DO FOR each tuple r in R DO IF r and s join to make a tuple t THEN output t Improvement to Take Advantage of Disk I/O Model Instead of retrieving tuples of R, T(S) times, load memory with as many tuples of S as can fit, and match tuples of R against all Stuples in memory.

Upload: september-middleton

Post on 03-Jan-2016

57 views

Category:

Documents


1 download

DESCRIPTION

Nested-Loop joins. “one-and-a-half” pass method, since one relation will be read just once. Tuple -Based Nested-loop Join Algorithm: FOR each tuple s in S DO FOR each tuple r in R DO IF r and s join to make a tuple t THEN output t - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Nested-Loop joins

Nested-Loop joins• “one-and-a-half” pass method, since one relation will be

read just once.• Tuple-Based Nested-loop Join Algorithm:

FOR each tuple s in S DO

FOR each tuple r in R DO

IF r and s join to make a tuple t THEN

output t

Improvement to Take Advantage of Disk I/O Model • Instead of retrieving tuples of R, T(S) times, load memory

with as many tuples of S as can fit, and match tuples of R against all S tuples in memory.

Page 2: Nested-Loop joins

Block-based nested loops• Assume B(S) ≤ B(R), and B(S) > M • Read M-1 blocks of S into main memory and compare to

all of R, block by block

FOR each chunk of M-1 blocks of S DO

FOR each block b of R DO

FOR each tuple t of b DO

find the tuples of S in memory that join with t

output the join of t with each of these tuples

Page 3: Nested-Loop joins

Example• B(R) = 1000, B(S) = 500, M = 101

• Important Aside: 101 buffer blocks is not as unrealistic as it sounds. There may be many queries at the same time, competing for main memory buffers.

• Outer loop iterates 5 times at 100 I/O’s each• At each iteration we read M-1 (i.e. 100) blocks of S and all

of R (i.e. 1000) blocks.• Total time: 5*(100 + 1000) = 5500 I/O’s

• Question: What if we reversed the roles of R and S?• We would iterate 10 times, and in each we would read

100+500 blocks, for a total of 6000 I/O’s.• Compare with one-pass join, if it could be done!• We would need 1500 disk I/O’s if B(S) M-1

Page 4: Nested-Loop joins

Analysis of blocks nested loops• Number of disk I/O’s:

[B(S)/(M-1)]*(M-1 +B(R))

or

B(S) + [B(S)B(R)/(M-1)]

or approximately B(S)*B(R)/M

Page 5: Nested-Loop joins

Two-pass algorithms based on sorting• This special case of multi-pass algorithms is sufficient for

most of the relation sizes.

Main idea for unary operations on R • Suppose B(R) M (main memory size in blocks)

• First pass: • Read M blocks of R into MM• Sort the content of MM• Write the sorted result (sublist/run) into M blocks on disk.

• Second pass: create final result

Page 6: Nested-Loop joins

Duplicate elimination using sorting• In the second phase (merging) we don’t sort but copy each

tuple just once.

• We can do that because the identical tuples will appear “at the same time,” i.e. they will be all the first ones at the buffers (for the sorted sublists).

• As usual, if one buffer gets empty we refill it.

Page 7: Nested-Loop joins

Duplicate-Elimination using Sorting Example• Assume M=3, each block holds 2 records and relation R

consists of the following 17 tuples:

2, 5, 2, 1, 2, 2, 4, 5, 4, 3, 4, 2, 1, 5, 2, 1, 3

• After the first pass the following sorted sub-lists are created:

1, 2, 2, 2, 2, 5

2, 3, 4, 4, 4, 5

1, 1, 2, 3, 5

• In the second pass we dedicate a memory buffer to each sub-list.

Page 8: Nested-Loop joins

Example (Cont’d)

Page 9: Nested-Loop joins

Example (Cont’d)

Page 10: Nested-Loop joins

Example (Cont’d)

Page 11: Nested-Loop joins

Analysis of (R) • 2B(R) when creating sorted sublists• B(R) to read each sublist in phase 2• Total: 3B(R)• How large can R be?

– There can be no more than M sublists since we need one buffer for each one. So, B(R)/M ≤ M, (B(R)/M is the number of sublists)

i.e. B(R) ≤ M2

• To compute (R) we need at least sqrt(B(R)) blocks of MM.

Page 12: Nested-Loop joins

Sort-based , , -Exampe: set union. • Create sorted sublists of R and S

• Use input buffers for sorted sublists of R and S, one buffer per sublist

• Output each tuple once. We can do that since all the identical tuples appear “at the same time.”

• Analysis: 3(B(R) + B(S)) disk I/O’s• Condition: B(R) + B(S) ≤ M2

• Similar algorithms for sort based intersection and difference (bag or set versions).

Page 13: Nested-Loop joins

Join• A problem for joins but not for the previous operators: The

number of joining tuples from the two relations can exceed what fits in memory.

• First, we can try to maximize the number of available buffers for putting the joining tuples.

• How, we can do this? • By minimizing the number of sorted sublists (since we

need a buffer for each one of them).

Page 14: Nested-Loop joins

Simple sort-based join

• For R(X,Y) S(Y,Z) with M buffers of memory:• Sort R on Y, sort S on Y

Merge phase• Use 2 input buffers: 1 for R, 1 for S.• Pick tuple t with smallest Y value in the buffer for R (or for S)• If t doesn’t match with the first tuple in the buffer for S, then

just remove t.• Otherwise, read all the tuples from R with the same Y value

as t and put them in the M-2 part of the memory. • When the input buffer for R is exhausted fill it again and

again.

• Then, read the tuples of S that match. For each one we produce the join of it with all the tuples of R in the M-2 part of the memory.

Page 15: Nested-Loop joins

Example of sort join• B(R) = 1000, B(S) = 500, M= 101• To sort R, we need 4*B(R) I/O’s, same for S.

– Total disk I/O’s = 4*(B(R) + B(S))• Doing the join in the merge phase:

– Total disk I/O’s = B(R) + B(S) • Total disk I/O’s = 5*(B(R) + B(S)) = 7500

• Memory Requirement: To be able to do the sort, should have B(R) ≤ M2 and B(S) ≤ M2

• Recall: for nested-loop join, we needed 5500 disk I/O’s, but the memory requirement was quadratic (it is linear, here), i.e., nested-loop join is not good for joining relations that are much larger than MM.

Page 16: Nested-Loop joins

Potential problem ...R(X , Y)----------- x1 a x2 a … xn a

S(Y, Z)--------- a z1

a z2

... a zm

What if n+1 > M-1 andm+1 > M-1?

• If the tuples from R (or S) with the same value y of Y do not fit in M-1 buffers, then we use all M-1 buffers to do a nested-loop join on the tuples with Y-value y from both relations.

• Observe that we can “smoothly” continue with the nested loop join when we see that the R tuples with Y-value y do not fit in M-1 buffers.

Page 17: Nested-Loop joins

• Do we really need the fully sorted files?

Can We Improve on Sort Join?

R

S

Join?

sorted runs

Page 18: Nested-Loop joins

A more efficient sort-based join• Suppose we are not worried about many common Y values

• Create Y-sorted sublists of R and S• Bring first block of each sublist into a buffer (assuming we

have at most M sublists)• Find smallest Y-value from heads of buffers. Join with other

tuples in heads of buffers, use other possible buffers, if there are “many” tuples with the same Y values.

• Disk I/O: 3*(B(R) + B(S))• Requirement: B(R) + B(S) ≤ M2

Page 19: Nested-Loop joins

Example of more efficient sort-join• B(R) = 1000, B(S) = 500, M= 101

• Total of 15 sorted sublists• If too many tuples join on a value y, use the remaining 86

MM buffers for a one pass join on y• Total cost: 3(1000 + 500) = 4500 disk I/O’s• M2 =10201 > B(R) + B(S), so the requirement is satisfied

Page 20: Nested-Loop joins

Summary of sort-based algorithms

Operators Approx. M required Disk I/O

, Sqrt(B) 3B

, , - Sqrt(B(R) + B(S)) 3(B(R)+B(S))

Sqrt(max(B(R)+B(S))) 5(B(R)+B(S))

Sqrt(max(B(R)+B(S))) 3(B(R)+B(S))

Page 21: Nested-Loop joins

Two-pass algorithms based on hashing

• Main idea: Let B(R) > M• instead of sorted sublists, create partitions, based on

hashing• Second pass to create result from partitions

Page 22: Nested-Loop joins

Creating partitions• Here partitions are created based on all attributes of the relation except

for grouping and join, where the partitions are based on the grouping and join-attributes respectively.

• Why bucketize? Tuples with “matching” values end up in the same bucket.

Initialize M-1 buckets using M-1 empty buffers;FOR each block b of relation R DO

read block b into the M-th buffer;IF the buffer for bucket h(t) has no room for t THEN

copy the buffer to disk;initialize a new empty block in that buffer;

ENDIF ;copy t to the buffer for bucket h(t);

ENDFOR;FOR each bucket DO

IF the buffer for this bucket is not empty THENwrite the buffer to disk;

Page 23: Nested-Loop joins

Hash : even / odd

Simple Example

S54

123

138

1114

R243589

2 4 8 4 12 8 14

3 5 9 5 3 13 11

Even

Odd

Buckets

R S

Page 24: Nested-Loop joins

Hash-based duplicate elimination• Pass 1: create partitions by hashing on all attributes• Pass 2: for each partition, use the one-pass method for

duplicate elimination

• Cost: 3B(R) disk I/O’s

• Requirement: B(R) ≤ M*(M-1)

(B(R)/M is the approximate size of one bucket)

i.e. the req. is approximately B(R) ≤ M2

Page 25: Nested-Loop joins

Hash-based grouping and aggregation

• Pass 1: create partitions by hashing on grouping attributes• Pass 2: for each partition, use one-pass method.

• Cost: 3B(R), Requirement: B(R) ≤ M2

• If B(R) > M2

• Read blocks of partition one by one• Create one slot in memory for each group-value• Requirement:

where L is the list of grouping attributes

MM

RB L )1(

))(((

Page 26: Nested-Loop joins

Hash-based set union• Pass 1: create partitions R1,…,RM-1 of R, and S1,…,SM-1 of S

(with the same hash function)• Pass 2: for each pair Ri, Si compute Ri Si using the one-

pass method.

• Cost: 3(B(R) + B(S))• Requirement? • min(B(R),B(S)) ≤ M2

• Similar algorithms for intersection and difference (set and bag versions)

Page 27: Nested-Loop joins

Partition hash-join

• Pass 1: create partitions R1, ..,RM-1 of R, and S1, ..,SM-1 of S, based on the join attributes (the same hash function for both R and S)

• Pass 2: for each pair Ri, Si compute Ri Si using the one-pass method.

• Cost: 3(B(R) + B(S))• Requirement: min(B(R),B(S)) ≤ M2

Page 28: Nested-Loop joins

• B(R) = 1000 blocks• B(S) = 500 blocks• Memory available = 101 blocks• R S on common attribute C• Use 100 buckets

– Read R– Hash– Write buckets

Example

...

...

10 blocks

100R

Same for S

Page 29: Nested-Loop joins

• Read one R bucket• Build memory hash table• Read corresponding S bucket block by block.

RS

...

R

Memory...

Cost• “Bucketize:”

– Read + write R– Read + write S

• Join– Read R– Read S

Total cost = 3*[1000+500] = 4500

Page 30: Nested-Loop joins

Saving some disk I/O’s (I)• If we have more memory than we need to hold one block per

bucket, then we can use several buffers for each bucket, and write them out as a group saving in seek time and rotational latency.

• Also, we can read the buckets in group in the second pass and saving in seek time and rotational latency.

• Well, these techniques don’t save disk I/O’s, but make them faster.

• What about saving some I/O’s?

Page 31: Nested-Loop joins

• Suppose that to join R with S we decide to create k buckets where k is much smaller than M.

• When we hash S we can keep m of the k buckets in memory, while keeping only one block for each of the other k-m buckets.

• We can do so provided

– m*(B(S)/k) +(k-m) M

– B(S)/k is the approximate size of a bucket of S.

• Now, when we read the tuples of R, to hash them into buckets, we keep in memory:

• The m buckets of S that were never written out to disk, and

• One block for each of the k-m buckets of R whose corresponding buckets of S were written to disk.

Saving some disk I/O’s (II)

Page 32: Nested-Loop joins

• If a tuple t of R hashes to one of the first m buckets, then we immediately join it with all the tuples of the corresponding S-bucket.

• If a tuple t of R hashes to a bucket whose corresponding S-bucket is on disk, then t is sent to the main memory buffer for that bucket, and eventually migrates to disk, as for a two pass, hash-based join.

• In the second pass, we join the corresponding buckets of R and S as usual (but only m-k).

• The savings in I/O’s is equal to two for every block of the S-buckets that remain in memory, and their corresponding R-buckets. Since m/k of the buckets are in memory we save 2(m/k)(B(S)+B(R)) .

Saving some disk I/O’s (III)

Page 33: Nested-Loop joins

• All but k-m of the memory buffers can be used to hold tuples of S, and the more of these tuples, the fewer the disk I/O’s.

• Thus, we want to minimize k, the number of buckets.

• We do so by making each bucket about as big as can fit in memory, i.e. the buckets are of (approximately) M size, and therefore k=B(S)/M.

• If that is the case, then there is room for one bucket in memory, i.e. m=1. – We have to make the bucket actually M-k blocks, but we are talking

here approximately, when k<<M.

• So, we have that the savings in I/O’s are:

2(M/B(S))*(B(R) + B(S))

• And, the total cost is:

(3 - 2(M/B(S)))*(B(R) + B(S))

Saving some disk I/O’s (IV)How can we choose m and k?

Page 34: Nested-Loop joins

Summary of hash-based methods

Operators Approx. M required

Disk I/O

, Sqrt(B) 3B

, , - Sqrt(B(S)) 3(B(R)+B(S))

Sqrt(B(S)) 3(B(R)+B(S))

Sqrt(B(S)) (3-2M/B(S))(B(R)+B(S))

Page 35: Nested-Loop joins

Sort vs. Hash based algorithms• Hash-based algorithms have a size requirement that

depends only on the smaller of the two arguments rather than on the sum of the argument sizes, as for sort-based algorithms.

• Sort-based algorithms allow us to produce the result in sorted order and take advantage of that sort later. The result can be used in another sort-based algorithm later.

• Hash-based algorithms depend on the buckets being of nearly equal size. Well, what about a join with a very few values for the join attribute…