csc 213 – large scale programming. problems with search trees great at organizing information for...

12
LECTURE 40: (A,B)- AND B-TREES CSC 213 – Large Scale Programming

Upload: giles-white

Post on 13-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CSC 213 – Large Scale Programming. Problems with Search Trees  Great at organizing information for searching  Processing is maintained at consistent

LECTURE 40:(A,B)- AND B-TREES

CSC 213 – Large Scale Programming

Page 2: CSC 213 – Large Scale Programming. Problems with Search Trees  Great at organizing information for searching  Processing is maintained at consistent

Problems with Search Trees

Great at organizing information for searching Processing is maintained at consistent O(log n) time

But sucks at locality (both spatial and temporal) Each node contains only 1 piece of data Jumps to child after using that piece of data All of these references means nodes

spaced randomly

Page 3: CSC 213 – Large Scale Programming. Problems with Search Trees  Great at organizing information for searching  Processing is maintained at consistent

Big Search Trees

Excellent test of (roommates) system

Page 4: CSC 213 – Large Scale Programming. Problems with Search Trees  Great at organizing information for searching  Processing is maintained at consistent

(a,b) Trees to the Rescue!

Real solution to frequent hikes to Germany Linux & MacOS to track files & directories MySQL & other databases use this to hold

all the data Found in many other places where paging

occurs Simple rules define working of any (a,b)

tree Grows upward so that all leaves found at

same level At least a children for each internal node Every internal node has at most b children

Page 5: CSC 213 – Large Scale Programming. Problems with Search Trees  Great at organizing information for searching  Processing is maintained at consistent

What is “the BTree?”

Common multi-way tree implementation Describe B-Tree using order (“BTree of order m”)

m/2 to m children per internal node Root node can have m or fewer elements

Many variants exist to improve some failing Each variant is specialized for some niche

use Minor differences only between each

variant Describes the most basic B-Tree during

lecture

Page 6: CSC 213 – Large Scale Programming. Problems with Search Trees  Great at organizing information for searching  Processing is maintained at consistent

BTree Order

Select order minimizing paging when created Elements & references to kids in full node fills

page Nodes have at least m/2 elements, even at

their smallest In memory guarantees each page is at least

50% full How many pages touched by each

operation?

Page 7: CSC 213 – Large Scale Programming. Problems with Search Trees  Great at organizing information for searching  Processing is maintained at consistent

Removal from BTree

Swap element with successor in parent of a leaf Process is similar to removal in (2,4) node

If under m/2 elements in node after the removal See if can move element from sibling to

parent & steal element from parent

Else, merge with sibling & steal element from parent But this might propagate underflow to parent

node!

Page 8: CSC 213 – Large Scale Programming. Problems with Search Trees  Great at organizing information for searching  Processing is maintained at consistent

(2,4) Tree Is An (a,b) Tree

Grows upward so all leaves found at same level

At least a children for each internal node

Every internal node has at most b children

Page 9: CSC 213 – Large Scale Programming. Problems with Search Trees  Great at organizing information for searching  Processing is maintained at consistent

Case 1: Transfer

Adjacent sibling Node has Entry to lend Steal parent’s Entry closest to underfilled

node Prevent loneliness & promote sibling’s Entry

No further processing needed in this case Example: remove(15)

4 9

6 82

Page 10: CSC 213 – Large Scale Programming. Problems with Search Trees  Great at organizing information for searching  Processing is maintained at consistent

Case 2: Fusion

Emptied node has only ½ filled siblings Merge node & sibling into single nearly

filled node Look to parent & steal Entry between

siblings May propagate underflow to parent!

Example: remove(15)

Mom

9 14

102 5 7

Page 11: CSC 213 – Large Scale Programming. Problems with Search Trees  Great at organizing information for searching  Processing is maintained at consistent

In Case Of Overflow…

If addition overfills node, split into 2 new nodes ½ of the Entrys (& children) for the new

nodes Splitting now makes sure nodes at least ½

full!

15 24

12 18 27 30 32 35

Page 12: CSC 213 – Large Scale Programming. Problems with Search Trees  Great at organizing information for searching  Processing is maintained at consistent

For Next Lecture

By end of day, should submit your program #3

Weekly activity due tomorrow For Friday, must finish program for

portfolio