1 cs 310 – data structures all figures labeled with “figure x.y” copyright © 2006 pearson...
TRANSCRIPT
1
CS 310 – Data Structures
All figures labeled with “Figure X.Y”
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. photo ©Oregon Scenics used with permission
Trees Part II
2
Balanced binary search trees
• Binary search trees with an extra condition– Left and right subtrees must have the same
height.– In practice, we use the condition that left and
right subtrees can differ in height by no more than one.
– Note: It is convenient to denote the hight of an empty tree as -1.
3
• Keeping the tree balanced results in logarithmic search time.
AVL trees
• Adelson-Velskii & Landis
• First balanced tree – 1962
4
Unbalanced tree
insert(tree, 1) – results in an unbalanced tree
5
Which nodes to rebalance?
Nodes along the path from the insertion point to the root might need rebalancing.
6
How can we get into trouble?
Insertion into left subtree of left child
Insertion into right subtree of left child
7
Where else?
• Symmetric problems when inserting into the right subtree– Insertion in the right subtree of the left child of
the node in question.– Insertion in the right subtree of the right child
of the node in question.
8
The outside cases
• When the unbalance comes from inserting on the outside of the tree, we can fix the problem with one rotation.
k1
k2
k1
k2
9
Single rotation
10
Concrete example
11
Symmetric case for single rotation
12
13
The inside cases
• Rotations on the inside are more difficult.
14
Double rotation
25
15
A
19
B C
D
25
15
A
19
B C
D
rotation 1swap child and grandchild
15
Double Rotation
25
15
A
19
B C
D2515
A
19
B C D
rotation 2rotation between grandparent and new parent
16
Practice makes perfect
• Practice with http://webpages.ull.es/users/jriera/Docencia/AVL/AVL%20tree%20applet.htm
17
AVL implementation
• Implementation difficult
• Basic idea– For insertion into tree T, insert into
appropriate left/right Tlr subtree.
– If height of Tlr remains the same, all done.
– Otherwise, we need to see if T has become unbalanced. If so, perform appropriate repairs with root T.
18
19
AVL
• In practice, requires two passes through tree – down: insertion– up: repair
• Better schemes have been proposed.
20
Red-Black trees
• Single top down pass for insertion & deletion
• Binary search trees with:– Colored nodes: red or black (null nodes treated as
black)– Root is always black– If node is red children are black– Constant black depth. Every path from node to null
link has the same number of black nodes.
21
Sentinels
• Many implementations of red-black trees sometimes create special nodes called sentinels.
• Sentinel nodes are used in place of the null link to indicate leaves. In red-black trees, sentinels are always colored black.
22
Properties
• If there are B black nodes along each of the paths, the tree must have at least 2B-1 black nodes.
• As there are never two consecutive red nodes:– height is at most 2log(N+1)– which implies logarithmic search
23
Sample red-black tree
24
Insertion
• New nodes are always inserted as leaves.
• What color?– black? Other paths will no longer have the
same number of black nodes.– red?
• If parent is black, then we are okay:
25
Inserting when the parent is red
• What color is the parent’s sibling?– black and inserted leaf is an outside child
relative to grandparent: single rotation & recolor
Color change ensures we do not have two consecutive red nodes.Note: Figure does not assume X is a leaf.
26
Red parent & black sibling
• We saw outside children single rotation
• Inside children double rotation
27
Inserting when the parent is red
• Parent’s sibling is also red?
28
Insertions with red parent & sibling
• Before rotations • Single rotation
• Problems– consecutive red nodes– recoloring doesn’t help
85
80
90
95
70
60
85
80 90
95
70
60
insert 95
29
Insertions with red parent & sibling
• Double rotation
85
80 90
79
70
60
insert 79 after first rotation after second rotation
What if the 80’s parent(originally node 70’s parent) had been red?
85
80
90
79
70
6085
80
9079
70
60
Remember: This is a subtree, you could notconstruct a tree that looked like this.
30
What if 80’s parent had been red?
• We could try to propagate this up the tree, applying the rotations to the next higher level.
• Unfortunately, this puts us in the same situation as the AVL tree which requires two traversals.
85
80
9079
70
60
?
31
Red-black treeTop down insertion
• We only get into trouble when the parent of the inserted node has a red sibling.
• By recognizing this, we can prevent it from happening.
32
Swapping colors
• On our way down, if we see two red children:
• we swap the parent and child colors:
33
Swapping colors
• Is this all right?– black depth preserved– What if the new red node is the root?
• It is a problem, but we can just recolor it black.
– What if the parent is red?• let us think about this…
P PS?
34
Swapping colors when the parent is also red
• Parent’s sibling is black?– Swap colors.– Repair with
• single (slide 28) or
• double rotation (slide 29)
• Parent’s sibling is red?– Can’t happen!– Why not?
P PS P PS
P PS? P PS?
35
Top-down red-black tree
• Implementation is complicated by some special cases.
• Two tricks to ease implementation:– Instead of null links, we have a sentinel node
which is always black.– The root pointer points to a pseudoroot node
• Contains a smaller than any other value (-∞).• Right pointer points to real root.
36
37
38
39
40
41
42
85
80 90
82
70
60
50 65
43
44
insert 76 while (compare( item, current ) != 0 ) { great = grand; grand = parent; parent = current; current = compare( item, current ) < 0 ? current.left : current.right; // Check if two red children; // fix if so if (current.left.color == RED && current.right.color == RED) handleReorient( item ); }
50
7525
8065
78 82
45
insert 76private void handleReorient(T item) {
// flip colorcurrent.color = REDcurrent.left.color = BLACK;current.right.color = BLACK;
…}
50
7525
8065
78 82
46
insert 76private void handleReorient(T item) {
… // slightly rewritten from Weiss
boolean leftOfGrand = compare(item, grand) < 0;boolean leftOfParent = compare(item, parent) < 0;if (parent.color == RED) { grand.color = RED; if (leftOfGrand != leftOfParent) { // double rotation
parent = rotate(item, grand); } current = rotate(item, great); current.color = BLACK;
} header.right.color = BLACK;
}
50
7525
8065
78 82
47
insert 76private void handleReorient(T item) {
… // slightly rewritten from Weiss
boolean leftOfGrand = true;boolean leftOfParent = true;if (parent.color == RED) { grand.color = RED; if (leftOfGrand != leftOfParent) { // double rotation
parent = rotate(item, grand); } current = rotate(item, great); current.color = BLACK;
} header.right.color = BLACK;
}
50
7525
8065
78 82
48
insert 76rotate(T item, RedBlackNode<T> parent) {
if (compare(item, parent) < 0) {
…
} else {
return parent.right = compare(item, parent.right) < 0 ?
rotateWithLeftChild(parent.right) :
rotateWithRightChild(parent.right);
}
50
7525
8065
78 82
49
insert 76rotate(T item, RedBlackNode<T> parent) {
if (compare(item, parent) < 0) {
…
} else {
return parent.right = compare(item, parent.right) < 0 ?
rotateWithLeftChild(parent.right) :
rotateWithRightChild(parent.right);
}
50
7525
8065
78 82
50
75
25
80
65 78 82
50
insert 76private void handleReorient(T item) {
… // slightly rewritten from Weiss
boolean leftOfGrand = true;boolean leftOfParent = true;if (parent.color == RED) { grand.color = RED; if (leftOfGrand != leftOfParent) { // double rotation
parent = rotate(item, grand); } current = rotate(item, great); current.color = BLACK;
} header.right.color = BLACK;
}
50
75
25
80
65 78 82
51
insert 76
insert continues until
current is sentinel
parent
and now we can insert 76 as
a red node.
50
75
25
80
65 78 82
52
Red/Black tree applets• http://gauss.ececs.uc.edu/RedBlack/redblack.html• or http://webpages.ull.es/users/jriera/Docencia/AVL/AVL%20tree%20applet.htm
53
Red/Black tree deletion
• More complicated than insertion.
• We will cover the basic ideas and omit the implementation:– Deleting black nodes causes problems
make sure we delete red nodes.– We replace the values in internal nodes.
54
Red/Black tree deletion
• X – current node
• S – sibling
• P – parent
• Assume sentinel red
• Consider what we can do when X’s children are black.
55
Case 1
• If S has black children, we can perform a color flip.
P
X S
P
X S
56
Case 2
• If sibling’s outer child is red, perform a single rotation.
P
X S
R
P
X
S
R
57
Case 3
• If sibling’s inner child is red, perform a double rotation.
P
X S
L
P
X
L
S
58
Deletion when X is a leaf
• Recall sentinels are colored black.
• Therefore, we can consider X to have two black children and use the 3 cases just described.
59
X has a red child?
• In all three cases, X was colored red, so the color flip and rotations inappropriate.
• Without covering the specifics, we can move to the next level and perform an operation to make X one of the following:– red– a leaf node use one of the three cases– or X has a single child:
• red child delete X, make child black• black child use one of the three cases
60
b-trees
• Log N search seems less convincing when some operations are orders of magnitude slower than others.
• A fast hard drive in 2006 can find a disk block in about 7.5 ms, or about 133 uncorrelated accesses per s.
• In contrast, an AMD Sempron 3600+ (top of the value line late 2006) can execute over 3,000 MFLOPSper s.
• Successful search in a 10 million record balanced binary search tree requires about 25 comparisons.– Trivial in RAM– About .2 s on disk assuming nobody else is using the system.
61
M-ary b-trees
• Data stored in leaves• Nonleaf nodes
– store up to M-1 keys– ith key is the smallest subkey in i+1th
subtree• Root either
– a leaf or– has between 2 to M children.
• All interior nodes (except root) ceil(M/2) to M children.• All leaves at the same depth with ceil(L/2) to L data
items (L maximum number of items, root leaf may have less)
62
Sample b-tree
63
Selecting M & L
• Optimal choice of M and L depends upon the minimum amount of information that can addressed on a disk.
• While disks typically operate on a small blocks of bytes (512), many operating systems group the blocks into a larger unit called a cluster.
64
Selecting M & L
• We typically choose M & L based upon the cluster size.
• M = floor(cluster size / key size)
• L = floor(cluster size / record size)
• In the worst case, we will access about logM/2(N) clusters.
65
Adding to a b-tree
• Very easy when there’s room in the leaf node: insert 57
66
Adding to a full node
• Split into two leaves if possible: insert 55
67
When the interior node is full
• insert 40 cannot split easily
68
Splitting an interior node
• interior node covering 40: 8, 18, 26, and 35
• leaf node: 35, 36, 37, 38, 39
• Split leaf: [35, 36, 37], and [38, 39, 40]
• Split the interior node to: [8, 18], promote 26 to parent, [35, 38]
69
Splitting an interior node
70
b-tree insertion
• If parent is full, the process can be repeated.
71
b-tree deletion
• Just remove it when there are ceil(L/2) items will still remain.
• When the number of items is too small, merge nodes