ece750-txb lecture 7: red-black trees, heaps, and treapsece750-ads/notes/lecture07.pdf ·...
TRANSCRIPT
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
ECE750-TXB Lecture 7: Red-Black Trees,Heaps, and Treaps
Todd L. [email protected]
Electrical & Computer EngineeringUniversity of Waterloo
Canada
February 14, 2007
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Binary Search Trees
I Recall that in a binary tree of height h the timerequired to find or insert an element is O(h).
I In the worst case h = n, the number of elements.
I To keep h ∈ O(log n) one needs a balancing strategy.I Balancing strategies may be either:
I Randomized: e.g. a random insert order results inexpected height of c log n with c ≈ 4.311.
I Deterministic (in the sense of not random).
I Today we will see an example of each:I Red-black trees: deterministic balancingI Treaps: randomized. Also demonstrate persistence and
unique representation.
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Red-black trees
I Red-black trees are a popular form of binary search treewith a deterministic balancing strategy.
I Nodes are coloured red or black.I Properties of the node-colouring ensure that the longest
path to a leaf is no more than twice the length of theshortest path.
I This ensures height of ≤ 2 log2(n + 1), which impliessearch, min, max in O(log n) worst-case time.
I Insert and Delete can also be performed in O(log n)worst-case time.
I Invented by Bayer [2], red-black formulation due toGuibas and Sedgewick [9]. Other sources: [5, 10].
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Red-Black Trees: Invariants
I Balance invariants:
1. No red node has a red child.2. Every path in a subtree contains the same number of
black nodes.
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Red-Black Trees
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Red-Black Trees: Balance I
Let bh(x) be the number of black nodes along any pathfrom a node x to a leaf, excluding the leaf.
LemmaThe number of internal nodes in the subtree rooted at x isat least 2bh(x) − 1.
Proof.
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Red-Black Trees: Balance II
By induction on height:
1. Base case: If x has height 0, then x is a leaf, andbh(x) = 0; the number of internal (non-leaf)descendents of x is 0 = 2bh(x) − 1.
2. Induction step: assume the hypothesis is true for height≤ h. Consider a node of height h + 1. From invariant(2), the children have black height either bh(x)− 1 (ifthe child is black) or bh(x) (if the child is red). Byinduction hypothesis, each child subtree has at least2bh(x)−1 − 1 internal nodes. The total number ofinternal nodes in the subtree rooted at x is therefore≥ (2bh(x)−1 − 1) + 1 + (2bh(x)−1 − 1) = 2bh(x) − 1.
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Red-Black Trees: Balance
TheoremA red-black tree with n internal nodes has height at most2 log2(n + 1).
Proof.Let h be the tree height. From invariant 1 (a red node musthave both children black), the black-height of the root mustbe ≥ h/2. Applying Lemma 1.1, the number of internalnodes n of the tree satisfies n ≥ 2h/2 − 1. Rearranging,h ≤ 2 log2(n + 1).
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Red-Black Trees: Balance
I As with all non-randomized binary search trees, balancemust be maintained when insert or delete operations areperformed.
I These operations may disrupt the invariants, sorotations and recolourings are needed to restore them.
I Insert for red-black tree:
1. Insert the new key as a red node, using the usual binarytree insert.
2. Perform restructurings and recolourings along the pathfrom the newly added leaf to the root to restoreinvariants.
3. Root is always coloured black.
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Red-Black Trees: Balance
I Four cases for red nodes with red children:
I Restructure/recolour to correct: each of the abovecases becomes
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Red-Black Trees: Example
I Insertion of [1,2,3,4,5] into a red-black tree:
I Implementation of rebalancing is straightforward but abit involved.
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Heaps and Treaps
I Treaps are a randomized search tree that combineTRees and hEAPS.
I First, let’s look at heaps.I Consider determining the maximum element of a set.
I We could iterate through the array and keep track ofthe maximum element seen so far. Time taken: Θ(n).
I We could build a binary tree (e.g. red-black). We canobtain the maximum (minimum) element in O(h) timeby following rightmost (leftmost) branches. If tree isbalanced, requires O(n log n) time to build the tree, andO(log n) time to retrieve the maximum element.
I A heap is a highly efficient data structure formaintaining the maximum element of a set. It is arudimentary example of a dynamic algorithm/datastructure.
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Dynamic Algorithms
I A static problem is one where we are given an instanceof a problem to solve, we solve it, and are done (e.g.,sort an array).
I A dynamic problem is one where we are given a problemto solve, we solve it.
I Then the problem is changed slightly and we resolve.I ...ad infinitum.
I The challenge goes from solving a single instance of aproblem to maintaining a solution as the problem ismodified.
I It is usually more efficient to update the solution thanrecompute from scratch.
I e.g., binary search trees can be viewed as a method fordynamically maintaining an ordered list as elements areinserted and removed.
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Heaps
I A heap dynamically maintains the maximum element ina collection (or, dually, the minimum element). Abinary heap can:
I Obtain the maximum element in O(1) time;I Remove the maximum element in O(log n) time;I Insert new element in O(log n) time.
Heaps are a natural implementation of thePriorityQueue ADT.
I There are several flavours of heaps: binary heaps,binomial heaps, fibonacci heaps, pairing heaps. Themore sophisticated of these support merging (melding)two heaps.
I We will look at binary heaps.
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Binary Heap Invariants
1. A binary heap is a complete binary tree of height h − 1,plus a possibly incomplete level of height h filled fromleft to right.
2. The key stored at each node is ≥ the key(s) stored inits children.
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Binary HeapI A binary heap may be stored as a (1-based) array,
whereI Parent(j) = bj/2cI LeftChild(i) = 2 ∗ iI RightChild(i) = 2 ∗ i + 1
I e.g., [17, 11, 13, 9, 6, 2, 12, 4, 3, 1] is an arrayrepresentation of the heap:
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Heap operations
I To insert a key k into the heap:I Place k at the next available position.I Swap k with its parent(s) until the heap invariant is
satisfied. (Takes O(log n) time.)
I The maximum element is just the key stored at theroot, which can be read off in O(1) time.
I To delete the maximum element:I Place the key at the last heap position at the root
(overwriting the current maximum), and decrease thesize of the heap by one.
I Choose the largest of the root and its two children, andmake this the root; perform this procedure recursivelyuntil the heap invariant is satisfied.
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Heap: insert example
I Example: insert 23 into the heap and restore the heapinvariant.
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Heap: delete-max example
I To delete the max element, move the element from thelast position (2) to the root;
I To restore heap invariant, swap root with the largestchild greater than it, if any, and repeat down the heap.
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Treaps
Treaps (binary TRee + hEAP)
I a randomized binary search tree
I with O(log n) average-case insert, delete, search
I with O(∆ log n) average-case union, intersection, ⊆, ⊇,where ∆ = |(A \ B) ∪ (B \ A)| is the difference betweenthe sets
I uniquely represented (to be explained)
I easily made persistent (to be explained)
I Due to Vuillemin [14] and independently, Seidel andAragon [11]. Additional references: [3, 16, 15].
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Treaps: Basics
I Keys are assigned (randomly chosen) priorities.I Two total orders on keys:
I The usual key order;I A randomly chosen priority order, often obtained by
assigning each key a random integer, or using anappropriate hash function
I Treaps are kept sorted by key in the usual way (inordertree traversal visits keys in order).
I The heap property is maintained wrt the priority order.
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Treap ordering
I Each node has key k and priority p
I Ordering invariants:
(k2, p2)
ttttttttt
KKKKKKKKK
(k1, p1) (k3, p3)
k1 ≤ k2 ≤ k3 Key orderp2 ≥p p1
p2 ≥p p3Priority order
Every node has a higher priority than its descendents.
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Treaps: Basics
I If priorities are chosen randomly, the tree is on averagebalanced, and insert, delete, search take O(log n) time
I Random priorities behave like a random insertion order:the structure of the treap is exactly that obtained byinserting the keys into a binary search tree indescending order of heap prioritity.
I If keys are unique (no duplicates), and priorities areunique, then the treap has the unique representationproperty
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Unique representation
I Unique representation: each set is represented by aunique data structure [1, 13, 12]
I Most tree data structures do not have this property:depending on order of inserts, deletes, etc. the tree canhave different forms for the same set of keys.
I Recall there are Cn ∼ 4nn−3/2π−1/2 ways to place nkeys in a binary search tree (Catalan numbers). e.g.C20 = 6564120420.
I Deterministic (i.e., not randomized) uniquelyrepresented search trees are known to require Ω(
√n)
worst-case time for insert, delete, search [12].
I Treaps are randomized (not deterministic), and haveO(log n) average-case time for insert, delete, search
I If you memoize or cache the constructors of a uniquelyrepresented data structure, you can do equality testingin O(1) time by comparing pointers.
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Treap: Example
Treap A1 = R.insert("f"); // Insert the key fTreap A2 = A1.insert("u"); // Insert the key u
Treap B1 = R.insert("u"); // Insert the key u into RTreap B2 = R.insert("f"); // Insert the key f
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Canonical forms
I The structure of the treap does not depend on the orderon which the operations are carried out.
I Treaps give a canonical form for sets: if A,B are sets,we can determine whether A = B by constructing treapscontaining the elements of A and B, and comparingthem. If the treaps are the same, the sets are equal.
I Treaps give an easy decision procedure for equality ofterms modulo associativity, commutativity, andidempotency.
I Treaps are very useful in program analysis (e.g., forcompilers) for solving fixpoint equations on sets.
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Persistent Data Structures
Literature: [7, 8, 4, 6]
I Partially persistent: Can access previous versions of adata structure, but cannot derive new versions fromthem (read-only access to a linear past.)
I Fully persistent: Can make changes in previous versionsof the data structure: versions can “fork.”
I Any linked data structure with constant boundedin-degree can be made fully persistent with amortizedO(1) space and time overhead, and worst case O(1)overhead for access [7]
I Confluently persistent: Can branch into two versions ofthe data structure, and later reconcile these branches
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
The Version Graph
The version graph shows how versions of a data structureare derived from one another.
I Vertices: Data structures
I Edges: Show how one data structure was derived fromanother
I Treaps example:
R
~~
BBB
BBBB
B
A1
B1
A2 B2
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Version graph
I Partial persistence: version graph is a linear sequence ofversions, each derived from the previous version.
I Partial/full persistence: get a version tree
I Confluent persistence: get a version DAG (directedacyclic graph)
X
AAA
AAAA
A
Y 1
Z
Y 2
!!CCC
CCCC
C
W
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Purely Functional Data Structures
I Literature: [10]
I Functional data structures: cannot modify a node ofthe data structure once it is created. (One implication:no cyclic data structures.)
I Functional data structures are by nature partiallypersistent: we can always hold onto pointers to oldversions of the data structure.
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Scopes
I Partial persistence is very useful for managing scopes incompilers and program analysis.
I A scope is a representation of the names that are visibleat a given program point:
int foo(int a, int b) // S1
int x = a*a, y = b*b, z=0; // S2
for (int k=0; k < x; ++k) // S3for (int l=0; l < y; ++l) // S4
++c;// S5
return x;
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Scopes Example
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Bibliography I
[1] A. Andersson and T. Ottmann.Faster uniquely represented dictionaries.In IEEE, editor, Proceedings: 32nd annual Symposiumon Foundations of Computer Science, San Juan, PuertoRico, October 1–4, 1991, pages 642–649, 1109 SpringStreet, Suite 300, Silver Spring, MD 20910, USA, 1991.IEEE Computer Society Press. bib pdf
[2] Rudolf Bayer.Symmetric binary B-trees: Data structure andmaintenance algorithms.Acta Inf, 1:290–306, 1972. bib
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Bibliography II
[3] Guy E. Blelloch and Margaret Reid-Miller.Fast set operations using treaps.In Proceedings of the 10th Annual ACM Symposium onParallel Algorithms and Architectures, pages 16–26,Puerto Vallarta, Mexico, June 1998. bib ps
[4] Adam L. Buchsbaum and Robert E. Tarjan.Confluently persistent deques via data-structuralbootstrapping.In Proceedings of the fourth annual ACM-SIAMSymposium on Discrete algorithms, pages 155–164.ACM Press, 1993. bib pdf ps
[5] Thomas H. Cormen, Charles E. Leiserson, andRonald R. Rivest.Intoduction to algorithms.McGraw Hill, 1991. bib
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Bibliography III
[6] P. F. Dietz.Fully persistent arrays.In F. Dehne, J.-R. Sack, and N. Santoro, editors,Proceedings of the Workshop on Algorithms and DataStrucures, volume 382 of LNCS, pages 67–74, Berlin,August 1989. Springer. bib
[7] James R. Driscoll, Neil Sarnak, Daniel Dominic Sleator,and Robert Endre Tarjan.Making data structures persistent.In ACM Symposium on Theory of Computing, pages109–121, 1986. bib pdf
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Bibliography IV
[8] Amos Fiat and Haim Kaplan.Making data structures confluently persistent.In Proceedings of the Twelfth Annual ACM-SIAMSymposium on Discrete Algorithms (SODA-01), pages537–546, New York, January 7–9 2001. ACM Press.bib pdf
[9] Leonidas J. Guibas and Robert Sedgewick.A dichromatic framework for balanced trees.In FOCS, pages 8–21. IEEE, 1978. bib
[10] Chris Okasaki.Purely Functional Data Structures.Cambridge University Press, Cambridge, UK, 1998. bib
[11] Raimund Seidel and Cecilia R. Aragon.Randomized search trees.Algorithmica, 16(4/5):464–497, 1996. bib pdf ps
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Bibliography V
[12] Lawrence Snyder.On uniquely representable data structures.In 18th Annual Symposium on Foundations ofComputer Science, pages 142–146, Long Beach, Ca.,USA, October 1977. IEEE Computer Society Press. bib
[13] R. Sundar and R. E. Tarjan.Unique binary search tree representations andequality-testing of sets and sequences.In Baruch Awerbuch, editor, Proceedings of the 22ndAnnual ACM Symposium on the Theory of Computing,pages 18–25, Baltimore, MY, May 1990. ACM Press.bib pdf
ECE750-TXBLecture 7:
Red-Black Trees,Heaps, and Treaps
Todd L.Veldhuizen
Red-Black Trees
Heaps
Treaps
Bibliography
Bibliography VI
[14] Jean Vuillemin.A unifying look at data structures.Communications of the ACM, 23(4):229–239, 1980.bib pdf
[15] M. A. Weiss.A note on construction of treaps and Cartesian trees.Information Processing Letters, 54(2):127–127, April1995. bib
[16] Mark Allen Weiss.Linear-time construction of treaps and Cartesian trees.Information Processing Letters, 52(5):253–257,December 1994. bib pdf