binary search trees (non-linear data structures for...
TRANSCRIPT
CIS210 1
Topic
Binary Search Trees
(Non-Linear Data
Structures for
Searching)
CIS210 2
The Searching Problem
Fundamental to a variety of computer problems!
(Search) Key Data
Data
Structure
Searching
for
DataKey Data
Key Data
Key Data
CIS210 3
Search Trees
CIS210 4
A Search Tree?
Tree structures used to store data because their
organization allows more efficient access to the
data.
A tree that maintains its data some sorted order and
supports efficient search operations.
By constraining the relative positions of the nodes in
the tree.
CIS210 5
Binary Search Trees (BST) as
Non-linear Data Structures
CIS210 6
A Binary Search Tree?
A binary tree + A search tree A special kind of binary tree with the ordering
condition
Between every node and the nodes in its left subtree.
Between every node and the nodes in its right subtree.
BST Order Property!
CIS210 7
The Order Condition of BST
BST property - For any node N
The key value in every node in N’s left subtree is less
than or equal to the key value K in N.
The key value in every node in N’s right subtree is
greater than the key value K in N.
CIS210 8
Logical Structure of BST
TrightTleft
root
CIS210 9
Binary Search Trees as ADTs
CIS210 10
Operations on Binary Search Trees
Create an empty binary search tree.
Destroy a binary search tree.
Insert a new item to the binary search tree.
Delete the item with a given search key from a binary search tree.
Search/Retrieve the item with a given search key from a binary search tree.
Determine whether a binary search tree empty?
Traverse the items in a binary search tree in preorder, inorder or postorder.
...
CIS210 11
A Pointer-Based Representation using Template Class
template<class DataType>
class BST
template<class DataType>
class BSTnode
{
Public:
BSTnode();
BSTnode(DataType D, BSTnode<DataType>* l,
BSTnode<DataType>* r)
: data(D), LchildPtr(l), RchildPtr(r) { }
friend class BST<DataType>;
private:
DataType data;
BSTnode<DataType>* LchildPtr;
BSTnode<DataType>* RchildPtr;
};
CIS210 12
A Pointer-Based Representation using Template Class
template<class DataType>
class BST
{
Public:
BST();
…
private:
BSTnode<DataType>* rootBT;
…
};
CIS210 13
Binary Search Tree ADT
template < class DT, class KF > // Forward dec. of the BSTree class
class BSTree;
template < class DT, class KF >
class BSTreeNode // Facilitator for the BSTree class
{
private:
// Constructor
BSTreeNode ( const DT &nodeDataItem,
BSTreeNode *leftPtr, BSTreeNode *rightPtr );
// Data members
KF searchKey;
DT dataItem; // Binary search tree data item
BSTreeNode *left, // Pointer to the left child
*right; // Pointer to the right child
friend class BSTree<DT,KF>;
};
CIS210 14
Binary Search Tree ADT
template < class DT, class KF > // DT : tree data item
class BSTree // KF : key field
{
public:
// Constructor
BSTree ();
// Destructor
~BSTree ();
CIS210 15
Binary Search Tree ADT
// Binary search tree manipulation operations
void insert (KF searchKey, const DT &newDataItem ); // Insert data item
bool retrieve ( KF searchKey, DT &searchDataItem ) const;
// Retrieve data item
bool remove ( KF deleteKey ); // Remove data item
void writeKeys () const; // Output keys
void clear (); // Clear tree
CIS210 16
Binary Search Tree ADT
// Binary search tree status operations
bool isEmpty () const; // Tree is empty
bool isFull () const; // Tree is full
// Output the tree structure -- used in testing/debugging
void showStructure () const;
int getHeight () const; // Height of tree
void writeLessThan ( KF searchKey ) const; // Output keys
// < searchKey
CIS210 17
Binary Search Tree ADT
private:
// Recursive partners of the public member functions -- insert
// prototypes of these functions here.
void insertSub ( BSTreeNode<DT,KF> *&p, KF searchKey,
const DT &newDataItem );
bool retrieveSub ( BSTreeNode<DT,KF> *p, KF searchKey,
DT &searchDataItem) const;
bool removeSub ( BSTreeNode<DT,KF> *&p, KF deleteKey );
void clearSub ( BSTreeNode<DT,KF> *p );
void showSub ( BSTreeNode<DT,KF> *p, int level ) const;
int getHeightSub ( BSTreeNode<DT,KF> *p ) const;
CIS210 18
Binary Search Tree ADT
// Data member
BSTreeNode<DT,KF> *root; // Pointer to the root node
};
CIS210 19
Example: Insertions of D, B, F, A, C and E
D
FB
D
B
D
D B F
A
D
F
A C E
B
D
F
A C
B
D
F
A
B C E
CIS210 20
Example: What order?
4
6
1 3 75
2
Insertion Order: 4, 2, 6, 1, 3, 5 and 7
CIS210 21
Example: What order?
Insertion Order: 1, 2, 3, 4, 5, 6 and 7
1
2
5
6
3
4
7
CIS210 22
Example: What order?
1
7
2
6
3
5
4
Insertion Order: 1, 7, 2, 6, 3, 5 and 4
CIS210 23
Insertion Operation - Recursive
Insert (BST, newitem)
If BST == NULL (empty tree) then
Create a new node; let BST point to this new node;copy
newitem into new node’s data portion; set the pointers
in the new node to NULL.
else if newitem.Key < BST->Key then
Insert (BST->LchildPtr, newitem)
else
Insert (BST->RchildPtr, newitem)
CIS210 24
BST with the Same Data
Several different binary search trees are possible for
the same data?
Yes
CIS210 25
Insertion Order and Shape of BST
Insertion in search-key order produces
a maximum-height binary search tree!
Insertion in random order produces
a near-minimum-height binary search tree!
CIS210 26
Example: Search F (Successful)
B
D
F
E
GC
A
CIS210 27
Example: Search H (Unsuccessful)
B
D
F
E
GC
A
CIS210 28
Search Operation - Recursive
Search(BST, SearchKey):
If BST == NULL (empty tree) then
Not Found (Unsuccessful search)
else if SearchKey == BST->Key then
Found (Successful search)
else if SearchKey < BST->Key then
Search (BST->LchildPtr, SearchKey)
else
Search (BST->RchildPtr, SearchKey)
CIS210 29
BST Search vs Binary Search
Searching for a key value V in a binary search tree is
similar to performing a binary search in a sorted
array.
If V=the key data, then the search succeeds.
If V < the key data, the search continues
in the left subtree.
In the left half of the current part of the array.
If V > the key data, the search continues
in the right subtree.
In the right half of the current part of the array.
CIS210 30
Find Min (Smallest) Operation -Iterative
FindMin(BST):
Start at the root node BST.
Follow the chain of left subtrees until we get to the
node that has an empty left subtree.
The key in that node is the smallest in the BST.
CIS210 31
Find Min (Smallest) Operation -Recursive
FindMin(BST):
If BST == NULL then return NULL.
If BST->LchildPtr == NULL (No left subtree) then
Return BST
else
FindMin (BST-> LchildPtr)
CIS210 32
Example: FindMin
R
Z
BST
J
B
R
Z
Q
K
P
L
M
N
BST
CIS210 33
Find Max (Largest) Operation -Iterative
FindMax(BST):
Start at the root node BST.
Follow the chain of right subtrees until we get to the
node that has an empty right subtree.
The key in that node is the largest in the BST.
CIS210 34
Find Max (Largest) Operation -Recursive
FindMax(BST):
If BST == NULL then return NULL.
If BST ->RchildPtr == NULL (No right subtree) then
Return BST
else
FindMax (BST-> RchildPtr)
CIS210 35
Example: FindMax
K
P
L
M
N
BST
J
B
R
Z
Q
K
P
L
M
N
BST
CIS210 36
Traversal Operation on BST
Preorder traversal
Inorder traversal
Postorder traversal
CIS210 37
Inorder Traversal of BST
The inorder traversal of a binary search tree
visits the nodes
in sorted search-key order.
CIS210 38
Find Inorder Predecessor Operation
InorderPredecessor(BST):
The immediate predecessor of the node in the inorder
traversal, if it exists.
If the node’s left subtree is nonempty then
The largest key in the node’s left subtree
FindMax(BST->LchildPtr)
else (the node’s left subtree is empty)
The lowest ancestor of the node whose right child is the
node or also an ancestor of the node.
CIS210 39
Example: Inorder Predecessor of Q
J
B
R
Z
Q
K
P
L
M
N
BST
K
P
L
M
N
BST
FindMax
Inorder: B J K L M N P Q R Z
CIS210 40
Example: Inorder Predecessor of K
J
B
R
Z
Q
K
P
L
M
NBST
J
B
R
Z
Q
K
P
L
M
NBST
Inorder: B J K L M N P Q R Z
CIS210 41
Example: Inorder Predecessor of R
J
B
R
Z
Q
K
P
L
M
N
BST
J
B
R
Z
Q
K
P
L
M
N
BST
Inorder: B J K L M N P Q R Z
CIS210 42
Find Inorder Successor Operation
InorderSuccessor(BST):
The immediate successor of the node in the inorder
traversal, if it exists.
If the node’s right subtree is nonempty then
The smallest key in the node’s right subtree
FindMin(BST->RchildPtr)
else (the node’s right subtree is empty)
The lowest ancestor of the node whose left child is the
node or also an ancestor of the node.
CIS210 43
Example: Inorder Successor of Q
J
B
R
Z
Q
K
P
L
M
N
BST
R
Z
BST
FindMin
Inorder: B J K L M N P Q R Z
CIS210 44
Example: Inorder Successor of P
J
B
R
Z
Q
K
P
L
M
N
BST
J
B
R
Z
Q
K
P
L
M
N
BST
Inorder: B J K L M N P Q R Z
CIS210 45
Example: Inorder Successor of K
J
B
R
Z
Q
K
P
L
M
NBST
J
B
R
Z
Q
K
P
L
M
NBST
Inorder: B J K L M N P Q R Z
CIS210 46
Deletion Operation
Delete(BST, SearchKey):
If SearchKey < BST->Key then
Delete (BST->LchildPtr, SearchKey)
else if SearchKey > BST->Key then
Delete (BST->RchildPtr, SearchKey)
else (SearchKey == BST->Key )
DeleteNode (BST)
CIS210 47
Example: Delete Z
J
B
L R
Z
Q
J
B
L R
Q
CIS210 48
Example: Delete S
J
B
L S
Z
Q
J
B
L Z
Q
CIS210 49
Example: Delete S
J
B
L S
R
Q
J
B
L R
Q
CIS210 50
Example: Delete Q
J
B
L R
Z
Q
J
B
L R
Z
?
CIS210 51
Deletion Operation
Delete by Merging
Delete by Copying
CIS210 52
Deletion by Merging
Observation!
CIS210 53
Deletion by Copying
By copying IOP
By copying IOS
CIS210 54
Deletion Operation
Delete(BST, SearchKey):
If SearchKey < BST->Key then
Delete (BST->LchildPtr, SearchKey)
else if SearchKey > BST->Key then
Delete (BST->RchildPtr, SearchKey)
else (SearchKey == BST->Key )
DeleteNode (BST)
CIS210 55
DeleteNode
If N has two children then
Find M, the node that contains N’s inorder predecessor (or successor).
Inorder predecessor (IOP) =
• The rightmost node in the N’s left subtree
• The largest key in the N’s left subtree
Inorder successor (IOS) =
• The leftmost node in the N’s right subtree
• The smallest key in the N’s right subtree
Copy the item from node M into node N.
Delete (BST-> LchildPtr (or RchildPtr), M) // Remove M from the bst.
See Figure 6.32 (p. 251)
CIS210 56
Example: Delete Q
J
B
L R
Z
Q
J
B
L R
Z
?
CIS210 57
Example: Delete Q
J
B
L R
Z
Q
J
B
R
Z
L
J
B
R
Z
L
L
IOP
CIS210 58
Example: Delete Q
J
B
L R
Z
Q
J
B
R
Z
R
L
IOSJ
B
Z
R
L
CIS210 59
Quiz: Delete Q
J
B
R
Z
Q
K
P
L
M
N
J
B
R
Z
P
K
L
M
N
J
B
R
Z
P
K
P
L
M
N
IOP
CIS210 60
Quiz: Delete Q
J
B
R
Z
Q
K
P
L
M
N
J
B
Z
R
K
L
M
N
J
B
R
Z
R
K
P
L
M
N
IOS
CIS210 61
Delete by Merging Vs. Delete by Copying
Delete by Merging
…
Delete by Copying
…
CIS210 62
Analysis of BST Operations
The number of comparisons for a search/retrieval,
insertion or deletion is
the level (depth) of the element in the binary search
tree.
The maximum number of comparisons for a
retrieval, insertion or deletion is
the height of the binary search tree!
CIS210 63
Properties of Binary Search Trees
What is the minimum number of nodes that a binary
search tree of height h can have?
h
The minimum number of nodes that a binary
search tree of height h can have is h.
CIS210 64
The minimum number of nodes that a binary search tree of height
h can have is h.
Proof (by Induction on h):
Base case: h=1:
N= 1 = h
Inductive hypothesis:
The minimum number of nodes that a binary search tree
of height h = some k 1 can have is k.
CIS210 65
The minimum number of nodes that a binary search tree of height
h can have is h.
Consider h= k+1:
N = 1 + # of nodes in the subtree with height k
= 1 + k
= k + 1 = h
Tleft
root
Tright
root
CIS210 66
Properties of Binary Search Trees
What is the maximum number of nodes that a binary
search tree of height h can have?
2h - 1
The maximum number of nodes that a
binary search tree of height h can have is
2h - 1.
CIS210 67
The maximum number of nodes that a binary search tree of height
h can have is 2h - 1.
Proof (by Induction on h):
Base case: h=1:
N= 1 = h
Inductive hypothesis:
The maximum number of nodes that a binary search
tree of height h = some k 1 can have is 2k - 1.
CIS210 68
The maximum number of nodes that a binary search tree of height
h can have is 2h - 1.
Consider h= k+1:
N = 1 + # of nodes in the subtrees with height k
= 1 + 2 * (2k -1)
= 1 + 2 k+1 - 2
= 2 k+1 - 1 = 2h - 1
Tleft
root
Tright
CIS210 69
Properties of Binary Search Trees
N= The number of nodes in a binary search tree.
h = The height of a binary search tree.
h N 2h - 1log N log (N+1) h N
Lower Bound: h = (log N)
Upper Bound: h = O (N)
N
N
log N
N
CIS210 70
Analysis of Search/Retrieval Operation
Worst case
O(N)
Average Case
O(log N)
CIS210 71
Analysis of Insertion Operation
Worst case
O(N)
Average Case
O(log N)
CIS210 72
Analysis of Deletion Operation
Worst case
O(N)
Average Case
O(log N)
CIS210 73
Analysis of Traversal Operation
Worst case
O(N)
Average Case
O(N)
CIS210 74
Quiz
What is the maximum number of nodes that a D-ary
tree of height h can have?
(Dh - 1) / (D-1)
Prove by induction?
...
CIS210 75
The maximum number of nodes that a D-ary tree of height h can
have is (Dh - 1)/(D-1).
Proof (by Induction on h):
Base case: h=1:
N= D-1 / D-1 = 1 = h
Inductive hypothesis:
The maximum number of nodes that a D-ary tree of
height h= some k 1 can have is (Dk - 1)/(D-1).
CIS210 76
The maximum number of nodes that a D-ary tree of height h can
have is (Dh - 1)/(D-1).
Consider h= k+1:
N = 1 + # of nodes in the subtrees with height k
= 1 + D * (Dk - 1)/(D-1)
= 1 + (D k+1 - D)/(D-1)
= (D - 1 + D k+1 - D) /(D-1)
= (D k+1 - 1) /(D-1) = (Dh - 1) /(D-1) .
T1
root
T2 TD
CIS210 77
Iterators for BST
CIS210 78
Design Pattern: The Iterator Pattern
Container
Iterator
An iterator is an object of an iterator class!
CIS210 79
Iterator Operations
Inequality compare (!=)
Dereference (*)
Increment (++)
Overloaded operators!
CIS210 80
Types of Iterators for BST
Pre-order Traversal Iterator
In-order Traversal Iterator
Post-order Traversal Iterator
Level-order Traversal Iterator
CIS210 81
Example: Iterators for BST
J
B
R
Z
Q
K
P
L
M
N
Preorder: J B Q L K N M P R Z
Inorder: B J K L M N P Q R Z
Postorder: B K M P N L Z R Q J
CIS210 82
Iterators for BST
bst<int> bst1;
bst<int>::inorder_iterator p;
for (p=bst1.begin();p!=bst1.end(); ++p)
// Process *p
cout << *p << ” ”;
CIS210 83
BST& BST Iterator Class Diagram
BST BSTIterator
association
CIS210 84
Binary Search Trees
Vs
Skip Lists
CIS210 85
BSTs vs Skip Lists
Search
…
Insert
…
Delete
…
CIS210 86
Search Trees
CIS210 87
An M-Way Search Tree
A rooted tree in which
Each node has at most (m-1) sorted data items and
m subtrees.
The values in the leftmost subtree are less than the
first node item.
The values in the second subtree are between the first
node item and the second node item.
And so on ...
The values in the rightmost subtree are greater than
the last node item.
CIS210 88
Example: An M-way Search Tree with M=4
1 3 4
2 5 7
6
= Empty tree
CIS210 89
Example: An M-way Search Tree with M=4
1 3 4
2 5 7
6
CIS210 90
The World of Search Trees
General Trees
M-way Search Trees
CIS210 91
The World of Search Trees
Binary Search Trees
M-way Search Trees
M = 2
* Each node has at most 1 data item and 2 subtrees.
CIS210 92
The World of Trees & Search Trees
Binary Search Trees
General Trees
M-way Search TreesN-ary Trees
Binary Trees
BST
Property
M = 2