e.g.m. petrakistrees1 definition (recursive): finite set of elements (nodes) which is either empty...
Post on 20-Dec-2015
222 views
TRANSCRIPT
E.G.M. Petrakis Trees 1
Trees
• Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T0,T1,…Tm where T0 contains one node (root) and T1, T2…Tm are trees
T0
T2T1 Tm…
T:
E.G.M. Petrakis Trees 2
• Depth of Tree node: length of the path from the root to the node– The root has depth 0
• Height of Tree: depth of the deepest node
• Leaf: node with no children
• Full Tree: each internal node has either two children or it is a leaf
• Complete Tree: all levels except perhaps level d-1 are full
E.G.M. Petrakis Trees 3
Aroot
root (T) = Adescendents (B) = {F,G} sons (B)ancestors (H) = {E} parent (H)degree (T) = 4 : max number of sonsdegree (E) = 3 level (H) = 2 level (A) = 0depth (T) = 3 max level
DB EC
HF JI
K
G
sub-tree
leaf
E.G.M. Petrakis Trees 4
• Full binary tree: binary tree of level d with exactly 2l nodes at level l <= d– Total number of nodes N = 2d+1 –1– N = 1222...22 1
0
10
d
d
j
jd
B
A
C
E FD GAll leaf nodes
at the same level
E.G.M. Petrakis Trees 5
• Binary trees: finite set of nodes which is either empty or it is partitioned into sets T0, Tleft, Tright where T0 is the root and Tleft, Tright are binary trees
• If a binary tree has n nodes at level l then it has at most 2n nodes at level l+1
Τ0
Τleft Τright
E.G.M. Petrakis Trees 6
• Traversal: list all nodes in ordera) Preorder (depth first order): root
traverse Tleft preorder
traverse Tright preorderb) Inorder (symmetric order):
traverse Tleft inorderroot
traverse Tright inorderc) Postorder:
traverse Tleft
traverse Tright
root
E.G.M. Petrakis Trees 7
A
C
E
B
F
HG
D
I
Preorder: ABDGCEHIFInorder: DGBAHEICFPostorder: GDBHIEFCA
E.G.M. Petrakis Trees 8
B
A
F
C
G
J LIK
E
D
H
Preorder: ABCEIFJDGHKLInorder: EICFJBGDKHLAPostorder: IEJFCGKLHDBA
E.G.M. Petrakis Trees 9
Traversal of Trees of Degree >= 2
T0
T2T1 TΜ…
preorder•T0 (root)•T1 preorder•T2 preorder• . . . .•TM preorder
inorder•T1 inorder•T0 (root)•T2 inorder• ….•TM inorder
postorder•T1 postorder•T2 postorder• ….•TM postorder•T0 (root)
E.G.M. Petrakis Trees 10
B
C ED
A
G
H
F
preorder: ABCDHEFGinorder: CBHDEAGFpostorder: CHDEBGFA
E.G.M. Petrakis Trees 11
• Binary Tree Node ADT:
class BinNode { //Bin.Tree node class
public:
BinNode* leftchild( ) const; //pointer to left child
BinNode* rightchild( ) const //pointer to right child
BELEM value( ) const; //node value
void setValue(BELEM val); //set node value
bool isLeaf( ) const; //check if node is a leaf
};
E.G.M. Petrakis Trees 12
enum bool {true = ‘1’, false=‘0’};typedef int BELEM;
class BinNode { public:
BELEM element; BinNode *left, *right;
BinNode ( ) {left = right = NULL;} //constructors BinNode (BELEM e,BinNode *l = NULL,BinNode *r =
NULL) {element = e; left = l; right = r;}
~BinNode ( ) { } //destructor
BinNode *leftchild ( ) const { return left;}
…..
E.G.M. Petrakis Trees 13
BinNode *rightchild ( ) const {return right ;}
BELEM value ( ) const { return element;}
void setValue (BELEM val) {element = val;}
bool isLeaf ( ) const { return (left == NULL ) && ( right == NULL);}
bool isLeft (BinNode *p) {….} }
E.G.M. Petrakis Trees 14
void preorder (BinNode *tree){ if ( tree = = NULL) return ;
else { ……………..
cout << ‘’node:‘’ << treeelement << ‘’\n‘’ ; preorder ( treeleftchild ( ) );
preorder (treerightchild ( ) ); }}
E.G.M. Petrakis Trees 15
• Binary Search Trees (BST): binary tree – Each node stores key K
– The nodes of Tleft have keys < K
– The nodes of Tright have keys >= K
10
3
3015
20
1
25
E.G.M. Petrakis Trees 16
• Search a BST to find key x: – Compare with root– If root(key) == x key found!!
– If key(root) < x search the Tleft recursively
– If key(root) >= x search Tright recursively
30
50
35
60
20 65
E.G.M. Petrakis Trees 17
• Find the maximum key in a BST: follow nodes on Tright until key is found or NULL
• Find the minimum key in a BST: follow nodes on Tleft
50
30
35
60
20 65max
min
E.G.M. Petrakis Trees 18
• Insert key x in a BST: search for x in the BST until a leaf is found– Insert it as left child of leaf if key(leaf) < x– Insert it as right child of leaf if key(leaf) >= x
50
35
40
37
6030
2065
ρ
E.G.M. Petrakis Trees 19
• Shape of BST depends on insertion order– Example: 10 20 5 7 1
– Example: 1 5 7 10 20
10
5
71
20
1
5
20
7
10
for ordered insertion sequences the BST
becomes list
E.G.M. Petrakis Trees 20
• Delete key x from BST: three casesa) X is a leaf: simply delete it
b) X is not a leaf; it has exactly one sub-tree• The father node points to the node next to x
• Delete node(x)
c) X is not a leaf; it has two sub-trees• Find p: minimum of Tright or maximum of Tleft
• This has at most one sub-tree!!
• Delete it as in (a) or (b)
• Substitute x with p
E.G.M. Petrakis Trees 21
Delete key from BST: the key is a leaf
E.G.M. Petrakis Trees 22
Delete key from BST: the key has exactly one sub-tree
E.G.M. Petrakis Trees 23
Delete key from BST: the key has two sub-trees
E.G.M. Petrakis Trees 24
• Application of BST: sorting – Insert keys in a BST– Output inorder sequence of keys– insertion sequence: 50 30 60 20 35 40 37 65
30
50
37
60
3520
40
65
Complexity: Average case: O(nlogn)Worst case: O(n2)
E.G.M. Petrakis Trees 25
• Implementation of BSTs: each node stores the key and pointers to roots of Tleft, Tright
– Array-based implementation: • known maximum number of nodes
• fast
• good for heaps
– Dynamic memory allocation:• no restriction on the number of nodes
• slower but
• more general
E.G.M. Petrakis Trees 26
• Implementation with dynamic memory: – Two C++ classes– BinNode: node class
• The same as in general binary trees• Elementary Operations on node• left, right, setValue, isleaf, father …
– BST: tree class • Composite operations on trees• Find, insert, remove, traverse ….
– Use of help operations: • Easier interface to class • Encapsulation: implementation becomes private
E.G.M. Petrakis Trees 27
class BST { private: BinNode *root; void inserthelp (BinNode *&, const BELEM); void removehelp(BinNode *&,const BELEM); BinNode * findhelp (BinNode *,const BELEM) const; void printhelp (const BinNode *,const int)const; void clearhelp (BinNode *); public: BST ( ) { root = NULL; } ~BST ( ) { clearhelp (root); } void clear ( ) { clearhelp (root); root = NULL;} void insert (const BELEM & val) { inserthelp (root, val); }
continued ...
E.G.M. Petrakis Trees 28
continued ... void remove (const BELEM &val )
{removehelp (root , val); }
BinNode *deletemin (BinNode *&);
BinNode *find (const BELEM & val) const { return findhelp (root ,val);
bool isEmpty ( ) const {return root == NULL }
void print ( ) const { if ( root == NULL ) cout < < ‘’BST empty \n ‘’ ; else printhelp (root , 0); }}
E.G.M. Petrakis Trees 29
BinNode*BST::findhelp(BinNode *rt, const BELEM val) const { if (rt == NULL) return NULL ; else if ( val < rtvalue ( ) ) return findhelp (rtleftchild ( ), val ); else if ( val == rtvalue ( ) ) return rt; else return findhelp (rtrightchild ( ), val ); }
void BST :: inserthelp (BinNode *& rt, const BELEM val) { if (rt = = NULL) rt = new BinNode (val, NULL, NULL ); else if (val < rtvalue ( ) ) inserthelp (rtleft, val ); else inserthelp (rtright , val ); }
E.G.M. Petrakis Trees 30
BinNode *BST:: deletemin (BinNode *& rt ) { assert (rt ! = NULL ); if (rtleft ! = NULL return deletemin (rtleft ); else { BinNode * temp = rt; rt = rtright; return temp; } }
E.G.M. Petrakis Trees 31
• Array-based implementation of trees– General for trees with degree d >= 2– Many alternative implementations
Α
Β C
D E F G
E.G.M. Petrakis Trees 32
Α
Β C
D E F G
• Array with pointers to father nodes:
1 A 0 2 B 1 3 C 1 4 D 2 5 E 2 6 F 2 7 G 3
• Easy to find ancestors• Difficult to find children• Minimum space
E.G.M. Petrakis Trees 33
• Array with pointers to children nodes
1 A 2 3 2 B 4 5 6 3 C 7 4 D 5 E 6 F 7 G
– Easy to find children nodes– Difficult to find ancestor nodes
Α
Β C
D E F G
E.G.M. Petrakis Trees 34
• Array with lists to children nodes
1 Α 2 3 2 B 4 5 6 3 C 7 4 D 5 E 6 F 7 G
– Easy to find children nodes– Difficult to find ancestor nodes
Α
Β C
D E F G
E.G.M. Petrakis Trees 35
• General array implementation7
4 6
1 2 3 5
r
7
4 6
1 2 3 5
0 1 2 3 4 5 6
7 4 6 1 2 3 5
parent ( r ) = (r-1)/2 if 0 < r < nleftchild ( r ) = 2r+1 if 2r + 1 < nrightchild ( r ) = 2r+2 if 2r + 2 < nleftsibling ( r ) = r-1 if r even 0 < r < nrightsibling ( r ) = r+1 if r odd 0 < r+1 < n
E.G.M. Petrakis Trees 36
Heap
• Binary tree (not BST), complete– Max heap: every node has a value >= children
7
6 4
3
Max element: root
E.G.M. Petrakis Trees 37
– Min heap: every node has a value <= children
2
10
5 4
Min element: root
E.G.M. Petrakis Trees 38
• Operations on heaps: – Insert new element in heap– Build heap from n elements– Delete max (min) element from heap– Shift-down: move an element from the root to a
leaf (elementary operation)
• Array-Based implementation
• Assume max heap
E.G.M. Petrakis Trees 39
• Insert: the new element is inserted as a leaf– Last in the array
– Then shifted it to its proper position
– Swap the new elements with parent elements as long as it has greater value
– Complexity O(logn): at most d shift operations in array
• d: depth of tree
New element
57
25
65
48=>
57
65
25
48 =>
65
57
25
48
E.G.M. Petrakis Trees 40
• Build a new heap from n elements: insert elements in an initially empty heap– Complexity: O(nlogn)– Better method: Build heap bottom-up
• Assume that H1, H2 are heaps and shift R down to its proper position in H1 or H2 (depending on which root from H1, H2 has greater value)
• Do the same (recursively) within H1, H2
R
H2H1
E.G.M. Petrakis Trees 41
1
5
2
7
6 3
=>
7
5 1
4 2 6 3
=>
7
5 6
4 2 1 3
RH2 H1 R
H2 H1
4
Starts from d-2 level, the leafs need not be accessed
E.G.M. Petrakis Trees 42
• Complexity: – Up to n/2 elements at d-1 level 0 moves– Up to n/4 elements at d-2 level 1 moves/elem
…– Up to n/2i elements at level i i-1 moves/elem
)(2/)1(log
1
i nnind
i
Complexity(total number of moves)
leafs
E.G.M. Petrakis Trees 43
• Delete: the max element is removed– Swap with last element in array– Delete last elements– Shift-down new root to find its proper position– Complexity: O(logn)
65
57
25
48=>
25
57
65
48
=>48
57 25
57
25 48
E.G.M. Petrakis Trees 44
class heap { private: ΕLEM* Heap ; int size ; int n ; void siftdown(int) ;public: heap(ELEM*,int, int) ; int heapsize( ) const ; bool isLeaf(int) const ; int leftchild(int) const ; int rightchild(int) const ; int parent(int) const ; void insert(const ELEM) ; ELEM removemax( ) ; ELEM remove(int) ; void buildheap( ) ;} ;
// Pointer to the heap array// Maximum size of the heap// Number of elements now in the heap// Put an element in its correct place
// Constructor// Return current size of the heap// TRUE if pos is a leaf position// Return position for left child// Return position for right child// Return position for parent of pos// Insert value into heap // Remove maximum value// Remove value at specified position// Heapify contents of Heap
Class Heap
E.G.M. Petrakis Trees 45
heap :: heap(ELEM* h, int num , int max) // Constructor { heap = h; n = num; size = max; buildheap( ); }
int heap :: heapsize( ) const { return n; } // Size of heap
bool heap :: isLeaf(int pos) const // TRUE if leaf { return (pos >= n/2) && (pos < n); }
int heap :: leftchild(int pos) const { // position of left child assert(pos < n/2); // Must be a position with a left child return 2*pos + 1; } int heap :: rightchild(int pos) const {// position of right child assert(pos < (n-1)/2); // Must be a position with a left child return 2*pos + 2;}
E.G.M. Petrakis Trees 46
int heap :: parent(int pos) const { // position for parent assert(pos > 0); // Posmust have parent return (pos-1)/2; }
void heap :: insert(const ELEM val) { // Insert value assert(n < size); int curr = n++; Heap[curr] = val; // Start at end of heap, then // sift it up until curr’s parent > curr while ((curr != 0) && (key(Heap[curr]) >
key(Heap[parent(curr)]))) { swap(Heap[curr], Heap[parent(curr)]); curr = parent(curr); } }
E.G.M. Petrakis Trees 47
ELEM heap :: removemax( ) { // Remove maximum value assert(n > 0); swap(Heap[0] , Heap[--n]) ; // Swap maximum with last
value if (n !=0) // Not on last element shiftdown(0); // Put new root in correct
place return Heap[n];}
E.G.M. Petrakis Trees 48
// Remove value at specified position ELEM heap :: remove(int pos) { assert((pos > 0) && (pos < n)); swap(Heap[pos] , Heap[--n]); // Swap with last value if (n != 0) // Not on last element shiftdown(pos); // Put new heap root val in correct
place return Heap[n]; } // starts from elements with higher index valuesvoid heap :: buildheap( ) // Heapify contents of Heap { for (int i = n/2-1; i>= 0; i--) siftdown(i); }
E.G.M. Petrakis Trees 49
void heap :: shiftdown(int pos) {//put elem in proper position assert((pos >= 0) && (pos < n)); while (!isLeaf(pos)) { int j = leftchild(pos); if ((j < (n-1)) && (key(Heap[j]) < key(Heap[j+1]))) j++; // j is now index of child with greater value if (key(Heap[pos]) >= key(Heap[j])) return; // Done swap(Heap[pos], Heap[j]); pos = j; // Move down }}
rightchild
E.G.M. Petrakis Trees 50
• Application: heap sort– Insert all elements in new heap– At step i delete max element
• Delete: Put max element last in array
– Store this element at position i+1– After n steps the array is sorted!!– Complexity: O(nlogn) why??
E.G.M. Petrakis Trees 51
92
37 86
33 12 48 57
25
(a) Initial maxheap
86
37 57
33 12 48 25
92
57
37 48
33 12 25 86
92
x[1]
x[3]
x[5]
x[7]
x[6]
x[8]
(b) x[8] = delete(92)
(c) x[7] = delete(86)
x[4]
x[2]
E.G.M. Petrakis Trees 52
48
37 25
33 12
92
57 86
37
33 25
12 48 57 86
92
(d) x[6] = delete(57) (e) x[5] = delete(48)
E.G.M. Petrakis Trees 53
33
12 25
37 48 57 86
92
25
12 33
37 48 57 86
92
12
25 33
37 48 57 86
92
(ζ) x[4] = delete(37)
(f) x[3] = delete(33) (g) x[2] = delete(25)
x[1]
x[3]x[2]
x[4]
x[8]
x[5] x[6] x[7]
E.G.M. Petrakis Trees 54
Huffman Coding Trees
• Goal: improvement in space requirements– In exchange of a penalty in running time
• Huffman coding assigns codes to symbols• Main idea: the length of a code depends on
its frequency– Code: binary number– Assign short codes to the more frequent
symbols
E.G.M. Petrakis Trees 55
• Input: sequence of symbols (letters)– Typically 8 bits/symbol
• Output: a binary representation– The codes of all symbols concatenated in the
same order– Requirement: A code cannot be prefix of
another– n: Less than 8 bits/symbol on the average
N
ii
N
iii
n
pnn
1
1
E.G.M. Petrakis Trees 56
• The Huffman code of each letter is derived from a binary tree– Each leaf corresponds to a letter– Goal: build a tree with the min. external path
weight (epw)– epw: sum. of weighted path lengths– Min epw min. n– A letter with high weight (frequency) should
have low depth short code – A letter with low weight may be pushed deeper
in the Huffman tree longer code
E.G.M. Petrakis Trees 57
• Building the Huffman tree: – The tree is build bottom-up– Order the letters by ascending weight
(frequency)– Remove the first two letters and assign them to
leaves of the Huffman tree– Substitute the two letters with one with weight
equal to the sum of their weights– Put this new element back on the ordered list
(in its correct place)– Repeat until only one element (root of Huffman
tree) remains in the list
E.G.M. Petrakis Trees 58
• Assigning codes to letters: beginning at the root:– Assign 0 to left branches and 1 to right
branches– The Huffman code of a letter is the binary
number determined by the path from the root to the leaf of that letter
– Longer paths correspond to less frequent letters (and the reverse)
– Replace each letter in the input with its code.
E.G.M. Petrakis Trees 59
0Weights: .12 .40 .15 .08 .25Symbols: a b c d eWeights: .20 .40 .15 .25Symbols: b c e
42 d a
Weights: .35 .40 .25Symbols: b e
1
c
d a
3Weights: .60 .40Symbols: b
e
c
d a
Weights: 1.0Symbols:
ΗuffmanTree
b
ec
d a
E.G.M. Petrakis Trees 60
Huffman codesa: 1111b: 0c:110d:1110e:10
ΗuffmanTree
b
ec
d a
Input: acde Huffman code: 1111 110 1110 10
E.G.M. Petrakis Trees 61
• Decoding: repeat until end of code– Beginning at the root, take right branch for each
1 and left branch for each 0 until we reach letter– Output the letter
• We need the Huffman tree for the decoding
• Example:
1 1 1 1 ¦ 1 1 1 0 ¦ 1 0 ¦ 0 ¦ 1 1 0 a ¦ d ¦ e ¦ b ¦ c
ΗuffmanTree
b
ec
d a
E.G.M. Petrakis Trees 62
• Advantages: – Compression (40-80%)– Cheaper transmission– Some security (encryption)– Locally optimal code
• Disadvantages:– CPU cost– Difficult error correction– We need the Huffman tree for the decoding– Code not globally optimal