e.g.m. petrakistrees1 definition (recursive): finite set of elements (nodes) which is either empty...

62
E.G.M. Petrakis Trees 1 Trees • Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0 ,T 1 , …T m where T 0 contains one node (root) and T 1 , T 2 …T m are trees T 0 T 2 T 1 T m T:

Post on 20-Dec-2015

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 1

Trees

• Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T0,T1,…Tm where T0 contains one node (root) and T1, T2…Tm are trees

T0

T2T1 Tm…

T:

Page 2: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 2

• Depth of Tree node: length of the path from the root to the node– The root has depth 0

• Height of Tree: depth of the deepest node

• Leaf: node with no children

• Full Tree: each internal node has either two children or it is a leaf

• Complete Tree: all levels except perhaps level d-1 are full

Page 3: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 3

Aroot

root (T) = Adescendents (B) = {F,G} sons (B)ancestors (H) = {E} parent (H)degree (T) = 4 : max number of sonsdegree (E) = 3 level (H) = 2 level (A) = 0depth (T) = 3 max level

DB EC

HF JI

K

G

sub-tree

leaf

Page 4: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 4

• Full binary tree: binary tree of level d with exactly 2l nodes at level l <= d– Total number of nodes N = 2d+1 –1– N = 1222...22 1

0

10

d

d

j

jd

B

A

C

E FD GAll leaf nodes

at the same level

Page 5: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 5

• Binary trees: finite set of nodes which is either empty or it is partitioned into sets T0, Tleft, Tright where T0 is the root and Tleft, Tright are binary trees

• If a binary tree has n nodes at level l then it has at most 2n nodes at level l+1

Τ0

Τleft Τright

Page 6: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 6

• Traversal: list all nodes in ordera) Preorder (depth first order): root

traverse Tleft preorder

traverse Tright preorderb) Inorder (symmetric order):

traverse Tleft inorderroot

traverse Tright inorderc) Postorder:

traverse Tleft

traverse Tright

root

Page 7: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 7

A

C

E

B

F

HG

D

I

Preorder: ABDGCEHIFInorder: DGBAHEICFPostorder: GDBHIEFCA

Page 8: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 8

B

A

F

C

G

J LIK

E

D

H

Preorder: ABCEIFJDGHKLInorder: EICFJBGDKHLAPostorder: IEJFCGKLHDBA

Page 9: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 9

Traversal of Trees of Degree >= 2

T0

T2T1 TΜ…

preorder•T0 (root)•T1 preorder•T2 preorder• . . . .•TM preorder

inorder•T1 inorder•T0 (root)•T2 inorder• ….•TM inorder

postorder•T1 postorder•T2 postorder• ….•TM postorder•T0 (root)

Page 10: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 10

B

C ED

A

G

H

F

preorder: ABCDHEFGinorder: CBHDEAGFpostorder: CHDEBGFA

Page 11: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 11

• Binary Tree Node ADT:

class BinNode { //Bin.Tree node class

public:

BinNode* leftchild( ) const; //pointer to left child

BinNode* rightchild( ) const //pointer to right child

BELEM value( ) const; //node value

void setValue(BELEM val); //set node value

bool isLeaf( ) const; //check if node is a leaf

};

Page 12: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 12

enum bool {true = ‘1’, false=‘0’};typedef int BELEM;

class BinNode { public:

BELEM element; BinNode *left, *right;

BinNode ( ) {left = right = NULL;} //constructors BinNode (BELEM e,BinNode *l = NULL,BinNode *r =

NULL) {element = e; left = l; right = r;}

~BinNode ( ) { } //destructor

BinNode *leftchild ( ) const { return left;}

…..

Page 13: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 13

BinNode *rightchild ( ) const {return right ;}

BELEM value ( ) const { return element;}

void setValue (BELEM val) {element = val;}

bool isLeaf ( ) const { return (left == NULL ) && ( right == NULL);}

bool isLeft (BinNode *p) {….} }

Page 14: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 14

void preorder (BinNode *tree){ if ( tree = = NULL) return ;

else { ……………..

cout << ‘’node:‘’ << treeelement << ‘’\n‘’ ; preorder ( treeleftchild ( ) );

preorder (treerightchild ( ) ); }}

Page 15: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 15

• Binary Search Trees (BST): binary tree – Each node stores key K

– The nodes of Tleft have keys < K

– The nodes of Tright have keys >= K

10

3

3015

20

1

25

Page 16: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 16

• Search a BST to find key x: – Compare with root– If root(key) == x key found!!

– If key(root) < x search the Tleft recursively

– If key(root) >= x search Tright recursively

30

50

35

60

20 65

Page 17: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 17

• Find the maximum key in a BST: follow nodes on Tright until key is found or NULL

• Find the minimum key in a BST: follow nodes on Tleft

50

30

35

60

20 65max

min

Page 18: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 18

• Insert key x in a BST: search for x in the BST until a leaf is found– Insert it as left child of leaf if key(leaf) < x– Insert it as right child of leaf if key(leaf) >= x

50

35

40

37

6030

2065

ρ

Page 19: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 19

• Shape of BST depends on insertion order– Example: 10 20 5 7 1

– Example: 1 5 7 10 20

10

5

71

20

1

5

20

7

10

for ordered insertion sequences the BST

becomes list

Page 20: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 20

• Delete key x from BST: three casesa) X is a leaf: simply delete it

b) X is not a leaf; it has exactly one sub-tree• The father node points to the node next to x

• Delete node(x)

c) X is not a leaf; it has two sub-trees• Find p: minimum of Tright or maximum of Tleft

• This has at most one sub-tree!!

• Delete it as in (a) or (b)

• Substitute x with p

Page 21: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 21

Delete key from BST: the key is a leaf

Page 22: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 22

Delete key from BST: the key has exactly one sub-tree

Page 23: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 23

Delete key from BST: the key has two sub-trees

Page 24: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 24

• Application of BST: sorting – Insert keys in a BST– Output inorder sequence of keys– insertion sequence: 50 30 60 20 35 40 37 65

30

50

37

60

3520

40

65

Complexity: Average case: O(nlogn)Worst case: O(n2)

Page 25: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 25

• Implementation of BSTs: each node stores the key and pointers to roots of Tleft, Tright

– Array-based implementation: • known maximum number of nodes

• fast

• good for heaps

– Dynamic memory allocation:• no restriction on the number of nodes

• slower but

• more general

Page 26: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 26

• Implementation with dynamic memory: – Two C++ classes– BinNode: node class

• The same as in general binary trees• Elementary Operations on node• left, right, setValue, isleaf, father …

– BST: tree class • Composite operations on trees• Find, insert, remove, traverse ….

– Use of help operations: • Easier interface to class • Encapsulation: implementation becomes private

Page 27: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 27

class BST { private: BinNode *root; void inserthelp (BinNode *&, const BELEM); void removehelp(BinNode *&,const BELEM); BinNode * findhelp (BinNode *,const BELEM) const; void printhelp (const BinNode *,const int)const; void clearhelp (BinNode *); public: BST ( ) { root = NULL; } ~BST ( ) { clearhelp (root); } void clear ( ) { clearhelp (root); root = NULL;} void insert (const BELEM & val) { inserthelp (root, val); }

continued ...

Page 28: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 28

continued ... void remove (const BELEM &val )

{removehelp (root , val); }

BinNode *deletemin (BinNode *&);

BinNode *find (const BELEM & val) const { return findhelp (root ,val);

bool isEmpty ( ) const {return root == NULL }

void print ( ) const { if ( root == NULL ) cout < < ‘’BST empty \n ‘’ ; else printhelp (root , 0); }}

Page 29: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 29

BinNode*BST::findhelp(BinNode *rt, const BELEM val) const { if (rt == NULL) return NULL ; else if ( val < rtvalue ( ) ) return findhelp (rtleftchild ( ), val ); else if ( val == rtvalue ( ) ) return rt; else return findhelp (rtrightchild ( ), val ); }

void BST :: inserthelp (BinNode *& rt, const BELEM val) { if (rt = = NULL) rt = new BinNode (val, NULL, NULL ); else if (val < rtvalue ( ) ) inserthelp (rtleft, val ); else inserthelp (rtright , val ); }

Page 30: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 30

BinNode *BST:: deletemin (BinNode *& rt ) { assert (rt ! = NULL ); if (rtleft ! = NULL return deletemin (rtleft ); else { BinNode * temp = rt; rt = rtright; return temp; } }

Page 31: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 31

• Array-based implementation of trees– General for trees with degree d >= 2– Many alternative implementations

Α

Β C

D E F G

Page 32: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 32

Α

Β C

D E F G

• Array with pointers to father nodes:

1 A 0 2 B 1 3 C 1 4 D 2 5 E 2 6 F 2 7 G 3

• Easy to find ancestors• Difficult to find children• Minimum space

Page 33: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 33

• Array with pointers to children nodes

1 A 2 3 2 B 4 5 6 3 C 7 4 D 5 E 6 F 7 G

– Easy to find children nodes– Difficult to find ancestor nodes

Α

Β C

D E F G

Page 34: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 34

• Array with lists to children nodes

1 Α 2 3 2 B 4 5 6 3 C 7 4 D 5 E 6 F 7 G

– Easy to find children nodes– Difficult to find ancestor nodes

Α

Β C

D E F G

Page 35: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 35

• General array implementation7

4 6

1 2 3 5

r

7

4 6

1 2 3 5

0 1 2 3 4 5 6

7 4 6 1 2 3 5

parent ( r ) = (r-1)/2 if 0 < r < nleftchild ( r ) = 2r+1 if 2r + 1 < nrightchild ( r ) = 2r+2 if 2r + 2 < nleftsibling ( r ) = r-1 if r even 0 < r < nrightsibling ( r ) = r+1 if r odd 0 < r+1 < n

Page 36: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 36

Heap

• Binary tree (not BST), complete– Max heap: every node has a value >= children

7

6 4

3

Max element: root

Page 37: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 37

– Min heap: every node has a value <= children

2

10

5 4

Min element: root

Page 38: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 38

• Operations on heaps: – Insert new element in heap– Build heap from n elements– Delete max (min) element from heap– Shift-down: move an element from the root to a

leaf (elementary operation)

• Array-Based implementation

• Assume max heap

Page 39: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 39

• Insert: the new element is inserted as a leaf– Last in the array

– Then shifted it to its proper position

– Swap the new elements with parent elements as long as it has greater value

– Complexity O(logn): at most d shift operations in array

• d: depth of tree

New element

57

25

65

48=>

57

65

25

48 =>

65

57

25

48

Page 40: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 40

• Build a new heap from n elements: insert elements in an initially empty heap– Complexity: O(nlogn)– Better method: Build heap bottom-up

• Assume that H1, H2 are heaps and shift R down to its proper position in H1 or H2 (depending on which root from H1, H2 has greater value)

• Do the same (recursively) within H1, H2

R

H2H1

Page 41: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 41

1

5

2

7

6 3

=>

7

5 1

4 2 6 3

=>

7

5 6

4 2 1 3

RH2 H1 R

H2 H1

4

Starts from d-2 level, the leafs need not be accessed

Page 42: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 42

• Complexity: – Up to n/2 elements at d-1 level 0 moves– Up to n/4 elements at d-2 level 1 moves/elem

…– Up to n/2i elements at level i i-1 moves/elem

)(2/)1(log

1

i nnind

i

Complexity(total number of moves)

leafs

Page 43: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 43

• Delete: the max element is removed– Swap with last element in array– Delete last elements– Shift-down new root to find its proper position– Complexity: O(logn)

65

57

25

48=>

25

57

65

48

=>48

57 25

57

25 48

Page 44: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 44

class heap { private: ΕLEM* Heap ; int size ; int n ; void siftdown(int) ;public: heap(ELEM*,int, int) ; int heapsize( ) const ; bool isLeaf(int) const ; int leftchild(int) const ; int rightchild(int) const ; int parent(int) const ; void insert(const ELEM) ; ELEM removemax( ) ; ELEM remove(int) ; void buildheap( ) ;} ;

// Pointer to the heap array// Maximum size of the heap// Number of elements now in the heap// Put an element in its correct place

// Constructor// Return current size of the heap// TRUE if pos is a leaf position// Return position for left child// Return position for right child// Return position for parent of pos// Insert value into heap // Remove maximum value// Remove value at specified position// Heapify contents of Heap

Class Heap

Page 45: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 45

heap :: heap(ELEM* h, int num , int max) // Constructor { heap = h; n = num; size = max; buildheap( ); }

int heap :: heapsize( ) const { return n; } // Size of heap

bool heap :: isLeaf(int pos) const // TRUE if leaf { return (pos >= n/2) && (pos < n); }

int heap :: leftchild(int pos) const { // position of left child assert(pos < n/2); // Must be a position with a left child return 2*pos + 1; } int heap :: rightchild(int pos) const {// position of right child assert(pos < (n-1)/2); // Must be a position with a left child return 2*pos + 2;}

Page 46: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 46

int heap :: parent(int pos) const { // position for parent assert(pos > 0); // Posmust have parent return (pos-1)/2; }

void heap :: insert(const ELEM val) { // Insert value assert(n < size); int curr = n++; Heap[curr] = val; // Start at end of heap, then // sift it up until curr’s parent > curr while ((curr != 0) && (key(Heap[curr]) >

key(Heap[parent(curr)]))) { swap(Heap[curr], Heap[parent(curr)]); curr = parent(curr); } }

Page 47: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 47

ELEM heap :: removemax( ) { // Remove maximum value assert(n > 0); swap(Heap[0] , Heap[--n]) ; // Swap maximum with last

value if (n !=0) // Not on last element shiftdown(0); // Put new root in correct

place return Heap[n];}

Page 48: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 48

// Remove value at specified position ELEM heap :: remove(int pos) { assert((pos > 0) && (pos < n)); swap(Heap[pos] , Heap[--n]); // Swap with last value if (n != 0) // Not on last element shiftdown(pos); // Put new heap root val in correct

place return Heap[n]; } // starts from elements with higher index valuesvoid heap :: buildheap( ) // Heapify contents of Heap { for (int i = n/2-1; i>= 0; i--) siftdown(i); }

Page 49: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 49

void heap :: shiftdown(int pos) {//put elem in proper position assert((pos >= 0) && (pos < n)); while (!isLeaf(pos)) { int j = leftchild(pos); if ((j < (n-1)) && (key(Heap[j]) < key(Heap[j+1]))) j++; // j is now index of child with greater value if (key(Heap[pos]) >= key(Heap[j])) return; // Done swap(Heap[pos], Heap[j]); pos = j; // Move down }}

rightchild

Page 50: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 50

• Application: heap sort– Insert all elements in new heap– At step i delete max element

• Delete: Put max element last in array

– Store this element at position i+1– After n steps the array is sorted!!– Complexity: O(nlogn) why??

Page 51: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 51

92

37 86

33 12 48 57

25

(a) Initial maxheap

86

37 57

33 12 48 25

92

57

37 48

33 12 25 86

92

x[1]

x[3]

x[5]

x[7]

x[6]

x[8]

(b) x[8] = delete(92)

(c) x[7] = delete(86)

x[4]

x[2]

Page 52: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 52

48

37 25

33 12

92

57 86

37

33 25

12 48 57 86

92

(d) x[6] = delete(57) (e) x[5] = delete(48)

Page 53: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 53

33

12 25

37 48 57 86

92

25

12 33

37 48 57 86

92

12

25 33

37 48 57 86

92

(ζ) x[4] = delete(37)

(f) x[3] = delete(33) (g) x[2] = delete(25)

x[1]

x[3]x[2]

x[4]

x[8]

x[5] x[6] x[7]

Page 54: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 54

Huffman Coding Trees

• Goal: improvement in space requirements– In exchange of a penalty in running time

• Huffman coding assigns codes to symbols• Main idea: the length of a code depends on

its frequency– Code: binary number– Assign short codes to the more frequent

symbols

Page 55: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 55

• Input: sequence of symbols (letters)– Typically 8 bits/symbol

• Output: a binary representation– The codes of all symbols concatenated in the

same order– Requirement: A code cannot be prefix of

another– n: Less than 8 bits/symbol on the average

N

ii

N

iii

n

pnn

1

1

Page 56: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 56

• The Huffman code of each letter is derived from a binary tree– Each leaf corresponds to a letter– Goal: build a tree with the min. external path

weight (epw)– epw: sum. of weighted path lengths– Min epw min. n– A letter with high weight (frequency) should

have low depth short code – A letter with low weight may be pushed deeper

in the Huffman tree longer code

Page 57: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 57

• Building the Huffman tree: – The tree is build bottom-up– Order the letters by ascending weight

(frequency)– Remove the first two letters and assign them to

leaves of the Huffman tree– Substitute the two letters with one with weight

equal to the sum of their weights– Put this new element back on the ordered list

(in its correct place)– Repeat until only one element (root of Huffman

tree) remains in the list

Page 58: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 58

• Assigning codes to letters: beginning at the root:– Assign 0 to left branches and 1 to right

branches– The Huffman code of a letter is the binary

number determined by the path from the root to the leaf of that letter

– Longer paths correspond to less frequent letters (and the reverse)

– Replace each letter in the input with its code.

Page 59: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 59

0Weights: .12 .40 .15 .08 .25Symbols: a b c d eWeights: .20 .40 .15 .25Symbols: b c e

42 d a

Weights: .35 .40 .25Symbols: b e

1

c

d a

3Weights: .60 .40Symbols: b

e

c

d a

Weights: 1.0Symbols:

ΗuffmanTree

b

ec

d a

Page 60: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 60

Huffman codesa: 1111b: 0c:110d:1110e:10

ΗuffmanTree

b

ec

d a

Input: acde Huffman code: 1111 110 1110 10

Page 61: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 61

• Decoding: repeat until end of code– Beginning at the root, take right branch for each

1 and left branch for each 0 until we reach letter– Output the letter

• We need the Huffman tree for the decoding

• Example:

1 1 1 1 ¦ 1 1 1 0 ¦ 1 0 ¦ 0 ¦ 1 1 0 a ¦ d ¦ e ¦ b ¦ c

ΗuffmanTree

b

ec

d a

Page 62: E.G.M. PetrakisTrees1 Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0,T

E.G.M. Petrakis Trees 62

• Advantages: – Compression (40-80%)– Cheaper transmission– Some security (encryption)– Locally optimal code

• Disadvantages:– CPU cost– Difficult error correction– We need the Huffman tree for the decoding– Code not globally optimal