sorting31 sorting iii: heapsort sorting32 a good sorting algorithm is hard to find... quadratic...

30
sorting3 1 Sorting III: Sorting III: Heapsort

Post on 22-Dec-2015

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 1

Sorting III:Sorting III:

Heapsort

Page 2: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 2

A good sorting algorithm is hard A good sorting algorithm is hard to find ...to find ...

• Quadratic sorting algorithms (with running times of O(N2), such as Selectionsort & Insertionsort are unacceptable for sorting large data sets

• Mergesort and Quicksort are both better alternatives, but each has its own set of problems

Page 3: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 3

A good sorting algorithm is hard A good sorting algorithm is hard to find ...to find ...

• Both Mergesort and Quicksort have average running times of O(N log N), but:– Quicksort’s worst-case running time is

quadratic– Mergesort uses dynamic memory and may fail

for large data sets because not enough memory is available to complete the sort

Page 4: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 4

Heapsort to the rescue!Heapsort to the rescue!

• Combines the time efficiency of Mergesort with the storage efficiency of Quicksort

• Uses an element interchange algorithm similar to Selectionsort

• Works by transforming the array to be sorted into a heap

Page 5: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 5

Review of HeapsReview of Heaps

• A heap is a binary tree, usually implemented as an array

• The data entry in any node is never less than the data entries in its children

• The tree must be complete:– every level except the deepest is fully populated– at the deepest level, nodes are placed as far to

the left as possible

Page 6: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 6

Review of HeapsReview of Heaps

• Data location in an array-based heap:– the root entry is in array[0]– for any node at array[n], its parent node may be

found at array[(n-1)/2]– if a node at array[n] has children, they will be

found at:• array[2n+1] (left child)

• array[2n+2] (right child)

Page 7: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 7

How Heapsort WorksHow Heapsort Works

• Begin with an array of values to be sorted

• Heapsort algorithm treats the array as if it were a complete binary tree

• Values are rearranged so that the other heap condition is met: each parent node is greater than or equal to its children

Page 8: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 8

Heapsort in ActionHeapsort in Action

42 19 33 8 12 97 54 85 29 60 26 71

Original array

42

19 33

8 12 97 54

85 29 60 26 71 … arranged as binarytree (but not yet heap)

Page 9: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 9

Heapsort in ActionHeapsort in Action

97 85 71 29 60 42 54 8 19 12 26 33

4229

338 12

97

54

85

19

60

26

71

… rearranged as heap

Page 10: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 10

Heapsort in ActionHeapsort in ActionThe goal of the sort algorithm is to arrange entries in order,smallest to largest. Since the root element is by definition thelargest element in a heap, that largest element can be placed inits final position by simply swapping it with whatever elementhappens to be in the last position:

9733

As a result, the shaded area can now be designated the sorted side of the array, and the unsorted side is almost, but not quite, a heap.

Page 11: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 11

Heapsort in ActionHeapsort in Action

The next step is to restore the heap condition by rearranging thenodes again; the root value is swapped with its largest child, andmay be swapped further down until the heap condition is restored:

85 33 3360

Once the heap is restored, the previous operation is then repeated:the root value, which is the largest on the unsorted side, is swapped with the last value on the unsorted side:

8526

Page 12: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 12

Heapsort in ActionHeapsort in Action

Again, the unsorted side is almost a heap, and must be restored toheap condition - this is accomplished the same way as before:

71 26 2654

The process continues: root is swapped with the last value andmarked sorted; the unsorted portion is restored to heap condition

7112 1260 1233 1219

Page 13: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 13

Heapsort in ActionHeapsort in ActionAnd so on ...

601254 1242 12

And so forth ...

54842 826 8

Etc. ...

42833 829 8 331229 1219 12291226 12 26819 8 1912 128

Tada!

Page 14: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 14

Pseudocode for HeapsortPseudocode for Heapsort

• Convert array of n elements into heap

• Set index of unsorted to n (unsorted = n)

• while (unsorted > 1)unsorted--;

swap (array[0], array[unsorted]);

reHeapDown

Page 15: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 15

Code for HeapsortCode for Heapsort

template <class item>void heapsort (item array[ ], size_t n){

size_t unsorted = n;makeHeap(array, n);while (unsorted > 1){

unsorted--;swap(array[0], array[unsorted]);reHeapDown(data, unsorted);

}}

Page 16: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 16

makeHeap functionmakeHeap function

• Two common ways to build initial heap

• First method builds heap one element at a time, starting at front of array

• Uses reHeapUp process (last seen in priority queue implementation) to enforce heap condition - children are <= parents

Page 17: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 17

Code for makeHeapCode for makeHeaptemplate <class item>void makeHeap (item data[ ], size_t n){

for (size_t x=1; x<n; x++){

size_t k = x;while (data[k] > data[parent(k)]){

swap (data[k], data[parent(k)]);k = parent(k);

}}

}

Page 18: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 18

Helper functions for makeHeapHelper functions for makeHeap

template <class item>void swap (item& a, item& b){

item temp = a;a = b;b = temp;

}

size_t parent (size_t index){

size_t n;n = (index-1)/2;return n;

}

Page 19: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 19

Alternative makeHeap methodAlternative makeHeap method

• Uses a function (subHeap) that creates a heap from a subtree of the complete binary tree

• The function takes three arguments: an array of items to be sorted, the size of the array(n), and a number representing the index of a subtree.

• Function subHeap is called within a loop in the makeHeap function, as follows:for (int x = (n/2); x>0; x--)

subHeap(data, x-1, n);

Page 20: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 20

Alternative makeHeap in ActionAlternative makeHeap in Action

42 19 33 8 12 97 54 85 29 60 26 71

Original array

First call to subHeap, in the given example:subHeap (data, 5, 12);

Algorithm examines subtree to determine if any change isnecessary; since the only child of element 5 (97) is element 11(71), no change from this call.

Page 21: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 21

Second call to subHeapSecond call to subHeap

42 19 33 8 12 97 54 85 29 60 26 71

The second call to subHeap looks at the subtree with rootindex 4

The children of this node are found at indexes 9 and 10

Both children are greater than the root node, so the rootnode’s data entry is swapped with the data entry of the largerchild

60 12

Page 22: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 22

Third call to subHeapThird call to subHeap

42 19 33 8 60 97 54 85 29 12 26 71

Continuing to work backward through the array, the nextroot index is 3

Again, the children of this node are examined, and theroot data entry is swapped with the larger of its two children

85 8

Page 23: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 23

The process continues ...The process continues ...

42 19 33 85 60 97 54 8 29 12 26 7197 33

In this case, a further swap is necessary because the dataentry’s new position has a child with a larger data entry

71 33

Page 24: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 24

The process continues ...The process continues ...

42 19 97 85 60 71 54 8 29 12 26 3385 19

Again, a further swap operation is necessary

29 19

Page 25: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 25

Last call to subHeapLast call to subHeap

42 85 97 29 60 71 54 8 19 12 26 3397 4271 42

We now have a heap:

Page 26: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 26

ReHeapDown functionReHeapDown function

• Both the original HeapSort function and the subHeap function will call reHeapDown to rearrange the array

• The reHeapDown fucntion swaps an out-of-place data entry with the larger of its children

Page 27: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 27

Code for reHeapDownCode for reHeapDowntemplate <class item>void reHeapDown(item data [ ], size_t n){

size_t current = 0, big_child, heap_ok = 0; while ((!heap_ok) && (left_child(current) < n)) { if (right_child(current) >= n)

big_child = left_child(current); else if (data[left_child(current)] > data[right_child(current)]) big_child = left_child(current); else big_child = right_child(current);

...

Page 28: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 28

Code for reHeapDownCode for reHeapDown

if (data[current] < data[big_child]) { swap(data[current], data[big_child]); current = big_child; } else heap_ok = 1; } // end of while loop} // end of function

Page 29: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 29

Helper functions for Helper functions for reHeapDownreHeapDown

size_t left_child(size_t n){

return (2*n)+1;}

size_t right_child(size_t n){

return (2*n)+2;}

Page 30: sorting31 Sorting III: Heapsort sorting32 A good sorting algorithm is hard to find... Quadratic sorting algorithms (with running times of O(N 2 ), such

sorting3 30

Time analysis for HeapSortTime analysis for HeapSort

• The time required for a HeapSort is the sum of the time required for the major operations:– building the initial heap: O(N log N)– removing items from the heap: O(N log N)

• So total time required is O(2N log N), but since constants don’t count, the worst-case (and average case) for HeapSort is O(N log N)