# chapter 09 compiled by: dr. mohammad omar alhawarat sorting & searching

Post on 12-Jan-2016

218 views

Category:

## Documents

Tags:

• #### specific order

Embed Size (px)

TRANSCRIPT

CS2320: Data structures & Algorithms

Chapter 09Compiled by: Dr. Mohammad Omar AlhawaratSorting & Searching1ContentSorting:Bubble Sort.Insertion Sort.Selection Sort.Merge Sort.Quicksort

Searching:Sequential Search.Binary Search.Hashing.2Sorting3SortingDefinition: Rearranging the values into a specific order: (Ascending OR Descending).

Sorting is important and is required in many Applications, i.e., Searching.

one of the fundamental problems in computer sciencecan be solved in many ways:fast/slowuse more/less memorydepends on datautilize multiple computers / processors, ...

4SortingComparison-based sorting: determining order by comparing pairs of elements.

An internal sort requires that the collection of data fit entirely in the computers main memory.

We can use an external sort when the collection of data cannot fit in the computers main memory all at once but must reside in secondary storage such as on a disk.

We will analyze only Comparison-based and internal sorting algorithms.5Bubble SortIdea:Repeatedly pass through the arraySwaps adjacent elements that are out of order

Easy to implement, but slow O(N2)66Example

7Example

8Example

9Example

10Bubble Sort Analysis Worst-case: O(n2)Array is in reverse order:

Average-case: O(n2)We have to look at all possible initial data organizations.

So, Bubble Sort is O(n2)

11Insertion SortInsertion sort is a simple sorting algorithm that is appropriate for small inputs.

The list is divided into two parts: sorted and unsorted.

In each pass, the first element of the unsorted part is picked up, transferred to the sorted sublist, and inserted at the appropriate place.

A list of n elements will take at most n-1 passes to sort the data. 12Example

1313Insertion Sort Analysis Worst-case: O(n2)Array is in reverse order:

Average-case: O(n2)We have to look at all possible initial data organizations.

So, Insertion Sort is O(n2)

14Selection SortIdea:Find the smallest element in the array

Exchange it with the element in the first position

Find the second smallest element and exchange it with the element in the second position

Continue until the array is sorted1515Example132964883296418349621864932189643218694321986432198643211616Selection Sort Analysis Worst-case: O(n2)Array is in reverse order:

Average-case: O(n2)We have to look at all possible initial data organizations.

So, Selection Sort is O(n2)

17Merge sortIdea:Is based on Merging idea where two sorted lists are combined in the right order.

The start point is to consider each element in the list as an ordered small list.The result is a list of two-element sorted lists.

Repeatedly combine the ordered list until having one list18Merging AlgorithmMerging two ordered lists:Access the first item from both listsWhile neither sequence is finishedCompare the current items of bothCopy smaller current item to the outputAccess next item from that input sequenceCopy any remaining from first sequence to outputCopy any remaining from second to output

19Example of Merging

20Example: Merge sort

2121Merge sort Analysis Worst-case: O(N LogN)Array is in reverse order:

Average-case: O(N LogN)We have to look at all possible initial data organizations.

So, Merge sort Sort is O(N LogN)

But, merge sort requires an extra array whose size equals to the size of the original array.

22QuicksortIdea:Repeatedly partition the data into two halves. Only the element in the middle is sorted.After (Log2N) repetitions then the data is sorted.

Advantage: One of the practically best sorting Algorithms [O(N Log2N)] in the average case.

Drawbacks: O(N2) in the worst case.2323Searching24Introduction to Search AlgorithmsSearch: locate an item in a list (array, vector, etc.) of information

Three algorithms:Linear search (Also known as: Sequential Search)Binary searchHashing2525 Linear Search ExampleFollowing Array contains:

Searching for the value 11, linear search examines 17, 23, 5, and 11Searching for the value 7, linear search examines 17, 23, 5, 11, 2, 29, and 3172351122932626See pr9-01.cppLinear Search TradeoffsBenefitsEasy algorithm to understandArray can be in any order

DisadvantageInefficient O(N) (slow): for array of N elements, examines N/2 elements on average for value that is found in the array, N elements for value that is not in the array2727 Binary Search AlgorithmDivide a sorted array into three sections:middle elementelements on one side of the middle elementelements on the other side of the middle element

If the middle element is the correct value, done. Otherwise, go to step 1, using only the half of the array that may contain the correct value.

Continue steps 1 and 2 until either the value is found or there are no more elements to examine.2828 Ignoring one-half of the data when the data is sorted.

Binary Search Algorithm29Binary Search ExampleIf the following Array contains:

Searching for the value 11, binary search examines 11 and stops

Searching for the value 7, binary search examines 11, 3, 5, and stops235111723293030See pr9-02.cpp

Binary Search Example

3131See pr9-02.cpp

Binary Search Example

3232See pr9-02.cpp

Binary Search TradeoffsBenefit Much more efficient than linear search(For array of N elements, performs at mostlog2N comparisons) O(log2N)

Disadvantage Requires that array elements be sorted

3333 Time Complexity Summary Worst CaseAverage CaseO(N2)O(N2)Bubble SortO(N2)O(N2)Insertion SortO(N2)O(N2)Selection SortO(N LogN)O(N LogN)Merge SortO(N LogN)O(N LogN)Heap SortO(N2)O(N LogN)Quick SortO(N)O(N)Sequential SearchO(LogN)O(LogN)Binary SearchO(N)O(LogN)Binary search Tree34Sorting

35HashingHashing can be classified as one of the searching techniques that is usually used with external storage as Hard disk drive (HDD).

Hashing, is an information retrieval strategy for providing efficient access to information based on a key.

One usage is indexing databases. In such case, the location of a record in a database is linked to the key/index of that record.

Information can usually be accessed in constant time.36Concept of HashingThe information to be retrieved is stored in a hash table which is best thought of as an array of m locations, called buckets

The mapping between a key and a bucket is called the hash function

The time to store and retrieve data is proportional to the time to compute the hash function (constant)37Hashing function The ideal function, termed a perfect hash function, would distribute all elements across the buckets such that no collisions ever occurred

h(v) = f(v) mod m

Knuth (1973) suggests using as the value for m a prime number 38Determines position of key in the array

Assume table (array) size is N

Function f(x) maps any key x to an integer between 0 and N1

For example, assume that N=15, that key x is a non-negative integer between 0 and MAX_INT, and hash function f(x) = x Mod 15.39Hash Function40Let f(x) = x Mod 15. Then,if x =25 129 35 2501 47 36 f(x) =10 9 5 11 2 6

Storing the keys in the array is straightforward:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14_ _ 47 _ _ 35 36 _ _ 129 25 2501 _ _ _

Thus, delete and find can be done in O(1), and also insert, exceptHash Function41Hash FunctionWhat happens when you try to insert: x = 65 ?x =65f(x) = 5

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14_ _ 47 _ _ 35 36 _ _ 129 25 2501 _ _ _ 65(?)

This is called a collision.

Handling CollisionsA collision occurs when two different keys hash to the same value:

Ex.: For TableSize = 17, the keys 18 and 35 hash to the same value18 mod 17 = 1 and 35 mod 17 = 1

Cannot store both data records in the same slot in array!

Resolution:Separate Chaining (Closed Addressing): Use a dictionary data structure (such as a linked list) to store multiple items that hash to the same slotClosed Hashing (Open Addressing): search for empty slots and store item in first empty slot that is foundMulti-Hash functions: use another hash function to resolve the collision.42Separate Chaining43Let each array element be the head of a chain.

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 47 65 36 129 25 2501 35

Where would you store: 29, 16, 14, 99, 127 ?

Separate Chaining44Let each array element be the head of a chain:

Where would you store: 29, 16, 14, 99, 127 ?

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 16 47 65 36 127 99 25 2501 14 35 129 29

New keys go at the front of the relevant chain.

Closed Hashing45The hash table should be large enough to include all possible keys:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 16 47 65 36 127 99 25 2501 14

Where would you store: 29, 60, 24, 97?

Closed Hashing46The hash table should be large enough to include all possible keys:

Where would you store: 29, 60, 24, 97?

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 29 16 47

Recommended