# sdt topic-09: searching & searching

Post on 15-Jul-2015

545 views

Embed Size (px)

TRANSCRIPT

Topic 9 : Searching and Sorting

Software Development Techniques

Learning Objectives

Develop linear and binary searches

Develop bubble and quick sorts

Apply Big O Notation to algorithm selection.

Searching

Searching algorithms allow us to find a specific element in an array.

We often want to know if an element is present in an array before performing some operation.

If we are storing a list of names, we will want to check to make sure a name is not already there.

Searching algorithms allow us to do that.

And also, it can allow us to find out specifically where the element is if that is needed

Searching

We often need to perform search operations, not just to find whether or not an element is within an array, but also to get access to its index.

1. Linear Searching and

2. Binary searching

Searching - Linear

The simplest and most costly search is a linear search.

We start at the beginning of an array.

We check each element in turn to see if it is our desired element.

We continue until we get to the end.

Linear search is simple to code, and simple to understand.

But it also has performance issues.

Linear Search Pseudocode

Linear Search using Function

Write a main program.

Problem

We need to do a lot of checking in order to find an element.

If the desired element is at the start of the array, we can find it with one check.

If it is at the end, we find it with a number of checks equal to the size of the array.

On average, we will need to check a number of times equal to half the size of the array.

If we have 10 elements, we need 5 checks.

If we have 100 elements, we need 50 checks.

Problem

If we have an array of N elements, then:

Our best case is 1 check

Resource usage/time consumption is least

Our worst case is N checks

Resource usage/time consumption is most

Our average case is N/2 checks

Resource usage/time consumption is average

Big O Notation

Big O notation is the expression that we use for expressing how long an algorithm takes to run.

The Big O notation for

1. best case of Linear Search is O(1).

2. worst case of Linear Search is O(N).

Binary Searching

If we have an array that is in ascending or descending order, we can use a binary search.

We pick the mid-point of the array.

If we have found our search term, our work is done.

If the search term is not there, we split the array in two segments.

If our search term is higher than our current term, binary search the top half.

If our search term is lower than our current term, binary search the bottom half.

Continue until found, or until the term is not found.

Binary Searching

Write the pseudo code for performing a binary search on an ordered array.

Binary Searching

Binary Searching

Binary searching scales much better than a linear search.

There is a lot less checking involved.

We say binary searching scales at O(log n)

That puts it between O(1) and O(n)

Sorting

The process for ordering an array is known as sorting.

There are many algorithms for doing this.

We will only discuss two of these.

1. Bubble sort

2. Quick sort

The Bubble Sort

The simplest form of sorting is known as the bubble sort.

It works by checking adjacent pairs and then swapping them when they are not in order.

We keep doing this until we do not make any swaps in a pass.

The end result of this is an ordered array.

How Bubble Sort works?

From http://en.wikipedia.org/wiki/Bubble_sort

Bubble Sort Pseudo code

Bubble Sort - Scaling

Bubble sorts do not scale well.

Because they are made up of loops within a loop.

They have O (n2)scaling in worst case and O(n)scaling in best case.

Big O Notation O(n2)

These algorithms scale exponentially.

They are even worse than linear scaling.

This is true for both average and worst case.

However, they scale much better for arrays that are already partially sorted.

Bubble sorts then are simple and easy to code.

But with very limited real world use for large data sets.

They show one mechanism by which an array can be sorted.

Recursion

Recursion is a special kind of looping.

Rather than looping within a function, we loop by calling the function itself.

Sum using recursive function

Factorial using recursive function

Quicksort

A much better algorithm for sorting is known as the quicksort. This is an algorithm based on the principle of divide and

conquer.

It works by splitting an array into two sections It then sorts each array individually.

The clever thing about a quicksort is that it is recursive. Subarrays are quicksorted in turn.

Quicksort Procedure

We are given an array which may or may not be sorted.

We pick a pivot value.

We will just go by picking the middle most element.

We separate our array into two separate arrays.

Those elements that are greater than the pivot value

Those elements that are less than the pivot value.

We quicksort each of those separate arrays

At the end, we merge all the arrays together.

Quicksort Animation

From http://en.wikipedia.org/wiki/Quicksort

Quicksort Scaling

Worst case performance O(n

2)

Best case performance O(n log n)

Worst case performance O(n

2)

Best case performance O(n log n)

Important Question

Explain why the bubble sort algorithm is less efficient than the quick sort algorithm for sorting a large number of elements.

The Bubble sort does not scale well because it uses a nested loop. A quick sort makes use of recursion to sort increasingly reduced arrays around a pivot point which is selected as the midpoint of an array.

http://cs.stackexchange.com/questions/3/why-is-quicksort-better-than-other-sorting-algorithms-in-practice

Conclusion

Arrays are powerful, but we need techniques for managing them.

Two of those techniques are searching and sorting.

Standard algorithms exist for both of these tasks.

Linear and binary search

Bubble and quicksort

There are many other such algorithms.

Big O notation tells us which are likely to be better for specific kinds of data.

Terminology - 1

Big O Notation A way to state how well algorithms scale.

Linear Search To search through every element in an array in order for a

search term

Binary Search To partition a search for increased efficiency.

Terminology - 2

Bubble Sort

A sort which works by repeatedly swapping adjacent elements until an array is orders.

Quicksort

A sort which uses recursion to more efficiently sort a list of numbers.

Recursion

A loop that is created by having a function call itself with a smaller set of data.

End of Topic 9Software Development Techniques