sdt topic-09: searching & searching

Download SDT Topic-09: Searching & Searching

Post on 15-Jul-2015

545 views

Category:

Education

2 download

Embed Size (px)

TRANSCRIPT

  • Topic 9 : Searching and Sorting

    Software Development Techniques

  • Learning Objectives

    Develop linear and binary searches

    Develop bubble and quick sorts

    Apply Big O Notation to algorithm selection.

  • Searching

    Searching algorithms allow us to find a specific element in an array.

    We often want to know if an element is present in an array before performing some operation.

    If we are storing a list of names, we will want to check to make sure a name is not already there.

    Searching algorithms allow us to do that.

    And also, it can allow us to find out specifically where the element is if that is needed

  • Searching

    We often need to perform search operations, not just to find whether or not an element is within an array, but also to get access to its index.

    1. Linear Searching and

    2. Binary searching

  • Searching - Linear

    The simplest and most costly search is a linear search.

    We start at the beginning of an array.

    We check each element in turn to see if it is our desired element.

    We continue until we get to the end.

    Linear search is simple to code, and simple to understand.

    But it also has performance issues.

  • Linear Search Pseudocode

  • Linear Search using Function

    Write a main program.

  • Problem

    We need to do a lot of checking in order to find an element.

    If the desired element is at the start of the array, we can find it with one check.

    If it is at the end, we find it with a number of checks equal to the size of the array.

    On average, we will need to check a number of times equal to half the size of the array.

    If we have 10 elements, we need 5 checks.

    If we have 100 elements, we need 50 checks.

  • Problem

    If we have an array of N elements, then:

    Our best case is 1 check

    Resource usage/time consumption is least

    Our worst case is N checks

    Resource usage/time consumption is most

    Our average case is N/2 checks

    Resource usage/time consumption is average

  • Big O Notation

    Big O notation is the expression that we use for expressing how long an algorithm takes to run.

    The Big O notation for

    1. best case of Linear Search is O(1).

    2. worst case of Linear Search is O(N).

  • Binary Searching

    If we have an array that is in ascending or descending order, we can use a binary search.

    We pick the mid-point of the array.

    If we have found our search term, our work is done.

    If the search term is not there, we split the array in two segments.

    If our search term is higher than our current term, binary search the top half.

    If our search term is lower than our current term, binary search the bottom half.

    Continue until found, or until the term is not found.

  • Binary Searching

    Write the pseudo code for performing a binary search on an ordered array.

  • Binary Searching

  • Binary Searching

    Binary searching scales much better than a linear search.

    There is a lot less checking involved.

    We say binary searching scales at O(log n)

    That puts it between O(1) and O(n)

  • Sorting

    The process for ordering an array is known as sorting.

    There are many algorithms for doing this.

    We will only discuss two of these.

    1. Bubble sort

    2. Quick sort

  • The Bubble Sort

    The simplest form of sorting is known as the bubble sort.

    It works by checking adjacent pairs and then swapping them when they are not in order.

    We keep doing this until we do not make any swaps in a pass.

    The end result of this is an ordered array.

  • How Bubble Sort works?

    From http://en.wikipedia.org/wiki/Bubble_sort

  • Bubble Sort Pseudo code

  • Bubble Sort - Scaling

    Bubble sorts do not scale well.

    Because they are made up of loops within a loop.

    They have O (n2)scaling in worst case and O(n)scaling in best case.

  • Big O Notation O(n2)

    These algorithms scale exponentially.

    They are even worse than linear scaling.

    This is true for both average and worst case.

    However, they scale much better for arrays that are already partially sorted.

    Bubble sorts then are simple and easy to code.

    But with very limited real world use for large data sets.

    They show one mechanism by which an array can be sorted.

  • Recursion

    Recursion is a special kind of looping.

    Rather than looping within a function, we loop by calling the function itself.

  • Sum using recursive function

  • Factorial using recursive function

  • Quicksort

    A much better algorithm for sorting is known as the quicksort. This is an algorithm based on the principle of divide and

    conquer.

    It works by splitting an array into two sections It then sorts each array individually.

    The clever thing about a quicksort is that it is recursive. Subarrays are quicksorted in turn.

  • Quicksort Procedure

    We are given an array which may or may not be sorted.

    We pick a pivot value.

    We will just go by picking the middle most element.

    We separate our array into two separate arrays.

    Those elements that are greater than the pivot value

    Those elements that are less than the pivot value.

    We quicksort each of those separate arrays

    At the end, we merge all the arrays together.

  • Quicksort Animation

    From http://en.wikipedia.org/wiki/Quicksort

  • Quicksort Scaling

    Worst case performance O(n

    2)

    Best case performance O(n log n)

    Worst case performance O(n

    2)

    Best case performance O(n log n)

  • Important Question

    Explain why the bubble sort algorithm is less efficient than the quick sort algorithm for sorting a large number of elements.

    The Bubble sort does not scale well because it uses a nested loop. A quick sort makes use of recursion to sort increasingly reduced arrays around a pivot point which is selected as the midpoint of an array.

    http://cs.stackexchange.com/questions/3/why-is-quicksort-better-than-other-sorting-algorithms-in-practice

  • Conclusion

    Arrays are powerful, but we need techniques for managing them.

    Two of those techniques are searching and sorting.

    Standard algorithms exist for both of these tasks.

    Linear and binary search

    Bubble and quicksort

    There are many other such algorithms.

    Big O notation tells us which are likely to be better for specific kinds of data.

  • Terminology - 1

    Big O Notation A way to state how well algorithms scale.

    Linear Search To search through every element in an array in order for a

    search term

    Binary Search To partition a search for increased efficiency.

  • Terminology - 2

    Bubble Sort

    A sort which works by repeatedly swapping adjacent elements until an array is orders.

    Quicksort

    A sort which uses recursion to more efficiently sort a list of numbers.

    Recursion

    A loop that is created by having a function call itself with a smaller set of data.

  • End of Topic 9Software Development Techniques