extended introduction to computer science cs1001.py...
TRANSCRIPT
Extended Introduction to Computer Science CS1001.py
Lecture 7: Basic Algorithms – Part 1
Binary Search, Selection Sort
Instructors: Elhanan Borenstein, Amir Rubinstein
Teaching Assistants: Michal Kleinbort,
Noam Parzanchevsky, Ori Zviran
School of Computer Science Tel-Aviv University
Spring Semester 2020 (the “Corona Semester”) py.wikidot.com-1001cs-http://tau
Lecture 5-6A: Highlights
• Binary Numbers
• Representing integers with bits
• Integer Exponentiation - Naïve method vs. (fast) iterated squaring
2
Lecture 7: Plan
• Basic algorithms: • Binary search
• Sorting using selection sort
• Merging sorted lists (next time)
• Complexity of algorithms (next time) • The O(…) notation – a formal definition for complexity
• Worst / best case analysis
3
Search 1. Sequential (linear) search 2. Binary search (on sorted lists)
4 (taken from http://bizlinksinternational.com/web/web%20seo.php)
Search • Search has always been a central computational task. In early
days, search supposedly took one quarter of all computing time. • The emergence and the popularization of the world wide web
has literally created a universe of data, and with it the need to pinpoint information in this universe.
• Various search engines have emerged, to cope with this challenge. They constantly collect data on the web, organize it, index it, and store it in sophisticated data structures that support efficient (fast) access, resilience to failures, frequent updates, including deletions, etc., etc.
• In this class we will deal with two much simpler search algorithms: • Sequential search • Binary search
5
Search in Unordered vs. Ordered Lists
Hands on experience: Searching for a word in a book vs. searching for it in a dictionary.
(We mean a real world, hard copy, dictionary, not Python's dict, which you may have already met)
9
6
Sequential Search The computational problem:
Input: a set of elements, and a key
Output: the index of an element in the set with the given key (if exists)
Possible solution:
Efficiency: how many iterations in the worst and best cases?
def sequential_search(lst, key):
for i in range(len(lst)):
if lst[i] == key :
return i
# we get here when key is not in list
return None
Sequential Search: Time Analysis
• Any sequential search in an unordered list goes over it, item by item. If the list is of length n, sequential search will take n steps in the worst case (when the item is not found because it is missing).
• For a rather short list, n steps is not a problem. But if n is very large, such a search will take very long.
7
Searching backwards : Code
• Is going over the list is reversed order justified?
• And is list reversing a good idea? Think what will happen to the worst and best case inputs.
• What about a random order?
8
def sequential_search_back(lst, key):
lst = lst[::-1]
for i in range(len(lst)):
if lst[i] == key :
return len(lst)-i-1
# we get here when key is not in list
return None
10
11
Animated Example - success
Searching for the existing Item, 18
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
3 4 5 6 9 10 13 16 18 20 22 28 29 31 32 33 40 42 47 48 50 52 A
11
Animated Example - success
Searching for the existing Item, 18
A[mid] ==22 > 18
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
3 4 5 6 9 10 13 16 18 20 22 28 29 31 32 33 40 42 47 48 50 52 A
left mid right
11
Animated Example - success
Searching for the existing Item, 18
A[mid] ==22 > 18
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
3 4 5 6 9 10 13 16 18 20 22 28 29 31 32 33 40 42 47 48 50 52 A
left mid right
A[mid] == 9 < 18
11
Animated Example - success
Searching for the existing Item, 18
A[mid] ==22 > 18
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
3 4 5 6 9 10 13 16 18 20 22 28 29 31 32 33 40 42 47 48 50 52 A
left mid right
A[mid] == 9 < 18 A[mid] == 16 < 18
11
Animated Example - success
Searching for the existing Item, 18
A[mid] ==22 > 18
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
3 4 5 6 9 10 13 16 18 20 22 28 29 31 32 33 40 42 47 48 50 52 A
left
mid
right
A[mid] == 9 < 18 A[mid] == 16 < 18 A[mid] == 18 Found!
12
Animated Example - failure
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
3 4 5 6 9 10 13 16 18 20 22 28 29 31 32 33 40 42 47 48 50 52 A
Searching for the non existing Item, 17
12
Animated Example - failure
A[mid] ==22 > 17
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
3 4 5 6 9 10 13 16 18 20 22 28 29 31 32 33 40 42 47 48 50 52 A
left mid right
Searching for the non existing Item, 17
12
Animated Example - failure
A[mid] ==22 > 17
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
3 4 5 6 9 10 13 16 18 20 22 28 29 31 32 33 40 42 47 48 50 52 A
left mid right
A[mid] == 9 < 17
Searching for the non existing Item, 17
12
Animated Example - failure
A[mid] ==22 > 17
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
3 4 5 6 9 10 13 16 18 20 22 28 29 31 32 33 40 42 47 48 50 52 A
left mid right
A[mid] == 9 < 17 A[mid] == 16 < 17
Searching for the non existing Item, 17
12
Animated Example - failure
A[mid] ==22 > 17
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
3 4 5 6 9 10 13 16 18 20 22 28 29 31 32 33 40 42 47 48 50 52 A
left
mid
right
A[mid] == 9 < 17 A[mid] == 16 < 17 A[mid] == 18 > 17
Searching for the non existing Item, 17
12
Animated Example - failure
A[mid] ==22 > 17
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
3 4 5 6 9 10 13 16 18 20 22 28 29 31 32 33 40 42 47 48 50 52 A
left
mid
right
A[mid] == 9 < 17 A[mid] == 16 < 17 A[mid] == 18 > 17
Searching for the non existing Item, 17
12
Animated Example - failure
A[mid] ==22 > 17
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
3 4 5 6 9 10 13 16 18 20 22 28 29 31 32 33 40 42 47 48 50 52 A
left
mid
right
A[mid] == 9 < 17 A[mid] == 16 < 17 A[mid] == 18 > 17
Searching for the non existing Item, 17
Not Found! left > right
13
14
Binary Search – the code
def binary_search(lst, key):
""" iterative binary search. lst must be sorted """
n= len(lst)
left = 0
right = n-1
while left <= right :
middle = (right + left)//2 # middle rounded down
if key == lst[middle]: # item found
return middle
elif key < lst[middle]: # item not in top half
right = middle-1
else: # item not in bottom half
left = middle+1
#print(key, "not found")
return None
15
Binary Search – the code
Execution examples: in class.
26
Binary Search – Time Analysis • As a measure for efficiency, we will look at the number of steps/iterations.
• An underlying assumption: the time needed for all the (basic) operations in each iteration is bounded by some constant.
Therefore each iteration takes at most a constant amount of time
What is considered a basic operation? This is context dependent (discussion in class).
Note that this assumption does not always hold (recall integer exponentiation as an example)
• So, how many iterations are needed, as a function of the input size? Input size in this case is the list’s length, denoted n
• Does the result depend on the content of the input, or on its length only?
16
≤
17
Binary Search - Real Time Measurements import time
repeat = 20 #repeat execution several times, for more significant results
for n in [10**6, 2*10**6, 4*10**6]:
print("n=", n)
L = [i for i in range(n)]
key = -1 # why?
t0 = time.perf_counter()
for i in range(repeat):
res = sequential_search(L, key)
t1 = time. perf_counter()
print("sequential search:", t1-t0)
t0 = time. perf_counter()
for i in range(repeat):
res = binary_search(L, key)
t1 = time. perf_counter()
print("binary search:", t1-t0)
18
n= 1000000
sequential search: 2.9088399801573757
binary search: 0.0005532383071873426
n= 2000000
sequential search: 5.7504573815927795
binary search: 0.0005583582503900786
n= 4000000
sequential search: 11.536035866908783
binary search: 0.0005953356179659863
• How would the results change if we searched an element that does exist in the
list? Does it depend on where in the list the element is found?
Binary Search - Real Time Measurements
19
Created using https://graphsketch.com/
log(n)
n
Log: input x2 time + constant
Linear: input x2 time x2 (approximately)
Logarithmic vs. Linear Time Algorithms
20
21
Sorting
33
34
The Sorting Problem
The computational problem:
Input: a set of elements
Output: a sequence of the same elements, ordered by “size”
Note that a computational problem is described in abstract terms, and is merely a desired relation between legal inputs and their outputs.
Technically, we will represent a sequence as a python list
Possible algorithms?
We will see at least 3 in this course, one today
These solutions employ different strategies, which has consequences in terms of efficiency, as we will see
35
def selection_sort(lst):
''' sort lst (in-place) '''
n = len(lst)
for i in range(n):
m_index = i #index of minimum
for j in range(i+1,n):
if lst[m_index] > lst[j]:
m_index = j
swap(lst, i, m_index)
return None #no need to return lst??
Selection sort • The algorithm in pseudo-code:
Selection-Sort (input: lst of size n) 1. for i=0 to n-1: 1.1 find the minimum of the sublist of lst from index i onward 1.2 swap it with the element at index i 2. end
• Implementation in code:
def swap(lst, i, j):
tmp = lst[i]
lst[i] = lst[j]
lst[j] = tmp
36
Selection Sort - Efficiency • We will analyze in class the total number of iterations as a
function of the list size, 𝑛. • Then we will measure actual running time and see if it fits
the formal analysis.
def selection_sort(lst):
''' sort lst (in-place) '''
n = len(lst)
for i in range(n):
m_index = i #index of minimum
for j in range(i+1,n):
if lst[m_index] > lst[j]:
m_index = j
swap(lst, i, m_index)
return None #no need to return lst??
37
Selection Sort - Analysis • As a measure for efficiency, we will look at the number of iterations.
• An underlying assumption: the time needed for all the (basic) operations in each iteration is bounded by some constant (not including inner loops).
Therefore each iteration takes a constant amount of time
What is considered a basic operation? This is context dependent (discussion in class).
Note that this assumption does not always hold (recall integer exponentiation as an example)
• So, how many iterations are needed, as a function of the input size? Input size in this case is the list’s length, denoted n
• Does the result depend on the content of the list, or on its length only?
Answers: in class and on board
38
Selection Sort – Actual Running Time
• Output:
n= 1000 0.06043896640494811
n= 2000 0.27381915858021066
n= 4000 1.0055912134084082
• How does running time seem to change with input size?
import time
import random
for n in [1000,2000,4000]:
lst = list(range(n)) # [0,1,2,…,n-1]
random.shuffle(lst) # balagan
t0 = time.clock() # stopper go!
selection_sort(lst)
t1= time.clock() # stopper end
print("n=", n, t1-t0)
39
Selection Sort - Efficiency
log(n)
n
n2
Quadratic: input x2 time x22 (approximately)
Crash Intro to Complexity
40
Time Complexity: A Crash Intro • A computational problem is a relation between input and output:
• description of parameters (input) • description of solution (output)
• An algorithm is a step-by-step procedure, a “recipe” • can be represented e.g. as a computer program (but also in natural languages, diagrams, animations, etc.) • an abstract notion • A formal definition in "Computational Models" course
• Efficient algorithms are usually preferred • fastest – time complexity • most economical in terms of memory – space complexity
• Time complexity analysis: • measured in terms of operations, not actual timings
• We want to say something about the algorithm, not a specific machine/execution/programming language implementation
• expressed as a function of the problem size • We will be interested in how the number of operations changes with input size
41
Defining Complexity
We will be interested in how the number of operations changes with input size.
In most cases, we will not care about the exact function, but in its “order”, or growth rate (e.g., logarithmic, linear, quadratic, etc.)
Sometimes we will only be interested/able to give an upper bound for this growth rate. We will, however, strive to make this upper bound as tight (=low) as we can. In this course, we will almost always be able to give tight upper bounds.
We need some formal definition for ”upper bound for the number of operations growth rate, as a function of input size”.
42
Big O Notation • We say that a function 𝑓(𝑛) is 𝑂(𝑔(𝑛)) if there is a constant 𝑐 such that for
large enough 𝑛,
|𝑓(𝑛)| ≤ 𝑐 ∙ |𝑔(𝑛)|
• We denote this as 𝑓(𝑛) = 𝑂(𝑔(𝑛))
• In our context, 𝑓(𝑛) will usually denote the number of operation an algorithm performs on an input of size 𝑛
• a number with 𝑛 bits • a collection with 𝑛 elements (list, string, etc.)
• sometimes 𝑓(𝑛) will denote the number of memory cells required by the
algorithm on an input of size 𝑛
• So in our context 𝑓 and 𝑔 are positive functions for every 𝑛, and so we will omit the absolute value notation.
43
Big O Notation – Visualized
44
𝑓 𝑛 = 𝑂(𝑔 𝑛 )
𝑓(𝑛)
𝑐 ∙ 𝑔(𝑛)
Big O Notation - Examples
Examples:
• 3𝑛 + 7 = 𝑂 𝑛
• 3𝑛 + 7 = 𝑂(𝑛2) *
• 3𝑛 + 7 ≠ 𝑂(√𝑛)
• 5𝑛 ∙ log2𝑛 + 1 = 𝑂(𝑛 log 𝑛) [where did the log base disappear?]
• 6log2𝑛 = 𝑂(𝑛) *
• 2log2𝑛 + 12 = 𝑂(𝑛) *
• 1000 ∙ 𝑛 ∙ log2 𝑛 = 𝑂(𝑛2) *
• 3𝑛 ≠ 𝑂(2𝑛)
• 2𝑛/100 ≠ 𝑂(𝑛100)
45 * not the tightest possible bound
The Asymptotic nature of Big 𝑂 • Consider the two functions 𝑓(𝑛) = 10𝑛log2 𝑛 + 1, and
𝑔 𝑛 = 𝑛2 ⋅ (2 + sin (𝑛)/3) + 2
• It is not hard to verify that 𝑓(𝑛) = 𝑂(𝑔 𝑛 ).
• Yet, for small values of 𝑛, 𝑓(𝑛) > 𝑔(𝑛), as can be seen in the following plot:
46
𝑓(𝑛)
𝑔(𝑛)
The Asymptotic nature of Big 𝑂 (CONT.)
• But for large enough 𝑛, indeed 𝑓(𝑛) < 𝑔(𝑛), as can be seen in the next plot:
47
• Also, remember that for big 𝑂, 𝑓(𝑛) may be larger than 𝑔(𝑛), as long as there is a constant 𝑐 such that 𝑓(𝑛) < 𝑐 ∙ 𝑔(𝑛).
𝑓(𝑛)
𝑔(𝑛)
Complexity Hierarchy
O(1)
O(logn)
O(n)
O(n2)
O(2n)
constant
logarithmic
linear
quadratic
exp
on
enti
al
O(log2n)
O(3n)
O(nlogn)
48
We’ll meet this guy later in the course
Unless asked to prove formally, You can use this hierarchical orderings as facts.
(bo
un
d b
y) P
oly
no
mia
l
poly-logarithmic
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
O(1)
49
What is the meaning of this?
a) A very short running time
b) A running time that is independent of the input size (i.e. constant)
c) 1 operation
d) immediate termination of the algorithm
Worst / Best Case Complexity
50
• In many cases, for the same size of input, the content of the input itself affects the complexity. We then separate between worst case and best case complexity.
• Examples:
• Note that this statement is completely nonsense: "The best time complexity is when 𝑛 is very small…"
𝑇𝑤𝑜𝑟𝑠𝑡 𝑛 = max {𝑡𝑖𝑚𝑒 𝐼𝑛𝑝𝑢𝑡 : 𝐼𝑛𝑝𝑢𝑡 = 𝑛} 𝑇𝑏𝑒𝑠𝑡 𝑛 = min {𝑡𝑖𝑚𝑒 𝐼𝑛𝑝𝑢𝑡 : 𝐼𝑛𝑝𝑢𝑡 = 𝑛}
Worst case Best case
O(logn) O(1) Binary search
O(n2) O(n2) Selection sort
51
Summary of Some Previous Results
• All these results refer to worst case scenarios.
• Algorithms on sequences: • Binary search on a sorted list of length 𝑛 takes 𝑂(𝑙𝑜𝑔𝑛) iterations • Selection Sort on a list of length 𝑛 takes 𝑂(𝑛2) iterations • Merging of 2 sorted lists of sizes 𝑛 and 𝑚 takes 𝑂(𝑛 + 𝑚) iterations • Palindrome checking on a string of length 𝑛 takes 𝑂(𝑛) iterations
• Algorithms on integers: • Addition of two 𝑛-bit integers takes 𝑂(𝑛) iterations • Multiplication of two 𝑛-bit integers takes 𝑂(𝑛2) iterations • Naïve integer exponentiation 𝑎𝑏 where |𝑏| = 𝑛 bits takes 𝑂(2𝑛) multiplications* • Iterated squaring for 𝑎𝑏 where |𝑏| = 𝑛 bits takes 𝑂(𝑛) multiplications *
* the number of iterations depends on the size of the multiplied numbers, which isn't constant. In Modular exponentiation (𝑎𝑏%c) where |c|=n bits, each multiplication takes is 𝑂(𝑛2) iterations.
52
Input Size - Clarifications • We measure running time (or computational complexity) as a function of the
input size.
• For integers, input size is the number of bits in the representation of the number in the computer. • we normally count the number of "simple" bit operations (such as adding or
multiplying two bits).
• For lists/strings/dictionaries/other collections, the input size is typically the number of elements in the collection. • We normally count the number of "simple" list element operations (such as
comparisons, assignments), assuming that the size of each element is bound by some constant (therefor no need to consider operations on them, such as comparison, addition, etc.)
• There are exceptions to this, however. For example, a list of 𝑛 string each of size 𝑚.
53
10 20 30 40 50 60
n 1.0E-09 2.0E-09 3.0E-09 4.0E-09 5.0E-09 6.0E-09
seconds seconds seconds seconds seconds seconds
n2 1.0E-08 4.0E-08 9.0E-08 1.6E-07 2.5E-07 3.6E-07
seconds seconds seconds seconds seconds seconds
n3 1.0E-07 8.0E-07 2.7E-06 6.4E-06 1.3E-05 2.2E-05
seconds seconds seconds seconds seconds seconds
n5 1.0E-05 0.00032 0.00243 0.01024 0.03125 0.07776
seconds seconds seconds seconds seconds seconds
2n 1.02E-07 1.05E-04 0.107 1.833 1.303 0.64
seconds seconds seconds minutes days years
3n 5.9E-06 0.35 5.72 38.55 22764 1.34E+09
seconds seconds hours years centuries centuries
How would execution time for a very fast, modern processor (1010 ops per second, say) vary for a task with the following time complexities and n = input sizes?
Modified from Garey and Johnson's classical book
Tractability - Basic Distinction:
Polynomial time = tractable. Exponential time = intractable.
Time Complexity - What is tractable in Practice?
• A polynomial-time algorithm is good. • n100 is polynomial, hence good.
• An exponential-time algorithm is bad. • 2n/100 is exponential, hence bad.
Yet for input of size n = 4000, the n100 time algorithm takes more than 1035 centuries on the above mentioned machine, while the
2n/100 algorithm runs in just under two minutes.
54
Time Complexity - Advice
• Trust, but check! Don't just mumble "polynomial-time algorithms are good", "exponential-time algorithms are bad" because the lecturer told you so.
• Asymptotic run time and the O notation are important, and in most cases help clarify and simplify the analysis.
• But when faced with a concrete task on a specific problem size, you may be far away from "the asymptotic".
• In addition, constants hidden in the O notation may have unexpected impact on actual running time.
55
Time Complexity – Advice (cont.)
• We will employ both asymptotic analysis and direct measurements of the actual running time.
• For direct measurements, we will use either the time package and the time.clock() function.
• Or the timeit package and the timeit.timeit() function.
• Both have some deficiencies, yet are highly useful for our needs.
56
Average Complexity
57
• Often the average complexity is more informative (e.g. when the worst and best cases are rather rare).
• However analyzing it is usually more complicated, and requires some knowledge on the distribution of input probability.
• Assuming distribution is uniform:
• Examples from our course you will encounter in the near future: - Quicksort runs on average in O(nlogn) - Hash table chains are of average length O(n/m)
𝑇𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑛 =
𝑡𝑖𝑚𝑒 𝐼𝑛𝑝𝑢𝑡
|𝐼𝑛𝑝𝑢𝑡|=𝑛
N𝑜. 𝐼𝑛𝑝𝑢𝑡𝑠 𝑜𝑓 𝑠𝑖𝑧𝑒 𝑛