algorithms and data structures - clarkson universityalexis/cs344/notes/...preface this document...

Algorithms and Data Structures

Solutions to Exercises

Spring 2019

Alexis MacielDepartment of Computer Science

Clarkson University

Copyright c© 2019 Alexis Maciel

Contents

Preface v

1 Analysis of Algorithms 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Measuring Exact Running Times . . . . . . . . . . . . . . . . . . . . . 11.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.4 Asymptotic Running Times . . . . . . . . . . . . . . . . . . . . . . . . 21.5 Other Asymptotic Relations . . . . . . . . . . . . . . . . . . . . . . . . 31.6 Some Common Running Times . . . . . . . . . . . . . . . . . . . . . . 51.7 Basic Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.8 Analyzing Summations . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.9 Worst-Case and Average-Case Analysis . . . . . . . . . . . . . . . . . 101.10 The Binary Search Algorithm . . . . . . . . . . . . . . . . . . . . . . . 11

2 Recursion 132.1 The Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.2 When to Use Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . 172.3 Tail Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.4 Analysis of Recursive Algorithms . . . . . . . . . . . . . . . . . . . . . 19

iii

iv CONTENTS

3 Sorting 233.1 Selection Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.2 Insertion Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.3 Mergesort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.4 Quicksort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.5 Analysis of Quicksort . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.6 Partitioning Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.7 A Selection Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.8 A Lower Bound for Comparison-Based Sorting . . . . . . . . . . . . 363.9 Sorting in Linear Time . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

Preface

This document contains solutions to the exercises of the course notes Algorithmsand Data Structures. These notes were written for the course CS344 Algorithmsand Data Structures taught at Clarkson University. The solutions are organizedaccording to the same chapters and sections as the notes.

Here’s some advice. Whether you are studying these notes as a student in acourse or in self-directed study, your goal should be to understand the materialwell enough that you can do the exercises on your own. Simply studying thesolutions is not the best way to achieve this. It is much better to spend a reason-able amount of time and effort trying to do the exercises yourself before lookingat the solutions.

If you can’t do an exercise on your own, you should study the notes somemore. If that doesn’t work, seek help from another student or from your instruc-tor. Look at the solutions only to check your answer once you think you knowhow to do an exercise.

If you needed help doing an exercise, try redoing the same exercise later onyour own. And do additional exercises.

If your solution to an exercise is different from the official solution, take thetime to figure out why. Did you make a mistake? Did you forget something?Did you discover another correct solution? If you’re not sure, ask for help from

v

vi PREFACE

another student or the instructor. If your solution turns out to be incorrect, fixit, after maybe getting some help, then try redoing the same exercise later onyour own and do additional exercises.

Feedback on the notes and solutions is welcome. Please send comments [email protected].

Chapter 1

Analysis of Algorithms

1.1 Introduction

There are no exercises in this section.

1.2 Measuring Exact Running Times


1.3 Analysis


1

2 CHAPTER 1. ANALYSIS OF ALGORITHMS

1.4 Asymptotic Running Times

1.4.8. First, with limits:

a)

limn→∞

n+ 10n

= limn→∞

nn+ lim

n→∞

10n= 1+ 0= 1> 0

Therefore, by Theorem 1.2, n+ 10= Θ(n).

b)

limn→∞

n2 + nn2

= 1> 0

Therefore, n2 + n= Θ(n2).

c)

limn→∞

3n2 − nn2

= 3> 0

Therefore, 3n2 − n= Θ(n2).

d)

limn→∞

3n2 − n+ 10n2

= 3> 0

Therefore, 3n2 − n+ 10= Θ(n2).

Now, without limits, which is more elementary but also more tedious:

a) For the lower bound, it’s always true that n+ 10 ≥ n. On the otherhand, for the upper bound, n+ 10 ≤ n+ 10n = 11n if n ≥ 1. There-fore, we get that an≤ n+ 10≤ bn for every n≥ n0 by setting a = 1,b = 11 and n0 = 1.

1.5. OTHER ASYMPTOTIC RELATIONS 3

Another solution comes from noticing that if n ≥ 10, then n+ 10 ≤n+ n = 2n. Therefore, we also get that an ≤ n+ 10 ≤ bn for everyn≥ n0 by setting a = 1, b = 2 and n0 = 10.

b) We have that n2 + n ≥ n2 if n ≥ 0. On the other hand, n2 + n ≤n2 + n2 = 2n2 if n ≥ 1. Therefore, an2 ≤ n2 + n ≤ bn2 for everyn≥ n0 when a = 1, b = 2 and n0 = 1.

c) If n≥ 0, then 3n2−n≤ 3n2. On the other hand, if n≥ 1, 3n2−n≥ 2n2

because n2 ≥ n. Therefore, an2 ≤ 3n2 − n ≤ bn2 for every n ≥ n0

when a = 2, b = 3 and n0 = 1.

d) If n≥ 10, then 3n2−n+10≤ 3n2. On the other hand, still if n≥ 10,3n2 − n + 10 ≥ 2n2 because n2 ≥ 10n ≥ 2n ≥ n + 10. Therefore,an2 ≤ 3n2 − n+ 10 ≤ bn2 for every n ≥ n0 when a = 2, b = 3 andn0 = 10.

1.4.9. By definition, n = d logd n. By taking logc on both sides, and by usinga well-known property of logarithms, we get that logc n = logd n logc d.Therefore, logc n is a constant multiple of logd n, which implies that logc nis Θ(logd n).

A more basic proof (one that doesn’t use that property of logs) goes likethis. By definition, n= d logd n and d = clogc d . Therefore, n= (clogc d)logd n =clogc d logd n. By definition, this implies that logc n= logc d logd n. Therefore,logc n is a constant multiple of logd n, which implies that logc n isΘ(logd n).

1.5 Other Asymptotic Relations

1.5.2. Since t2(n) is O(n2), there exists b > 0 such that t2(n) ≤ bn2, when nis large enough. Therefore, since t1(n) ≤ t2(n), when n is large enough,


t1(n)≤ bn2. This implies that t1(n) is O(n2).

1.5.3. No, because t1(n) could be much smaller than t2(n). For example, sup-pose that t1(n) = n and t2(n) = n2. It is true that t1(n) ≤ t2(n) and thatt2(n) is Θ(n2). But t1(n) is o(n2), so it can’t be Θ(n2).

1.5.4. We want to show that

limn→∞

t(n)n2= 0

Since t(n) is Θ(n log n), there exists b > 0 such that t(n)≤ bn log n, whenn is large enough. Therefore, when n is large enough,

t(n)n2≤

bn log nn2

This implies that

limn→∞

t(n)n2≤ lim

n→∞

bn log nn2

= b limn→∞

log nn= 0

Since t(n) is nonnegative (by the assumption in the definition of Θ), wealso have that

limn→∞

t(n)n2≥ 0

Therefore,

limn→∞

t(n)n2= 0

which shows that t(n) is o(n2).

1.6. SOME COMMON RUNNING TIMES 5

1.6 Some Common Running Times

1.6.1.

n 10 103 106

log2 n µs 3 µs 10 µs 20 µs

n µs 10 µs 1 ms 1 s

n log2 n µs 33 µs 10 ms 20 s

n2 µs 100 µs 1 s 12 days

n3 µs 1 ms 17 min 32× 103 years

n 10 20 40 60 80

n µs 10 µs 20 µs 40 µs 60 µs 80 µs

n log2 n µs 33 µs 86 µs 210 µs 350 µs 510 µs

n2 µs 0.1 ms 0.4 ms 1.6 ms 3.6 ms 6.4 ms

n3 µs 1 ms 8 ms 64 ms 220 ms 510 ms

2n µs 1 ms 1 s 13 days 37× 103 years 38× 109 years


1.7 Basic Strategies

1.7.1.

a) The body of the inner loop runs in constant time c. The inner looprepeats 2n+1 times. Therefore, the running time of the inner loop isΘ((2n+ 1)c) = Θ(n). The inner loop always take the same amountof time and is repeated n times by the outer loop. Therefore, therunning time of the outer loop is Θ(n2).

b) The inner loop runs in time Θ(n) as we’ve seen before. The outerloop repeats 10 times. Therefore, the running time of the outer loopis Θ(10n) = Θ(n).

c) The inner loop repeats 6 times. Therefore, since its body runs inconstant time, the inner loop runs in constant time. The outer looprepeats the inner loop n times. Therefore, the running time of theouter loop is Θ(n).

d) The inner loop repeats n− i+1 times. Since i varies, this implies thatthe running time of the inner loop varies. Let T (k) be the runningtime of the inner loop when it repeats k times. Therefore, the runningtime of the outer loop is

Θ(n) +n∑

i=1

T (n− i + 1)

where the Θ(n) term is the total running time of the operations thatcontrol the outer loop. Note that

n∑

i=1

T (n− i + 1) =n∑

i=1

T (i)

1.7. BASIC STRATEGIES 7

Clearly, when the inner loop repeats i times, its running time is of theform ai + b. This implies that

n∑

i=1

T (i) =n∑

i=1

(ai + b) = an(n+ 1)

2+ bn= Θ(n2)

Therefore, the running time of the outer loop is Θ(n2).

e) The inner loop repeats 2i + 1 times. Let T (k) be the running time ofthe inner loop when it repeats k times. Therefore, the running timeof the outer loop is

Θ(n) +n∑

i=1

T (2i + 1)

Clearly, when the inner loop repeats k times, its running time is ofthe form ak+ b. This implies that

n∑

i=1

T (2i + 1) =n∑

i=1

(a(2i + 1) + b)

= 2an∑

i=1

i + (a+ b)n

= an(n+ 1) + (a+ b)n

= Θ(n2)


f) The inner loop repeats i times. Let T (k) be the running time of theinner loop when it repeats k times. Therefore, the running time of


the outer loop is

Θ(n2) +n2∑

i=1

T (i)

Clearly, when the inner loop repeats i times, its running time is of theform ai + b. This implies that

n2∑

i=1

T (i) =n2∑

i=1

(ai+ b) = an2(n2 + 1)

2+ bn2 = a

n4 + n2

2+ bn2 = Θ(n4)


1.7.2. Since large values of n determine the asymptotic running time, we canignore the if part of the if statement. Therefore, in all cases, the runningtime of this algorithm is Θ(TA(n) + nTC(n)).

a) The running time is Θ(n) + nΘ(log n) = Θ(n+ n log n) = Θ(n log n)because n log n is the dominant term.

b) The running time is Θ(n2) + nΘ(log n) = Θ(n2 + n log n) = Θ(n2)because n2 is the dominant term.

c) Same as (b).

1.7.3. By the definition of Θ, there are positive constants b1, b2 and b3 such that

T (n)≤ b1n+ b2n2 + b3

when n is large enough. This implies that

T (n)≤ (b1 + b2 + b3)n2

1.8. ANALYZING SUMMATIONS 9

Similarly, there are positive constants a1, a2 and a3 such that

T (n)≥ a1n+ a2n2 + a3

when n is large enough. This implies that

T (n)≥ a2n2

because a1 and a3 are positive. Therefore, when n is large enough,

a2n2 ≤ T (n)≤ (b1 + b2 + b3)n2

This implies that T (n) is Θ(n2).

1.8 Analyzing Summations

1.8.1. By splitting the summation at bn/2c. We get that

n−1∑

i=1

i ≥n−1∑

i=bn/2c

i ≥n−1∑

i=bn/2c

jn2

k

=

n−jn

2

kjn2

k

Using the fact that n/2− 1/2≤ bn/2c ≤ n/2, we get

n−1∑

i=1

i ≥

n−n2

n2−

12

=n2

n2−

12

=n2

4−

n4

This implies that the summation is Ω(n2).


1.9 Worst-Case and Average-Case Analysis

1.9.3. The best-case running time is when the test succeeds. In that case, therunning time isΘ(n). The worst-case running time is when the test fails. Inthat case, the running time isΘ(n2). Because inputs of length n are equallylikely to pass or fail the test, the average-case running time is simply theaverage of the running times of algorithms A and B. Let TA and TB be thoserunning times. Then, when n is large enough, we have that

a1n≤ TA(n)≤ b1n

anda2n2 ≤ TB(n)≤ b2n2

Therefore,a1n+ a2n2

2≤

TA(n) + TB(n)2

≤b1n+ b2n2

2

This implies that the average-case running time is also Θ(n2).

1.10. THE BINARY SEARCH ALGORITHM 11

1.10 The Binary Search Algorithm

1.10.1.

e = 42s = [11 27 28 30 36 42 58 65] middle = 36

[36 42 58 65] 58[36 42] 42

[42]Found!

e = 30s = [11 27 28 30 36 42 58 65] middle = 36

[11 27 28 30] 28[28 30] 30

[30]Found!

1.10.2. Once again, we’re assuming that 2k−1 < n ≤ 2k. After one iteration,n1 ≥ bn/2c. Since n/2 ≥ 2k−2, an integer, we have that bn/2c ≥ 2k−2,which implies that n1 ≥ 2k−2. After two iterations, n2 ≥ bn1/2c. Sincen1/2 ≥ 2k−3, we get that bn1/2c ≥ 2k−3 and that n2 ≥ 2k−3. It should beclear that this can be repeated to show that for any i, ni ≥ 2k−i−1. Since2k−i−1 = 2 when i = k − 2, it must be that when ni = 1 for the first time,i ≥ k − 1. Therefore, r ≥ k − 1. To relate r to n, recall that n ≤ 2k. Thisimplies that log n≤ k and that k ≥ log n. Therefore, r ≥ log n− 1.

1.10.3. Suppose that computing the location of the middle element now takestime linear in the number of elements currently being searched. Then


the running time of the body of the loop is of the form am+ b, where mis the number of elements currently being searched. Let k be such that2k−1 < n ≤ 2k. As explained in the notes, the number of elements beingsearched is going to be n0, n1, n2, . . . , nr , where n0 = n and nr = 1. Therunning time of the algorithm is then

T (n) =r∑

i=0

(ani + b) +Θ(1)

As also explained in the notes, ni ≤ 2k−i. Therefore,

T (n)≤r∑

i=0

(a2k−i + b) +Θ(1) = ar∑

i=0

2k−i + br +Θ(1)

Now, note that

r∑

i=0

2k−i =k∑

i=k−r

2i ≤k∑

i=0

2i = 2k+1 − 1≤ 2n− 1

In addition, as explained in the notes, r < log n+ 1. Therefore,

T (n)≤ a(2n− 1) + b(log n+ 1) +Θ(1)

This implies that T (n) is O(n). A similar argument shows that T (n) isΩ(n).

1.10.4. Right now, the middle element is the first one of the right half and whene equals the middle element, we go right. To find the first occurrence of euse the last element of the left half as middle element and when e equalsthe middle element, go left.

Chapter 2

Recursion

2.1 The Technique

13

14 CHAPTER 2. RECURSION

2.1.3.

template <class T>int count(const T a[], int start, int stop,

const T & e)

if (start < stop) int count_in_rest =

count(a, start+1, stop, e);if (a[start] == e)

return count_in_rest + 1;else

return count_in_rest;else

return 0;

2.1. THE TECHNIQUE 15

2.1.4.

template <class T>const T & max(const T a[], int start, int stop)

if (stop − start > 1) const T & max_in_rest =

max(a, start+1, stop);if (max_in_rest > a[start])

return max_in_rest;else

return a[start];else

return a[start];


2.1.5.

template <class T>int max(const T a[], int start, int stop)

if (stop − start > 1) int i_max_in_rest =

max(a, start+1, stop);if (a[i_max_in_rest] > a[start])

return i_max_in_rest;else

return start;else

return start;

2.1.6.

void print(int n)

if (n > 0) cout << ’ ’ << n;print(n−1);

2.2. WHEN TO USE RECURSION 17

2.1.7.

void print(int n)

if (n > 0) print(n−1);cout << ’ ’ << n;

2.1.8.

void print(int n)

if (n > 1) cout << ’ ’ << n;print(n−1);cout << ’ ’ << n;

else if (n == 1)

cout << ’ ’ << 1;

2.2 When to Use Recursion

2.2.2. All of them.


2.3 Tail Recursion

2.3.4.

display(a, i, j)

while (j > i)print a[i]++i

binary_search(a, i, j, e)

while (j − i >= 2)mid = floor((i + j) / 2)if (e < a[mid])

j = midelse

i = mid

if (j − i = 1 and e = a[i])return i

elsereturn −1;

2.4. ANALYSIS OF RECURSIVE ALGORITHMS 19

2.3.5. The first print is the only one of those functions that’s tail recursive.

void print(int n)

while (n > 0) cout << ’ ’ << n;−−n;

2.4 Analysis of Recursive Algorithms

2.4.1. We’ll show that T (n)≥ cn for every n≥ 1.

The basis is for n = 1. We want T (1) ≥ c · 1 = c. This will be true as longas we choose c to be no greater than T (1). This is our first condition onthe value of c.

The inductive step is for n ≥ 2. Suppose that the bound holds for n− 1:T (n− 1)≥ c(n− 1). Then,

T (n) = T (n− 1) + b ≥ c(n− 1) + b = cn− c + b

Therefore, to show that T (n)≥ cn, all we need to show is that cn− c+ b ≥cn. This is equivalent to−c+b ≥ 0 and c ≤ b. This is our second conditionon c.

So we choose c = min(T (1), b). With this value, the basis and inductivestep work and we have shown, by induction, that T (n) ≥ cn for everyn≥ 1.


2.4.2. The recurrence relation implies the following set of equations:

T (n) = T (n− 1) + bn

T (n− 1) = T (n− 2) + b(n− 1)...

T (2) = T (1) + b2

T (1) = T (0) + b

T (0) = a

Adding, we get

T (n) =n∑

i=1

bi + a = bn(n+ 1)

2+ a

which implies that T (n) is Θ(n2).

Let’s now prove the same result by induction. Let’s start with the upperbound. We’ll show that T (n)≤ cn2 for every n≥ 1.

The basis is for n = 1. Since cn2 = c when n = 1, we want T (1) ≤ c. Thiswill be true as long as we choose c to be at least T (1). This is our firstcondition on the value of c.

The inductive step is for n ≥ 2. Suppose that the bound holds for n− 1:T (n− 1)≤ c(n− 1)2. Then,

T (n) = T (n− 1) + bn≤ c(n− 1)2 + bn= cn2 − 2cn+ c + bn

Therefore, to show that T (n)≤ cn2, all we need to show is that cn2−2cn+c + bn ≤ cn2. This is equivalent to −2cn+ c + bn ≤ 0, which will be true

2.4. ANALYSIS OF RECURSIVE ALGORITHMS 21

if −2cn + cn + bn ≤ 0. This, in turn, is equivalent to −cn + bn ≤ 0 andc ≥ b. This is our second condition on c.

So we choose c = max(T (1), b). With this value, the basis and inductivestep work and we have shown, by induction, that T (n) ≤ cn2 for everyn≥ 1.

Now let’s do the lower bound. We’ll show that T (n)≥ cn2 for every n≥ 1.(Another c.)

The basis is for n = 1. Since cn2 = c when n = 1, we want T (1) ≥ c. Thiswill be true as long as we choose c to be no greater than T (1). This is ourfirst condition on the value of c.

The inductive step is for n ≥ 2. Suppose that the bound holds for n− 1:T (n− 1)≥ c(n− 1)2. Then,

T (n) = T (n− 1) + bn≥ c(n− 1)2 + bn= cn2 − 2cn+ c + bn

Therefore, to show that T (n)≥ cn2, all we need to show is that cn2−2cn+c + bn ≥ cn2. This is equivalent to −2cn+ c + bn ≥ 0, which will be trueif −2cn+ bn ≥ 0. This is equivalent to 2cn ≥ bn and c ≥ b/2. This is oursecond condition on c.

So we choose c =min(T (1), b/2). With this value, the basis and inductivestep work and we have shown, by induction, that T (n) ≥ cn2 for everyn≥ 1.

Chapter 3

Sorting

3.1 Selection Sort

3.1.1.

[12 37 25 60 16 42 38]

[12 37 25 38 16 42] 60

[12 37 25 38 16] 42 60

[12 37 25 16] 38 42 60

[12 16 25] 37 38 42 60

[12 16] 25 37 38 42 60

[12] 16 25 37 38 42 60

12 16 25 37 38 42 60

23

24 CHAPTER 3. SORTING

3.1.2.

template <class Iterator>void selection_sort(Iterator start, Iterator stop)

int n = std::distance(start, stop);while (n > 1)

auto itr_max =std::max_element(start, stop);

std::swap(∗itr_max, ∗(std::prev(stop)));−−stop;−−n;

3.1.3. As explained in this section of the notes, if A is an array of size n≥ 2, therunning time of selection sort on A is given by the recurrence

T (A) = T (An−1) +Θ(n)

where An−1 is the array that consists of the first n − 1 elements of A andthe Θ(n) term is the total running time of all the operations except for therecursive call.

Let T (n) be the best-case running time of selection sort. The above recur-rence implies that when n≥ 2,

T (n)≥ T (n− 1) +Θ(n)

By the definition of Θ, there are a > 0 and n0 such that the Θ(n) term is

3.1. SELECTION SORT 25

bounded below by an, for every n≥ n0. Therefore,

T (n)≥ T (n− 1) + an

when n is greater than or equal to both 2 and n0.

Let n1 =max(2, n0). Then the above recurrence is valid for every n≥ n1.

We can now write out the recurrence relation:

T (n)≥ T (n− 1) + an

T (n− 1)≥ T (n− 2) + a(n− 1)...

T (n1)≥ T (n1 − 1) + an1

Adding all these inequalities gives us that

T (n)≥ T (n1 − 1) +n∑

i=n1

ai = Θ(1) + an∑

i=n1

i

Now, note that

n∑

i=n1

i =n∑

i=1

i −ni∑

i=1

i = Θ(n2)−Θ(1)

The last summation is a constant because it contains only a constant num-ber of terms. This implies that the summation on the left is Θ(n2). There-fore, T (n) is Ω(n2).


3.2 Insertion Sort

3.2.1.

[12] 37 25 60 16 42 38

[12 37] 25 60 16 42 38

[12 25 37] 60 16 42 38

[12 25 37 60] 16 42 38

[12 16 25 37 60] 42 38

[12 16 25 37 42 60] 38

[12 16 25 37 38 42 60]

3.3 Mergesort

3.3.1.

[22 37 25 60 16 42 38 46 19]

[22 37 25 60 16][42 38 46 19]

[16 22 25 37 60][19 38 42 46]

[16 19 22 25 37 38 42 46 60]

3.3. MERGESORT 27

[22 37 25 60 16 42 38 46 19]

[22 37 25 60 16][42 38 46 19]

[22 37 25][60 16][42 38][46 19]

[22 37][25][60][16][42][38][46][19]

[22][37][25][60][16][42][38][46][19]

[22 37][25][60][16][42][38][46][19]

[22 25 37][16 60][38 42][19 46]

[16 22 25 37 60][19 38 42 46]

[16 19 22 25 37 38 42 46 60]

3.3.2.

First array Second array Resulting array

[16 22 25 37 60] [19 38 42 46] []

[22 25 37 60] [19 38 42 46] [16]

[22 25 37 60] [38 42 46] [16 19]

[25 37 60] [38 42 46] [16 19 22]

[37 60] [38 42 46] [16 19 22 25]

[60] [38 42 46] [16 19 22 25 37]

[60] [42 46] [16 19 22 25 37 38]

[60] [46] [16 19 22 25 37 38 42]

[60] [] [16 19 22 25 37 38 42 46]

[] [] [16 19 22 25 37 38 42 46 60]

3.3.3. Consider again the recurrence relation that gives the running time ofmergesort on an array A of size n≥ 2:

T (A) = T (Lbn/2c) + T (Rdn/2e) +Θ(n)


If T (n) is the best-case running time of mergesort, then the above recur-rence implies that when n≥ 2,

T (n)≥ T (bn/2c) + T (dn/2e) +Θ(n)

First, we remove the asymptotics. We know that there exist a > 0 andn0 such that for every n ≥ n0, the Θ(n) term is bounded below by an.Therefore,

T (n)≥ T (bn/2c) + T (dn/2e) + an

when n is greater than both 2 and n0.

We now show that T (n)≥ cn log n for every n≥ 2.

Let n1 =max(4, n0). The inductive step will be for n≥ n1. (It’s convenientfor n to be at least 4 in the inductive step. This guarantees that T (1) doesnot appear on the right-hand-side of the recurrence.)

The basis is for n ∈ [2, n1− 1]. We want T (n)≥ cn log n. This will be trueif we choose

c ≤minT (n)/(n log n) | n ∈ [2, n1 − 1]

The inductive step is for n ≥ n1. Assume that T (k) ≥ ck log k for everyk ∈ [2, n− 1]. Then, since bn/2c ≥ 2 and dn/2e ≤ n− 1,

T (n)≥ cbn/2c logbn/2c+ cdn/2e logdn/2e+ an

≥ cbn/2c logbn/2c+ cdn/2e logbn/2c+ an

= c(bn/2c+ dn/2e) logbn/2c+ an

= cn logbn/2c+ an

3.4. QUICKSORT 29

It is easy to show that that when n≥ 2, dn/2e ≥ n/3. Therefore,

T (n)≥ cn logn3+ an= cn log n− cn log 3+ an

This is at least cn log n if −cn log 3 + an ≥ 0, which is equivalent tocn log3≤ an and c ≤ a/ log3.

Therefore, if we choose c to be small enough, the conditions on c inthe basis and the inductive step will be met and we get a proof thatT (n) ≥ cn log n. This implies that the best-case running time of merge-sort is Ω(n log n).

3.4 Quicksort

3.4.3.

[22 37 25 60 16 42 38 46 19]

[16 19] 22 [37 25 60 42 38 46]

[16 19] 22 [25 37 38 42 46 60]

[16 19 22 25 37 38 42 46 60]

[22 37 25 60 16 42 38 46 19]

[16 19] 22 [37 25 60 42 38 46]

16 [19] 22 [25] 37 [60 42 38 46]

16 [19] 22 [25] 37 [42 38 46] 60

16 [19] 22 [25] 37 [38] 42 [46] 60

16 [19] 22 [25] 37 [38 42 46] 60

16 [19] 22 [25] 37 [38 42 46 60]

[16 19] 22 [25 37 38 42 46 60]


[16 19 22 25 37 38 42 46 60]

3.4.4. If the array is already sorted, then the middle element of the array is themedian, which implies that the pivot will be the median and the array willbe split as evenly as possible. If the elements are not reordered unneces-sarily during the partitioning step, then the subarrays will again be sortedand the pivots will again be the medians. This leads to partitions that areas even as possible.

Suppose that n is even. Consider the array that contains the followingelements, in this order:

1, 3,5, . . . , n− 1,2, 4,6, . . . , n

The pivot will be 2 which will cause the right subarray to contain the fol-lowing n− 2 elements:

3, 5, . . . , n− 1,4, 6, . . . , n

If the elements are not reordered unnecessarily during the partitioningstep, this will repeat, with the size of the right subarray decreasing byonly 2 at every partition. That’s the worst possible. (This would lead to arecurrence relation similar to that of selection sort and to a Θ(n2) runningtime.)

If n is odd, then consider the following array:

2, 4,6, . . . , n− 1,1, 3,5, . . . , n

Note that 1 is the middle element of this array. The pivot will again be 2

3.4. QUICKSORT 31

and the right subarray will contain the following n− 2 elements:

4, 6, . . . , n− 1,3, 5, . . . , n

The middle element is now 3 and the pattern repeats. Again, that’s theworst possible.

3.4.5. First, here’s a simple implementation of the partitioning step. It does twoscans of the array and uses an additional array as temporary storage.

template <class T>void partition(T a[], int start, int stop,

int & pivot)// Partitions the elements of an array around a// pivot.//// PRECONDITION: The indices are valid, start occurs// before stop and pivot is within the range [start,// stop).//// POSTCONDITION: The elements in [start, pivot) are// smaller than a[pivot] and the elements in// [pivot + 1, stop) are greater or equal to the// pivot. Note that the pivot may move during// partitioning. The argument pivot is updated// accordingly.//// ASSUMPTION ON TEMPLATE ARGUMENT: Values of type T// can be compared using the < operator.


std::swap(a[pivot], a[start]);

// moves pivot to startT ∗ temp = new T[stop − start];int k = 0; // next available position in tempfor (int i = start + 1; i < stop; ++i)

if (a[i] < a[start]) temp[k] = a[i];++k;

temp[k] = a[start]; // pivotpivot = start + k; // final index in a++k;for (int i = start + 1; i < stop; ++i)

if (!(a[i] < a[start]))temp[k] = a[i];++k;

std::copy(temp, temp + k, a + start);delete temp;

Now, using this partition function, here’s an implementation of the ran-domized version of quicksort.

3.5. ANALYSIS OF QUICKSORT 33

template <class T>void quicksort(T a[], int start, int stop)// Sorts elements in a in increasing order using the// quicksort algorithm. Sorts elements in the range// [start,stop). Sorts according to the < operator.//// PRECONDITION: The indices are valid and start// occurs before stop.//// ASSUMPTION ON TEMPLATE ARGUMENT: Values of type T// can be compared using the < operator.

if (stop − start > 1) int pivot = start + rand() % (stop − start);

partition(a, start, stop, pivot);quicksort(a, start, pivot);quicksort(a, pivot + 1, stop);

3.5 Analysis of Quicksort

3.5.1. Consider a recursion tree that represents the execution of quicksort on anarray of size n. The top level of this tree — we call this Level 0 — consistsof one node where n elements are being sorted. The next level — Level 1— consists of one or two nodes that together contain n− 1 elements. It’sn − 1, not n, because the pivot that was used at level 0 is not present atLevel 1.


How many elements are passed on to Level 2? If a node at Level 1 corre-sponds to a recursive case, then the pivot used at that node does not makeit to Level 2. If a node at Level 1 corresponds to a base case for an array ofsize 1, then the element contained at that node does not make it to Level 2.All the other elements from Level 1 go on to Level 2. Therefore, becausethere are at most two nodes at Level 1, Level 2 contains at least n− 1− 2elements.

A similar argument shows that Level 3 has at least n−1−2−4 elements,Level 4 has at least n− 1− 2− 4− 8 elements. In general, Level k has atleast n−1−2−4−· · ·−2k−1 elements. Since 1+2+4+ · · ·+2k−1 = 2k−1,Level k has at least n− (2k − 1) = n+ 1− 2k elements.

This means that if k is not too large, then Level k will have a lot of elements.More precisely, Level k contains at least n/2 elements if n+ 1− 2k ≥ n/2,which is true if n/2+ 1 ≥ 2k and k ≤ log(n/2+ 1). This in turn is true ifk ≤ log(n/2) = log n− 1.

Therefore, Levels 0 to log n− 1 all contain at least n/2 elements. That’s atotal of log n levels. At each of these levels, the time spent by the algorithmis at least bn/2 for some constant, because each element is involved in atleast one operation either in a partition or in a base case. Therefore, thetotal running time of the algorithm is at least (log n)(n/2) = (1/2)n log n.That’s Ω(n log n).

3.6. PARTITIONING ALGORITHM 35

3.6 Partitioning Algorithm

3.6.1.

22 37 25 60 16 42 38 46 19||22 19 25 60 16 42 38 46 |37

22||19 25 60 16 42 38 46 |3722 19||25 60 16 42 38 46 |3722 19 25||60 16 42 38 46 |3722 19 25 |60 |16 42 38 46 |3722 19 25 16 |60 |42 38 46 |3722 19 25 16 |60 42 |38 46 |3722 19 25 16 |60 42 38 |46 |3722 19 25 16 |60 42 38 46||3722 19 25 16 37 42 38 46 60

3.7 A Selection Algorithm

3.7.1.

[22 37 25 60 16 42 38 46 19] r = 1, pivot = 22[16 19]22 37 25 60 42 38 46 r = 1, pivot = 1616 19 return 16

[22 37 25 60 16 42 38 46 19] r = 5, pivot = 2216 19 22[37 25 60 42 38 46] r = 2, pivot = 37

25 37 60 42 38 46 return 37


[22 37 25 60 16 42 38 46 19] r = 7, pivot = 2216 19 22[37 25 60 42 38 46] r = 4, pivot = 37

25 37[60 42 38 46] r = 2, pivot = 60[42 38 46]60 r = 2, pivot = 4238 42 46 return 42

3.8 A Lower Bound for Comparison-Based Sorting

3.9 Sorting in Linear Time

3.9.1.

A: 2 3 6 3 3 4 3 4 1

0 1 2 3 4 5 6C: 0 1 1 4 2 0 1

A: 1 2 3 3 3 3 4 4 6

3.9.2.

A: 5A 4B 2C 8D 1E 4F 3G 4H 1I

0 1 2 3 4 5 6 7 8B: 1E 2C 3G 4B 5A 8D

1I 4F4H

A: 1E 1I 2C 3G 4B 4F 4H 5A 8D

algorithms and data structures - clarkson universityalexis/cs344/notes/...preface this document...

Documents