cse 2331/5331
DESCRIPTION
CSE 2331/5331. Topic 5: Prob. Analysis Randomized Alg. Expected Complexity. Probabilistic method: Given a distribution for all possible inputs Derive expected time based on distribution Randomized algorithm: Add randomness in the algorithm - PowerPoint PPT PresentationTRANSCRIPT
CSE 2331/5331
CSE 2331/5331
Topic 5:
Quick sort Deterministic
Randomized
Sorting Revisited !
Quick-sort Divide and conquer paradigm But in place sorting Worst case: Randomized quicksort:
Expected running time:
CSE 2331/5331
CSE 2331/5331
Divide and Conquer
MergeSort ( A, r, s ) if ( r ≥ s) return; m = (r+s) / 2; A1 = MergeSort ( A, r, m ); A2 = MergeSort ( A, m+1, s ); Merge (A1, A2);
CSE 2331/5331
Divide and Conquer
QuickSort ( A, r, s ) if ( r ≥ s) return; m = Partition ( A, r, s ); A1 = QuickSort ( A, r, m-1 ); A2 = QuickSort ( A, m+1, s );
Merge (A1, A2);
CSE 2331/5331
Divide and Conquer
QuickSort ( A, r, s ) if ( r ≥ s) return; m = Partition ( A, r, s ); A1 = QuickSort ( A, r, m-1 ); A2 = QuickSort ( A, m+1, s );
Merge (A1, A2);
A[m]: pivot
Partition
m = Partition(A, p, r) Afterwards
All elements in A[p, …, m-1] have value at most A[m] All elements in A[m, …, r] have value at least A[m]
CSE 2331/5331
CSE 2331/5331
Partition ( A, p, r)
Plan: take A[r] as pivot
return m
In-place partition !
xy
CSE 2331/5331
Partition ( A, p, r )
xy
Case 1: y > x
xz y
CSE 2331/5331
Partition ( A, p, r )
xz y
Case 2: otherwise
xy z
CSE 2331/5331
Complexity:O (p - r)
CSE 2331/5331
Quicksort ( A, r, s)
QuickSort ( A, p, r ) if ( p ≥ r ) return; m = Partition ( A, p, r ); A1 = QuickSort ( A, p, m-1 ); A2 = QuickSort ( A, m+1, r );
In-place
Initial call is QuickSort(A, 1, n).
CSE 2331/5331
Complexity
T(n) = T(m-1) + T(n-m) + n
Worst case: T(n) = T(0) + T(n-1) + n
= T(n-1) + n
Best case: T(n) = 2T(n/2) + n
Remarks
If input array is sorted or reversely sorted: Quadratic time complexity
Mixing splits Good ones will dominate
CSE 2331/5331
Balanced Partitioning
Imagine instead of splitting in the middle, we has a 9-to-1 split
O(n lg n) Any constant fraction α works the same:
When we have a mixture of good and bad
partitioning, the good ones dominates.
CSE 2331/5331
Question
How can we have a good mix ?
CSE 2331/5331
Intuitively, we will use randomization to achieve that.
Expected Complexity
CSE 2331/5331
Probabilistic method: Given a distribution for all possible inputs Derive expected time based on distribution
Randomized algorithm: Add randomness in the algorithm Analyze the expected behavior of the algorithm
CSE 2331/5331
Probabilistic Analysis
Simple example: Assume there are only two types of inputs:
half are best cases, half are worst
Worst: T(n) = ( )n 2
Best: T(n) = ( n lg n )
For all possible inputs, average time A(n) = 1/2 n lg n + 1/2
= ( )n 2
n 2
Randomization will not assume input distribution
CSE 2331/5331
Randomized-QuickSort ( A, p, r ) If ( p ≥ r) return; m = Randomized-Partition ( A, p, r ); A1 = Randomized-QuickSort ( A, p, m-1 ); A2 = Randomized-QuickSort ( A, m+1, r );
CSE 780 Algorithms
Random-Partition ( A, r, s )
y
Randomly choose position s, take y = A[s] as pivot.
yx x
Next, run the samePartition (A, p, r)
Time Complexity?
Intuitively, with constant probability, the pivot will cause a balanced partition.
How to make this more rigorous?
CSE 2331/5331
CSE 2331/5331
Randomized-QuickSort ( A, p, r ) If ( p ≥ r) return; s = Randomized-Partition ( A, p, r ); A1 = Randomized-QuickSort ( A, p, s-1 ); A2 = Randomized-QuickSort ( A, s+1, r );
= expected running time of Randomized-QuickSort(A, 1, n)
Assume input array has no duplicates
CSE 2331/5331
Randomized-QuickSort ( A, 1, n ) If ( 1 ≥ n) return; s = Randomized-Partition ( A, 1, n ); A1 = Randomized-QuickSort ( A, 1, s-1 ); A2 = Randomized-QuickSort ( A, s+1, n );
CSE 2331/5331
= expected running time of Randomized-QuickSort(A, 1, n)
Assume input array has no duplicates
CSE 2331/5331
One can use substitution method to show that ET(n) = O(n lg n)
Another Way of Analysis
CSE 2331/5331
CSE 2331/5331
Yet Another Way of Analysis
Using the indicator random variable
Requires global understanding of the algorithm
The key is to identify the right indicator random variable Similar to identify the right events to compute
expectation.
CSE 2331/5331
CSE 2331/5331
Indicator Random Variable
Goal: count the expected number of heads in n coin flips
Straightforward: X : number of heads in n coin flips E[X] = ∑ k Prob[X = k]
New approach: X[j] = I { the j’th flip is a head }
= 1 if j’th flip is a head
0 otherwise
Indicator randomvariable
CSE 2331/5331
Indicator Random Variable cont.
Lemma: Let X = I { claim A }. Then,
E[X] = Prob[claim A is true].
CSE 2331/5331
Counting Heads
Y : number of heads in n coin flips
th flip is head ]
CSE 2331/5331
Rand-QuickSort
Given A, suppose z1 , z2 , …. , zn are its elements in sorted order.
Indicator random variable: X(i,j) = 1 if is compared to
0 otherwise
z i z j
Complexity = O(# comparisons Y) + O(n) Y = ∑ ∑ X(i,j)
i j
Why?
CSE 2331/5331
Pr[i,j]
Expected Complexity
E[Y] = E [ ∑ ∑ X(i,j) ] = ∑ ∑ E [ X(i,j) ]
= ∑ ∑ Pr { zi is compared to zj }
Z[i, j] : { zi , zi+1…. zj }
CSE 2331/5331
Key Observation
If two elements in different subarray Will never be compared from then on
If any first chosen as pivot, X[i, j] = 0
X[i,j] = 1 if and only if zi or zj is the first among Z[i,j] as privot
Pr[i,j] = Pr[ zi or zj is the first pivot chosen from Z[i,j] ] = Pr [ zi is the first pivot chosen from Z[i,j] ] + Pr[ zj is the first pivot chosen from Z[i,j] ] = 1 / (j - i + 1) + 1 / (j - i + 1) = 2 / ( j - i + 1)
Why?
CSE 2331/5331
Expected Complexity
E[X] = ∑ ∑ 2 / ( j - i + 1 )
…
= O ( n lg n )