csc 213 lecture 15: sets, union/find, and the selection problem

25
CSC 213 Lecture 15: Sets, Union/Find, and the Selection Problem

Post on 22-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

CSC 213

Lecture 15:Sets, Union/Find, and theSelection Problem

Set Operations (§ 10.6)

Sets define 3 basic operations void union(Set B)

Add B’s elements not in the current set to the current set

void intersect(Set B)Remove elements from the current set that are not in B

void subtract(Set B)Remove elements from current set that are in B

Set ADT

Sets are collection of elements Sets use Positions and not Entrys ADT does not define way to insert,

remove, or find elements ADT is just a collection – no way of

getting next, previous, first, or last element

Set Operations (§ 10.6)

Running time of operations with sets A and B should be O(nAnB)

Often implement Set using a Sequence For operations we would need to

compare all possible elements So, for efficiency, maintain Sequence

in sorted order

Template Method Pattern

Abstract class provides basic outline Outline is shared used for several

functions Actions are performed in abstract

methods

Classes specialize the abstract class Defines abstract methods to perform the

specific actions they would need All classes share the template, however

Generic Merge Template

Class defines template method genericMerge

Template method relies on auxiliary (abstract) methods aIsLess bIsLess bothAreEqual

Generic Merge Template

Algorithm genericMerge(Sequence A, Sequence B, Comparator C)S new Sequence()

a A.first(); b B.first();

while a ≠ null && a ≠ null x C.compare(a, b)

if x < 0 aIsLess(A, S);

else if x > 0bIsLess(B, S);

elsebothAreEqual(A, B, S)

while a ≠ nullaIsLess(A, S);

while b ≠ nullbIsLess(B, S);

return S

Generic Merge Implementations

Can implement all Set operations using generic merge union: copy all elements, but make

only one copy of duplicates intersect: only copy duplicate

elements subtract: copy elements that are only

in A

Generic Merge Implementations

union: aIsLess(A, S)

S.insertLast(a); a A.next(a);

bIsLess(B, S)

S.insertLast(b); b B.next(b);

bothAreEqual(A, B, S)

S.insertLast(a); a A.next(a); b B.next(b);

Generic Merge Implementations

intersect: aIsLess(A, S)

a A.next(a); bIsLess(B, S)

b B.next(b); bothAreEqual(A, B, S)

S.insertLast(a); a A.next(a); b B.next(b);

Your Turn

What is the result of the following operations? A = {7, 8, 1, 6, 3} B = {1, 2, 4, 100, 21}C = {45, 8, 100, 2}

A - B B - A A C A C B A B B C

Using the template method pattern and GenericMerge , write the subtract class.

Your Turn

What is the result of the following operations?

A = {7, 8, 1, 6, 3} B = {1, 2, 4, 100, 21}C = {45, 8, 100, 2}

A - B = {7, 8, 6, 3} B - A = {2, 4, 100, 21} A C = {7, 8, 1, 6, 3, 45, 8, 100, 2} A C B = {7, 8, 1, 6, 3, 45, 8, 100, 2, 4,

21} A B = {1} B C = {100}

Your Turn

Using the template method pattern and GenericMerge , write the subtract class

public class SubtMerge extends GenericMerge { public void aIsLess(Sequence A, Sequence S) { S.insertLast(a); a = A.next(a); } public void bIsLess(Sequence B, Sequence S) { b = B.next(b); } public void bothAreEqual(Sequence A, Sequence B, Sequence S) { a = A.next(a); b = B.next(b); }}

Partitions

ADT defining collection of disjoint SetsPartitions define 3 methods: makeSet(x): Create new set containing x union(A, B): Remove A and B from

partition and add new set containing A B

find(p): Return set containing element in position p

Position Implementation

Sets implemented using SequencePositions include reference to element and reference to the set the element is in

Complexity of find() is _________

Complexity of makeSet() is ________

Sequence-based Partitions

Move elements from the smaller set to the larger set during union Each element moved into set twice

size of its old set

Perform n makeSet operations, then make n union and find calls What is maximum complexity?

Your Turn

Write union() given this class defintion:

public class Part implements Partition{/** Holds all the instances of Set * that make up this partition. */Sequence sets;

// Constructor, find(), makeSet() omitted for space…

/** Remove A & B from Partition and * add new Set equal to A B. * Return this new set. */ public Set union(Set A, Set B) {

Your Turn

public Set union(Set A, Set B) {int i = 0;while (i < sets.size()) { if (sets.elemAtRank(i) == A) { sets.removeAtRank(i); } else if (sets.elemAtRank(i) == B) { sets.removeAtRank(i); } else {

i++; }

}A.union(B);sets.insertLast(A);return A;

}

Selection ProblemGiven n elements in Sequence S = s1, s2, …, sn, and a value k between 1 -- n, find kth smallest element in SCould sort Sequence and return kth element

This would take time _____________

7 4 9 6 2 2 4 6 7 9k=3

Quick-Select (§ 10.7)

Prune-and-search algorithm Prune: Pick pivot x and partition S into

L elements less than x x G elements greater than or equal to x

Search: If x is solution, return x; else, solve problem recursively using L or G

Quick-Select (§ 10.7)

x

x

L GE

if k < L.size() then return quickSelect(k, G)

if k > L.size() +1 then k = k - |L| - 1 return quickSelect(k, G)

if k == L.size() +1 then return x

Quick-Select Visualization

Like sort trees, but nodes have 1 child

k=5, S=(7 4 9 3 2 6 5 1 8)

5

k=2, S=(7 4 9 6 5 8)

k=2, S=(7 4 6 5)

k=1, S=(7 6 5)

Running Time

What is worst-case running time of quickSelect?

What is expected running time of quickSelect?

Expected Running Time

7 9 7 1 1

7 2 9 4 3 7 6 1 9

2 4 3 1 7 2 9 4 3 7 61

7 2 9 4 3 7 6 1

Good call Bad call

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Good pivotsBad pivots Bad pivots

Expected Running Time, Part 2

Probabilistic Fact #1: The expected number of coin tosses required in order to get one head is twoProbabilistic Fact #2: Expectation is a linear function:

E(X + Y ) = E(X ) + E(Y ) E(cX ) = cE(X )

Let T(n) denote the expected running time of quick-select.By Fact #2,

T(n) < T(3n/4) + bn*(expected # of calls before a good call)By Fact #1,

T(n) < T(3n/4) + 2bnThat is, T(n) is a geometric series:

T(n) < 2bn + 2b(3/4)n + 2b(3/4)2n + 2b(3/4)3n + …

T(n) = O(n).