comp251: greedy algorithmsjeromew/comp251material/comp251_le… · huffman’s algorithm • given...

29
COMP251: Greedy algorithms Jérôme Waldispühl School of Computer Science McGill University Based on (Cormen et al., 2002) Based on slides from D. Plaisted (UNC) & (goodrich & Tamassia, 2009)

Upload: others

Post on 02-Oct-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

COMP251:Greedyalgorithms

Jérôme WaldispühlSchoolofComputerScience

McGillUniversityBasedon(Cormen etal.,2002)

BasedonslidesfromD.Plaisted (UNC)&(goodrich &Tamassia,2009)

Page 2: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

Disjointsetsarerepresentedwithanarrayrep[],thatstorestherepresentativerep[i]ofeachelementi.Therunningtimeofthe

functionfind(i)thatreturnstherepresentativeofthesetcontainingi is:

• 𝛀(1)• O(logn)• 𝚹(logn)

✓ (Moreinterestingly𝚹(1))

Page 3: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

Leth(A)(resp.h(B))betheheightofthetreeA(resp.B)rootedatx(resp.y).Weassumethath(B)<=h(A)+1.Afterunion(x,y),

whichassertionaretrue?

• h(y) = h(A) + 1• h(y) = max(h(A)+1, h(B))• h(y) = h(B)• h(B) < h(y)

✓✗

Page 4: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

Overview

• Algorithmdesigntechniquetosolveoptimizationproblems.

• Problemsexhibitoptimalsubstructure.• Idea(thegreedychoice):

–Whenwehaveachoicetomake,maketheonethatlooksbestrightnow.

–Makealocallyoptimalchoiceinhopeofgettingagloballyoptimalsolution.

Page 5: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

GreedyStrategy

Thechoicethatseemsbestatthemomentistheonewegowith.

– Provethatwhenthereisachoicetomake,oneoftheoptimalchoicesisthegreedychoice.Therefore,it isalwayssafetomakethegreedychoice.

– Showthatallbutoneofthesub-problemsresultingfromthegreedychoiceareempty.

Page 6: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

Activity-selectionProblem• Input: SetS ofnactivities,a1,a2,…,an.

– si =starttimeofactivityi.– fi =finishtimeofactivityi.

• Output: SubsetA ofmaximumnumber ofcompatibleactivities.– 2 activitiesarecompatible,iftheirintervalsdonotoverlap.

Example:Activitiesineachlinearecompatible.

012345678910

Page 7: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

Activity-selectionProblem

012345678910

s6 a6

a5

a4

a3

a2

a1 a7s1

s2

s3

s4

s5

s7f1

f2

f3

f4

f5

f6

f7

i 1 2 3 4 5 6 7si 0 1 2 4 5 6 8fi 2 3 5 6 9 9 10

Activitiessortedbyfinishingtime.

Optimalcompatibleset:{a1 ,a3 ,a5 }

Page 8: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

OptimalSubstructure• Assumeactivitiesaresortedbyfinishingtimes.

• Supposeanoptimalsolutionincludesactivityak.Thissolutionisobtainedfrom:– Anoptimalselectionofa1,…,ak-1 activitiescompatiblewithoneanother,andthatfinishbefore ak starts.

– Anoptimalsolutionofak+1,…,an activitiescompatiblewithoneanother,andthatstartafter ak finishes.

012345678910

Page 9: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

OptimalSubstructure

• LetSij =subsetofactivitiesinS thatstartafteraifinishesandfinishbeforeaj starts.

• Aij =optimalsolutiontoSij

• Aij =Aik U{ak }UAkj

Sij = ak ∈ S :∀i, j fi ≤ sk < fk ≤ sj{ }

Page 10: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

RecursiveSolution

• Subproblems:SelectingmaximumnumberofmutuallycompatibleactivitiesfromSij.

• Letc[i,j]=sizeofmaximum-sizesubsetofmutuallycompatibleactivitiesinSij.

c[i, j]=0 if Sij =Ø

max{c[i,k]+ c[k, j]+1}i<k< j and ak∈Sij

if Sij ≠Ø

#

$%

&%

Recursivesolution:

Note:Here,wedonotknowwhichktousefortheoptimalsolution.

Page 11: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

Greedychoice

Theorem:LetSij ≠ ∅,andletam betheactivityinSij withtheearliestfinishtime:fm =min{fk :ak∈Sij}.Then:1. am isusedinsomemaximum-sizesubsetof

mutuallycompatibleactivitiesofSij.2. Sim =∅,sothatchoosingam leavesSmj astheonly

nonemptysubproblem.

Page 12: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

GreedychoiceProof:(1)am isusedinsomemaximum-sizesubsetofmutuallycompatibleactivitiesofSij.

• LetAij beamaximum-sizesubsetofmutuallycompatibleactivitiesinSij (i.e.anoptimalsolutionofSij).

• OrderactivitiesinAij inmonotonicallyincreasingorderoffinishtime,andletak bethefirstactivityinAij.

• Ifak =am⇒ done.• Otherwise,letA’ij =Aij - {ak }U{am }• A’ij isvalidbecauseamfinishesbeforeak• Since|Aij|=|A’ij|andAij maximal⇒ A’ij maximaltoo.

Page 13: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

GreedychoiceProof:(2)Sim =∅,sothatchoosingam leavesSmj astheonlynonemptysubproblem.

Ifthereisak∈Sim thenfi≤sk <fk ≤sm <fm⇒ fk <fm whichcontradictsthehypothesisthatam hastheearlierfinish.

Page 14: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

Greedychoice

Beforetheorem Aftertheorem#subproblems inoptimalsolution

2 1

#choicestoconsider j-i-1 1

WecannowsolvetheproblemSij top-down:

• Chooseam∈Sij withtheearliestfinishtime(greedychoice).

• SolveSmj.

Aij =Aik U{ak }UAkj Aij ={am }UAmj

Page 15: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

Activity-selectionProblem

012345678910

s6 a6

a5

a4

a3

a2

a1 a7s1

s2

s3

s4

s5

s7f1

f2

f3

f4

f5

f6

f7

i 1 2 3 4 5 6 7si 0 1 2 4 5 6 8fi 2 3 5 6 9 9 10

Activitiessortedbyfinishingtime.

Page 16: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

Activity-selectionProblem

012345678910

s6 a6

a5

a4

a3

a2

a1 a7s1

s2

s3

s4

s5

s7f1

f2

f3

f4

f5

f6

f7

i 1 2 3 4 5 6 7si 0 1 2 4 5 6 8fi 2 3 5 6 9 9 10

Activitiessortedbyfinishingtime.

Page 17: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

Activity-selectionProblem

012345678910

s6 a6

a5

a4

a3

a2

a1 a7s1

s2

s3

s4

s5

s7f1

f2

f3

f4

f5

f6

f7

i 1 2 3 4 5 6 7si 0 1 2 4 5 6 8fi 2 3 5 6 9 9 10

Activitiessortedbyfinishingtime.

Page 18: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

Activity-selectionProblem

012345678910

s6 a6

a5

a4

a3

a2

a1 a7s1

s2

s3

s4

s5

s7f1

f2

f3

f4

f5

f6

f7

i 1 2 3 4 5 6 7si 0 1 2 4 5 6 8fi 2 3 5 6 9 9 10

Activitiessortedbyfinishingtime.

Page 19: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

RecursiveAlgorithmRecursive-Activity-Selector(s,f,i,n)1. m¬ i+12. whilem ≤n andsm <fi3. dom¬m+14. if m ≤ n5. then return {am}È

Recursive-Activity-Selector(s,f,m,n)6. elsereturn∅

InitialCall:Recursive-Activity-Selector(s,f,0,n+1)Complexity:Q(n)

Note1:Weassumeactivitiesarealreadyorderedbyfinishingtime.Note2:Straightforwardtoconvertthealgorithmtoaniterativeone.

//FindfirstactivityinSi,n+1

Page 20: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

TypicalSteps• Casttheoptimizationproblemasoneinwhichwemakeachoiceandareleftwithonesubproblem tosolve.

• Provethatthere isalwaysanoptimalsolutionthatmakesthegreedychoice (greedychoiceissafe).

• ShowthatgreedychoiceandoptimalsolutiontosubproblemÞ optimalsolutiontotheproblem.

• Makethegreedychoiceandsolvetop-down.• Youmayhavetopreprocessinputtoputitintogreedyorder (e.g.sortingactivitiesbyfinishtime).

Page 21: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

ElementsofGreedyAlgorithms

Nogeneralwaytotellifagreedyalgorithmisoptimal,buttwokeyingredientsare:• Greedy-choiceProperty.

– Agloballyoptimalsolutioncanbearrivedatbymakingalocallyoptimal(greedy)choice.

• OptimalSubstructure.

Page 22: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

TextCompression

• GivenastringX,efficientlyencodeXintoasmallerstringY– Savesmemoryand/orbandwidth

• Agoodapproach:Huffmanencoding– Computefrequencyf(c)foreachcharacterc.– Encodehigh-frequencycharacterswithshortcodewords– Nocodewordisaprefixforanothercode– Useanoptimalencodingtreetodeterminethecodewords

Page 23: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

EncodingTreeExample• Acode isamappingofeachcharacterofanalphabettoabinary

code-word• Aprefixcode isabinarycodesuchthatnocode-wordistheprefix

ofanothercode-word• Anencodingtree representsaprefixcode

– Eachexternalnode(leaf)storesacharacter– Thecodewordofacharacterisgivenbythepathfromtheroottothe

externalnodestoringthecharacter(0foraleftchildand1forarightchild)

a

b c

d e

00 010 011 10 11a b c d e

0

0

0

0

1

11

1

Page 24: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

EncodingExample

a

b c

d e

0

0

0

0

1

11

1

Initialstring:X=acdaEncodedstring: X=00 011 10 00

Page 25: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

EncodingTreeOptimization• GivenatextstringX,wewanttofindaprefixcodeforthe

charactersofX thatyieldsasmallencodingforX– Frequentcharactersshouldhavelongcode-words– Rarecharactersshouldhaveshortcode-words

• Example– X = abracadabra– T1 encodesX into29 bits– T2 encodesX into24 bits

c

a r

d b a

c d

b r

T1 T2

Page 26: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

Example

a b c d r5 2 1 1 2

X = abracadabra

Frequencies

ca rdb5 2 1 1 2

ca rdb

2

5 2 2ca bd r

2

5

4

ca bd r

2

5

4

6c

a

bd r

2 4

6

11

Page 27: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

ExtendedHuffmanTreeExample

Page 28: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

Huffman’sAlgorithm

• GivenastringX,Huffman’salgorithmconstructaprefixcodetheminimizesthesizeoftheencodingofX

• ItrunsintimeO(n + d log d),wherenisthesizeofX andd isthenumberofdistinctcharactersofX

• Aheap-basedpriorityqueueisusedasanauxiliarystructure

Algorithm HuffmanEncoding(X)Input string X of size nOutput optimal encoding trie for XC ¬ distinctCharacters(X)computeFrequencies(C, X)Q ¬ new empty heap for all c Î C

T ¬ new single-node tree storing cQ.insert(getFrequency(c), T)

while Q.size() > 1f1 ¬ Q.minKey()T1 ¬ Q.removeMin()f2 ¬ Q.minKey()T2 ¬ Q.removeMin()T ¬ join(T1, T2)Q.insert(f1 + f2, T)

return Q.removeMin()

Page 29: COMP251: Greedy algorithmsjeromew/COMP251material/COMP251_Le… · Huffman’s Algorithm • Given a string X, Huffman’s algorithm construct a prefix code the minimizes the size

012345678910

a2

a5

a8

a1

a4

a7 a9

a6

a3