comp251: greedy algorithmsjeromew/comp251material/comp251_le… · huffman’s algorithm • given...
TRANSCRIPT
COMP251:Greedyalgorithms
Jérôme WaldispühlSchoolofComputerScience
McGillUniversityBasedon(Cormen etal.,2002)
BasedonslidesfromD.Plaisted (UNC)&(goodrich &Tamassia,2009)
Disjointsetsarerepresentedwithanarrayrep[],thatstorestherepresentativerep[i]ofeachelementi.Therunningtimeofthe
functionfind(i)thatreturnstherepresentativeofthesetcontainingi is:
• 𝛀(1)• O(logn)• 𝚹(logn)
✓ (Moreinterestingly𝚹(1))
Leth(A)(resp.h(B))betheheightofthetreeA(resp.B)rootedatx(resp.y).Weassumethath(B)<=h(A)+1.Afterunion(x,y),
whichassertionaretrue?
• h(y) = h(A) + 1• h(y) = max(h(A)+1, h(B))• h(y) = h(B)• h(B) < h(y)
✓
✗
✓✗
Overview
• Algorithmdesigntechniquetosolveoptimizationproblems.
• Problemsexhibitoptimalsubstructure.• Idea(thegreedychoice):
–Whenwehaveachoicetomake,maketheonethatlooksbestrightnow.
–Makealocallyoptimalchoiceinhopeofgettingagloballyoptimalsolution.
GreedyStrategy
Thechoicethatseemsbestatthemomentistheonewegowith.
– Provethatwhenthereisachoicetomake,oneoftheoptimalchoicesisthegreedychoice.Therefore,it isalwayssafetomakethegreedychoice.
– Showthatallbutoneofthesub-problemsresultingfromthegreedychoiceareempty.
Activity-selectionProblem• Input: SetS ofnactivities,a1,a2,…,an.
– si =starttimeofactivityi.– fi =finishtimeofactivityi.
• Output: SubsetA ofmaximumnumber ofcompatibleactivities.– 2 activitiesarecompatible,iftheirintervalsdonotoverlap.
Example:Activitiesineachlinearecompatible.
012345678910
Activity-selectionProblem
012345678910
s6 a6
a5
a4
a3
a2
a1 a7s1
s2
s3
s4
s5
s7f1
f2
f3
f4
f5
f6
f7
i 1 2 3 4 5 6 7si 0 1 2 4 5 6 8fi 2 3 5 6 9 9 10
Activitiessortedbyfinishingtime.
Optimalcompatibleset:{a1 ,a3 ,a5 }
OptimalSubstructure• Assumeactivitiesaresortedbyfinishingtimes.
• Supposeanoptimalsolutionincludesactivityak.Thissolutionisobtainedfrom:– Anoptimalselectionofa1,…,ak-1 activitiescompatiblewithoneanother,andthatfinishbefore ak starts.
– Anoptimalsolutionofak+1,…,an activitiescompatiblewithoneanother,andthatstartafter ak finishes.
012345678910
OptimalSubstructure
• LetSij =subsetofactivitiesinS thatstartafteraifinishesandfinishbeforeaj starts.
• Aij =optimalsolutiontoSij
• Aij =Aik U{ak }UAkj
Sij = ak ∈ S :∀i, j fi ≤ sk < fk ≤ sj{ }
RecursiveSolution
• Subproblems:SelectingmaximumnumberofmutuallycompatibleactivitiesfromSij.
• Letc[i,j]=sizeofmaximum-sizesubsetofmutuallycompatibleactivitiesinSij.
c[i, j]=0 if Sij =Ø
max{c[i,k]+ c[k, j]+1}i<k< j and ak∈Sij
if Sij ≠Ø
#
$%
&%
Recursivesolution:
Note:Here,wedonotknowwhichktousefortheoptimalsolution.
Greedychoice
Theorem:LetSij ≠ ∅,andletam betheactivityinSij withtheearliestfinishtime:fm =min{fk :ak∈Sij}.Then:1. am isusedinsomemaximum-sizesubsetof
mutuallycompatibleactivitiesofSij.2. Sim =∅,sothatchoosingam leavesSmj astheonly
nonemptysubproblem.
GreedychoiceProof:(1)am isusedinsomemaximum-sizesubsetofmutuallycompatibleactivitiesofSij.
• LetAij beamaximum-sizesubsetofmutuallycompatibleactivitiesinSij (i.e.anoptimalsolutionofSij).
• OrderactivitiesinAij inmonotonicallyincreasingorderoffinishtime,andletak bethefirstactivityinAij.
• Ifak =am⇒ done.• Otherwise,letA’ij =Aij - {ak }U{am }• A’ij isvalidbecauseamfinishesbeforeak• Since|Aij|=|A’ij|andAij maximal⇒ A’ij maximaltoo.
GreedychoiceProof:(2)Sim =∅,sothatchoosingam leavesSmj astheonlynonemptysubproblem.
Ifthereisak∈Sim thenfi≤sk <fk ≤sm <fm⇒ fk <fm whichcontradictsthehypothesisthatam hastheearlierfinish.
Greedychoice
Beforetheorem Aftertheorem#subproblems inoptimalsolution
2 1
#choicestoconsider j-i-1 1
WecannowsolvetheproblemSij top-down:
• Chooseam∈Sij withtheearliestfinishtime(greedychoice).
• SolveSmj.
Aij =Aik U{ak }UAkj Aij ={am }UAmj
Activity-selectionProblem
012345678910
s6 a6
a5
a4
a3
a2
a1 a7s1
s2
s3
s4
s5
s7f1
f2
f3
f4
f5
f6
f7
i 1 2 3 4 5 6 7si 0 1 2 4 5 6 8fi 2 3 5 6 9 9 10
Activitiessortedbyfinishingtime.
Activity-selectionProblem
012345678910
s6 a6
a5
a4
a3
a2
a1 a7s1
s2
s3
s4
s5
s7f1
f2
f3
f4
f5
f6
f7
i 1 2 3 4 5 6 7si 0 1 2 4 5 6 8fi 2 3 5 6 9 9 10
Activitiessortedbyfinishingtime.
Activity-selectionProblem
012345678910
s6 a6
a5
a4
a3
a2
a1 a7s1
s2
s3
s4
s5
s7f1
f2
f3
f4
f5
f6
f7
i 1 2 3 4 5 6 7si 0 1 2 4 5 6 8fi 2 3 5 6 9 9 10
Activitiessortedbyfinishingtime.
Activity-selectionProblem
012345678910
s6 a6
a5
a4
a3
a2
a1 a7s1
s2
s3
s4
s5
s7f1
f2
f3
f4
f5
f6
f7
i 1 2 3 4 5 6 7si 0 1 2 4 5 6 8fi 2 3 5 6 9 9 10
Activitiessortedbyfinishingtime.
RecursiveAlgorithmRecursive-Activity-Selector(s,f,i,n)1. m¬ i+12. whilem ≤n andsm <fi3. dom¬m+14. if m ≤ n5. then return {am}È
Recursive-Activity-Selector(s,f,m,n)6. elsereturn∅
InitialCall:Recursive-Activity-Selector(s,f,0,n+1)Complexity:Q(n)
Note1:Weassumeactivitiesarealreadyorderedbyfinishingtime.Note2:Straightforwardtoconvertthealgorithmtoaniterativeone.
//FindfirstactivityinSi,n+1
TypicalSteps• Casttheoptimizationproblemasoneinwhichwemakeachoiceandareleftwithonesubproblem tosolve.
• Provethatthere isalwaysanoptimalsolutionthatmakesthegreedychoice (greedychoiceissafe).
• ShowthatgreedychoiceandoptimalsolutiontosubproblemÞ optimalsolutiontotheproblem.
• Makethegreedychoiceandsolvetop-down.• Youmayhavetopreprocessinputtoputitintogreedyorder (e.g.sortingactivitiesbyfinishtime).
ElementsofGreedyAlgorithms
Nogeneralwaytotellifagreedyalgorithmisoptimal,buttwokeyingredientsare:• Greedy-choiceProperty.
– Agloballyoptimalsolutioncanbearrivedatbymakingalocallyoptimal(greedy)choice.
• OptimalSubstructure.
TextCompression
• GivenastringX,efficientlyencodeXintoasmallerstringY– Savesmemoryand/orbandwidth
• Agoodapproach:Huffmanencoding– Computefrequencyf(c)foreachcharacterc.– Encodehigh-frequencycharacterswithshortcodewords– Nocodewordisaprefixforanothercode– Useanoptimalencodingtreetodeterminethecodewords
EncodingTreeExample• Acode isamappingofeachcharacterofanalphabettoabinary
code-word• Aprefixcode isabinarycodesuchthatnocode-wordistheprefix
ofanothercode-word• Anencodingtree representsaprefixcode
– Eachexternalnode(leaf)storesacharacter– Thecodewordofacharacterisgivenbythepathfromtheroottothe
externalnodestoringthecharacter(0foraleftchildand1forarightchild)
a
b c
d e
00 010 011 10 11a b c d e
0
0
0
0
1
11
1
EncodingExample
a
b c
d e
0
0
0
0
1
11
1
Initialstring:X=acdaEncodedstring: X=00 011 10 00
EncodingTreeOptimization• GivenatextstringX,wewanttofindaprefixcodeforthe
charactersofX thatyieldsasmallencodingforX– Frequentcharactersshouldhavelongcode-words– Rarecharactersshouldhaveshortcode-words
• Example– X = abracadabra– T1 encodesX into29 bits– T2 encodesX into24 bits
c
a r
d b a
c d
b r
T1 T2
Example
a b c d r5 2 1 1 2
X = abracadabra
Frequencies
ca rdb5 2 1 1 2
ca rdb
2
5 2 2ca bd r
2
5
4
ca bd r
2
5
4
6c
a
bd r
2 4
6
11
ExtendedHuffmanTreeExample
Huffman’sAlgorithm
• GivenastringX,Huffman’salgorithmconstructaprefixcodetheminimizesthesizeoftheencodingofX
• ItrunsintimeO(n + d log d),wherenisthesizeofX andd isthenumberofdistinctcharactersofX
• Aheap-basedpriorityqueueisusedasanauxiliarystructure
Algorithm HuffmanEncoding(X)Input string X of size nOutput optimal encoding trie for XC ¬ distinctCharacters(X)computeFrequencies(C, X)Q ¬ new empty heap for all c Î C
T ¬ new single-node tree storing cQ.insert(getFrequency(c), T)
while Q.size() > 1f1 ¬ Q.minKey()T1 ¬ Q.removeMin()f2 ¬ Q.minKey()T2 ¬ Q.removeMin()T ¬ join(T1, T2)Q.insert(f1 + f2, T)
return Q.removeMin()
012345678910
a2
a5
a8
a1
a4
a7 a9
a6
a3