lecture 18 - stanford universityweb.stanford.edu/.../lecture18-compressed.pdf · •duality helps...
TRANSCRIPT
Announcements
• Finalexam:
• 3:30– 6:30pm,Wednesday12/13,320-105.
• Youmaybringtwodouble-sidedsheetsofnotes
• Format:similartothemidterm
• Resources
• Practiceexamsonline
• Officehours
• NotethemodifiedOHschedule
• Book,notes,slides,HW,piazza
• (Notduringtheexam,buttostudy!)
• Wednesdayduring“class”:reviewsession.
• Completelyoptional
• PleasefilloutPiazzapollaboutwhatyouwantcovered.
Moreannouncements
• Coursefeedbacknowopen!
• Fillitoutonaxess!
• Yourfeedbackissuper-important!
• Itwillhelpupmakethecoursebetter!
Figure1:Feedback
Today
• Whatjusthappened?
• AwhirlwindtourofCS161
• What’snext?
• Afewgemsfromfuturealgorithmsclasses
Generalapproachtoalgorithmdesignandanalysis
CanIdobetter?
Pluckythe
PedanticPenguin
Luckythe
LackadaisicalLemur
AlgorithmdesignerDetail-oriented
Precise
Rigorous
Big-picture
Intuitive
Hand-wavey
Toanswerthisquestionweneed
bothrigor andintuition:
Weneededmoredetails
Here is an
input!
Worst-caseanalysis big-Ohnotation
𝑇 𝑛 = 𝑂 𝑓 𝑛
⟺
∃𝑐, 𝑛+ > 0𝑠. 𝑡. ∀𝑛 ≥ 𝑛+,
0 ≤ 𝑇 𝑛 ≤ 𝑐 ⋅ 𝑓(𝑛)
Doesitwork?
Isitfast?
What
doesthat
mean??
Algorithmdesignparadigm:
divideandconquer
• LikeMergeSort!
• OrKaratsuba’salgorithm!
• OrSELECT!
• Howdoweanalyzethese?
Big
problem
Smaller
problemSmaller
problem
Yet
smaller
problem
Yet
smaller
problem
Yet
smaller
problem
Yet
smaller
problem
Bycareful
analysis!
Useful shortcut, the master method is.
JedimasterYodaPluckythe
PedanticPenguin
Whilewe’reonthetopicofsorting
Whynotuserandomness?
• WeanalyzedQuickSort!
• Stillworst-caseinput,butweuserandomness aftertheinputischosen.
• Alwayscorrect,usuallyfast.
• ThisisaLasVegasalgorithm
Allthissortingismakingmewonder…
Canwedobetter?
• Dependsonwhoyouask:
• RadixSort takestimeO(n)if
theobjectsare,for
example,smallintegers!
• Can’tdobetterina
comparison-basedmodel.
≤
beyondsortedarrays/linkedlists:
BinarySearchTrees!
• Usefuldatastructure!
• Especiallytheself-balancingones!
42 8
73
5
6Maintain balance by stipulating that
black nodes are balanced, and that there aren’t too many red nodes.
It’s just good sense!
Anotherwaytostorethings
Hashtables!
Somebuckets
hashfunctionh
Choosehrandomlyfroma
universalhashfamily.
It’sbetterifthehash
familyissmall!
Thenittakesless
spacetostoreh.
Afundamentalgraphproblem:
shortestpaths
• Eg,transitplanning,packetrouting,…
• Dijkstra!
• Bellman-Ford!
• Floyd-Warshall!
• Notprogramminginanactionmovie.
• Step1:Identifyoptimalsubstructure.
• Step2:Findarecursiveformulationforthevalueoftheoptimalsolution.
• Steps3-5:Usedynamicprogramming:fillinatabletofindtheanswer!
Bellman-FordandFloyd-Warshall
wereexamplesof…
Big
problem
sub
problem
sub
problem
sub
problem
sub
sub
prob
sub
sub
prob
sub
sub
prob
sub
sub
prob
sub
problem
Wesawmanyother
examples,includingLongest
CommonSubsequenceand
KnapsackProblems.
Instead,an
algorithmic
paradigm!
Sometimeswecantakeevenbetteradvantageofoptimalsubstructure…with
Greedyalgorithms
• Makeaseriesofchoices,andcommit!
• Intuitivelywewanttoshowthatourgreedychoicesneverruleoutsuccess.
• Rigorously,weusuallyanalyzedthesebyinduction.
• Examples!
• ActivitySelection
• JobScheduling
• HuffmanCoding
• MinimumSpanningTrees
• Globalminimumcut:
• Karger’s algorithm!
• minimums-tcut:
• isthesameasmaximums-tflow!
• Ford-Fulkersoncanfindthem!
• usefulforrouting
• alsoassignmentproblems
Cutsandflows
Karger’s algorithmisa
Monte-Carloalgorithm:
itisalwaysfast
butmightbewrong.
ts
4
2
6
3
6
3
3
4
3
1
3
43
4
5
5
1
2
10
4
4
4
2
2
6
6
Whathavewelearned?
• Afewalgorithmdesignparadigms:
• Divideandconquer,DynamicProgramming,Greedy
• Afewanalysistools:
• Worst-caseanalysis,asymptoticanalysis,recurrencerelations,probabilitytricks,proofsbyinduction
• Afewcommonobjects:
• Graphs,arrays,trees,hashfunctions
• ALOTofexamples!
Whathavewelearned?
We’vefilledoutatoolbox
• Tonsofexamplesgiveusintuitionaboutwhatalgorithmictechniquesmightworkwhen.
• Thetechnicalskillsmakesureourintuitionworksout.
Atasteofwhat’stocome• CS154– IntroductiontoComplexity
• CS167– ReadingsinAlgorithms
• CS168– TheModernAlgorithmicToolbox
• MS&E212– CombinatorialOptimization
• CS250– ErrorCorrectingCodes
• CS254– ComputationalComplexity
• CS255– IntroductiontoCryptography
• CS261– ASecondCourseinAlgorithms
• CS264– BeyondWorst-CaseAnalysis
• CS265– RandomizedAlgorithms
• CS269– IncentivesinComputerScience
• EE364A/B– ConvexOptimizationIandII
...andmanymanymore
upper-leveltopicscourses!
findSomeTheoryCourses():
• gototheory.stanford.edu
• Clickon“People”
• Lookatwhatwe’reteaching!
Today
Afewgems
• Linearprogramming
• Randomprojections
• Low-degreepolynomials
Thiswillbeprettyfluffy,
withoutmuchdetail–
takemoreCStheory
classesformoredetail!
NOTHINGAFTERTHISPOINT
WILLBEONTHEFINALEXAM
LinearProgramming
• Thisisafancynameforoptimizingalinearfunctionsubjecttolinearconstraints.
• Forexample:
• Itturnsoutthebeanextremelygeneralproblem.
Maximize
𝑥 + 𝑦
𝑥 ≥ 0
𝑦 ≥ 0
4𝑥 + 𝑦 ≤ 2
𝑥 + 2𝑦 ≤ 1
subjectto
ActuallywejustsawitonMonday
Maximize
thesumoftheflowsleavings
ts
4
2
6
3
6
3
3
4
3
1
3
43
4
5
5
1
2
10
4
4
4
2
2
6
6
subjectto
• Noneoftheflowsarebiggerthantheedgecapacities
• Ateveryvertex,stuffgoingin=stuffgoingout.
Anotherexample,inmachinelearning
SupportVectorMachines
Maximize
themarginsubjectto
allofthepointsareonthecorrectside
oftheline.Thiscanbewritten
inaconvexwayThesearejust
linearinequalities
?
Technicallyquadraticprogramming,notlinearprogramming…
LinearProgramming
Hasareallynicegeometricintuition
Maximize
𝑥 + 𝑦
𝑥 ≥ 0
𝑦 ≥ 0
4𝑥 + 𝑦 ≤ 2
𝑥 + 2𝑦 ≤ 1
subjectto
𝑥 ≥ 0
𝑦 ≥ 0
LinearProgramming
Hasareallynicegeometricintuition
Maximize
𝑥 + 𝑦
𝑥 ≥ 0
𝑦 ≥ 0
4𝑥 + 𝑦 ≤ 2
𝑥 + 2𝑦 ≤ 1
subjectto
𝑥 ≥ 0
𝑦 ≥ 0
LinearProgramming
Hasareallynicegeometricintuition
Maximize
𝑥 + 𝑦
𝑥 ≥ 0
𝑦 ≥ 0
4𝑥 + 𝑦 ≤ 2
𝑥 + 2𝑦 ≤ 1
subjectto
𝒙 + 𝒚 is
increasingin
thisdirection.
Thefunction
ismaximized
here!
Ingeneral
• Theconstraintsdefineapolytope
• Thefunctiondefinesadirection
• Wejustwanttofindthevertexthatisfurthestinthatdirection.
Thefunction
ismaximized
here!
DualityHowdoweknowwehaveanoptimalsolution?
Maximize
𝑥 + 𝑦
𝑥 ≥ 0
𝑦 ≥ 0
4𝑥 + 𝑦 ≤ 2
𝑥 + 2𝑦 ≤ 1
subjectto
Iclaimthattheoptimumis5/7.
Proof:sayxandysatisfytheconstraints.
• 𝒙 + 𝒚 =𝟏
𝟕𝟒𝒙 + 𝒚 +
𝟑
𝟕𝒙 + 𝟐𝒚
• ≤𝟏
𝟕⋅ 𝟐 +
𝟑
𝟕⋅ 𝟏
• =𝟓
𝟕
Youcancheckthispoint
hasvalue5/7...buthow
wouldweproveit’s
optimalotherthanby
eyeballingit?
cute,but
Howdidyoucomeupwith1/7,3/7?
Maximize
𝑥 + 𝑦
𝑥 ≥ 0
𝑦 ≥ 0
4𝑥 + 𝑦 ≤ 2
𝑥 + 2𝑦 ≤ 1
subjectto
Iclaimthattheoptimumis5/7.
Proof:sayxandysatisfytheconstraints.
• 𝒙 + 𝒚 ≤𝟏
𝟕𝟒𝒙 + 𝒚 +
𝟑
𝟕𝒙 + 𝟐𝒚
• ≤𝟏
𝟕⋅ 𝟐 +
𝟑
𝟕⋅ 𝟏
• =𝟓
𝟕
• Iwanttochoosethingstoputhere
• SothatIminimizethis
• Subjecttothesethings
That’salinearprogram!
• HowdidIfindthosespecialvalues1/7,3/7?
• Isolvedsomelinearprogram.
• It’scalledthedualprogram.
Minimizetheupperboundyouget,
subjecttotheproofworking.
Primal
Maximizestuff
subjecttostuff
Dual
Minimizeotherstuff
subjecttootherstuffTheoptimalvaluesare
thesame!
We’veactuallyalreadyseenthistoo
TheMin-CutMax-FlowTheorem!
Primal
Maximizethe
sumofthe
flowsleavings
s.t
Alltheflow
constraintsare
satisfied
Dual
Minimizethesum
ofthecapacities
onacut
s.t.
it’salegitcut
Theoptimalvaluesare
thesame!
ts
4
2
6
3
6
3
3
4
3
1
3
43
4
5
5
1
2
10
4
4
4
2
2
6
6
LPsandDualityarereallypowerful
• Thisgeneralphenomenonshowsupallovertheplace
• Min-CutMax-Flowisaspecialcase.
• Dualityhelpsusreasonaboutanoptimizationproblem
• Thedualprovidesacertificate thatwe’vesolvedtheprimal.
• eg,ifyouhaveacutandaflowwiththesamevalue,youmusthavefoundamaxflowandamincut.
• WecansolveLPsquickly!
• Forexample,byintelligentlybouncingaroundtheverticesofthefeasibleregion.
• Thisisanextremelypowerfulalgorithmicprimitive.
OneofmyfavoritetricksTakearandomprojectionandhopeforthebest.
High-dimensional
setofpoints
Forexample,eachdata
pointisavector
(age,height,shoesize,… )
Whywouldwedothis?
• Highdimensionaldatatakesalongtimetoprocess.
• Lowdimensionaldatacanbeprocessedquickly.
• ”THEOREM”:Randomprojectionsapproximatelypreservepropertiesofdatathatyoucareabout.
Example:nearestneighbors
• Iwanttofindwhichpointisclosesttothisone.
Thattakesareallylong
timeinhighdimensions.
Johnson-Lindenstrauss Lemma:
Euclideandistanceis
approximatelypreservedby
randomprojections.
Findtheclosestpoint
downhere,you’re
probablypretty
correct.
Anotherexample:
CompressedSensing
• Startwithasparsevector
• Mostlyzeroorclosetozero
• Forexample:
(5,0,0,0,0,0.01,0.01,5.8,32,14,0,0,0,12,0,0,5,0,.03)
Thisimageissparse ThisimageissparseafterI
takeawavelettransform.
Compressedsensingcontinued
• Takearandomprojectionofthatsparsevector:
Randomshortfatmatrix
Longsparse
vector
Shortvector
=
Goal:Giventheshort
vector,recoverthe
longsparsevector.
WhywouldIwanttodothat?• Imagecompressionandsignalprocessing
• Especiallywhenyouneverhavespacetostorethewholesparsevectortobeginwith.
Randomlysampling(inthetime
domain)asignalthatissparsein
theFourierdomain.
Randommeasurementsin
anfMRImeansyouspend
lesstimeinsideanfMRI
A“singlepixel
camera”isa
thing.
Allexamplesofthis:
Randomshortfatmatrix
Longsparse
vector
Shortvector
=
Goal:Giventheshort
vector,recoverthe
longsparsevector.
Butwhyshouldthisbepossible?
• Therearetonsoflongvectorsthatmaptotheshortvector!
Randomshortfatmatrix
Longsparse
vector
Shortvector
=
Goal:Giventheshort
vector,recoverthe
longsparsevector.
Backtothegeometry
Theorem:
randomprojectionspreservethegeometryofsparsevectorstoo.
Allofthe
sparse
vectors
(Infinitely
manyofthem)
Ifwedon’tcareaboutalgorithms,
that’smorethanenough.Allofthe
sparse
vectors
Randomshort
fatmatrix
Multiplyby
Thismeansthat,intheory,
wecaninvertthatarrow.
Howdowedothisefficiently??
Theremaybetonsofvectors
thatmaptothispoint,butonly
oneofthemissparse!
Anefficientalgorithm?
Randomshort
fatmatrixA
Long
sparse
vector
Short
vectory
=
Goal:Giventheshortvector,
recoverthelongsparsevector.
Minimize 𝑥 E
𝐴𝑥 = 𝑦
s.t.
Thisnormisthesum
oftheabsolutevalues
oftheentriesofx
• Itturnsoutthatbecausethegeometryofsparsevectorsis
preserved,thisoptimizationproblemgivesthesameanswer.
• Wecanuselinearprogrammingtosolvethisquickly!
Whatwe’dliketodois:
Minimizenumberof
nonzeroentriesinxs.t.
Problem: Idon’tknow
howtodothatefficiently!Thisisn’ta
nicefunction
𝐴𝑥 = 𝑦Instead:
Anotherofmyfavoritetricks
Polynomialinterpolation
• Saywehaveafewevaluationpointsofalow-degreepolynomial.
• Wecanrecoverthepolynomial.• 2 ptsdeterminealine,3ptsdetermineaparabola,etc.
• Wecanrecoverthewholepolynomialreallyfast.• It’sadivide-and-conqueralgorithm
• Evenworksifsomeofthepointsarewrong.
f(x)
Oneapplication:
CommunicationandStorage
Alice Bob
“Hi,Bob!”
𝑓 𝑥 = 𝑯 + 𝑰 ⋅ 𝑥 + 𝑩 ⋅ 𝑥J +𝑶 ⋅ 𝑥L + 𝑩 ⋅ 𝑥M
f(x)
• AlicewantstosendamessagetoBob
Noisychannel
Bobcandosuper-fast
polynomialinterpolation
andfigureoutwhatAlice
meanttosay!
Anotherapplication:
Designing“random”projectionsthatarebetterthanrandom
Randomshortfatmatrix =
Thematrixthattreatsthebig
longvectorasAlice’smessage
polynomialandevaluatesit
REALFASTatrandompoints.
• Thisisstill”randomenough”
tomaketheLPsolutionwork.
• Itismuchmoreefficientto
manipulateandstore!
Today
Afewgems
• Linearprogramming
• Randomprojections
• Low-degreepolynomials
Tolearnmore:
CS168,CS261,…
CS168,CS261,
CS264,CS265,…
CS168,CS250,…
Toseemore…
• Takemoreclasses!
• Comehangoutwiththetheorygroup!
• Theorylunch,Thursdaysatnoon
• Theoryseminar,Thursdaysat4:15
• Jointhetheory-seminarmailinglistforupdates
Stanfordtheorygroup:
Weareveryfriendly.
theory.stanford.edu