computational lower bounds for statistical...

ComputationalLowerBoundsforStatisticalEstimationProblems

IliasDiakonikolas(USC)

(jointwithDanielKane(UCSD)andAlistairStewart(USC))

WorkshoponLocalAlgorithms,MIT,June2018

THISTALK

GeneralTechniqueforStatisticalQueryLowerBounds:LeadstoTightLowerBounds

forarangeofHigh-dimensionalEstimationTasks

ConcreteApplicationsofourTechnique:

• LearningGaussianMixtureModels(GMMs)

• RobustlyLearningaGaussian

• RobustlyTestingaGaussian

• Statistical-ComputationalTradeoffs

STATISTICALQUERIES[KEARNS’93]

𝑥", 𝑥$, … , 𝑥& ∼ 𝐷 over𝑋

STATISTICALQUERIES[KEARNS’93]

𝑣" − 𝐄-∼. 𝜙" 𝑥 ≤ 𝜏𝜏 istoleranceofthequery;𝜏 = 1/ 𝑚�

𝜙7

𝑣"𝜙$𝑣$

𝑣7SQalgorithmSTAT.(𝜏) oracle

𝐷

𝜙": 𝑋 → −1,1

Problem𝑃 ∈ SQCompl 𝑞,𝑚 :IfexistsaSQalgorithmthatsolves𝑃 using𝑞 queriestoSTAT.(𝜏 = 1/ 𝑚� )

POWEROFSQ ALGORITHMSRestrictedModel:Hopetoproveunconditionalcomputationallowerbounds.

PowerfulModel:WiderangeofalgorithmictechniquesinMLareimplementableusingSQs*:

• PACLearning:AC0,decisiontrees,linearseparators,boosting.

• UnsupervisedLearning:stochasticconvexoptimization,moment-basedmethods,k-meansclustering,EM,…[Feldman-Grigorescu-Reyzin-Vempala-Xiao/JACM’17]

Onlyknownexception:Gaussianeliminationoverfinitefields(e.g.,learningparities).

Forallproblemsinthistalk,strongestknownalgorithmsareSQ.

METHODOLOGYFORSQ LOWERBOUNDSStatisticalQueryDimension:

• Fixed-distributionPACLearning[Blum-Furst-Jackson-Kearns-Mansour-Rudich’95;…]

• GeneralStatisticalProblems[Feldman-Grigorescu-Reyzin-Vempala-Xiao’13,…,Feldman’16]

PairwisecorrelationbetweenD1 andD2 withrespecttoD:

Fact:Sufficestoconstructalargesetofdistributionsthatarenearlyuncorrelated.

THISTALK

GeneralTechniqueforStatisticalQueryLowerBounds:LeadstoTightLowerBounds

forarangeofHigh-dimensionalEstimationTasks

ConcreteApplicationsofourTechnique:

• LearningGaussianMixtureModels(GMMs)

• RobustlyLearningaGaussian

• RobustlyTestingaGaussian

• Statistical-ComputationalTradeoffs

GAUSSIANMIXTUREMODEL(GMM)

• GMM:Distributiononwithprobabilitydensityfunction

• ExtensivelystudiedinstatisticsandTCS

KarlPearson(1894)

LEARNINGGMMS- PRIORWORK(I)

TwoRelatedLearningProblemsParameterEstimation:Recovermodelparameters.

• SeparationAssumptions:Clustering-basedTechniques[Dasgupta’99,Dasgupta-Schulman’00,Arora-Kanan’01,Vempala-Wang’02,Achlioptas-McSherry’05,Brubaker-Vempala’08]

SampleComplexity:(BestKnown)Runtime:

• NoSeparation:MomentMethod[Kalai-Moitra-Valiant’10,Moitra-Valiant’10,Belkin-Sinha’10,Hardt-Price’15]

SampleComplexity:(BestKnown)Runtime:

SEPARATIONASSUMPTIONS

• Clusteringispossibleonlywhenthecomponentshaveverylittleoverlap.

• Formally,wewantthetotalvariationdistancebetweencomponentstobecloseto1.

• AlgorithmsforlearningsphericalGMMSworkunderthisassumption.

• Fornon-sphericalGMMs,knownalgorithmsrequirestrongerassumptions.

LEARNINGGMMS- PRIORWORK(II)

DensityEstimation:Recoverunderlyingdistribution(withinstatisticaldistance).

[Feldman-O’Donnell-Servedio’05,Moitra-Valiant’10,Suresh-Orlitsky-Acharya-Jafarpour’14,Hardt-Price’15,Li-Schmidt’15]

SampleComplexity:

(BestKnown)Runtime:

Fact:ForseparatedGMMs,densityestimationandparameterestimationareequivalent.

LEARNINGGMMS– OPENQUESTION

Summary:Thesamplecomplexityofdensityestimationfork-GMMsis.Thesamplecomplexityofparameterestimationforseparated k-GMMsis.

Question:Isthereatime learningalgorithm?

STATISTICALQUERYLOWERBOUNDFORLEARNINGGMMS

Theorem:Supposethat.AnySQalgorithmthatlearnsseparatedk-GMMsovertoconstanterrorrequireseither:• SQqueriesofaccuracy

or• Atleast

manySQqueries.

Take-away: ComputationalcomplexityoflearningGMMsisinherentlyexponentialindimensionoflatentspace.

GENERALRECIPEFOR(SQ)LOWERBOUNDS

OurgenerictechniqueforprovingSQLowerBounds:

� Step#1:ConstructdistributionthatisstandardGaussianinalldirectionsexcept.

� Step#2:Constructtheunivariateprojectioninthedirectionsothatitmatchesthefirstm momentsof

� Step#3:Considerthefamilyofinstances

HIDDENDIRECTIONDISTRIBUTION

Definition: Foraunitvectorv andaunivariatedistributionwithdensityA,considerthehigh-dimensionaldistribution

Example:

GENERICSQLOWERBOUND


Proposition:Supposethat:• A matchesthefirstm momentsof• Wehaveaslongasv, v’ arenearly

orthogonal.

ThenanySQalgorithmthatlearnsanunknownwithinerrorrequireseitherqueriesofaccuracyormanyqueries.

WHYISFINDINGAHIDDENDIRECTIONHARD?

Observation:Low-DegreeMomentsdonothelp.

• A matchesthefirstm momentsof• Thefirstm momentsofareidenticaltothoseof• Degree-(m+1) moment tensor has entries.

Claim:Randomprojectionsdonothelp.

• Todistinguishbetweenand,wouldneedexponentiallymanyrandomprojections.

ONE-DIMENSIONALPROJECTIONSAREALMOSTGAUSSIAN

KeyLemma:LetQ bethedistributionof,where.Then,wehavethat:

PROOFOFKEYLEMMA(I)

PROOFOFKEYLEMMA(II)

where istheoperatorover

GaussianNoise(Ornstein-Uhlenbeck)Operator

EIGENFUNCTIONS OFORNSTEIN-UHLENBECK OPERATOR

LinearOperator actingonfunctions

Fact(Mehler’66):

• denotesthedegree-i Hermite polynomial.• Notethatareorthonormalwithrespect

totheinnerproduct

GENERICSQLOWERBOUND



orthogonal.


PROOFOFGENERICSQ LOWERBOUND• Sufficestoconstructalargesetofdistributionsthatare

nearly uncorrelated.• PairwisecorrelationbetweenD1 andD2 withrespectto

D:

TwoMainIngredients:

CorrelationLemma:

PackingArgument:ThereexistsasetS ofunitvectorsonwithpairwiseinnerproduct

Theorem:AnySQalgorithmthatlearnsseparatedk-GMMsovertoconstanterrorrequireseitherSQqueriesofaccuracyoratleastmanySQqueries.

APPLICATION:SQ LOWERBOUNDFORGMMS (I)

Wanttoshow:

byusingourgenericproposition:


orthogonal.


APPLICATION:SQ LOWERBOUNDFORGMMS (II)Lemma:ThereexistsaunivariatedistributionA thatisak-GMMwithcomponentsAi such that:• A agreeswithonthefirst2k-1 moments.• Eachpairofcomponentsareseparated.• Wheneverv andv’ arenearlyorthogonal

APPLICATION:SQ LOWERBOUNDFORGMMS (III)High-DimensionalDistributionslooklike“parallelpancakes”:

Efficientlylearnablefork=2. [Brubaker-Vempala’08]

FURTHERRESULTS

SQLowerBounds:• LearningGMMs• RobustlyLearningaGaussian

“Errorguaranteeof[DKK+16]areoptimalforallpolytimealgorithms.”• RobustCovarianceEstimationinSpectralNorm:

“AnyefficientSQalgorithmrequiressamples.”• Robustk-SparseMeanEstimation:

“AnyefficientSQalgorithmrequiressamples.”

SampleComplexityLowerBounds• RobustGaussianMeanTesting• TestingSpherical2-GMMs:“Distinguishingbetweenandrequiressamples.”• SparseMeanTesting

Unifiedtechniqueyieldingarangeofapplications.

SAMPLECOMPLEXITYOFROBUSTTESTINGHigh-DimensionalHypothesisTesting

GaussianMeanTestingDistinguishbetween:• Completeness:• Soundness:with

Simplemean-basedalgorithmwithsamples.

Supposeweaddcorruptionstosoundnesscaseatrate.

TheoremSamplecomplexityofrobustGaussianmeantestingis.

Take-away: Robustnesscandramaticallyincreasethesamplecomplexityofanestimationtask.

SUMMARYANDFUTUREDIRECTIONS

• GeneralTechniquetoProveSQLowerBounds

• ImplicationsforaRangeofUnsupervisedEstimationProblems

FutureDirections:

• FurtherApplicationsofourFrameworkDiscreteSetting[D-Kane-Stewart’18],RobustRegression[D-Kong-Stewart’18],AdversarialExamples[Bubeck-Price- Razenshteyn’18]…

• AlternativeEvidenceofComputationalHardness?

Thanks!AnyQuestions?

computational lower bounds for statistical...

Documents