unified maximum likelihood estimation of symmetric ...ccanonne/workshop-focs2017/files/...unified...

42
Unified Maximum Likelihood Estimation of Symmetric Distribution Properties Jayadev Acharya Hirakendu Das Alon Orlitsky Ananda Suresh Cornell Yahoo UCSD Google Frontiers in Distribution Testing Workshop FOCS 2017

Upload: others

Post on 01-Sep-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

UnifiedMaximumLikelihoodEstimation

ofSymmetricDistributionProperties

Jayadev AcharyaHirakendu DasAlon Orlitsky AnandaSuresh

Cornell Yahoo UCSD Google

FrontiersinDistributionTestingWorkshopFOCS2017

Page 2: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

SpecialThanksto…

Idon’tsmile Ionlysmile

Page 3: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

Symmetricproperties

𝒫 ={(𝒑𝟏, …𝒑𝒌) }- collectionofdistributionsover{1,…,k}

Distributionproperty:𝒇:𝒫→ℝ

𝒇 issymmetricifunchangedunderinputpermutations

Entropy𝑯 𝒑 ≜ ∑𝒑𝒊𝒍𝒐𝒈𝟏𝒑𝒊

��

Supportsize𝑺 𝒑 ≜ ∑ 𝕀 𝒑𝒊5𝟎�𝒊

Rényi entropy,supportcoverage,distancetouniformity,…

Determinedbytheprobabilitymultiset{𝒑𝟏, 𝒑𝟐, … , 𝒑𝒌 }

Page 4: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

Symmetricproperties

𝒑 = (𝒑𝟏, …𝒑𝒌), (𝒌 finiteorinfinite)- discretedistribution

𝒇 𝒑 , apropertyof𝒑

𝒇 𝒑 issymmetricifunchangedunderinputpermutations

Entropy 𝑯 𝒑 ≜ ∑𝒑𝒊𝒍𝒐𝒈𝟏𝒑𝒊

��

Supportsize𝑺 𝒑 ≜ ∑ 𝕀 𝒑𝒊5𝟎�𝒊

Rényi entropy,supportcoverage,distancetouniformity,…

Determinedbytheprobabilitymultiset{𝒑𝟏, 𝒑𝟐, … , 𝒑𝒌 }

Page 5: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

Propertyestimation

𝒑 unknowndistributionin𝒫

Givenindependentsamples𝐗𝟏, 𝐗𝟐, … , 𝐗𝐧 ∼ 𝐩

Estimate𝒇 𝒑

Samplecomplexity 𝑺 𝒇, 𝒌, 𝜺, 𝜹

Minimum𝑛 necessaryto

Estimate𝒇 𝒑 ± 𝜺

Witherrorprobability< 𝜹

Page 6: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

Plug-inestimation

Use𝑿𝟏, 𝑿𝟐, … , 𝑿𝒏 ∼ 𝒑tofindanestimate𝒑Fof𝒑

Estimate𝒇 𝒑 by𝒇 𝒑F

Howtoestimate𝒑?

Page 7: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

SequenceMaximumLikelihood(SML)

𝒑𝐬𝐦𝐥 = 𝐚𝐫𝐠𝐦𝐚𝐱𝒑 𝒑(𝒙𝒏 )= 𝐚𝐫𝐠𝐦𝐚𝐱

𝒑(𝒙) 𝚷𝒊𝒑(𝒙𝒊 )

𝑥Q= ℎ, ℎ, 𝑡𝒑𝒉,𝒉,𝒕𝒔𝒎𝒍 = 𝐚𝐫𝐠𝐦𝐚𝐱𝒑𝟐 𝒉 · 𝒑(𝒕)

𝒑𝒉,𝒉,𝒕𝒔𝒎𝒍 (h)=2/3 𝒑𝒉,𝒉,𝒕𝒔𝒎𝒍 (t)=1/3

Sameasempirical-frequencydistribution

Multiplicity𝑵𝒙 - #times𝑥 appearsin𝒙𝒏

𝒑𝐬𝐦𝐥 𝒙 =𝑵𝒙𝒏

Page 8: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

PriorWork

Differentestimatorforeachproperty

Usesophisticatedapproximationtheoryresults

A Unified Maximum Likelihood Approach for Estimating Symmetric Properties of Discrete Distributions

Property Notation SML OptimalEntropy H(p) k

"

?Support size S(p)

k

k log 1

"

?Support coverage S

m

(p)

m

m ?Distance to u kp� uk

1

k

"

2 ?

Table 2. Estimation complexity for various properties, up to a constant factor. For all properties shown, PML achieves the best knownresults. Citations are for specialized techniques, PML results are shown in this paper. Support and support coverage results have beennormalized for consistency with existing literature.

Property Notation SML Optimal ReferencesEntropy H(p) k

"

k

log k

1

"

(Valiant & Valiant, 2011a; Wu &Yang, 2016; Jiao et al., 2015)

Support size S(p)

k

k log 1

"

k

log k

log

2

1

"

(Wu & Yang, 2015)Support coverage S

m

(p)

m

m m

logm

log

1

"

(Orlitsky et al., 2016)Distance to u kp� uk

1

k

"

2k

log k

1

"

2 (Valiant & Valiant, 2011b; Jiaoet al., 2016)

Table 3. Estimation complexity for various properties, up to a constant factor. For all properties shown, PML achieves the best knownresults. Citations are for specialized techniques, PML results are shown in this paper. Support and support coverage results have beennormalized for consistency with existing literature.

Forseveralimportantproperties

Empirical-frequencypluginrequiresΘ 𝒌 samples

Newcomplex(non-plugin)estimatorsneedΘ 𝑘log 𝑘

samples

Page 9: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

Entropyestimation

SMLestimateofentropy= ∑ _`alog a

_`�b

Samplecomplexity:Θ 𝑘/𝜀Variouscorrectionsproposed:Miller-Maddow,Jackknifedestimator,Coverageadjusted,…Samplecomplexity:Ω(𝑘) foralltheaboveestimators

Page 10: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

Entropyestimation[Paninski’03]:𝑜(𝑘) samplecomplexity(existential)

[ValiantValiant’11a]:ConstructiveLPbasedmethods:Θgh

ijk h[ValiantValiant11b,WuYang’14,HanJiaoVenkatWeissman’14]:

Simplifiedalgorithms,andgrowthrate:Θ hgijk h

Page 11: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

New(asofAugust)Results

Unified,simple,sample-optimalapproachforallaboveproblems

Plug-inestimator,replacesequencemaximumlikelihood

withprofilemaximumlikelihood

Page 12: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

Profiles𝒉, 𝒉, 𝒕 or 𝒉, 𝒕, 𝒉or𝒕, 𝒉, 𝒕➞ sameestimate

Oneelementappearedonce,onappearedtwice

Profile:Multi-setofmultiplicities:𝚽 𝑿𝟏𝒏 = {𝑵𝒙: 𝒙 ∈ 𝑿𝟏𝒏}

𝚽(𝒉, 𝒉, 𝒕) = 𝚽(𝒕, 𝒉, 𝒕) = {𝟏, 𝟐}

𝚽(𝜶, 𝜸, 𝜷, 𝜸) = {𝟏, 𝟏, 𝟐}

Sufficientstatisticforsymmetricproperties

Page 13: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

Profilemaximumlikelihood[+SVZ’04]Profile probability

𝑝 Φ = u 𝑝(𝑥va)�

w bxy zw

Maximizetheprofileprobability

𝑝w{|} = argmax

{𝑝(Φ 𝑋va )

See“Onestimatingtheprobabilitymultiset”,Orlitsky,Santhanam,Viswanathan,Zhangforadetailedtreatment,andanargumentforcompetitivedistributionestimation.

Page 14: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

Profilemaximumlikelihood(PML)[+SVZ’04]

Profile probability

𝑝 Φ = u 𝑝(𝑥a)�

by:w bxy zw

DistributionMaximizingtheprofileprobability

𝑝w{|} = argmax

{𝑝(Φ)

PMLcompetitivefordistributionestimation

Page 15: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

PMLexample𝑋Q = h, h, t

𝒑𝒉,𝒉,𝒕𝒔𝒎𝒍 (h)=2/3𝒑𝒉,𝒉,𝒕𝒔𝒎𝒍 (t)=1/3

Φ h, h, t = 1,2𝑝 Φ = 1,2 = 𝑝 𝑠, 𝑠, 𝑑 + 𝑝 𝑠, 𝑑, 𝑠 + 𝑝 𝑑, 𝑠, 𝑠

=3p(s,s,d)

= Qv ∑ 𝑝� 𝑥 𝑝(𝑦)�

b��

𝑝�|} 1, 2 =31

23

� 13 +

13

� 23 =

1827 =

23

Page 16: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

PMLof{1,2}

P({1,2})=p(s,s,d)+p(s,d,s)+p(d,s,s)=3p(s,s,d)

p s, s, d = Σ���𝑝� 𝑥 𝑝 𝑦= Σ�𝑝� 𝑥 1 − 𝑝 𝑥

≤¼Σ�𝑝 𝑥 = v�

𝑝{|}(s,s,d)=¼

(1/2,1/2)➞ p(s,s,d)=1/8+1/8=1/4

𝑝{|}({1,2})=¾ Recall:𝑝�|} 1, 2 =2/3

Page 17: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

PML({1,1,2})

Φ(𝛼, 𝛾, 𝛽, 𝛾) = {1,1, 2}

𝑝{|} 1,1,2 = 𝑈[5]

PMLcanpredictexistenceofnewsymbols

Page 18: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

ProfilemaximumlikelihoodPMLof{1,2}is{½,½}

𝑝{|} 1,2 =31

12

� 12 +

12

� 12 =

34 >

1827

Σ���𝑝� 𝑥 𝑝 𝑦 = Σ�𝑝� 𝑥 1 − 𝑝 𝑥 ≤¼Σ�𝑝 𝑥 = v�

𝑋a = 𝛼, 𝛾, 𝛽, 𝛾,Φ 𝑋a = {1,1, 2}

𝑝{|} 1,1,2 = 𝑈[5]

PMLcanpredictexistenceofnewsymbols

Page 19: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

PMLPlug-inToestimateasymmetricproperty𝑓

Find𝑝{|} Φ(𝑋a)Output𝑓(𝑝{|})

Simple

UnifiedNotuningparameters

Someexperimentalresults(c.2009)

Page 20: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

Uniform500symbols350samples

2x6,3x4,13x3,63x2,161x1242appeared, 258didnot

Page 21: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

U[500],350x,12experiments

Page 22: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

Uniform500symbols350samples

2x6,3x4,13x3,63x2,161x1248appeared,258didnot

700samples

Page 23: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

U[500],700x,12experiments

Page 24: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

Staircase

15Kelements,5steps,~3x30KsamplesObserve8,882elts6,118missing

Page 25: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

Zipf

Underliesmanynaturalphenomenapi=C/i,i=100…15,00030,000samplesObserve9,047elts5,953missing

Page 26: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

1990Census- LastnamesSMITH 1.006 1.006 1JOHNSON 0.810 1.816 2WILLIAMS 0.699 2.515 3JONES 0.621 3.136 4BROWN 0.621 3.757 5DAVIS 0.480 4.237 6MILLER 0.424 4.660 7WILSON 0.339 5.000 8MOORE 0.312 5.312 9TAYLOR 0.311 5.623 10

AMEND 0.001 77.478 18835ALPHIN 0.001 77.478 18836ALLBRIGHT 0.001 77.479 18837AIKIN 0.001 77.479 18838ACRES 0.001 77.480 18839ZUPAN 0.000 77.480 18840ZUCHOWSKI 0.000 77.481 18841ZEOLLA 0.000 77.481 18842

18,839 names77.48% population~230 million

Page 27: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

1990Census- Lastnames18,839 lastnamesbasedon~230 million35,000 samples,observed9,813 names

Page 28: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

Coverage(#newsymbols)Zipfdistributionover15Kelements

Sample30KtimesEstimate:#newsymbolsinsampleofsizeλ*30K

Good-Toulmin:

EstimatePML&predictExtendstoλ>1Appliestootherproperties

λ<1

λ>1

Page 29: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

λ>1

Page 30: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

FindingthePMLdistribution

EMalgorithm[+ Pan,Sajama,Santhanam,Viswanathan,Zhang’05- ’08]

ApproximatePMLviaBethePermanents[Vontobel]

ExtensionsofMarkovChains[Vatedka,Vontobel]

Noprovablealgorithmsknown

MotivatedValiant&Valiant

Page 31: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

MaximumLikelihoodEstimationPlugin

Generalpropertyestimationtechnique

𝒫- collectionofdistributionsoverdomain𝒵

𝑓:𝒫 → ℝ anyproperty(sayentropy)

MLEestimator

Given𝑧 ∈ 𝒵

Determine𝑝ª«¬­ ≜ argmax{∈𝒫

𝑝 𝑧

Output𝑓(𝑝ª«¬­)

HowgoodisMLE?

Page 32: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

CompetitivenessofMLEplugin

𝒫 - collectionofdistributionsoverdomain𝒵

𝑓®: 𝒵 → ℝ any estimaorsuchthat∀𝑝 ∈ 𝒫, 𝑍 ∼ 𝑝

Pr 𝑓 𝑝 − 𝑓® 𝑍 > 𝜀 < 𝛿

MLEpluginerrorboundedby

Pr 𝑓 𝑝 − 𝑓(𝑝ª«¬­) > 2 ⋅ 𝜀 < 𝛿 ⋅ |𝒵|

Simple,universal,competitivewithany𝑓®

Page 33: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

Quiz:Probabilityofunlikelyoutcomes6-sideddie,p=(𝑝v, 𝑝�, … , 𝑝µ)

𝑝¶ ≥ 0,andΣ𝑝¶ = 1,otherwisearbitrary

Z~p

𝑃º 𝑝ª ≤ 1/6

(1,0,…,0)➞ Pr(𝑝ª≤1/6)=0

(1/6,…,1/6)➞ Pr(𝑝ª ≤1/6)=1

𝑃¼ 𝑝ª ≤ 0.01

𝑃¼ 𝑝ª ≤ 0.01 = Σ¶:{¾¿À.Àv 𝑝¶ ≤ 6 ⋅ 0.01 = 0.06

Canbeanything

≤0.06

Page 34: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

CompetitivenessofMLEplugin- proof𝑓®: 𝒵 → ℝ: ∀𝑝 ∈ 𝒫, 𝑍 ∼ 𝑝➞ Pr 𝑓 𝑝 − 𝑓® 𝑍 > 𝜀 < 𝛿

thenPr 𝑓 𝑝 − 𝑓(𝑝ª«¬­) > 2 ⋅ 𝜀 < 𝛿 ⋅ |𝒵|

Forall𝑧suchthat𝑝 𝑧 ≥ 𝛿: 1) 𝑓 𝑝 − 𝑓® 𝑧 ≤ 𝜀

2)𝑝ª«¬­ 𝑧 ≥ 𝑝 𝑧 > 𝛿,hence 𝑓(𝑝ª«¬­) − 𝑓® 𝑧 ≤ 𝜀

Triangleinequality: 𝑓(𝑝ª«¬­) − 𝑓 𝑝 ≤ 2𝜀

If 𝑓(𝑝ª«¬­) − 𝑓 𝑝 > 2𝜀then𝑝 𝑧 < 𝛿,

Pr 𝑓 𝑝ª«¬­ − 𝑓 𝑝 > 2𝜀 ≤ Pr 𝑝 𝑍 < 𝛿 ≤ u 𝑝(𝑧)�

{ ª ÆÇ

≤ 𝛿 ⋅ |𝒵|

Page 35: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

PMLperformancebound

If𝑛 = 𝑆 𝑓, 𝑘, 𝜀, 𝛿 ,then𝑆{|} 𝑓, 𝑘, 2 ⋅ 𝜀, Φa ⋅ 𝛿 ≤ 𝑛

|Φa|:numberofprofilesoflength𝑛

Profileoflengthn:partitionofn

{3},{1,2},{1,1,1}➞ 3,2+1,1+1+1Φa = 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛#𝑜𝑓𝑛

Hardy-Ramanujan:|Φa| < 𝑒Q a�

Easy:𝑒ijk a⋅ a�

If 𝒏 = 𝑺 𝒇, 𝒌, 𝜺, 𝒆Ï𝟒 𝒏� ,then 𝑺𝒑𝒎𝒍 𝒇, 𝒌, 𝟐𝜺, 𝒆Ï 𝒏� ≤ 𝒏

Page 36: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

SummarySymmetricpropertyestimationPMLplug-inapproach

SimpleUniversalSampleoptimalforknownsublinearproperties

FuturedirectionsProvablyefficientalgorithmsIndependentprooftechnique

ThankYou!

Page 37: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

PMLforsymmetricfForany symmetricproperty𝑓,

if𝑛 = 𝑆 𝑓, 𝑘, 𝜀, 0.1 ,then𝑆{|} 𝑓, 𝑘, 2 ⋅ 𝜀, 0.1 = 𝑂(𝑛�).

Proof.Bymediantrick,𝑆 𝑓, 𝑘, 𝜀, 𝑒Ï| = 𝑂 𝑛 ⋅ 𝑚 .Therefore,

𝑆{|} 𝑓, 𝑘, 2 ⋅ 𝜀, 𝑒Q a⋅|� Ï| = 𝑂(𝑛 ⋅ 𝑚),Pluggingin𝑚 = 𝐶 ⋅ 𝑛,givesthedesiredresult.

Page 38: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

Bettererrorprobabilities– warmup

Estimatingadistribution𝑝 over[𝑘] toL1distance𝜀 w.p.>0.9 requiresΘ(𝑘/𝜀�) samples.Proof:Exercise.Estimatingadistribution𝑝 over[𝑘] toL1distance𝜀 w.p.1 − 𝑒Ïh requiresΘ(𝑘/𝜀�) samples.Proof:• Empiricalestimator�̂�• |�̂� − 𝑝| hasboundeddifferenceconstant(b.d.c.)2/𝑛• ApplyMcDiarmid’s inequality

Page 39: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

BettererrorprobabilitiesRecall

𝑆 𝐻, 𝜀, 𝑘, 2/3 = Θ𝑘

𝜀 ⋅ log 𝑘• Existingoptimalestimators:highb.d.c.• Modifythemtohavesmallb.d.c.,andstillbeoptimal• Inparticular,cangetb.d.c.=𝑛ÏÀ.Ö× (exponentcloseto1)

• Withtwicethesampleserrordropssuper-fast

𝑆 𝐻, 𝜀, 𝑘, 𝑒ÏaØ.Ù = Θ hg⋅ijk h

Similarresultsforotherproperties

Page 40: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

EvenapproximatePMLworks!• Perhaps findingexactPMLishard.• EvenapproximatePMLworks.

Findadistribution𝑞 suchthat

𝑞 Φ 𝑋va ≥ 𝑒Ïa.Û ⋅ 𝑝{|} Φ(𝑋va)

Eventhisisoptimal(forlarge𝑘)

Page 41: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

InFisher’swords…

OfcoursenobodyhasbeenabletoprovethatMLEisbestunderallcircumstances.MLEcomputedwithalltheinformationavailablemayturnouttobeinconsistent.Throwingawayasubstantialpartoftheinformationmayrenderthemconsistent.

R.A.Fisher

Page 42: Unified Maximum Likelihood Estimation of Symmetric ...ccanonne/workshop-focs2017/files/...Unified Maximum Likelihood Estimation of Symmetric Distribution Properties JayadevAcharya

ProofofPMLperformanceIf𝒏 = 𝑺 𝒇, 𝒌, 𝜺, 𝜹 ,then𝑺𝒑𝒎𝒍 𝒇, 𝒌, 𝟐 ⋅ 𝜺, 𝜱𝒏 ⋅ 𝜹 ≤ 𝒏

𝑆 𝑓, 𝑘, 𝜀, 𝛿 ,achievedbyanestimator𝑓®(Φ(𝑋a))

• ProfilesΦ 𝑋a suchthat𝑝 Φ 𝑋a > 𝛿,

𝑝ÝÞß Φ ≥ 𝑝 Φ > 𝛿𝑓 𝑝wÝÞß − 𝑓 𝑝 ≤ 𝑓 𝑝wÝÞß − 𝑓® Φ + 𝑓® Φ − 𝑓 𝑝 < 2𝜀

• Profileswith𝑝 Φ 𝑋a < 𝛿,

𝑝(𝑝 Φ 𝑋a < 𝛿 < 𝛿 ⋅ |Φa|