statistical inference for rényi entropy of integer order · discrete and continuous multivariate...
TRANSCRIPT
![Page 1: Statistical inference for Rényi entropy of integer order · discrete and continuous multivariate P, from sample fX 1;:::X ng of P-i.i.d. observations. Estimators of entropy are widely](https://reader036.vdocuments.mx/reader036/viewer/2022071108/5fe34042f4cc57276105c8f5/html5/thumbnails/1.jpg)
Statistical inference for Renyi entropy of integerorder
David Kallberg
August 23, 2010
David Kallberg Statistical inference for Renyi entropy of integer order
![Page 2: Statistical inference for Rényi entropy of integer order · discrete and continuous multivariate P, from sample fX 1;:::X ng of P-i.i.d. observations. Estimators of entropy are widely](https://reader036.vdocuments.mx/reader036/viewer/2022071108/5fe34042f4cc57276105c8f5/html5/thumbnails/2.jpg)
Outline
Introduction
Measures of uncertainty
Estimation of entropy
Numerical experiment
More results
Conclusion
David Kallberg Statistical inference for Renyi entropy of integer order
![Page 3: Statistical inference for Rényi entropy of integer order · discrete and continuous multivariate P, from sample fX 1;:::X ng of P-i.i.d. observations. Estimators of entropy are widely](https://reader036.vdocuments.mx/reader036/viewer/2022071108/5fe34042f4cc57276105c8f5/html5/thumbnails/3.jpg)
Coauthor:
Oleg SeleznjevDepartment of Mathematics and Mathematical Statistics
Umea university
David Kallberg Statistical inference for Renyi entropy of integer order
![Page 4: Statistical inference for Rényi entropy of integer order · discrete and continuous multivariate P, from sample fX 1;:::X ng of P-i.i.d. observations. Estimators of entropy are widely](https://reader036.vdocuments.mx/reader036/viewer/2022071108/5fe34042f4cc57276105c8f5/html5/thumbnails/4.jpg)
Introduction
A system described by probability distribution P.
Only partial information about P is available, e.g. thecovariance matrix.
A measure of uncertainty (entropy) in P.
What P should we use if any?
David Kallberg Statistical inference for Renyi entropy of integer order
![Page 5: Statistical inference for Rényi entropy of integer order · discrete and continuous multivariate P, from sample fX 1;:::X ng of P-i.i.d. observations. Estimators of entropy are widely](https://reader036.vdocuments.mx/reader036/viewer/2022071108/5fe34042f4cc57276105c8f5/html5/thumbnails/5.jpg)
Why entropy?
The entropy maximization principle: choose P, satisfying givenconstraints, with maximum uncertainty.
Objectivity: we don’t use more information than we have.
David Kallberg Statistical inference for Renyi entropy of integer order
![Page 6: Statistical inference for Rényi entropy of integer order · discrete and continuous multivariate P, from sample fX 1;:::X ng of P-i.i.d. observations. Estimators of entropy are widely](https://reader036.vdocuments.mx/reader036/viewer/2022071108/5fe34042f4cc57276105c8f5/html5/thumbnails/6.jpg)
Measures of uncertainty. The Shannon entropy.
Discrete P = {p(k), k ∈ D}
h1(P) := −∑k
p(k) log p(k)
Continuous P with density p(x), x ∈ Rd
h1(P) := −∫Rd
log (p(x))p(x)dx
David Kallberg Statistical inference for Renyi entropy of integer order
![Page 7: Statistical inference for Rényi entropy of integer order · discrete and continuous multivariate P, from sample fX 1;:::X ng of P-i.i.d. observations. Estimators of entropy are widely](https://reader036.vdocuments.mx/reader036/viewer/2022071108/5fe34042f4cc57276105c8f5/html5/thumbnails/7.jpg)
Measures of uncertainty.The Renyi entropy
A class of entropies. Of order s ≥ 0 given by
Discrete P = {p(k), k ∈ D}
hs(P) :=1
1− slog (
∑k
p(k)s), s 6= 1
Continuos P with density p(x), x ∈ Rd
hs(P) :=1
1− slog (
∫Rd
p(x)sdx), s 6= 1
David Kallberg Statistical inference for Renyi entropy of integer order
![Page 8: Statistical inference for Rényi entropy of integer order · discrete and continuous multivariate P, from sample fX 1;:::X ng of P-i.i.d. observations. Estimators of entropy are widely](https://reader036.vdocuments.mx/reader036/viewer/2022071108/5fe34042f4cc57276105c8f5/html5/thumbnails/8.jpg)
Motivation
The Renyi entropy satisfies axioms on how a measure ofuncertainty should behave, Renyi(1961,1970).
For both discrete and continuous P, the Renyi entropy is ageneralization of the Shannon entropy,
limq→1
hq(P) = h1(P)
David Kallberg Statistical inference for Renyi entropy of integer order
![Page 9: Statistical inference for Rényi entropy of integer order · discrete and continuous multivariate P, from sample fX 1;:::X ng of P-i.i.d. observations. Estimators of entropy are widely](https://reader036.vdocuments.mx/reader036/viewer/2022071108/5fe34042f4cc57276105c8f5/html5/thumbnails/9.jpg)
Problem
Non-parametric estimation of integer order Renyientropy, fordiscrete and continuous multivariate P, from sample {X1, . . .Xn}of P-i.i.d. observations.
Estimators of entropy are widely used.
Distribution identification problems (Student-r distributions).
Average case analysis for random databases.
Clustering.
David Kallberg Statistical inference for Renyi entropy of integer order
![Page 10: Statistical inference for Rényi entropy of integer order · discrete and continuous multivariate P, from sample fX 1;:::X ng of P-i.i.d. observations. Estimators of entropy are widely](https://reader036.vdocuments.mx/reader036/viewer/2022071108/5fe34042f4cc57276105c8f5/html5/thumbnails/10.jpg)
Overview of Renyi entropy estimation
Consistency of nearest neighbor estimators for any s,Leonenko et al. (2008). Only for continuous entropy.
Consistency and asymptotic normality for quadratic case s=2,Leonenko and Seleznjev (2010).
David Kallberg Statistical inference for Renyi entropy of integer order
![Page 11: Statistical inference for Rényi entropy of integer order · discrete and continuous multivariate P, from sample fX 1;:::X ng of P-i.i.d. observations. Estimators of entropy are widely](https://reader036.vdocuments.mx/reader036/viewer/2022071108/5fe34042f4cc57276105c8f5/html5/thumbnails/11.jpg)
Notation
I (C ) indicator function of event C .
Ss,n set of all s-subsets of {1, . . . , n}.∑(s,n) is summation over Ss,n.
d(x , y) the Euclidean distance in Rd
Bε(x) := {y : d(x , y) ≤ ε} ball of radius ε with center x .
bε volume of Bε(x)
pε(x) := P(X ∈ Bε(x)) the ε-ball probability at x
David Kallberg Statistical inference for Renyi entropy of integer order
![Page 12: Statistical inference for Rényi entropy of integer order · discrete and continuous multivariate P, from sample fX 1;:::X ng of P-i.i.d. observations. Estimators of entropy are widely](https://reader036.vdocuments.mx/reader036/viewer/2022071108/5fe34042f4cc57276105c8f5/html5/thumbnails/12.jpg)
Estimation
Method relies on estimating functional
qs := E(ps−1(X )) =
∑
k p(k)s (Discrete)∫Rd p(x)sdx (Continuous)
David Kallberg Statistical inference for Renyi entropy of integer order
![Page 13: Statistical inference for Rényi entropy of integer order · discrete and continuous multivariate P, from sample fX 1;:::X ng of P-i.i.d. observations. Estimators of entropy are widely](https://reader036.vdocuments.mx/reader036/viewer/2022071108/5fe34042f4cc57276105c8f5/html5/thumbnails/13.jpg)
Estimation, discrete case
An estimator of qs ,
Qs,n :=
(s
n
)−1 ∑(s,n)
ψ(S),
where
ψ(S) :=1
s
∑i∈S
I (Xi = Xj , ∀j ∈ S).
Hs,n := 11−s log(max(Qs,n, 1/n)) estimator of hs .
Qs,n is a U-statistic, so properties follows from conventionaltheory.
David Kallberg Statistical inference for Renyi entropy of integer order
![Page 14: Statistical inference for Rényi entropy of integer order · discrete and continuous multivariate P, from sample fX 1;:::X ng of P-i.i.d. observations. Estimators of entropy are widely](https://reader036.vdocuments.mx/reader036/viewer/2022071108/5fe34042f4cc57276105c8f5/html5/thumbnails/14.jpg)
Estimation, continuous case
A U-statistic estimator of qε,s := E(pε(X )s−1),
Qs,n :=
(s
n
)−1 ∑(s,n)
ψ(S),
where
ψ(S) :=1
s
∑i∈S
I (d(Xi ,Xj) ≤ ε, ∀j ∈ S).
Qs,n := Qs,n/bε(d)s−1 asymptotically unbiased estimator of qsif ε = ε(n)→ 0 as n→∞.
Hs,n := 11−s log(max(Qs,n, 1/n)) estimator of hs .
David Kallberg Statistical inference for Renyi entropy of integer order
![Page 15: Statistical inference for Rényi entropy of integer order · discrete and continuous multivariate P, from sample fX 1;:::X ng of P-i.i.d. observations. Estimators of entropy are widely](https://reader036.vdocuments.mx/reader036/viewer/2022071108/5fe34042f4cc57276105c8f5/html5/thumbnails/15.jpg)
Smoothness conditions
Denote by H(α)(K ), 0 < α ≤ 2,K > 0, a linear space ofcontinuous in Rd functions satisfying α-Holder condition if0 < α ≤ 1 or if 1 < α ≤ 2 with continuous partial derivativessatisfying (α− 1)-Holder condition with constant K.
David Kallberg Statistical inference for Renyi entropy of integer order
![Page 16: Statistical inference for Rényi entropy of integer order · discrete and continuous multivariate P, from sample fX 1;:::X ng of P-i.i.d. observations. Estimators of entropy are widely](https://reader036.vdocuments.mx/reader036/viewer/2022071108/5fe34042f4cc57276105c8f5/html5/thumbnails/16.jpg)
Asymptotics, continuous case
ε = ε(n)→ 0 as n→∞.
L(n) > 0, n ≥ 1, a slowly varying function as n→∞.
Ks,n := max(s2(Q2s−1,n − Q2s,n), 1/n) consistent estimator of
the asymptotic variance.
Theorem
Suppose that p(x) is bounded and continuous. Let nεd → a forsome 0 < a ≤ ∞.
(i) Then Hs,n is a consistent estimator of hs .
(ii) Let p(x)s−1 ∈ H(α)(K ) for some d/2 < α ≤ 2. Ifε ∼ L(n)n−1/d and a =∞, then
√nQs,n(1− s)√
Ks,n
(Hs,n − hs)D→ N(0, 1) as n→∞.
David Kallberg Statistical inference for Renyi entropy of integer order
![Page 17: Statistical inference for Rényi entropy of integer order · discrete and continuous multivariate P, from sample fX 1;:::X ng of P-i.i.d. observations. Estimators of entropy are widely](https://reader036.vdocuments.mx/reader036/viewer/2022071108/5fe34042f4cc57276105c8f5/html5/thumbnails/17.jpg)
Numerical experiment
χ2 distribution, 4 degrees of freedom. h3 = −12 log (q3),
where q3 = 1/54.
500 simulations, each of size n = 1000, ε = 1/4.
Quantile plot and histogram supports standard normality.
Remark. Choice of ε in a practical situation remains an openproblem.
David Kallberg Statistical inference for Renyi entropy of integer order
![Page 18: Statistical inference for Rényi entropy of integer order · discrete and continuous multivariate P, from sample fX 1;:::X ng of P-i.i.d. observations. Estimators of entropy are widely](https://reader036.vdocuments.mx/reader036/viewer/2022071108/5fe34042f4cc57276105c8f5/html5/thumbnails/18.jpg)
Figures
Histogram of x
x
Sta
ndar
d no
rmal
den
sity
−3 −2 −1 0 1 2
0.0
0.1
0.2
0.3
0.4
David Kallberg Statistical inference for Renyi entropy of integer order
![Page 19: Statistical inference for Rényi entropy of integer order · discrete and continuous multivariate P, from sample fX 1;:::X ng of P-i.i.d. observations. Estimators of entropy are widely](https://reader036.vdocuments.mx/reader036/viewer/2022071108/5fe34042f4cc57276105c8f5/html5/thumbnails/19.jpg)
Figures
−3 −2 −1 0 1 2 3
−2
−1
01
2Normal Q−Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
s
David Kallberg Statistical inference for Renyi entropy of integer order
![Page 20: Statistical inference for Rényi entropy of integer order · discrete and continuous multivariate P, from sample fX 1;:::X ng of P-i.i.d. observations. Estimators of entropy are widely](https://reader036.vdocuments.mx/reader036/viewer/2022071108/5fe34042f4cc57276105c8f5/html5/thumbnails/20.jpg)
More results
Estimation of (quadratic) Renyi entropy from m-dependentsample.
Two different distributions PX and PY . Inference for
...functionals of type∫Rd
pX (x)s1pY (x)s2dx , s1, s2 ∈ N+.
...statistical distances (Bregman)
Discrete:D2 :=
∑k
(pX (k)− pY (k))2
Continuous:
D2 :=
∫Rd
(pX (x)− pY (x))2dx
David Kallberg Statistical inference for Renyi entropy of integer order
![Page 21: Statistical inference for Rényi entropy of integer order · discrete and continuous multivariate P, from sample fX 1;:::X ng of P-i.i.d. observations. Estimators of entropy are widely](https://reader036.vdocuments.mx/reader036/viewer/2022071108/5fe34042f4cc57276105c8f5/html5/thumbnails/21.jpg)
Conclusion
Asymptotically normal estimates possible for Renyi entropy ofinteger order.
David Kallberg Statistical inference for Renyi entropy of integer order
![Page 22: Statistical inference for Rényi entropy of integer order · discrete and continuous multivariate P, from sample fX 1;:::X ng of P-i.i.d. observations. Estimators of entropy are widely](https://reader036.vdocuments.mx/reader036/viewer/2022071108/5fe34042f4cc57276105c8f5/html5/thumbnails/22.jpg)
References
Leonenko, N. ,Pronzato, L. ,Savani, V. (1982). A class ofRenyi information estimators for multidimensional densities,Annals of Statistics 36 2153-2182.
Leonenko, N. and Seleznjev, O. (2010). Statistical inferencefor ε-entropy and quadratic Renyientropy, J. MultivariateAnalysis 101, Issue 9, 1981-1994.
Renyi, A.(1961). On measures of information and entropyProceedings of the 4th Berkeley Symposium on Mathematics,Statistics and Probability 1960, 547-561.
Renyi, A. Probability theory, North-Holland PublishingCompany 1970
David Kallberg Statistical inference for Renyi entropy of integer order