local privacy and statistical minimax rates · 2020-01-03 · goals for this talk bring together...

146
Local Privacy and Statistical Minimax Rates John C. Duchi, Michael I. Jordan, Martin J. Wainwright University of California, Berkeley December 2013

Upload: others

Post on 13-Mar-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Local Privacy and Statistical Minimax Rates

John C. Duchi, Michael I. Jordan, Martin J. Wainwright

University of California, Berkeley

December 2013

Page 2: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Goals for this talk

Bring together some classical concepts of decision theoryand newer concepts of privacy

Page 3: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Illustration of problem

I Have data X1, . . . , Xn

I Private views Z1, . . . , Zn constructed from Xi

I Often: goal to get statistics of {X1, . . . , Xn} (e.g. average salary)

I Distribution P and parameter θ(P ) generate data

I Sample X1, . . . , Xn from P not observed (only Zi)

I Goal: infer population parameter θ(P ) based on Z1, . . . , Zn

Page 4: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Illustration of problem

???

X1X2

X3 Xn. . .

Z1

Z2

Zn

✓(P )Infer

P

Database

Channel Q

I Have data X1, . . . , Xn

I Private views Z1, . . . , Zn constructed from Xi

I Often: goal to get statistics of {X1, . . . , Xn} (e.g. average salary)

I Distribution P and parameter θ(P ) generate data

I Sample X1, . . . , Xn from P not observed (only Zi)

I Goal: infer population parameter θ(P ) based on Z1, . . . , Zn

Page 5: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Illustration of problem

???

X1X2

X3 Xn. . .

Z1

Z2

Zn

✓(P )Infer

P

Database

Channel Q

I Have data X1, . . . , Xn

I Private views Z1, . . . , Zn constructed from Xi

I Often: goal to get statistics of {X1, . . . , Xn} (e.g. average salary)

I Distribution P and parameter θ(P ) generate data

I Sample X1, . . . , Xn from P not observed (only Zi)

I Goal: infer population parameter θ(P ) based on Z1, . . . , Zn

Page 6: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Illustration of problem

???

X1X2

X3 Xn. . .

Z1

Z2

Zn

✓(P )Infer

P

Database

Channel Q

I Have data X1, . . . , Xn

I Private views Z1, . . . , Zn constructed from Xi

I Often: goal to get statistics of {X1, . . . , Xn} (e.g. average salary)

I Distribution P and parameter θ(P ) generate data

I Sample X1, . . . , Xn from P not observed (only Zi)

I Goal: infer population parameter θ(P ) based on Z1, . . . , Zn

Page 7: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Illustration of problem

???

X1X2

X3 Xn. . .

Z1

Z2

Zn

✓(P )Infer

P

Database

Channel Q

I Have data X1, . . . , Xn

I Private views Z1, . . . , Zn constructed from Xi

I Often: goal to get statistics of {X1, . . . , Xn} (e.g. average salary)

I Distribution P and parameter θ(P ) generate data

I Sample X1, . . . , Xn from P not observed (only Zi)

I Goal: infer population parameter θ(P ) based on Z1, . . . , Zn

Page 8: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Illustration of problem

???

X1X2

X3 Xn. . .

Z1

Z2

Zn

✓(P )Infer

P

Database

Channel Q

I Have data X1, . . . , Xn

I Private views Z1, . . . , Zn constructed from Xi

I Often: goal to get statistics of {X1, . . . , Xn} (e.g. average salary)

I Distribution P and parameter θ(P ) generate data

I Sample X1, . . . , Xn from P not observed (only Zi)

I Goal: infer population parameter θ(P ) based on Z1, . . . , Zn

Page 9: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Illustration of problem

???

X1X2

X3 Xn. . .

Z1

Z2

Zn

✓(P )Infer

P

Database

Channel Q

I Have data X1, . . . , Xn

I Private views Z1, . . . , Zn constructed from Xi

I Often: goal to get statistics of {X1, . . . , Xn} (e.g. average salary)

I Distribution P and parameter θ(P ) generate data

I Sample X1, . . . , Xn from P not observed (only Zi)

I Goal: infer population parameter θ(P ) based on Z1, . . . , Zn

Page 10: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Primer on minimax rates of convergence and statisticalinference

Page 11: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Illustration of classical problem

X1X2

X3 Xn. . .

✓(P ) Infer

P

I Have distribution P and parameter θ(P ) of P

I Sample X1, . . . , Xn drawn from P and observed

I Goal: infer population parameter θ(P )

Why? Care about making future predictions

I What is likelihood new resident of San Francisco needs food stamps

I Biological prediction, web advertising, search, ...

Page 12: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Illustration of classical problem

X1X2

X3 Xn. . .

✓(P ) Infer

P

I Have distribution P and parameter θ(P ) of P

I Sample X1, . . . , Xn drawn from P and observed

I Goal: infer population parameter θ(P )

Why? Care about making future predictions

I What is likelihood new resident of San Francisco needs food stamps

I Biological prediction, web advertising, search, ...

Page 13: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Illustration of classical problem

X1X2

X3 Xn. . .

✓(P ) Infer

P

I Have distribution P and parameter θ(P ) of P

I Sample X1, . . . , Xn drawn from P and observed

I Goal: infer population parameter θ(P )

Why? Care about making future predictions

I What is likelihood new resident of San Francisco needs food stamps

I Biological prediction, web advertising, search, ...

Page 14: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Illustration of classical problem

X1X2

X3 Xn. . .

✓(P ) Infer

P

I Have distribution P and parameter θ(P ) of P

I Sample X1, . . . , Xn drawn from P and observed

I Goal: infer population parameter θ(P )

Why? Care about making future predictions

I What is likelihood new resident of San Francisco needs food stamps

I Biological prediction, web advertising, search, ...

Page 15: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Illustration of classical problem

X1X2

X3 Xn. . .

✓(P ) Infer

P

I Have distribution P and parameter θ(P ) of P

I Sample X1, . . . , Xn drawn from P and observed

I Goal: infer population parameter θ(P )

Why? Care about making future predictions

I What is likelihood new resident of San Francisco needs food stamps

I Biological prediction, web advertising, search, ...

Page 16: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Illustration of classical problem

X1X2

X3 Xn. . .

✓(P ) Infer

P

I Have distribution P and parameter θ(P ) of P

I Sample X1, . . . , Xn drawn from P and observed

I Goal: infer population parameter θ(P )

Why? Care about making future predictions

I What is likelihood new resident of San Francisco needs food stamps

I Biological prediction, web advertising, search, ...

Page 17: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Minimax risk

Central object of study: Minimax riskI Parameter θ(P ) of distribution P

I E.g. mean: θ(P ) = EP [X]

I Loss ρ that measures error in estimate of θ̂ for θ: ρ(θ̂, θ)

I E.g. ρ(θ̂, θ) = ‖θ̂ − θ‖22 or more esoteric/robust

ρu(θ̂, θ)

=

{12‖θ̂ − θ‖22 if ‖θ̂ − θ‖2 ≤ uu‖θ̂ − θ‖2 − u2/2 otherwise

I Family of distributions P that we studyI E.g. P such that EP [X2] ≤ 1

Page 18: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Minimax risk

Central object of study: Minimax riskI Parameter θ(P ) of distribution P

I E.g. mean: θ(P ) = EP [X]

I Loss ρ that measures error in estimate of θ̂ for θ: ρ(θ̂, θ)

I E.g. ρ(θ̂, θ) = ‖θ̂ − θ‖22 or more esoteric/robust

ρu(θ̂, θ)

=

{12‖θ̂ − θ‖22 if ‖θ̂ − θ‖2 ≤ uu‖θ̂ − θ‖2 − u2/2 otherwise

I Family of distributions P that we studyI E.g. P such that EP [X2] ≤ 1

Page 19: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Minimax risk

Central object of study: Minimax riskI Parameter θ(P ) of distribution P

I E.g. mean: θ(P ) = EP [X]

I Loss ρ that measures error in estimate of θ̂ for θ: ρ(θ̂, θ)

I E.g. ρ(θ̂, θ) = ‖θ̂ − θ‖22 or more esoteric/robust

ρu(θ̂, θ)

=

{12‖θ̂ − θ‖22 if ‖θ̂ − θ‖2 ≤ uu‖θ̂ − θ‖2 − u2/2 otherwise

I Family of distributions P that we studyI E.g. P such that EP [X2] ≤ 1

Page 20: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Minimax risk

Central object of study: Minimax risk

I Parameter θ(P ) of distribution P

I Loss ρ that measures error in estimate of θ̂ for θ: ρ(θ̂, θ)

I E.g. ρ(θ̂, θ) = ‖θ̂ − θ‖22I Family of distributions P that we study

Look at expected loss

I Worst case over distributions PI Best case over all estimators θ̂ : Zn → Θ

To study: rate of Mn(θ(P), ρ)→ 0 as n grows

Page 21: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Minimax risk

Central object of study: Minimax risk

I Parameter θ(P ) of distribution P

I Loss ρ that measures error in estimate of θ̂ for θ: ρ(θ̂, θ)

I E.g. ρ(θ̂, θ) = ‖θ̂ − θ‖22I Family of distributions P that we study

Look at expected loss

EP[ρ(θ̂(X1, . . . , Xn), θ(P ))

]

I Worst case over distributions PI Best case over all estimators θ̂ : Zn → Θ

To study: rate of Mn(θ(P), ρ)→ 0 as n grows

Page 22: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Minimax risk

Central object of study: Minimax risk

I Parameter θ(P ) of distribution P

I Loss ρ that measures error in estimate of θ̂ for θ: ρ(θ̂, θ)

I E.g. ρ(θ̂, θ) = ‖θ̂ − θ‖22I Family of distributions P that we study

Look at expected loss

supP∈P

EP[ρ(θ̂(X1, . . . , Xn), θ(P ))

]

I Worst case over distributions P

I Best case over all estimators θ̂ : Zn → Θ

To study: rate of Mn(θ(P), ρ)→ 0 as n grows

Page 23: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Minimax risk

Central object of study: Minimax risk

I Parameter θ(P ) of distribution P

I Loss ρ that measures error in estimate of θ̂ for θ: ρ(θ̂, θ)

I E.g. ρ(θ̂, θ) = ‖θ̂ − θ‖22I Family of distributions P that we study

Look at expected loss

infθ̂

supP∈P

EP[ρ(θ̂(X1, . . . , Xn), θ(P ))

]

I Worst case over distributions PI Best case over all estimators θ̂ : Zn → Θ

To study: rate of Mn(θ(P), ρ)→ 0 as n grows

Page 24: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Minimax risk

Central object of study: Minimax risk

I Parameter θ(P ) of distribution P

I Loss ρ that measures error in estimate of θ̂ for θ: ρ(θ̂, θ)

I E.g. ρ(θ̂, θ) = ‖θ̂ − θ‖22I Family of distributions P that we study

Look at expected loss

Mn(θ(P), ρ) := infθ̂

supP∈P

EP[ρ(θ̂(X1, . . . , Xn), θ(P ))

]

︸ ︷︷ ︸Classical minimax risk

I Worst case over distributions PI Best case over all estimators θ̂ : Zn → Θ

To study: rate of Mn(θ(P), ρ)→ 0 as n grows

Page 25: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Minimax risk

Central object of study: Minimax risk

I Parameter θ(P ) of distribution P

I Loss ρ that measures error in estimate of θ̂ for θ: ρ(θ̂, θ)

I E.g. ρ(θ̂, θ) = ‖θ̂ − θ‖22I Family of distributions P that we study

Look at expected loss

Mn(θ(P), ρ) := infθ̂

supP∈P

EP[ρ(θ̂(X1, . . . , Xn), θ(P ))

]

︸ ︷︷ ︸Classical minimax risk

I Worst case over distributions PI Best case over all estimators θ̂ : Zn → Θ

To study: rate of Mn(θ(P), ρ)→ 0 as n grows

Page 26: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax bounds

This talk:

I Upper bounds will be ad-hoc

I Lower bounds will be information theoretic [Hasminskii 78, Birge 83,Ibragimov and Hasminskii 81, Yang and Barron 99, Yu97]

I NB: Many known information-theoretic upper bounds [Barron, Birge,Kivinen, ...]

Page 27: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax bounds

This talk:

I Upper bounds will be ad-hoc

I Lower bounds will be information theoretic [Hasminskii 78, Birge 83,Ibragimov and Hasminskii 81, Yang and Barron 99, Yu97]

I NB: Many known information-theoretic upper bounds [Barron, Birge,Kivinen, ...]

Page 28: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax bounds

This talk:

I Upper bounds will be ad-hoc

I Lower bounds will be information theoretic [Hasminskii 78, Birge 83,Ibragimov and Hasminskii 81, Yang and Barron 99, Yu97]

I NB: Many known information-theoretic upper bounds [Barron, Birge,Kivinen, ...]

Page 29: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax bounds

This talk:

I Upper bounds will be ad-hoc

I Lower bounds will be information theoretic [Hasminskii 78, Birge 83,Ibragimov and Hasminskii 81, Yang and Barron 99, Yu97]

I NB: Many known information-theoretic upper bounds [Barron, Birge,Kivinen, ...]

Page 30: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax lower bounds

Step 1: Reduce from estimation to testing

Step 2: Canonical testing problem

Step 3: Classical bounds on error probabilities

Page 31: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax lower bounds

Step 1: Reduce from estimation to testing

Step 2: Canonical testing problem

Step 3: Classical bounds on error probabilities

Page 32: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax lower bounds

Step 1: Reduce from estimation to testing

Step 2: Canonical testing problem

Step 3: Classical bounds on error probabilities

Page 33: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax lower bounds

Step 1: Reduce from estimation to testingStep 2: Canonical testing problemStep 3: Classical bounds on error probabilities

✓k

✓i

I Have 2δ-packing of Θ,i.e. {θ1, θ2, . . . , θK}

I Estimator θ̂

I At most one index closeto θ̂

I Can test index i ∈ [K]

Page 34: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax lower bounds

Step 1: Reduce from estimation to testingStep 2: Canonical testing problemStep 3: Classical bounds on error probabilities

✓k

✓i

I Have 2δ-packing of Θ,i.e. {θ1, θ2, . . . , θK}

I Estimator θ̂

I At most one index closeto θ̂

I Can test index i ∈ [K]

Page 35: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax lower bounds

Step 1: Reduce from estimation to testingStep 2: Canonical testing problemStep 3: Classical bounds on error probabilities

✓k

✓i

� b✓

I Have 2δ-packing of Θ,i.e. {θ1, θ2, . . . , θK}

I Estimator θ̂

I At most one index closeto θ̂

I Can test index i ∈ [K]

Page 36: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax lower bounds

Step 1: Reduce from estimation to testingStep 2: Canonical testing problemStep 3: Classical bounds on error probabilities

✓k

✓i

� b✓

I Have 2δ-packing of Θ,i.e. {θ1, θ2, . . . , θK}

I Estimator θ̂

I At most one index closeto θ̂

I Can test index i ∈ [K]

Page 37: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax lower bounds

Step 1: Reduce from estimation to testingStep 2: Canonical testing problemStep 3: Classical bounds on error probabilities

✓k

✓i

� b✓

I Have 2δ-packing of Θ,i.e. {θ1, θ2, . . . , θK}

I Estimator θ̂

I At most one index closeto θ̂

I Can test index i ∈ [K]

Page 38: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax lower bounds

Step 1: Reduce from estimation to testingStep 2: Canonical testing problemStep 3: Classical bounds on error probabilities

✓k

✓i

I Nature chooses random index V ∈ [K]I Conditional on V = v, sample X1, . . . , Xn i.i.d. from PvI Lower bound minimax error:

supP∈P

EP[ρ(θ̂, θ(P ))

]≥ 1

K

K∑

v=1

Ev[ρ(θ̂, θv)

]

≥ 1

K

K∑

v=1

ρ(δ)Pv(ρ(θ̂, θv) ≥ 2δ)

I Final canonical testing problem:

Mn(θ(P), ρ) ≥ ρ(δ) minv̂

P(v̂(X1, . . . , Xn) 6= V ).

Page 39: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax lower bounds

Step 1: Reduce from estimation to testingStep 2: Canonical testing problemStep 3: Classical bounds on error probabilities

✓k

✓i

I Nature chooses random index V ∈ [K]

I Conditional on V = v, sample X1, . . . , Xn i.i.d. from PvI Lower bound minimax error:

supP∈P

EP[ρ(θ̂, θ(P ))

]≥ 1

K

K∑

v=1

Ev[ρ(θ̂, θv)

]

≥ 1

K

K∑

v=1

ρ(δ)Pv(ρ(θ̂, θv) ≥ 2δ)

I Final canonical testing problem:

Mn(θ(P), ρ) ≥ ρ(δ) minv̂

P(v̂(X1, . . . , Xn) 6= V ).

Page 40: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax lower bounds

Step 1: Reduce from estimation to testingStep 2: Canonical testing problemStep 3: Classical bounds on error probabilities

✓k

✓i

I Nature chooses random index V ∈ [K]I Conditional on V = v, sample X1, . . . , Xn i.i.d. from Pv

I Lower bound minimax error:

supP∈P

EP[ρ(θ̂, θ(P ))

]≥ 1

K

K∑

v=1

Ev[ρ(θ̂, θv)

]

≥ 1

K

K∑

v=1

ρ(δ)Pv(ρ(θ̂, θv) ≥ 2δ)

I Final canonical testing problem:

Mn(θ(P), ρ) ≥ ρ(δ) minv̂

P(v̂(X1, . . . , Xn) 6= V ).

Page 41: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax lower bounds

Step 1: Reduce from estimation to testingStep 2: Canonical testing problemStep 3: Classical bounds on error probabilities

✓k

✓i

I Nature chooses random index V ∈ [K]I Conditional on V = v, sample X1, . . . , Xn i.i.d. from PvI Lower bound minimax error:

supP∈P

EP[ρ(θ̂, θ(P ))

]≥ 1

K

K∑

v=1

Ev[ρ(θ̂, θv)

]

≥ 1

K

K∑

v=1

ρ(δ)Pv(ρ(θ̂, θv) ≥ 2δ)

I Final canonical testing problem:

Mn(θ(P), ρ) ≥ ρ(δ) minv̂

P(v̂(X1, . . . , Xn) 6= V ).

Page 42: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax lower bounds

Step 1: Reduce from estimation to testingStep 2: Canonical testing problemStep 3: Classical bounds on error probabilities

✓k

✓i

I Nature chooses random index V ∈ [K]I Conditional on V = v, sample X1, . . . , Xn i.i.d. from PvI Lower bound minimax error:

supP∈P

EP[ρ(θ̂, θ(P ))

]≥ 1

K

K∑

v=1

Ev[ρ(θ̂, θv)

]

≥ 1

K

K∑

v=1

ρ(δ)Pv(ρ(θ̂, θv) ≥ 2δ)

I Final canonical testing problem:

Mn(θ(P), ρ) ≥ ρ(δ) minv̂

P(v̂(X1, . . . , Xn) 6= V ).

Page 43: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax lower bounds

Step 1: Reduce from estimation to testingStep 2: Canonical testing problemStep 3: Classical bounds on error probabilities

✓k

✓i

Le Cam’s method

P0(v̂ 6= 0) + P1(v̂ 6= 1)

≥ 1− ‖P0 − P1‖TV

Fano’s inequality (for K > 2)

P(v̂ 6= V )

≥ 1− I(X1, . . . , Xn;V ) + log 2

logK

Summarizing

V X✓v b✓ bv

Mn(θ(P), ρ) ≥ ρ(δ) minv̂

P (v̂(X1, . . . , Xn) 6= V )

Page 44: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax lower bounds

Step 1: Reduce from estimation to testingStep 2: Canonical testing problemStep 3: Classical bounds on error probabilities

✓k

✓i

Le Cam’s method

P0(v̂ 6= 0) + P1(v̂ 6= 1)

≥ 1− ‖P0 − P1‖TV

Fano’s inequality (for K > 2)

P(v̂ 6= V )

≥ 1− I(X1, . . . , Xn;V ) + log 2

logK

Summarizing

V X✓v b✓ bv

Mn(θ(P), ρ) ≥ ρ(δ) minv̂

P (v̂(X1, . . . , Xn) 6= V )

Page 45: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax lower bounds

Step 1: Reduce from estimation to testingStep 2: Canonical testing problemStep 3: Classical bounds on error probabilities

✓k

✓i

Le Cam’s method

P0(v̂ 6= 0) + P1(v̂ 6= 1)

≥ 1− ‖P0 − P1‖TV

Fano’s inequality (for K > 2)

P(v̂ 6= V )

≥ 1− I(X1, . . . , Xn;V ) + log 2

logK

Summarizing

V X✓v b✓ bv

Mn(θ(P), ρ) ≥ ρ(δ) minv̂

P (v̂(X1, . . . , Xn) 6= V )

Page 46: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax lower bounds

Step 1: Reduce from estimation to testingStep 2: Canonical testing problemStep 3: Classical bounds on error probabilities

✓k

✓i

Le Cam’s method

P0(v̂ 6= 0) + P1(v̂ 6= 1)

≥ 1− ‖P0 − P1‖TV

Fano’s inequality (for K > 2)

P(v̂ 6= V )

≥ 1− I(X1, . . . , Xn;V ) + log 2

logK

Summarizing

V X✓v b✓ bv

Mn(θ(P), ρ) ≥ ρ(δ) minv̂

P (v̂(X1, . . . , Xn) 6= V )

Page 47: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax lower bounds

Step 1: Reduce from estimation to testingStep 2: Canonical testing problemStep 3: Classical bounds on error probabilities

✓k

✓i

Le Cam’s method

P0(v̂ 6= 0) + P1(v̂ 6= 1)

≥ 1− ‖P0 − P1‖TV

Fano’s inequality (for K > 2)

P(v̂ 6= V )

≥ 1− I(X1, . . . , Xn;V ) + log 2

logK

Summarizing

V X✓v b✓ bv

Mn(θ(P), ρ) ≥ ρ(δ) minv̂

P (v̂(X1, . . . , Xn) 6= V )

Page 48: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax lower bounds

Step 1: Reduce from estimation to testingStep 2: Canonical testing problemStep 3: Classical bounds on error probabilities

✓k

✓i

Le Cam’s method

P0(v̂ 6= 0) + P1(v̂ 6= 1)

≥ 1− ‖P0 − P1‖TV

Fano’s inequality (for K > 2)

P(v̂ 6= V )

≥ 1− I(X1, . . . , Xn;V ) + log 2

logK

Summarizing

V X✓v b✓ bv

Mn(θ(P), ρ) ≥ ρ(δ) minv̂

P (v̂(X1, . . . , Xn) 6= V )

Page 49: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax lower bounds

Step 1: Reduce from estimation to testingStep 2: Canonical testing problemStep 3: Classical bounds on error probabilities

✓k

✓i

Le Cam’s method

P0(v̂ 6= 0) + P1(v̂ 6= 1)

≥ 1− ‖P0 − P1‖TV

Fano’s inequality (for K > 2)

P(v̂ 6= V )

≥ 1− I(X1, . . . , Xn;V ) + log 2

logK

Summarizing

V X✓v b✓ bv

Mn(θ(P), ρ) ≥ ρ(δ) minv̂

P (v̂(X1, . . . , Xn) 6= V )

Page 50: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax lower bounds

Step 1: Reduce from estimation to testingStep 2: Canonical testing problemStep 3: Classical bounds on error probabilities

✓k

✓i

Le Cam’s method

P0(v̂ 6= 0) + P1(v̂ 6= 1)

≥ 1− ‖P0 − P1‖TV

Fano’s inequality (for K > 2)

P(v̂ 6= V )

≥ 1− I(X1, . . . , Xn;V ) + log 2

logK

Summarizing

V X✓v b✓ bv

Mn(θ(P), ρ) ≥ ρ(δ) minv̂

P (v̂(X1, . . . , Xn) 6= V )

Page 51: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax lower bounds

Step 1: Reduce from estimation to testingStep 2: Canonical testing problemStep 3: Classical bounds on error probabilities

✓k

✓i

Le Cam’s method

P0(v̂ 6= 0) + P1(v̂ 6= 1)

≥ 1− ‖P0 − P1‖TV

Fano’s inequality (for K > 2)

P(v̂ 6= V )

≥ 1− I(X1, . . . , Xn;V ) + log 2

logK

Summarizing

V X✓v b✓ bvV X

Mn(θ(P), ρ) ≥ ρ(δ) minv̂

P (v̂(X1, . . . , Xn) 6= V )

Page 52: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax lower bounds

Step 1: Reduce from estimation to testingStep 2: Canonical testing problemStep 3: Classical bounds on error probabilities

✓k

✓i

Le Cam’s method

P0(v̂ 6= 0) + P1(v̂ 6= 1)

≥ 1− ‖P0 − P1‖TV

Fano’s inequality (for K > 2)

P(v̂ 6= V )

≥ 1− I(X1, . . . , Xn;V ) + log 2

logK

Summarizing

V X✓v b✓ bvV X

Mn(θ(P), ρ) ≥ ρ(δ) minv̂

P (v̂(X1, . . . , Xn) 6= V )

Page 53: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax lower bounds

Step 1: Reduce from estimation to testingStep 2: Canonical testing problemStep 3: Classical bounds on error probabilities

✓k

✓i

Summarizing

V X✓v b✓ bvV X

Mn(θ(P), ρ) ≥ ρ(δ) minv̂

P (v̂(X1, . . . , Xn) 6= V )

Key idea: Control information-theoretic divergences

‖P0 − P1‖TV or I(X1, . . . , Xn;V )

to attain minimax rate

Page 54: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Proving minimax lower bounds

Step 1: Reduce from estimation to testingStep 2: Canonical testing problemStep 3: Classical bounds on error probabilities

✓k

✓i

Summarizing

V X✓v b✓ bvV X

Mn(θ(P), ρ) ≥ ρ(δ) minv̂

P (v̂(X1, . . . , Xn) 6= V )

Key idea: Control information-theoretic divergences

‖P0 − P1‖TV or I(X1, . . . , Xn;V )

to attain minimax rate

Page 55: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Inference under privacy constraints

???

X1X2

X3 Xn. . .

Z1

Z2

Zn

✓(P )Infer

P

Database

Channel Q

I Have distribution P and parameter θ(P )

I Sample X1, . . . , Xn drawn from P and not observed

I Private views Z1, . . . , Zn constructed from Xi

I Goal: infer population parameter θ(P )

based on X1, . . . , Xn

Page 56: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Inference under privacy constraints

???

X1X2

X3 Xn. . .

Z1

Z2

Zn

✓(P )Infer

P

Database

Channel Q

I Have distribution P and parameter θ(P )

I Sample X1, . . . , Xn drawn from P and not observed

I Private views Z1, . . . , Zn constructed from Xi

I Goal: infer population parameter θ(P )

based on X1, . . . , Xn

Page 57: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Inference under privacy constraints

???

X1X2

X3 Xn. . .

Z1

Z2

Zn

✓(P )Infer

P

Database

Channel Q

I Have distribution P and parameter θ(P )

I Sample X1, . . . , Xn drawn from P and not observed

I Private views Z1, . . . , Zn constructed from Xi

I Goal: infer population parameter θ(P ) based on X1, . . . , Xn

Page 58: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Model of privacy

Local Privacy: Don’t trust collector of data (Evfimievski et al. 2003,Warner 1965)

M

Xi

Q(Z | X)

Zi

I Individuals i ∈ {1, . . . , n} have personal data Xi ∼ PθI Estimator Zn1 7→ θ̂(Z1:n)

Page 59: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Model of privacy

Local Privacy: Don’t trust collector of data (Evfimievski et al. 2003,Warner 1965)

X1 X2 X3 Xn

M

θ̂

M

Xi

Q(Z | X)

Zi

I Individuals i ∈ {1, . . . , n} have personal data Xi ∼ PθI Estimator Zn1 7→ θ̂(Z1:n)

Page 60: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Model of privacy

Local Privacy: Don’t trust collector of data (Evfimievski et al. 2003,Warner 1965)

X1 X2 X3 Xn

M

θ̂ M

Xi

Q(Z | X)

Zi

I Individuals i ∈ {1, . . . , n} have personal data Xi ∼ PθI Estimator Zn1 7→ θ̂(Z1:n)

Page 61: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Model of privacy

Local Privacy: Don’t trust collector of data (Evfimievski et al. 2003,Warner 1965)

X1 X2 X3 Xn

M

θ̂ M

Xi

Q(Z | X)

Zi

I Individuals i ∈ {1, . . . , n} have personal data Xi ∼ Pθ

I Estimator Zn1 7→ θ̂(Z1:n)

Page 62: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Model of privacy

Local Privacy: Don’t trust collector of data (Evfimievski et al. 2003,Warner 1965)

X1 X2 X3 Xn

M

θ̂ M

Xi

Q(Z | X)

Zi

I Individuals i ∈ {1, . . . , n} have personal data Xi ∼ PθI Estimator Zn1 7→ θ̂(Z1:n)

Page 63: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Differential privacy

Definition: The channel Q is α-differentiallyprivate if

maxz,x,x′

Q(Z = z | x)

Q(Z = z | x′) ≤ eα.

[Dwork, McSherry, Nissim, Smith 2006]

M

Xi

Q(Z | X)

Zi

Page 64: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Differential privacy

Definition: The channel Q is α-differentiallyprivate if

maxz,x,x′

Q(Z = z | x)

Q(Z = z | x′) ≤ eα.

[Dwork, McSherry, Nissim, Smith 2006]

logQ(z | x)

logQ(z | x′)

M

Xi

Q(Z | X)

Zi

Page 65: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Differential privacy

Definition: The channel Q is α-differentiallyprivate if

maxz,x,x′

Q(Z = z | x)

Q(Z = z | x′) ≤ eα.

[Dwork, McSherry, Nissim, Smith 2006]

What does this mean?

I Given Z, cannot tell what x gave Z

I Testing argument: based on Z, adversary mustdistinguish between x and x′:

FNR + FPR ≥ 2

1 + eα

[Wasserman and Zhou 2011]

M

Xi

Q(Z | X)

Zi

Page 66: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Differential privacy

Definition: The channel Q is α-differentiallyprivate if

maxz,x,x′

Q(Z = z | x)

Q(Z = z | x′) ≤ eα.

[Dwork, McSherry, Nissim, Smith 2006]

What does this mean?

I Given Z, cannot tell what x gave Z

I Testing argument: based on Z, adversary mustdistinguish between x and x′:

FNR + FPR ≥ 2

1 + eα

[Wasserman and Zhou 2011]

M

Xi

Q(Z | X)

Zi

Page 67: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Differential privacy

Definition: The channel Q is α-differentiallyprivate if

maxz,x,x′

Q(Z = z | x)

Q(Z = z | x′) ≤ eα.

[Dwork, McSherry, Nissim, Smith 2006]

What does this mean?

I Given Z, cannot tell what x gave Z

I Testing argument: based on Z, adversary mustdistinguish between x and x′:

FNR + FPR ≥ 2

1 + eα

[Wasserman and Zhou 2011]

M

Xi

Q(Z | X)

Zi

Page 68: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Minimax risk

Central object of study: Minimax risk

I Parameter θ(P ) of distribution P

I Loss ρ that measures error in estimate of θ̂ for θ: ρ(θ̂, θ)

I E.g. ρ(θ̂, θ) = ‖θ̂ − θ‖22I Family of distributions P that we study

Look at expected loss

I Worst case over distributions PI Best case over all estimators θ̂ : Zn → Θ

I Best case over all α-private channels Q ∈ Qα from X to Z

Page 69: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Minimax risk

Central object of study: Minimax risk

I Parameter θ(P ) of distribution P

I Loss ρ that measures error in estimate of θ̂ for θ: ρ(θ̂, θ)

I E.g. ρ(θ̂, θ) = ‖θ̂ − θ‖22I Family of distributions P that we study

Look at expected loss

EP[ρ(θ̂(Z1, . . . , Zn), θ(P ))

]

I Worst case over distributions PI Best case over all estimators θ̂ : Zn → Θ

I Best case over all α-private channels Q ∈ Qα from X to Z

Page 70: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Minimax risk

Central object of study: Minimax risk

I Parameter θ(P ) of distribution P

I Loss ρ that measures error in estimate of θ̂ for θ: ρ(θ̂, θ)

I E.g. ρ(θ̂, θ) = ‖θ̂ − θ‖22I Family of distributions P that we study

Look at expected loss

supP∈P

EP[ρ(θ̂(Z1, . . . , Zn), θ(P ))

]

I Worst case over distributions P

I Best case over all estimators θ̂ : Zn → Θ

I Best case over all α-private channels Q ∈ Qα from X to Z

Page 71: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Minimax risk

Central object of study: Minimax risk

I Parameter θ(P ) of distribution P

I Loss ρ that measures error in estimate of θ̂ for θ: ρ(θ̂, θ)

I E.g. ρ(θ̂, θ) = ‖θ̂ − θ‖22I Family of distributions P that we study

Look at expected loss

infθ̂

supP∈P

EP[ρ(θ̂(Z1, . . . , Zn), θ(P ))

]

I Worst case over distributions PI Best case over all estimators θ̂ : Zn → Θ

I Best case over all α-private channels Q ∈ Qα from X to Z

Page 72: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Minimax risk

Central object of study: Minimax risk

I Parameter θ(P ) of distribution P

I Loss ρ that measures error in estimate of θ̂ for θ: ρ(θ̂, θ)

I E.g. ρ(θ̂, θ) = ‖θ̂ − θ‖22I Family of distributions P that we study

Look at expected loss

infθ̂

supP∈P

EP[ρ(θ̂(Z1, . . . , Zn), θ(P ))

]

︸ ︷︷ ︸Classical minimax risk

I Worst case over distributions PI Best case over all estimators θ̂ : Zn → Θ

I Best case over all α-private channels Q ∈ Qα from X to Z

Page 73: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Minimax risk

Central object of study: Minimax risk

I Parameter θ(P ) of distribution P

I Loss ρ that measures error in estimate of θ̂ for θ: ρ(θ̂, θ)

I E.g. ρ(θ̂, θ) = ‖θ̂ − θ‖22I Family of distributions P that we study

Look at expected loss

infQ∈Qα

infθ̂

supP∈P

EP ,Q[ρ(θ̂(Z1, . . . , Zn), θ(P ))

]

I Worst case over distributions PI Best case over all estimators θ̂ : Zn → Θ

I Best case over all α-private channels Q ∈ Qα from X to Z

Page 74: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Minimax risk

Central object of study: Minimax risk

I Parameter θ(P ) of distribution P

I Loss ρ that measures error in estimate of θ̂ for θ: ρ(θ̂, θ)

I E.g. ρ(θ̂, θ) = ‖θ̂ − θ‖22I Family of distributions P that we study

Look at expected loss

Mn(θ(P), ρ, α) := infQ∈Qα

infθ̂

supP∈P

EP ,Q[ρ(θ̂(Z1, . . . , Zn), θ(P ))

]

︸ ︷︷ ︸Private minimax risk

I Worst case over distributions PI Best case over all estimators θ̂ : Zn → Θ

I Best case over all α-private channels Q ∈ Qα from X to Z

Page 75: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Goal for the rest of the talk

How does the minimax risk

Mn(θ(P), ρ, α) := infQ∈Qα

infθ̂supP∈P

EP ,Q[ρ(θ̂(Z1, . . . , Zn), θ(P ))

]

change with privacy parameter α and number of samples n?

Many related results

I Non-population lower bounds [Hardt and Talwar 10, Nikkolov,Talwar, Zhang 13; Hall, Rinaldo, Wasserman 11, Chaudhuri,Monteleoni, Sarwate 12]

I Related population bounds:

I Two point hypotheses [Chaudhuri & Hsu 12, Beimel, Nissim, Omri 08](e.g. for 1-dimensional bias estimation, get 1/(nα)2 error)

I PAC learning results [Beimel, Brenner, Kasiviswanathan, Nissim 13]

Page 76: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Goal for the rest of the talk

How does the minimax risk

Mn(θ(P), ρ, α) := infQ∈Qα

infθ̂supP∈P

EP ,Q[ρ(θ̂(Z1, . . . , Zn), θ(P ))

]

change with privacy parameter α and number of samples n?

Many related results

I Non-population lower bounds [Hardt and Talwar 10, Nikkolov,Talwar, Zhang 13; Hall, Rinaldo, Wasserman 11, Chaudhuri,Monteleoni, Sarwate 12]

I Related population bounds:

I Two point hypotheses [Chaudhuri & Hsu 12, Beimel, Nissim, Omri 08](e.g. for 1-dimensional bias estimation, get 1/(nα)2 error)

I PAC learning results [Beimel, Brenner, Kasiviswanathan, Nissim 13]

Page 77: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Goal for the rest of the talk

How does the minimax risk

Mn(θ(P), ρ, α) := infQ∈Qα

infθ̂supP∈P

EP ,Q[ρ(θ̂(Z1, . . . , Zn), θ(P ))

]

change with privacy parameter α and number of samples n?

Many related results

I Non-population lower bounds [Hardt and Talwar 10, Nikkolov,Talwar, Zhang 13; Hall, Rinaldo, Wasserman 11, Chaudhuri,Monteleoni, Sarwate 12]

I Related population bounds:

I Two point hypotheses [Chaudhuri & Hsu 12, Beimel, Nissim, Omri 08](e.g. for 1-dimensional bias estimation, get 1/(nα)2 error)

I PAC learning results [Beimel, Brenner, Kasiviswanathan, Nissim 13]

Page 78: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Goal for the rest of the talk

How does the minimax risk

Mn(θ(P), ρ, α) := infQ∈Qα

infθ̂supP∈P

EP ,Q[ρ(θ̂(Z1, . . . , Zn), θ(P ))

]

change with privacy parameter α and number of samples n?

Many related results

I Non-population lower bounds [Hardt and Talwar 10, Nikkolov,Talwar, Zhang 13; Hall, Rinaldo, Wasserman 11, Chaudhuri,Monteleoni, Sarwate 12]

I Related population bounds:

I Two point hypotheses [Chaudhuri & Hsu 12, Beimel, Nissim, Omri 08](e.g. for 1-dimensional bias estimation, get 1/(nα)2 error)

I PAC learning results [Beimel, Brenner, Kasiviswanathan, Nissim 13]

Page 79: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Goal for the rest of the talk

How does the minimax risk

Mn(θ(P), ρ, α) := infQ∈Qα

infθ̂supP∈P

EP ,Q[ρ(θ̂(Z1, . . . , Zn), θ(P ))

]

change with privacy parameter α and number of samples n?

Many related results

I Non-population lower bounds [Hardt and Talwar 10, Nikkolov,Talwar, Zhang 13; Hall, Rinaldo, Wasserman 11, Chaudhuri,Monteleoni, Sarwate 12]

I Related population bounds:I Two point hypotheses [Chaudhuri & Hsu 12, Beimel, Nissim, Omri 08]

(e.g. for 1-dimensional bias estimation, get 1/(nα)2 error)

I PAC learning results [Beimel, Brenner, Kasiviswanathan, Nissim 13]

Page 80: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Goal for the rest of the talk

How does the minimax risk

Mn(θ(P), ρ, α) := infQ∈Qα

infθ̂supP∈P

EP ,Q[ρ(θ̂(Z1, . . . , Zn), θ(P ))

]

change with privacy parameter α and number of samples n?

Many related results

I Non-population lower bounds [Hardt and Talwar 10, Nikkolov,Talwar, Zhang 13; Hall, Rinaldo, Wasserman 11, Chaudhuri,Monteleoni, Sarwate 12]

I Related population bounds:I Two point hypotheses [Chaudhuri & Hsu 12, Beimel, Nissim, Omri 08]

(e.g. for 1-dimensional bias estimation, get 1/(nα)2 error)I PAC learning results [Beimel, Brenner, Kasiviswanathan, Nissim 13]

Page 81: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Examples

I Mean estimation

I Fixed-design regression

I Convex risk minimization (i.e. online learning)

I Multinomial (probability) estimation

I Nonparametric density estimation

Page 82: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Examples

I Mean estimation

I Fixed-design regression

I Convex risk minimization (i.e. online learning)

I Multinomial (probability) estimation

I Nonparametric density estimation

Page 83: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 1: Mean estimation

Problem: Estimate mean of distributions P with k ≥ 2nd moment:

θ(P ) := EP [X], EP [|X|k] ≤ 1.

Examples:

I For two moments k = 2, rate goes from parametric 1/n to 1/√nα2

I For k →∞ (bounded random variables) parametric decrease

n 7→ nα2

Page 84: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 1: Mean estimation

Problem: Estimate mean of distributions P with k ≥ 2nd moment:

θ(P ) := EP [X], EP [|X|k] ≤ 1.

Examples:

I For two moments k = 2, rate goes from parametric 1/n to 1/√nα2

I For k →∞ (bounded random variables) parametric decrease

n 7→ nα2

Page 85: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 1: Mean estimation

Problem: Estimate mean of distributions P with k ≥ 2nd moment:

θ(P ) := EP [X], EP [|X|k] ≤ 1.

θ̂

Examples:

I For two moments k = 2, rate goes from parametric 1/n to 1/√nα2

I For k →∞ (bounded random variables) parametric decrease

n 7→ nα2

Page 86: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 1: Mean estimation

Problem: Estimate mean of distributions P with k ≥ 2nd moment:

θ(P ) := EP [X], EP [|X|k] ≤ 1.

θ̂ =1

n

n∑

i=1

Xi

︸ ︷︷ ︸Standard estimator

E[(θ̂ − E[X]

)2]≤ 1

n.

Examples:

I For two moments k = 2, rate goes from parametric 1/n to 1/√nα2

I For k →∞ (bounded random variables) parametric decrease

n 7→ nα2

Page 87: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 1: Mean estimation

Problem: Estimate mean of distributions P with k ≥ 2nd moment:

θ(P ) := EP [X], EP [|X|k] ≤ 1.

Examples:

I For two moments k = 2, rate goes from parametric 1/n to 1/√nα2

I For k →∞ (bounded random variables) parametric decrease

n 7→ nα2

Page 88: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 1: Mean estimation

Problem: Estimate mean of distributions P with k ≥ 2nd moment:

θ(P ) := EP [X], EP [|X|k] ≤ 1.

Examples:

I For two moments k = 2, rate goes from parametric 1/n to 1/√nα2

I For k →∞ (bounded random variables) parametric decrease

n 7→ nα2

Page 89: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 1: Mean estimation

Problem: Estimate mean of distributions P with k ≥ 2nd moment:

θ(P ) := EP [X], EP [|X|k] ≤ 1.

Examples:

I For two moments k = 2, rate goes from parametric 1/n to 1/√nα2

I For k →∞ (bounded random variables) parametric decrease

n 7→ nα2

Page 90: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 1: Mean estimation

Problem: Estimate mean of distributions P with k ≥ 2nd moment:

θ(P ) := EP [X], EP [|X|k] ≤ 1.

Proposition (D., Jordan, Wainwright):Non-private minimax rate

1

n. E[(θ̂ − E[X])2] .

1

n

Examples:

I For two moments k = 2, rate goes from parametric 1/n to 1/√nα2

I For k →∞ (bounded random variables) parametric decrease

n 7→ nα2

Page 91: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 1: Mean estimation

Problem: Estimate mean of distributions P with k ≥ 2nd moment:

θ(P ) := EP [X], EP [|X|k] ≤ 1.

Proposition (D., Jordan, Wainwright):Private minimax rate

1

(nα2)k−1k

. E[(θ̂ − E[X])2] .1

(nα2)k−1k

Examples:

I For two moments k = 2, rate goes from parametric 1/n to 1/√nα2

I For k →∞ (bounded random variables) parametric decrease

n 7→ nα2

Page 92: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 1: Mean estimation

Problem: Estimate mean of distributions P with k ≥ 2nd moment:

θ(P ) := EP [X], EP [|X|k] ≤ 1.

Proposition (D., Jordan, Wainwright):Private minimax rate

1

(nα2)k−1k

. Mn(E[X], (·)2, α) .1

(nα2)k−1k

Examples:

I For two moments k = 2, rate goes from parametric 1/n to 1/√nα2

I For k →∞ (bounded random variables) parametric decrease

n 7→ nα2

Page 93: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 1: Mean estimation

Problem: Estimate mean of distributions P with k ≥ 2nd moment:

θ(P ) := EP [X], EP [|X|k] ≤ 1.

Proposition (D., Jordan, Wainwright):Private minimax rate

1

(nα2)k−1k

. Mn(E[X], (·)2, α) .1

(nα2)k−1k

Examples:

I For two moments k = 2, rate goes from parametric 1/n to 1/√nα2

I For k →∞ (bounded random variables) parametric decrease

n 7→ nα2

Page 94: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 1: Mean estimation

Problem: Estimate mean of distributions P with k ≥ 2nd moment:

θ(P ) := EP [X], EP [|X|k] ≤ 1.

Proposition (D., Jordan, Wainwright):Private minimax rate

1

(nα2)k−1k

. Mn(E[X], (·)2, α) .1

(nα2)k−1k

Examples:

I For two moments k = 2, rate goes from parametric 1/n to 1/√nα2

I For k →∞ (bounded random variables) parametric decrease

n 7→ nα2

Page 95: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 2: multinomial estimation

Problem: Get observations X ∈ [d] and wish to estimate

θj := P (X = j)

Example:

I $0–$10,000

I $10,001–$20,000

I $20,001–$40,000

I $40,001–$80,000

I $80,001–$160,000

I $160,001–$320,000

I $320,001+

θ1 = .05

θ2 = .1

θ3 = .2

θ4 = .4

θ5 = .2

θ6 = .04

θ7 = .01

Page 96: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 2: multinomial estimation

Problem: Get observations X ∈ [d] and wish to estimate

θj := P (X = j)

Example:

I $0–$10,000

I $10,001–$20,000

I $20,001–$40,000

I $40,001–$80,000

I $80,001–$160,000

I $160,001–$320,000

I $320,001+

θ1 = .05

θ2 = .1

θ3 = .2

θ4 = .4

θ5 = .2

θ6 = .04

θ7 = .01

Page 97: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 2: multinomial estimation

Problem: Get observations X ∈ [d] and wish to estimate

θj := P (X = j)

Example:

I $0–$10,000

I $10,001–$20,000

I $20,001–$40,000

I $40,001–$80,000

I $80,001–$160,000

I $160,001–$320,000

I $320,001+

θ1 = .05

θ2 = .1

θ3 = .2

θ4 = .4

θ5 = .2

θ6 = .04

θ7 = .01

Page 98: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 2: multinomial estimation

Problem: Get observations X ∈ [d] and wish to estimate

θj := P (X = j)

Example:

I $0–$10,000

I $10,001–$20,000

I $20,001–$40,000

I $40,001–$80,000

I $80,001–$160,000

I $160,001–$320,000

I $320,001+

θ1 = .05

θ2 = .1

θ3 = .2

θ4 = .4

θ5 = .2

θ6 = .04

θ7 = .01

Page 99: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 2: multinomial estimation

Problem: Get observations X ∈ [d] and wish to estimate

θj := P (X = j)

Usual rate:

E[‖θ̂ − θ‖22

]≤ 1

n.

Page 100: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 2: multinomial estimation

Problem: Get observations X ∈ [d] and wish to estimate

θj := P (X = j)

θ̂j =1

n

n∑

i=1

1 {Xi = j} = P̂ (X = j)

︸ ︷︷ ︸Standard estimator

Usual rate:

E[‖θ̂ − θ‖22

]≤ 1

n.

Page 101: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 2: multinomial estimation

Problem: Get observations X ∈ [d] and wish to estimate

θj := P (X = j)

θ̂j =1

n

n∑

i=1

1 {Xi = j} = P̂ (X = j)

︸ ︷︷ ︸Standard estimator (counts)

Usual rate:

E[‖θ̂ − θ‖22

]≤ 1

n.

Page 102: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 2: multinomial estimation

Problem: Get observations X ∈ [d] and wish to estimate

θj := P (X = j)

θ̂j =1

n

n∑

i=1

1 {Xi = j} = P̂ (X = j)

︸ ︷︷ ︸Standard estimator (counts)

Usual rate:

E[‖θ̂ − θ‖22

]≤ 1

n.

Page 103: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 2: multinomial estimation

Problem: Get observations X ∈ [d] and wish to estimate

θj := P (X = j)

Take away: Sample size reduction

n 7→nα2

d

Page 104: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 2: multinomial estimation

Problem: Get observations X ∈ [d] and wish to estimate

θj := P (X = j)

Proposition: Non-private minimax rate

1

n. E

[‖θ̂ − θ‖22

].

1

n

Take away: Sample size reduction

n 7→nα2

d

Page 105: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 2: multinomial estimation

Problem: Get observations X ∈ [d] and wish to estimate

θj := P (X = j)

Proposition: Private minimax rate

d

(nα2). E

[‖θ̂ − θ‖22

].

d

(nα2)

Take away: Sample size reduction

n 7→nα2

d

Page 106: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 2: multinomial estimation

Problem: Get observations X ∈ [d] and wish to estimate

θj := P (X = j)

Proposition: Private minimax rate

d

(nα2). Mn([P (X = j)]dj=1, ‖·‖22 , α) .

d

(nα2)

Take away: Sample size reduction

n 7→nα2

d

Page 107: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 2: multinomial estimation

Problem: Get observations X ∈ [d] and wish to estimate

θj := P (X = j)

Proposition: Private minimax rate

d

(nα2). Mn([P (X = j)]dj=1, ‖·‖22 , α) .

d

(nα2)

Take away: Sample size reduction

n 7→nα2

d

Page 108: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 2: multinomial estimation

Problem: Get observations X ∈ [d] and wish to estimate

θj := P (X = j)

I Optimal mechanism: randomized response. Resample each coordinateby Bernoulli coin flips

I X $0–$10,000

I X $10,001–$20,000

I X $20,001–$40,000

I X $40,001–$80,000

I X $80,001–$160,000

I X $160,001–$320,000

I X $320,001+

I X $0–$10,000

I X $10,001–$20,000

I X $20,001–$40,000

I X $40,001–$80,000

I X $80,001–$160,000

I X $160,001–$320,000

I X $320,001+

Page 109: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 2: multinomial estimation

Problem: Get observations X ∈ [d] and wish to estimate

θj := P (X = j)

I Optimal mechanism: randomized response. Resample each coordinateby Bernoulli coin flips

I X $0–$10,000

I X $10,001–$20,000

I X $20,001–$40,000

I X $40,001–$80,000

I X $80,001–$160,000

I X $160,001–$320,000

I X $320,001+

I X $0–$10,000

I X $10,001–$20,000

I X $20,001–$40,000

I X $40,001–$80,000

I X $80,001–$160,000

I X $160,001–$320,000

I X $320,001+

Page 110: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 2: multinomial estimation

Problem: Get observations X ∈ [d] and wish to estimate

θj := P (X = j)

I Optimal mechanism: randomized response. Resample each coordinateby Bernoulli coin flips

I X $0–$10,000

I X $10,001–$20,000

I X $20,001–$40,000

I X $40,001–$80,000

I X $80,001–$160,000

I X $160,001–$320,000

I X $320,001+

I X $0–$10,000

I X $10,001–$20,000

I X $20,001–$40,000

I X $40,001–$80,000

I X $80,001–$160,000

I X $160,001–$320,000

I X $320,001+

Page 111: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Example 2: multinomial estimation

Problem: Get observations X ∈ [d] and wish to estimate

θj := P (X = j)

I Optimal mechanism: randomized response. Resample each coordinateby Bernoulli coin flips

I X $0–$10,000

I X $10,001–$20,000

I X $20,001–$40,000

I X $40,001–$80,000

I X $80,001–$160,000

I X $160,001–$320,000

I X $320,001+

I X $0–$10,000

I X $10,001–$20,000

I X $20,001–$40,000

I X $40,001–$80,000

I X $80,001–$160,000

I X $160,001–$320,000

I X $320,001+

Page 112: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Main consequences

Goal: Understand tradeoff between differential privacy bound α andsample size n

“Theorem 1” Effective sample size for essentially any1 problem ismade worse by at least

n 7→ nα2

1 essentially any: any problem whose minimax rate can be controlled byinformation-theoretic techniques

“Theorem 2” Effective sample size for d-dimensional problemsscales as

n 7→ nα2

d

Page 113: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Main consequences

Goal: Understand tradeoff between differential privacy bound α andsample size n

“Theorem 1” Effective sample size for essentially any1 problem ismade worse by at least

n 7→ nα2

1 essentially any: any problem whose minimax rate can be controlled byinformation-theoretic techniques

“Theorem 2” Effective sample size for d-dimensional problemsscales as

n 7→ nα2

d

Page 114: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Main consequences

Goal: Understand tradeoff between differential privacy bound α andsample size n

“Theorem 1” Effective sample size for essentially any1 problem ismade worse by at least

n 7→ nα2

1 essentially any: any problem whose minimax rate can be controlled byinformation-theoretic techniques

“Theorem 2” Effective sample size for d-dimensional problemsscales as

n 7→ nα2

d

Page 115: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

General theory

Showing minimax bounds:

I Have possible “true” parameters {θv} we want to find

I Distribution Pv associated with each parameter

I Problem is hard when Pv ≈ Pv′

Pv Pv′ Pv Pv′

Easy Hard

Page 116: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

General theory

Showing minimax bounds:

I Have possible “true” parameters {θv} we want to find

I Distribution Pv associated with each parameter

I Problem is hard when Pv ≈ Pv′

Pv Pv′ Pv Pv′

Easy Hard

Page 117: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Differential privacy and probability distributions

Samples: Zi are drawn Xi → Q→ Zi from marginal

Mv(Z) :=

∫Q(Z | X = x)dPv(x)

Strong data processing: If Q(Z | x)/Q(Z | x′) ≤ eα,

Dkl (M1||M2) +Dkl (M2||M1) ≤ 4(eα − 1)2 ‖P1 − P2‖2TV

P1

P2

Q

Q

M1

M2

Page 118: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Differential privacy and probability distributions

Samples: Zi are drawn Xi → Q→ Zi from marginal

Mv(Z) :=

∫Q(Z | X = x)dPv(x)

Strong data processing: If Q(Z | x)/Q(Z | x′) ≤ eα,

Dkl (M1||M2) +Dkl (M2||M1) ≤ 4(eα − 1)2 ‖P1 − P2‖2TV

P1

P2

Q

Q

M1

M2

Page 119: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Differential privacy and probability distributions

Samples: Zi are drawn Xi → Q→ Zi from marginal

Mv(Z) :=

∫Q(Z | X = x)dPv(x)

Strong data processing: If Q(Z | x)/Q(Z | x′) ≤ eα,

Dkl (M1||M2) +Dkl (M2||M1) ≤ 4(eα − 1)2 ‖P1 − P2‖2TV

P1

P2

Q

Q

M1

M2

Page 120: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Contraction and lower bounds

Samples: Zi are drawn Xi → Q→ Zi from marginal

Mv(Z) :=

∫Q(Z | X = x)dPv(x)

Le Cam’s Method

I θ1 and θ2 are 2δ separated

I Non-private version:

Mn(Θ, (·)2)≥ δ2

(1−

√nDkl (P1||P2)

)

I Private version:

Mn

(Θ, (·)2, α

)

≥ δ2(

1−√nα2 ‖P1 − P2‖2TV

)

Page 121: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Contraction and lower bounds

Samples: Zi are drawn Xi → Q→ Zi from marginal

Mv(Z) :=

∫Q(Z | X = x)dPv(x)

Le Cam’s Method

I θ1 and θ2 are 2δ separated

I Non-private version:

Mn(Θ, (·)2)≥ δ2

(1−

√nDkl (P1||P2)

)

I Private version:

Mn

(Θ, (·)2, α

)

≥ δ2(

1−√nα2 ‖P1 − P2‖2TV

)

P1 P2

2δθ1 θ2

Page 122: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Contraction and lower bounds

Samples: Zi are drawn Xi → Q→ Zi from marginal

Mv(Z) :=

∫Q(Z | X = x)dPv(x)

Le Cam’s Method

I θ1 and θ2 are 2δ separated

I Non-private version:

Mn(Θ, (·)2)≥ δ2

(1−

√nDkl (P1||P2)

)

I Private version:

Mn

(Θ, (·)2, α

)

≥ δ2(

1−√nα2 ‖P1 − P2‖2TV

)

P1 P2

2δθ1 θ2

Page 123: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Contraction and lower bounds

Samples: Zi are drawn Xi → Q→ Zi from marginal

Mv(Z) :=

∫Q(Z | X = x)dPv(x)

Le Cam’s Method

I θ1 and θ2 are 2δ separated

I Non-private version:

Mn(Θ, (·)2)≥ δ2

(1−

√nDkl (P1||P2)

)

I Private version:

Mn

(Θ, (·)2, α

)

≥ δ2(

1−√nα2 ‖P1 − P2‖2TV

)

P1 P2

2δθ1 θ2

M1 M2

Page 124: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Variational results on privacy and probability distributions

Samples: Zi are drawn Xi → Q→ Zi from marginal

Mv(Z) :=

∫Q(Z | X = x)dPv(x)

Canonical problem: Nature samples V uniformly from v = 1, . . . ,K anddraws

Xii.i.d.∼ Pv when V = v

V X Z

Goal: Find V based on Z1, . . . , Zn

Difficulty of problem: Saw earlier mutual information

I(X1, . . . , Xn;V ) 7→ I(Z1, . . . , Zn;V )

Page 125: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Variational results on privacy and probability distributions

Samples: Zi are drawn Xi → Q→ Zi from marginal

Mv(Z) :=

∫Q(Z | X = x)dPv(x)

Canonical problem: Nature samples V uniformly from v = 1, . . . ,K anddraws

Xii.i.d.∼ Pv when V = v

V X Z

Goal: Find V based on Z1, . . . , ZnDifficulty of problem: Saw earlier mutual information

I(X1, . . . , Xn;V ) 7→ I(Z1, . . . , Zn;V )

Page 126: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Fano inequality, lower bounds, contraction

V X Z

I Have parameters θ1, . . . , θK , choose randomly V ∈ [K]

I Sample Xi according to θv when V = v

I Sample Zi according to Q(· | Xi)

I Non-private Fano inequality:

P(Error) ≥ 1− I(X1, . . . , Xn;V )

logK− o(1)

Page 127: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Fano inequality, lower bounds, contraction

V X Z

I Have parameters θ1, . . . , θK , choose randomly V ∈ [K]

I Sample Xi according to θv when V = v

I Sample Zi according to Q(· | Xi)

I Non-private Fano inequality:

P(Error) ≥ 1− I(X1, . . . , Xn;V )

logK− o(1)

Page 128: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Fano inequality, lower bounds, contraction

V X Z

I Have parameters θ1, . . . , θK , choose randomly V ∈ [K]

I Sample Xi according to θv when V = v

I Sample Zi according to Q(· | Xi)

I Non-private Fano inequality:

P(Error) ≥ 1− I(X1, . . . , Xn;V )

logK− o(1)

I Private Fano inequality:

P(Error) ≥ 1− I(Z1, . . . , Zn;V )

logK− o(1)

Page 129: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Fano inequality, lower bounds, contraction

V X Z

I Have parameters θ1, . . . , θK , choose randomly V ∈ [K]

I Sample Xi according to θv when V = v

I Sample Zi according to Q(· | Xi)

I Non-private Fano inequality:

P(Error) ≥ 1− I(X1, . . . , Xn;V )

logK− o(1)

I Private Fano inequality:

P(Error) ≥ 1− α2I(X1, . . . , Xn;V )

d logK− o(1)

Page 130: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Variational results on privacy and probability distributions

V X Z

I Define mixture P = 1K

∑Kv=1 Pv

Mutual information contraction: For any non-interactiveα-locally private channel Q,

What happens? Roughly

Page 131: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Variational results on privacy and probability distributions

V X Z

I Define mixture P = 1K

∑Kv=1 Pv

Mutual information contraction: For any non-interactiveα-locally private channel Q,

I(Z1, . . . , Zn;V ) ≤ n(eα − 1)2 supS

1

K

K∑

v=1

(Pv(S)− P (S)

)2

What happens? Roughly

Page 132: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Variational results on privacy and probability distributions

V X Z

I Define mixture P = 1K

∑Kv=1 Pv

Mutual information contraction: For any non-interactiveα-locally private channel Q,

I(Z1, . . . , Zn;V ) ≤ n(eα − 1)2 supS

1

K

K∑

v=1

(Pv(S)− P (S)

)2

︸ ︷︷ ︸Dimension-dependent total variation

What happens? Roughly

Page 133: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Variational results on privacy and probability distributions

V X Z

I Define mixture P = 1K

∑Kv=1 Pv

Mutual information contraction: For any non-interactiveα-locally private channel Q,

I(Z1, . . . , Zn;V ) ≤ n(eα − 1)2 supS

1

K

K∑

v=1

(Pv(S)− P (S)

)2

︸ ︷︷ ︸Dimension-dependent total variation

What happens? Roughly

n supS

1

K

K∑

v=1

(Pv(S)− P (S)

)2 ≈ 1

dI(X1, . . . , Xn;V )

Page 134: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Variational results on privacy and probability distributions

V X Z

I Define mixture P = 1K

∑Kv=1 Pv

Mutual information contraction: For any non-interactiveα-locally private channel Q,

I(Z1, . . . , Zn;V ) ≤ n(eα − 1)2 supS

1

K

K∑

v=1

(Pv(S)− P (S)

)2

︸ ︷︷ ︸Dimension-dependent total variation

What happens? Roughly

n supS

1

K

K∑

v=1

(Pv(S)− P (S)

)2 ≈ 1

dI(X1, . . . , Xn;V )︸ ︷︷ ︸

Classical

Page 135: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Variational results on privacy and probability distributions

V X Z

I Define mixture P = 1K

∑Kv=1 Pv

Mutual information contraction: For any non-interactiveα-locally private channel Q,

I(Z1, . . . , Zn;V ) ≤ n(eα − 1)2 supS

1

K

K∑

v=1

(Pv(S)− P (S)

)2

︸ ︷︷ ︸Dimension-dependent total variation

What happens? Roughly

I(Z1, . . . , Zn;V ) ≤ α2

dI(X1, . . . , Xn;V )︸ ︷︷ ︸

Classical

Page 136: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Summary

High level results:

I Formal minimax framework for local differential privacy

I Two main theorems bound distances between probabilitydistributions as function of privacy

I Pairwise contraction: Le Cam’s methodI Mutual information contraction: Fano’s method

Extensions and other conclusions:

I In essentially any problem, effective number of samples

n 7→ nα2

I In d-dimensional problems, effective number of samples

n 7→ nα2

dI Rates for regression, multinomial estimation, convex optimizationI Dimension-dependent effects: High-dimensional problems impossible

(no logarithmic dependence on dimension)I Identification of optimal mechanism requires geometric understanding

Page 137: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Summary

High level results:

I Formal minimax framework for local differential privacyI Two main theorems bound distances between probability

distributions as function of privacy

I Pairwise contraction: Le Cam’s methodI Mutual information contraction: Fano’s method

Extensions and other conclusions:

I In essentially any problem, effective number of samples

n 7→ nα2

I In d-dimensional problems, effective number of samples

n 7→ nα2

dI Rates for regression, multinomial estimation, convex optimizationI Dimension-dependent effects: High-dimensional problems impossible

(no logarithmic dependence on dimension)I Identification of optimal mechanism requires geometric understanding

Page 138: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Summary

High level results:

I Formal minimax framework for local differential privacyI Two main theorems bound distances between probability

distributions as function of privacyI Pairwise contraction: Le Cam’s methodI Mutual information contraction: Fano’s method

Extensions and other conclusions:

I In essentially any problem, effective number of samples

n 7→ nα2

I In d-dimensional problems, effective number of samples

n 7→ nα2

dI Rates for regression, multinomial estimation, convex optimizationI Dimension-dependent effects: High-dimensional problems impossible

(no logarithmic dependence on dimension)I Identification of optimal mechanism requires geometric understanding

Page 139: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Summary

High level results:

I Formal minimax framework for local differential privacyI Two main theorems bound distances between probability

distributions as function of privacyI Pairwise contraction: Le Cam’s methodI Mutual information contraction: Fano’s method

Extensions and other conclusions:I In essentially any problem, effective number of samples

n 7→ nα2

I In d-dimensional problems, effective number of samples

n 7→ nα2

d

I Rates for regression, multinomial estimation, convex optimizationI Dimension-dependent effects: High-dimensional problems impossible

(no logarithmic dependence on dimension)I Identification of optimal mechanism requires geometric understanding

Page 140: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Summary

High level results:

I Formal minimax framework for local differential privacyI Two main theorems bound distances between probability

distributions as function of privacyI Pairwise contraction: Le Cam’s methodI Mutual information contraction: Fano’s method

Extensions and other conclusions:I In essentially any problem, effective number of samples

n 7→ nα2

I In d-dimensional problems, effective number of samples

n 7→ nα2

dI Rates for regression, multinomial estimation, convex optimizationI Dimension-dependent effects: High-dimensional problems impossible

(no logarithmic dependence on dimension)

I Identification of optimal mechanism requires geometric understanding

Page 141: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Summary

High level results:

I Formal minimax framework for local differential privacyI Two main theorems bound distances between probability

distributions as function of privacyI Pairwise contraction: Le Cam’s methodI Mutual information contraction: Fano’s method

Extensions and other conclusions:I In essentially any problem, effective number of samples

n 7→ nα2

I In d-dimensional problems, effective number of samples

n 7→ nα2

dI Rates for regression, multinomial estimation, convex optimizationI Dimension-dependent effects: High-dimensional problems impossible

(no logarithmic dependence on dimension)I Identification of optimal mechanism requires geometric understanding

Page 142: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Summary

High level results:

I Formal minimax framework for local differential privacyI Two main theorems bound distances between probability

distributions as function of privacyI Pairwise contraction: Le Cam’s methodI Mutual information contraction: Fano’s method

I Trade-offs between privacy and statistical utilityI Many open problems:

I Other models of privacy?I Non-local notions of privacy?I Privacy without knowing statistical objective?

Pre/post-print online: “Local privacy and statistical minimax rates.” D.,Jordan, & Wainwright (2013). arXiv:1302.3203 [stat.TH]

Page 143: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Summary

High level results:

I Formal minimax framework for local differential privacyI Two main theorems bound distances between probability

distributions as function of privacyI Pairwise contraction: Le Cam’s methodI Mutual information contraction: Fano’s method

I Trade-offs between privacy and statistical utility

I Many open problems:

I Other models of privacy?I Non-local notions of privacy?I Privacy without knowing statistical objective?

Pre/post-print online: “Local privacy and statistical minimax rates.” D.,Jordan, & Wainwright (2013). arXiv:1302.3203 [stat.TH]

Page 144: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Summary

High level results:

I Formal minimax framework for local differential privacyI Two main theorems bound distances between probability

distributions as function of privacyI Pairwise contraction: Le Cam’s methodI Mutual information contraction: Fano’s method

I Trade-offs between privacy and statistical utilityI Many open problems:

I Other models of privacy?I Non-local notions of privacy?I Privacy without knowing statistical objective?

Pre/post-print online: “Local privacy and statistical minimax rates.” D.,Jordan, & Wainwright (2013). arXiv:1302.3203 [stat.TH]

Page 145: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Summary

High level results:

I Formal minimax framework for local differential privacyI Two main theorems bound distances between probability

distributions as function of privacyI Pairwise contraction: Le Cam’s methodI Mutual information contraction: Fano’s method

I Trade-offs between privacy and statistical utilityI Many open problems:

I Other models of privacy?I Non-local notions of privacy?I Privacy without knowing statistical objective?

Pre/post-print online: “Local privacy and statistical minimax rates.” D.,Jordan, & Wainwright (2013). arXiv:1302.3203 [stat.TH]

Page 146: Local Privacy and Statistical Minimax Rates · 2020-01-03 · Goals for this talk Bring together some classical concepts of decision theory and newer concepts of privacy

Summary

High level results:

I Formal minimax framework for local differential privacyI Two main theorems bound distances between probability

distributions as function of privacyI Pairwise contraction: Le Cam’s methodI Mutual information contraction: Fano’s method

I Trade-offs between privacy and statistical utilityI Many open problems:

I Other models of privacy?I Non-local notions of privacy?I Privacy without knowing statistical objective?

Pre/post-print online: “Local privacy and statistical minimax rates.” D.,Jordan, & Wainwright (2013). arXiv:1302.3203 [stat.TH]