smooth sensitivity and samplingcompthink/mindswaps/oct07/local... · • review of global...

33
Smooth Sensitivity and Sampling Sofya Raskhodnikova Penn State University Joint work with Kobbi Nissim (Ben Gurion University) and Adam Smith (Penn State University)

Upload: others

Post on 09-Aug-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Smooth Sensitivity andSampling

Sofya RaskhodnikovaPenn State University

Joint work with Kobbi Nissim (Ben Gurion University)

and Adam Smith (Penn State University)

Page 2: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Our main contributions

• Starting point: Global sensitivity framework [DMNS06]

(Cynthia’s talk)

• Two new frameworks for private data analysis

• Greatly expand the types of information that can be

released

2

Page 3: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Road map

I. Introduction

• Review of global sensitivity framework [DMNS06]

• Motivation

II. Smooth sensitivity framework

III. Sample-and-aggregate framework

3

Page 4: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Model

...

x1

x2

xn

���� Trusted

agency

A�

�Compute f(x)

A(x) =f(x) + noise

Users

Each row is arbitrarily complex data supplied by 1 person.

For which functions f can we have:

• utility: little noise

• privacy: indistinguishability definition of [DMNS06]

4

Page 5: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Privacy as indistinguishability [DMNS06]

Two databases are neighbors if they differ in one row.

x = ...

x1

x2

xn

x′ = ...

x1

x′2

xn

Privacy definition

Algorithm A is ε-indistinguishable if

• for all neighbor databases x, x′

• for all sets of answers S

Pr[A(x) ∈ S] ≤ (1 + ε) · Pr[A(x′) ∈ S]

5

Page 6: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Privacy definition: composition

If A is ε-indistinguishable on each query,it is εq-indistinguishable on q queries.

...

x1

x2

xn

���� ε-indisting.

agency

A�

�Compute f(x)

A(x) =f(x) + noise

Users

6

Page 7: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Global sensitivity framework [DMNS06]

Intuition: f can be released accurately when

it is insensitive to individual entries x1, . . . , xn.

Global sensitivity GSf = maxneighbors x,x′

‖f(x) − f(x′)‖.

Example: GSaverage = 1n

if x ∈ [0, 1]n.

Theorem

If A(x) = f(x) + Lap(

GSf

ε

)then A is ε-indistinguishable.

7

Page 8: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Instance-Based Noise

Big picture for global sensitivity framework:

• add enough noise to cover the worst case for f

• noise distribution depends only on f , not database x

Problem: for some functions that’s too much noise

8

Page 9: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Instance-Based Noise

Big picture for global sensitivity framework:

• add enough noise to cover the worst case for f

• noise distribution depends only on f , not database x

Problem: for some functions that’s too much noise

Example: median of x1, . . . , xn ∈ [0, 1]

x = 0 · · · 0︸ ︷︷ ︸n−1

2

0 1 · · · 1︸ ︷︷ ︸n−1

2

x′ = 0 · · · 0︸ ︷︷ ︸n−1

2

1 1 · · · 1︸ ︷︷ ︸n−1

2

median(x) = 0 median(x′) = 1

GSmedian = 1

• Noise magnitude: 1ε.

8

Page 10: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Instance-Based Noise

Big picture for global sensitivity framework:

• add enough noise to cover the worst case for f

• noise distribution depends only on f , not database x

Problem: for some functions that’s too much noise

Example: median of x1, . . . , xn ∈ [0, 1]

x = 0 · · · 0︸ ︷︷ ︸n−1

2

0 1 · · · 1︸ ︷︷ ︸n−1

2

x′ = 0 · · · 0︸ ︷︷ ︸n−1

2

1 1 · · · 1︸ ︷︷ ︸n−1

2

median(x) = 0 median(x′) = 1

GSmedian = 1

• Noise magnitude: 1ε.

Our goal: noise tuned to database x

8

Page 11: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Road map

I. Introduction

• Review of global sensitivity framework [DMNS06]

• Motivation

II. Smooth sensitivity framework

III. Sample-and-aggregate framework

9

Page 12: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Local sensitivity

Local sensitivity LSf (x) = maxx′: neighbor of x

‖f(x) − f(x′)‖Reminder: GSf = max

xLSf (x)

Example: median for 0 ≤ x1 ≤ · · · ≤ xn ≤ 1, odd n

�0 1� �� ��x1 xnxm−1 xm+1xm. . . . . .

�median

LSmedian(x) = max(xm − xm−1, xm+1 − xm)

Goal: Release f(x) with less noise when LSf (x) is lower.

10

Page 13: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Local sensitivity

Local sensitivity LSf (x) = maxx′: neighbor of x

‖f(x) − f(x′)‖Reminder: GSf = max

xLSf (x)

Example: median for 0 ≤ x1 ≤ · · · ≤ xn ≤ 1, odd n

�0 1� �� ��x1 xnxm−1 xm+1xm. . . . . .

�median

new medianwhen x′

1 = 1

LSmedian(x) = max(xm − xm−1, xm+1 − xm)

Goal: Release f(x) with less noise when LSf (x) is lower.

10

Page 14: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Local sensitivity

Local sensitivity LSf (x) = maxx′: neighbor of x

‖f(x) − f(x′)‖Reminder: GSf = max

xLSf (x)

Example: median for 0 ≤ x1 ≤ · · · ≤ xn ≤ 1, odd n

�0 1� �� ��x1 xnxm−1 xm+1xm. . . . . .

�median

�new medianwhen x′

n = 0

new medianwhen x′

1 = 1

LSmedian(x) = max(xm − xm−1, xm+1 − xm)

Goal: Release f(x) with less noise when LSf (x) is lower.

10

Page 15: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Instance-based noise: first attempt

Noise magnitude proportional to LSf (x) instead of GSf?

No! Noise magnitude reveals information.

Lesson: Noise magnitude must be an insensitive function.

11

Page 16: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Smooth bounds on local sensitivity

Design sensitivity function S(x)

• S(x) is an ε-smooth upper bound on LSf (x) if:

– for all x: S(x) ≥ LSf (x)

– for all neighbors x, x′ : S(x) ≤ eεS(x′)

x

LSf (x)

Theorem

If A(x) = f(x) + noise

(S(x)

ε

)then A is ε′-indistinguishable.

Example: GSf is always a smooth bound on LSf (x)

12

Page 17: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Smooth bounds on local sensitivity

Design sensitivity function S(x)

• S(x) is an ε-smooth upper bound on LSf (x) if:

– for all x: S(x) ≥ LSf (x)

– for all neighbors x, x′ : S(x) ≤ eεS(x′)

x

LSf (x)

S(x)

Theorem

If A(x) = f(x) + noise

(S(x)

ε

)then A is ε′-indistinguishable.

Example: GSf is always a smooth bound on LSf (x)

12

Page 18: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Smooth Sensitivity

Smooth sensitivity S∗f (x)= max

y

(LSf (y)e−ε·dist(x,y)

)

LemmaFor every ε-smooth bound S: S∗

f (x) ≤ S(x) for all x.

Intuition: little noise when far from sensitive instances

database space

high local

sensitivity

low local

sensitivity

13

Page 19: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Smooth Sensitivity

Smooth sensitivity S∗f (x)= max

y

(LSf (y)e−ε·dist(x,y)

)

LemmaFor every ε-smooth bound S: S∗

f (x) ≤ S(x) for all x.

Intuition: little noise when far from sensitive instances

database space

high local

sensitivity

low local

sensitivity

low smooth sensitivity

13

Page 20: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Computing smooth sensitivity

Example functions with computable smooth sensitivity

• Median & minimum of numbers in a bounded interval

• MST cost when weights are bounded

• Number of triangles in a graph

Approximating smooth sensitivity

• only smooth upper bounds on LS are meaningful

• simple generic methods for smooth approximations

– work for median and 1-median in Ld1

14

Page 21: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Road map

I. Introduction

• Review of global sensitivity framework [DMNS06]

• Motivation

II. Smooth sensitivity framework

III. Sample-and-aggregate framework

15

Page 22: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

New goal

• Smooth sensitivity framework requires

understanding combinatorial structure of f

– hard in general

• Goal: an automatable transformation from

an arbitrary f into an ε-indistinguishable A

– A(x) ≈ f(x) for ”good” instances x

16

Page 23: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Example: cluster centers

Database entries: points in a metric space.x

�� ����

��������

�������� ��

��

�� ����

��������

�������� ��

��

�� ����

��������

�������� ��

��

��

x′

�� ����

��������

�������� ��

��

�� ����

��������

�������� ��

��

�� ����

��������

�������� ��

��

��

• Comparing sets of centers: Earthmover-like metric

• Global sensitivity of cluster centers is roughly the

diameter of the space. But intuitively, if clustering is

”good”, cluster centers should be insensitive.

• No efficient approximation for smooth sensitivity17

Page 24: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Example: cluster centers

Database entries: points in a metric space.x

�� ����

��������

�������� ��

��

�� ����

��������

�������� ��

��

�� ����

��������

�������� ��

��

��

����

x′

�� ����

��������

�������� ��

��

�� ����

��������

�������� ��

��

�� ����

��������

�������� ��

��

��

����

• Comparing sets of centers: Earthmover-like metric

• Global sensitivity of cluster centers is roughly the

diameter of the space. But intuitively, if clustering is

”good”, cluster centers should be insensitive.

• No efficient approximation for smooth sensitivity17

Page 25: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Example: cluster centers

Database entries: points in a metric space.x

�� ����

��������

�������� ��

��

�� ����

��������

�������� ��

��

�� ����

��������

�������� ��

��

��

��

��

����

x′

�� ����

��������

�������� ��

��

�� ����

��������

�������� ��

��

�� ����

��������

�������� ��

��

��

��

��

����

• Comparing sets of centers: Earthmover-like metric

• Global sensitivity of cluster centers is roughly the

diameter of the space. But intuitively, if clustering is

”good”, cluster centers should be insensitive.

• No efficient approximation for smooth sensitivity17

Page 26: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Sample-and-aggregate framework

Intuition: Replace f with a less sensitive function f̃ .

f̃(x) = g(f(sample1), f(sample2), . . . , f(samples))

� �

� � �

� �

x

xi1 , . . . , xit xj1 , . . . , xjt . . . xk1 , . . . , xkt

� � �

� � �

f f f

gaggregation function

18

Page 27: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Sample-and-aggregate framework

Intuition: Replace f with a less sensitive function f̃ .

f̃(x) = g(f(sample1), f(sample2), . . . , f(samples))

� �

� � �

� �

x

xi1 , . . . , xit xj1 , . . . , xjt . . . xk1 , . . . , xkt

� � �

� � �

f f f

gaggregation function

�� ��+

noise calibratedto sensitivity of f̃

output

18

Page 28: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Good aggregation functions

• average

– works for L1 and L2

• center of attention

– the center of a smallest ball containing a strict

majority of input points

– works for arbitrary metrics

(in particular, for Earthmover)

– gives lower noise for L1 and L2

19

Page 29: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Sample-and-aggregate results

TheoremIf f can be approximated on x

from small samples

then f can be released with little noise

20

Page 30: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Sample-and-aggregate results

TheoremIf f can be approximated on x within distance r

from small samples of size n1−δ

then f can be released with little noise ≈ rε

+ negl(n)

20

Page 31: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Sample-and-aggregate results

TheoremIf f can be approximated on x within distance r

from small samples of size n1−δ

then f can be released with little noise ≈ rε

+ negl(n)

• Works in all ”interesting” metric spaces

• Example applications

– k-means cluster centers (if data is separated a.k.a.[Ostrovsky Rabani Schulman Swamy 06])

– fitting mixtures of Gaussians (if data is i.i.d., using[Vempala Wang 04, Achlioptas McSherry 05])

– PAC concepts (Adam Smith’s talk)

20

Page 32: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Road map

I. Introduction

• Review of global sensitivity framework [DMNS06]

• Motivation

II. Smooth sensitivity framework

III. Sample-and-aggregate framework

21

Page 33: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Conclusion: fundamental question

Which computations are not too

sensitive to individual inputs?

Which functions f admit

ε-indistinguishable approximation A?

22