a privacy-preserving framework for personalized social recommendations

47
A Privacy-Preserving Framework for Personalized Social Recommendations Zach Jorgensen 1 and Ting Yu 1,2 1 NC State University Raleigh, NC, USA 2 Qatar Computing Research Institute Doha, Qatar EDBT March 24-28, 2014 Athens, Greece

Upload: medge-moore

Post on 31-Dec-2015

39 views

Category:

Documents


0 download

DESCRIPTION

A Privacy-Preserving Framework for Personalized Social Recommendations Zach Jorgensen 1 and Ting Yu 1,2. 1 NC State University Raleigh, NC, USA. 2 Qatar Computing Research Institute Doha, Qatar. EDBT March 24-28, 2014 Athens, Greece. Motivation. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A Privacy-Preserving  Framework for Personalized Social Recommendations

A Privacy-Preserving Framework for Personalized

Social Recommendations

Zach Jorgensen1 and Ting Yu1,2

1 NC State University Raleigh, NC, USA

2 Qatar Computing Research Institute Doha, Qatar

EDBT March 24-28, 2014Athens, Greece

Page 2: A Privacy-Preserving  Framework for Personalized Social Recommendations

Motivation

• Social recommendation task – to predict items a user might like based on the items his/her friends like

i2 i3 i4 i5

SocialRecommendati

onSystem

i1

Social Relations

Item Preferences

recommendations

Page 3: A Privacy-Preserving  Framework for Personalized Social Recommendations

Motivation

Model: Top-n Social RecommenderFor every item i For every user u

Compute μ(i, u)For every user u Sort items by utility Recommend top n items

Input• Items• Users• Social Graph• Preference Graph• # of recs, n

The utility of recommending item i to user u

OutputA personalized list of top n items (by utility), for each user

Page 4: A Privacy-Preserving  Framework for Personalized Social Recommendations

Motivation = utility of recommending item i to user u

u, i ∑𝑣 ∈U sers

¿ (𝑢 ,𝑣 ) ∙𝑤 (𝑣 , 𝑖) 𝜇 (𝑖 ,𝑢) ∈ℝ ≥0

¿SocialGraph

Social similaritymeasure

ℝ ≥0

μ

s ℑ (𝑢 ,𝑣 )=¿𝑛𝑏𝑟𝑠 (𝑢) ∩𝑛𝑏𝑟𝑠 ( 𝑣 )∨¿

u, v

e.g., Common Neighbors

1 if pref. exists0 otherwise

Page 5: A Privacy-Preserving  Framework for Personalized Social Recommendations

Motivation

• Many existing structural similarity measures could be used [Survey: Lu & Zhou, 2011]

• We considered– Common Neighbors– Adamic-Adar– Graph Distance– Katz

Page 6: A Privacy-Preserving  Framework for Personalized Social Recommendations

Motivation

Two main privacy problems:1. Protect privacy of user data from

malicious service provider (i.e., the recommender)

2. Protect privacy of user data from malicious/curious users

Our focus: preventing disclosure of individual item preferences through the output

Page 7: A Privacy-Preserving  Framework for Personalized Social Recommendations

Motivation

Simple attack on Common Neighbors.

Bob listens to Bieber!

Bob Alice

Page 8: A Privacy-Preserving  Framework for Personalized Social Recommendations

Motivation

• Knowledge of all preferences except target edge

• Observes all recommendations

• Knowledge of the algorithm

Goal: to deduce the presence/absence of a single preference edge (the target edge)

Adversary

Page 9: A Privacy-Preserving  Framework for Personalized Social Recommendations

Motivation

Differential Privacy [Dwork, 2006]• Provides strong, formal privacy

guarantees• Informally: guarantees that

recommendations will be (almost) the same with/without any one preference edge in the input

Page 10: A Privacy-Preserving  Framework for Personalized Social Recommendations

Motivation

• Task: For each node, recommend node with highest social similarity (Common Neighbors, Katz).

• No distinction between user/items or between preferences/social edges.

• Negative theoretical results.

Related work: Machanavajjhala et al. (VLDB 2011)

Page 11: A Privacy-Preserving  Framework for Personalized Social Recommendations

Motivation

• We assume that social graph is public

• Often true in practice…

Page 12: A Privacy-Preserving  Framework for Personalized Social Recommendations

Motivation

• Main Contribution: a framework that enables differential privacy guarantees for preference edges

• Demonstrate on real data sets that making accurate and private social recommendation is feasible

Page 13: A Privacy-Preserving  Framework for Personalized Social Recommendations

Outline

• Motivation• Differential Privacy• Our Approach• Experimental Results• Conclusions

Page 14: A Privacy-Preserving  Framework for Personalized Social Recommendations

Differential Privacy

A randomized algorithm A gives ε-differential privacy if for any neighboring data sets D, D’ and any :

X1

…Xi

…Xn

𝐷

Neighboring data sets differ ina single record

𝐷 ′X1

…Xi

…Xn

[Dwork, 2006.]

Page 15: A Privacy-Preserving  Framework for Personalized Social Recommendations

Achieving Differential Privacy

X1

…Xi

…Xn

𝐷

𝐴 : 𝐷𝑛 →𝑅𝑑

noised

Global sensitivity of A: 1

Theorem: satisfies ε-differential privacy

typically Smaller ε = more noise/privacy

Page 16: A Privacy-Preserving  Framework for Personalized Social Recommendations

Properties of Differential Privacy

• Sequential Composition

𝐴 (𝐷 ) ,𝜀1 𝐴 (𝐷 )+𝐿𝑎𝑝 (∆𝐴/𝜀1)

𝐴 (𝐷 ) ,𝜀𝑛

... ...

𝐴 (𝐷 )+𝐿𝑎𝑝 (∆𝐴/𝜀𝑛) ∑ 𝜀𝑖-differential privacyD

DP Interface

• Parallel Composition

𝐴 (𝐷1 ) ,𝜀

𝐴 (𝐷𝑛) ,𝜀𝐷𝑖

𝐷1

𝐷𝑛

𝐴 (𝐷1 )+𝐿𝑎𝑝(∆ 𝐴/𝜀)

𝐴 (𝐷𝑛)+𝐿𝑎𝑝(∆𝐴/𝜀)ε-differentially private

... ...

Page 17: A Privacy-Preserving  Framework for Personalized Social Recommendations

Outline

• Motivation• Differential Privacy• Our Approach

– Simplifying observations– Naïve Approaches– Our Approach

• Experimental Results• Conclusions

Page 18: A Privacy-Preserving  Framework for Personalized Social Recommendations

Simplifying Observations

For every item i For every user u

Compute μ(i, u)For every user u Sort items by utility Recommend top n items

Iterations use disjoint

inputs

Post-processing

Our focus: an ε-differentially private procedure for computing μ(i, u), for all

users u and a given i

Page 19: A Privacy-Preserving  Framework for Personalized Social Recommendations

Naïve Approaches

Approach 1: Noise-on-Utilities

For each item i For every user u

Compute For each user u Sort items by utility Recommend top n items

∆=max𝑢∑𝑣

𝑠𝑖𝑚(𝑣 ,𝑢) Satisfies ε-differential privacy, but…destroys accuracy!

Page 20: A Privacy-Preserving  Framework for Personalized Social Recommendations

Naïve Approaches

Approach 2: Noise-on-Edges1. Add Laplace noise independently

to each edge, 2. Run the non-private algorithm with

the resulting sanitized preference graph

Example: let

Noise will destroy accuracy!

Page 21: A Privacy-Preserving  Framework for Personalized Social Recommendations

Our Approach

ic1

c2

c3

0

0

01

1 1

11

𝐺𝑖

ClusterEdges

Strategy S

u1

u2

u4

u5

u3

u6u7u8

For now, assume S randomly assigns edges to clusters

Page 22: A Privacy-Preserving  Framework for Personalized Social Recommendations

Our Approach

ic1

c2

c3

+ noise

noise + noise

0

0

01

1 1

11

𝐺𝑖

For each cluster, compute noisy average weight

u1

u2

u4

u5

u3

u6u7u8

Page 23: A Privacy-Preserving  Framework for Personalized Social Recommendations

Our Approach

i𝑐1

𝑐1

𝑐1

𝑐2𝑐2

𝑐3

𝑐3

𝑐3

𝐺𝑖

c1

c2

c3

+ noise

noise + noise

Replace edge weights w/ noisy

average of respective cluster

u1

u2

u4

u5

u3

u6u7u8

Page 24: A Privacy-Preserving  Framework for Personalized Social Recommendations

Our Approach

i𝑐1

𝑐1

𝑐1

𝑐2𝑐2

𝑐3

𝑐3

𝑐3

𝐺𝑖

For every item i For each user u

Compute μ(i, u)For each user u Sort items by utility Recommend top n items

u1

u2

u4

u5

u3

u6u7u8

Page 25: A Privacy-Preserving  Framework for Personalized Social Recommendations

• Adding/removing a single preference edge affects one cluster average by at most 1/|ci|

• Noise added to average for cluster is • The bigger the cluster, the smaller the noise

Example: let ε = 0.1, |c| = 50 edges

Intuition: the bigger the cluster, the less sensitive its average weight is to any one

preference edge

Our Approach: Rationale

Page 26: A Privacy-Preserving  Framework for Personalized Social Recommendations

Our Approach: Rationale

• The catch – averaging introduces approximation error!

• Need a better clustering strategy that will keep approx. error relatively low

• Strategy must not leak privacy.

Page 27: A Privacy-Preserving  Framework for Personalized Social Recommendations

Our Approach: Clustering Strategy

Community

Detection

u1

u2

u4

u5

u3

u6u7u8

Social Graphu1

u3

u6u7

u2u4u5

u8

c0

c1

Cluster the users based on the natural community structure of the public social graph.

Page 28: A Privacy-Preserving  Framework for Personalized Social Recommendations

Our Approach: Clustering Strategy

CommunityDetection

u1

u2

u4

u5

u3

u6u7u8

Social Graphc0

c1

(𝑢1 , 𝑖)(𝑢3 , 𝑖)

(𝑢7 , 𝑖)(𝑢6 ,𝑖)

(𝑢2 , 𝑖)

(𝑢8 , 𝑖)

(𝑢4 ,𝑖)(𝑢5 , 𝑖)

For each item, derive clusters for preference edges based on the user clusters

Page 29: A Privacy-Preserving  Framework for Personalized Social Recommendations

Our Approach: Clustering Strategy

Community

Detection

u1

u2

u4

u5

u3

u6u7u8

Social Graphc0

c1

(𝑢1 , 𝑖)(𝑢3 , 𝑖)

(𝑢7 , 𝑖)(𝑢6 ,𝑖)

(𝑢2 , 𝑖)

(𝑢8 , 𝑖)

(𝑢4 ,𝑖)(𝑢5 , 𝑖)

Note: we only need to cluster the social graph once; resulting clusters used for all items

Page 30: A Privacy-Preserving  Framework for Personalized Social Recommendations

Our Approach: Clustering Strategy

Community

Detection

u1

u2

u4

u5

u3

u6u7u8

Social Graphc0

c1

(𝑢1 , 𝑖)(𝑢3 , 𝑖)

(𝑢7 , 𝑖)(𝑢6 ,𝑖)

(𝑢2 , 𝑖)

(𝑢8 , 𝑖)

(𝑢4 ,𝑖)(𝑢5 , 𝑖)

Key point: clustering based on the publicsocial graph does not leak privacy!

Page 31: A Privacy-Preserving  Framework for Personalized Social Recommendations

Our Approach: Clustering Strategy

• Louvain Method [Blondel et al. 2008]– Greedy modularity maximization– Well-studied and known to produce

good communities– Fast enough for graphs with millions of

nodes– No parameters to tune

Page 32: A Privacy-Preserving  Framework for Personalized Social Recommendations

Outline

• Motivation• Preliminaries• Our Approach• Experimental Results• Conclusions

Page 33: A Privacy-Preserving  Framework for Personalized Social Recommendations

Data Sets

• 1,892 users• 17,632 items• Avg. user deg. = 13.4

(std. 17.3)• Avg. prefs per user =

48.7 (std. 6.9)

• 137,372 users• 48,756 items• Avg. user deg. = 18.5

(std. 31.1)• Avg. prefs per user =

54.8 (std. 218.2)

Publicly available:Last.fm <http://ir.ii.uam.es/hetrec2011/datasets>Flixster <http://www.sfu.ca/~sja25/datasets>

Page 34: A Privacy-Preserving  Framework for Personalized Social Recommendations

Measuring Accuracy

• Normalized Discounted Cumulative Gain [Järvelin and Kekäläinen. 2002]

• NDCG at n – measures quality of the private recommendations relative to non-private recommendations, taking rank and utility into account

• Ranges from 0.0 to 1.0, with 1.0 meaning private recommender achieves ideal ranking

• Average over all users in data set

Page 35: A Privacy-Preserving  Framework for Personalized Social Recommendations

Experiments: Last.fmAvg. Accuracy (NDCG at n=50) vs. Privacy

Acc

ura

cy

PrivacyLow High

Page 36: A Privacy-Preserving  Framework for Personalized Social Recommendations

Experiments: FlixsterAvg. NDCG at 50; 10,000 random users

Acc

ura

cy

PrivacyLow HighNote: different y-axis scale

Page 37: A Privacy-Preserving  Framework for Personalized Social Recommendations

Experiments: Naïve Approaches

Katz Common Graph Adamic-Adar Nbrs. Dist.

Katz Common Graph Adamic-Adar Nbrs. Dist.

𝜀=1.0 𝜀=0.1

• Naïve approaches on Last.fm data set

Page 38: A Privacy-Preserving  Framework for Personalized Social Recommendations

Conclusions

• Differential privacy guarantees for item preferences

• Use clustering and averaging to trade Laplace noise for some approx. error

• Clustering via the community structure of the social graph is a useful heuristic for clustering the edges without violating privacy

• Personalized social recommendations can be both private and accurate

Page 39: A Privacy-Preserving  Framework for Personalized Social Recommendations

THANK YOU!

Page 40: A Privacy-Preserving  Framework for Personalized Social Recommendations

BACKUP SLIDES

Page 41: A Privacy-Preserving  Framework for Personalized Social Recommendations
Page 42: A Privacy-Preserving  Framework for Personalized Social Recommendations

Accuracy Metric: NDCG

• Normalized Discounted Cumulative Gain– items recommended to user u by private

recommender; sorted by noisy utility– items recommended to user u by non-

private recommender; sorted by true utility

– NDCG ranges from 0…1– Averaged over all users in a data set

Page 43: A Privacy-Preserving  Framework for Personalized Social Recommendations

Social Similarity Measures

• Adamic-Adar

• Graph Distance

• Katz

𝑠𝑖𝑚 (𝑢 ,𝑣 )= ∑𝑥∈ commonneighbors

1log ¿𝑛𝑏𝑟𝑠 (𝑥 )∨¿

¿

𝑠𝑖𝑚 (𝑢 ,𝑣 )=1/ ShortestPathLength(u , v )

𝑠𝑖𝑚 (𝑢 ,𝑣 )=∑𝑙=1

𝑘

𝛼𝑙 ∙| h𝑝𝑎𝑡 𝑠𝑢 , 𝑣𝑙 |

paths of length l between u,v

small dam-ping factor

Page 44: A Privacy-Preserving  Framework for Personalized Social Recommendations

Experiments: Last.fm

NDCG at 10 NDCG at 100

Page 45: A Privacy-Preserving  Framework for Personalized Social Recommendations

Experiments: Flixster

NDCG at 10 NDCG at 100

Page 46: A Privacy-Preserving  Framework for Personalized Social Recommendations

Comparison of approaches on Last.fm data set.

Low Rank Mechanism (LRM) – Yuan et al. PVLDB’12Group and Smooth (GS) – Kellaris & Papadopoulos. PVLDB’13

Page 47: A Privacy-Preserving  Framework for Personalized Social Recommendations

Relationship between user degree and accuracy, due to approx. error (Common Neighbors).