on link privacy in randomizing social networks

17
Xiaowei Ying, Xintao Wu Univ. of North Carolina at Charlotte PAKDD-09 April 28, Bangkok, Thailand On Link Privacy in Randomizing Social Networks

Upload: norton

Post on 16-Jan-2016

21 views

Category:

Documents


0 download

DESCRIPTION

On Link Privacy in Randomizing Social Networks. Xiaowei Ying, Xintao Wu Univ. of North Carolina at Charlotte PAKDD-09 April 28, Bangkok, Thailand. Motivation. Privacy Preserving Social Network Publishing node- anonymization - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: On Link Privacy in Randomizing Social Networks

Xiaowei Ying, Xintao Wu

Univ. of North Carolina at Charlotte

PAKDD-09 April 28, Bangkok, Thailand

On Link Privacy in Randomizing Social Networks

Page 2: On Link Privacy in Randomizing Social Networks

PAKDD-09, April 28, Bangkok, Thailand

On Link Privacy in Randomizing Social Networks

MotivationPrivacy Preserving Social Network Publishing

node-anonymization cannot guarantee identity/link privacy due to

subgraph queries.Backstrom et al. WWW07, Hay et al. UMass TR07

edge randomizationRandom Add/Del Random SwitchK-anonymity

Hay et al. VLDB08, Liu&Terzi SIGMOD08, Zhou&Pei ICDE08

Utility preserving randomizationSpectral feature preserving Ying&Wu SDM08 Real space feature preserving Ying&Wu SDM09

2

Page 3: On Link Privacy in Randomizing Social Networks

PAKDD-09, April 28, Bangkok, Thailand

On Link Privacy in Randomizing Social Networks

Problem Formalization

3

nnijaA )(),( mnG

Prior belief vs. Posterior belief)1( ijaP )~

|1( GaP ij

?)~,~|1( xmaaP ijijij

Ying&Wu SDM08

similarity measure value between node i and j

nnijaA )~(~

),(~

mnG

This paper

Add k then del k edges

Page 4: On Link Privacy in Randomizing Social Networks

PAKDD-09, April 28, Bangkok, Thailand

On Link Privacy in Randomizing Social Networks

:

:

:

:

2

1

C

h

Network of US political books

(105 nodes, 441 edges, r=8%)

Books about US politics sold by Amazon.com. Edges represent frequent co-purchasing of books by the same buyers. Nodes have been given colors of blue, white, or red to indicate whether they are "liberal", "neutral", or "conservative".

http://www-personal.umich.edu/˜mejn/netdata/

4

Polbooks network

Page 5: On Link Privacy in Randomizing Social Networks

PAKDD-09, April 28, Bangkok, Thailand

On Link Privacy in Randomizing Social Networks

Proportion of true edges vs. similarity

5

After randomly add/delete 200 edges (totally 441 edges)

Page 6: On Link Privacy in Randomizing Social Networks

PAKDD-09, April 28, Bangkok, Thailand

On Link Privacy in Randomizing Social Networks

Similarity measures vs. Link predictionSimilarity measures

The number of common neighborsAdamic/Adar, the weighted number of

common neighborsKatz, a weighted sum of the number of

paths connecting two nodesCommute time, the expected steps of

random walks from node i to j and back to i.

Similarity measures have been exploited in the classic link prediction problem. Liben-Nowell&Kleinberg CIKM03

6

Page 7: On Link Privacy in Randomizing Social Networks

PAKDD-09, April 28, Bangkok, Thailand

On Link Privacy in Randomizing Social Networks

Proportion of true edges vs. similarity

7

After randomly add/delete 200 edges (totally 441 edges)

Page 8: On Link Privacy in Randomizing Social Networks

PAKDD-09, April 28, Bangkok, Thailand

On Link Privacy in Randomizing Social Networks

Calculating Posterior belief

8

The attacker does not know this value,

what he can do?

]/[

/2

2

1

mCkp

mkp

n

Applying Bayes theorem

Page 9: On Link Privacy in Randomizing Social Networks

PAKDD-09, April 28, Bangkok, Thailand

On Link Privacy in Randomizing Social Networks

MLE estimation

Estimate based on randomized graph

9

Posterior belief can be calculated by attackers

Page 10: On Link Privacy in Randomizing Social Networks

PAKDD-09, April 28, Bangkok, Thailand

On Link Privacy in Randomizing Social Networks

Comparison

10

Page 11: On Link Privacy in Randomizing Social Networks

PAKDD-09, April 28, Bangkok, Thailand

On Link Privacy in Randomizing Social Networks

Comparison

11

Page 12: On Link Privacy in Randomizing Social Networks

PAKDD-09, April 28, Bangkok, Thailand

On Link Privacy in Randomizing Social Networks

Empirical EvaluationAttacker’s Prediction Strategy

Calculate posterior probability of all node pairs Choose top t node pairs (with highest post.

Prob.) as predicted candidate links

12

For each t, the precision of predictions (k=0.5m)

Page 13: On Link Privacy in Randomizing Social Networks

PAKDD-09, April 28, Bangkok, Thailand

On Link Privacy in Randomizing Social Networks

Empirical Evaluation

13

mk 5.0

The posteriori beliefs with similarity measures achieve higher precision than that without exploiting similarity measures.

One measure that is best for one data is not necessarily best for another data.

Page 14: On Link Privacy in Randomizing Social Networks

PAKDD-09, April 28, Bangkok, Thailand

On Link Privacy in Randomizing Social Networks

Determining k to guarantee privacy

14

Data Owner

Page 15: On Link Privacy in Randomizing Social Networks

PAKDD-09, April 28, Bangkok, Thailand

On Link Privacy in Randomizing Social Networks

Conclusion & Future WorkWe have shown that node proximity measures

can be exploited by attackers to breach link privacy in edge add/del randomized networks

15

How about other topological properties?

?)~

|1( GaP ij)~,~|1( xmaaP ijijij

How about other randomization strategies?

Privacy vs. utility tradeoff

Page 16: On Link Privacy in Randomizing Social Networks

Questions?

Acknowledgments

This work was supported in part by U.S. National Science Foundation IIS-0546027 and CNS-0831204.

Thank You!

16

Page 17: On Link Privacy in Randomizing Social Networks

PAKDD-09, April 28, Bangkok, Thailand

On Link Privacy in Randomizing Social Networks

Graph space :{G: with the given degree seq. & }

Examining proportion of sample graphs with existence a link between node i and j

Ying&Wu,SDM09

17

N

kkij

N

jiGN

SpaceaP

SpaceGGGN

1

21

),(1

)|1(

,,, :samples

Utility preserving randomization

RGS )(

Attacker’s confidence on link (i,j)