secgraph:*a uniform and open-source evaluation system for ...secgraph:*a uniform and open-source...
TRANSCRIPT
SecGraph: A Uniform and Open-source Evaluation System for Graph Data
Anonymization and De-anonymization
Shouling Ji, Weiqing Li, and Raheem Beyah Georgia Ins=tute of Technology
Prateek MiDal Princeton University
Xin Hu IBM Research
S. Ji, W. Li, P. MiDal, X. Hu, and R. Beyah SecGraph: A Uniform and Open-‐Source Evalua=on System 2
Introduc=on • Graph Data
– Social networks data
– Mobility traces
– Medical data
– ……
• Why data sharing/publishing – Academic research
– Business applica=ons
– Government applica=ons
– Healthcare applica=ons
– ……
S. Ji, W. Li, P. MiDal, X. Hu, and R. Beyah SecGraph: A Uniform and Open-‐Source Evalua=on System 3
Anonymiza=on and De-‐anonymiza=on
Anonymization
De-anonymization
Utility Anonymization vs DA
S. Ji, W. Li, P. MiDal, X. Hu, and R. Beyah SecGraph: A Uniform and Open-‐Source Evalua=on System 4
Contribu=on • We studied and analyzed
– 11 state-‐of-‐the-‐art graph anonymiza=on techniques
– 15 modern Structure-‐based De-‐Anonymiza=on (SDA) aDacks
• We developed and evaluated SecGraph – 11 anonymiza=on techniques
– 12 graph u=lity metrics
– 7 applica=on u=lity metrics
– 15 de-‐anonymiza=on aDacks
• We released SecGraph – Publicly available at hDp://www.ece.gatech.edu/cap/secgraph/
S. Ji, W. Li, P. MiDal, X. Hu, and R. Beyah SecGraph: A Uniform and Open-‐Source Evalua=on System 5
Outline
• Introduc=on • Graph Anonymiza=on • Graph De-‐anonymiza=on
• Anonymiza=on vs. De-‐anonymiza=on Analysis • SecGraph
• Conclusion
S. Ji, W. Li, P. MiDal, X. Hu, and R. Beyah SecGraph: A Uniform and Open-‐Source Evalua=on System 6
Graph Anonymiza=on • Naïve ID Removal
• Edge Edi=ng (EE) based anonymiza=on – Add/Del
– Switch
• K-‐anonymity – K-‐Neighborhood Anonymiza=on (k-‐NA)
– K-‐Degree Anonymiza=on (k-‐DA)
– K-‐automorphism (k-‐auto)
– K-‐isomorphism (k-‐iso)
• Aggrega=on/class/cluster based anonymiza=on
• Differen=al Privacy (DP) based anonymiza=on
• Random Walk (RW) based anonymiza=on
S. Ji, W. Li, P. MiDal, X. Hu, and R. Beyah SecGraph: A Uniform and Open-‐Source Evalua=on System 7
U=lity Metrics • Graph u=lity (12)
– Degree (Deg.)
– Joint Degree (JD)
– Effec=ve Diameter (ED)
– Path Length (PL)
– Local Clustering Coefficient (LCC)
– Global Clustering Coefficient (GCC)
– Closeness Centrality (CC)
– Betweenness Central=y (BC)
– Eigenvector (EV)
– Network Constraint (NC)
– Network Resilience
– Infec=ousness (Infe.)
• Applica=on u=lity (7) – Role eXtrac=on (RX)
– Influence Maximiza=on (IM)
– Minimum-‐sized Influen=al Node Set (MINS)
– Community Detec=on (CD)
– Secure Rou=ng (SR)
– Sybil Detec=on (SD)
Graph utility captures how the anonymized data preserve fundamental structural
properties of the original graph
How useful the anonymized data are for practical graph applications and mining
tasks
S. Ji, W. Li, P. MiDal, X. Hu, and R. Beyah SecGraph: A Uniform and Open-‐Source Evalua=on System 8
Anonymiza=on vs U=lity Resilient to DA attacks
Anonymization Techniques
Conditionally preserve the utility
Partially preserve the utility
Not preserve the utility
preserve the utility
S. Ji, W. Li, P. MiDal, X. Hu, and R. Beyah SecGraph: A Uniform and Open-‐Source Evalua=on System 9
Anonymiza=on vs U=lity Resilient to DA attacks
Anonymization Techniques
Conditionally preserve the utility
Partially preserve the utility
Not preserve the utility
S. Ji, W. Li, P. MiDal, X. Hu, and R. Beyah SecGraph: A Uniform and Open-‐Source Evalua=on System 10
Outline
• Introduc=on • Graph Anonymiza=on • Graph De-‐anonymiza=on
• Anonymiza=on vs. De-‐anonymiza=on Analysis • SecGraph
• Conclusion
S. Ji, W. Li, P. MiDal, X. Hu, and R. Beyah SecGraph: A Uniform and Open-‐Source Evalua=on System 11
Graph De-‐Anonymiza=on (DA) • Seed-‐based DA
– Backstrom et al.’s aDack (BDK)
– Narayanan-‐Shma=kov’s aDack (NS)
– Narayanan et al.’s aDack (NSR)
– Nilizadeh et al.’s aDack (NKA)
– Distance Vector (Srivatsa-‐Hicks aDack) (DV)
– Randomized Spanning Trees (Srivatsa-‐Hicks aDack) (RST)
– Recursive Subgraph Matching (Srivatsa-‐Hicks aDack) (RSM)
– Yartseva-‐Grossglauser’s aDack (YG)
– De-‐Anonymiza=on aDack (Ji et al.’s aDack) (DeA)
– Adap=ve De-‐Anonymiza=on aDack (Ji et al.’s aDack) (ADA)
– Korula-‐LaDanzi’s aDack (KL)
• Seed-‐free DA – Pedarsani et al.’s aDack (PFG)
– Ji et al.’s aDack (JLSB)
Anonymized Graph Auxiliary Graph
Seed mappings
Figure source: E. Kazemi et al., Growing a Graph Matching from a Handful of Seeds, VLDB 2015
S. Ji, W. Li, P. MiDal, X. Hu, and R. Beyah SecGraph: A Uniform and Open-‐Source Evalua=on System 12
DA ADacks
SF: seed-free AGF: auxiliary graph-free SemF: semantics-free A/P: active/passive attack Scal.: scalable Prac.: practical Rob.: robust to noise
S. Ji, W. Li, P. MiDal, X. Hu, and R. Beyah SecGraph: A Uniform and Open-‐Source Evalua=on System 13
Outline
• Introduc=on • Graph Anonymiza=on • Graph De-‐anonymiza=on
• Anonymiza=on vs. De-‐anonymiza=on Analysis • SecGraph
• Conclusion
S. Ji, W. Li, P. MiDal, X. Hu, and R. Beyah SecGraph: A Uniform and Open-‐Source Evalua=on System 14
Anonymiza=on vs. DA
Modern DA Attacks
State-of-the-art Anonymization
Techniques
: vulnerable : conditionally vulnerable : invulnerable
DA attacks!
S. Ji, W. Li, P. MiDal, X. Hu, and R. Beyah SecGraph: A Uniform and Open-‐Source Evalua=on System 15
Outline
• Introduc=on • Graph Anonymiza=on • Graph De-‐anonymiza=on
• Anonymiza=on vs. De-‐anonymiza=on Analysis • SecGraph
• Conclusion
S. Ji, W. Li, P. MiDal, X. Hu, and R. Beyah SecGraph: A Uniform and Open-‐Source Evalua=on System 16
SecGraph: Overview & Implementa=on
Publishing Data
AnonymizedData
Raw Data
AnonymizationModule (AM)Utility
Module (UM)
DAModule (DM)
15 DA Attacks
19 Utility Metrics
11 Anonymization Techniques
Raw Data -> AM -> Anonymized Data -> UM/DM -> Publishing Data Raw Data -> Publishing Data
Raw Data -> UM/DM -> AM -> Anonymized Data -> Publishing Data
S. Ji, W. Li, P. MiDal, X. Hu, and R. Beyah SecGraph: A Uniform and Open-‐Source Evalua=on System 17
SecGraph: Evalua=on • Datasets
– Enron (36.7K users, 0.2M edges): an email network dataset
– Facebook – New Orleans (63.7K users, 0.82M edges): a Facebook friendship dataset in the New Orleans area
• Evalua=on senngs – Follow the same/similar senngs in the original papers
– See details in the paper and hDp://www.ece.gatech.edu/cap/secgraph/
• Evalua=on scenarios – Anonymiza=on vs U=lity
– DA performance
– Robustness of DA aDacks
– Anonymiza=on vs DA
S. Ji, W. Li, P. MiDal, X. Hu, and R. Beyah SecGraph: A Uniform and Open-‐Source Evalua=on System 18
SecGraph: Anonymity vs U=lity
C1: existing anonymization techniques are better at preserving graph utilities
C2: all the anonymization techniques lose one or more application utility
S. Ji, W. Li, P. MiDal, X. Hu, and R. Beyah SecGraph: A Uniform and Open-‐Source Evalua=on System 19
SecGraph: DA Performance
0.00%
20.00%
40.00%
60.00%
0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95
De-‐anonymize Enron
NS DV JLSB
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95
De-‐anonymize Facebook
NS DV JLSB
Phase transitional phenomena
C1: the phase transitional (percolation) phenomena is attack-dependent
S. Ji, W. Li, P. MiDal, X. Hu, and R. Beyah SecGraph: A Uniform and Open-‐Source Evalua=on System 20
SecGraph: Robustness of DA ADacks % of error seeds
0%
10%
20%
30%
40%
4%
8%
12%
16%
20%
24%
28%
32%
36%
40%
44%
48%
% OF SEED ERROR
De-‐anonymize Enron
NS DV ADA
0% 20% 40% 60% 80% 100%
4%
8%
12%
16%
20%
24%
28%
32%
36%
40%
44%
48%
% OF SEED ERROR
De-‐anonymize Facebook
NS DV ADA
C1: global structure-based DA attacks are more robust to seed errors
S. Ji, W. Li, P. MiDal, X. Hu, and R. Beyah SecGraph: A Uniform and Open-‐Source Evalua=on System 21
SecGraph: Anonymiza=on vs DA
C1: state-of-the-art anonymization techniques are vulnerable to one or several modern DA attacks
C2: no DA attack is general optimal. The DA performance depends on several factors, e.g., similarity between anonymized and auxiliary graphs, graph density, and DA heuristics
S. Ji, W. Li, P. MiDal, X. Hu, and R. Beyah SecGraph: A Uniform and Open-‐Source Evalua=on System 22
SecGraph: Release & Support • Website
– hDp://www.ece.gatech.edu/cap/secgraph/
– Sopware
– Datasets
– Documents
– Demo
– Q&A
• Modes – GUI
– Command line
• Suppor=ng – Windows
– Linux
– Mac
S. Ji, W. Li, P. MiDal, X. Hu, and R. Beyah SecGraph: A Uniform and Open-‐Source Evalua=on System 23
Conclusion
– We analyzed exis=ng graph anonymiza=on techniques and de-‐anonymiza=on aDacks
– We implemented and evaluated SecGraph: a uniform and open-‐source evalua=on system from graph data anonymiza=on and de-‐anonymiza=on
Acknowledgement We thank the anonymous reviewers very much for their valuable
comments!
We are grateful to the following researchers in developing SecGraph: Ada Fu, Michael Hay, Davide Proserpio, Qian Xiao, Shirin Nilizadeh,
Jing S. He, Wei Chen, and Stanford SNAP developers
S. Ji, W. Li, P. MiDal, X. Hu, and R. Beyah SecGraph: A Uniform and Open-‐Source Evalua=on System 24
Thank you! Shouling Ji
hDp://www.ece.gatech.edu/cap/secgraph/
hDp://users.ece.gatech.edu/sji/