mining privacy settings to find optimal privacy-utility tradeoffs for social network services
DESCRIPTION
Mining Privacy Settings to Find Optimal Privacy-Utility Tradeoffs for Social Network Services. Shumin Guo , Keke Chen Data Intensive Analysis and Computing (DIAC) Lab Kno.e.sis Center Wright State University . Outline. Introduction Background Research goals Contributions - PowerPoint PPT PresentationTRANSCRIPT
Mining Privacy Settings to Find Optimal Privacy-Utility Tradeoffs for Social
Network Services
Shumin Guo, Keke Chen
Data Intensive Analysis and Computing (DIAC) LabKno.e.sis Center
Wright State University
Outline Introduction
Background Research goals Contributions
Our modeling methods The IRT model Our research hypothesis Modeling social network privacy and utility The weighted/personalized utility model Trade off between privacy and utility
The experiments Social network data from Facebook Experimental Results
Conclusions
Introduction
Background Social network services (SNS) are popular SNS are filled up with private info privacy risks
Online identity theft Insurance discrimination …
Protecting SNS privacy is complicated Many new young users
Do not realize privacy risks Do not know how to protect their privacy
Privacy settings consist of tens of options involve implicit privacy-utility tradeoff
A privacy guidance for new young users?
Some facts Privacy settings of Facebook
27 items Each item is set to one of the four levels of exposure
(“me only”, “friends only”, “friends of friends”, “everyone”)
By default, most items are set to the highest exposure level the best interest to the SNS provider is to get people
exposed and connected to each other
Research goals Understand the SNS privacy problem
The level of “privacy sensitivity” for each personal item
Quantification of privacy The balance between privacy and SNS utility
Enhancement of SNS privacy How to help users express their privacy concerns? How to help users automate the privacy configuration
with utility preference in mind?
Our contributions Develop a privacy quantification framework
that considers both privacy and utility Understand common users’ privacy concerns Help users achieve optimal privacy settings
based on their utility preferences
We study the framework with real data obtained from Facebook
Modeling SNS Users’ Privacy Concerns
Basic idea Use the Item Response Theory (IRT) model
to understand existing SNS users’ privacy settings
Derive the quantification of privacy concern with the privacy IRT model
Map a new user’s privacy concern to the IRT model find the best privacy setting
The Item Response Theory(IRT) model A classic model used in standard test evaluation
Example, estimate the ability level of an examinee based on his/her answers to the a number of questions
The two-parametric model
α level of discrimination for a certain questionβ Level of difficulty for a certain questionθ Level of a person’s certain trait
Mapping to privacy problem Question answer profile item setting Ability level of privacy concern Beta sensitivity of profile item Alpha contribution to overall privacy concern
What we get…
Level of privacy concern
Prob
abilit
y of
hid
ing
the
item network
relationships
Current_city
Our Research Approach Observation: Users disclose some profile
items while hide others If a user believes an item is not too sensitive,
he/she will disclose this item If a user perceives an item as critical to realize
his/her social utility, he/she may also disclose it Otherwise, user will hide this item
Hypothesis: Users have some implicit balance judgment behind their SNS activities If utility gain > privacy risk disclose If utility gain < privacy risk hide
Modeling SNS privacy Use the two-parametric IRT model New interpretation of the IRT model
α profile-item weight for a user’s overall privacy concern β Sensitivity level of the profile itemθ Level of a user’s privacy concern
The complete result looks like…
Finding optimal settings Theorem:
User settings for items: 1: hidden, 0: disclosed
Probability of hidingthe item
Privacy rating at i
Modeling SNS utility – the same method
λ profile-item weight for a user’s SNS utility μ importance level of the profile item φ Level of a user’s utility preference We can derive: λ = α and μ = -β For utility model, we have:
is the flip of sij
An important resultFor a specific privacy setting over theta_i
Privacy rating + utility rating ≈ a constant
Privacy-utility are linearly related
The weighted/personalized utility model
Users often have clear intention for using SN but have less knowledge on privacy
Users want to put higher utility weight on (a) certain group(s) of profile items than others
Users can assign specific weights to profile items to express his/her preference
The utility IRT model can be revised with a weighted model (skip the details here)
Illustration of Tradeoff between privacy and utility
The Experiments
The Real Data from Facebook Data crawled from Facebook with two accounts
Account normal: a normal Facebook account, which has a certain number of friends
Account fake: a fake account with no friends Data crawling steps
For the friends and “friends of friends” (FOF) of account normal, crawl the profile item visibility of each user
For the same group of users, crawl the visibility of the fake account’s FoFs’ profile items again
We have the following inference rules
Deriving privacy settings of users Based on the data crawled with the FoF of the two
accounts, we derive the (gross) privacy setting of a user based on the following rules
E: everyone, FoF: friends of Friends, O:the account owner, F: Friends only
Experimental Results
Validated with 5-fold cross-validationWith p-value <0.05
Privacy ratingreal setting
Results of learning weighted utility model
Tradeoff between privacy and utility (unweighted)
Utility rating
Priv
acy
ratin
gFew people have very highlevel of privacy concern
More people tend to havelower privacy ratings, or implicitly higher utility ratings
Tradeoff between privacy and weighted utility
A framework to address the tradeoff between privacy and utility
Latent trait model (IRT) is used for modeling privacy and utility
We develop a personalized utility model and a tradeoff method for users to find optimal configuration based on utility preferences
The models are validated with a large dataset crawled from Facebook
Conclusion