mining privacy settings to find optimal privacy-utility tradeoffs for social network services

Mining Privacy Settings to Find Optimal Privacy-Utility Tradeoffs for Social

Network Services

Shumin Guo, Keke Chen

Data Intensive Analysis and Computing (DIAC) LabKno.e.sis Center

Wright State University

Outline Introduction

Background Research goals Contributions

Our modeling methods The IRT model Our research hypothesis Modeling social network privacy and utility The weighted/personalized utility model Trade off between privacy and utility

The experiments Social network data from Facebook Experimental Results

Conclusions

Introduction

Background Social network services (SNS) are popular SNS are filled up with private info privacy risks

Online identity theft Insurance discrimination …

Protecting SNS privacy is complicated Many new young users

Do not realize privacy risks Do not know how to protect their privacy

Privacy settings consist of tens of options involve implicit privacy-utility tradeoff

A privacy guidance for new young users?

Some facts Privacy settings of Facebook

27 items Each item is set to one of the four levels of exposure

(“me only”, “friends only”, “friends of friends”, “everyone”)

By default, most items are set to the highest exposure level the best interest to the SNS provider is to get people

exposed and connected to each other

Research goals Understand the SNS privacy problem

The level of “privacy sensitivity” for each personal item

Quantification of privacy The balance between privacy and SNS utility

Enhancement of SNS privacy How to help users express their privacy concerns? How to help users automate the privacy configuration

with utility preference in mind?

Our contributions Develop a privacy quantification framework

that considers both privacy and utility Understand common users’ privacy concerns Help users achieve optimal privacy settings

based on their utility preferences

We study the framework with real data obtained from Facebook

Modeling SNS Users’ Privacy Concerns

Basic idea Use the Item Response Theory (IRT) model

to understand existing SNS users’ privacy settings

Derive the quantification of privacy concern with the privacy IRT model

Map a new user’s privacy concern to the IRT model find the best privacy setting

The Item Response Theory(IRT) model A classic model used in standard test evaluation

Example, estimate the ability level of an examinee based on his/her answers to the a number of questions

The two-parametric model

α level of discrimination for a certain questionβ Level of difficulty for a certain questionθ Level of a person’s certain trait

Mapping to privacy problem Question answer profile item setting Ability level of privacy concern Beta sensitivity of profile item Alpha contribution to overall privacy concern

What we get…

Level of privacy concern

Prob

abilit

y of

hid

ing

the

item network

relationships

Current_city

Our Research Approach Observation: Users disclose some profile

items while hide others If a user believes an item is not too sensitive,

he/she will disclose this item If a user perceives an item as critical to realize

his/her social utility, he/she may also disclose it Otherwise, user will hide this item

Hypothesis: Users have some implicit balance judgment behind their SNS activities If utility gain > privacy risk disclose If utility gain < privacy risk hide

Modeling SNS privacy Use the two-parametric IRT model New interpretation of the IRT model

α profile-item weight for a user’s overall privacy concern β Sensitivity level of the profile itemθ Level of a user’s privacy concern

The complete result looks like…

Finding optimal settings Theorem:

User settings for items: 1: hidden, 0: disclosed

Probability of hidingthe item

Privacy rating at i

Modeling SNS utility – the same method

λ profile-item weight for a user’s SNS utility μ importance level of the profile item φ Level of a user’s utility preference We can derive: λ = α and μ = -β For utility model, we have:

is the flip of sij

An important resultFor a specific privacy setting over theta_i

Privacy rating + utility rating ≈ a constant

Privacy-utility are linearly related

The weighted/personalized utility model

Users often have clear intention for using SN but have less knowledge on privacy

Users want to put higher utility weight on (a) certain group(s) of profile items than others

Users can assign specific weights to profile items to express his/her preference

The utility IRT model can be revised with a weighted model (skip the details here)

Illustration of Tradeoff between privacy and utility

The Experiments

The Real Data from Facebook Data crawled from Facebook with two accounts

Account normal: a normal Facebook account, which has a certain number of friends

Account fake: a fake account with no friends Data crawling steps

For the friends and “friends of friends” (FOF) of account normal, crawl the profile item visibility of each user

For the same group of users, crawl the visibility of the fake account’s FoFs’ profile items again

We have the following inference rules

Deriving privacy settings of users Based on the data crawled with the FoF of the two

accounts, we derive the (gross) privacy setting of a user based on the following rules

E: everyone, FoF: friends of Friends, O:the account owner, F: Friends only

Experimental Results

Validated with 5-fold cross-validationWith p-value <0.05

Privacy ratingreal setting

Results of learning weighted utility model

Tradeoff between privacy and utility (unweighted)

Utility rating

Priv

acy

ratin

gFew people have very highlevel of privacy concern

More people tend to havelower privacy ratings, or implicitly higher utility ratings

Tradeoff between privacy and weighted utility

A framework to address the tradeoff between privacy and utility

Latent trait model (IRT) is used for modeling privacy and utility

We develop a personalized utility model and a tradeoff method for users to find optimal configuration based on utility preferences

The models are validated with a large dataset crawled from Facebook

Conclusion

mining privacy settings to find optimal privacy-utility tradeoffs for social network services

Documents

privacy guidance

privacy configuration

privacy risksdo

optimal privacy settings

privacy irt modelmap

new users privacy concern

privacy concernyaxis

ability level