paper by elena zhelea and lise getoor

20
To Join or Not to Join: The Illusion of Privacy in social Networks with Mixed Public and Private User Profiles Paper by Elena Zhelea and Lise Getoor

Upload: yolanda-henderson

Post on 30-Dec-2015

24 views

Category:

Documents


0 download

DESCRIPTION

To Join or Not to Join: The Illusion of Privacy in social Networks with Mixed Public and Private User Profiles. Paper by Elena Zhelea and Lise Getoor. Introduction. What do we want to find out about a “Private” profile? Sensitive information What Is Sensitive information ? - PowerPoint PPT Presentation

TRANSCRIPT

To Join or Not to Join: The Illusion of Privacy in social Networks with

Mixed Public and Private User Profiles

Paper by Elena Zhelea and Lise Getoor

Introduction• What do we want to find out about a “Private”

profile? – Sensitive information

• What Is Sensitive information ?– What advertizing agencies and companies want to

know– What you do not want others to find out

How can we find out private information?

• If a profile is really private how can you find out something?– What if it was not facebook? A completely

anonymous profile?– Utilize what pubic info you have.

• Using tactics that exploit friendship links• Exploiting group affiliations– Neither Facebook nor Flikr hide group

members.

BASIC model

• Guess sensitive attribute based on distribution of known attributes.

??

?? ??

Ana

Gia

Fabio

Emma

Chris

Bob

Don

Sensitive Info =Favorite ColorsOrangeBlueGreen

Sensitive-Attribute Inference Models

• We assume the overall distribution of the sensitive attribute is either known or it can be found using public profiles.

• We will consider the BASIC distribution to be the baseline attack.

• A successful attack is one that with extra knowledge, has significantly higher accuracy.

Our Model

??

?? ??

Ana

Gia

Fabio

Emma

Chris

Bob

Don

??Bob

??

Gia

Fabio

True Blue Lovers

??

Bob

Emma Chris

Don

Espresso lovers

Sensitive Info = Favorite ColorOrangeBlueGreen

“Tell me who your friends are, and I’ll tell you who are you”

Link based Attacks• Friend-aggregate model (AGG)• Collective Classification model (CC)• Flat-link model (LINK)

Friend-aggregate model (AGG)

??

?? ??

Ana

Gia

Fabio

Emma

Chris

Bob

Don

• Given my friends, what am I most likely?

• Public-Sensitive attributes/Total Links

Collective Classification model (CC)

• AGG, With re-evaluation

??

?? ??

Ana

Gia

Fabio

Emma

Chris

Bob

Don

Flat-link model (LINK)

??

?? ??

Ana

Gia

Fabio

Emma

Chris

Bob

Don

• Flatten the data by considering adjacency matrix of the graph.

• CLASSIFICATION!!!Ana Bob Chris Don Emma Fabio Gia color

Ana 1 1 0 1 0 0 0 ?

Bob 1 1 1 0 1 1 0 ?

Chris 0 1 1 0 0 0 0 Orange

Don 1 0 0 1 0 0 1 Green

Emma 0 1 0 0 1 1 1 Orange

Fabio 0 1 0 0 1 1 1 Blue

Gia 0 0 0 1 1 1 1 ?

Group Based Attacks

• Groupmate-link model (CLIQUE)– Considers all people in a group as

friends

• Group-based classification model (GROUP)– Considers each group as a feature in a

classifier

Groupmate-link model (CLIQUE)

??Bob

??

Gia

Fabio

??

Bob

Emma Chris

Don

Espresso loversTrue Blue Lovers

• Consider everyone in a group, a friend

• Then flatten to adjacency matrix• Use previous LINK methods after

Ana Bob Chris Don Emma Fabio Gia color

Ana 0 0 0 0 0 0 0 ?

Bob 0 1 1 1 1 1 1 ?

Chris 0 1 1 1 1 0 0 Orange

Don 0 1 1 1 1 0 0 Green

Emma 0 1 1 1 1 0 0 Orange

Fabio 0 1 0 0 0 1 1 Blue

Gia 0 1 0 0 0 1 1 ?

Group-based classification model (GROUP)

??Bob

??

Gia

Fabio

??

Bob

Emma Chris

Don

Espresso Lovers

• Use groups as a feature set– Prune away less useful groups

• Homogeneity = Entropy (h)• Size, smaller groups might be better.

True Blue Lovers

True Blue Lovers

Espresso Lovers Color

Ana 0 0 ?

Bob 1 1 ?

Chris 0 1 Orange

Don 0 1 Green

Emma 0 1 Orange

Fabio 1 0 Blue

Gia 1 0 ?

LINK-GROUP

• Use friends and groups as features and then use traditional classifierAna Bob Chri

sDon Emm

aFabio

Gia True Blue

Espresso

Color

Ana 1 1 0 1 0 0 0 0 0 ?

Bob 1 1 1 0 1 1 0 1 1 ?

Chris 0 1 1 0 0 0 0 0 1 Orange

Don 1 0 0 1 0 0 1 0 1 Green

Emma

0 1 0 0 1 1 1 0 1 Orange

Fabio 0 1 0 0 1 1 1 1 0 Blue

Gia 0 0 0 1 1 1 1 1 0 ?

Using Both –Groups and Links

• LINK-GROUP– Uses the links and groups as features in

a classifier model

Facebook Data

• Link based attacks– AGG, CC, BLOCK similar to baseline– LINK’s accuracy varied between 65.3%

and 73.5%• Group based Attacks– 73.4% success in determining gender

• Mixed-Model– 72.5%, no improvement, 57.8% or 1%

better than BASIC on political views

How good is this paper?

• How good is their attack methods?

• We can attack in more ways– Using image recognition– Using the names of people and “googling”

• Also applies to doing the same to their friends

– Search for key words in wall posts