Download - Paper by Elena Zhelea and Lise Getoor
To Join or Not to Join: The Illusion of Privacy in social Networks with
Mixed Public and Private User Profiles
Paper by Elena Zhelea and Lise Getoor
Introduction• What do we want to find out about a “Private”
profile? – Sensitive information
• What Is Sensitive information ?– What advertizing agencies and companies want to
know– What you do not want others to find out
How can we find out private information?
• If a profile is really private how can you find out something?– What if it was not facebook? A completely
anonymous profile?– Utilize what pubic info you have.
• Using tactics that exploit friendship links• Exploiting group affiliations– Neither Facebook nor Flikr hide group
members.
BASIC model
• Guess sensitive attribute based on distribution of known attributes.
??
?? ??
Ana
Gia
Fabio
Emma
Chris
Bob
Don
Sensitive Info =Favorite ColorsOrangeBlueGreen
Sensitive-Attribute Inference Models
• We assume the overall distribution of the sensitive attribute is either known or it can be found using public profiles.
• We will consider the BASIC distribution to be the baseline attack.
• A successful attack is one that with extra knowledge, has significantly higher accuracy.
Our Model
??
?? ??
Ana
Gia
Fabio
Emma
Chris
Bob
Don
??Bob
??
Gia
Fabio
True Blue Lovers
??
Bob
Emma Chris
Don
Espresso lovers
Sensitive Info = Favorite ColorOrangeBlueGreen
“Tell me who your friends are, and I’ll tell you who are you”
Link based Attacks• Friend-aggregate model (AGG)• Collective Classification model (CC)• Flat-link model (LINK)
Friend-aggregate model (AGG)
??
?? ??
Ana
Gia
Fabio
Emma
Chris
Bob
Don
• Given my friends, what am I most likely?
• Public-Sensitive attributes/Total Links
Collective Classification model (CC)
• AGG, With re-evaluation
??
?? ??
Ana
Gia
Fabio
Emma
Chris
Bob
Don
Flat-link model (LINK)
??
?? ??
Ana
Gia
Fabio
Emma
Chris
Bob
Don
• Flatten the data by considering adjacency matrix of the graph.
• CLASSIFICATION!!!Ana Bob Chris Don Emma Fabio Gia color
Ana 1 1 0 1 0 0 0 ?
Bob 1 1 1 0 1 1 0 ?
Chris 0 1 1 0 0 0 0 Orange
Don 1 0 0 1 0 0 1 Green
Emma 0 1 0 0 1 1 1 Orange
Fabio 0 1 0 0 1 1 1 Blue
Gia 0 0 0 1 1 1 1 ?
Group Based Attacks
• Groupmate-link model (CLIQUE)– Considers all people in a group as
friends
• Group-based classification model (GROUP)– Considers each group as a feature in a
classifier
Groupmate-link model (CLIQUE)
??Bob
??
Gia
Fabio
??
Bob
Emma Chris
Don
Espresso loversTrue Blue Lovers
• Consider everyone in a group, a friend
• Then flatten to adjacency matrix• Use previous LINK methods after
Ana Bob Chris Don Emma Fabio Gia color
Ana 0 0 0 0 0 0 0 ?
Bob 0 1 1 1 1 1 1 ?
Chris 0 1 1 1 1 0 0 Orange
Don 0 1 1 1 1 0 0 Green
Emma 0 1 1 1 1 0 0 Orange
Fabio 0 1 0 0 0 1 1 Blue
Gia 0 1 0 0 0 1 1 ?
Group-based classification model (GROUP)
??Bob
??
Gia
Fabio
??
Bob
Emma Chris
Don
Espresso Lovers
• Use groups as a feature set– Prune away less useful groups
• Homogeneity = Entropy (h)• Size, smaller groups might be better.
True Blue Lovers
True Blue Lovers
Espresso Lovers Color
Ana 0 0 ?
Bob 1 1 ?
Chris 0 1 Orange
Don 0 1 Green
Emma 0 1 Orange
Fabio 1 0 Blue
Gia 1 0 ?
LINK-GROUP
• Use friends and groups as features and then use traditional classifierAna Bob Chri
sDon Emm
aFabio
Gia True Blue
Espresso
Color
Ana 1 1 0 1 0 0 0 0 0 ?
Bob 1 1 1 0 1 1 0 1 1 ?
Chris 0 1 1 0 0 0 0 0 1 Orange
Don 1 0 0 1 0 0 1 0 1 Green
Emma
0 1 0 0 1 1 1 0 1 Orange
Fabio 0 1 0 0 1 1 1 1 0 Blue
Gia 0 0 0 1 1 1 1 1 0 ?
Using Both –Groups and Links
• LINK-GROUP– Uses the links and groups as features in
a classifier model
Facebook Data
• Link based attacks– AGG, CC, BLOCK similar to baseline– LINK’s accuracy varied between 65.3%
and 73.5%• Group based Attacks– 73.4% success in determining gender
• Mixed-Model– 72.5%, no improvement, 57.8% or 1%
better than BASIC on political views