information revelation and privacy in online social networks

61
Information Revelation and Privacy in Online Social Networks Ralph Gross and Alessandro Acquisti [email protected] [email protected] Heinz Seminars, October 3 rd , 2005

Upload: adonia

Post on 06-Feb-2016

37 views

Category:

Documents


0 download

DESCRIPTION

Information Revelation and Privacy in Online Social Networks. Ralph Gross and Alessandro Acquisti [email protected] [email protected] Heinz Seminars, October 3 rd , 2005. Information revelation and privacy in online social networks. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Information Revelation and Privacy  in Online Social Networks

Information Revelation and Privacy in Online Social Networks

Ralph Gross and Alessandro [email protected] [email protected]

Heinz Seminars, October 3rd, 2005

Page 2: Information Revelation and Privacy  in Online Social Networks

Information revelation and privacyin online social networks

• Online social networks (OSN): sites that facilitate interaction between members through their self-published personal profiles

• How much do users of OSN reveal about themselves online?– A lot

• To whom?– Friends and strangers

• Why?

Page 3: Information Revelation and Privacy  in Online Social Networks

Why?

• Rationality hypothesis: signaling• Low privacy sensitivity• Herding behavior• Peer pressure• Myopic discounting• Incomplete information• …

Page 4: Information Revelation and Privacy  in Online Social Networks

Privacy, economics, and rationality 1. Incomplete information2. Bounded rationality3. Affective processes, psychological/behavioral

deviations from pure rationality model

Page 5: Information Revelation and Privacy  in Online Social Networks

Our study

• Starts research on privacy implications of OSN• Provides first quantification of observed behavior• Studies actual usage data• Discusses trade-offs and incentives and advances

behavioral hypotheses– Yet, still preliminary

Implications extend beyond OSN domain

Page 6: Information Revelation and Privacy  in Online Social Networks

Agenda1. Online social networks

– The Facebook

2. CMU students and the Facebooki. Usage data

– Patterns of information revelation– Inferred privacy preferences Risks and trade-offs

ii. User survey (pilot)– Users’ knowledge and expectations Drivers and incentives

3. Next step– Experiments

Page 7: Information Revelation and Privacy  in Online Social Networks

Online Social Networks

Page 8: Information Revelation and Privacy  in Online Social Networks

What are online social networks?

• Sites that facilitate interaction between members through their self-published personal profiles

• Common core:– Through the site, individuals offer representations of their sel[ves] to others to peruse,

with the intention of contacting or being contacted by others, to meet new friends or dates, find new jobs, receive or provide recommendations, …

• Progressive diversification and sophistication of purposes and usage patterns– Social Software Weblog groups hundreds of social networking sites in nine categories

(business, common interests, dating, facetoface facilitation, friends, pets, photos, …)

Classifieds <> OSN <> blogs

Page 9: Information Revelation and Privacy  in Online Social Networks

A history of online social networks

• 1960s: Plato (University of Illinois)• 1997: SixDegrees.com• After 2002: commercial explosion

– Friendster, Orkut, LinkedIn, …, – Viral growth with participation expanding at rates topping 20% a month– 7 million Friendster users; 2 millions MySpace users; 16 million registered on

Tickle to take personality test (Leonard 2004)– Revenues: advertising, data trading, subscriptions– Media attention: Salon, NYT, Wired, …

Page 10: Information Revelation and Privacy  in Online Social Networks

Research on online social networks

• boyd (2003): trust and intimacy on OSN• Donath and boyd (2004): representation of self on OSN• Liu and Maes (2005): harvesting OSN for recommender

systems

• (some additional research uses OSN data for other purposes)

Page 11: Information Revelation and Privacy  in Online Social Networks

From (social) network theoryto online networks

• Milgram (1967): the small world problem– Watts (2003): six degrees

• Granovetter (1973, 1983): weak and strong ties• Milgram (1977): the familiar stranger

What about the “unknown buddy”?

Page 12: Information Revelation and Privacy  in Online Social Networks

Social network theory and privacy• Strahilevitz (2005):

Discourse about privacy should be based “on what the parties should have expected to follow the initial disclosure of information by someone other than the defendant”

– Consideration of expected information flows within/outside somebody’s social network should inform that person’s expectations for privacy

• However, application to online social network reveals challenges

Page 13: Information Revelation and Privacy  in Online Social Networks

Online vs offline social networks

1. Offline: extremely diverse ties. Online: simplistic binary relations (boyd 2004)

2. Number of strong ties not significantly increased, but number of weak ties can increase substantially (Donath and boyd 2004)

– From a dozen of intimate ties plus 1000 to 1700 “acquaintances,” to hundreds of direct “friends” and hundreds of thousands of relations

Page 14: Information Revelation and Privacy  in Online Social Networks

Hence:

• Online social networks are vaster and have more weaker ties than offline social networks An imagined community?

• Anderson (1991)

• Intimacy and trust– Sharing same personal information with a large and potential unknown number of

friends and strangers• Intimate with everybody? (Gerstein 1984)

Ability to meaningfully interact with others is mildly augmented, while ability of others to access the person is significantly enlarged

Page 15: Information Revelation and Privacy  in Online Social Networks

Online social networks and personal information

1. Pretense of identifiability changes across different types of sites

Anonymous <> Pseudonymous <> Fully identified

2. Type of information revealed or elicited often orbits around hobbies and interests, but can stride from there in different directions

– From classified to journals

3. Visibility of information is highly variable– Members only– Everybody

Page 16: Information Revelation and Privacy  in Online Social Networks

Online social networks and privacy

• Privacy implications of OSN depend on the level of identifiability of the information provided, its possible recipients, and its possible uses– Re-identification

• Two directions: known>additional information; unknown>known

– To whom may identifiable information be made available?• Site, third-parties (hackers, government), users (little control on social network and its

expansion)

– Risks• From identity theft to online and physical stalking; from embarrassment and blackmailing to

spam and price discrimination

Page 17: Information Revelation and Privacy  in Online Social Networks

Online social networks and privacy

• And yet:– OSN can also offer tools to address online privacy problems– “Social networking has the potential to create an intelligent order

in the current chaos by letting you manage how public you make yourself and why and who can contact you.” Tribe.net CEO Mark Pincus

Is that true?

Page 18: Information Revelation and Privacy  in Online Social Networks

The Facebook

Page 19: Information Revelation and Privacy  in Online Social Networks

The Facebook• www.facebook.com• Started February 2004

– Attracted Silicon Valley funding• Has spread to 2000 schools and 4.2 million users• Typically attracts 80 percent of a school’s undergraduate population

– Also gets graduate students, faculty members, staff, and alumni• Now targeting high schools• Growing media attention

Page 20: Information Revelation and Privacy  in Online Social Networks
Page 21: Information Revelation and Privacy  in Online Social Networks

Facebook‘s privacy policy• …is lax, but straightforwardly so:

“Facebook also collects information about you from other sources, such as newspapers and instant messaging services. This information is gathered regardless of your use of the Web Site.”

…“We use the information about you that we have collected from other sources to supplement your

profile unless you specify in your privacy settings that you do not want this to be done.”…“In connection with these offerings and business operations, our service providers may have access

to your personal information for use in connection with these business activities.”

Page 22: Information Revelation and Privacy  in Online Social Networks

Facebook and unique privacy issues

• Unique data– Includes home location, current location (from IP address), etc.

• Uniquely identified– College email account– Contact information

• Ostensibly bounded community– “Shared real space”

…or imagined community?

Page 23: Information Revelation and Privacy  in Online Social Networks

CMU students and the Facebook: usage data

Page 24: Information Revelation and Privacy  in Online Social Networks

Studies• Gross and Acquisti, Proceedings of WPES 2005• Acquisti and Gross, Proceedings of PET 2006

Page 25: Information Revelation and Privacy  in Online Social Networks

Data gathering• In June 2005, we created Facebook profiles with different

characteristics– E.g., degree of connectedness, geographical location, …

• We searched for CMU Facebook members’ profiles using advanced search feature and extracted profile IDs– Downloaded profiles – Inferred additional information not immediately visible from profiles

Page 26: Information Revelation and Privacy  in Online Social Networks

Demographics

Page 27: Information Revelation and Privacy  in Online Social Networks

Demographics

Page 28: Information Revelation and Privacy  in Online Social Networks

Demographics

Page 29: Information Revelation and Privacy  in Online Social Networks

Information revelation

Page 30: Information Revelation and Privacy  in Online Social Networks

Information revelation

• Male users 63% more likely to leave phone number than female users • Single male users tend to report their phone numbers in even higher frequencies

Page 31: Information Revelation and Privacy  in Online Social Networks

Data verifiability

Page 32: Information Revelation and Privacy  in Online Social Networks

Data verifiability

Page 33: Information Revelation and Privacy  in Online Social Networks

Privacy risks

• Stalking• Re-identification• Digital dossier

Page 34: Information Revelation and Privacy  in Online Social Networks

Privacy risks: Stalking

• Real-World Stalking– College life centers around class attendance– Facebook users put home address and class list on their

profiles; whereabouts are known for large portions of the day

• Online stalking– Facebook profiles list AIM screennames– AIM lets users add “buddies” without notification– Unless AIM privacy settings have been changed, adversary can

track when user is online

Page 35: Information Revelation and Privacy  in Online Social Networks

Privacy risks: Re-identification

• Demographics re-identification• 87% of US population is uniquely identified by {gender, ZIP,

date of birth} (Sweeney, 2001)• Facebook users that put this information up on their profile

could link them up to outside, de-identified data sources

• Face re-identification• Facebook profiles often show high quality facial images• Images can be linked to de-identified profiles on e.g.

Match.com or Friendster.com using face recognition

• Social Security Number re-identification• Anatomy of a social security number: xxx yy zzzz• Based on hometown and date of birth xxx and yy can be

narrowed down substantially

Page 36: Information Revelation and Privacy  in Online Social Networks

Privacy risks: Digital Dossier

• Users reveal sensitive information (e.g. current partners, political views) in profiles

• Simple script programs allow adversaries to continuously retrieve and save all profile information

• Cheap hard drives enable essentially indefinite storage

Page 37: Information Revelation and Privacy  in Online Social Networks

Privacy risks

Page 38: Information Revelation and Privacy  in Online Social Networks

Data accessibility

Page 39: Information Revelation and Privacy  in Online Social Networks

Data accessibility

Page 40: Information Revelation and Privacy  in Online Social Networks

Data accessibility

• Profile Searchability– We measured the percentage of users that changed search

default setting away from being searchable to everyone on the Facebook to only being searchable to CMU users

– 1.2% of users (18 female, 45 male) made use of this privacy setting

• Profile Visibility– We evaluated the number of CMU users that changed profile

visibility by restricting access from unconnected users – Only 3 profiles (0.06%) in total fall into this category

• Caveat: We would not detect users who had made themselves both unsearchable and invisible within CMU network (safe to assume their number is very low)

Page 41: Information Revelation and Privacy  in Online Social Networks

Data accessibility

Page 42: Information Revelation and Privacy  in Online Social Networks

Actual data accessibility:An imagined community?

• Extensive, uncontrolled social networks• Fragile protection:

– Fake email addresses– Manipulating users– Geographical location– Advanced search features

• Using advanced search features various profile information can be searched for, e.g. relationship status, phone number, sexual preferences, political views and (college) residence

• By keeping track of the profile IDs returned in the different searches a significant portion of the previously inaccessible information can be reconstructed

– AIM Facebook profiles are, effectively, public data

Page 43: Information Revelation and Privacy  in Online Social Networks

Actual data accessibility:An imagined community

• “What a great illustration of how things you might not mind being public in one context can cause all sorts of problems when they wind up globally public.”– CMU student

Page 44: Information Revelation and Privacy  in Online Social Networks

Initial hypotheses

• Default settings (Mackay 1991)/ Myopic discounting?– Less than 2% make their profiles less searchable– Less than 1% make their profiles less visible

• Peer pressure• Incomplete information and biased perspectives

– An imagined community

• Or simply:– Low privacy concerns– Signaling

• Single males list phone number with highly significant more frequency than females

Page 45: Information Revelation and Privacy  in Online Social Networks

User survey (pilot)

Page 46: Information Revelation and Privacy  in Online Social Networks

(Pilot) Survey

• Goals– Understand CMU Facebook’s users degree of awareness about

the site and its information revelation patterns; understand their privacy attitudes and expectations

• Thirty-six online questions• Anonymous, paid• Pilot

– 50 subjects– Focused on Facebook users

• Survey link

Page 47: Information Revelation and Privacy  in Online Social Networks

CAVEAT:The following results are based on our pilot test (50 subjects).

Hence they must only be considered suggestive trends rather than robust evidence. We are now exploring the same questions in the full survey – please contact us for the most recent results: [email protected].

Page 48: Information Revelation and Privacy  in Online Social Networks

Generic concerns (7-point Likert scale)

0.1

.2.3

.4D

ensi

ty

0 2 4 6 8State of the economy

0.1

.2.3

Den

sity

0 2 4 6 8Threats to personal privacy

0.0

5.1

.15

.2.2

5D

ensi

ty

0 2 4 6 8Threats of terrorism

0.1

.2.3

Den

sity

0 2 4 6 8Global warming

Page 49: Information Revelation and Privacy  in Online Social Networks

Specific concerns (7-point Likert scale)

0.2

.4.6

.8D

ensi

ty

0 2 4 6 8Same-sex marriage

0.1

.2.3

Den

sity

0 2 4 6 8Friend of friend knew contact information

0.1

.2.3

Den

sity

0 2 4 6 8Permeable borders

0.1

.2.3

.4D

ensi

ty

0 2 4 6 8Stranger knows address

0.1

.2.3

.4D

ensi

ty

0 2 4 6 8US vetoes global warming regulations

0.1

.2.3

.4.5

Den

sity

0 2 4 6 8Partners info

Page 50: Information Revelation and Privacy  in Online Social Networks

Attitudes vs. behavior

• Share of users with high sensitivity (Likert >5) to partner/sexual orientation information who provide it on Facebook: ~70%

• Share of users with high sensitivity (Likert >5) to home location and class schedule information who provide it on Facebook: ~32%

• Share of users with high sensitivity (Likert >5) to contact information who provide it on Facebook: ~42%

Page 51: Information Revelation and Privacy  in Online Social Networks

Awareness: visibility and searchability

• 21% incorrectly believe only CMU users can search their profiles

• 71% do not realize that everybody at UPitt can search their profiles

• 40% do not realize that anybody on Facebook can search their profiles

• 31% do not realize that everybody at CMU can read their profiles

• On the other side, 23% incorrectly believe that everybody on Facebook can read their profiles

Page 52: Information Revelation and Privacy  in Online Social Networks

Facebook‘s privacy policy, revisited

“Facebook also collects information about you from other sources, such as newspapers and instant messaging services. This information is gathered regardless of your use of the Web Site.”

• 85% believe that is not the case

“We use the information about you that we have collected from other sources to supplement your profile unless you specify in your privacy settings that you do not want this to be done.”

• 87% believe that is not the case

“In connection with these offerings and business operations, our service providers may have access to your personal information for use in connection with these business activities.”

• 60% believe that is not the case

• Control: perusal of privacy policy does not improve awareness

Page 53: Information Revelation and Privacy  in Online Social Networks

Privacy concerns

• 69% believe that the information other Facebook users reveal may create privacy risks for those users

• But:

0.1

.2.3

.4D

ensi

ty

0 2 4 6 8Are you concerned about your personal privacy on the Facebook?

Page 54: Information Revelation and Privacy  in Online Social Networks

Information revelation

• Reasons to provide more personal information (in order of importance):

1. No factor in particular, it's just fun 2. No factor in particular, but the amount of information I reveal

is necessary to me and other users to benefit from the FaceBook

3. No factor in particular, rather I am following the norms and habits common on the site

4. Quite simply, expressing myself and defining my online persona

5. Showing more information about me to "advertise" myself

…..– Getting more potential dates

Page 55: Information Revelation and Privacy  in Online Social Networks

Other privacy concerns

• Reasons for low privacy concerns (in order of importance):

1. Control on information2. Control on access3. CMU environment4. Student environment…

Page 56: Information Revelation and Privacy  in Online Social Networks

Other privacy concerns

• Does your Facebook profile contain information that you might not mind being "public" within the your Facebook or CMU network, but that would indeed bother you if other people could access (e.g., family, interviewers, etc.)?

– 50% answer yes

Page 57: Information Revelation and Privacy  in Online Social Networks

Is it possible/likely?0

.2.4

.6

0 2 4 6 8 0 2 4 6 8

0 1

Den

sity

PossibleGraphs by q31

0.2

.4

0 2 4 6 8 0 2 4 6 8

0 1

Den

sity

LikelyGraphs by q31

Page 58: Information Revelation and Privacy  in Online Social Networks

Next steps

Page 59: Information Revelation and Privacy  in Online Social Networks

Next steps

• Full survey– Users and non-users: different privacy sensitivities?

• Experiments– Control for initial privacy settings– Control for perception of other users’ information patterns– Control for perception of other users’ information revelation

• Other scripts– Study evolution of a new network– Study dynamics of information revelation

Page 60: Information Revelation and Privacy  in Online Social Networks

Conclusions

• OSN offer exciting ground for privacy research– Plenty of information revelation– Alternative explanations– Actual usage data

• The unknown buddy?• An imagined community?

Page 61: Information Revelation and Privacy  in Online Social Networks

Conclusions

• Facebook users claim, in general, to be concerned about their privacy but– Publish plenty of personal information– Do not use privacy enhancing features

• However, they are both– …uninformed about specific information revelation patterns– … aware of generic possibilities

• Suggestive evidence pointing towards:– Signaling, but also– Myopic discounting– Incomplete information