7th ieee/acm international conference on advances in social networks analysis and mining privacy...

19
7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Privacy Concerns vs. User Behavior in Community Question Answering Imrul Kayes USF Nicolas Kourtellis Telefonica Research Francesco Bonchi Yahoo Labs Adriana Iamnitchi USF

Upload: shannon-jackson

Post on 13-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Privacy Concerns vs. User Behavior in Community Question Answering

7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Privacy Concerns vs. User Behavior in Community Question Answering

Imrul KayesUSF

Nicolas KourtellisTelefonica Research

Francesco BonchiYahoo Labs

Adriana IamnitchiUSF

Page 2: 7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Privacy Concerns vs. User Behavior in Community Question Answering

2

7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Community Question Answering Sites• CQA sites: Popular platforms– Yahoo Answers: 200M users, 5M users/day– Quora: 1M/month– Stack Exchange: 4M users (e.g., Stack Overflow)

• Functionalities:– Q/A posts & comments, social networking,

leaderboards, and more.• Why do we need them?– Not all web-searches are successful!– Complicated / intricate questions!

Page 3: 7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Privacy Concerns vs. User Behavior in Community Question Answering

3

7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Motivation• Privacy of user data: important but unresolved

issue• Privacy vs. platform usability:– Public content is helpful but users would prefer privacy

• Privacy-aware users are more engaged (e.g., on FB)– Is there a sweet-spot?

Priv

acy

Platform usability

Page 4: 7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Privacy Concerns vs. User Behavior in Community Question Answering

4

7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Main Idea• Utilize users’ modifications on privacy settings

as a proxy of privacy concerns• Group users into privacy categories:

1. Public2. Semi-private• QA-private, Network-private

3. Private• Study users’ engagement vs. privacy concerns– Use activity logs & privacy settings instead of

surveys

Page 5: 7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Privacy Concerns vs. User Behavior in Community Question Answering

5

7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Outline• CQA platforms• Motivation• Main idea• Research Questions• Yahoo Answers Dataset• Results• Proposals for CQA platforms improvements

Page 6: 7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Privacy Concerns vs. User Behavior in Community Question Answering

6

7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Motivating Research Questions

1. Are privacy-concerned users more engaged than public users?– Retention, social circles

2. Do privacy-concerned users contribute differently to the community than public users– Points, best answers

3. Do privacy-concerned users have different perception on answer quality than public users?– Thumbs up/down of best answers

Page 7: 7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Privacy Concerns vs. User Behavior in Community Question Answering

7

7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Motivating Research Questions

4. Are privacy-concerned users also more abuse-conscious?– Abuse reporting

5. Are privacy-concerned users more likely to violate community rules?– Deviance score

Page 8: 7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Privacy Concerns vs. User Behavior in Community Question Answering

8

7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Yahoo Answers Dataset• 1.5M users (2012-2013)– 2.6M follower-followee ties (SN properties*)– LCC: 1.1M nodes (74%), 2.4M edges (92%)

• Keep active users with more than 10 Q/A (68%)• 4 privacy settings:– All public (84.4%), PU– Hide content (Questions/Answers) (2.5%), QA– Hide network (Followers/Followees) (0.9%), N– Hide content & network (12.2%), PR

PublicSemi-PrivatePrivate

*Kayes et al. “The social world of content abusers in Community Question Answering“, WWW’2015

Page 9: 7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Privacy Concerns vs. User Behavior in Community Question Answering

9

7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Privacy & Retention

• Semi-private users have higher inter-event time• Private users have lower inter-event time

Page 10: 7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Privacy Concerns vs. User Behavior in Community Question Answering

10

7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Privacy & Social Circles

Indegree = #followers, Outdegree = #followees

• Privacy-concerned users have larger social circles• Followers or followees

Page 11: 7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Privacy Concerns vs. User Behavior in Community Question Answering

11

7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Privacy & Accomplishments

• Private & semi-private users have more points• Privacy-concerned users contribute more in YA

quantitatively

Page 12: 7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Privacy Concerns vs. User Behavior in Community Question Answering

12

7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Privacy & Accomplishments

best answer percentage (BAP

• Privacy-concerned users contribute more in YA qualitatively

Page 13: 7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Privacy Concerns vs. User Behavior in Community Question Answering

13

7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Privacy & Best Answers

• Privacy-concerned users select more Best Answers

Page 14: 7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Privacy Concerns vs. User Behavior in Community Question Answering

14

7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

• Community feedback via thumbs up and down:

• Privacy-concerned users have more average thumbs on Best Answers

Privacy & Best Answer Quality

Page 15: 7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Privacy Concerns vs. User Behavior in Community Question Answering

15

7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Privacy & Abuse Reporting

• 90% of reports submitted by 8% of users

• Privacy-concerned users post more valid reports

Page 16: 7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Privacy Concerns vs. User Behavior in Community Question Answering

16

7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Privacy & Deviance

• Deviance can indicate user engagement*• Private > semi-private > public users’ deviance scores

* Kayes et al. “The social world of content abusers in community question answering“, WWW’2015

Page 17: 7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Privacy Concerns vs. User Behavior in Community Question Answering

17

7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Summary

1. 87.20% public profiles (default)2. Privacy-concerned users are more engaged:– higher retention– more social network contacts– contribute more & better content– higher perception on answer quality– better citizens in terms of reporting abuses– higher platform engagement (via deviance)

Page 18: 7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Privacy Concerns vs. User Behavior in Community Question Answering

18

7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Suggestions for better CQA platforms

• If privacy settings modified:– Likely commitment to the platform– Question recommendation and routing– Community moderation

• If privacy settings unmodified:– Incentivize for increased participation & retention

• Privacy settings in CQA sites– Prediction & recommendation via ML techniques

Page 19: 7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Privacy Concerns vs. User Behavior in Community Question Answering

7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Privacy Concerns vs. User Behavior in Community Question Answering

Imrul KayesUSF

Nicolas KourtellisTelefonica Research

Francesco BonchiYahoo Labs

Adriana IamnitchiUSF

Paper: http://arxiv.org/abs/[email protected]

Twitter: @kourtellis