tweetcred: real-time credibility assessment of content on twitter @ socinfo conference 2014
Post on 18-Jun-2015
377 Views
Preview:
TRANSCRIPT
Real-Time Credibility Assessment of Content on Twitter"
Adi$ Gupta*, Ponnurangam Kumaraguru*, Carlos Cas$llo~ and Patrick Meier~ *Indraprastha Ins$tute of Informa$on Technology
~Qatar Compu$ng Research Ins$tute November 11, SocInfo’14
cerc.iiitd.ac.in
TweetCred"
cerc.iiitd.ac.in
Non-trustworthy Content"
FAKE
RUMORS
3
$
cerc.iiitd.ac.in 4
cerc.iiitd.ac.in
Research Contributions"
� Semi-‐supervised ranking model for scoring tweets according to their credibility in real-‐$me
� TweetCred - Live deployment: 1,400+ TwiRer users - Credibility score computed for 7+ Million tweets - Evaluated TweetCred in terms of response $me, effec$veness and usability
5
cerc.iiitd.ac.in
Methodology"
6
cerc.iiitd.ac.in
Training Data"� 500 Tweets per event � Used CrowdFlower
7
Event Tweets Users Boston Marathon Blasts (2013) 7,888,374 3,677,531
Typhoon Haiyan / Yolanda (2013) 671,918 368,269
Cyclone Phailin (2013) 76,136 34,776
Washington Navy yard shoo$ngs (2013) 484,609 257,682
Polar vortex cold wave (2014) 143,959 116,141
Oklahoma Tornadoes (2013) 809,154 542,049
Total 10,074,150 4,996,448
cerc.iiitd.ac.in
Annotation"� Step 1
- R1. Contains informa$on about the event - R2. Is related to the event, but contains no informa$on - R3. Not related to the event - R4. Skip tweet
*45% (class R1), 40% (class R2), and 15% (class R3)
� Step 2 - C1. Definitely credible - C2. Seems credible - C3. Definitely incredible - C4. Skip tweet. *52% (class C1), 35% (class C2), and 13% (class C3)
8
cerc.iiitd.ac.in
Credibility Modeling "
9
Feature set Features (45)
Tweet meta-‐data Number of seconds since the tweet; Source of tweet (mobile / web/ etc); Tweet contains geo-‐coordinates
Tweet content (simple)
Number of characters; Number of words; Number of URLs; Number of hashtags; Number of unique characters; Presence of stock symbol; Presence of happy smiley; Presence of sad smiley; Tweet contains `via'; Presence of colon symbol
Tweet content (linguis$c)
Presence of swear words; Presence of nega$ve emo$on words; Presence of posi$ve emo$on words; Presence of pronouns; Men$on of self words in tweet (I; my; mine)
Tweet author Number of followers; friends; $me since the user if on TwiRer; etc.
Tweet network Number of retweets; Number of men$ons; Tweet is a reply; Tweet is a retweet
Tweet links WOT score for the URL; Ra$o of likes / dislikes for a YouTube video
cerc.iiitd.ac.in
Ranking Results"
10
AdaRank Coord. Ascent RankBoost SVM-‐rank
NDCG@25 0.6773 0.5358 0.6736 0.3951
NDCG@50 0.6861 0.5194 0.6825 0.4919
NDCG@75 0.6949 0.7521 0.689 0.6188
NDCG@100 0.6669 0.7607 0.6826 0.7219
Time (training) 35-‐40 secs 1 min 35-‐40 secs 9-‐10 secs
Time (tesNng) <1 sec <1 sec <1 sec <1 sec
cerc.iiitd.ac.in
TweetCred Demo"
11
cerc.iiitd.ac.in
Implementation"
cerc.iiitd.ac.in
Feedback by Users"
13
cerc.iiitd.ac.in
Usage Statistics"
Date of launch of TweetCred 27 Apr, 2014
Credibility score seen by users total 5,438,115
Credibility score seen by users unique 4,540,618
Unique TwiRer users 1,127
Feedback was given for tweets 1,273
Unique users who gave feedback 263
14 *at the $me of paper submission
cerc.iiitd.ac.in
Users of TweetCred"� Live deployment � Sample users: - Emergency departments - Firefighters - Journalists / news media - General users - Researchers (Requested API tokens)
15
cerc.iiitd.ac.in
Evaluation"� Response Time � Usability Evalua$on � User Feedback
16
cerc.iiitd.ac.in
Response Time"
17
cerc.iiitd.ac.in
Usability Survey"
� Online survey - Par$cipants: 67
� Ques$ons: - System Usability Scale (SUS) ques$ons - Demographic ques$ons
� SUS Score: 70 - 74% of par$cipants -‐> TweetCred easy to use - 81% of par$cipants -‐> Would like to use TweetCred
18
cerc.iiitd.ac.in
User Feedback"
19
Observed 95% Conf. interval Agreed with score 40.14 (36.73, 43.77) Disagreed with score 59.85 (55.68, 64.26) Disagreed: score should be higher 48.62 (44.86, 52.61) Disagreed: score should be lower 11.23 (9.82, 13.65) Disagreed by 1 point 8.71 (7.17, 10.50) Disagreed by 2 points 14.29 (12.29, 16.53) Disagreed by 3 points 12.8 (10.91, 14.92) Disagreed by 4 points 10.91 (9.17, 12.89) Disagreed by 5 points 6.52 (5.19, 8.08)
cerc.iiitd.ac.in
Crisis Events Bias"
20
v
cerc.iiitd.ac.in
Future Work"� Personaliza$on - Users trust contacts in their network more (users they follow, retweet or men$on)
� Intersec$on between psychology literature on informa$on credibility & credibility of content on TwiRer
22
Thank you!
hRp://twitdigest.iiitd.edu.in/TweetCred/ cerc.iiitd.ac.in
top related