designing and evaluating techniques to mitigate misinformation spread on micro-blogging web services
TRANSCRIPT
Designing and Evaluating Techniques to Mitigate Misinformation Spread on
Micro-blogging Web Services"
Adi$ Gupta
Under the Supervision of Dr. Ponnurangam Kumaraguru
Indraprastha Ins9tute of Informa9on Technology, Delhi July 6, 2015
Power of Social Media"
2
300 hours of video uploaded every minute
500 million tweets posted every day
1.44 Billion monthly ac$ve
users
60 million photos shared
everyday
* 2015 Sta9s9cs
Aim"
Designing and Evalua9ng Techniques to Mi9gate Misinforma9on Spread on
Micro-‐blogging Web Services
9
Proposed Solution"
10
� Learning to Rank model for assessing credibility of Tweets
� Model based on ground truth data for 20 real world events and 45 features
� System evalua9on using year long real world experiment
� 1800+ users requested for credibility score of more than 14.2 million tweets.
Approach"
12
Characterizing Misinforma$on and Fake Content
Ranking Framework to
Assess Credibility
Building and Evalua$ng a Real-‐
$me System
Detec9ng fake images (Hurricane sandy) Analyzing rumor propaga9on (Boston blasts) Detec9ng user communi9es (three events) Analyzing rumors spread in India centric events (Mumbai blasts and Assam riots)
14 events data tagging 30% of tweets provide informa9on (17% credible informa9on Linear logis9c regression Present ranking algorithm to assess credibility in tweets using pseudo relevance feedback
45 features computable for a single tweet Live deployment: 1,800+ TwiOer users Credibility score computed for 14+ Million tweets Evaluated TweetCred in terms of response 9me, effec9veness and usability
Data Collection"� Created a 24*7 data collec9on framework - Streaming / REST APIs - JSON Format - MySql Databases
� Collected 2+ Billion tweets from 2011-‐14
13
Approach"
14
Characterizing Misinforma$on and Fake Content
Ranking Framework to
Assess Credibility
Building and Evalua$ng a Real-‐
$me System
Detec9ng fake images (Hurricane sandy) Analyzing rumor propaga9on (Boston blasts) Detec9ng user communi9es (three events) Analyzing rumors spread in India centric events (Mumbai blasts and Assam riots)
14 events data tagging 30% of tweets provide informa9on (17% credible informa9on Linear logis9c regression Present ranking algorithm to assess credibility in tweets using pseudo relevance feedback
45 features computable for a single tweet Live deployment: 1,800+ TwiOer users Credibility score computed for 14+ Million tweets Evaluated TweetCred in terms of response 9me, effec9veness and usability
Background: Hurricane Sandy"
� Dates: Oct 22-‐ 31, 2012 � Damages worth $75 billion � Coast of NE America
15
Faking Sandy: Characterizing and Iden9fying Fake Images on TwiOer during Hurricane Sandy. Adi9 Gupta, Hemank Lamba, Ponnurangam Kumaraguru and Anupam Joshi. Accepted at the 2nd Interna9onal Workshop on Privacy and Security in Online Social Media (PSOSM), in conjunc9on with the 22th Interna9onal World Wide Web Conference (WWW), Rio De Janeiro, Brazil, 2013. Best Paper Award.
Data Description"
17
Total tweets 1,782,526 Total unique users 1,174,266
Tweets with URLs 622,860
Tweets with fake images 10,350
Users with fake images 10,215
Tweets with real images 5,767
Users with real images 5,678
Network Analysis"
18
Tweet – Retweet graph for the propaga9on of fake images during first 2 hours
Node -‐> User Id Edge -‐> Retweet
Role of Twitter Network"� Analyzed role of follower network in fake image propaga9on
� Crawled the TwiOer network for all users who tweeted the fake image URLs
19
� Graph 1 - Nodes: Users, Edges: Retweets
� Graph 2 - Nodes: Users, Edges: Follow rela9onships
Results"
20
Total edges in retweet network 10,508
Total edges in follower-‐followee network 10,799,122
Common edges 1,215
%age Overlap 11%
Classification" 5 fold cross valida9on
21
Tweet Features [F2] Length of Tweet Number of Words
Contains Ques9on Mark? Contains Exclama9on Mark? Number of Ques9on Marks
Number of Exclama9on Marks
Contains Happy Emo9con Contains Sad Emo9con
Contains First Order Pronoun
Contains Second Order Pronoun Contains Third Order Pronoun
Number of uppercase characters
Number of nega9ve sen9ment words
Number of posi9ve sen9ment words Number of men9ons Number of hashtags Number of URLs Retweet count
User Features [F1]
Number of Friends
Number of Followers
Follower-‐Friend Ra9o
Number of 9mes listed
User has a URL
User is a verified user
Age of user account
Classification Results"
22
F1 (user) F2 (tweet) F1+F2
Naïve Bayes 56.32% 91.97% 91.52%
Decision Tree 53.24% 97.65% 96.65%
• Best results were obtained from Decision Tree classifier, we got 97% accuracy in predic9ng fake images from real.
• Tweet based features are very effec9ve in dis9nguishing fake images tweets from real, while the performance of user based features was very poor.
Boston Blasts"� Twin blasts occurred during the Boston Marathon
- April 15th, 2013 at 18:50 GMT � 3 people were killed and 264 were injured � First Image on TwiOer (within 4 mins)
23 $1.00 per RT #BostonMarathon #PrayForBoston: Analyzing Fake Content on TwiOer. Adi9 Gupta, Hemank Lamba and Ponnurangam Kumaraguru. Accepted at IEEE APWG eCrime Research Summit (eCRS), San Francisco, USA, 2013.
Data Description"Total tweets 7,888,374
Total users 3,677,531
Time of the blast Mon Apr 15 18:50 2013
Time of first tweet Mon Apr 15 18:53 2013
25
Identifying Rumor / True tweets"� Tagged most viral 20 tweet content
- Rumor / Fake - True - Generic (NA)
� Six Rumors - 130,690 Tweets / Retweets (29%) - R.I.P. to the 8 year-‐old boy who died in Boston’s explosions, while running for the Sandy Hook kids. #prayforboston
� Seven True news - 116,454 Tweets / Retweets (20%) - Doctors: bombs contained pellets, shrapnel and nails that hit vicGms #BostonMarathon @NBC6
� Seven Generic - 206,816 Tweets / Retweets (51%) - #PrayForBoston
Fake Content User Profiles"
Account 1 Account 2 Account 3 Account 4
No. of Followers 10 297 249 73,657
Profile Crea$on Date Mar 24 2013 Apr 15 2013 Feb 07 2013 Dec 04 2008
Total No. of Statuses 2 2 294 7,411
No. of Fake Tweets 2 2 1 1
Current Status Suspended Suspended Suspended Ac9ve
28 Username: BostonMarathons
Spread of Fake Content"� Using linear regression � Predict how viral a rumor would get
- Based on aOributes of users who are propaga9ng the rumor
� Based on: - Follower - Friends - Favorited - Status - Verified
31
Predicting Spread of Fake Content"
32
Results show it is possible to predict how viral a rumor would become in future based on aOributes of users currently propaga9ng the rumor.
Approach"
34
Characterizing Misinforma$on and Fake Content
Ranking Framework to
Assess Credibility
Building and Evalua$ng a Real-‐
$me System
Detec9ng fake images (Hurricane sandy) Analyzing rumor propaga9on (Boston blasts) Detec9ng user communi9es (three events) Analyzing rumors spread in India centric events (Mumbai blasts and Assam riots)
14 events data tagging 30% of tweets provide informa9on (17% credible informa9on Linear logis9c regression Present ranking algorithm to assess credibility in tweets using pseudo relevance feedback
45 features computable for a single tweet Live deployment: 1,800+ TwiOer users Credibility score computed for 14+ Million tweets Evaluated TweetCred in terms of response 9me, effec9veness and usability
Credibility Ranking of Tweets during High Impact Events. Adi9 Gupta and Ponnurangam Kumaraguru, Workshop on Privacy and Security on Online Social Media (PSOSM), co-‐located with the 21st Interna9onal World Wide Web Conference (WWW), Lyon, France, 2012.
Tweets about an Event"
35
Tweets #event
Informa$on No informa$on
Tweets with
informa$on
Credible Informa$on
Non-‐Credible
Informa$on
Fake news / Rumors Personal Opinions / Spam
No. of people affected Place of event Pictures / videos
Data Statistics"Events Tweets Trending Topics
UK Riots 542,685 #ukriots, #londonri- ots, #prayforlondon
Libya Crisis 389,506 libya, tripoli
Earthquake in Virginia 277,604 #earthquake, Earth- quake in SF
JanLokPal Bill Agitation 182,692 Anna Hazare, #jan- lokpal, #anna
Apple CEO Steve Jobs resigns 158,816 Steve Jobs, Tim Cook, Apple CEO
US Downgrading 148,047 S&P, AAA to AA
Hurricane Irene 90,237 Hurricane Irene, Tropical Storm Irene
Google acquires Motorola Mobility 68,527 Google, Motorola Mobility
News of the World Scandal 67,602 Rupert Murdoch, #murdoch
Abercrombie & Fitch stocks drop 54,763 Abercrombie & Fitch, A&F
Muppets Bert and Ernie were gay 52,401 Bert and Ernie
Indiana State Fair Tragedy 49,924 Indiana State Fair
Mumbai Blast, 2011 32,156 #mumbaiblast, Dadar, #needhelp
New Facebook Messenger 28,206 Facebook Messenger 38
Annotation"� Step 1
- R1. Contains informa9on about the event - R2. Is related to the event, but contains no informa9on - R3. Not related to the event - R4. Skip tweet
� Step 2
- C1. Definitely credible - C2. Seems credible - C3. Definitely incredible - C4. Skip tweet.
39
Annotation Results"
40
� Each tweet annotated by 3 people
� Inter-‐annotator agreement (Cronbach Alpha) = 0.748
� 30% of tweets provide informa9on (17% credible informa9on) and 14% was spam
Feature Sets"
41
Message based features
Length of the tweet
Number of words
Number of unique characters
Number of hashtags
Number of retweets
Number of swear language words
Number of positive sentiment words
Number of negative sentiment words
Tweet is a retweet
Number of special symbols [$, !]
Number of emoticons [:-), :-(]
Tweet is a reply
Number of @- mentions
Number of retweets
Time lapse since the query
Has URL
Number of URLs
Use of URL shortener service
Message based features
Length of the tweet
Number of words
Source based features
Registration age of the user
Number of statuses
Number of followers
Number of friends
Is a verified account
Length of description
Length of screen name
Has URL
Ratio of followers to followees
Source based features
Registration age of the user
Number of statuses
Number of followers
Evaluation Metric"
42
Evalua9on Metric: NDCG (Normalized Discounted Cumula9ve Gain) NDCG is the standard metric used to evaluate “graded” results
Ranking Results"
43
• Tweet and user based features contribute in determining the credibility – it maOers “what you post and who you are”
PRF"� PRF (Pseudo Relevance Feedback) - Extract k ranked documents and then re-‐rank those documents according to a defined score - Re-‐ranking based on ‘top words’ of an event - Top n unigrams based on BM25 ranking func9on
44
Algorithm"
45
SVM-‐Rank
T1 . . . .
Tn
T’1 . . T’k .
T’n
Extract top unigrams per
event
PRFRank (similarity metric)
T’’1 . .
T’’k
Approach"
47
Characterizing Misinforma$on and Fake Content
Ranking Framework to
Assess Credibility
Building and Evalua$ng a Real-‐
$me System
Detec9ng fake images (Hurricane sandy) Analyzing rumor propaga9on (Boston blasts) Detec9ng user communi9es (three events) Analyzing rumors spread in India centric events (Mumbai blasts and Assam riots)
14 events data tagging 30% of tweets provide informa9on (17% credible informa9on Linear logis9c regression Present ranking algorithm to assess credibility in tweets using pseudo relevance feedback
45 features computable for a single tweet Live deployment: 1,800+ TwiOer users Credibility score computed for 14+ Million tweets Evaluated TweetCred in terms of response 9me, effec9veness and usability
TweetCred: Real-‐Time Credibility Assessment of Content on TwiOer. Adi9 Gupta, Ponnurangam Kumaraguru, Carlos Cas9llo and Patrick Meier. Proceedings of the 6th Interna9onal Conference on Social Informa9cs (SocInfo), Barcelona, Spain, 2014. Honorable Men$on for Best Paper.
Features for Real-time Analysis"
49
Feature set Features (45)
Tweet meta-‐data Number of seconds since the tweet; Source of tweet (mobile / web/ etc); Tweet contains geo-‐coordinates
Tweet content (simple)
Number of characters; Number of words; Number of URLs; Number of hashtags; Number of unique characters; Presence of stock symbol; Presence of happy smiley; Presence of sad smiley; Tweet contains `via'; Presence of colon symbol
Tweet content (linguis9c) Presence of swear words; Presence of nega9ve emo9on words; Presence of posi9ve emo9on words; Presence of pronouns; Men9on of self words in tweet (I; my; mine)
Tweet author Number of followers; friends; 9me since the user if on TwiOer; etc.
Tweet network Number of retweets; Number of men9ons; Tweet is a reply; Tweet is a retweet
Tweet links WOT score for the URL; Ra9o of likes / dislikes for a YouTube video
Training Data"� 500 Tweets per event � Used CrowdFlower service
50
Event Tweets Users Boston Marathon Blasts (2013) 7,888,374 3,677,531
Typhoon Haiyan / Yolanda (2013) 671,918 368,269
Cyclone Phailin (2013) 76,136 34,776 Washington Navy yard shoo9ngs (2013) 484,609 257,682
Polar vortex cold wave (2014) 143,959 116,141
Oklahoma Tornadoes (2013) 809,154 542,049
Total 10,074,150 4,996,448
Annotation"� Step 1
- R1. Contains informa9on about the event - R2. Is related to the event, but contains no informa9on - R3. Not related to the event - R4. Skip tweet
45% (class R1), 40% (class R2), and 15% (class R3)
� Step 2 - C1. Definitely credible - C2. Seems credible - C3. Definitely incredible - C4. Skip tweet. 52% (class C1), 35% (class C2), and 13% (class C3) 51
Ranking Model Evaluation"
52
AdaRank Coord. Ascent RankBoost
SVM-‐rank
NDCG@25 0.6773 0.5358 0.6736 0.3951 NDCG@50 0.6861 0.5194 0.6825 0.4919 NDCG@75 0.6949 0.7521 0.689 0.6188 NDCG@100 0.6669 0.7607 0.6826 0.7219
Time (training) 35-‐40 secs 1 min 35-‐40 secs 9-‐10 secs Time (tes$ng) <1 sec <1 sec <1 sec <1 sec
Top Ten Features"� No. of characters in tweet � Unique characters in tweet � No. of words in tweet � User has loca9on in profile � Number of retweets � Age of tweet � Tweet contains URL � Tweet contains via � Statuses / Followers � Friends / Followers
53
Usage Statistics"
Date of launch of TweetCred 27 Apr, 2014
Credibility score requests received 14,234,131
Unique TwiOer users 1,808
Feedback was given for tweets 1,654
Unique users who gave feedback 364
56 * Data as on April’15
Users of TweetCred"Sample users: - Emergency responders - Firefighters - Journalists / news media - General users - Researchers (Requested API tokens)
57
Limitations & Future Work"� Current research focuses on TwiOer, we would like analyze credibility of content on different social media using similar framework
� We would like to enhance the current system to indicate tweets that are 9mely, factual, well-‐wriOen, etc.
60
Contributions Summary"� Analyzed how real and fake content is propagated through the TwiOer network, with the purpose of assessing the reliability of TwiOer as an informa9on source during real-‐world events.
� Proposed a learning-‐to-‐rank framework for assessing credibility of content on TwiOer using a combina9on of content, meta-‐data, network, user profile and temporal features.
� Evaluated and deployed a novel framework for providing indica9on of trustworthiness / credibility of tweets posted during events.
61
Real world Impact" � The real-‐9me system TweetCred built to assess credibility of content on TwiOer is used by 1,808 real TwiOer users to obtain credibility scores for more than 14.2 million tweets.
� A unique data set of thousands of fake images, rumor tweets and malicious profiles for 25+ real-‐world events.
62
Publications"� Peer Reviewed Publica9ons
- TweetCred: Real-‐Time Credibility Assessment of Content on TwiOer. Adi9 Gupta, Ponnurangam Kumaraguru, Carlos Cas9llo and Patrick Meier. Proceedings of the 6th Interna9onal Conference on Social Informa9cs (SocInfo), Barcelona, Spain, 2014. Honorable Men9on for Best Paper.
- $1.00 per RT #BostonMarathon #PrayForBoston: Analyzing Fake Content on TwiOer. Adi9 Gupta, Hemank Lamba and Ponnurangam Kumaraguru. Accepted at IEEE APWG eCrime Research Summit (eCRS), San Francisco, USA, 2013.
- Faking Sandy: Characterizing and Iden9fying Fake Images on TwiOer during Hurricane Sandy. Adi9 Gupta, Hemank Lamba, Ponnurangam Kumaraguru and Anupam Joshi. Accepted at the 2nd Interna9onal Workshop on Privacy and Security in Online Social Media (PSOSM), in conjunc9on with the 22th Interna9onal World Wide Web Conference (WWW), Rio De Janeiro, Brazil, 2013. Best Paper Award.
- Iden9fying and Characterizing User Communi9es on TwiOer during Crisis Events. Adi9 Gupta, Anupam Joshi and Ponnurangam Kumaraguru. Workshop on Data-‐driven User Behavioral Modeling and Mining from Social Media (UMSOCIAL), Co-‐located with 21st ACM Interna9onal Conference on Informa9on and Knowledge Management (CIKM), Hawaii, USA, 2012.
- Credibility Ranking of Tweets during High Impact Events. Adi9 Gupta and Ponnurangam Kumaraguru, Workshop on Privacy and Security on Online Social Media (PSOSM), co-‐located with the 21st Interna9onal World Wide Web Conference (WWW), Lyon, France, 2012.
- Beware of What You Share: Inferring Home Loca9on in Social Networks. Ta9ana Pontes, Gabriel Magno, Marisa Vasconcelos, Adi9 Gupta, Jussara Almeida, Ponnurangam Kumaraguru and Virgilio Almeida, Privacy in Social Data (PinSoda), in conjunc9on with Interna9onal Conference on Data Mining (ICDM) (2012).
63
Publications"� Peer Reviewed Publica9ons (Posters)
- Analyzing and Measuring Spread of Fake Content on TwiOer during High Impact Events. Adi9 Gupta, Hemank Lamba, Ponnurangam Kumaraguru. Security and Privacy Symposium IIT, Kanpur, 2014. Best Poster Winner.
- Twit-‐Digest Version 2: An Online Solu9on for Analyzing and Visualizing TwiOer in Real-‐Time. Adi9 Gupta, Mayank Gupta, Ponnurangam Kumaraguru. Security and Privacy Symposium IIT, Kanpur, 2014.
- Twit-‐Digest: Real-‐9me TwiOer search portal for extrac9ng, tracking and visualizing informa9on. Adi9 Gupta, Akshit Chhabra and Ponnurangam Kumaraguru. IBM ICARE 2012. 2nd Runner’s Up prize Best Poster.
- U2P2: Understanding User Privacy Percep9ons, Niharika Sachdeva, Ponnurangam Kumaraguru and Adi9 Gupta, Poster at IBM-‐ICARE, 2011.
� Book Chapter - Misinforma9on on TwiOer during Crisis Events. Encyclopedia of Social Network Analysis and Mining (ESNAM). Adi9 Gupta, Ponnurangam Kumaraguru. Book Chapter. Springer publica9ons. 2012.
64