searching for credible informaon via social media mininghuanliu/papers/bjut12222016.pdfarizona state...

42
Arizona State University Data Mining and Machine Learning Lab Searching for Credible Informa=on via Social Media Mining Searching for Credible Informa3on via Social Media Mining Huan Liu Data Mining and Machine Learning Lab Arizona State University

Upload: others

Post on 14-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=onviaSocialMediaMining

SearchingforCredibleInforma3onviaSocialMediaMining

HuanLiu

DataMiningandMachineLearningLabArizonaStateUniversity

Page 2: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ThankstoFormerandCurrentPhDStudentsofDMML

•  RezaZafarni,AsstProf,SyracuseU•  XiaHu,AsstProf,TexasA&MU•  MagdielGalan,Intel•  ShamanthKumar,CastlightHealth•  PritamGundecha,IBMResAlmaden•  JiliangTang,AsstProf,MSU•  HuijiGao,LinkedIn•  AliAbbasi,MachineZone•  SalemAlelyani,AsstProf,KingKhalidU•  XufeiWang,LinkedIn•  GeoffreyBarbier,AFRL•  LeiTang,Clari•  ZhengZhao,Google•  Ni3nAgarwal,ChairProf,UALR•  SaiMoturu,PostDoc,MITMediaLab•  LeiYu,AsscProf,BinghamtonU,NY

•  RobertTrevino,AFRL•  YunzhongLiu,LeEco,US•  SomnathShahapurkar,FICO•  FredMorstaXer•  IsaacJones•  SuhasRanganath•  SuhangWang•  TahoraNazer•  JundongLi•  LiangWu•  GhazalehBeigi•  KaiShu•  Jus3nSampson

Page 3: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

False,Misleading,andInaccurateInforma3on

•  Spam•  Fraud•  FakeNews•  Rumor•  UrbanLegend•  Gossip•  Informa3oncanbe:true,false,oruncertain•  BigData:6th`V’EveryoneShouldKnowAbout

– Vulnerability–  Socialmediahasall6V’s

3

Disinforma*on(purposeful)

Misinforma*on(uninten*onal)&Disinforma*on

Page 4: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

SpaminSocialMedia

•  Unwantedcontentinforma3ongeneratedbyspammingusersascomments,chat,fakerequeststhatareusedtopromoteproductsorspreadmaliciousinforma3on.

4

–  Fakereviews – Maliciouslinks –  Fakerequests

Page 5: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

Fraud(Scam)inSocialMedia

•  Asocialmediafraudisdefraudingand/ortakingadvantageofsocialmediauserswiththeuseofsocialmediaservices.

5

–  Swindlemoney –  Stealpersonalinforma3on

Page 6: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

FakeNewsWebsitesandSocialMedia

•  Fakenewswebsitesdeliberatelypublishhoaxes,propaganda,anddisinforma3ontodrivetrafficexacerbatedbysocialmedia

•  Fakenewscanaffectdomes3cpoli3cs,inflamedbysocialmedia,duetolimitedresourcestochecktheveracityofclaims–  Easyto“like”and“share”,buttakingefforttocheck,albeitjustafewclicksaway(effortasymmetry)

•  Fakenews+SocialmediaCyberwarfare

6

Page 7: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

FakeNewsIsRampantinSocialMedia

•  Fakenewsspreadsonsocialmedia–  Spreadsrapidly

–  Evolvesfast

7

• Crossovertoothernetworks • Modifiedcontent

Page 8: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

FakeNewsCanCauseRealHarm

•  Pizzagate:storiesoffakenewsfromRedditleadtorealshoo3ng

•  Afalserumorerased$136billionin10minutes

8

Fake News Onslaught Targets Pizzeria as Nest of Child-Trafficking, New York Times, 2016

Page 9: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

Rumors

• Wikipedia:“Atalltaleofexplana3onscircula3ngfrompersontopersonandpertainingtoanobject,event,orissueinpublicconcern”.

•  Rumorscanbetrueorfalse.

9

–  Falserumor

Page 10: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

GossipinSocialMedia

•  Gossipisidlechatandrumoraboutpersonaland/orprivateaffairsofothers.

•  Socialmediaallowsforfaster,alargerscaleof,andmoreconvenientidlechat.

10

–  Celebrity:“ObamasmovingtoAsheville”

–  Friends:People“aremuchmorelikelytogossipwhenastoryunitesafamiliarpersonwithaninteres3ngscenario.“

FamiliaritywithInterestBreedsGossip:Contribu3onsofEmo3on,Expecta3on,andReputa3on, PLoS ONE, 2014

Page 11: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

UrbanLegendinSocialMedia

•  Fic3onalstorieswithmacabreelementsrootedinlocalpopularculture.– Onsocialmedia,itdevelopsfasterandspreadswider

•  Insummary,itisimpera3vetostudycredibilitychecking

11

• UrbanlegendofFengshui

Page 12: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

OnCredibilityChecking

•  Studyingdifferenttypesofcredibilityandtheneedfordifferentdataandinforma3onsourcesincredibilitychecking– Wedon’thavetoreinventwheelsinsocialmediaminingandcan“standontheshoulderofgiants”

– Machinesdifferfromhumansincredibilitychecking

•  AboutCredibilityChecking–  TypesofCredibility(socialsciences,psychology,CS)– AspectsofCredibilityChecking–  ComponentsofCredibilityCheckinginSocialMedia

12

Page 13: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

FourTypesofCredibility

•  Presumedcredibility(generalassump3ons)–  “Ourfriendsusuallytelltruth”

•  Reputedcredibility(basedonthirdpar3es’reports)–  Forinstance,pres3giousawardsorofficial3tles

•  Surfacecredibility(simpleinspec3on)–  “Peoplejudgeabookbyitscover”

•  Experiencedcredibility(first-handexperience)–  “Timecantell”(路遥知马力,日久见人心)

•  Anynewtypetoexploreinsocialmedia? 13

Page 14: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

AspectsofCredibilityChecking(CC)

•  CanweturnCCintoaproblemeasierforusersorAMTurks(withoutmuchexper3se)tocheck?

•  IssuesaboutCredibilityCheckingMeasures–  Reputa3onandHistory(3me)– AccuracyandRelevance–  TransparencyandIntegrity(consistency)–  Responsefromindependentsources(consistency)

•  Implica3onorimpactassessment– Noteverypieceoffakenewsisdisastrous–  “Warnornottowarn”:howtobalance?

14

Page 15: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

News/Post

Fake

Yes NoUncertain

•  Recipients

•  Senders

•  Sourceofinforma3on

•  Content

•  Networkcontext•  Crowdsourcing(fact-checkingsites,e.g.,Snopes)•  Groundtruth(mul3faceted,goldstandard)

Exper=se,experienceBackground,occupa=on

Reputa=onLengthofonlinepresenceSocialnetworks

ProvenanceReputa=on,Cura=on/Edi=ngLength

Wri=ngstyleTopicsURLsMul=media

Topicthread(Outlierdetec=on)RetweetsRepliesComments

ComponentsinCredibilityCheckinginSocialMedia

15

Page 16: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

SearchingforCredibleInforma=on

16

CredibleData

Spam

Bots(automa=callygeneratedcontent)

FakeNews

Rumor

•  AUniqueChallenge–  Groundtruth

•  Addi3onalChallenges–  Credibilityverifica3on–  Dynamicchange–  Timeliness

•  Alterna3veApproaches–  RumorDetec3on–  SpamDetec3on–  BotDetec3on–  InferringDistrust

Page 17: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

UsingSocialMediaforCredibilityChecking

•  VelocityandVolume–  6,000tweetspersecond,5millionperdayonTwiXer–  55millionstatusand300millionphotosperdayonFB

•  Variety–  Geo-spa3al,textual,pictorial,temporal,socialdimensions–  Crossmodality(e.g.,geotaggedpictures)

•  Veracity–  Truthfulnessandaccuracyofinforma3on

•  Usebigdata,mul3-sourceinfo,andsocialnetworkstocompensateforlackofexper3se(以其之矛还其之盾)

17

Page 18: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

18

Adecentbreakdownofallthingsrealandfakenew

s.hX

p://imgur.com

/7xHaUXf

Page 19: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

RumorDetec=on

•  Rumor:unverifiedandrelevantinforma3onthatcirculatesinthecontextofambiguity.

•  Goal:detec3ngemergingrumorswithminimuminforma3onasearlyaspossible–  Ifinterven3onisnotfeasible,getearlywarningorprepared

•  Challenges:–  Howtoovercomethelackofinforma3oninasingletweet?–  Howtodetectrumorsintheirforma3vestage?

19

Page 20: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

InsufficientInforma=oninaSingleTweet

•  Asingletweetcouldbedamaging,butcontainsliXleinforma3onw/ocontextfordetec3on

•  Treatbatchesoftweetsas“conversa3ons”•  Basedonkeywordsimilari3es•  Basedonreplychains

20

...

1to9tweets 10+tweets

PointofAcceptableAccuracy

•  Aggregateconversa3ons•  Sharedhashtags•  Commonlinks•  Cosinesimilarity

Page 21: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

Detec=onofEmergingRumors

•  Emergentdetec3on-linkthefirsttweetinarumorwiththosealreadyposted

•  Standardrumorclassifica3onsarenoteffec3veforsmallconversa3ons–  Lackofnetworkandsta3s3caldata–  Datasparsityissues

•  Implicitlinkingworkseffec3velyfordetec3ngsmallrumorcascades

21

Page 22: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

BotDetec=on

•  Bots–  Innocuous:relayinforma3onfromofficialsources–  Malicious:spreadrumorsandfalseinforma3on

•  Goal:RemovebotsfromsocialmediadatawithhighRecall–  WHY?

•  Challenges–  Acquiringgroundtruth–  IncreasingRecallwithoutsignificantlyreducingPrecision

22

Page 23: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

BotsinSocialMedia

•  BotsonTwiXer:–  TwiXerclaims5%of230Musersarebots.– Onestudyfound20Mbotaccounts=9%**.–  24%ofalltweetsaregeneratedbybots***.

•  5-11%ofFacebookaccountsarefake****.

*hXp://blogs.wsj.com/digits/2014/03/21/new-report-spotlights-twiXers-reten3on-problem/**hXp://www.nbcnews.com/technology/1-10-twiXer-accounts-fake-say-researchers-2D11655362***hXps://sysomos.com/inside-twiXer/most-ac3ve-twiXer-user-data****hXp://thenextweb.com/facebook/2014/02/03/facebook-es3mates-5-5-11-2-accounts-fake/ 23

Page 24: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

FindingGroundTruth

•  ThreestatesofaTwiXeruser:–  Ac3ve–  Suspended–  Deleted

•  Idea:–  Usethesestatesas

labels–  Twosnapshotsof

eachuseristaken

24

Suspended

Deleted

Ac3ve

Ini=alCrawl•  Findsseedsetofusers.•  CrawlsProfile,Network,...

StatusonTwiXerasalabelingmechanism

Page 25: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

GroundTruth-Honeypots

•  Actasobviousbotaccounts•  AXractotherbotaccounts•  Botsareiden3fiedwhentheyfollowouraccount•  Assump=on:Realusersdonotfollowbots

25

Page 26: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

Honeypots-Logic

•  Post“Luring”Content–  Postcontentthatwillbeseen–  trendingtopics,hashtags,

“famous”tweets...•  MaintainNetwork

Connec=ons–  “Followback”,Retweets–  Famebegetsfame

•  PromoteOtherHoneypots–  Retweeteachother’stweets–  Men3oneachother

HoneypotAccounts

ChooseHoneypot,

h

RetweetRandomHoneypot

10%

SampleRandomTweet,t

90% hretweets

t

30%

hcopiest70%

Recordh’snewfriends

Wait10s

Follownew

friends

26

Page 27: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

BoostOR

•  BasedonAdaBoost•  TrytoincreaseRecallwithoutdras3cdecreaseinPrecision

•  Itera3velyupdatetheweightofinstances:–  Unchangedifcorrectlyclassified–  Decreasediffalsenega3ve–  Increasediffalseposi3ve

27

Page 28: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

Trust-DistrustPredic=on

•  Goal–  Trustanddistrustrela3onscanplayanimportantroleinhelpingonlineuserscollectreliableinforma3on

–  Findingtrustworthyusersandreliableinforma3onisofsignificantimportance

–  Howtopredicttrustrela3onsbetweenusers?

•  Challenges–  Trustrela3onsareextremelysparse–  Distrustrela3onsareevensparserthantrustones–  Findingsubs*tutefeaturesindica3veoftrustanddistrust

28

Page 29: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

TrustandEmo=ons

•  Accordingtopsychology,user’semo3onscanbestrongindicatorsoftrustanddistrustrela3ons

•  Emo3onalinforma3onismoreavailablethanthatoftrust/distrust

•  Thereexistsacorrela3onbetweenemo3onsandtrust/distrustrela3ons

29

Page 30: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

ModelingEmo=onalInforma=on

•  Userswithposi3ve(nega3ve)emo3onsaremorelikelytoestablishtrust(distrust)rela3ons

•  Userswithhighposi3ve(nega3ve)emo3onstrengthsaremorelikelytoestablishtrust(distrust)

•  TheEmo3onalTrustDistrustframeworkETD–  Low-rankmatrixfactoriza3on

–  Emo3onalinforma3onregulariza3on

30

Page 31: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

StudyingBiasinSocialMediaData

•  TwiXersharesitsdata–  “Firehose”feed-100%-costly–  “StreamingAPI”feed-1%-free

• Weusuallyobtaindataviasampling–  IsthesampleddatafromtheStreamingAPIrepresenta3veofthetrueac3vityonTwiXer’sFirehose?

•  Challenges– Howtodetermineifthesampleisbiasedwhenwedonothaveaccesstothewholedata?

– Howtoobtainanunbiasedsample?

31

Page 32: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

Twicer’sStreamingAPIvs.Firehose

•  DatafromFirehoseandStreamingAPIhasbeencollectedforspecificperiodof3metoperformanalysis

•  Morethan90%ofallgeotaggedtweetsareavailableviaStreamingAPIandthereisnotsignificantdifferenceinloca3ondistribu3on

•  Basedonin-degreecentralityandbetweennesscentralityinuser-userretweetnetworks,theStreamingAPIfinds~50%ofthekeyusers

32

Page 33: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

Mi=ga=ngBiasinTwicer’sStreamingAPI

CanwefindbiaswithouttheFirehose?

Es3ma3ngBiasfromStreamingAPI:–  ObtaintrendofhashtagfromSampleAPIandStreamingAPI

–  BootstrapSampleAPItoobtainconfidenceintervals

–  MarkregionswhereStreamingAPIisoutsideofconfidenceintervals

Mi3ga3ngBias:–  Leveragemul3plecrawlerstomaximizedataforeachquery

–  RoundRobinSpliyng

33

Page 34: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

Time-Cri=calInforma=oninCrisisResponse

• Socialmediaisusedtorequestforimmediateassistanceduringcrisis

• Time-cri3calpostsdemandimmediateaXen3on• Addressingthesequeriespromptlycanhelpinemergencyresponse

• Howcanthesepostsbedis3nguishedfromothers?

• WhatIsRequiredinFindingTime-Cri*calResponses?– Userswithexper3seorknowledge– Fastresponse– Relevantanswers

34

Page 35: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

FindingTime-Cri=calResponses

• Manyques3onsaskedduringcrisisshouldbeimmediatelyaXended

• Manyrespondersarebusy• Howcanwefindapromptresponderwhocanprovidearelevantanswer?

• ChallengesofIden3fyingPromptResponders–  Howdowees3matethereply*meofuserstoiden3fypromptresponders?

–  Timelinessandrelevance:howdoweintegrate3melinesswithrelevancetorankcandidateresponders?

35

Page 36: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

Informa=onSeekinginSocialMedia

• Socialmediaisusedtorequestforhelpduringcrisis

• Addressingthesequeriespromptlycanhelpinemergencyresponse

36

Page 37: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

Iden=fyingCandidateResponders

•  Timeliness–  Theusercanrespondmorequicklyifsheisavailablesoonazertheques3onisposted.Itcanbees3matedusingthepreviouspos3ng3mes

–  Auserrespondstoques3onsfasterifshehasrepliedpromptlytosimilarques3onsinthepast

•  Relevance–  Userswhosepreviouscontentissimilartotheques3onhavehigherrelevanceandtheirresponseismorelikelytobearelevantanswer

•  Timelinessandrelevanceareintegratedbycombiningtherankingscores

37

Page 38: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

SearchingforCredibleInforma=on

38

CredibleData

Spam

Bots(automa=callygeneratedcontent)

FakeNews

Rumor

•  AUniqueChallenge–  Groundtruth

•  Addi3onalChallenges–  Credibilityverifica3on–  Dynamicchange–  Timeliness

•  Alterna3veApproaches–  RumorDetec3on–  SpamDetec3on–  BotDetec3on–  InferringDistrust

以其之矛还其之盾

Page 39: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

ThankYouAll

•  ProfessorYang’skindinvita3onandwarmhospitality•  FundingsupportfromONR,NSF,ARO,amongothers•  DMMLLabformerandcurrentmembers,andLiangWuforhelpingwiththeprepara3onofthispresenta3on

Searchfor“HuanLiu”formoreinforma3onaboutDMML

HLiu,FMorstaXer,JTang,andRZafarani.``Thegood,thebad,andtheugly:uncoveringnovelresearchopportuni=esinsocialmediamining",inTrendsofDataScience,Interna3onalJournalonDataScienceandAnaly3cs,SpringerInterna3onalPublishingSwitzerland.September,2016.DOI10.1007/s41060-016-0023-0

39

Page 40: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

40

•  scikit-feature–anopensourcefeatureselec3onrepositoryinPython

•  SocialCompu3ngRepository

RepositoriesandRecentBooks

Page 41: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

41hcp://dmml.asu.edu/smm/

Page 42: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

References

1.  [BeigiSDM’16]GhazalehBeigi,JiliangTang,SuhangWang,andHuanLiu.“Exploi3ngEmo3onalInforma3onforTrust/DistrustPredic3on”.SIAMInterna3onalConferenceonDataMining(SDM16),May5-7,2016.Miami,Florida.

2.  [MorstaXerASONAM’16]FredMorstaXer,LiangWu,TahoraH.Nazer,KathleenM.Carley,andHuanLiu.“ANewApproachtoBotDetec3on:StrikingtheBalanceBetweenPrecisionandRecall”,IEEE/ACMInterna3onalConferenceonAdvancesinSocialNetworkAnalysisandMining(ASONAM2016),August18-21,SanFrancisco,CA.

3.  [MorstaXerWWW’14]FredMorstaXer,JürgenPfeffer,HuanLiu.WhenisitBiased?AssessingtheRepresenta3venessofTwiXer'sStreamingAPI”,WWWWebScience2014.

4.  [MorstaXerICWSM’13]FredMorstaXer,JürgenPfeffer,HuanLiu,KathleenMCarley.IstheSampleGoodEnough?ComparingDatafromTwiXer'sStreamingAPIwithTwiXer'sFirehose”,ICWSM2013.

5.  [SampsonCIKM’16]Jus3nSampson,FredMorstaXer,LiangWuandHuanLiu.“LeveragingtheImplicitStructurewithinSocialMediaforEmergentRumorDetec3on",shortpaper,ACMInterna3onalConferenceofInforma3onandKnowledgeManagement(CIKM2016),October24-28,2016.Indianapolis,Indiana.

6.  [SampsonICDM’15]Jus3nSampson,FredMorstaXer,RezaZafarani,andHuanLiu.“Real-TimeCrisisMappingUsingLanguageDistribu3on”.Demo.InProceedingsofIEEEInterna3onalConferenceonDataMining(ICDM2015),November14-17,2015.Atlan3cCity,NJ.

42