vera data week preso and script

40
Sara Vera Senior Data Analyst at Causes.com [email protected] The Data Science of Collective Action Online Virality vs Real-World Impact

Upload: sara-vera

Post on 13-Apr-2017

92 views

Category:

Documents


0 download

TRANSCRIPT

Sara VeraSenior Data Analyst at Causes.com

[email protected]

The Data Science of Collective Action

Online Virality vs Real-World Impact

http://oxfammonashpartnership.wordpress.com

Sortable: http://sortable.com/blog/rise-of-the-slacktivist/

http://atlanticsentinel.com/2012/

In this presentation

• For whom are we building this product?

• Where does Causes data come from?

• How do we use data to inspire civic action?

• How do we use data to measure impact?

Step One: Who are our users?

Good Ol’ Clustering• Quantitative Methods

• K-means clustering

• Decision Trees

• Qualitative Methods

• Surveys

• Interviews

K-means Clusters

Decision Trees

Surveys and Interviews

• Understand how people use Causes.com

• What inspired you to use Causes?

• Learn about campaign activity beyond Causes’ website

• How active are your friends in protests, petitions, boycotts?

• Learn as much as possible about a wide range of topics

Results: Six User Personas

Now, the next iteration of data

collection

We know who’s using our site.

Where does our data come from?

• Demographic Profiles

• Campaign Interests

• Supporter Network

• Online Behavior

First step to civic engagement:

• Connect with your ‘friends,’ ‘followers’ and ‘connections’!

• Support this cause you care about!

• Follow this issue you’re interested in!

• Support the people who share your values!

Recommendation Engines!

What about measuring impact?

Theory of ChangeCausal Logic behind intervention

Inputs

Activities

Outputs

Outcomes

Impacts

What happens

What goes in

Immediate Results

Med-term Results

Long-term Results

Abrahim & Rangan 2010

Theory of ChangeCausal Logic behind intervention

Inputs

Activities

Outputs

Outcomes

Impacts

What happens

What goes in

Immediate Results

Med-term Results

Long-term Results

Abrahim & Rangan 2010

Outcome of NBC Petition

• NRA loses an advertising platform

• Animal Conservation awareness

• 107K are ready to mobilize

Data and Impact

2. Tools3. Short-term Results

1. Data

4. Long-term Impact

... And back to the beginning!

Thank [email protected]

SaraVera.10/3/2013.DataWeekTalk

1

TheDataScienceofCollectiveAction:OnlineViralityversusReal-WorldImpact[SLIDEIntro]Hi,mynameisSaraVeraandIamaSeniorDataAnalystatCauses.com.Causesistheworld’slargestcampaigningplatform.Weconnectpeoplewhosupportacommoncauseandempowerthemtotakeactiontogether.InSeptember,CauseslaunchedtheSupporterNetwork,anindependentnetworkthatconnectsideologicallyalignedindividuals,celebrities,nonprofitorganizationsandsociallyresponsiblebrandsaroundtheworldtoinspirecollectiveaction.Campaigningisdependentonnetworkingandpeer-to-peerinfluence.TheInternetcouldbethegreatestorganizingtoolhumanityhasdevised,connectingpeoplefromallpartsoftheglobeandincreasingthefeelingthatanindividualisabletomobilizeagrouptoaccomplishgroup-basedgoals,which,incollectiveactionliterature,iscalledperceivedefficacy.We’veseenglimmers[ArabSpringSLIDE]oftheInternet’spotentialtofacilitatecollectiveactioninmovementsliketheArabSpring.AtCauses,thedatawecollectmakesitpossibleforustodeliverthebestcampaigningtoolssuitedtoorganizers’needs.Today,Iwilldiscussthedatascienceofcollectiveactionandhowweusedatatoinformaproductthattransforms[SLIDEaboutslacktivisim]“Slacktivism”toActivism.Slacktivism--forthoseofyouwhodon’tknow--isatermtodescribeclickingbuttonstofeellikeyou’remakingadifferenceintheworld.Whiletheterm“slacktivist”makesitsoundlikeonlineactivismisafeeblepretenseformakingadifference,inreality,leveragingonlinetoolstomaximizeviralityandgetamessagebeyond“thechoir”isquitepowerful.Further,accordingtoSortable,“slacktivists”aretwiceaslikelyasthegeneralpopulationtovolunteer,takepartinanevent,andaskfordonations.Whatiforganizerscouldbettermobilizetheseonlineactivists?Onlinenetworkshelpusleveragepeer-to-peerinfluenceandgivepowertothecollectivevoice.[SLIDEObamacampaign]Evenifyoudon’tloveBarackObama,youcan’tdenythathesqueezedeveryinsightandactionoutoftechnologyanddatatomakeasuccessfulbidforthepresidency.[CAUSESSLIDE]Howcanpeopleoptimizetheircampaignsandgrowacampaigntoscalewithoutthetoolsandknowledgeofcontentmarketers?Theansweris-ourtoolshelppeopleachievethisbecausetheyarebuilttoleverageback-enddatathathelpsensuretherightcampaignisputinfrontoftherightperson,attherighttime.[SLIDEInthispresentation]Inthispresentation,Iwillanswer:

1. Whoarewebuildingthisfor?2. WheredoesCausesdatacomefrom?3. Howdoweusedatatoinspirecollectiveaction?4. Andhowdowemeasureimpact?

SaraVera.10/3/2013.DataWeekTalk

2

Inansweringthesequestions,Iwillhelpdefinehowtoday’sonlineinfluencersareusingtoday’stechnologytobetterorganizearoundcommoncausesandcampaignsforreal-worldchange.[SLIDEWhoareourusers]AtCauses,ourfirstmissionistogetpeoplecivicallyengaged.Weunderstandthattomakeareal-worldimpact,inspiringactioniskey.Theseactionsareeverythingfromreadingandsharingcontenttostartingagrassrootscampaign.[GoodOl’ClusteringSLIDE]Butfirst,togetpeopleinvolved,weneedtounderstandwhotheyare.WhenwesetouttobuildtheSupporterNetwork,westartedbyperformingaclusteranalysisofourexistinguserstoinformthedirectionthatourproductwouldtake.Forthisresearch,wesubsetourdatato200,000userswhousedCausesbetweenJuneandDecember2012.From230variables,wederived30predictivevariablesbywhichtostartcategorizingourusers,suchasage,income,education,theiractivitylevelandtopicalcampaigninterests.Werank-meansclusteringalgorithmonourentiredataset,whichresultedin6distinctpersonalitytypes.[K-meansSLIDE]Thisisavisualizationofourclusteringwork.Oneofourgroupsisexcludedbecauseitisessentiallyasmall“other”categoryofpeoplewhodon’treallyfitintoanyonecluster.Youcanseethatthebottomtwoclustersontheslideexhibitdrasticallydifferentbehaviorthanthetopthreeclusters--that'swhythey're"farther"away.Whileit'shardtogaugewhatdistanceslikethisreallymean,butit’saninformativevisualizationofwhatouruserbaselookslike.Afterperformingthek-meansclusteranalysistofigureoutwhobelongsinwhichcluster,weusedthese6clusterstotrainthe[RandomForestSLIDE]randomforestmodeltofindouthowpredictiveeachvariableisinidentifyingwhichclustersomeonebelongsto.Usingtherandomforestmodel,wefound30variablesthatbestpredicteduserclassification.[SurveyandInterviewSLIDE]Usingthebehavioralanddemographictrendswesawthroughourmachinelearningclassification,wedugdeeperintothemotivationsofourusersthroughsurveysandinterviews.Wereceivedalmost1,500responsestoour20questionsurvey.[ResultsSLIDE]98%ofourusersfitintooneoftheseonlineactivistpersonas.“Theambitiousactivist”isinhismid-forties,andhe’spassionate,talkative,andenthusiasticaboutsharinghisnewfoundpassionforcreatingimpactintheworldwithasmanypeopleaspossible.

SaraVera.10/3/2013.DataWeekTalk

3

“Thepracticalactivist”isinhislatethirties,hehasfocusedideasabouthowhecanbestaffectchangeintheworldandheseekstofindasoapboxwherehecansharehisideaswithanaudience.SiteslikeCausesaresupplementstohisofflineactivism,notasubstitute.“Theself-assuredmillennial”isinhismid-twenties.Heisself-confidentandbelievesthathecanplayanintegralpartinchangingtheworldaroundhim.Butitcanbedifficultforhimtopledgeallegiancetojustonecampaignororganizationbecauseifheseesinjustice,hewantstogetinvolved,nomatterthecontext.“Theorganizedretiree”hasrecentlyretiredafterherasuccessfulcareerandisnotreadytoslowdown.Asapracticedorganizer,sheapproachesherworkmethodically:sheresearcheslegislation,educatesherselfandstaysactivethroughleadershiprolesinlocalorganizations.Although“Thetenaciousveteranactivist”isretiredandhasacouplehealthproblemsthatkeephermostlyhomebound,herenthusiasmforchangeisasprevalentasever.She’sbeeninvolvedincampaignsforsocialchangethroughoutherlife;alackofmobilityhasdrivenhertoparticipatenowonline.Andwhile“Thecasualparticipant”wouldbynomeansidentifyherselfasanactivist,shedoesrecognizetheflawsandfrustrationsintheworldaroundher.ShevisitssiteslikeCausesthroughinvitationsfromherfriends,butdoesnotfeelmuchloyaltytothecampaignsinwhichsheparticipates.[SLIDENextiterationofdatacollection]Onceweunderstoodourusers,weusedthedatatoinformtoolsthatwouldhelpconnectusersandinspirepeer-to-peersharing.Justlikeafieldofficeaskspeopletoknockon100doors,ormake100calls,weusedourdatatobuildaSupporterNetworkthatwoulddistributethecampaignresponsibilitiesbyconnecting“thecasualparticipant”with“thetenaciousveteranactivist”whoshareapassionforhealtheducationortheselfassuredmillennialwiththeorganizedretireewhoareconcernedwithenvironmentalconservation.Sohowdoweknowwhatthesepeoplecareabout?[SLIDE]Wheredoesourdatacomefrom?ThereareseveralcomponentstotheCauseswebsitewherewegathermostofourdata.Wecollectdemographicdatafromprofilepagesandcampaignpages,andwecollectalotofbehavioraldatawhenauserclicksorperformsanactiononoursite.Wealsohavealotofofflinedatathatwegetfrominterviewingandsurveyingourusers.HerearesomeexamplesofProfileandCampaignpagestogiveyousenseoftheinformationthatfromthemthatwecanuse.

SaraVera.10/3/2013.DataWeekTalk

4

ProfilespagesandCampaigns:[SLIDEofmypersonalprofile]HereismyPersonalProfilepage.YoucanseethatIhavedisplayedmycivicidentitybychoosingvirtualbumperstickers.EachstickerrepresentsanissuethatIcareabout,butisalsoapieceofdatathatweuseatCausestomakedecisionsaboutourproductandthebestwaywecanfacilitatesupporterconnectionsandcampaignsuccess.Overhere,youcanseewhomIsupportandwhosupportsme.Supportingbasicallymeans,“Hey,Ilikewhatyoustandfor,letmeknowhowIcanhelp.”Whoyousupportismeanttobeveryintentional.Usingthesupporternetwork--individuals,nonprofitsandbrandscanreachaninterestedaudienceanddrawattentiontospecificissuesandcampaigns.Wealsousethesupporternetworktoanalyzedemographicsrelatedtocampaigning,fundraising,andpoliticalaffiliations.[BrandSLIDE]Brandand[OrgSLIDE]Organizationprofilepagesareasimilarsetup.Whenanorganizercreatesacampaign,theyhavetheoptionof“tagging”thecampaignwithrelevantissuecategoriessopeoplecaneasilysearchforit.[SierraClubCampaignPage]Forexample,SierraClubcampaignsmightbetaggedwith‘conservation,’‘environment,’andmoredescriptive,specifictagsdependingonthecampaign,suchas‘oilandgasconservation.’ThistaggingsystemalsoallowsusatCausestocategorizethecampaignsonourwebsiteinto“issue”pagesinordertosurfacerelevantcontenttoourusers.Itisalsopossibletopostcomments,storiesandphotos,givingpeopleawaytobecomemoreengagedandpersonallyinvolved.[TonySLIDE]Andfinally,wehaveanexampleofapersonalcampaignpage,whichisawayforasupportertobiteoffasmallerpieceofalargercampaigntomaketheoverallgoalmoretangible.Tonyiscollecting10signaturesforalargercampaignthatiscollectingatotalof100signaturesforapetitiontomakeNorthBeachsidewalksmorepedestrianfriendly.IfTonyinvitesyoutosignthispetition,youcanclickbacktothe`overallcampaigntoreadmoreabouttheSanFranciscoCountyTransportationAuthority,theneighborhoodstudiesbeingconductedandotherspecificbackgroundinformationregardingthispetition.SoIjustwalkedyouthroughhowpeoplesetuptheircivicidentity,createasupporternetwork,andtakeactiononcampaignstheycareabout.Fromhere,weusethisinformationtomakethisprocessmoreefficientandengaging.[SLIDERecommendationEngines]Somewouldsayyouneedheartandpassionbuthere,weneedDiscovery!Discoveryisahugefirststeptoengagingourusers.Andwithalloftheinformationweseeontheprofilepages,thevirtualbumperstickers,campaignissuetags,andbehavioraldata,wetrytosurfacerelevantcontentandhelpusersdeterminewhointheirexistingonlinenetworksonFacebook,LinkedInandTwitterarelikelytosharetheircampaigninterests.

SaraVera.10/3/2013.DataWeekTalk

5

Whileusersarebuildingtheirsupporternetworksandtakingaction,wecontinuetorefineandsuggestsupporter-connections,issuesyoumightfindinteresting,andinspireyoutotakeaction.Thisleadstotheobviousquestionofassessingwhetheryouractionshaveanimpactintherealworld.[SLIDE]MeasuringImpactMeasuringimpactisaperennialproblemfororganizationsinvestedinsocialchange.It’shardenoughtodefineimpact,letalonefigureoutmetricstokeeptrackof.PeopleneedtofeelthattakingactiononlineisamorerewardingexperiencethansendingatweetorpostingonFacebook.Ifsomeonestartsorparticipatesinacampaign,isthereareturnontheirinvestment?Canweuseourdatatoshowtheimpactofthesecampaigns?[SLIDETheoryofChange]TherehasbeensomeresearchfromtheHarvardBusinessSchoolabouthowtomeasureimpact.Inthislineofresearch,theauthorsoutlineacontingencyframeworkformeasuringresultsbasedonthecausallogicofchangethatunderliesanyinterventionprogram.Thelogicchainincludesanorganization’sInputsandActivitiesthatleadtooutputs,outcomes,andultimately,impact.Inputsincludefunds,equipmentandsupplies,knowledgeandtechnicalexpertiseActivitiesincludebasicneedsdelivered,suchasfoodandshelter;orservicesdelivered,suchastrainingprogramsOutputs,aretheimmediateresultsOutcomesarethemedium-termresultssuchasimprovedlivingconditionsandhealth,increasedincomes,andenhancedpoliticalvoiceAndImpactsarelong-termresultsthatindicatefundamentalchangesinsocialnormsDecidingwhattomeasuredependsonwhereinthelogicchainyourcompanyorgrouplands.Giventhediversemissionandcapacitiesofcompaniesandorganizations,someshouldbemonitoringlong-termimpacts,whileothersshouldsticktoreportingimmediateresults.

SaraVera.10/3/2013.DataWeekTalk

6

AtCausestoday,[CausalLogicSLIDE]wehavedatatomeasureshort-termoutcomesinthefirsthalfofthislogicchain.Wekeeptrackofmoneyraised,petitionssignedandothershorter-termoutcomesofcampaignsthatoursitefacilitates.Here’sanexample.[ElephantSLIDE]YoumayhaveheardthecontroversyoveranNRA-sponsoredhuntingshowontheNBCSportsNetworklastweek.TheleaderofthiscampaignwasoutragedwhenhesawanNRAlobbyistshootathreatenedbullelephantinthefaceon“UnderWildSkies”.TheleaderofthiscampaignoriginallycreatedapetitiontoNBCtostopairingthisepisodetoanationalaudience.Heasked,“DothevaluesofNBCreallyfallinlinewiththistypeofprogrammingwhereendangeredanimalsarehunteddownbytheworld’sleadingpro-gunlobby?”Thisleaderwasabletouseoursupporternetworktoidentifyalargeofgroupofpeopleconcernedwithanimalrights,makingparticipationandsharingratesforthepetitionveryhigh.Thisispartofgettingtherightcontentinfrontoftherightpeople,attherighttime.[PersonalCampaignsSLIDE]Themomentumreallygotgoingwhen639Personalcampaignswerestartedtohelptheleaderreachhisoverallgoalof100,000signatures.Here,thetransitionfromclickingtocreatingisthedifferencebetweenslacktivismandonlineactivism.Then,[KirstinDavisSLIDE]glamorouscelebritieslikeKirstinDavisfromSexandtheCitystartedre-tweetingthepetition,then[MediaSLIDES]theLATimes,HuffingtonPostandothermediaoutletspickedupthestory,increasingpublicpressure.[ElephantSLIDE]Intheend,thecollectiveinfluenceofover115,000peoplehelpedtopressureNBCtocanceltheprogram.Thisishowonlineactioncanbemoreeffectivethanactualfeethittingthepavement.Participationisdemocratizedinawaythatinspiresover100,000peopletosignapetitionwithin4days.Asusersseethenumberofpeopleinvolvedincreasing,theyareevenmoreinspiredtoact.Digitizingourcivicengagementallowseveryonetogetinvolvedandsay,“Hey,Istandbehindthiscause.”[LogicChainSLIDE]So,again,atCauses,hereweareinthefirstpartofthelogicchainwhereweseetheimmediateresultofthispetition--cancellingthisNRA-sponsoredtelevisionshow.Wecanleverageanddisplaysuccessbyshowinghowmanypeoplehavesignedthepetitionandcreatedtheirownpersonalcampaignstobiteoffapieceofthelargergoal.Butthisdoesn’tmeasurethelong-termimpact.

SaraVera.10/3/2013.DataWeekTalk

7

[OutcomesSLIDE]Thereareseveraloutcomesofthiscampaign.Causeswasabletoconnectanimalrightsadvocatesconcernedwiththebrutalityofthehuntinwhichtheelephantisnotimmediatelykilled,aswellasguncontroladvocateswhoquestionedthepromotingofanNRA-sponsoredprogramtoanationalaudience.Inadditiontogettingsignatures,thecampaignpagesallowedsupporterstocollaborateandhelpedtoraiseawarenessandrecruitadditionalsupport.Butinthefuture,we’dliketosaythatthispetitioncontributedtoanoverallsocietalchangeinvalues.[DataandImpactSLIDE]Theselonger-termoutcomesarebothchallengingandexcitingfordatascientists.Theabilitytousedataaboutwhowearebuildingfortocreatetoolsforshort-termresultsisapromisingstarttomeasuringlong-termimpact.Weneedtofigureoutindicatorsofsocietalchangethatwecanmeasure,andtakenoteofhowmanycampaignsofwhattypesandvolumeandfrequencyittakesbeforewereachsomethresholdof“change.”Itaboutmakesmyheadexplodejustthinkingaboutit!But...Themorewecanshapeandrefinestrategybasedonpeople’sbehaviorandpractices,themorelikelywewillhavelastingimpact.[THANKYOUSLIDE]