annex i- mapping big data solutions for the sustainable...

73
1 Mapping Big Data Solutions for the Sustainable Development Goals LIRNEasia [email protected] | www.lirneasia.net March 2017 Sriganesh Lokanathan, Thavisha Gomez, Shazna Zuhyle DRAFT

Upload: others

Post on 11-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

1

Mapping Big Data Solutions for the Sustainable Development Goals

LIRNEasia [email protected] | www.lirneasia.net

March 2017

Sriganesh Lokanathan, Thavisha Gomez, Shazna Zuhyle

DRAFT

Page 2: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

2

www.lirneasia.net

LIRNEasia is a pro-poor, pro-market think tank whose mission is Catalyzing policy change through research to improve people”s lives in the emerging Asia Pacific by facilitating their use of hard and soft infrastructures through the use of knowledge, information and technology. Contact: 12 Balcombe Place, Colombo 00800, Sri Lanka. +94 11 267 1160. [email protected] www.lirneasia.net This work was carried out with the aid of a grant from the International Development Research Centre (IDRC), Canada.

Page 3: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

3

www.lirneasia.net

AuthorshipandacknowledgementsThereportwaspreparedbyathree-personteamfromLIRNEasialedbySriganeshLokanathanandcomprisedofThavishaGomezandShaznaZuhyle.CommentsbyProf.RohanSamarajiva(FoundingChair,LIRNEasia)aregratefullyacknowledged.This research was carried out with the aid of a grant from the International Development Research Centre (IDRC), Canada.

Page 4: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

4

www.lirneasia.net

TableofContentsExecutiveSummary........................................................................................................................6

Introduction...................................................................................................................................8

BigApplicationsfortheSDGs.........................................................................................................9Goal1:NoPoverty.....................................................................................................................9Goal2:ZeroHunger..................................................................................................................13Goal3:GoodHealthandWellbeing...........................................................................................15Goal4:QualityEducation..........................................................................................................18Goal5:AchieveGenderEqualityandEmpowerallWomenandGirls.........................................19Goal6:CleanWaterandSanitation...........................................................................................20Goal7:AccesstoCleanandAffordableEnergy..........................................................................22Goal8:DecentWorkandEconomicGrowth..............................................................................24Goal9:Industry,InnovationandInfrastructure.........................................................................28Goal10:ReducedInequalities...................................................................................................30Goal11:SustainableCitiesandCommunities............................................................................31Goal12:ResponsibleConsumptionandProduction...................................................................33Goal13:ClimateAction.............................................................................................................34Goal14:LifeBelowWater.........................................................................................................36Goal15:LifeonLand.................................................................................................................36Goal16:Peace,JusticeandStrongInstitutions..........................................................................38Goal17:PartnershipsfortheGoals...........................................................................................39

Challenges....................................................................................................................................46Analyticalchallenges.................................................................................................................46AccessingData..........................................................................................................................47Capacity....................................................................................................................................48Privacy......................................................................................................................................49Ethics........................................................................................................................................50Competition..............................................................................................................................50

Conclusion....................................................................................................................................52

ANNEXI:SDGTargetsbyBigDataSource......................................................................................54

ANNEXII:BigDataSourcesbyKeyApplications............................................................................57

References....................................................................................................................................60

ListofTablesTABLE1:NoPoverty..............................................................................................................................9TABLE02:ZeroHunger........................................................................................................................13TABLE03:GoodHealthandWell-being...............................................................................................16TABLE04:QualityEducation................................................................................................................18TABLE05:GenderEquality..................................................................................................................19TABLE06:CleanWaterandSanitation................................................................................................21TABLE07:AffordableandCleanEnergy..............................................................................................22TABLE08:DecentWorkandEconomicGrowth..................................................................................24TABLE09:Industry,InnovationandInfrastructure..............................................................................28

Page 5: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

5

www.lirneasia.net

TABLE10:ReducedInequalities...........................................................................................................30TABLE11:SustainableCitiesandCommunities...................................................................................31TABLE13:ClimateAction.....................................................................................................................34TABLE15:LifeonLand.........................................................................................................................37TABLE16:Peace,JusticeandStrongInstitutions................................................................................38

Page 6: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

6

www.lirneasia.net

ExecutiveSummaryThe thrust to utilize big data for official statistics underscores the potential for generating newinsightsand complementexistingmeasures. Bigdata canpotentially revolutionize currentofficialstatistical systems in one of several ways1: a) Entirely replace existing statistical sources such assurveys; b) Partially replace existing statistical sources such as surveys; c) Provide complementarystatistical information in the same statistical domain but from other perspectives; d) Improveestimates from statistical sources; and e) Provide completely new statistical information in aparticular statistical domain. At the moment, complementing existing data is what offers thegreatestpotentialforbigdatasources.Research conducted to date has explored the utilization of different data sources for specificdevelopmentalapplications.Forexample,theanalysisofsatelliteimageryhastypicallybeenusedtomonitoring changes in topography including crop/yield estimation, drought monitoring,deforestationandcarbonstockmappingamongothers.Mobilenetworkbigdatahasproventobeuseful in understanding mobility patterns of the population, creditworthiness of its users,socioeconomicstatusofthepopulationamongothers,whilesocialmediadataiswellpositionedforsentimentanalysis.Therehavebeenliteraturereviews(forexample,Williams,2016,UK”sOfficeofNationalStatistics;LokanathanandGunaratne,2014)thathavecapturedthestatisticalapplicationsofbigdata,inparticularmobilephonedata.The big data for development (BD4D) landscape hosts a range of players spanning government,industry, academia, and civil society among others. These players operate as policy actors,researchers, funders as well as intermediaries. Understanding the various actors at play offersgreateropportunities for strategicpartnershipandcollaborations. Thus,Goal17hasbeenviewedthrough the lens of big data for development, stressing on the need for collaboration betweenvariousbigdataactors,andprovidingasnapshotoftheBD4Dlandscape.It is importanttonotethatwhileinsightsderivedfrombigdatacanbeusedtomeasuresomeSDGindicators,thepotentialforbigdataliesinbeingabletohelpachievespecifictargets.Forinstance,whilebigdatamaynotbeparticularlyusefulincalculatingthedeathrateduetoroadtrafficinjuries(indicator 3.6.1), it can help identify hotspots for traffic accidents enabling authorities to takepreventiveaction, thus contributing to theachievementof target3.6which is “by2020,halve thenumberofglobaldeathsandinjuriesfromroadtrafficaccidents.”

Leveragingnewaswellasexistingdatasources(fromboththepublicaswellasprivatesectors)forthe purposes of monitoring the progress towards the SDGs as well as for achieving them is notwithout challenges and requires the confluence of several factors. Developing economies inparticularhavemuchlowerlevelsof“datafication”thandevelopedeconomies,whichmeanssomeofthemostinterestingandrelevantdataexistsamongsttheprivatesectorasshowninthisreport.Accessingsuchdatawillnotbewithoutchallenges,notleastbecauseincompetitiveindustriessuchasthetelecomsector,therewouldbecompetitiveimplicationstosharingdata.Eventhentherearecosts and considerations associated with extracting and analyzing the data from private sectorindustries so as to sufficiently protect aspects related customer privacy/ confidentiality aswell asprotecting commercially sensitive business intelligence in competitive industries (e.g. the telecomsector). Innovation will have to unleash appropriate business models for leveraging these datasources.But innovation inthisspacewillalsorequiretheconfluenceofavarietyofactorssuchas

1Seeforexamplehttp://www.q2014.at/fileadmin/user_upload/ESTAT-Q2014-BigDataOS-v1a.pdf

Page 7: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

7

www.lirneasia.net

state, private, academia, and importantly non-governmental researchers and practitioners. Newformsofpartnershipsbetweentheseactorsarerequirednotjustbecausethetruevaluecomesfromassemblingdifferenttypesofdatafromdifferentsources,butalsobecauseoftheinherentcapacitychallenges to make full use of the data. This is inherently a multi-disciplinary effort requiringcomputer scientists, statisticians, and domain/ subject-matter experts along with governmentofficials.

As this report shows, It is important to remember that despite the rich body of literature andapplicationsthatalreadyexist,thestateoftheartininnovativedevelopmentfocusedapplicationsofthese new data sources is still very much in its embryonic stages. A new technique successfullyappliedforaspecificcontextinaspecificcountryorregiondoesn”tnecessarilytranslateacrosstimeandspace,especiallywhentheunderlyingdatathatisbeingleveragedisbehavioraldata.Aprincipalconcern of the data revolution is about “counting the uncounted” and as such we need to payparticular attention to “representativity” of these new data sources that are being leveraged i.e.how accurately it reflects the population. Marginalization in the real world can often result inmarginalization in the digital world beyond just issues of access to technologies, or beingrepresented in the digitized data. As such localized testing of these new techniques with localexperts(awareofgroundtruthandlocalcontext)areveryimportant.

Aconsensusisyettoariseinaddressingtheprivacyandethicaldilemmasthatmayarisefromthesenew applications of data, but will continue to be important. It will be imperative that that laws,regulations,andguidelinesarerootedinpracticalexamplesofharmsratherthanfromimaginationssoastoensurethattherepublicbenefitsarethesedatainnovationsarenotgreatlydiminished.Asdataficationincreasesinsocietyanddigitalplatformscometothefore,arelatedcriticalconcernforthe economy is the nature of competition in sectors. This may affect the type of data that iscollectedandsharedandinparticularthelongertermsuccessofthe“dataphilanthropy”concept.

Page 8: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

8

www.lirneasia.net

IntroductionTheadoptionof the2030sustainabledevelopmentagenda,whichseeksto“ensurethatnoone isleft behind”, places a strain on countries to report data that canbe comparedover time and areavailableatrelativelyfrequentintervals.The17SustainableDevelopmentGoals(SDGs)whichwillbemonitoredbasedon169targets,willinturnbemeasuredbymorethan2302targetsthathavebeenclassified intothreetiersbasedonmethodology,standardsanddataavailabletomeasure/monitorthem:

− “Tier 1: Indicatorconceptuallyclear,establishedmethodologyandstandardsavailableanddataregularlyproducedbycountries

− Tier 2: Indicator conceptually clear, establishedmethodology and standards available butdataarenotregularlyproducedbycountries

− Tier 3: Indicator for which there are no established methodology and standards ormethodology/standardsarebeingdeveloped/tested.”

Asof21December2016,therewere83TierIIIindicators.3This lack of data underscores the opportunities for non-traditional data sources to complementtraditional statistics in the interval betweenofficial surveys, anddevelopnewwaysofmonitoringtheSDGtargets.Thereisopportunitytocapitalizeontheuseofnon-traditionaldatasourcessuchasmobile phone data, satellite-based technology and social media data among others to generateinsightsacrossarangeofdevelopmentissues.Thisreporthassoughttocapturetheapplicationsofnewdatasources,specificallybigdatasourcestomeasurethesustainabledevelopmentgoalsandrelevanttargetsbyreviewingrelevantliterature(bothpeer-reviewedandgrey literature),andreports.Giventhesizeandscopeof researchonthebig data for development space, the references used in this report represent a sample of theresearchconducted.Additionally, the report outlines the current concerns with the use of big data (privacy,marginalization,competition,etc.)andprovideadiscussionoftheinterplayoftheseissuessoastofacilitateaconduciveandsustainableenvironmentforleveragingbigdatatoachievetheglobalgoals.

2While the number of indicators listed in the final proposal is 244, there are a few that are repeated across differenttargets.Hence,thenumberofuniquetargetsis232.http://unstats.un.org/sdgs/indicators/indicators-list/3TierClassificationforGlobalSDGIndicators21DecemberPDF.fromhttp://docplayer.net/27317561-Tier-classification-for-global-sdg-indicators-21-december-2016.html

Page 9: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

9

www.lirneasia.net

BigApplicationsfortheSDGs

Goal1:NoPovertyThe United Nations” “No Poverty” goals seeks to “eradicate poverty in all its forms” through theachievement of seven key targets, eachmonitored by various indicators,making it imperative forcountries to report poverty data that can be compared over time, and are available at relativelyfrequentintervals.However,thedearthofpovertydatainmanycountriespresentssignificantchallenges;forexample,accordingtoSerajuddinetal. (2015),ofthe155countrieswhosepovertydata ismonitoredbytheWorldBank,morethan57countrieshadeither justoneornopovertyestimatesduringtheperiod2002-2011. Moreover, with countries that did publish two poverty data points, 20 countries hadmorethana five-year intervalbetweentheestimates.Serajuddinetal. (2015)attributethe lackofhouseholdsurveystothelackofpovertyestimates.This lack of data underscores the opportunities for non-traditional data sources to complementtraditionalstatisticsintheintervalbetweenofficialsurveys,anddevelopnewwaysofmonitoringtheSDGtargets.Insightsderivedfrombigdatasourcesmayhelptofillthegapsrelatingtopovertydata,ratherthanreplaceofficialsurveys.Mobilephonedataforinstance,cansupportthisgoalbyhelpingto identify the poor--determine the socioeconomic status of the population, identify pockets ofurban poverty, and estimate poverty rates--and by creating opportunities for greater financialinclusionamongthepoor.TABLE1:NoPoverty

PossibleTargets

Theme Application DataSource References

1.1;1.2

Socio economicstatus andwellbeing

Human mobility andsocioeconomiclevels

Mobile PhoneData

Frias-Martinezetal.(2012)

Estimating povertyandwealth

Blumenstocketal.(2015)

Socioeconomicstatus Gutierrezetal.(2013)

PovertyMapping

Identifyingthepoor SatelliteData

Elvidge et al. (2009); Jean etal.(2016)

Urbanpoverty Kohlietal.(2012)

1.4 FinancialInclusion

Creditworthiness oftheunbanked

Mobile PhoneData

KumarandMohta(2012)

1.5 Disasterresponse

Human mobility afterdisasters

Luetal. (2016);Wilsonetal.(2016);Luetal.(2012)

KeyBigDataSources

Page 10: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

10

www.lirneasia.net

MobilePhoneData,SatelliteData,PostalDataSocioeconomicStatusandEconomicWell-beingMobilePhoneDataNumerousstudies(Frias-Martinezetal.2012;Blumenstocketal.2015)havesoughttomapvariablesderivedfrommobilephonedatatomakeobservationsonthesocioeconomiclevelsofpopulations.Forinstance,researchconductedbyFrias-Martinezetal.(2012)indicatedthatpopulationsthathadhigher socioeconomic levels had a stronger linkage with larger ranges of mobility compared topopulations that had lower socioeconomic levels. The mathematical model developed by Frias-Martinez et al. (2012)mappedmobility variables derived from information on Call Detail Records(CDRs) to socioeconomic levels. Among others, the results showed a higher correlation betweensocioeconomiclevelsandincreasesintheradiusofgyration.4Insteadofidentifyingwealthandpovertyatanaggregatelevel,Blumenstocketal.(2015)soughttounderstandhowwellmobilephonedatacouldidentifyvariablesonanindividuallevelbyleveragingdatafromthatindividual”sdigitalfootprint.Acompositewealthindexwasdevelopedbasedontheresultsofaphonesurveywithrandomlyselectedsubscribers(gatheringinformationonthehousingcharacteristics, asset ownership and other indicators) were combined with their mobile phonetransactiondata (afterobtaining consent).Themergeddataof the respondentswasused to showthat itwaspossible topredict thewealthofmobilephonesubscriberby the individual”shistoricalphone interactiondata.Thiswas then leveraged tomakeout-of-samplepredictions for the restofthesubscribers (i.e. thosewhohadnotparticipated in thesurvey).While themodel”spredictions(out-of-sample) at a district level have a strong correlation with data from the demographic andhealthsurveyforhouseholdsowningamobilephone(r=0.917)aswellasall thehouseholds inthesurvey (r=0.916), a key caveat of this is the challenge in verifying the accuracy of data in microregionsgiventhelackofappropriatecomparativedata.Moreover, CDR data combined with remote sensing data can also be leveraged to support themodelingoftraditionalmeasuresofpovertyatamoregranularlevelandatmorefrequentintervals,andprovidenewinsightsonthedistributionofpoverty.Forexample,Steeleetal.(2017)soughttomodelthreetraditionalindicatorsofpovertyusingthetwobigdatasources.CDRdataincludedusermobilitymetrics,reloadpatternsaswellaspatternsofphoneusage.Remotesensingdataincludedmetrics such as access to roads, weather, nighttime luminosity and other possible welfare-linkedmetrics.Results showedthatmodelsdevelopedusingbothdatasourceshadastrongerpredictivepower, with themost successfulmodel being the reconstruction of the Demographic and HealthSurveyWealthIndextopredictpoverty(r2=0.76)Inaddition tomobilitydata fromCDRs,other research leveragemobile creditdata to inferwealthpatternsofusers.Forinstance,Gutierrezetal.(2013),leveragedairtimecreditpurchaserecordsandcommunicationdataofsubscriberstounderstandsocioeconomiclevelsbasedonthehypothesisthatmobileuserswhomakelargerpurchasesofairtimecreditwouldbemoreaffluentcomparedtothosewhomademultiplesmallerpurchases.However,theresearchersemphasizedthattherewasalackofdatatodevelopapredictivemodelthatleveragedbothvariables,sizeandfrequency.

4TheradiusofgyrationIsameasureofhowfaranobjecttravelsfromitscenterofgravity.Inthecaseofhumans,theradiusofgyrationroughlymeasuresthetypicalrangeofamobilityofuserinspace.

Page 11: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

11

www.lirneasia.net

SatelliteDataIt must be noted that the use of nighttime light data proxies pre-dates the current big dataphenomenon. Studies that leverage satellite image analysis for poverty estimation cut across keyfeatures that include nighttime and daytime imagery, regional and national level data, the timeperiodofobservation, aswell as the resolutionof imagesamongothers. Inaddition toestimatingGDP (Elvidge, et al. 1997; Sutton, et al. 2007; Ebener, et al. 2005), studies have also sought toestimatepovertyingeneral(Elvidgeetal.2009)orurbanpoverty(Kohlietal.2012).For instance, Elvidgeet al. (2009)developedapovertymapbyutilizingapoverty index calculatedusingpopulationcountandthebrightnessofnighttimelightsasobservedthroughsatellite images.Themodel took intoaccountoutlinesofhumansettlement, topographyaswellas landcover.Theresultsthatwerederivedwerecalibratedwithnationallevelpovertydata.Theestimatederivedwasa little lower than estimates from the World Development Indicators. The study suggested thatstronger referencedata forcalibrationpurposesand improvements in satelliteobservationswouldenhancetheresultsofthepovertymap.Kohli,etal.(2012)developedagenericframeworkthatleveragesinformationacrossthreelevelsofthe constructed environment to support the satellite image-classification of slums: object level,(road/building characteristics) settlement level (planned vs. unplanned) and the environ level(disasterproneregions).However, a key caveatofusingnighttime satellite imagery toestimatepoverty is the challengeofdifferentiatingregionsthatarealreadyatthelowerendofincomedistributionsincesatelliteimageryintheseareaswouldbedark(Jeanetal.2016).Jeanetal.(2016)usedbothdaytimeandnighttimesatelliteimageryaswellassurveydataandmachinelearningtoidentifyvaryingeconomicwell-beinglevelsinMalawi,Nigeria,Rwanda,TanzaniaandUganda.Thecomputermodelwastrainedtoidentifyfeatures of daytime satellite images thatwere predictive of poverty. Imagery from countries thatalreadyhadsurveydatawasusedtovalidatethefindingsofthemodel.FinancialInclusionMobilePhoneDataIn 2012, the ConsultativeGroup toAssist the Poor (CGAP) suggested that analyzingmobile phoneconsumption data (Kumar and Muhota, 2012) could identify the creditworthiness of theunbanked.Forexample,itwassuggestedthatpurchasingairtimecreditinaconsistentandfrequentmannershowedincomepredictabilityaswellasbeingabletoplanahead,bothpositiveindicatorsofloanrepaymentabilityasopposedtothosewithinactiveprepaidaccountsorthosethatwouldrunoutofcreditregularly.Similarly, with a presence in over 20 countries, Tiaxa serves the unbanked in emerging markets,offering10millionsmallloansdailyintheformofcash(throughMobileMoney)andasairtimecredit(prepaid mobile phone subscriptions) Tiaxa”s Balance Advance feature which relies on numerousprocessincludinguserbehavioranalyticsofmobilephoneuserstodetermineamountstoadvancetoeachuser,iscurrentlybeingutilizedby8operatorsinLatinAmerica.OtherBig data startup, DemystData leverages numerous sources of big data—includingtelecommunications information, social, ID, fraud, websites, text, news, and logs—to assess thecreditworthiness of customers. Its software integrates telecommunication data and social and

Page 12: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

12

www.lirneasia.net

corporatedata todevelop riskprofiles for small businessesor individuals, enablingbanks to get abettersenseoflendingrisks.DisasterResponseMobilePhoneDataThere is a growing body of research that leverages mobile phone data to understand disasterresponsethatincludeunderstandinghumanmobilitypatternsafterdisasterssuchascyclones(Luetal. 2016) and earthquakes (Wilson et al 2016; Lu et al. 2012). This has been described in greaterdetailinGoal11.InsightsfromBigDatavs.TraditionalIndicatorsTarget 1.1 By 2030, eradicate extreme poverty for all people everywhere, currently measured aspeoplelivingonlessthan$1.25aday.Thereisoneindicatortomeasurethistarget:

− 1.1.1Proportionofpopulationbelowtheinternationalpovertyline,bysex,age,employmentstatusandgeographicallocation(urban/rural).

Target1.2By2030, reduceat leastbyhalf theproportionofmen,womenandchildrenofall ageslivinginpovertyinallitsdimensionsaccordingtonationaldefinitions Therearetwoindicatorstomeasuretheachievementofthistarget

− 1.2.1Proportionofpopulationlivingbelowthenationalpovertyline,bysexandage− 1.2.2Proportionofmen,womenandchildrenofallageslivinginpovertyinallitsdimensions

accordingtonationaldefinitionsAsdiscussedabovebigdatasourcessuchasmobiledataandsatellitedatacanbeusedtoinferthesocio economic status of population enabling authorities to identify pockets of poverty at moregranular levels and at more frequent intervals, than is possible using traditional methods. Thus,insightsderivedfrombigdatasourcescanfeedintotheachievementoftarget1.1and1.2.Target1.4By2030,ensurethatallmenandwomen,inparticularthepoorandthevulnerable,haveequal rights toeconomic resources,aswell asaccess tobasic services,ownershipand controloverland and other forms of property, inheritance, natural resources, appropriate new technology andfinancialservices,includingmicrofinanceTwoindicatorsmeasurethistarget:

− 1.4.1Proportionofpopulationlivinginhouseholdswithaccesstobasicservices− 1.4.2 Proportion of total adult population with secure tenure rights to land, with legally

recognized documentation andwho perceive their rights to land as secure, by sex and bytypeoftenure

Whilethetraditional indicatorsfocusonaccesstobasicservicesandtenurerightstoland,bigdatasourcessuchasmobilephonedatahave thepotential to facilitate financial inclusionbyhelping toassessthecreditworthinessoftheunbanked.Target1.5By2030,buildtheresilienceofthepoorandthoseinvulnerablesituationsandreducetheirexposure and vulnerability to climate-related extreme events and other economic, social andenvironmentalshocksanddisastersThreeindicatorsaresettomeasurethistarget:

− 1.5.1 Number of deaths, missing persons and persons affected by disaster per 100,000people

− 1.5.2Directdisastereconomiclossinrelationtoglobalgrossdomesticproduct(GDP)− 1.5.3Numberofcountrieswithnationalandlocaldisasterriskreductionstrategies

Page 13: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

13

www.lirneasia.net

Of these indicators, mobile phone data can contribute towards indicator 1.5.1, which is multi-purposeindicatorwithlinkagestootherSDGtargetsincluding11.5,13.2,1.3,14.2,15.3,3.9,and3.6amongothers5.Whilemobilephonedatamaynotbeabletodirectlymeasurethis indicator, itcanhelp contribute towards the measurement) by helping identify the movement of people after adisaster.Moreover,satellitedata,analyzedquickly,cancontributetothemeasurementof1.5.2.Goal2:ZeroHungerWiththeUNprojectingworldpopulationtoriseto8.5billionby2030andto9.7billionby20506,thesecond SDG aims to “end hunger, achieve food security and improved nutrition and promotesustainable agriculture.” However, agricultural regions are susceptible to extreme weatherconditionssuchasdroughts,whichseverelyaffects theirability tocontribute towards foodsupply.This makes it crucial for decision makers to be aware of the location and potential of droughtconditionstoprovidetargetedassistance.Analysis,bothspatialandtemporal,ofsatellitedatacouldhelpdetermineboththeseverityandtheextentofthedroughtinnearreal-time,aswellassupporttheestimationofcropproduction.TABLE02:ZeroHunger

PossibleTargets

Theme Application DataSource

References

2.1 ExpenditureonFood

Proxy indicator forfoodexpenditure

MobilePhoneData

Decuyperetal.(2014)

2.1 Droughtmonitoring

Severity and extentof droughtconditions

SatelliteData

Berhan et al. (2011); Tucker &Choudhury (1987); Henricksen &Durkin(1986)

2.4 Early cropyieldassessment

Developingvegetation healthindices

Kogenetal.(2011)

2.c. PriceIndexes Constructingconsumerpriceindex

OnlinePrices

Cavallo&Rogobon(2016)

BigDataSourcesSatelliteData,MobilePhoneData,WebScrapedDataProxyIndicatorforFoodExpenditureMobilePhoneDataRecentstudieshavesoughttounderstandhowsuitedmobilephonedataderivedindicatorsweretoact a proxy for indicators of food security (Decuyper et al. 2014). Researchers compared mobile

5https://unstats.un.org/sdgs/files/metadata-compilation/Metadata-Goal-13.pdf6UNprojectsworldpopulationtoreach8.5billionby2030,drivenbygrowthindevelopingcountries.(n.d.).Retrievedfromhttp://www.un.org/sustainabledevelopment/blog/2015/07/un-projects-world-population-to-reach-8-5-billion-by-2030-driven-by-growth-in-developing-countries/

Page 14: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

14

www.lirneasia.net

phone(activityandpurchaseofairtimecredit)datafromanEastAfricancountrywithanationwidehouseholdsurveythatwasconductedbytheWorldFoodProgramme.Theresearchersfoundahighcorrelation (r>0.7)betweenpurchasesofairtimecreditand theresults fromthesurvey forseveralfooditems.Giventhatthishighcorrelationwaswitnessedforfooditemsthatweremainlymarket-dependent and not for those items grown at home, it can serve as a proxy indicator of foodexpenditureof“market-dependent”households.SocialMediaDataSimilarly, the results of a study conducted by UN Global Pulse (2014) indicated a relationshipbetweenpaststatisticsonfoodinflationandthevolumeoftweetsregardingincreasesinfoodprices.Researchers analysed tweets prices relating to food prices in Indonesia overMarch 2011 to April2013 by developing taxonomies relating to the issue, relying on a classification algorithm tocategorizetheresults.Atimeseriesanalysiswasthenruntoidentifythecorrelationbetweenofficialfoodinflationstatisticsandfood-relatedtweetsaswell.DroughtMonitoringSatelliteDataTherearemanystudies(forexample,Unganai&Kogan,1998;Petersetal.2002;Wan,Wang&Lee,2004;Rhee&Carbone,2010)thathavebeenconductedontheuseofremotesensingtechnologyfordetecting drought conditions. For instance, a study conducted by Henricksen, & Durkin (1986)leveraging radiometer data (Advanced Very High Resolution Radiometer - AVHRR) derived frommeteorological satellites found a strong correlation (r=0.99) between the “rates of change of thenormalizeddifferencevegetationindex(NDVI)derivedfromtheAVHRRdataandthresholdvaluesofa soilmoisture index at the beginning and ends of growing periods”. Tucker& Choudhury (1987)concludedthat“multi-temporalsatellitedatahaveapplicationinthedetectionandquantificationofdroughtthroughtheabilityof thesedatatoestimatethephotosyntheticcapacityof theterrestrialsurfaceand recordmicrowave surfacebrightness.”Moreover,Wang&Qu (2007)proposedanewNormalizedMulti-band Drought Index to monitor vegetation and soil moisture based on satellitesensors.CropManagement/YieldEstimationSatelliteDataIn addition tomonitoring topographical changes, satellite data can also be used to estimate cropyields.Forexample,Unganai&Kogan(1998)leveragedapreviouslydevelopedvegetationconditionindex based on AVHRR data and a temperature condition index to assess drought conditions inSouthernAfricawith results validated throughon the grounddata. The results suggested that thetwoindicescouldbeusedtodevelopscenariosofcornyield6-13weeksaheadofharvesting.Similarly, studies such as those conducted by Kogan et al. (2011) used vegetation health indicesbasedonAVHRRsensorstoprovidecumulativeestimatesonaweeklybasisforarangeofindicatorsincluding thermalandmoistureconditionsofcanopy for therelevantseason.Calculated for1981–2010 period, the indices were compared with yields in 24 countries and demonstrated a strongcorrelation between the vegetation health indices and corn, wheat, sorghum and soybean yieldsduringtheperiodindicatingitspotentialtoactsasaproxyforearlycropyieldassessment.

Page 15: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

15

www.lirneasia.net

Moreover, the Group on Earth Observation Global Agricultural Monitoring (GEOGLAM)7uses acombination of ground-data and satellite images to provide agricultural production forecasts atglobal, regional aswell as national levels. TheGroup on EarthObservations (GEO) developed theGEOGLAMinitiative,whichwasendorsedbytheG20inJune2011.GEOGLAMseeksto"coordinatesatellitemonitoringobservationsystemsindifferentregionsoftheworldinordertoenhancecropproductionprojectionsandweatherforecastingdata,"andtothatextent,developedacropmonitortoprovidetheAgriculturalMarketInformationSystemwithanassessmentofglobalcropproductionconditions for four key crops:wheat,maize, rice, and soy for theG20 countries plus Spain and 7other countries. The Earth Observation data is used to assess evapotranspiration, rainfall, soilmoisture,temperatureandtheNormalizedDifferenceVegetationIndex.PrivateSectorUS-basedClimateCorporation8offersfarmersamobilesoftware-as-a-solutionthatprovidesweathersimulationinformationandprovidesfarmerswithestimatesforcropyieldsbasedondailyweatherdata over the pastmonths.Weathermeasurement information obtained from around 2.5millionlocationsdaily isprocessedwith150billionsoilobservations toprovide farmerswith24-hourandseven-dayforecastsofrain,windandtemperatureforspecificareas(200acres),enablingfarmerstomakemoreinformeddecisionsregardingplantingandharvestingcrops.InsightsfromBigDatavs.TraditionalIndicatorsTarget2.1By2030,endhungerandensureaccessbyallpeople,inparticularthepoorandpeopleinvulnerablesituations,includinginfants,tosafe,nutritiousandsufficientfoodallyearroundBigdatasourcesmaynotbeabletodirectlymeasurethetwoproposed indicatorstomeasuretheprogresstowardstheachievementofthistarget

− 2.1.1Prevalenceofundernourishment− 2.1.2Prevalenceofmoderateorseverefoodinsecurityinthepopulation,basedontheFood

InsecurityExperienceScale(FIES)However, theanalysisofbigdatasourcescanprovide insightson foodsecurity thatcouldbevitalwhen serving populations in vulnerable situations. For instance, crop yield estimation enablesgovernmentstobetterprepareforanyprojectedshortages. Goal3:GoodHealthandWellbeingThethirdgoaloftheSDGsseeksto“ensurehealthylivesandpromotewell-beingforallatallages.”Inparticular there isa focusonendingepidemicssuchasmalaria,neglected tropicaldiseasesandcommunicablediseases.Theuseofmobilephonedata,particularlyCDR,hasbeenusedinmanycasestostudythespreadofmosquito-bornediseasessuchasdengueandmalaria.Analysisofmobilitydataformosquito-bornediseasepropagationreliesonthepremisethatthediseasecouldbespreadinoneofthreeways:(1)Infectedparasitesmovingtoaregion,(2)Infectedhumansvisitingaregionand(3)residentsvisitingaregionwiththediseaseandbecominginfectedandreturningtotheregion.Giventhatmosquitosflywithin a small radius, the principalmeans of epidemic spread is through human carriage. Theunderlying theory is the fact that social interaction and thereby mobility facilitate the spreaddiseases. By combining movement patterns with data of reported cases (e.g. malaria) areas of

7See:https://www.earthobservations.org/geoglam.php8ThecompanyisasubsidiaryofMonsantoCompany.See:https://www.forbes.com/sites/bruceupbin/2013/10/02/monsanto-buys-climate-corp-for-930-million/#55db1d1d177a

Page 16: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

16

www.lirneasia.net

transmissionriskcanbeprioritized.Suchriskmapscanbeusedtoaidtargetedinterventionsandtocurtailthespreadofsuchdiseases.Numerousstudies(Tatemetal.2009;Bengsttonetal.2011;Wesolowskietal.2014;Ruktanonchaiet al. 2016)have leveragedmobility variablesderived frommobilephonedata tounderstand thespreadofdiseasessuchasMalaria(Tatemetal.2009;Tatemetal.2014;Ruktanonchaietal.2016),Ebola(Wesolowskietal.2014),Rubella(Wesolowskietal.2015),Dengue(Wesolowskietal.2015),andCholera(Bengtssonetal.2011)amongothers.TABLE03:GoodHealthandWell-being

PotentialTarget

Theme Application DataSource References

3.3

DiseasePropagation

Mobility fromregionsofdiseaseoutbreak

MobilePhoneData

Wesolowski, et al. (2015);Bengston et al. (2015);Wesolowskietal.(2014)

Sources and sinks fordiseases

Ruktanonchai et al. (2016);Tatemetal.(2014)

DiseaseImportationrate Tatemetal.(2009)

Seasonal trends ofdiseases

SearchEngineData

Schusteretal.2010;Yangetal.2010;Xuetal.2010

BigDataSourcesSocialMediaData,SearchenginequeryData,MobilePhoneData,SensorDataDiseasePropagationMobilePhoneDataTatem et al. (2009) leveraged anonymized CDR data from mobile operator Zantel to gauge themalariaimportationrateinZanzibar.TheresearchersstudiedthemobilitypatternsofpopulationinZanzibartravellingtomainlandTanzania,extractinginformationsuchaslengthofstayandlocationsvisited by analyzing CDRs over a three-month period. This was combined with malaria risk data(based onwhere and how long travellers stayed) to develop amathematicalmodel to estimate amalariaimportationrate.Tatem et al. (2014) demonstrated the manner in which human mobility connected the areas ofmalaria transmission risks. Researchers leveraged satellite data, CDRs and surveillance data,integratingmovementdatawithcase-basedriskmaps.Researchersleveragedanonymisedcalldetailrecords by a leading mobile service operator, aggregated at a cell tower level to estimate themovement of two groups of travellers: “returning residents” – those who had visited an area ofmalariariskandreturnedhome,and“visitors”–residentsofareasthathadmalariariskvisitingotherlocations.Theresearcherswereabletoidentifyareathatwereexportersofthediseasesandotherareasthatservedassinks.Wesolowski et al. (2014) further studied the use of CDRs in the Ebola outbreak context. Theydevelopedmodelsestimatetravelbetween locationsbasedonthedistancebetween locationsandthe sizeof thepopulationat the locations,usingCDRs tounderstandnationalmobilitypatterns inKenya, Cote d’Ivoire and Senegal (in the absence of mobile data from the countries that wereaffectedbyEbolaatthetime).Themodelswouldhelpidentifytherelativeamountoftrafficbetweenthedifferentpopulationsaswellasidentifythemorepopularroutesoftravel.Giventhatmobilityof

Page 17: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

17

www.lirneasia.net

people is a key driver of the disease, understanding themovement of the population within thecountry would enable the delivery of targeted control policies by identifying the possible routestakenbyinfectedindividuals.Similarly,Wesolowskietal.(2015)inferredmobilitypatternsofpopulationfrommobilephonedatafromalmost15millionsubscribersinKenyaquantifyingthepatternsofseasonaltravelandcombinedthis with rubella incidence data to predict the transmission of the disease. Moreover, thecombination of spatial and seasonal travel data enables the characterization of risk fluctuationsseasonally.Wesolowski, et al. (2015) developed an epidemiological model for transmission of dengue intravelersby leveragingclimate informationandmobiledata fromaround40million subscribers tohelppredictthepropagationofdengueinPakistan.Theyanalysedhistorical(2013)denguecasedatathat was spatially explicit and compared the epidemic dynamics with epidemiological model. Theresultsderivedshowedthatmobilityestimatesbasedonmobiledatacouldbeused topredict thetiming and the geographic spread of diseases, not just in recently epidemic locations, but also inemergingareas.Moreover,theresearcherswereabletogeneratedynamicriskmapsthatcouldbeleveragedtocontainthedisease.Bengtssonetal. (2011) leveragedmobilephonedata fromDigicelHaiti toassess themovementofmobile phones and thereby estimate the movement of population from regions with a Choleraoutbreak.This informationwassharedwithreliefagencies thatweremobilizing inresponsetotheoutbreak. The results from the analysis were later compared with actual cases at a district levelshowingthatmobilitypatternsderivedfrommobilephonedatawereabletopredictthespreadoftheepidemicbetterthanstandardpopulationmobilitymodels.Moreover, itwasalsorevealedthattherewascorrelationbetweenmobilephonedataandthemagnitudeofanoutbreak.Similarly, Ruktanonchai et al. (2016) developed a mathematical model that leveraged call detailrecords ofmobile phone users in Namibia andmalaria parasite ratemaps to estimate regions of“self-sustainingmalariatransmission”.TheresearchersmodifiedamodeldevelopedbyCosneretal.(2009)inamannerthatwouldenablethemtoparameterizehumanmobilitybasedonmobiledata.Theresearcherswereableto identifyareasthatactedassourcesandothersthatactedasMalariasinksenablingauthoritiestoconductedtargetedeliminationprograms.SearchEngineDataThere is alsoevidence (Schuster et al. 2010; Yanget al. 2010;Xuet al. 2010) to show that searchenginequeriesonsymptomsandpharmaceuticalsarerelatedtoseasonaltrendsofavastspectrumof health related issues from specific diseases to conditions such as sleep disorders. The studiesreferenced in this reportallusedGoogletrendsas theirdataprimarydatasource leveragingotherdatasourcessuchasclimaterelateddata(Yangetal.,2010),revenueofpharmaceuticals(Schusteretal.2010)todeterminecorrelations.Thestudiesshowthattheanalysisofsearchenginedataalongwithotherrelevantdatasourcescanbeusedtomapspatialandgeographicalspreadof,forexample,communicablediseases,whichinturncanbeusedforbetterriskmanagement.HotspotsforRoadTrafficAccidentsTarget 3.6 focuseson the reductionof deaths and injuries from road traffic accidents. Theuseofhistoricalaccidentdatacoupledwithothervariablessuchassocialeventsandweathercanbeusedto identify potential areas of accidents, enabling authorities to take preventative action. Forinstance,TennesseeHighwayPatrol (THP) leveraged IBManalytics to reduceroad trafficaccidents

Page 18: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

18

www.lirneasia.net

bydevelopingamodeltopredictfutureaccidentsbasedonhistoricaldata:weatherdata,geotaggedcrashdata,geo-taggeddataondrivingunder influence(DUI)aswellasdataaroundspecialeventssuchasparadesandgames.ThedatawasusedtoidentifycorrelationsbetweenaccidentsandDUIarrests,andexternalinfluencessuchasevents,weather,location,timeofday/yearanddayofweeketc.Basedonthisanalysis,themodelcouldextrapolatetopredicthotspotsforfutureincidents.InsightsfromBigDatavs.TraditionalIndicatorsTarget3.3By2030,endtheepidemicsofAIDS,tuberculosis,malariaandneglectedtropicaldiseasesandcombathepatitis,water-bornediseasesandothercommunicablediseasesTherearefiveindicatorsunderthistargetthatmeasuretheincidenceofvariousepidemicsofferingastaticpicture.

− 3.3.1NumberofnewHIV infectionsper1,000uninfectedpopulation,by sex, ageandkeypopulations

− 3.3.2Tuberculosisincidenceper1,000population− 3.3.3Malariaincidenceper1,000population− 3.3.4HepatitisBincidenceper100,000population− 3.3.5Numberofpeoplerequiringinterventionsagainstneglectedtropicaldiseases

Meanwhiletheanalysisofbigdatasuchasmobilephonedatacancontributetotheachievementoftarget3.3byhelpingtounderstandthespreadofmosquitobornediseasessuchasmalariaenablingtheprovisionoftargetedcareandpreventivemeasures.Target3.6By2020,halvethenumberofglobaldeathsandinjuriesfromroadtrafficaccidentsWhilebigdatamaynotbeuseful tocomputethe indicator for this target (3.6.1Deathrateduetoroad traffic injuries) it can help identify hotspots for traffic accidents enabling authorities to takepreventiveaction,thuscontributingtotheachievementoftarget3.6Goal4:QualityEducationTheachievementofGoal4 is thekeytoachievingmanyof theotherSDGs.9Whiletherehasbeenprogresstowardsimprovingliteracyskillsandincreasingschoolenrolmentrates,universaleducationgoalsareyettobemet.Forinstance,over100millionyouthdonotpossessbasicliteracyskills10.TABLE04:QualityEducation

PossibleTargets

Theme Application DataSource References

4.6 IlliteracyPrediction Areasoflowliteracy Mobile PhoneData

Sundsøy, P.(2016)

BigDataSourcesUsedMobilePhoneDataLiteracyPrediction9See:QUALITYEDUCATION:WHYITMATTERS.(n.d.).Retrievedfromhttp://www.un.org/sustainabledevelopment/wpcontent/uploads/2017/02/ENGLISH_Why_it_Matters_Goal_4_QualityEducation.pdf10Education-UnitedNationsSustainableDevelopment.(n.d.).Retrievedfromhttp://www.un.org/sustainabledevelopment/education/

Page 19: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

19

www.lirneasia.net

MobilePhoneDataSundsøy,P.(2016)11developedamachinelearningalgorithmtopredictilliteracybyanalyzingmobilephone records in a developing country in Asia to derive indicators frommobile phone data thatcould reflect the socioeconomic andmobility patterns of users. A survey of 76,000mobile phoneuserswas conducted tovalidate the studyandgather informationon literacy. Themodel showed70%accuracywiththestudyalsoshowingthatindividualilliteracycanbeaggregatedatacelltowerlevel.Theresultsindicatepredictorsofilliteracy:socialnetworks(thereislessdiversityamongsocialcontactsforthosewhowereilliterate),location–areasofpoordevelopmentcouldbeidentified.TrendsinEducationMassiveOpenOnlineCoursesTheadventofMassiveOpenOnlineCourses(MOOC)offeropportunitiestogleaninsightsonthevastnetwork of users registered on their sites including information on demographics, geography,subjectsofengagementandeffectivenessof variousmodesof learning thathave thepotential toinformpolicy relating toeducation.HarvardXandMITxreleased findings (Hoetal,2014;Hoetal.2015)fromtheiropencourseslaunchedthroughtheedXplatform. InsightsfromBigDatavs.TraditionalIndicators. Target 4.6 By 2030, ensure that all youth and a substantial proportion of adults, both men andwomen,achieveliteracyandnumeracyThistargetisproposedtobemeasuredbyoneindicator:

− 4.6.1 Percentage of population in a given age group achieving at least a fixed level ofproficiencyinfunctional(a)literacyand(b)numeracyskills,bysex.

Thetargetsforthisgoallargelyrevolvearoundaccesstoeducation,formalandnon-formalspanningpre-primary to tertiary educationWhile there have been efforts to predict illiteracy usingmobilephonedata,thereappearstobelimitedresearchontheuseofbigdatatomeasurethesetargets.Goal5:AchieveGenderEqualityandEmpowerallWomenandGirlsGenderinequalityposessignificantramificationstoallareasofsocietyincludingreducingpoverty,toimproving health and education among others. In order to ensure that “no one is left behind”,genderdisaggregateddataarerequiredsothatallpopulationgroupsareaddressed.

TABLE05:GenderEquality

PossibleTargets

Theme Application DataSource References

5.1 GenderPrediction

Genderprediction

MobilePhoneData

Sundsøy et al. (2015); Blumenstock&Eagle(2010)

Page 20: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

20

www.lirneasia.net

BigDataSourcesUsedMobilePhoneData,SocialMediaGenderPredictionMobilePhoneDataThe development of models that could predict the gender of anonymous mobile phone userspresents significant opportunities for decision-makers to leverage existing models that assesssocioeconomic status and geography of mobile phone users to conduct targeted gender-specificdevelopmentinitiativesanddevelopinformed,evidence-basedpolicies.Sundsøy et al. (2015)12have utilized mobile phone data and machine learning to derive mobilephoneuser’sgenderbasedontheirphoneusagerelyingonindicatorsthatincludedaveragedelayinresponseand the radiusof gyration.This information, combinedwith theability to identify socio-economic status of phone users would enable decision-makers to identify key characteristics ofwomen in impoverished areas. Moreover, in their analysis of mobile phone usage and accesspatternsinRwanda,Blumenstock&Eagle(2010)noteddifferencesbygenderthatwerestatisticallydifferent,forexampleintermsofownershipitwasmorelikelythatwomenusedsharedphones.PilotProjects–MobilePhoneData;TwitterDataData2xhasbeensupportingnumerouspilotprojectstoanswergenderquestionsleveragingbigdatasourcessuchassocialmedia,calldetailrecordsandsatellitedata.Forinstance,itissupportingtheFlowminder Foundation to improve the spatial resolution of information that already exists fromstandard surveys (for example, indicators on freedom of movement, access to contraceptives,malnutritionetc.)byleveragingsatellitedata.Similarly, its partner UN Global Pulse sought to sex-disaggregate tweets on development topicsbetween the period May 2012 to July 2015. The 350 million development-related tweets wereyielded over the course of this period by filtering for 25,000 keywords (in Spanish, Portuguese,French,andEnglish)relatedto16developmentareasandrelyingonanalgorithmacombinationofname,urlandtwitterus/idtoidentifythegenderoftheuser.InsightsfromBigDatavs.TraditionalIndicatorsThereappears tobe limited researchon thegenerationofgender-disaggregated insights frombigdata sources. This is possibly explainable in termsofpseudonymizationofdata that is causedbyconcerns over privacy. Usually ability to identify gender is also removed when personallyidentifiableinformation(PIP)aremasked.Goal6:CleanWaterandSanitationWhileaccesstowaterandsanitationisahumanright,accordingtotheUN,over40percentoftheglobalpopulationisaffectedbywaterscarcityandover600milliondonothaveaccesstoimprovedsourcesofdrinkingwater.13Inaddition to theprovisionofuniversal access todrinkingwater, it is

13WaterandSanitation-UnitedNationsSustainableDevelopment.(n.d.).Retrievedfromhttp://www.un.org/sustainabledevelopment/water-and-sanitation

Page 21: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

21

www.lirneasia.net

imperativeforcountriestobeproactive inconservingtheirwater-relatedecosystemsandimprovewater-useefficiency.Numerousstudies(Haasetal.2009;Luetal.2011;Pekeletal.2014;Roknietal.2014;Muelleretal.2016)havebeenconductedthatrevolvesaroundmappingsurfacewaterusingremotesensingdata.Insightsderived fromsuchstudiescould feed intounderstanding thechanges in theglobalwater-relatedecosystem,assess the levelofwater stressandhelpdriveevidence-basedpolicymaking toprotectandrestoreecosystems.Moreover, inrelationto improvingwateruseefficiency,theadventofsmartwatermeterscreatesopportunities to regulate the supply ofwater by leveraging predictive analytics, generatingwaterconsumptionpatterns,detectingsuspiciousbehaviorssuchasleaksandidentifyingpeakdemand.TABLE06:CleanWaterandSanitation

PossibleTargets

Theme Application DataSource

References

6.6 Changes in water-relatedecosystem

Change insurfacewater

SatelliteData

Mueller et al. (2016); Haas et al.(2009);Roknietal.(2014);Pekeletal.(2014);

BigDataSourcesSmartMeterData,SatelliteDataMappingchangesinwater-relatedecosystemSatelliteDataTherehavebeennumerousstudiesthathavesoughttomapchangesinsurfacewaterusingvarioustechniques.Roknietal.(2014)reliedonsatellitedataandappliedtheNormalizedDifferenceWaterIndex (NDWI) to model the Lake Urma’s (Iran) spatiotemporal changes over the period between2000and2013.Theresultsofthestudyshowedatrendofdecreasingsurfacewaterovertheperiodanalysed. Similarly, Pekel et al. (2014) conducted an experiment of the African continent andproposedanapproachtoprovidecontinentaldynamicinformationonwatersurfacedetectionusinga “ genericmulti-temporal andmulti-spectral imageanalysismethod”and theproduct yieldedanaccuracyof91.5%.Muelleretal.(2016)reliedonsatellitedataspanning27yearstomapsurfacewaterinAustralia.Theresearchers used a decision tree classifier-based algorithm for water detection as well as acomparisonmethodology. Amodel on a pixel-by-pixel basis was utilized whereby each pixel wasclassifiedaswaterornot-waterbasedonregressiontreethatwastrainedonsampleofwaterandnon-water tiles that represented the various landscapes across Australia. The results enabled todifferentiate areaswherewaterwas persistent fromareaswherewaterwas collected for a shorttime(suchasareasunderflood).Researchers at the European Joint Research Center are utilizing Google Earth Engine [Includereference]tomap30yearsofsurfacewateroccurrenceglobally.Thestudyanalysesover2.8millionlandstat images spanning the period from March 1985 to March 2015 using sensor neutralmethodology.Theyarerelyingonapixel-basedapproachwithspatiotemporalvalidationbasedon20,000validationpixelswithoverallaccuracyabove90percent.

Page 22: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

22

www.lirneasia.net

Similarly,NASA’sGravityRecoveryandClimateExperiment(GRACE)providesthemeanstoidentifylocationsof freshwater availability via two satellites that identifywater gain and loss. It hasbeenusedtogeneratemapsthatshowgroundwaterlossinCaliforniaforaparticularperiodoftimeandthescarcityoffreshwaterinotherregionsoftheUnitedStates.InsightsfromBigDatavs.TraditionalIndicatorsTarget 6.6 By 2020, protect and restore water-related ecosystems, including mountains, forests,wetlands,rivers,aquifersandlakes

− 6.6.1Changeintheextentofwater-relatedecosystemsovertimeTheoneindicatortomeasurethistargetproposedtheuseofearthobservationdatatocomputetheindicator. Thus, satellite data can be used alongside ground data to estimate change in extent offreshwatersystems14. Goal7:AccesstoCleanandAffordableEnergyWitharound20per centof theworld’spopulation lackingaccess tomodernenergy services, thisgoal is impacts numerous sectors including business, agriculture, information technology andinfrastructureamongothers15.This is furthercomplicatedby the fact thatenergy remains thekeycontributortoglobalclimatechange.16Thisunderscorestheneedforaccesstocleanenergy.Big data can be particularly useful in understanding electrification (and thus also urban/ ruraldifferences).Forexample,therehavebeennumerousstudiesregardingtheuseofnighttimesatellitedataforvariousmeasuresofsocialandeconomicindicatorsincludingelectrificationrates(Elvidgeetal. 1997). Similarly, the use of big data and machine learning can help address the problem ofintermittencyproblemsofrenewableenergysourcessuchaswindandsolar.The deployment of smart meters offers utilities so many data points that can potentially offerinsightsintoconsumerdemand,particularlyasitcapturesinformationsuchasmeterstatus,qualityofpowerandelectricityconsumption.Inadditiontoprovidingdemand-sidedata,smartmetersareimportant ineffectivedemand-sidemanagement,whichallowformoreefficientuseofgenerationcapabilities,therebyreducingcostsofenergyenablingprovisiontomorepeople.TABLE07:AffordableandCleanEnergy

PossibleTargets

Theme Application DataSource References

7.1 Access toelectricity

Nighttimeluminosity

SatelliteData Elvidge et al. (1997); Elvidge et al.(2009); Townsend & Bruce (2010);Doll & Pachauri (2010); Chen &Nordhaus(2011)

14https://unstats.un.org/sdgs/files/metadata-compilation/Metadata-Goal-6.pdf15http://www.un.org/sustainabledevelopment/wp-content/uploads/2016/08/7_Why-it-Matters_Goal-7_CleanEnergy_2p.pdf16http://www.un.org/sustainabledevelopment/energy/

Page 23: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

23

www.lirneasia.net

7.1 Residentialelectricityconsumption

Determinants ofelectricityconsumption

Smart MeterData

Kavousianetal.(2013)

BigDataSourcesSmartMeterdata,SatelliteDataAccesstoElectricityGrid electricity is supplied by electricity distribution companies who maintain records for billingpurposes. Therefore, supply-side data is available. However, in some countries there are illegalconnections,whichmaybeestimatedbyothermeans.Householdsmayhaveelectricityfromnon-gridsources,whichcanbeestimatedfromsatelliteandotherdata.SatelliteDataThere have been many examples of leveraging satellite-based technology to understandelectrificationratesandhavebeendiscussedindetailinGoalone.Forinstance,Townsend&Bruce(2010) sought to use satellite imagery data between 1997 and 2002 to estimate the geographicdistributionofelectricityusageacrossAustralia.Theresultsyieldedastrongcorrelation(R2=0.9346)betweenastate’sconsumptionofelectricityandnighttimeluminosity.Theresearchersdevelopedamodeltoaddresstheovergloweffect (i.e. thespreadof lighttoneighbouringareas)by leveragingtherelationshipbetweenthestrengthofthelightsourceanddistanceofdispersionfromthesourceto estimate electricity consumption at amore granular level enabling the estimation of statisticallocalarea(greaterthan10km2)electricityconsumption.Moreover,Doll&Pachauri(2010)soughttoidentifyareasindevelopingcountrieswithnoaccesstoelectricityusingnighttimesatelliteimageryandpopulationdatasets.Thestudycomparedaccesstoelectricity between1990 and 2000 to derive estimates of populationwithout access to electricitybasedonsatellitedata.ElectricityConsumptionSmartMeterDataKavousian et al. (2013) use smart meter data to examine the determinants of residentialconsumptionofelectricity.Theresearcherscollecteddatafromover1600householdsintheUS.Theresearchersdevelopedmodelsfordailyminimumandmaximumelectricityconsumption.Thestudyfound that the key determinants of electricity use included floor area,weather and locationwithlittle correlation foundbetween consumptionof electricity and the level of income.An importantcaveat to note is that the participants of this study were employees of a Silicon Valley basedtechnologycompany,withoverhalfreportingincomesoverUSD150,000.PrivateSectorBigdatasolutionssuchastheoneofferedbyIBM(HybridRenewableEnergyforecastingsolution-HyRef) that leveragessensors,analytics technology,camerasetc. togenerateweather forecasts inareasofwindfarmseven30daysinadvancecouldassistinbetterforecastingtheenergyfromthewindfarmsthatcanbedirectedtowardsthepowergrids.Similarly, US-based startup Autogrid offers a solution that analyses energy data such as datacollected from generators and transformers, electricity consumption to offer power and utility

Page 24: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

24

www.lirneasia.net

companies services such as trends in energy usages, predictions and grid device performancemanagement.InsightsfromBigDatavs.TraditionalIndicatorsTarget7.1By2030,ensureuniversalaccesstoaffordable,reliableandmodernenergyservicesTherearetwoindicatorstomeasurethistarget:

− 7.1.1Proportionofpopulationwithaccesstoelectricity− 7.1.2Proportionofpopulationwithprimaryrelianceoncleanfuelsandtechnology

Ofthese,satellitedatacanbeusedtomeasureindicator7.1.1.Goal8:DecentWorkandEconomicGrowthTheavailabilityofstablejobsisavitalcomponentinhelpingthenearly2.2billionpeoplearoundtheworldwholivebelowthepovertyline(ofUSD2)17.ThusSDG8islinkedwiththeoverallobjectiveoferadicating poverty and re-aligning work, economic and social policies to increase job creation,economic productivity etc. One of the key objectives is to create the necessary conditions forsustainableeconomicgrowthandtoincreaseaccesstodecentworkingconditionsforthewholeagerange of the working population and for equal opportunities for men and women, persons withdisabilities, interalia. Bigdata,particularlymobilephonedatacanhelpassessthesocioeconomicstatusofpopulations(Frias-Martinezetal.2012;Blumenstocketal.2015)enablinggovernmentstoprovide targeted initiatives for theprovisionof jobs. Similarly, therehavebeen studies thathaveleveraged search engine query data to glean unemployment trends (Xu et al. 2013). Access tofinanceisalsoakeyconcernforthoseatthebottomofthepyramidandmobilephonedatacanplayarolehereinhelpingtoassessthecreditworthinessoftheunbanked(KumarandMohta,2012).TABLE08:DecentWorkandEconomicGrowth

PossibleTargets

Theme Application DataSource References

8.1 GDP GDP andHumanDevelopment

PostalData Hristovaetal.(2016)

8.5 Unemployment Unemploymenttrends

Search EngineData

Xuetal.,(2013)

8.9 Tourism Destination oftourists

Mobile PhoneData

Ahasetal.(2008)

8.9 Tourism Seasonaltourism

Mobile PhoneData

Ahasetal.(2007)

8.1 GDP Economicdevelopment

SatelliteData

Elvidge et al. (1997); Sutton andConstanza (2002); Ebener et al.(2005); Chen and Nordhaus(2011);

17http://www.un.org/sustainabledevelopment/economic-growth/

Page 25: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

25

www.lirneasia.net

BigDataSourcesMobilePhoneData,WebScrapedData,SearchEngineData,PostalDataPriceIndexesWebScrapedDataThe Billion Prices Project, an academic initiative of the Massachusetts Institute of Technology,conductseconomicresearchthat leveragespricedatacollecteddailyfromonlineretailersglobally.They have published numerous research papers based on their analyses that could contributetowards measuring/monitoring goal 08. For instance, in their study, Cavallo and Rogobon (2016)workwiththeBillionPricesProjectto“constructdailypriceindexes”usingweb-scrapingsoftwaretocollect online prices from largemultichannel retailers (with both an online and offline presence)focusingonproductsthatwereincludedinatypicalbasketforofficialconsumerpriceindexandhadconsumer expenditure weights. Similarly Cavallo (2013) calculated the annual inflation rate forArgentinafortheperiod(2007-2011)usingonlinepricingdatafromkeyretailers’websites.EconomicDevelopmentSatelliteDataNighttimeluminosityforinstancehasbeenusedasaproxyforeconomicdevelopmentatacountrylevel(Elvidgeetal.1997)withstudiesattemptingtoestimateGDPatasub-nationallevel(Suttonetal.2007;Ebeneretal.2005).For instance,astudybyElvidgeetal. (1997)foundacorrelation(R2=0.97)betweenluminosityandGDPacross21countries.Thestudyalsorevealedsignificantoutliersinrelationtotheluminosityandpopulation indicating that the population distribution across geographies should also factor ineconomicdevelopmentat the local level. Similarly,Dolletal. (2000)attemptedtodevelopglobalmaps using the relationship betweenGDP and area lit analyzing nighttime satellite imagery for 6monthsata“global1kmcomposite.”However,Dolletal.(2000)notedthatthespatialresolutionofthedataobtainedlimitedtheeffectivenessofthetechnique.Similarly,SuttonandConstanza(2002)sought to capture conventional GDP as well as non-marketed economic value, with conventionalGDP being measured based on nighttime luminosity derived from satellite imagery to develop aglobalmapofeconomicactivityata1-km2resolution.Ebeneretal.(2005)adoptedadifferentapproachtothatusedbyDolletal.(2000)toestimateper-capita income distribution at a sub-national level, trying to identify if othercomponents/combinationsoflightdata(suchasmeanfrequencyofobservationandnumberofcellswith light)andotherparameterscouldbeusedtodevelopamodel forcountry levelpredictionofper capita income. Sutton et al. (2007) built on thework by Ebener et al. (2005) contrasting twomethods for estimating ‘sub-national GDP levels’ of Turkey, India, China and theUnited States in2000.Theresearchersleveragedpopulationdata,GDPatasub-nationallevel(forthefourcountries)as well as city light data for the period 1992-93 and 2000 to developed regression models toestimateGDPatastatelevelforthefourcountries.Thefirstmethod,areproductionoftheideasbyEbeneretal.(2005),summedupthevaluesofintensityoflight(basedonnighttimesatelliteimages)whilstthesecondusedaspatialanalyticapproachontheimagerypatternsasperthedensityofthepopulation.Moreover,studieshavedifferedintermsoftheperiodofobservationswithsomestudiesfocusingon shorter time frames (Doll et al. 2000) whilst others such as Chen and Nordhaus (2011) havelookedatalongertimeframewhenconductingtheiranalysis,examiningtheuseofnighttimelights

Page 26: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

26

www.lirneasia.net

asaproxy forcountry-leveloutputata1° latitude×1° longitudegrid-cell level,comparingoutputandluminositybetween1992and2008.PostaldataHristova et al. (2016) sought to examine the use of global postal flows (from 187 countries) toestimate socioeconomic indicators that can be used to gauge national wellbeing. The studyleveraged aggregated postal data from 2010 to 2014 that was collected by the Universal PostalUnion.Therecordswereusedasaproxyindicatorandwere'correlatedtofourteensocioeconomicindicators. The study revealed close correlation between postal weighted outflows and GrossDomesticProduct(r=0.79)andweightedoutflowofpostaldataandtheHumanDevelopmentIndex(r=0.77)andweightedoutflowofpostaldata.MobilePhoneDataKreindler andMiyauchi (2015) sought to quantify the relationship between commuting flows (asderived fromanonymizedCDRs)andurbaneconomicactivity inSri Lanka.Thisproxy indicator foreconomic activity has a key advantage in that it would enable the quantification of the informaleconomy which is missing from official statistics. This is a particular problem in developingeconomies.UnemploymentMobilePhoneDataCreatingsustainableeconomicgrowthhasafocusonensuringjobcreation.However,jobretentionisalsoasvital.Thereisresearch(Tooleetal.2015)onusingmobilephonedatatoidentifyshocksinthe work force such as large-scale job losses, identifying individuals affected by such shocks andpredictingchangesinaggregateunemploymentrates.SuchinsightsareusuallyhardtocaptureandreallyhighlightthepotentialuseofbigdatatoaidinthemeasurementsandenrichmentofachievingSDGs.Toole,etal(2015)inferschangesinthemacro-economybyusingmobilephonedatafortwospecificcasesinEurope.Theauthorsidentifymobilephoneusersaffectedbyalarge-scalelay-offbyassigning Bayesian probability weights on the changes of mobile phone activity. The analysis iscarriedoutatmultiplelevels;individual(lookingatbehavioralchangesassociatedwithjobloss)andthereafter at the community and province level (where the relationship inferred at the individuallevelcanbeusedforpredictionofunemployment.SearchEngineDataXu et al., (2013) use neural networks and support vector regressions to model the relationshipbetween unemployment rates and search engine query data with data mining methods usedthereaftertoforecastunemploymenttrends.TourismTarget 8.9 deals with promoting sustainable tourism, specifically those that protect local culturewhilstcreatingjobs.MobilePhoneDataAhasetal.(2008)evaluatedtheuseofpassivemobilephonedatatounderstandtourisminEstonialeveraging data on the call activities of phones on roaming (i.e. foreign) in thenetwork cells. Theinformation captured included the location, country of origin aswell as time. For popular touristregions, the study found a strong correlation (R nearly 0.99) with traditional statistics onaccommodation,but lower correlations in regionswhere tourism infrastructurewasnot as strong

Page 27: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

27

www.lirneasia.net

andwherethereweremanytransittourists.Similarly,apreviousstudyAhasetal.(2007)usedthesocialpositioningmethodtoanalysemobilephonedatatounderstandthedistributionoftouristsbycountryoforiginatanetworkcell level.Thestudyrevealedthatcoastalareasweremorepopularamongtouristsinthesummer,withinlandregionspreferredinwinter.TheEuropeanCommission–Eurostatconductedastudythatamongotherthings,soughttoassessthe feasibilityof leveragingmobilephonedata togeneratestatistics relating to tourismaswellashighlighting the various challenges relating to the use of the data source. In addition, the studyhighlightstheneedsfora“centralframework”thatwouldfacilitatetheprocessoflegallyaccessingdata for national statistics offices aswell as other stakeholders.Moreover theneed for approvedmethodology for thegenerationof tourismstatisticsusingphone datawasalsounderscored.Thestudyconcludedthatmobilepositioningdatacouldsupplementofficialstatisticsatpresent.NSOshavebegunexploringtheuseofbigdatafortourismstatisticsaspertheUNbigdataprojectinventory. For example, countries such as Belgium and the Czech Republic are exploring thefeasibilityofleveragingmobilephonedataforthegenerationoftourismstatistics.SimilarlyHungary,seeksto leverageroadsensordatatoestimatetrafficat itsbordersto improvetheaccuracyof itstourismstatistics.Moreover,during2012-2013StatisticsNetherlandcarriedoutresearchontheuseofbigdatafortourism,whichincludedusingwebscrapeddata–toidentifytouristaccommodationandcharacteristics,andanonymisedcalldetailrecordstounderstandinboundtourism.InsightsfromBigDatavs.TraditionalIndicatorsTarget 8.1 Sustain per capita economic growth in accordancewith national circumstances and, inparticular, at least 7per cent gross domestic product growth per annum in the least developedcountriesThereisonetraditionalindicatortomeasurethis:

− 8.1.1AnnualgrowthrateofrealGDPpercapitaSatellite data and postal data have both been used to develop proxy indicators for economicdevelopmentandcancomplementtraditionalindicators.Target 8.5 By 2030, achieve full and productive employment anddecentwork for allwomen andmen,includingforyoungpeopleandpersonswithdisabilities,andequalpayforworkofequalvalueTherearetwoindicatorstomeasurethistarget:

− 8.5.1 Average hourly earnings of female and male employees, by occupation, age andpersonswithdisabilities

− 8.5.2Unemploymentrate,bysex,ageandpersonswithdisabilitiesByleveragingsearchenginequerydata,thereisopportunitytoidentifyunemploymenttrendswithmobile phone data helping identify shocks inworkforce – insights thatwould be hard to captureusingtheindicatorsabove.Target8.9By2030,deviseandimplementpoliciestopromotesustainabletourismthatcreatesjobsandpromoteslocalcultureandproductsTherearetwoindicatorstomeasurethistarget:

− 8.9.1TourismdirectGDPasaproportionoftotalGDPandingrowthrate− 8.9.2Numberofjobsintourismindustriesasaproportionoftotaljobsandgrowthrateof

jobs,bysexWhilebigdatasourcesmaynotbeabletodirectlymeasurethetwoindicatorsabove,thereareableto provide valuable insights on tourism as awhole that could influence the policies developed topromotesustainabletourism.

Page 28: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

28

www.lirneasia.net

Goal9:Industry,InnovationandInfrastructureManydevelopingcountries still grapplewitha lackof suchbasic infrastructureas roads,electricalpowerandaccesstowaterandsanitation.SDG9seekstoensurethatthereisequitableaccesstosustainableinfrastructurethatisnotjustresilientbutisalsoofhighqualityandreliability18.Satellitebasedtechnologycoupledwithmachinelearningcanhelpidentifyinfrastructuresuchasroads(Jeanet al. 2016) and be leveraged to asses rural populations’ access to such. Similarly,mobile phonedata,whichcanbeusedtoestimatemobilitypatterns,cancontributetowardsurbanplanning.Bigdatasourceshavegreaterapplicabilitytotarget9.1.TABLE09:Industry,InnovationandInfrastructure

PossibleTargets

Theme Application DataSource References

9.1 Predictorsofpoverty

Road access and ruralpopulation

SatelliteData Mena & Malpica (2005);Jeanetal.(2016)

9.1 TransportPlanning

Real-time trafficmonitoring

Sensordata Shi&Abdel-Aty(2015)

9.1 TransportPlanning

Roadusagepatterns MobilePhoneData

Tooleetal.(2014)

9.1 TransportPlanning

Origin-destinationflows MobilePhoneData

Calabrese et al. (2011);Samarajivaetal(2015)

9.1 TransportPlanning

Trafficmonitoring GPSdata GoogleTraffic

BigDataSourcesSatelliteData,SensorData,MobilePhoneDataRoadAccessandRuralPopulationSatelliteDataMena&Malpica(2005)developedamodelforautomaticroadnetworkextractionbasedonsatelliteand aerial color imagery of high resolution. The model leverages four components: “datapreprocessing;binarysegmentationbasedonthreelevelsoftexturestatisticalevaluation;automaticvectorizationbymeansofskeletalextraction;andfinallyamoduleforsystemevaluation.”Jean et al. (2016) used both daytime and nighttime satellite imagery as well as survey data andmachine learning to identify varying economic well-being levels in Malawi, Nigeria, Rwanda,Tanzania and Uganda. The computer model was trained to identify features of daytime satelliteimagesthatwerepredictiveofpovertyincludingtheidentificationofroads.MobilityPatternsandUrbanPlanning

18http://www.un.org/sustainabledevelopment/infrastructure-industrialization/

Page 29: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

29

www.lirneasia.net

MobilePhoneDataCalabrese et al. (2011) sought to identify origin-destination flows of a population in order toestimate travel demand by analyzing location data from mobile phone users in the BostonMetropolitanarea.Researchersshowthattheresultscorrelatewellwithofficialdataatthecountylevelaswellasthemoregranularcensustractlevel.Similarly,Samarajivaetal.(2015)analysedCDRsfrommultiplemobilephoneoperatorsinSriLankatounderstandmobilitypatternsinColombo.Thisincludedpeakandoff-peak travelpatterns, traffic aswell as congregationsofpeople. Specifically,the researchers sought tomeasure identifying changes in population density at a particular timerelative to midnight, with the findings closely matching the results of a transportation surveyconductedbythegovernment.Alsorelatedtomobility,butusingadifferentmethodistheworkofHayano,&Adachi.(2013)whouseGPS data (based on an subscription-based service offered by the service provider) to analyzemovementtoaparticulararea.ThelimitationofcourseistheuseablesampleofuserswhohadtohavesubscribedtotheGPStrackingserviceofferedbythemobileoperator.Moreover, in order tomap out relationship between people’smobility and their social networks,Phithakkitnukoon,S.etal (2012)conductedastudy inPortugalanalyzingayear’sworthofmobilephonedata foroveramillionmobileusers.Among thekey findings, the study found that aroundfour-fifths of places visited by users werewithin a 20km radius of the location of the closest (intermsofgeography)socialties.Thestudyalsorevealedthatpopulationdensityimpactedtheradius,withareas thathadadenserpopulationhavinga shorter geo-social radius than compared to lessdenseareas.Similarly,Calabreseetal. (2010) sought tounderstand the relationshipbetweensocialeventsandthehomeareasoftheattendeesbasedonthepremisethatthroughouttimepeopletendtofollowgivenpreferencepatterns.The studyanalysedclose toonemillioncellphone traces to show thatthere is a strong correlation between the home areas (origins) of the attendees and the type ofsocialeventheld.SensorDataA study conducted by Shi & Abdel-Aty (2015) deployed a Microwave Vehicle Detection System(MVDS) over 75miles of theOrlando expressway network to explore the “viability of a proactivereal-time traffic monitoring strategy evaluating operation and safety simultaneously.” The study,which showed that traffic congestion tended to be localized and varied with time, further usedBeysianInferenceandrandomforestdataminingtechniquestodevelopmodelsforcrashprediction.EstimatingTrafficFlowMobilePhoneDataAstudybyTooleatal (2014)developedamodel that leveragedcalldetail recordsto identify roadusage patterns. An algorithm to mine calls and identify the phone users location transitionprobabilities(TP)wasdevelopedandtheseTPwerethenusedinconjunctionwithdemographicdatato estimate “origin-destination flows of residents between any two intersections of a city” –congestionforeachroadwaywasthenestimatedwiththehelpofanalgorithm.SatelliteDataSimilarly,astudybyNecular(2015)soughttoidentifytrafficpatternsandtrafficflowsbyutilizingaGPSdata fromabout3,600drivers accounting for around10,000 tracesof vehicleGPS to identify

Page 30: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

30

www.lirneasia.net

“contiguoussetof roadsegmentsand time intervalswhichhave the largest statistically significantrelevanceinformingtrafficpatterns.”GoogleTrafficGoogle Traffic leverages crowd sourced traffic data from mobile users using Google Maps (oniphones)orandroiduserswithlocationturnedontoidentifytrafficspeedandcongestion.Thedataprovided by these phone users is used to analyse the speed and number of cars on the road.Furthermore,itshistoryoftrafficpatternsisalsoleveragedtopredictchangesintraffic.InsightsfromBigDatavs.TraditionalIndicatorsTarget9.1Developquality, reliable, sustainableand resilient infrastructure, including regionalandtrans-borderinfrastructure,tosupporteconomicdevelopmentandhumanwell-being,withafocusonaffordableandequitableaccessforallTherearetwotraditionalindicatorstomeasuretarget9.1:

− 9.1.1Proportionoftheruralpopulationwholivewithin2kmofanall-seasonroad− 9.1.2Passengerandfreightvolumes,bymodeoftransport

Bigdatasourcescanhelpunderstandpatternsofroadusage,pocketsofcongestion,andgenerallydeterminemobilitypatternsofpopulation,allkeyfactorsinthedevelopmentofinfrastructureGoal10:ReducedInequalitiesAccordingtotheUN,incomeinequalitywithincountrieshasrisen,andwhiletherehasbeeninroadsinto reducing poverty, there has been rising consensus regarding the need for inclusivedevelopment, that is taking into account economic, social and environmental aspects ofdevelopment.19 With regards to specific targets, mobile phone data which can be leveraged toassessthesocioeconomicstatusofpopulationsatgranularlevelscancontributetowardsidentifyingthe bottom 40 percent of population enabling governments to provide targeted assistance topromote income growth, helping to achieve target 10.1 and a portion of 10.2 that focuses oneconomicinclusion.TABLE10:ReducedInequalities

PossibleTargets

Theme Applicationn DataSource

References

10.1 Socio economicstatus andwellbeing

Socioeconomicstatus

MobilePhoneData

Frias-Martinez et al. (2012);Blumenstocketal.(2015)

BigDataSourcesMobilePhoneDataSocioeconomicStatusandEconomicWell-beingGoal1dealswiththistopicindetailInsightsfromBigDatavs.TraditionalIndicators

19http://www.un.org/sustainabledevelopment/inequality/

Page 31: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

31

www.lirneasia.net

Target10.1By2030,progressivelyachieveandsustainincomegrowthofthebottom40percentofthepopulationataratehigherthanthenationalaverage.Thisistobemeasuredusingoneindicator:

− 10.1.1Growthratesofhouseholdexpenditureor incomepercapitaamongthebottom40percentofthepopulationandthetotalpopulation

Whilemobilephonedatawouldnotbeabletoprovidegrowthratespercapita,itcouldcontributetowardsassessingchangesinthesocioeconomicstatusofpopulations.Goal11:SustainableCitiesandCommunitiesWith almost 60 percent20of the world’s population projected to live in urban areas by 2030,governmentswillfaceincreasedurbanchallengesincludingmanagingcongestion,adequatehousing,addressingurbanpoverty,providingsoundinfrastructureandensuringaccesstobasicservicesinasustainablemanner. Among other things, urban planning requires information on themobility ofpopulation, areas of congregation, areas of residence, their social networks as well as generaleconomic conditions. To that extent, understanding the dynamics betweenmobility, the physicaldistanceofmovementsalongwithcorrespondingsocialnetworks,relationshipsthatmayormaynotresult inmovementwithinageographicspace,canfeed intomanyaspectsofurbanplanning.Bigdata sources, such as mobile phone data and satellite-based technology are well-positioned toprovideinsightsofrelevancetourbanplanning.TABLE11:SustainableCitiesandCommunities

PossibleTargets

Theme Application DataSource References

11.1 UrbanPoverty Identifyingslums SatelliteData Kohlietal.(2012)

11.1 PovertyMapping

IdentifyingPoverty

SatelliteData Jeanelal.(2016)

11.2 TransportPlanning

Geo-socialRadius MobilePhoneData

Phithakkitnukoonetal.(2012)

11.2 TransportPlanning

Origin-DestinationFlows

Mobile PhoneData

Calabrese et al. (2011);Samarajivaetal(2015)

11.2 TransportPlanning

PopulationHotspots

Mobile PhoneData

Louailetal.(2014)

11.2 TransportPlanning

Social events andhomelocations

Mobile PhoneData

Calabreseetal.(2010)

11.3 Landuse Land cover/landusechanges

RemoteSensingData

Tso & Mather (2001); Lu &Weng, (2007); Thomas et al.(2011)

11.5 Disaster Human mobility Mobile Phone Lu et al. (2012); Lu et al.

20http://www.un.org/sustainabledevelopment/cities/

Page 32: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

32

www.lirneasia.net

response afterdisasters Data (2016);Wilsonetal.(2016)

BigDataSourcesSatelliteData,MobilePhoneDataIdentifyingUrbanPovertyTheidentificationofslumshasbeendiscussedunderGoal1.MobilityPatternsMobilePhoneDataTheuseofmobilephonedatatoestimatemobilitypatternshasbeendescribedunderGoal9.LandUseSatelliteDataSatellite-based technology can be leveraged to identify temporal change that is invaluable for thestudy and understanding urban settlements. In particular, previous literature (Lu et al. 2012) hascapturednumerous studies (includingTso&Mather,2001; Lu&Weng,2007;Thomasetal.2011)haveleveragedremotesensingdatatomapchangesinlandcover/landuse.DisasterResponseMobilePhoneDataGiven the impact of extremeweather conditions, knowing themovement of displaced populationwillhelpsteerdisasterresponsefacilitatingtheprovisionoftargetedrelief.Thereisagrowingbodyof research that leverages mobile phone data to understand disaster response that includeunderstanding human mobility patterns after disasters such as cyclones (Lu et al. 2016) andearthquakes (Wilson et al 2016; Lu et al. 2012). It is possible to leveragedmobile phone data todescribe“short-term features (hours–weeks)ofhumanmobilityduringandafterextremeweatherevents,whichareextremelyhardtoquantifyusingstandardsurveybasedresearch”(Luetal.2016).By using two de-identified datasets of mobile phone users in Bangladesh, researchers conductedanalysesonclimatestressedregionsinBangladeshintheperiodsbeforeandafterCycloneMahasenThisdataset included the locationof themobilephone towernearest to the caller.Analyseswerealsoconductedonlong-termmigrationpatternsatanationallevel),wherethisdatasetincludesthelocationofthephoneuser’smostfrequentlyusedmobilephonetowerforeachmonth.Theresultsenabled the researchers to “quantify incidence, direction and duration of migration episodesenabling characterization of previously undocumented features of long-termmigration patterns inclimatestressedareas.”AnalysisconductedbyWilsonetal.(2016)oncalldetailrecordsofmobilephoneusersshowedthechangesinmobilitypatternsofthepopulationafterthe2015earthquakeinNepalatagranulardetailincludingthemovementofanestimated390,000peoplefromthevalleytosurroundingareas.Akeyfeaturewastherapiddeploymentofanalyticalcapacityandcomputationalarchitecturethatenabledthedetailedestimationofdisplacementaftertheearthquake.Flowminder,incollaborationwithmobilephoneoperatorDigicelanalysedthemobilityoftwomillionde-identified mobile phones in Haiti after the 2010 earthquake, showing that results from theiranalysescloselycorrespondedtoactualmovementofthepopulationaftertheearthquake(Luetal.,

Page 33: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

33

www.lirneasia.net

2012). The analysis showed that an estimated 630,000 individuals left Port-au-Prince after theearthquake. Moreover, the findings also revealed that those who left Port-au-Prince duringChristmasandNewYear(beforetheearthquake),wenttothosesameplacesaftertheearthquake,indicatingthatiftravelandcommunicatingpatternscanbeusedtoidentifysocialnetworks,itcouldbeusedtopredictmovementaftercrises.InsightsfromBigDatavs.TraditionalIndicatorsTarget11.1By2030,ensureaccessforalltoadequate,safeandaffordablehousingandbasicservicesandupgradeslums.Thereisoneindicatortomeasuretheprogressofthistarget

− 11.1.1 Proportion of urban population living in slums, informal settlements or inadequatehousing

Therehavebeenstudiesconductedusingsatellitedatathatcanbeleveragedtoassessurbanpovertybyidentifyingslumdwellings,therebyhelpingidentifyhouseholdsthatneedaccesstohousing.Target11.2By2030,provideaccesstosafe,affordable,accessibleandsustainabletransportsystemsfor all, improving road safety, notably by expanding public transport,with special attention to theneedsofthoseinvulnerablesituations,women,children,personswithdisabilitiesandolderpersonsThereisonlyoneindicatortomeasuretheachievementoftarget:

− 11.2.1Proportionofpopulationthathasconvenientaccess topublic transport,bysex,ageandpersonswithdisabilities.

Big data sources can help identify the movement of population, areas of congestion, populationhotspotsallfactorsthataffectthedemandfortransportsystems.Target11.5By2030, significantly reduce thenumberofdeathsand thenumberofpeopleaffectedand substantially decrease the direct economic losses relative to global gross domestic productcausedbydisasters,includingwater-relateddisasters,withafocusonprotectingthepoorandpeopleinvulnerablesituations.Therearetwoindicatorstomeasurethisindicator:

− 11.5.1 Number of deaths, missing persons and persons affected by disaster per 100,000people

− 11.5.2DirectdisastereconomiclossinrelationtoglobalGDP,includingdisasterdamagetocriticalinfrastructureanddisruptionofbasicservices

Of the two indicators tomeasure target11.5,mobilephonedatacancontribute towards indicator11.5.1,which ismulti-purpose indicatorwith linkages tootherSDGtargets including1.5,13.2,1.3,14.2, 15.3, 3.9, and 3.6 among others21. While mobile phone data may not be able to directlymeasure this indicator, it can help contribute towards the measurement) by helping identify themovementofpeopleafteradisaster.Goal12:ResponsibleConsumptionandProductionAccording to the UN, Goal 12 deals with “promoting resource and energy efficiency, sustainableinfrastructure,andprovidingaccesstobasicservices,greenanddecentjobsandabetterqualityoflifeforall.”22Thishasramificationsintermsoftheworld”sconsumptionofenergy,foodandwatertoensurethatresourcesareusedefficientlytogainmorewithless.

21https://unstats.un.org/sdgs/files/metadata-compilation/Metadata-Goal-13.pdf22http://www.un.org/sustainabledevelopment/sustainable-consumption-production/

Page 34: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

34

www.lirneasia.net

ReducingFoodWasteSatelliteDataInsights derived from targets in someof theother goals that havebeendiscussed, can contributetowards the achievement of Goal 12. For instance, given that only two-thirds of all produced iseventually consumed23, food waste remains a key concern. Goal 2 discusses the role of big datasources in crop yield estimation. The ability to forecast production of agricultural cropwill enabledecisionmakerstoprepareforpotentialshortages/surplusesandtakecorrectiveaction, ifneeded,ensuringthatfoodwasteisminimized–thiswillcontributetowardstheachievementoftarget12.3(By2030,halvepercapitaglobalfoodwasteattheretailandconsumerlevelsandreducefoodlossesalongproductionandsupplychains,includingpost-harvestlosses),althoughitwouldnotbeabletomeasuretheproposedindicatorforthis(12.3.1foodlossindex).ImprovingEnergyUseGoal7dealswiththeuseofsmartmetersforelectricityusageanalysisandsensordataforrenewableenergyusesupplyestimation.Goal13:ClimateActionSDG Goal 13 is of significant importance to the global community as evidenced by the ParisAgreementof2015,where195countriesadoptedaglobalclimatedealthatsetoutaplantocurbglobal warming to below 2oC.24. The Paris Agreement, which provides a roadmap to strengthenclimateresilience,iscrucialtoachievetheSDGs.Thisfurtherstrengthensthecaseforaccurateandtimelydata tomeasureprogress towards theachievementof these targets. Bigdata,particularlysatellite-basedtechnologyarewellpoisedtoaidthemonitoringindicatorsofclimatechange.TABLE13:ClimateAction

PossibleTargets

Theme Applications DataSource

References

13.1 Disasterresponse

Human mobilityafterdisasters

MobilePhoneData

Lu et al. (2012); Lu et al. (2016);Wilsonetal.(2016)

13.3 Changes inwater-relatedecosystem

Change insurfacewater

SatelliteData

Haasetal.(2009);Roknietal.(2014);Pekel et al. (2014); Mueller et al.(2016)

13.3 Droughtmonitoring

Severity andextent ofdroughtconditions

Henricksen, & Durkin (1986); Tucker& Choudhury (1987); Berhan et al.(2011)

23http://www.un.org/sustainabledevelopment/sustainable-consumption-production/24https://ec.europa.eu/clima/policies/international/negotiations/paris_en

Page 35: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

35

www.lirneasia.net

BigDataSourcesMobilePhoneData,SatelliteDataDisasterResponseThishasbeencoveredindetailunderGoal11AssessingGlobalSurfaceWaterThishasbeencoveredindetailunderGoal6DroughtMonitoringSatelliteDataMany agricultural regions in the world are susceptible to extreme weather conditions such asdroughts,whichseverelyaffectstheirabilitytocontributetowardsfoodsupply.Thismakesitcrucialfor decision makers to be aware of the location and potential of drought conditions to providetargeted assistance. Analysis, both spatial and temporal, of satellite images could help determineboththeseverityandtheextentofthedroughtusuallyinnearreal-time.Forexample,astudyconductedbyBerhanetal.(2011)inEthiopiareliedonsatellitedatatoderiveaspatial distribution of drought-affected areas in the region of focus. Drought condition data wasgenerated by comparing the Normalized Difference Vegetation Index (NDV) for the first ten-dayperiod of October 2009 with long-term mean NDVI. The results derived aligned with the rainfallrecordedinthecountryduringthisperiod.InsightsfromBigDatavs.TraditionalIndicatorsTarget 13.1 Strengthen resilience and adaptive capacity to climate-related hazards and naturaldisastersinallcountries

− 13.1.1Numberofcountrieswithnationalandlocaldisasterriskreductionstrategies− 13.1.2 Number of deaths, missing persons and persons affected by disaster per 100,000

peopleOf the two indicators tomeasure target13.1,mobilephonedatacancontribute towards indicator13.1.2 (Number of deaths,missing persons and persons affected by disaster per 100,000 people)Indicator13.1.2 isamulti-purpose indicatorwith linkages tootherSDGtargets including1.5,11.5,1.3,14.2,15.3,3.9,and3.6amongothers25.Bigdatasourcesmaynotbeabletodirectlymeasurethisindicator, it can help contribute towards themeasurement) by helping identify themovement ofpeopleafteradisaster.Target13.3 Improveeducation,awareness-raisingandhumanand institutional capacityonclimatechangemitigation,adaptation,impactreductionandearlywarning

− 13.3.1 Numberof countries thathave integratedmitigation, adaptation, impact reductionandearlywarningintoprimary,secondaryandtertiarycurricula

− 13.3.2 Number of countries that have communicated the strengthening of institutional,systemic and individual capacity-building to implement adaptation, mitigation andtechnologytransfer,anddevelopmentactions

25https://unstats.un.org/sdgs/files/metadata-compilation/Metadata-Goal-13.pdf

Page 36: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

36

www.lirneasia.net

Similarly, while the indicators for target 13.3 focus on the number of countries that haveimplemented select initiatives/policies to strengthennational capacity (institutional, individual andsystematic)tocombatclimatechange,bigdatacanbeusedtoprovideearlywarningoftheeffectsofclimatechange.

Goal14:LifeBelowWaterTheachievementofSDG14,whichaimsto“conserveandsustainableusetheworld”soceans,seasand marine resources,”26would be a key contributor to a sustainable future. However, humanactivities have severely impacted nearly 40 percent of oceans 27 . The use of satellite-basedtechnologyofferssignificantopportunitytoensuretheprotectionofthemarineecosystem.28For instance, the use of satellite data can be used to track the movement of marine vessels,understandtheirtrajectory,identifyillegalfishingandidentifyifshipscrossmarineprotectareas.Inparticular,publicdatafromtheAutomaticShip IdentificationSystems(AIS)that isusedmymarinevessels worldwide can be used to observemarine activity (McCauley et al. 2016). The increasingamountsofAISdatarequireanalysis,whichrequiresthebuildingupofnationalcapacity.InitiativesincludeGlobalFishingWatch,acollaborationbetweenGoogle,OceanaandSkyTruth,thatleveragesdata generated fromAIS to assess fishing activity in near real-time.Other private companies thatoffer such marine fleet monitoring solutions include Big Ocean Data and Marine Traffic. MarineTraffic”s service includes that provisionofmobile apps that enable users to track vessels on theirphones.BigDataSourcesSatelliteData InsightsfromBigDatavs.TraditionalIndicatorsAlthoughitdoesn”tappeartobeabletomeasurespecificindicatorsunderthisgoal,bytrackingthemovement of marine vessels, satellite data can help monitor levels of pollution, better regulatefishing practices, and ensure the protection of themarine ecosystem, contributing towards target14.1(whichdealswiththereductionofmarinepollutionofallkinds),14.2(whichseekstoensurethesustainable management and protection of marine ecosystems) and 14.4 (“effectively regulateharvesting and end overfishing, illegal, unreported and unregulated fishing and destructive fishingpractices”)Goal15:LifeonLand Deforestationanddesertificationcreatessignificantramificationsforsustainabledevelopment.Withan annual forest loss of around 13 million hectares, there is an increasing need to combatdeforestation.29Remotesensingdatahasbeen instrumental inassessing inchanges in topographyover time and space and there are numerous studies that have sought to measure changes insurfacewater(Sawayaetal.2003;Feyisaetal.2014,Pekeletal.2016),monitordroughtconditions

26http://www.un.org/sustainabledevelopment/wp-content/uploads/2016/08/14_Why-it-Matters_Goal-14_Life-Below-Water_3p.pdf27http://www.un.org/sustainabledevelopment/oceans/28 29See:http://www.un.org/sustainabledevelopment/biodiversity/

Page 37: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

37

www.lirneasia.net

(Henricksen,&Durkin,1986;Tucker&Choudhury,1987)andassesschangesinforestcover(Hansenetal.2013).TABLE15:LifeonLand

PossibleTargets

Theme Application DataSource

References

15.1 IdentifyDeforestation

Forestmapping SatelliteData

Hansen et al. (2014);Ohmannetal.(2014)

15.3 Combatdesertification

Changesinvegetation Hutchinsonetal.(2015)

ForestMappingSatelliteDataSatellitedatahasgenerallybeenusedtoaccesschangesinforestcoveracrossregions.Oneparticularstudy (Hansen et al. 2014) by Google and researchers at the University of Maryland leveragedsatellite data (fromNASA/USGS Landsat 7) tomap the gain and loss of forest cover globally (theanalysis excluded Antarctica and select Arctic Islands) over the period 2000 to 2012, with GoogleEarthEngineprocessing20terapixelsofLandsatdata.Thestudyfoundthatoverthe12-yearperiod,2.3 million square kilometers of forest area was lost globally, with a new forest gain of 800,000squarekilometers.Moreover,GlobalForestWatchusesGoogleEarthEnginetoanalyzeandcreatedatasetsincludingannualtreecovergain/lossdatafromtheUniversityofMaryland30.Ohmann et al. (2014) integrate forest inventory plot datawith satellite image data tomap forestvegetationatregionallevels.Theirobjectivewastoquantifytheeffectsofscalerelatedmethodsonmap accuracy, as methodological choices (scale, resolution etc.) often affect the maps that aredevelopedfromsatelliteandplotdata.Similarly,monitoring landuse in termsof vegetationchangecanbeuseful inensuring that land isbeingusedinasustainablemanner.Forinstance,Hutchinsonetal.(2015)analysed10year”sworthofMODISNDVI data (available at 16 day intervals) for army training lands in northeast Kansas toNDVI trends to identify changes in vegetation. Potter (2014) explores the use of the Landsatdifference index methodology as a low-cost mode of monitoring change caused by climate andbiologicalfactors.Thismethodprovestobefeasibleincomparisonwithhighresolutionaerialimagesandground-basedsurveydata.InsightsfromBigDatavs.TraditionalIndicatorsTarget 15.1 By 2020, ensure the conservation, restoration and sustainable use of terrestrial andinland freshwater ecosystems and their services, in particular forests, wetlands, mountains anddrylands,inlinewithobligationsunderinternationalagreements

− 15.1.1Forestareaasaproportionoftotallandarea− 15.1.2 Proportion of important sites for terrestrial and freshwater biodiversity that are

coveredbyprotectedareas,byecosystemtypeTarget15.3By2030,combatdesertification,restoredegradedlandandsoil,includinglandaffectedbydesertification,droughtandfloods,andstrivetoachievealanddegradation-neutralworld

− 15.3.1Proportionoflandthatisdegradedovertotallandarea

30https://developers.google.com/earth-engine/tutorial_gfw_01

Page 38: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

38

www.lirneasia.net

Inrelationtotheabovetargetslistedabove,satellitedataincombinationwithothersourcesofdatacouldpotentiallybeleveragedtomeasuresomeoftheindicators,specifically,15.1.1(Forestareaasaproportionoftotallandarea)and15.3.1(Proportionoflandthatisdegradedovertotallandarea).Goal16:Peace,JusticeandStrongInstitutions Oneof the targetsunder thisgoaldealswith thereductionofviolenceandrelateddeathsandbigdatasourcescanplayanimportantroleinfightingcrimethroughtheanalysisofhistoriccrimedatacoupled with other data such as mobility patterns and demographics. According to Chen et al.(2012), inadditiontocrimeanalysisandprediction,bigdatacanplayaroleincybersecurity,opensource intelligence, terrorism informatics and computational criminology among others. Big datainsightshave thegreater applicability to target16.1 (Significantly reduceall formsof violenceandrelateddeathrateseverywhere).TABLE16:Peace,JusticeandStrongInstitutions

PossibleTargets

Theme Application DataSource References

16.1

Predictivepolicing

Crimeprediction

MobilePhoneData Bogomolov et al.(2014)

Support crimeprediction

SocialMediaData Gerber(2014)

BigDataSourcesUsedMobilePhoneData,SocialMediaDataPredictivePolicingMobilePhoneDataNewdatasourceshave thepotential tobetterequip thepolice force to identifyareasofpotentialcrime. For instance, aggregated anonymized mobile phone data, which can be used to deriveinformation on human mobility, can also be used to identify crime hotspots when used inconjunctionwithdemographicdata.Forexample,astudyconductedbyBogomolovetal. (2014) inLondonsoughttoidentifypotentiallocationsforfuturecrimesbasedonaggregatedpeopledynamicsfeatures from the analysis of pastmobile phone data, aswell as open data that included sales ofresidentialproperty,weatherdata,transportationdata,anddataoncriminalcasesreported.Whencompared against real crime data, the results showed an accuracy of nearly 70% in predictingwhetherornotanareawouldbeacrimespotinthenextmonth.Similarly,authoritieshavealsobeenusingcrimepredictionsoftwaresolutionssuchasPredpolandPalantir.Predpolforinstance,reliesoninformationoncrimessuchastime,placetypeofcrime,andproprietaryalgorithmstogeneratepredictionsonwhenandwherecrimesaremorelikelytohappen.SocialMediaDataStudieshavealsoinvestigatedtheuseofsocialmediadataforcrimeincidentprediction.Forinstance,Gerber(2014)conductedlinguisticanalysiscoupledwithtopicmodelingtoidentifydiscussionpointsinacityintheUnitedStates,whichwasthenfedintoacrimepredictionmodel.Thestudyfoundthatthepredictionofcrimeimprovedformorethanthree-fourthsofcrimetypeswhentwitterdatawasincludedinthecrimepredictionmodel.

Page 39: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

39

www.lirneasia.net

InsightsfromBigDatavs.TraditionalIndicatorsTarget16.1SignificantlyreduceallformsofviolenceandrelateddeathrateseverywhereTheprogressofthistargetwouldbemeasuredusingfourindicators:

− 16.1.1Numberofvictimsofintentionalhomicideper100,000population,bysexandage− 16.1.2Conflict-relateddeathsper100,000population,bysex,ageandcause− 16.1.3Proportionofpopulationsubjectedtophysical,psychologicalorsexualviolenceinthe

previous12months− 16.1.4Proportionofpopulationthatfeelsafewalkingalonearoundtheareatheylive

Whilebigdatasourcesmaynotbeabletomeasurethefourspecificindicatorsforthistarget,insightsderived from big data sources can help identify hotspots for crime enabling authorities to moreeffectivelyallocateresourcesandultimatelycontributingtotheachievementoftarget16.1.Goal17:PartnershipsfortheGoalsTherearenumerousplayersthatareactiveintheBD4Dspaceandthissectionprovidesahighlevelassessmentofthekeyplayersspanninggovernment,multilateralorganizations,civilsociety,industry,donororganizationsandacademia,amongothers.OneofthekeyelementsofdiscoursearoundtheSDGsoverallistheneedforstrategicpartnerships,whichismimickedintheBD4Dspaceaswell.Theneed formulti-stakeholder engagementwas also emphasized byMr.WuHongbo, UnitedNationsUnderSecretaryGeneral forEconomicandSocialAffairsat the recentlyconcludedUNWorldDataForuminCapeTown(Jan15-18,2017):

“Countries all around the world are mobilizing to carry out the 2030 Agenda forSustainableDevelopment.Todoso, it isessential tohaveaccurate, reliable, timelyand disaggregated data…. This will require everyone in the statistics and datacommunity – from governments, the private sector, the scientific and academiccommunitiesandcivil society -- to findways toworkacrossdifferentdomainsandcreatepartnershipsandsynergies.”31

Thefullpotentialofbigdatafordevelopmentmayonlyberealizedifthereiscollaborationbetweenvariousactors,includingbutnotlimitedtogovernment,academia,fundingagencies,privatesector,multilateral/internationalinstitutions,civilsociety,mediaandothernon-governmentalorganizations(NGOs).MultilateralOrganizationsUnitedNationsTheGlobalWorkingGrouponBigDataforOfficialStatisticsTheUnitedNationshasbeenakeyplayerwithinthebigdataforofficialstatisticsspace.Inparticular,throughtheestablishmentofaGlobalWorkingGroup(GWG)onBigDataforOfficialStatistics, theUnitedNationsStatisticsDivision (UNSD)sought toexplore thechallengesandbenefits involved inutilizing big data for official statistics purposes while taking into consideration issues related to

31BriefingontheopeningoftheUNWorldDataForum.(2017,January15).Retrievedfromhttp://undataforum.org/WorldDataForum/wp-content/uploads/2017/01/Opening-remarks-Mr.-Wu-UNWDF-press-briefing-15-Jan.pdf

Page 40: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

40

www.lirneasia.net

privacy,methodology,technologyandlegislationamongothers.OneofthesixtaskteamswithinthisGWG is focused on big data and the SDGs with members of this team including a cross-sectionplayerssuchasWorldBank,UnitedNationsEconomicandSocialCommissionforAsiaandthePacific,InternationalTelecommunicationUnion,UniversityofPennsylvania,Orange,Data-popAllianceandParis2132.ItwastheUNSD,inpartnershipwithCSOIreland,thatorganizedthethirdinternationalconferenceonBigdata forOfficial statistics inDublin inAugust/September2016where issuesofaccess tobigdataandtheneedforpartnerships,relatedprivacyissues,theneedforcapacitybuildinginNSOsandtheuseofbigdataformeasuringSDGswereexplored.GlobalPlatformforBigDataforOfficialStatisticsFurthermore,theGWGhasproposedthedevelopmentofaGlobalPlatformforBigDataforOfficialStatisticstobringtogether“”publicandprivatebigdatanetworksatthenational,regionalandgloballevelbothheldbypublicandprivateagenciestomakedata,servicesandapplicationsaccessibleandacceleratetheirsynergiesforresearchandcapacitybuilding”.33UNBigDataProjectInventoryMoreover,theUNBigDataProjectInventory,ajointeffortoftheWorldBankandtheUNSDfortheGWG,showcasesbigdataprojects (amajorityofwhichare initiativesbyNationalStatisticsOfficesaroundtheworld)thatareatvariousstagesofimplementation,rangingfromexploratoryresearchtoprojectsbeingimplemented,thatsupportSDGmeasurementand/orrelatetoofficialstatistics..TheInventory, which lists projects categorized by statistical area, data source would also providecategorizationofprojectsbySDGgoal.UNGlobalPulseAbigdatainitiativebytheUnitedNationsSecretary-General,UNGlobalPulseconductsbigdatafordevelopmentresearchthroughanetworkofinnovationlabs.ThePulselabsinJakarta,KampalaandNewYorkseektodevelopnewframeworksandmethodstoleveragebigdatafordevelopmentandthey also collaboratewith a range of stakeholders including academia, public sector, governmentsand other UN agencies. Areas of study include climate & resilience, privacy & ethics, economicwellbeing, food& agriculture, and gender among others. UNGlobal Pulse has also been active inscopingoutbigdata solutions to address the SDGswith a list of projectsuploadedon theUNBigDataProjectInventory.WorldBankTheWorldBankcollaboratedwithSecondMuseAssociatestopublishareport,“BigDatainActionforDevelopment34“thatsoughttobetterunderstandhowbigdatacouldbeutilizedinthedevelopment

32UsingBigDatafortheSustainableDevelopmentGoals—UNGWGforBigData.(n.d.).Retrievedfromhttps://unstats.un.org/bigdata/taskteams/sdgs/33BusinessModelforGlobalPlatformforBigDataforOfficialStatisticsinsupportofthe2030AgendaforSustainableDevelopment.(2016,August20).Retrievedfromhttps://unstats.un.org/unsd/bigdata/conferences/2016/gwg/Item%203%20-%20Business%20Model%20for%20Global%20Platform%20for%20Big%20Data%20-%2016%20August%202016.pdf;CommitteeonGlobalPlatformforData,ServicesandApplications—UNGWGforBigDatahttps://unstats.un.org/bigdata/taskteams/globalplatform/34BigDatainActionforDevelopment.(n.d.).Retrievedfromhttp://live.worldbank.org/sites/default/files/Big%20Data%20for%20Development%20Report_final%20version.pdf

Page 41: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

41

www.lirneasia.net

sector.Moreover,inSeptember2016theylaunchedtheWorldBankBigDataInnovationChallenge35that invited applicants from around theworld to provide big data innovations that could addressissues related to climate change. Furthermore, theWorld Bank is involved in numerous big datarelated projects around theworld spanning the prediction of poverty usingmobile phone data inGuatamelatotrackingpovertyusingSatelliteDatainPakistan.36WorldFoodProgrammeTheWorldFoodProgrammehasalsobeenexploringtheuseofdata innovationstobetterpositionthemselves in the fight against global hunger and the achievement of SDG 237. For example, itteamedupwithUNGlobalPulsetodevelopanewmethodtoestimatehouseholdfoodexpenditureinAfricabasedontheanalysisofmobilephonedata38.Non-GovernmentalOrganizations/Non-Profits/CivilSociety

NameofOrganization Description

AninitiativeoftheUnitedNationsFoundation,Data2xseeksto improve the quality of gender data along with theavailabilityanduseofitandtothatend,hasbeensupportingresearchpilotstounderstandhowvariousbigdatacollectionandanalysismethodscouldaddressgendergaps.

Data-pop Alliance, whose core members include HarvardDevelopment Initiative, MIT Media Lab, the OverseasDevelopment Institute and the Flowminder Foundation,draws togethervariousactors in theBD4Dspace. It focuseson themes around big data and development includingpoliticsandgovernance,climatechangeandresilience,dataethicsandliteracyamongothers.

Sweden-based non-profit Flowminder seeks to leverageinsightsderivedfromtheanalysisofbigdatasourcessuchassatellitedataandmobileoperatordatatoservemiddleandlow-incomecountriesthroughvariousapplicationareasthatinclude socioeconomic analysis, precision epidemiology anddisasterresponse.

Fay,M.(2016,October20).ThedatarevolutioncontinueswiththelatestWorldBankInnovationchallenge.Retrievedfromhttp://blogs.worldbank.org/voices/data-revolution-continues-latest-world-bank-innovation-challenge36BigDataProjectInventory—UNGWGforBigData.(n.d.).Retrievedfromhttps://unstats.un.org/bigdata/inventory/37InnovationattheWorldFoodProgramme.(2016,May).Retrievedfromhttp://documents.wfp.org/stellent/groups/public/documents/communications/wfp284025.pdf38WFPAndUNGlobalPulseShowHowBigDataCanSaveLivesAndFightHunger.(2015,April08).Retrievedfromhttps://www.wfp.org/news/news-release/wfp-and-un-global-pulse-show-how-big-data-can-save-lives-and-fight-hunger

Page 42: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

42

www.lirneasia.net

Sri Lanka-based non-profit, LIRNEasia has been engaged inbigdatafordevelopmentresearchsince2012,withsolutionsthat contribute to urban and transportation planning, andlanduse classificationamongothers, and is also involved inconducting exploratory research on the potential forleveragingbigdataforotherpublicpurposes(suchasdenguepropagation). LIRNEasia has also been active in raisingaddressing issues related to privacy, security andmethodology.

Since 2015, the Centre for Internet and Society, an India-based non-profit has been undertaking research into BigData both from the perspective of human rights anddevelopment. Theirbigdataand rightswork looks into theharms and benefits of big data through case studies ondigital identity, credit scoring, predictive policing,transportation, and smart grids in the Indian context. Theirbigdataanddevelopmentworkhasfocusedonthewaysinwhichbig data canbeused to support the achievement oftheglobalgoalsintheIndiancontext.

Academia/ResearchLabsAcademicResearchersThereisagrowingbodyofacademicresearchersinvolvedinbigdataresearch.Forinstance,someofresearcherscapturedinthisreportinclude:MobilePhoneData:JoshuaBlumenstock(UniversityofCalifornia,Berkeley),Vanessa-FriasMartinez(UniversityofMaryland),,ReinAhas(UniversityofTartu),RyousukeShibasaki(UniversityofTokyo),AdelineDecuyper(UniversityofLouvain)andSantiPhithakkitnukoon(ChiangMaiUniversity)Satellite/Remote Sensing Data: Neal Jean (Stanford University), Marshall Burke (StanfordUniversity), KomeilRokni (BirjandUniversityof Technology), J.M. ShawnHutchinson (Kansas StateUniversity),andAlexanderBrooker(UniversityofOttawa).OnlineData:AlbertoCavalloandRobertoRigobon(MassachusettsInstituteofTechnology),AlbertC.Yang(HarvardMedicalSchool),WeiXu(ReminUniversityofChina)ResearchLabsHuMNetLab-MITTheHumanMobilityandNetworksLabattheMassachusettsInstituteofTechnologyseekstoexploresocialnetworks,mobilityandcitiesbyleveragingbothmachinelearningandstatisticalphysics.Shiabasaki&SekimotoLabApartoftheCenterforSpatialInformationScience&InstituteofIndustrialScience,theUniversityofTokyo,thelableveragessatellitedata,sensors,GPSandmobilephonedata.Theirbigdataresearchincludes emergency management, people flow, dynamic census, mapping (roads, buildings,economicactivities),urbanandregionanalysis

Page 43: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

43

www.lirneasia.net

CambridgeBigDataStrategicResearchInitiativeThe research initiative draws multi-disciplinary expertise from all six schools at the University ofCambridge to conduct big data research across five themes that include “theoretical foundations,imaging,datamanagementandprocessing,ethicsaccessandimpact,andmakingbigdatawork39”.PrivateSectorBigDataAnalytics/SolutionsProvidersRealImpactAnalyticsThroughitsDataforGoodsegment,Belgium-basedReal ImpactAnalyticsseeksto leveragemobilephonedatatohelpmeettheSDGs.Inadditiontotheprovisionofoperationalappsfordevelopmentagencies, the company is also involved in conducting analytical research in areas such as foodsecurityandfinancialinclusion.GoogleEarthEngineGoogle Earth Engine offers a platform for large-scale analysis of geospatial data. Their archiveincludes satellite imagery that spans over 40 decades (new images added daily) and enables dataminingataglobalscale. Inadditiontobusinessusers,theplatformisalsoavailabletogovernmentusers as well as for public purposes. Numerous developmental initiatives have leveraged GoogleEarthEngine”s infrastructure including formalaria riskmappingaswellas forassessingchanges inglobalsurfacewater40.DataProvidersTelecomOperatorsTelenor is among themobileoperatorswhosedatahasbeenused fordevelopmentpurposes. Forexample,in2013,Telenorresearch,OxfordUniversity,HarvardT.H.ChanSchoolofPublicHealthandtheUniversityofPeshawapartneredtounderstandtheimpactofhumanmobilityonthespreadofdengue.Thestudyleverageddatafrom30millionTelenorPakistansubscribers.Similarly,in2013,togetherwithGrameenPhoneandtheFlowminderFoundation,theTelenorGroupleveragedmobilephonedatatounderstandmigrationpatternsinBangladeshinresponsetoCycloneMahasen.OrangeGroupandSonatelhavealsosharedanonymizedMobilePhoneDatainSenegalwithresearchlaboratoriesaspartofthedatafordevelopmentchallengethatfocusedonthethemesofagriculture,energy, health, official statistics and transport/urban planning. Moreover, South Korea”s leadingtelephone company, KT Corporation sought to understand the spread of avian influenza byleveragingtelecommunicationbigdata.ThroughitsTelefonicaDynamicsInsightsdivision,Telefonicaprovidesbigdataanalyticssolutionsinthe areas of transport, cities & regions, retail and media serving a client base that spans acrossprivatesectorsoperatorstogovernments.Telefonicahasalsobeenactiveinthebigdataforsocialgoodspace,havingcollaboratedwithMIT,GSMA,DataPopAllianceandtheUNGlobalPulseamongothers.

39http://www.bigdata.cam.ac.uk/research/BIGDATA/directory/research-themes40CaseStudies.(n.d.).Retrievedfromhttps://earthengine.google.com/case_studies/

Page 44: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

44

www.lirneasia.net

Inadditiontothis.telecomoperatorshavealsoprovidedtheirdatatothirdpartyanalyticsprovidersunderstrictnon-disclosureagreementsas inthecaseofLIRNEasia inSriLankawithnon-disclosureagreementsrestrictingtheorganizationfromdisclosingthenamesandthenumberofoperatorstheyhavepartneredwiththem.SocialMediaPlatformsSocial media platforms have also jumped in the bandwagon to utilize their data, technology andexpertisefordevelopmentpurposes.Forexample,inSeptember2016,TwitterandUNGlobalPulseenteredintoastrategicdatapartnershipofferingUNGlobalPulseaccesstotheplatform”sdatatoolstogenerateinsightsthatcouldhelpaddresstheSDGs.Facebook41isworkingwithColumbiaUniversityandtheWorldBanktodeveloppopulationmapsofthe world. Its Connectivity Labs divisions teamed up with their data science division, machinelearning and Artificial Intelligence groups and the infrastructure unit to analyse satellite imageryacross20countriesbyleveragingtechniquessuchasFacebook”simage-recognitionengine.SearchEnginesAlthoughitnolongerpublishesFluTrendsandDengueTrends,Googleattemptedmakepredictionsoffluactivitybasedontheanalysesofsearchqueries.InfrastructureServiceProvidersBigdataanalyticsbringswith it thechallengesofdatastoragewithmanyorganizations lackingthenecessary storage capacity to house such large volumes of data. Third party providers such asAmazonWebservicescome in to thepictureoffering infrastructureas-a-servicesolutionsenablingcompaniestohousetheirdataoutsidetheirphysicalpremiseswhilstmakingitaccessibleremotely.OtherprovidersincludeGoogleCloudPlatformandMicrosoftAzure.GovernmentGovernmentsholdlargevolumesofdataregardingitscitizensandanalysisofsuchinformationcancreateinsightsthathaveimplicationsfordevelopment.NationalStatisticsOfficesNational StatisticsOffices (NSOs) have been exploring the use of big data in official statistics. Forexample, the ONS big data project by the UK”s Office for National Statistics aims to develop astrategytoleveragebigdatainitsofficialstatistics.Inadditiontoanalyticalresearchusingbigdatasourcessuchastwitterdataandsmartelectricitymeters,theprojecthasalsoconductedaliteraturereviewonmobilephonedata”sstatisticaluses. In2015, theGWGonbigdata forofficial statisticsconductedaglobalsurveytoassessthegroundrealitiesoftheusageofbigdatabyNSOs.Ofthe93countries that responded, around half indicated at least one big data-related project, reporting atotalof115bigdataprojects.Bigdata sources suchas satellitedata, scannerdata,mobilephonedata, web-scraping data, mobile phone data were among the sources of big data that were ofinterest to NSOs with big data being used for various statistical applications including transport,tourism,populationandprice.Moreover,basedon the resultsof the surveydeveloping countriesweremorelikelytoviewmeetingnewdatarequirementssuchasmeasuringtheSDGsasabenefitof

41Conditt,J.(2016,November15).ThisishowtheworldlooksonFacebook”spopulationmaps.Retrievedfromhttps://www.engadget.com/2016/11/15/facebook-population-maps-open-source/

Page 45: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

45

www.lirneasia.net

bigdata,comparedtotheirOECDcounterparts42.TheUNBigDataProject InventorycapturesNSObigdatainitiatives.FundersandDonorAgenciesFoundations that have provided funding for big data for development initiatives includeInternational Development Research Centre, TheWorld Bank, the Gates Foundation, TheWilliamandFloraHewlettFoundation43,amongothers.

42See:https://unstats.un.org/unsd/statcom/47th-session/documents/2016-6-Big-data-for-official-statistics-E.pdf43See:http://datapopalliance.org/;http://www.hewlett.org/grants/united-nations-foundation-for-continuation-of-data2x-including-big-data-pilots-0/

Page 46: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

46

www.lirneasia.net

Challenges

AnalyticalchallengesWhilethereisaconsiderablebodyofliteratureandworkthathasbeenbuiltupleveragingthesenewdatasourcesfordevelopmentalpurposes,overallthesearestilltheearlystages.Muchofwhathasbeendonecanbeconsideredproof-of-conceptsandtherehasbeenlimitedscalingup.Hencetherearestillanalyticalchallengesthathavetobeovercome.Whendealingwithlargequantitiesofdata,oftenunstructuredandoftenfrommultiplesources,thereisanimplicitassumptionthatdatawillbe“messy.”Thebeliefisthat“whatweloseinaccuracyatthemicro-level,wegainininsightatthemacrolevel”(pp.36,Mayer-Schönberger&Cukier,2013).Thisismisleading.DataqualityanditsprovenancedomatterandthequestionisimportantinestablishinggeneralizabilityoftheBigDatafindings.Knowingsuchgroundcontextisimportantnotjustinunderstandingthebaserawdata,butalsowheninterpretingresults.ForexampleNathanEagle,apioneerinusingbigdatafordevelopment,upondiscoveringlowpopulationmobilityfollowingafloodwhenanalyzingCDRdatafromRwanda,theorizedthecausetobetheoutbreakofcholera.Howeveraquickgroundsurveyrevealedthattherealwaswashedoutroads(David,2013).

Whilstthelargedatasizesmaymakequestionsregardingthesamplingrateirrelevant,knowingtherepresentativenessofthedataisstillimportant.Evenasmobilesubscriptionsinmanydevelopingcountriesnears100%,itstilldoesn”tmeanthateverypersoninthecountryownsamobilephone.Dependingontheresearchbeingpursued,questionssuchastheextantofcoverageofthepoor,orthelevelsofgenderrepresentationamongstmobilephoneuserscanbeveryimportant.StreetBump,amobileappthatnotifiesBostonCityHallwheneverappusersdriveoverapotholeonBostonroads,suffersfromaselectionbias.Thisisbecausetheappisbiasedtowardsthedemographicsoftheappusers,whooftenhailfromaffluentareaswithgreatersmartphoneownership(Harford,2014).Theoreticallyitcouldbepossiblethatsolongasresourcesareallocatedonthebasisofinsightsfromsuchapps,thepoorerpartsofthecitywillbebecomefurthermarginalized.Henceeveninthebigdataparadigm,understandingandaccountingformeasurementbias,ensuringinternalandexternalvalidity,andunderstandinginter-dependenciesinthedata,allremainimportant.Thesearefoundationalissuesnotjustfor“smalldata”butalsofor“bigdata”(Boyd&Crawford,2012).Understandingdatabiaseswillbeimportant,notleastbecauseoneofthetaglinesforthedatarevolutionis“countingtheuncounted”and“leavingnoonebehind.”

Theconfusionofcorrelationwithcausationbecomesmorepronouncedinthebigdataparadigm.Bybeingobservational,bigdatacanmeasureonlycorrelationandnotcausality.Thetechniquesofdataminingandmachinelearning,whichunderpinmuchthebigdataanalytics,areprimarilyaboutcorrelationandpredictions.Whenbigdatastartedtogainpopularity,evangelistswerequicktoproclaimtheendoftheoryandhypothesistesting,withcorrelationbeingallthatmattered(seeforexampleAnderson,2008andevenMayer-Schönberger&Cukier,2013).Whileitistrueoftencorrelationscanbeenoughtomakedecisions,theevangelisticproclamationshavenotbeencometobear.ThenotedbehavioraleconomistSendhilMullainathanarguesthatinductivescience(i.e.algorithmicallyminingbigdatasources)willnotdrownouttraditionaldeductivescience(i.e.hypothesistesting)eveninaBigDataparadigm(Mullainathan,2013).AmongthethreeVsinthetraditionalbigdatadefinition,volumeandvarietyproducecountervailingforces.Morevolumemakesbigdatainductiontechniqueseasierandmoreeffective,whilemorevarietymakesthemharderandlesseffective.Itisthisvarietyissuethatwillensuretheneedforexplainingbehavior(i.e.deductivescience)ratherthanjustpredictingit.

Allthisisnottosaythatcausalmodelingisnotpossibleinthebigdataparadigm.Infactthisisachievablebyconductingexperiments(Varian,2013).Telecomnetworkoperatorsthemselvesusesuchtechniqueswhenrollingoutnewservicesor,forthatmatter,forpricingpurposes.Butthird-

Page 47: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

47

www.lirneasia.net

partyresearcherswillnoteasilyhaveaccesstosuchexperimentationpossibilitiessincetheseareproprietarysystems.

Anothermisconceptiontheriseofbigdatawillmeanthatsurveyswillgothewayofthedodo.EvenwhenleveragingMNBDfordevelopment,surveysandsupplementaldatasetswillremainimportanttosharpentheanalysesandespeciallytoverifytheunderlyingassumptions.ForexampleBlumenstock&Eagle(2010)ranabasichouseholdsurveyagainstarandomizedsetofphonenumberspriortodataanonymizationtobuildatrainingdataset.Thisallowedthemtounderstandvariationsinmobility,socialnetworksandconsumptionamongstmenandwomen,andbetweendifferentsocio-economicgroupsthatwouldn”thavebeenpossibleusingjustthecallrecords.

Thebroaderanalyticalchallengeisthatnewscientificknowledgefrombigdataespeciallywhenthisisusingdatafromprivatesources,isdifficulttoverify.Transparencyandreplicabilityarecriticaliftheunderlyingmethodsandresultantinsightsaretobehonedandimproved.Thisisparticularlyimportantgivenextantembryonicstagesofcomputationalsocialscience.Thisunderscoreshowimportantitwillbetoopenuptheprivatedatasources(inamannerthataddressespotentialprivacyconcerns)soastobeabletoavailofthebenefitsofproperpeer-review.

AccessingDataThepromiseofthedatarevolutionforsustainabledevelopmenthasbeenpremisedonthecontinuallyincreasinglevelsof“datafication”oftheeconomyandhumanbehavior.Thesourcesofbigdataarebothintheprivateaswellaspublicsectors.Butindevelopingeconomiesmorecomprehensive“datafication”(atleastintermsofcoverageofthepopulation)hastakenplaceintheprivatesector.

Asthisreporthasshown,aconsiderableandgrowingbodyofworkshowcasethepotentialofmobilenetworkbigdataasoneofthemostpotentsourcesforinsightsofrelevancetodevelopmentalpolicy.Howevermobilephoneoperatorsaremostlyprivateenterprisesandthesectoritselfishighlycompetitivewhichmakesdataaccessdifficult.Thereareseveralreasonswhyaccessingthisdataevenfordevelopmentalpurposesisdifficult.Firstlygiventhatoperatorspossessbehavioraldataabouttheircustomers,thereareprivacyconcernsevenwithdatasetsbeingpseudonymized.Furthermoreregulationsandlawsmaybeinplacethatpreventorlimitdatasharing.Moreoftenthannotthelawshaveyettocatchupandthereisoftenaregulatoryandlegalvacuumindevelopingeconomies.Operatorswouldoftenprefertotreadcarefully,ifatall,forfearofattractinggreaterregulatoryattention,andespeciallymoresoifthebusinesscaseitselfisn”tveryclear.Thirdlythemobilephoneservicessectorisoftenhighlycompetitive.Assuch,operatorsarewarythatcompetitorscouldgleancommerciallysensitiveinformationfromdatasharingwiththirdparties.WiththeexceptionofLIRNEasiainSriLanka,nearlyallinstancesofdevelopment-focuseduseofmobilenetworkdatainvolvedtheuseofdatafromonlyoneoperator.

Thereareseveralcountervailingforcesinplay.Intheearlydaysoftheriseofbigdatainthepublicdiscourse,therewassomerecognitionbyoperatorsthatthedatatheypossessedcouldhavecommercialusesthatcouldhelpthemgeneratenewrevenuesources.Buttherewaslessclarityonthebusinessusecases.Thiscreatedopportunitiesforacademicsandresearcherstonegotiatedataaccess,whichenabledtheoperatorstoalsoidentifypotentialbusinesscases,whileallowingresearchersandpractitionerstoexploredevelopmentalusescases.Thisfacilitatedinnovationwithoperatorsmakingtheirdataavailable(underagreements)forchallengesandcompetitions.44While44 ForexampletheD4DchallengesinitiatedinpartnershipwithOrange,madehistoricalpsuedononymizeddataavailableforresearchersunderagreementsforchallengesin2013(datafromCoted”Ivoire)and2014(datafromSenegal).Formoreinformationrefertohttp://www.d4d.orange.com/en/Accueil

Page 48: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

48

www.lirneasia.net

itcanbearguedthatthestateoftheartisstillinitsembryonicstages,bynowtherearesomereasonablefleshedoutdevelopmentusecases.AfterMNBDgatheredrenewedinterestinthepublicdiscourseduringtheEbolacrisesinWestAfrica,GSMA(thegloballobbyinggroupformostGSMoperatorsintheworld)cameupwithrecommendationsthatsuggestedin-houseanalysesratherthandatasharing.45Thecurrenttrendisforcustomhardwareandsoftwareinfrastructurethatsitsbehindtheoperator”sfirewall.Thedataareanalyzed,resultsaggregatedandthenprovidedtooutsideparties.ThisiswhathappenedinNepalaftertherecentearthquake.Ncell,thelargestmobileoperator(basedonsubscribernumbers)inNepalworkedwithFlowmindertoutilizetheirdatatomapoutthepopulationdisplacementsthathadoccurredasaresultoftheearthquake.TheGSMA”ssuggestedapproachisgainingtraction,withFlowminder,andtheDataPopAllianceproposingthismethodasthewayforward.Therearesomeclearbenefits.Operator”sconcernsinrelationtoprivacy,competition,andevenregulatoryattentionwouldbebetterassuaged.Butforthistowork,thereareseveralthingstoconsidertohavethisworkingsmoothly:(a)thedevelopmentalusecaseandtheassociatedtechniquesandrelatedcodeshouldbesufficientlyrobusttoallowfora“plugandplay”deployment;(b)thereshouldbeminimalneedforotherthird-party“sensitive”data;(c)theremustbesomeknowledgeofthelevelofrepresentativityoftheoperator”sdata.Itisdifficulttoseehowthisapproachcouldcoveralldifferentusecasessincethetypeandcomplexityofthedevelopmentinsightsneededwouldvaryfromcontexttocontext.

Thequestionthenishowtofacilitategreaterdatasharingbyserviceprovidersforpublicpurposes,whileaddressingconcernsrelatedtoprivacyandcompetition.Asaregulatedindustry,mobilenetworkoperators(MNOs)operateunderlicense,whichcouldtheoreticallybearguedasaformofconcessionarycontracttodeliveragovernmentservice,andassuch,licensescouldtheoreticallyincludeprovisionsfordatasharing.OrganizationssuchasUNGlobalPulseareseekingtopopularizetheconceptof“dataphilanthropy,”aimedatsystematizingtheregularandsafesharingofdatabybuildingontheprecedentsbeingcreatedbytheadhocactivities.Whilethisphilosophyisgainingtraction,itwouldhavetobebuiltonresearchthatarticulatestheconcretedevelopmentalpotentialforthisdata,whilstatthesametimegeneratingmechanismstobothprotectprivacyaswellasthebusinessinterestsoftheprivatesectoractorsharingthedata,andthirdlyalsodevelopaclearbusinesscasefortheprivatesectoractortosharethedata.

CapacityDatascienceisafrontierfieldandwillrequirebroadexpertiseinavarietyoffields.Theseincludeacombinationofdatamining,statistics,domainexpertise,andalsoskillsindatapreparation,cleaning,andvisualization.NSOsmayhavedeepin-housestatisticalskills,butthisisnotenoughtoworkwiththelargevolumesofbigdatawhichcallforcomputerscienceanddecisionanalysisskillswhicharenotemphasizedintraditionalstatisticalcourses(McAfee&Brynjolfsson,2012).Currentlythereisamismatch between the supply and demand for talentwith the needed broader skill-sets i.e. datascientists.McKinseypredictsthatby2018thedemandfordata-savvymanagersandanalystsintheUnited Stateswouldbe450,000, yet the supplywill fall far short at only 160,000 (Manyikaet al.,2011). Intheshort-term,thework,especiallythoseforpublic-purposeswillhavetobedoneusingcollaborativeteams,whichcandrawonavarietyofskill-setstocollectivelyanalyzeandmakesenseofthedata.

45 ThecompleteGSMAguidelinesontheuseofmobiledataforrespondingtoEbolacanbefoundathttp://www.gsma.com/mobilefordevelopment/wp-content/uploads/2014/11/GSMA-Guidelines-on-protecting-privacy-in-the-use-of-mobile-phone-data-for-responding-to-the-Ebola-outbreak-_October-2014.pdf

Page 49: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

49

www.lirneasia.net

Butcapacityconstraintsarenotjustinconductingdatascience.Wedonotexpectpolicymakerstonecessarily become data scientists, but we need them to be informed consumers of big dataoutputs.Assuchpayingparticularattentiontoenlighteningpolicymakersonhowbigdatacanbeleveragedwill be important. Peoplewho can connect these different disciplines and domainswillhaveanimportantrole.

PrivacyPrivacyissueshavetakencenterstagewiththeriseofbigdata.Theconversationonprivacyinvolvesnotjustacademics,butalsostateandprivatesector,andthegeneralpublicwhooftenaretheprimaryproducersofsuchdatathroughtheiractivities.Butaconsensusonhowtoaddresstheattendantprivacychallengesisyettobereached.Privacy,ascommonlyunderstood,“isasweepingconcept,encompassing(amongotherthings)freedomofthought,controloverone”sbody,solitudeinone”shome,controloverpersonalinformation,freedomfromsurveillance,protectionofone”sreputation,andprotectionfromsearchesandinterrogations”(Solove,2008,p.1).Attemptstodefineitintermsofboundarycontrolbyindividuals(e.g.,Samarajiva,1994:90)aredifficulttotranslateintopracticalpolicy.TheInternationalTelecommunicationUnion(ITU)forexample,definesindividualprivacyas“therightofindividualstocontrolorinfluencewhatinformationrelatedtothemmaybedisclosed”(ITU,2006).Butitisdifficulttoclearlydemarcatewhatanindividualhasauthorityoverinthecaseofdatageneratedasaby-productofatransaction,wherethedataareco-producedandheldbyoneparty.Currentpracticesrelatedtodataprivacyutilizearights-basedapproach.Thisapproachisbestillustratedbythe“informandconsent”policy,wherebycompaniesinformusersofwhatdataisbeingcollectedaboutthemandhowitwillbeutilizedbythemandpotentiallyalsobythecompanies”affiliatesandpartners.ButasMayer-Schönberger&Cukier(2013)rightlypointout,the“informandconsent”modelisimpractical.Mostcurrentuserprivacypoliciesarelengthyandwritteninlegalesethatmakesunderstandingthemdifficultforthelayperson.Theyalsodonotdealeffectivelywiththesecondaryusesforthedata,whichoftenonlymanifestlongaftertheoriginaldatawascollected.Itbecomesimpracticalthenforcompaniestoknowinadvanceallthepotentialusesorcontinuouslyseekpermissionforeachnewuse.

Thelinesbetweenpersonalandnon-personalinformationarefurtherblurred,whendataismashedup.Previouslynon-personaldata,whenmixed,couldattimesrevealinsightsthatcaneasilybelinkedtoanactualindividual(Ohm,2010).Arecentstudyshowedhowpersonalattributessuchasethnicity,religiousandpoliticalviews,andevensexualorientationcanbeinferredfromFacebooklikes(Kosinski,Stillwell,&Graepel,2013)

Evenascomputationalsocialscientistsutilizeanonymizationtechniquesaddressprivacyconcerns,themethodsthemselvesarebeingcalledintoquestion(Narayanan&Shmatikov,2008).DeMontjoye,Hidalgo,Verleysen,&Blondel(2013)wereabletoidentify90%ofthepeopleusingjust4datapointsfromananonymizedsetof1.5millionCDRs.Eventhoughthedataitselfhaveanyidentityinformation,theauthorsshowedthatthereal-worldidentitiescouldbefoundbycross-referencingtheirdatawithotherpublicdata.Theemergentstateoftheartintechnicalsolutionstolimitre-identificationmayholdpromisebutitisstillintheearlystages.Oneparticularpromisingapproachisdifferentialprivacy,whichseekstoensurethattheresultsthatarederivedfromadatasetarevirtuallythesamewhetheraparticularindividualwasinitornot.Thisisaccomplishedinprinciplebyaddingnoisetothedatasetinsuchamannerthatitdoesnotaffecttheoverallstatisticalrobustnessoftheresultswithinacertainlevelofsensitivity.46

46 For a survey of differential privacy techniques see Ji, Lipton, & Elkan (2014)

Page 50: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

50

www.lirneasia.net

Onecanhopethatnewprivacy-preservingtechniqueswillbesufficientlyadvancedbythetimedevelopingeconomiesbecomemore“datafied.”Inthemeantime,thesenewdatasetscanbeleveragedforpublicpurposesundercontrolledsituationsbackedbylegalagreementsandapprovalprocesses.Itissuchresearchthatiscurrentlyadvancingthestateoftheartinunderstandingtheprivacyboundsforspecificusecaseswhendatawithconfidentialinformationarereleasedtothirdpartiesand/ortothepublic.

Whilethemeanstoaddressprivacyconcernsarefarfromclear,thereisgeneralagreementthattheremustbesomesafeguards.Thesesafeguardsmaybetechnological,conceptual,legalorevenacombinationofallthree.

EthicsWhilethebroaderdatafordevelopmentmovementisshowingtheusefulnessofleveragingnewdatasourcesfordevelopmentalpurposes,therearechallengingquestionstounderstandtheethicaldilemmasthatemergefromusingdataaboutpeople”sbehaviors.Theseapplicationsoftencommonlyfallunderthesecondary-usecategory,withthedatabeingleveragedforpurposesotherthanwhatitmayhaveoriginallybeenintendedfor.Practicalwaysofaddressingprivacyconcernsaroundsecondaryuseandpermissionsforsecondaryusearedifficultaswasmentionedearlier.Anappropriatelegalframeworkisstillemergent,withlittleconsensus.Hencetheonusonunderstandingandaddressingtheethicaldilemmaswillremainonresearchersandpractitioners.Therewillbeapotentialneedforthedevelopmentofprofessionalstandardstoensurethatpractitionersdeterminelegalandethicalissuesandaddressthem.Thesewillhavetobedoneoncontext-by-contextbasis.Thereissomemovementtowardsthis.TheUNOfficefortheCoordinationofHumanitarianAffairs(OCHA)recentlypublishedapolicybriefonhowtobuilddataresponsibilityintohumanitariandataecosystem(OCHA,2016a).OCHAalsoissuedaguidancenotefordatacollectionanddisaggregationbasedonprinciplesaroundparticipation,datadisaggregation,self-identification,transparency,privacy,andaccountability(OCHA2016b).Buttheconflictwillbeinaddressingethicaldilemmasinamannerthatdoesnotreducethesocietaland/orindividualutilityfortheinnovationsthatarecurrentlyemerginginusingdatafordevelopmentalpurposes.

CompetitionCompetitionisseenasagoodthatensuresa“levelplayingfield”togiveallcompetitorsequalopportunities,thoughnotidenticalorequaloutcomes.Intermsofcompetition,therelevantissuestoconsiderintheexplosionofdataaswellasinthetechniquesforleveragingthem,aretheeffectsofmergersandacquisitionsacrossdistinctlydifferentmarketsonaggregationofdata.Forexample,ithasbeenarguedthatGoogle”sacquisitionofNest,asupplierofhomethermostatsandCarbonMonoxidedetectors,operatinginadistinctlydifferentmarket,shouldstillhaveattractedgreaterregulatoryscrutinybecauseofthepotentialofdataaggregation(Stucke&Grunes,2016:89-92).

Evenwhentheentitiescontrollingdataarenotregulatedmonopolies,theymayapproachmonopolystatusiscertainmarkets(Rosoff,2014).Insuchinstancesaswellasincasesofmergersoracquisitionsthatwouldincreasemarketshare,itislikelythatcompetitionauthoritieswillpayattentiontotheeffectsofdatainadditiontoconventionalcompetitionissues.ThisappearstobeatleastpartofthejustificationfortheattentionbeingpaidtoGooglebyEuropeancompetitionauthorities.Theissuehereiswhethertraditionalconceptionsofmarketdefinitionscontinuetoberelevant.

Page 51: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

51

www.lirneasia.net

Whilstthismaynotinitiallyseemtoaffectissuesrelatedtopublicusesofprivatesectordatafordevelopmentalpurposes,howtheemergentcompetitionissuesgetresolvedwillaffectthetypeofdatathatiscollectedandsharedandinparticularthelongertermsuccessofthe“dataphilanthropy”concept.

Page 52: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

52

www.lirneasia.net

ConclusionThe thrust to utilize big data for official statistics underscores the potential for generating newinsightsand complementexistingmeasures. Bigdata canpotentially revolutionize currentofficialstatistical systems in oneof severalways47: a) Entirely replace existing statistical sources such assurveys; b) Partially replace existing statistical sources such as surveys; c) Provide complementarystatistical information in the same statistical domain but from other perspectives; d) Improveestimates from statistical sources; and e) Provide completely new statistical information in aparticular statistical domain. At the moment, complementing existing data is what offers thegreatestpotentialforbigdatasources.Research conducted to date has explored the utilization of different data sources for specificdevelopmentalapplications.Forexample,theanalysisofsatelliteimageryhastypicallybeenusedtomonitoring changes in topography including crop/yield estimation, drought monitoring,deforestationandcarbonstockmappingamongothers.Mobilenetworkbigdatahasproventobeuseful in understanding mobility patterns of the population, creditworthiness of its users,socioeconomicstatusofthepopulationamongothers,whilesocialmediadataiswellpositionedforsentimentanalysis.Therehavebeenliteraturereviews(forexample,Williams,2016,UK”sOfficeofNationalStatistics;LokanathanandGunaratne,2014)thathavecapturedthestatisticalapplicationsofbigdata,inparticularmobilephonedata.

The big data for development (BD4D) landscape hosts a range of players spanning government,industry, academia, and civil society among others. These players operate as policy actors,researchers, funders as well as intermediaries. Understanding the various actors at play offersgreateropportunities forstrategicpartnershipandcollaborations. Thus,Goal17hasbeenviewedthrough the lens of big data for development, stressing on the need for collaboration betweenvariousbigdataactors,andprovidingasnapshotoftheBD4Dlandscape.

However it is essential to be cognizant of the fact that the stateof the art is still developing andthereareanalyticalandtechnologicalchallengestobeawareof.Weneedtoensurethatincorrectconclusionsarenotdrawnfromablindapplicationofbigdatatechniques.Thecurrenttechniquesincludingsurveyswillremainimportantnotonlytobootstrapsomeofthebigdatatechniqueswithtraining data but also to fine-tunemodels to ground realities.Hence big datawill not completelyreplacetraditionalsurveysbutrathercomplementthem.Similarlygiventheanalyticalchallengesitwillbeveryimportantthatthereistransparencyandreplicabilityintheanalyses.Aprincipalconcernof the data revolution is about “counting the uncounted” and as suchwe need to pay particularattention to “representativity” of these new data sources that are being leveraged i.e. howaccurately it reflects the population. Marginalization in the real world can often result inmarginalization in the digital world beyond just issues of access to technologies, or beingrepresented in the digitized data. As such localized testing of these new techniques with localexperts (aware of ground truth and local context) are very important. Being mindful of theseanalyticalchallengesdoesnotnegatethepotentialofbigdata,butratherhelpsrefineandimprovetheoverallprocessofleveragingbigdataformonitoringandachievingtheSDGS.

Developing economies in particular have much lower levels of “datafication” than developedeconomies,whichmeanssomeofthemostinterestingandrelevantdataexistsamongsttheprivatesectorasshowninthisreport.Accessingsuchdatawillnotbewithoutchallenges,notleastbecausein competitive industries such as the telecom sector, therewould be competitive implications tosharingdata.Eventhentherearecostsandconsiderationsassociatedwithextractingandanalyzing

47Seeforexamplehttp://www.q2014.at/fileadmin/user_upload/ESTAT-Q2014-BigDataOS-v1a.pdf

Page 53: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

53

www.lirneasia.net

thedatafromprivatesectorindustriessoastosufficientlyprotectaspectsrelatedcustomerprivacy/confidentiality as well as protecting commercially sensitive business intelligence in competitiveindustries(e.g.thetelecomsector).Innovationwillhavetounleashappropriatebusinessmodelsforleveraging these data sources. But innovation in this space will also require the confluence of avariety of actors such as state, private, academia, and importantly non-governmental researchersandpractitioners.New formsof partnershipsbetween these actors are requirednot just becausethe true value comes from assembling different types of data from different sources, but alsobecauseoftheinherentcapacitychallengestomakefulluseofthedata.Thisisinherentlyamulti-disciplinary effort requiring computer scientists, statisticians, and domain/ subject-matter expertsalongwithgovernmentofficials.

TherearealsochallengesinunderstandingandaddressingtheprivacyimplicationsofBigData.itcanbe seen that the sharing (subject to appropriate privacy protocols) of privately held data such asmobilephonerecordscanbemutuallybeneficialtobothgovernmentaswellastheprivatesector.For example, emerging research in Africa shows how the reduction in airtime top-ups forecastdeclines in income among the poor. This can allow for targeted and timely policy actions bygovernmenttoaddresstheunderlyingproblems,whichwouldnotbepossiblewiththelagged,andoftenlimited,insightsrevealedbytraditionalstatistics.Suchacollaborativeearlywarningandearlyaction systemshowshowsuch sharing couldbe consideredabusiness riskmitigation strategy foroperators in emerging markets. But such cooperation is predicated on opening up the currentlyprivileged access that a few researchers and organizations have been given to mobile operatordatasets.

Page 54: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

54

www.lirneasia.net

ANNEXI:SDGTargetsbyBigDataSourcePossibleTargets Theme Statistical

application DataSource References

16.1 Predictivepolicing Crimeprediction Mobile

PhoneData Bogomolovetal.(2014)

5.1 GenderPrediction

Genderprediction

MobilePhoneData

Sundsøy et al. (2015);Blumenstock&Eagle(2010)

11.2 TransportPlanning

Geo-socialRadius

MobilePhoneData Phithakkitnukoonetal.(2012)

13.1 Disasterresponse

Human mobilityafterdisasters

MobilePhoneData

Luetal.(2012);Luetal.(2016);Wilsonetal.(2016)

1.1;1.2Socio economicstatus andwellbeing

Human mobilityandsocioeconomiclevels

MobilePhoneData Frias-Martinezetal.(2012)

1.1;1.2Socio economicstatus andwellbeing

Estimatingpoverty andwealth

MobilePhoneData Blumenstocketal.(2015)

1.1;1.2Socio economicstatus andwellbeing

Socioeconomicstatus

MobilePhoneData Gutierrezetal.(2013)

1.4 FinancialInclusion

Creditworthinessoftheunbanked

MobilePhoneData KumarandMohta(2012)

1.5 Disasterresponse

Human mobilityafterdisasters

MobilePhoneData

Lu et al. (2016); Wilson et al.(2016);Luetal.(2012)

2.1 Expenditure onFood

Proxy indicatorfor foodexpenditure

MobilePhoneData Decuyperetal.(2014)

3.3 DiseasePropagation

Mobility fromregions ofdiseaseoutbreak

MobilePhoneData

Wesolowski, et al. (2015);Bengston et al. (2015);Wesolowskietal.(2014)

3.3 DiseasePropagation

Sources andsinks fordiseases

MobilePhoneData

Ruktanonchai et al. (2016);Tatemetal.(2014)

3.3 DiseasePropagation

DiseaseImportationrate

MobilePhoneData Tatemetal.(2009)

4.6 IlliteracyPrediction

Areas of lowliteracy

MobilePhoneData Sundsøy,P.(2016)

Page 55: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

55

www.lirneasia.net

8.9 Tourism Destination oftourists

MobilePhoneData Ahasetal.(2008)

8.9 Tourism Seasonaltourism

MobilePhoneData Ahasetal.(2007)

9.1 TransportPlanning

Road usagepatterns

MobilePhoneData Tooleetal.(2014)

9.1 TransportPlanning

Origin-destinationflows

MobilePhoneData

Calabrese et al. (2011);Samarajivaetal(2015)

10.1Socio economicstatus andwellbeing

Socioeconomicstatus

MobilePhoneData

Frias-Martinez et al. (2012);Blumenstocketal.(2015)

11.2 TransportPlanning

Origin-DestinationFlows

MobilePhoneData

Calabrese et al. (2011);Samarajivaetal(2015)

11.2 TransportPlanning

PopulationHotspots

MobilePhoneData Louailetal.(2014)

11.2 TransportPlanning

Social eventsand homelocations

MobilePhoneData Calabreseetal.(2010)

11.5 Disasterresponse

Human mobilityafterdisasters

MobilePhoneData

Luetal.(2012);Luetal.(2016);Wilsonetal.(2016)

9.1 TransportPlanning

Trafficmonitoring GPSdata GoogleTraffic

2.c. PriceIndexesConstructingconsumer priceindex

OnlinePrices CavalloandRogobon(2016)

8.1 GDP GDP andHumanDevelopment PostalData Hristovaetal.(2016)

11.3 Landuse Land cover/landusechanges

RemoteSensingData

Tso & Mather (2001); Lu &Weng, (2007); Thomas et al.(2011)

1.1;1.2 PovertyMapping

Identifying thepoor SatelliteData Elvidge et al. (2009); Jean et al.

(2016)

1.1;1.2 PovertyMapping Urbanpoverty SatelliteData Kohlietal.(2012)

Page 56: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

56

www.lirneasia.net

2.1 Droughtmonitoring

Severity andextent ofdroughtconditions

SatelliteDataBerhan et al. (2011); Tucker &Choudhury (1987); Henricksen,&Durkin(1986);

2.4 Early crop yieldassessment

Developingvegetationhealthindices

SatelliteData Kogenetal.(2011)

6.6Changes inwater-relatedecosystem

Change insurfacewater SatelliteData

Muelleretal.(2016);Haasetal.(2009);Roknietal.(2014);Pekeletal.(2014);

7.1 Access toelectricity

Nighttimeluminosity SatelliteData

Elvidge et al. (1997); Elvidge etal. (2009); Townsend & Bruce(2010); Doll & Pachauri (2010);Chen&Nordhaus(2011)

8.1 GDP Economicdevelopment SatelliteData

Elvidgeetal. (1997);SuttonandConstanza (2002); Ebener et al.(2005); Chen and Nordhaus(2011);

9.1 Predictors ofpoverty

Road access andruralpopulation SatelliteData Mena&Malpica (2005); Jeanet

al.(2016)

11.1 UrbanPoverty Identifyingslums SatelliteData Kohlietal.(2012)

11.1 PovertyMapping

IdentifyingPoverty SatelliteData Jeanelal.(2016)

13.3Changes inwater-relatedecosystem

Change insurfacewater SatelliteData

Haas et al. (2009); Rokni et al.(2014); Pekel et al. (2014);Muelleretal.(2016)

13.3 Droughtmonitoring

Severity andextent ofdroughtconditions

SatelliteDataHenricksen, & Durkin (1986);Tucker & Choudhury (1987);Berhanetal.(2011)

15.1 IdentifyDeforestation Forestmapping SatelliteData Hansenetal.(2014);Ohmannet

al.(2014)

15.3 Combatdesertification

Changes invegetation SatelliteData Hutchinsonetal.(2015)

3.3 DiseasePropagation

Seasonal trendsofdiseases

SearchEngineData

Schuster et al. 2010; Yanget al.2010;Xuetal.2010

Page 57: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

57

www.lirneasia.net

8.5 Unemployment Unemploymenttrends

SearchEngineData Xuetal.,(2013)

9.1 TransportPlanning

Real-time trafficmonitoring Sensordata Shi&Abdel-Aty(2015)

7.1Residentialelectricityconsumption

Determinants ofelectricityconsumption

Smart MeterData Kavousianetal.(2013)

16.1 Predictivepolicing

Support crimeprediction

Social MediaData Gerber(2014)

ANNEXII:BigDataSourcesbyKeyApplications

Theme PossibleTargets

Statisticalapplication DataSource References

Accesstoelectricity 7.1 Nighttimeluminosity SatelliteData

Elvidge et al. (1997); Elvidgeet al. (2009); Townsend &Bruce (2010); Doll &Pachauri (2010); Chen &Nordhaus(2011)

Changes in water-relatedecosystem 6.6 Change in surface

water SatelliteDataMuelleretal.(2016);Haasetal. (2009); Rokni et al.(2014);Pekeletal.(2014);

Changes in water-relatedecosystem 13.3 Change in surface

water SatelliteDataHaas et al. (2009); Rokni etal. (2014); Pekel et al.(2014);Muelleretal.(2016)

Combatdesertification 15.3 Changes in

vegetation SatelliteData Hutchinsonetal.(2015)

Disasterresponse 13.1 Human mobilityafterdisasters

Mobile PhoneData

Lu et al. (2012); Lu et al.(2016);Wilsonetal.(2016)

Disasterresponse 1.5 Human mobilityafterdisasters

Mobile PhoneData

Luetal.(2016);Wilsonetal.(2016);Luetal.(2012)

Disasterresponse 11.5 Human mobilityafterdisasters

Mobile PhoneData

Lu et al. (2012); Lu et al.(2016);Wilsonetal.(2016)

DiseasePropagation 3.3 DiseaseImportationrate

Mobile PhoneData Tatemetal.(2009)

DiseasePropagation 3.3Mobility fromregions of diseaseoutbreak

Mobile PhoneData

Wesolowski, et al. (2015);Bengston et al. (2015);Wesolowskietal.(2014)

DiseasePropagation 3.3 Seasonal trends ofdiseases

Search EngineData

Schusteretal.2010;Yangetal.2010;Xuetal.2010

DiseasePropagation 3.3 Sources and sinks Mobile Phone Ruktanonchai et al. (2016);

Page 58: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

58

www.lirneasia.net

fordiseases Data Tatemetal.(2014)

Droughtmonitoring 2.1Severity and extentof droughtconditions

SatelliteDataBerhan et al. (2011); Tucker& Choudhury (1987);Henricksen,&Durkin(1986);

Droughtmonitoring 13.3Severity and extentof droughtconditions

SatelliteDataHenricksen,&Durkin(1986);Tucker&Choudhury (1987);Berhanetal.(2011)

Early crop yieldassessment 2.4

Developingvegetation healthindices

SatelliteData Kogenetal.(2011)

Expenditure onFood 2.1 Proxy indicator for

foodexpenditureMobile PhoneData Decuyperetal.(2014)

FinancialInclusion 1.4 Creditworthiness oftheunbanked

Mobile PhoneData KumarandMohta(2012)

GDP 8.1 Economicdevelopment SatelliteData

Elvidge et al. (1997); Suttonand Constanza (2002);Ebener et al. (2005); ChenandNordhaus(2011);

GDP 8.1 GDP and HumanDevelopment PostalData Hristovaetal.(2016)

GenderPrediction 5.1 Genderprediction Mobile PhoneData

Sundsøy et al. (2015);Blumenstock&Eagle(2010)

IdentifyDeforestation 15.1 Forestmapping SatelliteData Hansen et al. (2014);

Ohmannetal.(2014)

IlliteracyPrediction 4.6 Areas of lowliteracy

Mobile PhoneData Sundsøy,P.(2016)

Landuse 11.3 Land cover/landusechanges

Remote SensingData

Tso & Mather (2001); Lu &Weng, (2007);Thomasetal.(2011)

PovertyMapping 11.1 IdentifyingPoverty SatelliteData Jeanelal.(2016)

PovertyMapping 1.1;1.2 Identifyingthepoor SatelliteData Elvidgeetal. (2009); Jeanetal.(2016)

PovertyMapping 1.1;1.2 Urbanpoverty SatelliteData Kohlietal.(2012)

Predictivepolicing 16.1 Crimeprediction Mobile PhoneData Bogomolovetal.(2014)

Predictivepolicing 16.1 Support crimeprediction

Social MediaData Gerber(2014)

Predictors ofpoverty 9.1 Road access and

ruralpopulation SatelliteData Mena & Malpica (2005);Jeanetal.(2016)

PriceIndexes 2.c.Constructingconsumer priceindex

OnlinePrices CavalloandRogobon(2016)

Residentialelectricityconsumption

7.1Determinants ofelectricityconsumption

Smart MeterData Kavousianetal.(2013)

Page 59: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

59

www.lirneasia.net

Socio economicstatusandwellbeing 1.1;1.2 Estimating poverty

andwealthMobile PhoneData Blumenstocketal.(2015)

Socio economicstatusandwellbeing 1.1;1.2

Human mobilityand socioeconomiclevels

Mobile PhoneData Frias-Martinezetal.(2012)

Socio economicstatusandwellbeing 1.1;1.2 Socioeconomic

statusMobile PhoneData Gutierrezetal.(2013)

Socio economicstatusandwellbeing 10.1 Socioeconomic

statusMobile PhoneData

Frias-Martinez et al. (2012);Blumenstocketal.(2015)

Tourism 8.9 Destination oftourists

Mobile PhoneData Ahasetal.(2008)

Tourism 8.9 Seasonaltourism Mobile PhoneData Ahasetal.(2007)

TransportPlanning 11.2 Geo-socialRadius Mobile PhoneData

Phithakkitnukoon et al.(2012)

TransportPlanning 9.1 Origin-destinationflows

Mobile PhoneData

Calabrese et al. (2011);Samarajivaetal(2015)

TransportPlanning 11.2 Origin-DestinationFlows

Mobile PhoneData

Calabrese et al. (2011);Samarajivaetal(2015)

TransportPlanning 11.2 PopulationHotspots

Mobile PhoneData Louailetal.(2014)

TransportPlanning 9.1 Real-time trafficmonitoring Sensordata Shi&Abdel-Aty(2015)

TransportPlanning 9.1 Road usagepatterns

Mobile PhoneData Tooleetal.(2014)

TransportPlanning 11.2 Social events andhomelocations

Mobile PhoneData Calabreseetal.(2010)

TransportPlanning 9.1 Trafficmonitoring GPSdata GoogleTraffic

Unemployment 8.5 Unemploymenttrends

Search EngineData Xuetal.,(2013)

UrbanPoverty 11.1 Identifyingslums SatelliteData Kohlietal.(2012)

Page 60: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

60

www.lirneasia.net

ReferencesAhas,R.,Aasa,A.,Mark,Ü.,Pae,T.,&Kull,A.(2007).SeasonaltourismspacesinEstonia:Casestudywithmobilepositioningdata.TourismManagement,28(3),898-910.

Ahas,R.,Aasa,A.,Roose,A.,Mark,Ü.,&Silm,S.(2008).Evaluatingpassivemobilepositioningdatafortourismsurveys:AnEstoniancasestudy.TourismManagement,29(3),469-486.

Anderson,C.(2008).Theendoftheory:thedatadelugemakesthescientificmethodobsolete.WiredMagazine,16(7)

Balcan, D., Colizza, V., Gonçalves, B., Hu, H., Ramasco, J. J., & Vespignani, A. (2009). Multiscalemobility networks and the spatial spreading of infectious diseases. Proceedings of the NationalAcademyofSciences,106(51),21484-21489.

Bengtsson, L., Gaudart, J., Lu, X.,Moore, S.,Wetter, E., Sallah, K., ... & Piarroux, R. (2015). Usingmobilephonedatatopredictthespatialspreadofcholera.Scientificreports,5.

Bengtsson, L., Lu, X., Thorson, A., Garfield, R., & Von Schreeb, J. (2011). Improved response todisastersandoutbreaksbytrackingpopulationmovementswithmobilephonenetworkdata:apost-earthquakegeospatialstudyinHaiti.PLoSMed,8(8),e1001083.

Berhan,G.,Tadesse,T.,Atnafu,S.,&Hill,S. (2011).DroughtMonitoring inFood-InsecureAreasofEthiopiabyUsingSatelliteTechnologies.InExperiencesofClimateChangeAdaptationinAfrica(pp.183-200).SpringerBerlinHeidelberg.

Blumenstock,J.E.(2016).Fightingpovertywithdata.Science,353(6301),753-754.

Blumenstock, J.,&Eagle,N. (2010,December).Mobiledivides:gender, socioeconomic status,andmobile phone use in Rwanda. In Proceedings of the 4th ACM/IEEE International Conference onInformationandCommunicationTechnologiesandDevelopment(p.6).ACM.

Blumenstock, J.,Cadamuro,G.,&On,R. (2015).Predictingpovertyandwealthfrommobilephonemetadata.Science,350(6264),1073-1076.

Bogomolov,A.,Lepri,B.,Staiano,J.,Letouzé,E.,Oliver,N.,Pianesi,F.,&Pentland,A.(2015).Movesonthestreet:Classifyingcrimehotspotsusingaggregatedanonymizeddataonpeopledynamics.BigData,3(3),148-158.

Bogomolov,A., Lepri,B.,Staiano, J.,Oliver,N.,Pianesi,F.,&Pentland,A. (2014,November).Onceuponacrime:towardscrimepredictionfromdemographicsandmobiledata.InProceedingsofthe16thinternationalconferenceonmultimodalinteraction(pp.427-434).ACM.

Boyd, D., & Crawford, K. (2012). Critical Questions for Big Data. Information, Communication &Society,15(5),662–679.doi:10.1080/1369118X.2012.678878

Brdar, S.,Gavrić,K.,Ćulibrk,D.,&Crnojević,V. (2016).UnveilingSpatialEpidemiologyofHIVwithMobilePhoneData.Scientificreports,6.

Brooker, A., Fraser, R. H., Olthof, I., Kokelj, S. V., & Lacelle, D. (2014). Mapping the activity andevolutionof retrogressive thawslumpsby tasselledcap trendanalysisofaLandsatsatellite imagestack.PermafrostandPeriglacialProcesses,25(4),243-256.

Calabrese, F., Di Lorenzo, G., Liu, L., & Ratti, C. (2011). Estimating Origin-Destination flows usingopportunistically collected mobile phone location data from one million users in BostonMetropolitanArea.IEEEPervasiveComputing,99.

Page 61: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

61

www.lirneasia.net

Calabrese,F.,Pereira,F.C.,DiLorenzo,G.,Liu,L.,&Ratti,C. (2010,May).Thegeographyoftaste:analyzingcell-phonemobilityandsocialevents.InInternationalConferenceonPervasiveComputing(pp.22-37).SpringerBerlinHeidelberg.

CambridgeBigData.(n.d.).Retrievedfromhttp://www.bigdata.cam.ac.uk/research

Cavallo, A. (2013). Online and official price indexes: Measuring Argentina”s inflation.Journal ofMonetaryEconomics,60(2),152-165.

Cavallo,A.,&Rigobon,R.(2016).TheBillionPricesProject:Usingonlinepricesformeasurementandresearch.TheJournalofEconomicPerspectives,30(2),151-178.

Chen,H.,Chiang,R.H.,&Storey,V.C.(2012).Businessintelligenceandanalytics:Frombigdatatobigimpact.MISquarterly,36(4),1165-1188.

Chen, X., & Nordhaus, W. D. (2011). Using luminosity data as a proxy for economic statistics.ProceedingsoftheNationalAcademyofSciences,108(21),8589-8594.

Coombes, P. J., & Barry, M. E. (2014). A systems framework of big data driving policy making–Melbourne”s water future. In OzWater2014 Conference. Australian Water Association. BrisbaneAustralia.

Cosner,C.,Beier,J.C.,Cantrell,R.S.,Impoinvil,D.,Kapitanski,L.,Potts,M.D.,...&Ruan,S.(2009).Theeffectsofhumanmovementonthepersistenceofvector-bornediseases.Journaloftheoreticalbiology,258(4),550-560.

Costăchioiu, T.,&Datcu,M. (2010, June). Land coverdynamics classificationusingmulti-temporalspectralindicesfromsatelliteimagetimeseries.InCommunications(COMM),20108thInternationalConferenceon(pp.157-160).IEEE.

D4D. (n.d.). Orange Data for Development Challenge in SenegalRetrieved fromhttp://d4d.orange.com/content/download/43330/405662/version/3/file/D4Dchallenge_leaflet_A4_V2Eweblite.pdf

Data 2x. (n.d.).Big Data and Gender Data Gaps. Retrieved from http://data2x.org/wp-content/uploads/2014/08/Big-Data-Projects.pdf

DataKind.(n.d.).Retrievedfromhttp://www.datakind.org/

David,T.(2013).BigDatafromCheapPhones.TechnologyReview,116(3),50–54.

Decuyper,A.,Rutherford,A.,Wadhwa,A.,Bauer, J.M.,Krings,G.,Gutierrez,T., ...&Luengo-Oroz,M. A. (2014). Estimating food consumption and poverty indices with mobile phone data. arXivpreprintarXiv:1412.2595.

DeMontjoye,Y.-A.,Hidalgo,C.a,Verleysen,M.,&Blondel,V.D.(2013).UniqueintheCrowd:Theprivacyboundsofhumanmobility.ScientificReports,3,1376.doi:10.1038/srep01376

Doll,C.H.,Muller,J.P.,&Elvidge,C.D.(2000).Night-timeimageryasatoolforglobalmappingofsocioeconomic parameters and greenhouse gas emissions. AMBIO: a Journal of the HumanEnvironment,29(3),157-162.

Doll, C. N., & Pachauri, S. (2010). Estimating rural populations without access to electricity indevelopingcountriesthroughnight-timelightsatelliteimagery.EnergyPolicy,38(10),5661-5670.

Eagle, N., Macy, M., & Claxton, R. (2010). Network diversity and economicdevelopment.Science,328(5981),1029-1031.

Page 62: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

62

www.lirneasia.net

Ebener, S.,Murray, C., Tandon, A., & Elvidge, C. C. (2005). Fromwealth to health:modelling thedistribution of income per capita at the sub-national level using night-time light imagery.internationalJournalofhealthgeographics,4(1),1.

Elvidge, C.D., Baugh, K. E., Kihn, E.A., Kroehl,H.W.,Davis, E. R.,&Davis, C.W. (1997). Relationbetween satellite observed visible-near infrared emissions, population, economic activity andelectricpowerconsumption.InternationalJournalofRemoteSensing,18(6),1373-1379.

Elvidge,C.D.,Safran,J.,Tuttle,B.,Sutton,P.,Cinzano,P.,Pettit,D.,...&Small,C.(2007).Potentialforglobalmappingofdevelopmentviaanightsatmission.GeoJournal,69(1-2),45-53.

Elvidge,C.D.,Sutton,P.C.,Ghosh,T.,Tuttle,B.T.,Baugh,K.E.,Bhaduri,B.,&Bright,E. (2009).Aglobalpovertymapderivedfromsatellitedata.Computers&Geosciences,35(8),1652-1660.

Eurostat. (2014). Feasibility Study on theUseof MobilePositioning Data forTourism Statistics.Retrieved from http://ec.europa.eu/eurostat/documents/747990/6225717/MP-Consolidated-report.pdf

FAO.(n.d.).TheGlobalPartnershiponSustainableDevelopmentData.WorkingtogethertowardstheGlobal Sustainable Development Goals. Retrieved from http://aims.fao.org/activity/blog/global-partnership-sustainable-development-data-working-together-towards-global

Feyisa,G. L.,Meilby,H., Fensholt,R.,&Proud, S.R. (2014).AutomatedWaterExtraction Index:Anew technique for surface water mapping using Landsat imagery.Remote Sensing ofEnvironment,140,23-35.

Finger,F.,Genolet,T.,Mari, L.,deMagny,G.C.,Manga,N.M.,Rinaldo,A.,&Bertuzzo,E. (2016).Mobile phone data highlights the role of mass gatherings in the spreading of cholera outbreaks.ProceedingsoftheNationalAcademyofSciences,201522305.

Flowminder Foundation. (n.d.). The First Application ofMobile Operator Data in Low- orMiddle-IncomeCountries—ModellingMalariaParasiteMovementsBetweenZanzibar and theTanzanianMainland. Retrieved from http://www.flowminder.org/case-studies/the-first-application-of-mobile-operator-data-in-low-or-middle-income-countries-modelling-malaria-parasite-movements-between-zanzibar-and-the-tanzanian-mainland

Flowminder Foundation. (n.d.).Haiti Cholera Outbreak 2010 . Retrieved fromhttp://www.flowminder.org/case-studies/haiti-cholera-outbreak-2010

Fortuna, C. (2016, December 19). 5 Vertical Integration Opportunities For Smart Water & SmartCities. Retrieved from https://cleantechnica.com/2016/12/19/five-vertical-integration-opportunities-smart-water-smart-cities/

Frias-Martinez,V.,&Virseda,J.(2012,March).Ontherelationshipbetweensocio-economicfactorsand cell phone usage. In Proceedings of the fifth international conference on information andcommunicationtechnologiesanddevelopment(pp.76-84).ACM.

Frias-Martinez, V., Virseda-Jerez, J., & Frias-Martinez, E. (2012). On the relation between socio-economicstatusandphysicalmobility.InformationTechnologyforDevelopment,18(2),91-106.

Gerber,M.S.(2014).PredictingcrimeusingTwitterandkerneldensityestimation.DecisionSupportSystems,61,115-125.

Google Earth Engine. (n.d.). Case Studies. Retrieved fromhttps://earthengine.google.com/case_studies/

Gutierrez, T., Krings, G., & Blondel, V. D. (2013). Evaluating socio-economic state of a countryanalyzingairtimecreditandmobilephonedatasets.arXivpreprintarXiv:1309.4496.

Page 63: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

63

www.lirneasia.net

Haas,E.M.,Bartholomé,E.,&Combal,B.(2009).Timeseriesanalysisofopticalremotesensingdatafor the mapping of temporary surface water bodies in sub-Saharan western Africa. Journal ofHydrology,370(1),52-63.

Hansen, M. C., Potapov, P. V., Moore, R., Hancher, M., Turubanova, S. A., Tyukavina, A., ... &Kommareddy, A. (2013). High-resolution global maps of 21st-century forest coverchange.science,342(6160),850-853.

Hansen,M.,Potapov,P.,Moore,R.,&Hancher,M.(2013,November14).Thefirstdetailedmapsofglobal forest change. Retrieved from https://research.googleblog.com/2013/11/the-first-detailed-maps-of-global.html

Hayano,R.S.,&Adachi,R.(2013).Estimationofthetotalpopulationmovingintoandoutofthe20km evacuation zone during the Fukushima NPP accident as calculated using “Auto-GPS” mobilephonedata.ProceedingsoftheJapanAcademy.SeriesB,Physicalandbiologicalsciences,89(5),196.

Heerschap,N.,Ortega,S.,Priem,A.,&Offermans,M. (2014,May). Innovationof tourismstatisticsthrough the use of newbig data sources. In12thGlobal Forumon Tourism Statistics, Prague, CZ.http://tsf2014prague.cz/assets/downloads/Paper(Vol.201).

Henricksen,B.L.,&Durkin,J.W.(1986).GrowingperiodanddroughtearlywarninginAfricausingsatellitedata.InternationalJournalofRemoteSensing,7(11),1583-1608.

Hess, D., & Savitz, J. (n.d.). Global Fishing Watch. Retrieved fromhttp://oceana.org/sites/default/files/global_fishing_watch_report_final.pdf

Ho, A. D., Reich, J., Nesterko, S. O., Seaton, D. T., Mullaney, T., Waldo, J., & Chuang, I. (2014).HarvardXandMITx:Thefirstyearofopenonlinecourses,fall2012-summer2013.Ho,AD,Reich,J.,Nesterko,S.,Seaton,DT,Mullaney,T.,Waldo, J.,&Chuang, I.(2014).HarvardXandMITx:The firstyearofopenonlinecourses(HarvardXandMITxWorkingPaperNo.1).

Hristova, D., Rutherford, A., Anson, J., Luengo-Oroz, M., & Mascolo, C. (2016). The InternationalPostalNetworkandOtherGlobalFlowsasProxiesforNationalWellbeing.PloSone,11(6),e0155976.

HumnetLab.(n.d.).Retrievedfromhttp://humnetlab.mit.edu/wordpress/news/

Hutchinson, J.M. S., Jacquin, A., Hutchinson, S. L., & Verbesselt, J. (2015).Monitoring vegetationchangeanddynamicsonUSArmytraininglandsusingsatelliteimagetimeseriesanalysis.Journalofenvironmentalmanagement,150,355-366.

Ingram,D.G.,Matthews,C.K.,&Plante,D.T.(2015).Seasonaltrendsinsleep-disorderedbreathing:evidencefromInternetsearchenginequerydata.SleepandBreathing,19(1),79-84.

ITU. (n.d.). Goal 14, Oceans. Retrieved from http://www.itu.int/en/sustainable-world/Pages/goal14.aspx

ITU.(2006).SecurityinTelecommunicationsandInformationTechnology:Anoverviewofissuesandthedeploymentofexisting ITU-TRecommendationsforsecuretelecommunications.Retrievedfromhttp://www.itu.int/dms_pub/itu-t/opb/hdb/T-HDB-SEC.03-2006-PDF-E.pdf

Jabari,S.,&Zhang,Y.(2014,March).BuildingdetectioninveryhighresolutionsatelliteimageusingHISmodel.InProceedingsofASPRS2014AnnualConference,Louisville,Kentucky.

Jahani,E.,Sundsoy,P.R.,Bjelland,J.,Iqbal,A.,Pentland,A.,&deMontjoye,Y.A.(2015).Predictinggenderfrommobilephonemetadata.InNetMobConferenceinCambridge,MA.

Janecek,A.,Valerio,D.,Hummel,K.A.,Ricciato,F.,&Hlavacs,H.(2015).TheCellularNetworkasaSensor: From Mobile Phone Data to Real-Time Road Traffic Monitoring. IEEE Transactions onIntelligentTransportationSystems,16(5),2551-2572.

Page 64: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

64

www.lirneasia.net

Jean, N., Burke, M., Xie, M., Davis, W.M., Lobell, D. B., & Ermon, S. (2016). Combining satelliteimageryandmachinelearningtopredictpoverty.Science,353(6301),790-794.

Ji, Z., Lipton, Z. C., & Elkan, C. (2014). Differential Privacy and Machine Learning: a Survey andReview, 1–30. Learning; Cryptography and Security; Databases. Retrieved fromhttp://arxiv.org/abs/1412.7584

Jiang, S., Ferreira Jr, J.,&González,M.C. (2015).Activity-basedhumanmobilitypatterns inferredfrommobilephonedata:AcasestudyofSingapore.InInt.WorkshoponUrbanComputing.

Kavousian, A., Rajagopal, R., & Fischer, M. (2013). Determinants of residential electricityconsumption: Using smart meter data to examine the effect of climate, building characteristics,appliancestock,andoccupants”behavior.Energy,55,184-194.

Kearns, J. (2015, July 08). Satellite Images Show Economies Growing and Shrinking in Real Time.Retrieved from http://www.bloomberg.com/news/features/2015-07-08/satellite-images-show-economies-growing-and-shrinking-in-real-time

Kogan, F., Adamenko, T., & Kulbida, M. (2011). Satellite-based crop production monitoring inUkraineandregionalfoodsecurity.InUseofSatelliteandIn-SituDatatoImproveSustainability(pp.99-104).SpringerNetherlands.

Kohli,D.,Sliuzas,R.,Kerle,N.,&Stein,A.(2012).Anontologyofslumsforimage-basedclassification.Computers,EnvironmentandUrbanSystems,36(2),154-163.

Kosinski, M., Stillwell, D., & Graepel, T. (2013). Private traits and attributes are predictable fromdigitalrecordsofhumanbehavior,2–5.doi:10.1073/pnas.1218772110

Kreindler, G. & Miyauchi, Y. (2015). Commuting and Productivity: Quantifying Urban EconomicActivityusingCellPhoneData.LIRNEasia

Krsinich,F.(2015).ImplementationofconsumerelectronicsscannerdataintheNewZealandCPI.

KTCorp. (2016, June27).KTCEOHwangChang-GyuDelivers SpeechatUNGlobalCompact: "Joinhands with the UN to prevent the spread of global diseases". Retrieved fromhttp://www.prnewswire.com/news-releases/kt-ceo-hwang-chang-gyu-delivers-speech-at-un-global-compact-join-hands-with-the-un-to-prevent-the-spread-of-global-diseases-300290512.html

Kumar,K.andMuhota,K.(2012),CanDigitalFootprintsLeadtoGreaterFinancialInclusion?(pp.1–4).WashingtonDC.

Lange,M.,Alpert,P.,&David,N.(2013,April).UtilizingMobile-Phone-LinkDatatoImproveRainfallMonitoringoverCyprus.InEGUGeneralAssemblyConferenceAbstracts(Vol.15,p.7743).

Lokanathan, S., & Lucas Gunaratne, R. (2014). Behavioral insights for development from MobileNetworkBigData:enlighteningpolicymakersontheStateoftheArt.

Lokanathan, S., DE SILVA, N., Kreindler, G., Miyauchi, Y., & Dhananjaya, D. (2014). Using MobileNetworkBigDataforInformingTransportationandUrbanPlanninginColombo.AvailableatSSRN.

Lokanathan,S.,Kreindler,G.E.,deSilva,N.N.,Miyauchi,Y.,Dhananjaya,D.,&Samarajiva,R.(2016).The Potential of Mobile Network Big Data as a Tool in Colombo”s Transportation and UrbanPlanning.InformationTechnologies&InternationalDevelopment,12(2),pp-63.

Louail, T., Lenormand, M., Cantú, O. G., Picornell, M., Herranz, R., Frias-Martinez, E., ... &Barthelemy, M. (2014). From mobile phone data to the spatial structure of cities.arXiv preprintarXiv:1401.4540.

Page 65: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

65

www.lirneasia.net

Louail,T.,Lenormand,M.,Ros,O.G.,Picornell,M.,Herranz,R.,Frias-Martinez,E., . . .Barthelemy,M. (2014, June 13). From mobile phone data to the spatial structure of cities. Retrieved fromhttp://www.nature.com/articles/srep05276

LuD,WengQ.Asurveyofimageclassificationmethodsandtechniquesforimprovingclassificationperformance.InternationalJournalofRemoteSensing.2007;28(5):823–870.

Lu,D.,Hetrick,S.,Moran,E.,&Li,G.(2012).ApplicationoftimeseriesLandsatimagestoexaminingland-use/land-coverdynamicchange.Photogrammetricengineeringandremotesensing,78(7),747.

Lu, S., Wu, B., Yan, N., & Wang, H. (2011). Water body mapping method with HJ-1A/B satelliteimagery.InternationalJournalofAppliedEarthObservationandGeoinformation,13(3),428-434.

Lu,X.,Bengtsson, L.,&Holme,P. (2012). Predictabilityofpopulationdisplacementafter the2010Haitiearthquake.ProceedingsoftheNationalAcademyofSciences,109(29),11576-11581.

Lu, X.,Wrathall, D. J., Sundsøy, P. R., Nadiruzzaman,M.,Wetter, E., Iqbal, A., ... & Bengtsson, L.(2016).Unveilinghiddenmigrationandmobilitypatternsinclimatestressedregions:AlongitudinalstudyofsixmillionanonymousmobilephoneusersinBangladesh.GlobalEnvironmentalChange,38,1-7.

Mayer-Schönberger,V.,&Cukier,K. (2013).Bigdata:Arevolutionthatwilltransformhowwelive,work,andthink.HoughtonMifflinHarcourt.

Madhawa,K.,Lokanathan,S.,Maldeniya,D.,&Samarajiva,R.(2015).Usingmobilenetworkbigdataforlanduseclassification.

Manyika,J.,Chui,M.,Brown,B.,Bughin,J.,Dobbs,R.,Roxburgh,C.,&Byers,A.H.(2011).Bigdata:The next frontier for innovation, competition, and productivity. Retrieved fromhttp://www.mckinsey.com/insights/mgi/research/technology_and_innovation/big_data_the_next_frontier_for_innovation

McAfee, A., & Brynjolfsson, E. (2012). Big data: the management revolution. Harvard BusinessReview,90(10),60–6,68,128.Retrievedfromhttp://www.ncbi.nlm.nih.gov/pubmed/23074865

McCauley,D., (2016,April 13).HowSatellites andBigData canHelp to save theocean.Retrievedfromhttp://e360.yale.edu/features/how_satellites_and_big_data_can_help_to_save_the_oceans

Mena,J.B.,&Malpica,J.A.(2005).Anautomaticmethodforroadextractioninruralandsemi-urbanareasstartingfromhighresolutionsatelliteimagery.Patternrecognitionletters,26(9),1201-1220.

Mueller, N., Lewis, A., Roberts, D., Ring, S., Melrose, R., Sixsmith, J., ... & Ip, A. (2016). Waterobservationsfromspace:Mappingsurfacewaterfrom25yearsofLandsatimageryacrossAustralia.RemoteSensingofEnvironment,174,341-352.

Mullainathan,S.(2013).WhatBigDataMeansForSocialScience.HeadCon“13PartI.Retrievedfromhttp://edge.org/panel/headcon-13-part-i

Narayanan,A.,&Shmatikov,V.(2008).RobustDe-anonymizationofLargeSparseDatasets.In2008IEEESymposiumonSecurityandPrivacy(sp2008)(pp.111–125).IEEE.doi:10.1109/SP.2008.33

Necula, E. (2015). Analyzing Traffic Patterns on Street Segments Based on GPS Data Using R.TransportationResearchProcedia,10,276-285.

OECD.(n.d.).Retrievedfromhttps://stats.oecd.org/glossary/detail.asp?ID=4344

Office for National Statistics. (n.d.). Retrieved fromhttp://www.ons.gov.uk/aboutus/whatwedo/programmesandprojects/theonsbigdataproject

Page 66: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

66

www.lirneasia.net

Ohmann, J. L., Gregory,M. J.,&Roberts,H.M. (2014). Scale considerations for integrating forestinventory plot data and satellite image data for regional forest mapping. Remote sensing ofenvironment,151,3-15.

Ohm,P.(2010).Brokenpromisesofprivacy:Respondingtothesurprisingfailureofanonymization.UCLALawReview,57(6).

OCHA.(2016a).Buildingdataresponsibilityintohumanitarianaction.OCHAPolicyandStudiesSeries,(18).UnitedNationsOfficefortheCoordinationofHumanitarianAffairs.Retrievedfromhttps://docs.unocha.org/sites/dms/Documents/TB18_DataResponsibility_Online.pdf

OCHA.(2016b).AHumanRights-basedApproachtoData:LeavingNoOneBehindinthe2030DevelopmentAgenda.UnitedNationsOfficefortheCoordinationofHumanitarianAffairs.Retrievedfromhttp://www.ohchr.org/Documents/Issues/HRIndicators/GuidanceNoteonApproachtoData.pdf

Pekel, J. F., Cottam, A., Gorelick, N., & Belward, A. S. (2016). High-resolution mapping of globalsurfacewateranditslong-termchanges.Nature.

Pekel, J. F., Vancutsem, C., Bastin, L., Clerici, M., Vanbogaert, E., Bartholomé, E., & Defourny, P.(2014). A near real-timewater surface detectionmethodbased onHSV transformation ofMODISmulti-spectraltimeseriesdata.Remotesensingofenvironment,140,704-716.

Peters, A. J., Walter-Shea, E. A., Ji, L., Vina, A., Hayes, M., & Svoboda, M. D. (2002). Droughtmonitoring with NDVI-based standardized vegetation index. Photogrammetric engineering andremotesensing,68(1),71-75.

Phithakkitnukoon,S.,Smoreda,Z.,&Olivier,P.(2012).Socio-geographyofhumanmobility:Astudyusinglongitudinalmobilephonedata.PloSone,7(6),e39253.

Pinkovskiy, M., & Sala-i-Martin, X. (2014). Lights, Camera,... Income!: Estimating Poverty UsingNationalAccounts,SurveyMeans,andLights(No.w19831).NationalBureauofEconomicResearch.

Plante, D. T., & Ingram, D. G. (2015). Seasonal trends in tinnitus symptomatology: evidence fromInternet search engine query data.European Archives of Oto-Rhino-Laryngology,272(10), 2807-2813.

Potter, C. (2014). Ten years of forest cover change in the Sierra Nevada detected using Landsatsatelliteimageanalysis.InternationalJournalofRemoteSensing,35(20),7136-7153.

Potter, C. (2014). Ten years of forest cover change in the Sierra Nevada detected using Landsatsatellite image analysis. International Journal of Remote Sensing, 35(20), 7136-7153.(http://www.tandfonline.com/doi/full/10.1080/01431161.2014.968687)

Prasad,A.K.,Chai,L.,Singh,R.P.,&Kafatos,M.(2006).CropyieldestimationmodelforIowausingremote sensing and surface parameters. International Journal of Applied Earth Observation andGeoinformation,8(1),26-33.

Predpol. (2014,March07).PredPolPartners LAPD-FoothillRecordsDayWithoutCrime!.Retrievedfromhttp://www.predpol.com/predpol-partners-lapd-foothill-records-day-without-crime/

PredPol.(n.d.).AboutUs|PredictivePolicing.Retrievedfromhttp://www.predpol.com/about/

RamaRao,N.,Kapoor,M.,Sharma,N.,&Venkateswarlu,K.(2007).Yieldpredictionandwaterloggingassessmentforteaplantationlandusingsatelliteimage‐basedtechniques.InternationalJournalofRemote Sensing, 28(7), 1561-1576.(http://www.tandfonline.com/doi/full/10.1080/01431160600904980)

RealImpactAnalytics.(n.d.).Retrievedfromhttps://realimpactanalytics.com/en/data-for-good

Page 67: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

67

www.lirneasia.net

Rhee,J., Im,J.,&Carbone,G.J.(2010).Monitoringagriculturaldroughtforaridandhumidregionsusingmulti-sensorremotesensingdata.RemoteSensingofEnvironment,114(12),2875-2887.

Rincon, P. (2016, August 18). Satellite images used to predict poverty. Retrieved fromhttp://www.bbc.com/news/science-environment-37122748

Roberts, Peter, K. C. Shyam, and Cordula Rastogi. 2006. “Rural Access Index: A Key DevelopmentIndicator.”TransportPapersTP-10.TheWorldBankGroup,Washington,DC.

Rokni,K.,Ahmad,A.,Selamat,A.,&Hazini,S.(2014).WaterfeatureextractionandchangedetectionusingmultitemporalLandsatimagery.RemoteSensing,6(5),4173-4189.

Rosoff,M. (2014,November24).Here”showdominantGoogle is inEurope.RetrievedSeptember25,2016,fromhttp://www.businessinsider.com/heres-how-dominant-google-is-in-europe-2014-11

Ruktanonchai, N. W., DeLeenheer, P., Tatem, A. J., Alegana, V. A., Caughlin, T. T., zu Erbach-Schoenberg,E., ...&Smith,D.L. (2016). Identifyingmalaria transmission foci foreliminationusinghumanmobilitydata.PLoSComputBiol,12(4),e1004846.

Samarajiva, R., & Lokanathan, S. (2013). Using behavioral big data for public purposes: Exploringfrontierissuesofanemergingpolicyarena.

Samarajiva, R., Lokanathan, S., Madhawa, K., Kreindler, G., & Maldeniya, D. (2015). Big Data toImproveUrbanPlanning.Economic&PoliticalWeekly,50(22),43.

Sannier,C.,Gilliams,S.,Ham,F.,&Fillol,E.(2015).UseofSatelliteImageDerivedProductsforEarlyWarning and Monitoring of the Impact of Drought on Food Security in Africa. In Time-SensitiveRemoteSensing(pp.183-198).SpringerNewYork.

Sawaya, K. E., Olmanson, L. G., Heinert, N. J., Brezonik, P. L., & Bauer, M. E. (2003). Extendingsatellite remote sensing to local scales: land andwater resourcemonitoring using high-resolutionimagery.RemotesensingofEnvironment,88(1),144-156.

Šćepanović S,Mishkovski I, Hui P, Nurminen JK, Ylä-Jääski A (2015)Mobile Phone Call Data as aRegional Socio-Economic Proxy Indicator. PLoS ONE 10(4): e0124160.doi:10.1371/journal.pone.0124160

Schuster,N.M.,Rogers,M.A.,&McMahonJr,L.F.(2010).Usingsearchenginequerydatatotrackpharmaceuticalutilization:astudyofstatins.TheAmericanjournalofmanagedcare,16(8),e215.

Serajuddin, U. (2015, April 30). Much of the world is deprived of poverty data. Let”s fix this.Retrieved from http://blogs.worldbank.org/developmenttalk/much-world-deprived-poverty-data-let-s-fix

Serajuddin,U.,Uematsu,H.,Wieser,C.,Yoshida,N.,&Dabalen,A.(2015).Datadeprivation:anotherdeprivationtoend.WorldBankPolicyResearchWorkingPaper,(7252).

Shi, Q., & Abdel-Aty, M. (2015). Big data applications in real-time traffic operation and safetymonitoring and improvement on urban expressways.Transportation Research Part C: EmergingTechnologies,58,380-394.http://www.sciencedirect.com/science/article/pii/S0968090X15000777

ShiabasakiLab.(n.d.).Retrievedfromhttp://shiba.iis.u-tokyo.ac.jp/home_en/overview/

ShiabasakiLab.(n.d.).Retrievedfrom://shiba.iis.u-tokyo.ac.jp/home_en/research/

Steele,J.E.,Sundsøy,P.R.,Pezzulo,C.,Alegana,V.A.,Bird,T.J.,Blumenstock,J.,...&Hadiuzzaman,K.N. (2017).Mappingpovertyusingmobilephoneand satellitedata.JournalofTheRoyal SocietyInterface,14(127),20160690.

Stucke,ME.,Grunes,A.P.(2016).Bigdataandcompetitionpolicy.Oxford:OxfordUniversityPress

Page 68: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

68

www.lirneasia.net

Sundsøy, P. (2016). Can mobile usage predict illiteracy in a developing country?. arXiv preprintarXiv:1607.01337.

Sundsøy,P.R.,Bjelland,J.,Iqbal,A.,D,Pentland,A.,Jahani,E.,&Demontjoye,Y.(2015,January).Across country study of gender prediction using mobile phone metadata. Retrieved fromhttps://www.researchgate.net/publication/270960813_A_cross_country_study_of_gender_prediction_using_mobile_phone_metadata

Sutton,P.C.,&Costanza,R.(2002).Globalestimatesofmarketandnon-marketvaluesderivedfromnighttime satellite imagery, land cover, and ecosystem service valuation. Ecological Economics,41(3),509-527.

Sutton,P.C.,Elvidge,C.D.,&Ghosh,T.(2007).Estimationofgrossdomesticproductatsub-nationalscales usingnighttime satellite imagery. International Journal of Ecological Economics& Statistics,8(S07),5-21.

Tatem,A. J.,Huang,Z.,Narib,C.,Kumar,U.,Kandula,D.,Pindolia,D.K., ...&Lourenço,C. (2014).Integrating rapid riskmappingandmobilephonecall recorddata for strategicmalariaeliminationplanning.Malariajournal,13(1),1.

Tatem,A.J.,Qiu,Y.,Smith,D.L.,Sabot,O.,Ali,A.S.,&Moonen,B.(2009).Theuseofmobilephonedata for the estimation of the travel patterns and imported Plasmodium falciparum rates amongZanzibarresidents.Malariajournal,8(1),1.

Telefonica.(2014,November27).BigDataforSocialGood:OpportunitiesandChallenges.Retrievedfrom https://www.telefonica.com/en/web/public-policy/blog/article/-/blogs/big-data-for-social-good-opportunities-and-challenges/

Telefonica. (n.d.). Retrieved from http://dynamicinsights.telefonica.com/smart-steps/our-sectors/for-transport/

Telenor. (2016, March 08). How to use Big Data for Social Good. Retrieved fromhttps://www.telenor.com/media/articles/2016/how-to-use-big-data-for-social-good/

Telenor. (2016, March 09). Big Demand for Big Data: New Telenor study on Dengue Fever inPakistan. Retrieved from http://www.gsma.com/mobilefordevelopment/programme/digital-identity/big-demand-for-big-data-new-telenor-study-on-dengue-fever-in-pakistan

Think-Asia. (n.d.). Retrieved from https://think-asia.org/bitstream/handle/11540/5146/Proceedings%20of%20the%20Expert%20Meeting%20on%20Crop%20Monitoring%20for%20Food%20Security.pdf?sequence=1#page=147

ThomasNE,HuangC,GowardSN,PowellS,RishmawiK,SchleeweisK,HindsA.ValidationofNorthAmerican forestdisturbancedynamicsderived fromLandsat timeseriesstacks.RemoteSensingofEnvironment.2011;115:19–32.

Thompson, K., & Kadiyala, R. (2014). Leveraging Big Data to Improve Water System Operations.ProcediaEngineering,89,467-472.

Tompkins, A.M., &McCreesh, N. (2016).Migration statistics relevant formalaria transmission inSenegalderived frommobilephonedataandused inanagent-basedmigrationmodel.Geospatialhealth,11(1s).

Toole,J.L.,Colak,S.,Alhasoun,F.,Evsukoff,A.,&Gonzalez,M.C.(2014).Thepathmosttravelled:miningroadusagepatternsfrommassivecalldata.arXivpreprintarXiv:1403.0636.

Page 69: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

69

www.lirneasia.net

Toole, J. L., Lin, Y. R., Muehlegger, E., Shoag, D., González, M. C., & Lazer, D. (2015). Trackingemployment shocks using mobile phone data. Journal of The Royal Society Interface, 12(107),20150185.

Townsend,A.C.,&Bruce,D.A.(2010).Theuseofnight-timelightssatelliteimageryasameasureofAustralia”s regional electricity consumption and population distribution. International Journal ofRemoteSensing,31(16),4459-4480.

Transport & ICT. 2016.Measuring Rural Access: Using New Technologies.Washington DC:WorldBank,License:CreativeCommonsAttributionCCBY3.0

Tso B, Mather PM. Classification Methods for Remotely Sensed Data. Taylor & Francis; London:2001.p.332.

Tucker, C. J., & Choudhury, B. J. (1987). Satellite remote sensing of drought conditions.RemotesensingofEnvironment,23(2),243-251.

UN Global Pulse (n.d.). Retrieved from http://www.unglobalpulse.org/blog/big-data-development-action-global-pulse-project-series

UN Global Pulse. (2014). Mining Indonesian Tweets to understand food price crises. Jakarta: UNGlobalPulse.

UNGlobal Pulse. (2015). Usingmobile phone data and airtime credit purchases to estimate foodsecurity. Retrieved fromwww.unglobalpulse.org/sites/default/files/UNGP_ProjectSeries_Airtimecredit_Food_2015.pdf

UN Global Pulse. (n.d.). A look at the gender distribution of tweets about global development.Retrieved from http://www.unglobalpulse.org/news/look-gender-distribution-post-2015-related-twitter-discussions

UNGlobalPulse.(n.d.).Retrievedfromhttp://www.unglobalpulse.org/about-new

UNGlobalPulse.(n.d.).Retrievedfromhttp://www.unglobalpulse.org/pulse-labs

Unganai, L. S., & Kogan, F. N. (1998). Drought monitoring and corn yield estimation in SouthernAfricafromAVHRRdata.RemoteSensingofEnvironment,63(3),219-232.

UnitedNationsStatisticsDivision.(2016,August16).BusinessModelforGlobalPlatformforBigDataforOfficial Statistics in support of the 2030Agenda for SustainableDevelopment. Retrieved fromhttps://unstats.un.org/unsd/bigdata/conferences/2016/gwg/Item%203%20-%20Business%20Model%20for%20Global%20Platform%20for%20Big%20Data%20-%2016%20August%202016.pdf

United Nations Statistics Division. (n.d.). Big Data for Official Statistics. Retrieved fromhttp://unstats.un.org/unsd/bigdata/

UnitedNationsStatisticsDivision.(n.d.).http://unstats.un.org/bigdata/inventory

UnitedNationsStatisticsDivision.(n.d.).http://unstats.un.org/bigdata/inventory/?selectID=WB54

United Nations Statistics Division. (n.d.).http://unstats.un.org/unsd/bigdata/conferences/2016/presentations/day%202/Grant%20Cameron.pdf

UnitedNationsStatisticsDivision.(n.d.). InternationalConferenceonBigDataforOfficialStatistics.Retrievedfromhttp://unstats.un.org/unsd/bigdata/conferences/2016/default.asp

UnitedNationsStatisticsDivision. (n.d.).UsingBigData for theSustainableDevelopmentGoals—UNGWGforBigData.Retrievedfromhttp://unstats.un.org/bigdata/taskteams/sdgs/

Page 70: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

70

www.lirneasia.net

United Nations Statistics Division. (n.d.).Expert Group Meeting on data disaggregation — SDGIndicators.Retrievedfromhttp://unstats.un.org/sdgs/meetings/egm-data-dissaggregation

UnitedNations.(n.d.).UNprojectsworldpopulationtoreach8.5billionby2030,drivenbygrowthindeveloping countries. Retrieved fromhttp://www.un.org/sustainabledevelopment/blog/2015/07/un-projects-world-population-to-reach-8-5-billion-by-2030-driven-by-growth-in-developing-countries/

University of Columbia. (2016, February 22).Working with Facebook to Create Better PopulationMaps. Retrieved from http://blogs.ei.columbia.edu/2016/02/22/working-with-facebook-to-create-better-population-maps/

Unsalan, C., & Boyer, K. L. (2011). Multispectral Satellite Image Understanding: From LandClassificationtoBuildingandRoadDetection.SpringerScience&BusinessMedia.

Vaitla,B.(2014).TheLandscapeofBigDataforDevelopment.

Varian,H.R.(2013),BigData:NewTricksforEconometrics

Wan,Z.,Wang,P.,&Li,X.(2004).UsingMODISlandsurfacetemperatureandnormalizeddifferencevegetation indexproducts formonitoringdrought in the southernGreatPlains,USA. InternationalJournalofRemoteSensing,25(1),61-72.

Wang,L.,&Qu,J.J.(2007).NMDI:Anormalizedmulti‐banddroughtindexformonitoringsoilandvegetationmoisturewithsatelliteremotesensing.GeophysicalResearchLetters,34(20).

Wästfelt,A.,Tegenu,T.,Nielsen,M.M.,&Malmberg,B.(2012).Qualitativesatelliteimageanalysis:MappingspatialdistributionoffarmingtypesinEthiopia.AppliedGeography,32(2),465-476.

Wesolowski,A.,Buckee,C.O.,Bengtsson,L.,Wetter,E.,Lu,X.,&Tatem,A.J.(2014).Commentary:ContainingtheEbolaoutbreak–thepotentialandchallengeofmobilenetworkdata.PLOScurrentsoutbreaks.

Wesolowski,A.,Metcalf,C.J.E.,Eagle,N.,Kombich,J.,Grenfell,B.T.,Bjørnstad,O.N.,...&Buckee,C. O. (2015). Quantifying seasonal population fluxes driving rubella transmission dynamics usingmobilephonedata.ProceedingsoftheNationalAcademyofSciences,112(35),11114-11119.

Wesolowski, A., Qureshi, T., Boni, M. F., Sundsøy, P. R., Johansson, M. A., Rasheed, S. B., ... &Buckee, C. O. (2015). Impact of human mobility on the emergence of dengue epidemics inPakistan.ProceedingsoftheNationalAcademyofSciences,112(38),11887-11892.

Wicaksono,P.,Danoedoro,P.,Hartono,H.,Nehren,U.,&Ribbe,L.(2011,October).Preliminaryworkof mangrove ecosystem carbon stock mapping in small island using remote sensing: above andbelowgroundcarbonstockmappingonmediumresolutionsatelliteimage.InSPIERemoteSensing(pp.81741B-81741B).InternationalSocietyforOpticsandPhotonics.

Williams, S. (2016). Statistical uses for mobile phone data: literature review. Retrieved fromhttps://www.ons.gov.uk/file?uri=/methodology/methodologicalpublications/generalmethodology/onsworkingpaperseries/literaturereviewonstatisticalusesofmobilephonedatafinal.pdf

Wilson,R.,zuErbach-Schoenberg,E.,Albert,M.,Power,D.,Tudge,S.,Gonzalez,M.,...&Pitonakova,L. (2016). Rapid and near real-time assessments of population displacement using mobile phonedatafollowingdisasters:the2015Nepalearthquake.PLoScurrents,8.

WorldPop.(n.d.).Retrievedfromhttp://www.worldpop.org.uk/about_our_work/about_worldpop/

Xu, W., Han, Z. W., & Ma, J. (2010, July). A neural netwok based approach to detect influenzaepidemicsusing searchenginequerydata. In2010 InternationalConferenceonMachine LearningandCybernetics(Vol.3,pp.1408-1412).IEEE.

Page 71: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

71

www.lirneasia.net

Xu,W., Li, Z., Cheng, C.,& Zheng, T. (2013).Datamining for unemployment ratepredictionusingsearchenginequerydata.ServiceOrientedComputingandApplications,7(1),33-42.

Xu, W., Zheng, T., & Li, Z. (2011, October). A neural network based forecasting method for theunemployment rate prediction using the search engine query data. Ine-Business Engineering(ICEBE),2011IEEE8thInternationalConferenceon(pp.9-15).IEEE.

Yang AC, Huang NE, Peng C-K, Tsai S-J (2010) Do Seasons Have an Influence on the Incidence ofDepression?TheUseofanInternetSearchEngineQueryDataasaProxyofHumanAffect.PLoSONE5(10):e13728.doi:10.1371/journal.pone.0013728

Yuan,L.,Zhang,J.,Shi,Y.,Nie,C.,Wei,L.,&Wang,J.(2014).Damagemappingofpowderymildewinwinterwheatwithhigh-resolutionsatelliteimage.Remotesensing,6(5),3611-3623.

Zhang,R.,Su,H.,Tian,J.,Chen,S.,Zhan,J.,Deng,X.,...&Wu,J.(2008,July).DroughtmonitoringinnorthernChinabasedonremotesensingdataandlandsurfacemodeling.InIGARSS2008-2008IEEEInternationalGeoscienceandRemoteSensingSymposium(Vol.3,pp.III-860).IEEE.

Page 72: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

72

www.lirneasia.net

Page 73: ANNEX I- Mapping Big Data Solutions for the Sustainable …lirneasia.net/wp-content/uploads/2013/09/Mapping-Big... · 2018-05-21 · various big data actors, and providing a snapshot

DRAFT

73

www.lirneasia.net