information quality: a framework for evaluating empirical studies

36
A Framework for Evaluating Empirical Studies InfoQ SRITNE KNOWLEDGE SEMINAR, ISB, Feb 2016 Galit Shmueli 徐茉莉 Analytics Humanity Responsibility

Upload: galit-shmueli

Post on 14-Feb-2017

356 views

Category:

Education


0 download

TRANSCRIPT

AFrameworkforEvaluatingEmpiricalStudies

InfoQ

SRITNEKNOWLEDGESEMINAR,ISB,Feb2016

Galit Shmueli徐茉莉

AnalyticsHumanity

Responsibility

eBaydatasetwithmillionsofauctions• Allcameraauctionsin2008• Allcameraauctionsin2016• Auctionsforallitemsin1/2016• Samplefromallitemsin2016

Howmuchwillyoupay?

OrmaybeCraigslist data?

2

“statisticiansworkinginaresearchenvironment…maywellhavetoexplainthatthedataareinadequate toansweraparticularquestion.”

Statistics:AVeryShortIntroduction (Hand2008)

Pre-data,post-data,post-analysis

Whatisthepotentialofadatasettogenerateknowledge?

3

Analysisgoalg X

Availabledata

fDataanalysis

method

Utilitymeasure

U

4

Analysisgoalg

DomainGoalWhat,why,when,where,how

→AnalysisGoalExplain,predict,describeEnumerative,analyticExploratory,confirmatory

QualityofGoalSpecification• “errorofthethirdkind”- givingtherightanswertothewrongquestion– Kimball

• “Farbetteranapproximateanswertotherightquestion,whichisoftenvague,thananexactanswertothewrongquestion,whichcanalwaysbemadeprecise”- Tukey 5

XAvailabledata

DataSource• Primary,secondary• Observational,experiment• Single,multiplesources• Collectioninstrument,protocol

DataType• Continuous,categorical,mix• Structured,un-,semi-structured• Cross-sectional,timeseries,panel,network,geographical

DataQualityU(X|g)• “Zeroth Problem- Howdothedatarelatetotheproblem,and

whatotherdatamightberelevant?”- Mallows• MIS/Database- usefulnessofquerieddatatopersonqueryingit.• QualityofStatisticalData (IMF,OECD)- usefulnessofsummary

statisticsforaparticulargoal (7dimensions)

DataSizeandDimension• #observations• #variables

6

fDataanalysismethod

AnalysisQuality• “poormodelsandpooranalysistechniques,orevenanalyzingthedatainatotallyincorrectway.”- Godfrey

• Analystexpertise• Softwareavailability• Thefocusofstatistics/econometrics/DMeducation

Statisticalmodelsandmethods• Parametric,semi-,non-parametric• Classic,Bayesian

Econometricmodels

Dataminingalgorithms

GraphicalmethodsOperationsresearchmethods

7

UtilitymeasureU

QualityofUtilityMeasure• Adequatemetricfromanalysisstandpoint(R2,holdoutdata)• Adequatemetricfromdomainstandpoint

Domaingoal→Analysisgoal

• Predictiveaccuracy,lift• Goodness-of-fit• Statisticalpower,statisticalsignificance• Strength-of-fit• Expectedcosts,gains• Biasreduction,bias-variancetradeoff

Analysisutility→ Domainutility

8

InfoQ(f,X,g)=U(f(X|g))

Dependsonqualityofg,X,f,Uandrelationshipbetweenthem

Thepotentialofaparticulardatasettoachieveaparticulargoalusingagivenempiricalanalysismethod

9

StatisticalApproachesforIncreasingInfoQ

StudyDesign(Pre-Data)• DOE• Clinicaltrials• Surveysampling• Computerexperiments

Post-Data-Collection• Datacleaningandpreprocessing

• Re-weighting,biasadjustment

• Metaanalysis

Randomization,Stratification,Blinding,Placebo,Blocking,Replication,Samplingframe,Linkdatacollectionprotocolwithappropriatedesign

Recovering“realdata”vs.“cleaningforthegoal”Handlingmissingvalues,outlierdetection,re-weighting,combiningresults

10

AssessingInfoQ“QualityofStatisticalData”(Eurostat,OECD,NCSES,…)• Relevance• Accuracy• Timelinessandpunctuality• Accessibility• Interpretability• Coherence• Credibility

InfoQdimensions1. Dataresolution2. Datastructure3. Dataintegration4. Temporalrelevance5. Chronologyofdataandgoal6. Generalizability7. Constructoperationalization8. Communication

3V’sofBigData• Volume• Variety• Velocity

MarketingResearch• Recency• Accuracy• Availability• Relevance 11

#1DataResolutionMeasurementscaleandaggregationlevel

12

DutchFlowerAuctionsVanHeckKetterGuptaKoppiusLuKambil

Howmanyflowers?Howmanyauctions?Bidderlevel?

#2DataStructure

DataTypes• Timeseries,cross-sectional,panel• Geographic,spatial,network• Text,audio,video,semantic• Structured,semi-,non-structured• Discrete,continuous

DataCharacteristicsCorruptedandmissingvaluesduetostudydesignordatacollectionmechanism

13

SocialTV:Real-TimeMediaResponsetoTVAdvertisingHill,Nalavade&Benton,KDD2012

PredictioninEconomicNetworksDhar,Geva,Oestreicher-Singer&Sundararajan,ISR2012

ConsumerSurplusinOnlineAuctionsBapna,Jank &Shmueli, ISR2008

#3DataIntegration

UtilityofLinkageDangers:PrivacyIncreaseordecreaseInfoQ?

14

MusicBlogging,OnlineSampling,andtheLongTail,Dewan &Ramprasad,ISR2013

SongradioplaydatafromNielsenSoundScan +musicblogsamplingdatafromTheHypeMachine

#4TemporalRelevance

AnalysisTimeliness(solvingtherightproblemtoolate)

DataCollection

DataAnalysis

StudyDeployment

t1 t2 t3 t4 t5 t6

CollectionTimeliness(relevancetog)

g:Prospectivevs.retrospective;longitudinalvs.snapshotNatureofX,complexityoff

forecast

15

#5ChronologyofData&Goalg1:Explainpriceg2:Forecastprice

Retrospective/prospectiveEx-postavailabilityEndogeneity

16

#6Generalizability

Statisticalgeneralizability

Scientificgeneralizability

DefinitionofgChoiceofX,f,U 17

TheHiddenCostofAccommodatingCrowdfunderPrivacyPreferences:ARandomizedFieldExperimentBurtch,Ghose,Wattal

“Wefoundthatourtreatmenthadalarge,highlynegativeeffectonhidingbehavior…(β=-0.279,p<0.001)”

“Itispossiblethatourresultswouldnotextendtoa[offline]purchasecontext,whereissuesofsocialcapital,reputation,etc.mightbelesspronounced”

18

19

#7aConstructOperationalizationχ constructX=θ(χ)operationalization(measurable)

• Causalexplanationvs.prediction,description

• Theoryvs.data• Data:Questionnaire,

physicalmeasurement

20

OnlineSeller

Reputation

Total#ratings

{#pos,#neg}ratings

#recentnegativeratings

Textcontentof feedback

The Digitization of Word-of-Mouth: Promise and Challenges of Online Reputation SystemsDellarocas, Management Science

#7bActionOperationalization

21

Deriveconcreteactions fromtheinformationprovidedbyastudy

TheManagementInsightseditorwillwriteaManagementInsightsparagraphforeverypaperthatisacceptedforpublication.

OneWayMirrorsinOnlineDating:ARandomizedExperimentBapna,Ramprasad,Shmueli&Umyarov

OnlineReputationSystems:HowtoDesignOneThatDoesWhatYouNeedDellarocas

SocialTV:Real-TimeMediaResponsetoTVAdvertisingHill,Nalavade&Benton

#8CommunicationVisual,written,verbalpresentations&reports

Knowledgemustreachtherightpersonattherighttime

• Mentoring• Manuscriptreviewing• Datamadeavailableto

others• EDAandshared

visualizationdashboards• Seminars+conferences!

22

23

“Inthelastthreeyears,therehasbeenaconcertedeffortbythoseinWashingtontoreducegovernmentspendingandreigninthenationaldebt.

Onereason forthebudgetcuts?

ResearchbytwoHarvardeconomists,KenRogoff andCarmenReinhart.Thepairfoundthatwhenacountryowesmorethan90percentoftheirGDP, itslidesintorecession.”

…FixingthisExcelerrortransformshigh-debtcountriesfromrecessiontogrowth

www.marketplace.org/topics/economy/excel-mistake-heard-round-world

a brief conversation about marriage equality with a canvasser who revealed that he or she was gay had a big, lasting effect on the voters’ views, as measured by separate online surveys administered before and after the conversation.

24

May2015:Independentresearchersfailtoreplicate;notedstatisticalirregularitiesinLaCour’s data (baselineoutcomedatastatisticallyindistinguishablefromanationalsurvey;over-timechangesindistinguishablefromperfectnormallydistributednoise)

High or Low InfoQ?

AssessingInfoQinPractice

Rating-basedassessment1-5scaleoneachdimension:

InfoQScore=[d1(Y1) d2(Y2) … d8(Y8)]1/8

ExperiencefromtwoResearchMethodscourses

ShmueliandKenett (2013), “AnInformationQuality(InfoQ)FrameworkforEx-AnteandEx-PostEvaluationofEmpiricalStudies”,ProceedingsofIDAM2013,Springer-Verlag. 25

AssessingResearchProposals:Ex-Ante(Prospective)InfoQ• 2009ResearchmethodsworkshopforPhDstudents• Helpstudentsdevelopandpresentresearchproposal• 50studentsfromOB,OR,marketing,economics

• Eachstudentevaluateshis/herproposalonthe8InfoQdimensions.Willproposal likelyleadtotheintendedgoal?

• Students’gradesderivedfromanInfoQscoreoftheirproposalsubmission(presentation+written)

InfoQIntegration(Prospective)

Goal:Makestudents’researchjourneymoreefficientandmoreeffective

AssessingResearchProposals:Ex-Post(Retrospective)InfoQ

Professionalmastersdegreeprogramthatemphasizesstatisticalpractice,methods,dataanalysisandpracticalworkplaceskills.TheMSPisforstudentswhoareinterestedinprofessionalcareersinbusiness,industry,government,orscientificresearch.

MastersPrograminStatisticalPractice

AssessingResearchProposals:Ex-Post(Retrospective)InfoQIn2012:16studentsStudentsinstructedto:(60-90min)1. ReadshortdescriptionofInfoQanditsdimensions2. Readreportsonfivestudies3. Describegoal,data,analysis,utilityforeachstudy4. Evaluatethe8dimensionsforeachstudy

PredictingdayswithunhealthyairqualityinWashingtonDC

PredictingChangesinQuarterlyCorporateEarningsUsingEconomicIndicators

Quality-of-carefactorsinU.S.nursinghomes

PredictingZILLOW.com’sZestimate accuracy

goo.gl/erNPF

PredictingFirstDayReturnsforJapaneseIPOs

Feedback:

InfoQapproachhelped“sortoutalloftheinformation”

Severalreportedthattheywilladoptthisevaluationapproachforfuturestudies

MorewithInfoQ:GuidelinesforJournalArticle

WritingandReviewing

32

33

34

InfoQ:StrengthsandChallengesInfoQapproachstreamlinesquestioningofdatavalue• “Whyshouldweinvestindata?”– management• Comparevalueofpotentialdatasets,analyses• Prioritize/rankprojects• Strengthenfunctional– analyticalrelationship• Evaluationstudy:InfoQusefulfordevelopingananalysisplan

(prospective)andevaluatinganempiricalstudy(retrospective)

ToDo:• ImproveInfoQassessment(heterogeneityacrossdifferentraters)• AlternativeInfoQassessmentapproaches(pilotstudy,EDA,other)• Furtherdimensions(dataprivacy,ethics/moral,novelty)• EffectoftechnologicaladvancesonInfoQ

35

36

Withdiscussion

Forthcoming(2016)