information quality: a framework for evaluating empirical studies
TRANSCRIPT
AFrameworkforEvaluatingEmpiricalStudies
InfoQ
SRITNEKNOWLEDGESEMINAR,ISB,Feb2016
Galit Shmueli徐茉莉
AnalyticsHumanity
Responsibility
eBaydatasetwithmillionsofauctions• Allcameraauctionsin2008• Allcameraauctionsin2016• Auctionsforallitemsin1/2016• Samplefromallitemsin2016
Howmuchwillyoupay?
OrmaybeCraigslist data?
2
“statisticiansworkinginaresearchenvironment…maywellhavetoexplainthatthedataareinadequate toansweraparticularquestion.”
Statistics:AVeryShortIntroduction (Hand2008)
Pre-data,post-data,post-analysis
Whatisthepotentialofadatasettogenerateknowledge?
3
Analysisgoalg
DomainGoalWhat,why,when,where,how
→AnalysisGoalExplain,predict,describeEnumerative,analyticExploratory,confirmatory
QualityofGoalSpecification• “errorofthethirdkind”- givingtherightanswertothewrongquestion– Kimball
• “Farbetteranapproximateanswertotherightquestion,whichisoftenvague,thananexactanswertothewrongquestion,whichcanalwaysbemadeprecise”- Tukey 5
XAvailabledata
DataSource• Primary,secondary• Observational,experiment• Single,multiplesources• Collectioninstrument,protocol
DataType• Continuous,categorical,mix• Structured,un-,semi-structured• Cross-sectional,timeseries,panel,network,geographical
DataQualityU(X|g)• “Zeroth Problem- Howdothedatarelatetotheproblem,and
whatotherdatamightberelevant?”- Mallows• MIS/Database- usefulnessofquerieddatatopersonqueryingit.• QualityofStatisticalData (IMF,OECD)- usefulnessofsummary
statisticsforaparticulargoal (7dimensions)
DataSizeandDimension• #observations• #variables
6
fDataanalysismethod
AnalysisQuality• “poormodelsandpooranalysistechniques,orevenanalyzingthedatainatotallyincorrectway.”- Godfrey
• Analystexpertise• Softwareavailability• Thefocusofstatistics/econometrics/DMeducation
Statisticalmodelsandmethods• Parametric,semi-,non-parametric• Classic,Bayesian
Econometricmodels
Dataminingalgorithms
GraphicalmethodsOperationsresearchmethods
7
UtilitymeasureU
QualityofUtilityMeasure• Adequatemetricfromanalysisstandpoint(R2,holdoutdata)• Adequatemetricfromdomainstandpoint
Domaingoal→Analysisgoal
• Predictiveaccuracy,lift• Goodness-of-fit• Statisticalpower,statisticalsignificance• Strength-of-fit• Expectedcosts,gains• Biasreduction,bias-variancetradeoff
Analysisutility→ Domainutility
8
InfoQ(f,X,g)=U(f(X|g))
Dependsonqualityofg,X,f,Uandrelationshipbetweenthem
Thepotentialofaparticulardatasettoachieveaparticulargoalusingagivenempiricalanalysismethod
9
StatisticalApproachesforIncreasingInfoQ
StudyDesign(Pre-Data)• DOE• Clinicaltrials• Surveysampling• Computerexperiments
Post-Data-Collection• Datacleaningandpreprocessing
• Re-weighting,biasadjustment
• Metaanalysis
Randomization,Stratification,Blinding,Placebo,Blocking,Replication,Samplingframe,Linkdatacollectionprotocolwithappropriatedesign
Recovering“realdata”vs.“cleaningforthegoal”Handlingmissingvalues,outlierdetection,re-weighting,combiningresults
10
AssessingInfoQ“QualityofStatisticalData”(Eurostat,OECD,NCSES,…)• Relevance• Accuracy• Timelinessandpunctuality• Accessibility• Interpretability• Coherence• Credibility
InfoQdimensions1. Dataresolution2. Datastructure3. Dataintegration4. Temporalrelevance5. Chronologyofdataandgoal6. Generalizability7. Constructoperationalization8. Communication
3V’sofBigData• Volume• Variety• Velocity
MarketingResearch• Recency• Accuracy• Availability• Relevance 11
#1DataResolutionMeasurementscaleandaggregationlevel
12
DutchFlowerAuctionsVanHeckKetterGuptaKoppiusLuKambil
Howmanyflowers?Howmanyauctions?Bidderlevel?
#2DataStructure
DataTypes• Timeseries,cross-sectional,panel• Geographic,spatial,network• Text,audio,video,semantic• Structured,semi-,non-structured• Discrete,continuous
DataCharacteristicsCorruptedandmissingvaluesduetostudydesignordatacollectionmechanism
13
SocialTV:Real-TimeMediaResponsetoTVAdvertisingHill,Nalavade&Benton,KDD2012
PredictioninEconomicNetworksDhar,Geva,Oestreicher-Singer&Sundararajan,ISR2012
ConsumerSurplusinOnlineAuctionsBapna,Jank &Shmueli, ISR2008
#3DataIntegration
UtilityofLinkageDangers:PrivacyIncreaseordecreaseInfoQ?
14
MusicBlogging,OnlineSampling,andtheLongTail,Dewan &Ramprasad,ISR2013
SongradioplaydatafromNielsenSoundScan +musicblogsamplingdatafromTheHypeMachine
#4TemporalRelevance
AnalysisTimeliness(solvingtherightproblemtoolate)
DataCollection
DataAnalysis
StudyDeployment
t1 t2 t3 t4 t5 t6
CollectionTimeliness(relevancetog)
g:Prospectivevs.retrospective;longitudinalvs.snapshotNatureofX,complexityoff
forecast
15
#5ChronologyofData&Goalg1:Explainpriceg2:Forecastprice
Retrospective/prospectiveEx-postavailabilityEndogeneity
16
#6Generalizability
Statisticalgeneralizability
Scientificgeneralizability
DefinitionofgChoiceofX,f,U 17
TheHiddenCostofAccommodatingCrowdfunderPrivacyPreferences:ARandomizedFieldExperimentBurtch,Ghose,Wattal
“Wefoundthatourtreatmenthadalarge,highlynegativeeffectonhidingbehavior…(β=-0.279,p<0.001)”
“Itispossiblethatourresultswouldnotextendtoa[offline]purchasecontext,whereissuesofsocialcapital,reputation,etc.mightbelesspronounced”
#7aConstructOperationalizationχ constructX=θ(χ)operationalization(measurable)
• Causalexplanationvs.prediction,description
• Theoryvs.data• Data:Questionnaire,
physicalmeasurement
20
OnlineSeller
Reputation
Total#ratings
{#pos,#neg}ratings
#recentnegativeratings
Textcontentof feedback
The Digitization of Word-of-Mouth: Promise and Challenges of Online Reputation SystemsDellarocas, Management Science
#7bActionOperationalization
21
Deriveconcreteactions fromtheinformationprovidedbyastudy
TheManagementInsightseditorwillwriteaManagementInsightsparagraphforeverypaperthatisacceptedforpublication.
OneWayMirrorsinOnlineDating:ARandomizedExperimentBapna,Ramprasad,Shmueli&Umyarov
OnlineReputationSystems:HowtoDesignOneThatDoesWhatYouNeedDellarocas
SocialTV:Real-TimeMediaResponsetoTVAdvertisingHill,Nalavade&Benton
#8CommunicationVisual,written,verbalpresentations&reports
Knowledgemustreachtherightpersonattherighttime
• Mentoring• Manuscriptreviewing• Datamadeavailableto
others• EDAandshared
visualizationdashboards• Seminars+conferences!
22
23
“Inthelastthreeyears,therehasbeenaconcertedeffortbythoseinWashingtontoreducegovernmentspendingandreigninthenationaldebt.
Onereason forthebudgetcuts?
ResearchbytwoHarvardeconomists,KenRogoff andCarmenReinhart.Thepairfoundthatwhenacountryowesmorethan90percentoftheirGDP, itslidesintorecession.”
…FixingthisExcelerrortransformshigh-debtcountriesfromrecessiontogrowth
www.marketplace.org/topics/economy/excel-mistake-heard-round-world
a brief conversation about marriage equality with a canvasser who revealed that he or she was gay had a big, lasting effect on the voters’ views, as measured by separate online surveys administered before and after the conversation.
24
May2015:Independentresearchersfailtoreplicate;notedstatisticalirregularitiesinLaCour’s data (baselineoutcomedatastatisticallyindistinguishablefromanationalsurvey;over-timechangesindistinguishablefromperfectnormallydistributednoise)
High or Low InfoQ?
AssessingInfoQinPractice
Rating-basedassessment1-5scaleoneachdimension:
InfoQScore=[d1(Y1) d2(Y2) … d8(Y8)]1/8
ExperiencefromtwoResearchMethodscourses
ShmueliandKenett (2013), “AnInformationQuality(InfoQ)FrameworkforEx-AnteandEx-PostEvaluationofEmpiricalStudies”,ProceedingsofIDAM2013,Springer-Verlag. 25
AssessingResearchProposals:Ex-Ante(Prospective)InfoQ• 2009ResearchmethodsworkshopforPhDstudents• Helpstudentsdevelopandpresentresearchproposal• 50studentsfromOB,OR,marketing,economics
• Eachstudentevaluateshis/herproposalonthe8InfoQdimensions.Willproposal likelyleadtotheintendedgoal?
• Students’gradesderivedfromanInfoQscoreoftheirproposalsubmission(presentation+written)
InfoQIntegration(Prospective)
AssessingResearchProposals:Ex-Post(Retrospective)InfoQ
Professionalmastersdegreeprogramthatemphasizesstatisticalpractice,methods,dataanalysisandpracticalworkplaceskills.TheMSPisforstudentswhoareinterestedinprofessionalcareersinbusiness,industry,government,orscientificresearch.
MastersPrograminStatisticalPractice
AssessingResearchProposals:Ex-Post(Retrospective)InfoQIn2012:16studentsStudentsinstructedto:(60-90min)1. ReadshortdescriptionofInfoQanditsdimensions2. Readreportsonfivestudies3. Describegoal,data,analysis,utilityforeachstudy4. Evaluatethe8dimensionsforeachstudy
PredictingdayswithunhealthyairqualityinWashingtonDC
PredictingChangesinQuarterlyCorporateEarningsUsingEconomicIndicators
Quality-of-carefactorsinU.S.nursinghomes
PredictingZILLOW.com’sZestimate accuracy
goo.gl/erNPF
PredictingFirstDayReturnsforJapaneseIPOs
Feedback:
InfoQapproachhelped“sortoutalloftheinformation”
Severalreportedthattheywilladoptthisevaluationapproachforfuturestudies
InfoQ:StrengthsandChallengesInfoQapproachstreamlinesquestioningofdatavalue• “Whyshouldweinvestindata?”– management• Comparevalueofpotentialdatasets,analyses• Prioritize/rankprojects• Strengthenfunctional– analyticalrelationship• Evaluationstudy:InfoQusefulfordevelopingananalysisplan
(prospective)andevaluatinganempiricalstudy(retrospective)
ToDo:• ImproveInfoQassessment(heterogeneityacrossdifferentraters)• AlternativeInfoQassessmentapproaches(pilotstudy,EDA,other)• Furtherdimensions(dataprivacy,ethics/moral,novelty)• EffectoftechnologicaladvancesonInfoQ
35