A scientific framework to measure results of research investmentsJulia Lane, American Institutes of Research, University of Strasbourg and University of MelbourneAnd many colleagues
Key ideas• Need sensible scientific framework which:– Is theoretically driven– Uses appropriate unit of analysis– Is generalizable and replicable
• Need sensible empirical framework which– Uses 21st Century technology to collect data– Uses 21st Century technology to link activities
• Need framework which can be international
Outline
• Motivation• Conceptual Framework• Empirical Frameworks• Next steps
Motivation
The President recently asked his Cabinet to carry out an aggressive management agenda for his second term that delivers a smarter, more innovative, and more accountable government for citizens. An important component of that effort is strengthening agencies' abilities to continually improve program performance by applying existing evidence about what works, generating new knowledge, and using experimentation and innovation to test new approaches to program delivery.
MotivationHow much should a nation spend on science? What kind of science? How much from private versus public sectors? Does demand for funding by potential science performers imply a shortage of funding or a surfeit of performers?......A new “science of science policy” is emerging, and it may offer more compelling guidance for policy decisions and for more credible advocacy
We spend a lot on research: What’s the impact?
Classic Questions for Measuring Impact
• What is the impact or causal effect of a program on outcome of interest?
• Is a given program effective compared to the absence of the program?
• When a program can be implemented in several ways, which one is the most effective?
Classic Example: Measuring Impact
Illustration of swan-necked flask experiment used by Louis Pasteur to test the hypothesis of spontaneous generation
Classic Challenge: Theory of Change
Key ideas• Need sensible scientific framework which:– Is theoretically driven (theory of change)– Uses appropriate unit of analysis (people)– Is generalizable and replicable (open)
Outline
• Motivation• Conceptual Framework• Empirical Frameworks• Next steps
The Theory of Change
Classic Challenge: Theory of Change
Writing the Framework Down• (1) Yit
(1) = Yit(2)α + Xit
(1)λ + εit
• (2) Yit(2) = Zitβ +Xit
(2)μ + ηit
where the subscripts i and t denote project teams and quarters ε and η stand for unobserved factors, serendipity and errors of measurement and specification (and can possibly include individual unobserved project teams’ characteristics).
The output variables are measured by Y(1) and research collaboration variables by Y(2).
Both are determined by a set of control variables X(1) and X(2) that can overlap and be truly exogenous or predetermined variables of key interest Z (funding).
Source: Jason Owen Smith
Outline
• Approach: Doing an Evaluation• Conceptual Framework• Empirical Framework• Next steps
STAR METRICS approach
• Level 1: Document the levels and trends in the scientific workforce supported by federal funding.
• Level 2: Develop an open automated data infrastructure and tools that will enable the documentation and analysis of a subset of the inputs, outputs, and outcomes resulting from federal investments in science.
Institution STARSTARPilot
ProjectAcquisition
And Analysis
DirectBenefitAnalysis
IntellectualPropertyBenefitAnalysis
InnovationAnalysis
Jobs,Purchases,Contracts
BenefitAnalysis
DetailedCharacterization
andSummary
Institution
Agency Budget
Award
StateFunding
Personnel Vendor Contractor
HR System ProcurementSystem
SubcontractingSystem
EndowmentFunding
Financial System
Hire Buy Engage
Disbursement
Award
Record
Start-Up
Papers
Patents
DownloadState
ResearchProject
ExistingInstitutional
Reporting
Agency
Automated Data Construction
• Most data efforts focus on hand-curated data• Scalable, Low cost / burden: Algorithmically
link researchers to their support (grants) scientific output (publications and citations) technological products (patents and drug approvals) Impacts (Health, economy, productivity)
• Link to linked employee / employer data• Probabilistic matches
The Theory of Change
Key ideas• Need sensible empirical framework which– Uses 21st Century technology to collect data
(cybertools..and SCIELO like activities)– Uses 21st Century technology to link activities
(disambiguation; ORCID)
Example in practice: CalTech Project• Funded by Sloan Foundation• Goals– Use STAR METRICS Level I data to examine production of
science at project, PI and lab level– Interview Caltech PIs to get qualitative grounding– Begin to build STAR METRICS Level 2 data linking PEOPLE
to results: publications, patents, altmetrics, dissertations, and Census data on student placements, firm startups etc
– Make source code and database infrastructure available to all STAR METRICS institutions
Award Funding for one researcher
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
0
2
4
6
8
10
12
Ongoing awardsNew awardsOngoing awardsNew awards
Lab staffing20
03
2004
2005
2006
2007
2008
2009
2010
2011
2012
0
20
40
60
80
100
120
UndergraduateTechnician / Staff scientistResearchResearch AnalystFacultyPost-DocGraduate Students
Industry Expenditures Number of transactions
Other Professional Equipment and Supplie 3386.36 121
Rail transportation 36 1
Scenic and Sightseeing Transportation, L 896.12 4
Commercial Banking 4616 2
Testing Laboratories 8312.92 100
Pharmaceutical Preparation Manufacturing 629.63 12
Biological Product (except Diagnostic) M 2480.45 37
Electrometallurgical Ferroalloy Product 189.8 8
Electronic Computer Manufacturing 6831.41 49
Semiconductor and Related Device Manufac 3672.51 73
Analytical Laboratory Instrument Manufac 61464.87 49
Scheduled Passenger Air Transportation 5892.79 19
Passenger car rental 1015.28 8
Research and development in the physical 1654.88 38
Colleges, Universities, and Professional -110.88 1
Vendor Expenditures on one project
Publications of researcher20
00
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
0
2
4
6
8
10
12
PHD Theses Supervised
1 2 3 4 5 6 7 8 9 10
0
1
2
3
4
5
6N
. of T
hese
s
Patents for same researcher20
00
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
0
0.5
1
1.5
2
2.5
3
3.5
USPTO Patents
n_pat_uspto n_pat_uspto
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
0
0.5
1
1.5
2
2.5
3
3.5
EPO Patents
n_pat n_pat
New research: Exploratory regressions
+...
Y (outputs) can be expanded
• Currently Y is just publications, patents, PhD students
• Census interest suggests we can develop additional economic outcomes:– Wages and career trajectories for postdocs/grad.
Students– Firm startups, growth and productivity
• And..substantial competence in SciSIP community in building out science and social outcomes
VARIABLES Pubs Patents PhDs Pubs Patents PhDs
Award expenditures 0.057*** 0.0018 0.0093**
Labor inputs 0.19*** 0.056*** 0.10*** 0.12*** 0.053*** 0.089***
Share post-doc 0.43** -0.071 -0.078 0.23 -0.077 -0.11
Share PhD 0.072 -0.023 0.27*** -0.14 -0.030 0.23***
Equipments 0.010 0.00055 0.0029 -0.015 -0.00024 -0.0011
Share computer -0.36 -0.042 -0.25 -0.41 -0.044 -0.26
Share optics -0.21 0.68** 0.22 0.016 0.68** 0.26
seniority -0.0098*** -0.00081 0.00014 -0.010*** -0.00083 0.000030
Full Prof. 0.081 0.027 0.072** 0.054 0.026 0.068**
Share ARRA 0.94*** -0.018 -0.10 0.71** -0.026 -0.14
harvard -0.026 -0.041 -0.0024 -0.069 -0.042 -0.0095
mit 0.065 0.092 -0.00068 0.051 0.091 -0.0030
caltech 0.23** 0.028 0.046 0.21** 0.027 0.043
physics 0.26*** -0.047 0.0047 0.22*** -0.048 -0.0017
chemistry 0.40*** 0.064 0.17** 0.38*** 0.063 0.17**
engineering 0.60*** 0.030 0.22*** 0.59*** 0.030 0.22***
Calendar year dummies yes yes yes yes yes yes
Constant 0.11 -0.021 -0.16*** 0.018 -0.024 -0.17***
Observations 2,590 2,590 2,590 2,590 2,590 2,590
R-squared 0.321 0.084 0.205 0.365 0.084 0.210
Robust standard errors in parentheses
Use data to estimate production functions at project level
Note: Same approach as that used to derive widely accepted result that R&D generated more than half of US productivity growth in the 1990’s; these data preliminary and not to be cited
Next example: CIC Activity Now building out across multiple universities and frames
Bruce Weinberg, OSU
The CIC• University of Chicago • University of Illinois • Indiana University • University of Iowa • University of Maryland • University of Michigan • Michigan State University • University of Minnesota • University of Nebraska-Lincoln • Northwestern University • Ohio State University • Pennsylvania State University • Purdue University • Rutgers University • University of Wisconsin-Madison
STEM Workforce Training:A Quasi-Experimental Approach Using
the Effects of Research FundingJoint with Bruce Weinberg, Vetle Torvik, Lee
Giles and Chris Morphew
Overview and Goals• The impact of research environment and
funding structures on the training and outcomes of graduate students and post docs
• Build automated, extensible data infrastructure
• Pilot for international community
Data Structure
CIC STAR METRICS Data(Grants/Labs / Teams;
Sample)
SED(Chars, Initial
outcomes)
Web, Algorithmic
Disambiguation, Microsoft Academic
(Pubs, Patents, Cites, Grants)
LEHD(Employment,
wages w/in US)
Econometric Models(1)
(2) (3)
Identification• Relate outcomes to length of training, team, and
funding structure• ARRA funding as “experiment” to shift length of
training– Lightly Reviewed Grants– Supplements to Existing Grants– Payline Extension Granst
• Also, presumably, shift teams toward postdocs• Get returns to time in training under different
team and funding structures
Probability ofFunding
Proposed Project “Quality” Non-ARRA Payline
Extended ARRA
Payline
Likely Funded only under ARRA
Figure 2. Research Design for Payline Extension.
Likely Funded even without ARRA
Unlikely to be Funded even with ARRA
Possible Analyses• Estimate how training environment affects
retention in US, sector of employment, wages• Estimate how flows of trainees to companies
affects productivity• Measure impact on innovation by linking text of
patents to the research done in the labs where people trained
• Open the knowledge transfer black box and estimate returns to training
What are the results of research (internationally)ASTRA (Australia)HELIOS (France)CAELIS (Czech Republic) NORDSJTERNEN (Norway)STELLAR (Germany)TRICS (UK)SOLES (SPAIN)
Building new tools
We spend a lot on research: What’s the impact?
Key ideas• Need sensible scientific framework which:
– Is theoretically driven (theory of change)– Uses appropriate unit of analysis (people)– Is generalizable and replicable (open)
• Need sensible empirical framework which– Uses 21st Century technology to collect data (cybertools..and
SCIELO like activities)– Uses 21st Century technology to link activities (disambiguation;
ORCID)• Need framework which can be international (develop
community of practice with common interests)
Thank you!
Julia Lanewww.julialane.org
www.cssip.org