mary beth hughes, ph.d. ida science and technology policy institute
DESCRIPTION
Can Traditional R&D Evaluation Methods Be Used for Evaluating High-Risk, High-Reward Research Programs ?. Mary Beth Hughes, Ph.D. IDA Science and Technology Policy Institute American Evaluation Association Annual Conference Anaheim, CA November 4, 2011. Overview. - PowerPoint PPT PresentationTRANSCRIPT
Can Traditional R&D Evaluation Methods Be Used for Evaluating High-Risk, High-Reward Research
Programs?
Mary Beth Hughes, Ph.D.IDA Science and Technology Policy Institute
American Evaluation Association Annual ConferenceAnaheim, CA
November 4, 2011
Overview
• Introduction to the Problem• A Typology of HRHR Research Programs• Applicability of Traditional RTD Evaluation
Methods to HRHR Research Programs– Example: NIH Director’s Pioneer Award
• Some Concluding Thoughts
A Call for High-Risk, High-Reward Research Programs…
…And a Desire to Understand Program Effectiveness
What are High-Risk, High-Reward Research Programs?
• HRHR research programs not well defined• Generally, HRHR research programs aim to support
unconventional, innovative, creative, transformative, “outside the box” research that will have a larger impact than status-quo research
• HRHR research programs may differ from traditional RTD programs in terms of: funding amount and duration, mechanism of funding, target recipients, target research fields, selection criteria, and selection/review processes
The Problem
• Given the differences between HRHR programs and traditional R&D programs (and the hypothesized difference between “normal” research and “HRHR” research), it is not clear that the application of traditional RTD evaluation methods to HRHR programs is valid.
What do we know about when and how traditional methods can apply to evaluations of
HRHR research programs?
Very Few Evaluations and Research on U.S. HRHR Programs*
• NIH’s Director’s Pioneer Award – IDA Science & Technology Policy
• NSF’s SGER – SRI• HHMI vs. R01 – Pierre Azoulay, MIT• NIH’s Director’s New Innovator Award– IDA
Science & Technology Policy• NSF’s Emerging Frontiers in Research and
Innovation – IDA Science & Technology Policy
*Many more studies of ‘creative’ research at individual scientist or organizational level – but no commonly agreed on indicators of creative research
Summ
ative StudiesForm
ative Studies
A Typology of HRHR Programs (Heinze, 2008; Hughes and Lal, 2009)
• People Programs– Aimed at funding an individual scientist to undertake (almost) any
research project; longer duration funding and higher funding amounts
• Synergy Programs– Aimed to move a field forward through projects based on teams or
inter- or multi-disciplinary approach• Challenge Programs
– Aimed at funding projects based on a technological challenge or critical national need; funding may be milestone-based
• Seed Programs– Aimed to jump start a project (often unconventional); shorter
duration funding and lower funding amounts
“People” Programs Present Greatest Challenge to Translation of Methods
• Synergy Programs– Emerging bibliometrics to measure interdisciplinarity– Social network analysis to measure collaborations
• Challenge Programs– Specific challenges and milestones identified a priori serve as measures of desired
outcomes – did research meet milestones or is it contributing to outcome?• Seed Programs
– Intended to incorporate new projects into traditional funding streams – did projects go on to receive funding?
• People Programs– Research projects not well defined, funding used for multiple ideas, many different
types of risks possible– Longer-term funding means need for early indicators– Concept of failure less clear
One Example: The NIH Director’s Pioneer Award
• Key Features– Managed out of NIH’s Office of the Director– 5-page application reviewed externally– Interview by external panel– Three high-level criteria (Years 2+)
• Scientific problem to be addressed• Investigator• Suitability for NDPA mechanism
– 5 years of funding, $2.5M– 51% effort commitment– Flexibility in how funds are used
• Evaluation Request– NDPA represented several “firsts” for NIH– Viewed as an “experiment” in how to fund biomedical and behavioral research– IDA/STPI was asked to evaluate short term outcomes of first 22 awardees
NDPA Evaluation Used Common RTD Evaluation Methods
1. Bibliometrics2. Case Studies - descriptive3. Expert Review
These methods are all ex-post but our understanding around ex-ante evaluations of HRHR research (e.g. proposals) is also
weak.How well does traditional peer review apply to HRHR research?
How successful are various alternatives (shortened applications, interviews, “sandpit process”, etc.) at identifying
HRHR research?Many opportunities for further study…
1. Bibliometrics for Traditional RTD Evaluations
• Description of Method– Some examples of bibliometrics• Traditional: Counts (and variations thereof), Citations
(and variants therof), Content Analysis• Emerging: Interdisciplinarity, Burstness, Centrality
• Uses in traditional RTD evaluations– Counts used as measures of productivity– Citations used as measures of utility and
dissemination
Bibliometrics for HRHR Research Evaluation
• Prior Work– Many studies proposing relationships between bibliometric-based indicators and
creative research outcomes (Productivity (Simonton, 2004); Interdisciplinarity (Heinze, 2007); Brokerage (Burt, 1992); Burstness and Centrality (Chen, 2009))
– Azoulay (2010) found HHMI researchers had higher level of productivity both post-award and compared to control group (and higher levels of highly cited publications)
• NDPA Findings– No apparent correlation between measures in literature and expert review of research– Interpretation unclear – short-term nature of evaluation? Small number statistics?– Counts cannot distinguish between concept of a researcher trying something new and
failing and an unproductive scientist
• Conclusion for Use of Method for Evaluation of “People” Programs– Currently insufficient as sole method; further work needed to understand HRHR-specific
bibliometric indicators (e.g. Transformative outcomes may not be readily accepted by scientific tradition (Polanyi, 1966))
Bibliometrics: Initial Comparisons Across HRHR Evaluations
NDPA Data (Source: STPI) HHMI Data (Source: Azoulay, 2010)
Bibliometrics show potential for cross-evaluation comparisons and synthesis of HRHR evaluations
2. Case Studies for Traditional RTD Evaluations
• Description of Method– “In-depth investigations into a program, project, facility, or
phenomenon, usually to examine what happened, to describe the context in which it happened, to explore how and why, and to consider what would have happened otherwise.” (Yin, 1994)
• Uses in traditional RTD evaluations– Typically used in exploratory phases of a program (Ruegg
and Feller, 2003)– Helpful for understanding key relationships and variables in
a complex phenomenon (Shadish, Cook, and Leviton, 1991)
Case Studies for HRHR Research Evaluations
• Prior Work– No HRHR research program evaluations relying on case studies; literature on
creative research includes case studies on individual scientists OR overall work environment but less understanding at level of groups or projects
• NDPA Findings– Case studies allowed for tracking of research trajectory and gave awardees
opportunity to state what made research pioneering– Great diversity across awardees in terms of use of funding, other funding,
research approach, research trajectory, group size, group composition no consistent markers of pioneering research
• Conclusion for Use of Method for Evaluation of “People” Programs– Most suitable method for understanding HRHR research trajectory, especially
in the near-term and for small samples– But need common framework for understanding what information to collect
to enable better understanding of HRHR research (sociology of science?)
3. Expert Review for Traditional RTD Evaluations
• Description of Method– Using informed judgments to make assessments– Many variants on implementation of review; effects of
review process variants on review outcomes not well-understood
• Uses in traditional RTD evaluations– Most widely used method for research evaluations– Recognized challenges, but often viewed as most
robust method (Garfield, 2006; NAS, 1999; Nature, 2009)
Expert Review for HRHR Research Evaluations
• Prior Work– Experts widely used for evaluating “high rewards” (e.g. Nobel Prize)– Little research done on how experts identify HRHR research, although Amabile
(1982) suggests a technique for determining creative outcomes• NDPA Findings
– Experts in the field of the awardee independently evaluated 3 publications and the case study of the awardee and asked to determine the level of “pioneeringness”
– When asked how experts made determination, most included some variant of “you know it when you see it”…but there was still disagreement between experts (consistent with non-HRHR research evaluation findings)
• Conclusion for Use of Method for Evaluation of “People” Programs– Currently necessary for understanding scientific contributions– Need better understanding of what experts are looking for in their assessment– Need robust data collection to enable cross-evaluation comparisons
Some Concluding Thoughts
• To return to original question…can these methods be applied to HRHR research programs?– Sort of, BUT we need a clearer understanding of program
theory of HRHR programs, need to tailor methods appropriately, and need to use multiple methods
• Need more evaluations of HRHR research programs for synthesis of evaluation results– Studies are in progress (EURECIA, CREA, STPI, …)
• As evaluators, need to balance role as auditors and researchers and caveat findings appropriately
Thank You!
• Acknowledgments– Bhavya Lal– Stephanie Shipp– Elizabeth Lee– Amy Marshall– Brian Zuckerman
NDPA FY 2004 – 2005 Outcome Evaluation report available at:https://commonfund.nih.gov/pdf/Pioneer_Award_Outcome%20Evaluation_FY2004-2005.pdf