department of defense experimentation guidebook · 2019-12-12 · document for the department of...

Department of Defense

Experimentation Guidebook

Office of the Under Secretary of Defense for Research and Engineering

Prototypes and Experiments

August 2019 (Version 1.0)

i

Table of Contents

1 Forward ................................................................................................................................... 1

2 Introduction ............................................................................................................................. 1

3 Purpose and Scope .................................................................................................................. 3

4 Experimentation Basics .......................................................................................................... 3

4.1 Experimentation Fundamentals. ..................................................................................... 4

4.2 Types of Experiments. .................................................................................................... 5

4.3 Why Experiment in the DoD?......................................................................................... 6

4.4 Differentiating Experimentation From Prototyping, Testing, and Demonstration. ........ 7

4.5 Experimentation Methods. .............................................................................................. 8

4.6 Cultural Implications for Experimentation. .................................................................. 11

5 Experimentation Activities ................................................................................................... 12

5.1 Formulating Experiments.............................................................................................. 13

5.2 Planning Experiments. .................................................................................................. 16

5.3 Soliciting Proposed Solutions for Experiments. ........................................................... 24

5.4 Selecting Potential Solutions for Experiments. ............................................................ 25

5.5 Preparing For and Conducting Experiments. ................................................................ 26

5.6 Data Analysis and Interpretation. ................................................................................. 29

5.7 Results of Experimentation. .......................................................................................... 29

6 Summary ............................................................................................................................... 32

Appendix 1: Acronyms .............................................................................................................. 33

Appendix 2: Definitions............................................................................................................. 35

Appendix 3: References ............................................................................................................. 37

Table of Tables

Table 1: Primary Scenario Factors ................................................................................................ 19

Table 2: Risks Common to Experimentation ................................................................................ 22

Table 3: Examples of Selection Criteria for Navy's TnTE2 Methodology ................................... 26

1

1 Forward

Military Departments and Defense Agencies have used experimentation for decades to support innovation and develop solutions to vexing military problems. Many of these organizations and their subject matter experts (SME) have developed processes, methods, and tools that have helped them succeed in their efforts. The Office of the Under Secretary of Defense for Research & Engineering’s Prototypes and Experiments Office (P&E) was tasked to capture and consolidate these approaches, best practices, and recommendations into a single reference document for the Department of Defense (DoD). This guidebook is designed to complement DoD, Military Service, and Defense Agency policy pertaining to experimentation, providing the reader with discretionary best practices that should be tailored to the circumstances of each experiment. It is a living document and will be updated periodically to ensure that direction captured from governing documents is current and that best practices are fresh.

To draft this guidebook, P&E conducted an extensive literature review, gleaning information from legal, congressional, academic, and regulatory documents and reports. This information was refined through interviews with research and acquisition professionals across the DoD who provided insights into proven experimentation programs and processes, as well as documented best practices and lessons learned from previous defense experimentation efforts. This approach to developing the guidebook resulted in a product with broad applicability to the defense experimentation community.

2 Introduction

United States technological superiority has sustained U.S. military dominance for over 70 years. However, the explosion of technological gains in the defense and commercial industries over the past several decades and their broad availability to nation-states and non-state actors has resulted in a dramatic increase in the technical prowess of U.S. adversaries and the erosion of the U.S. competitive military advantage.2 This erosion is exacerbated by the rate at which U.S. adversaries are making technological advances. Dr. Griffin, the Under Secretary of Defense for Research and Engineering (USD(R&E)), stated that “the pace at which we develop advanced warfighting capability is being eclipsed by those nations that pose the greatest threat to our security.”3

1 James N. Mattis, Secretary of Defense, Summary of the 2018 National Defense Strategy of the United States of America: Sharpening the American Military’s Competitive Edge (Washington, DC: Department of Defense, 2018), 10, https://dod.defense.gov/Portals/1/Documents/pubs/2018-National-Defense-Strategy-Summary.pdf. 2 Mattis, Summary of the 2018 National Defense Strategy of the United States of America, 1. 3 Report to Congress: Restructuring the Department of Defense Acquisition, Technology and Logistics Organization and Chief Management Officer Organization, In Response to Section 901 of the National Defense Authorization Act

“Deliver performance at the speed of relevance.”1

2018 National Defense Strategy

https://dod.defense.gov/Portals/1/Documents/pubs/2018-National-Defense-Strategy-Summary.pdf

2

U.S. adversaries are not just embracing advanced technology; they are also studying U.S. strategies and tactics and are rapidly innovating novel applications of new and existing technologies to maximize their effectiveness against those strategies and tactics. This integrated approach to capability development is endorsed by the 2018 National Defense Strategy (NDS): “Success no longer goes to the country that develops a new technology first, but rather to the one that better integrates it and adapts its way of fighting.”4 Unfortunately, U.S. emphasis on this integrated approach to capability development has also atrophied over time.

The NDS’ challenge to “[d]eliver performance at the speed of relevance”5 requires a new (or renewed) approach to capability development—one that uses experimentation, rapid concept exploration, and prototyping to integrate materiel and non-materiel solutions in ways that most effectively address warfighter capability gaps. The NDS explains this further:

“Modernization is not defined solely by hardware; it requires change in the ways we organize and employ forces. We must anticipate the implications of new technologies on the battlefield, rigorously define the military problems anticipated in future conflict, and foster a culture of experimentation and calculated risk-taking. We must anticipate how competitors and adversaries will employ new operational concepts to sharpen our competitive advantages and enhance our lethality.”6

This guidebook explores the topic of DoD experimentation and its role in rapid capability development.

In general terms, experimentation answers the question, “If I do this, what will happen?” Defense experimentation extend that question to the military domain, providing decision makers with information they need to make decisions. Defense experiments provide opportunities for technologists and warfighters to evaluate potential solutions to existing or emerging warfighter capability gaps and probe the integration of technology development and concept exploration in order to maximize synergies that exist. Experimentation also enables rapid evaluation of a military problem, increasing the speed by which knowledge and understanding is gained and decisions can be made. According to the Defense Science Board (DSB), “experimentation fuels the discovery and creation of knowledge and leads to the development and improvement of products, processes, systems, and organizations.”7

True experimentation does not exist without the risk that the experiment will not yield the expected or desired results. In fact, experiments that result in the greatest benefits are often accompanied by substantial risk. These high-risk experiments give the Department the greatest

for Fiscal Year 2017 (Public Law 114 - 328) (Washington DC: Department of Defense, 2014), 3, https://dod.defense.gov/Portals/1/Documents/pubs/Section-901-FY-2017-NDAA-Report.pdf. 4 Mattis, Summary of the 2018 National Defense Strategy of the United States of America, 10. 5 Mattis, Summary of the 2018 National Defense Strategy of the United States of America, 10. 6 Mattis, Summary of the 2018 National Defense Strategy of the United States of America, 7. 7 Defense Science Board, The Defense Science Board Report on Technology and Innovation Enablers for Superiority in 2030 (Washington DC: Department of Defense, 2013), 85, https://www.acq.osd.mil/DSB/reports/2010s/DSB2030.pdf.

https://dod.defense.gov/Portals/1/Documents/pubs/Section-901-FY-2017-NDAA-Report.pdf

https://www.acq.osd.mil/DSB/reports/2010s/DSB2030.pdf

3

opportunity to find transformative solutions to capability gaps and other warfighter needs. Historically, however, DoD’s risk-averse acquisition culture contributed to the failure of past experimentation efforts. In 2013, the DSB confirmed that “experimentation in the Department [became] synonymous with scripted demonstrations, testing, and training in an environment and culture that is arguably much more risk-averse today than it was just 20 years ago.”8

The environment in DoD has slowly begun to change in recent years, however, evolving into one that increasingly fosters innovation and risk taking and that promotes exploration through experimentation and prototyping. Congress recognized the need to further encourage this evolution, stating in the Fiscal Year (FY) 2017 National Defense Authorization Act (NDAA) Conference Report that they “expect that the [USD(R&E)] would take risks, press the technology envelope, test and experiment, and have the latitude to fail, as appropriate.”9 This type of experimentation—risk tolerant experimentation—is necessary for DoD, and it is key to restoring the U.S. defense technology overmatch.

3 Purpose and Scope

This guidebook contains overarching guidance on the application of experimentation in DoD. It provides a basic introduction to experimentation and details on specific defense experimentation activities. This guidebook is primarily intended to be used by DoD personnel who plan to use experimentation to explore solutions to existing and emerging military capability problems. It is also intended to be used as an introductory and reference document by staff officers and senior leaders seeking to increase their knowledge of experimentation.

This guidebook is not policy or directive in nature, and it does not supersede DoD, Military Service, or Defense Agency policy pertaining to acquisitions or experimentation. It is not a substitute for Defense Acquisition University (DAU) training and it does not describe every activity necessary to be effective.

4 Experimentation Basics

So, what is “experimentation?” In its purest sense, experimentation is the application of the scientific method (the processes used since the 17th century to explore natural science) to determine cause-and-effect relationships—manipulating one or more inputs, recording the effects on an output while controlling the environment and other potential influencers, and analyzing the data to validate the relationships.10 People conduct experiments all the time, sometimes formally (like a child in science class who enthusiastically watches to see what happens when baking soda

8 Defense Science Board, Report on Technology and Innovation Enablers for Superiority in 2030, 78. 9 U.S. Congress, House, Conference Report: National Defense Authorization Act for Fiscal Year 2017, S. 2943, 114th Cong., 2d sess. (2016), 1130, https://www.congress.gov/114/crpt/hrpt840/CRPT-114hrpt840.pdf. 10 The Technical Cooperation Program, Pocketbook Version of GUIDEx (Slim-Ex): Guide for Understanding and Implementing Defense Experimentation (Ottawa, Canada: Canadian Forces Experimentation Centre, 2006), 5-6, https://www.acq.osd.mil/ttcp/guidance/documents/GUIDExPocketbookMar2006.pdf.

https://www.congress.gov/114/crpt/hrpt840/CRPT-114hrpt840.pdf

https://www.acq.osd.mil/ttcp/guidance/documents/GUIDExPocketbookMar2006.pdf

4

and vinegar are combined), but more often informally in our day to day lives (e.g., if I take another route to work that is longer but avoids traffic lights, will I make it to work quicker?).

Defense experimentation is the extension of this type of thinking and activity into the military domain. From the beginning of warfare, militaries have experimented with capabilities and concepts to develop and identify better ways of conducting war and solving warfighting capability gaps (e.g., using gunpowder to propel projectiles, using airpower to sink ships, and targeting terrorists using armed drones). To ensure clarity regarding the term, this guidebook will use the following definition for defense experimentation:

Defense Experimentation: Testing a hypothesis, under measured conditions, to explore unknown effects of manipulating proposed warfighting concepts, technologies, or conditions.

Experimentation is not an end in itself nor is it a research, acquisition, or doctrine process. Instead, experimentation is a tool that can be used in any of those processes to explore unknown relationships and outcomes that result from new disruptive technologies and concepts, new applications of existing capabilities, and emerging threats.11

4.1 Experimentation Fundamentals. Before exploring defense experimentation further, it is important to explain some fundamental principles regarding classic experimentation. Classic experiments are built around a hypothesis that clearly states the proposed causal relationship, typically in an if-then statement. For example, a hypothesis might read:

If a Hellfire missile is mounted on and fired from a reconnaissance drone,

Then the kill-chain will be shortened.

This hypothesis is composed of an independent variable in the “If” statement, “a Hellfire missile is mounted on and fired from a reconnaissance drone,” and a dependent variable in the “Then” statement, “the kill-chain will be shortened.” In addition to the independent and dependent variables are intervening variables that impact the relationship between the dependent and independent variables. Examples might include level of training of the participants, skill of the pilots, and weather. Experiments then manipulate the independent variable to see if/how the dependent variable is affected. Classic experiments are conducted systematically following scenarios under very controlled conditions in order to increase the confidence that the relationship is valid. In ideal experiments, only one independent variable is manipulated at a time, while all intervening variables are controlled.12

11 Defense Science Board, Report on Technology and Innovation Enablers for Superiority in 2030, 79-80. 12 David S. Alberts and Richard E. Hayes, Code of Best Practice: Experimentation (Washington, DC: Command and Control Research Program, 2002), 142, http://dodccrp.org/files/Alberts_Experimentation.pdf.

http://dodccrp.org/files/Alberts_Experimentation.pdf

5

It is important at this point to introduce four experimentation criteria—validity, reliability, precision, and credibility. Validity addresses how well the experiment measures what it intends to measure.13 Reliability pertains to the objectivity of the experiment and whether the same values would be measured for the same observations every time. Precision addresses whether or not instrumentation is calibrated to tolerances that enable detection of meaningful differences. Credibility pertains to whether or not the measures are understood and respected. With the trades and assumptions that naturally need to be made during an experiment, it is the responsibility of the experimentation team to ensure that the experiment is designed to most effectively balance validity, reliability, precision, and credibility in order for the experiment to be as useful as possible within the known limitations and constraints.

4.2 Types of Experiments. The way practitioners categorize experiments depends on the prisms through which they view the subject. Often the prisms reflect the environment or the specific disciplines within which the experimenters operate. Even within a specific discipline, several categorizations may exist. Defense experimentation is no different. For example, some experimenters group defense experiments according to whether they assess materiel solutions (e.g., emerging technology) or non-materiel solutions (e.g., a transformational concept, doctrine, concepts of operations (CONOPS), etc.). Others categorize experiments by the level of realism inherent in the experiment (i.e., technological experiments conducted in a controlled setting versus operational experiments that are typically conducted in the field). One of the more prominent categorizations of defense experiments found in literature addresses the maturity of the solution being assessed—discovery experiments, hypothesis-testing experiments, and demonstration experiments.14 Regardless of how an experiment is categorized, the fundamental activities associated with conducting an experiment are typically consistent across all types of experiments. As a result, rather than attempt to address each of these different types of experiments individually, the guidebook describes the activities common to most (if not all) types of experiments.

4.2.1 Classic Experimentation vs. Free Play. It is, however, important to note the difference between classic experimentation, as described in Section 4.1, and free play experiments as often conducted by DoD organizations. While the high level of scientific rigor associated with classic experimentation enhances the validity and reliability of the experiment, it requires significant control of the experiment variables, participants, and environment, increasing the time and cost of experimentation. Instead, DoD experimenters will often introduce new technologies, new applications of existing systems, and new concepts at experimentation events and during exercises where they can be used by operators to simply explore the question “what happens if I do ‘X’?” This more informal

13 Validity comes in two forms—internal validity and external validity. Internal validity suggests that the experiment has been designed and conducted in a way that ensures that no alternative explanations exist for the experiment results. External validity suggests that the results of the experiment can be generalized to other environments. In defense experiments, external validity relates to the operational realism of the experiment and whether the results can be generalized to the combat environment. 14 Alberts, Code of Best Practice: Experimentation, 4.

6

approach to experimentation, referred to as free play, affords experimenters significant flexibility in designing experiments vice a rigid scenario-based activity. The experimentation team must weigh the pros and cons of enhanced scientific rigor against the experiment’s objective and determine the appropriate balance between free play and scientific rigor in the experiment’s design. While free play experiments are typically less formal than classic experimentation, it is important that experimenters document their hypotheses prior to designing and planning the experiment to ensure the experiment tests what is intended to be tested and so that analysis of the results can clearly determine the validity of the hypotheses.

4.3 Why Experiment in the DoD? The ultimate purpose for all experimentation is to enrich the understanding of a particular issue or domain, providing knowledge to better inform decision makers. At the end of each experiment, experimenters should be able to answer the questions that compelled the experiment, identify additional information necessary for further research on the topic, and provide decision makers with the information they need to make decisions. Defense experimentation includes the additional purpose of accelerating the development and deployment of concepts and capabilities to the warfighter. The following sections comprise a non-exhaustive list of purposes and benefits associated with defense experimentation.

4.3.1 Identify and Refine Capability Gaps and Requirements. Defense experimentation can be used to identify and help clarify current and future warfighting problems. Bringing together operators, intelligence experts, and technologists to discuss and explore current and future warfighting environments and the impact of existing and emerging technologies enables the development, refinement, prioritization, and validation of capability gaps and requirements. Independent teams that imitate anticipated adversary actions and responses, known as red teams, can also be used in experiments to identify how adversaries might use emerging technologies to create new threats or modify existing threats. Results from these types of experiments can be used in the Capabilities Based Assessment process to help guide development of alternative materiel and non-material solutions.

4.3.2 Explore Innovative Technology Solutions. Experimentation can be used to explore, identify, and enhance technological solutions that address capability gaps and requirements and identify opportunities that emerging technologies afford. These solutions may be developed in DoD and National laboratories or by commercial innovators (many of whom DoD may not otherwise have access to). Experiments are often used to facilitate the exploration of numerous potential emerging technology solutions by warfighters in order to identify the most promising solutions to pursue and their associated technical and integration risks. Experiments are also used to explore new ways of applying existing technologies to obtain a military advantage. Just as important, experiments can also help decision makers identify innovations or current research and development (R&D) efforts that will not close a capability gap, enabling them to terminate the efforts before they become programs of record (PoR) and redirect funding to more promising solutions.

4.3.3 Explore Non-Materiel Solutions. Experimentation can also be used to investigate the full range of possible non-materiel innovations across the Doctrine, Organization, Training, Materiel, Leadership and Education, Personnel, Facilities, and Policy (DOTMLPF-P) spectrum. These experiments help warfighters

7

investigate the impact of changes to organizational structure; CONOPS; tactics, techniques, and procedures; training; etc. before operationalizing the changes. Larger, more complex transformational concepts can also be developed, explored, and refined through experimentation.

4.3.4 Evaluate Operational Value. Experiments that place capabilities in the hands of the warfighter in an operationally realistic environment enable operators and technologists to explore the operational utility and limitations of the capabilities, sometimes facilitating discovery of unexpected applications in an operational environment.

4.3.5 Rapidly Learn. Experimentation can be used to enable programs to take advantage of the “fail fast/fail cheap” philosophy (referred to by some in the Department as “learn fast/learn cheap”). This philosophy seeks to use the simplest and least expensive representative model possible (rather than an expensive final development article) to quickly determine the value of a concept or technology solution through incremental development and evaluation. When the experiment reveals something isn’t working as expected or desired (i.e., a “failure”), the concept or technology can either be modified and reevaluated, or decision makers can pivot to a different approach. The faster the solution “fails,” the faster learning can occur, and the faster decisions can be made regarding the next appropriate step in the development or innovation process. In his April 18, 2018 statement to Congress, Dr. Griffin echoed this philosophy: “So let us make our mistakes. Let us learn our developmental lessons when it does not cost much…Let’s not learn it when the system is in production. Let’s learn it while people are still experimenting.”15

4.3.6 Strengthen and Expand the Technology Base. Defense experiments are also used to reach nontraditional defense contractors that might otherwise have little interest in working through the arduous federal acquisition and contracting process. These nontraditional partners are often the sources of disruptive innovation critical to the U.S. military. Experiments allow both industry and operators to understand how novel technologies can provide value to operations (and in some cases direction on how the technology needs to be evolved) in a far less obstructive environment.

4.4 Differentiating Experimentation From Prototyping, Testing, and Demonstration. To help improve communication and reduce misunderstanding, this section explains how this guidebook differentiates DoD experimentation from prototyping, testing, and demonstration.

4.4.1 Experimentation vs. Prototyping. Among DoD personnel, the terms experimentation and prototyping are often mentioned in the same sentence and sometimes used synonymously. The terms, however, are quite different. Experimentation is used to address uncertainty when analysis is insufficient to draw conclusions.

15 Accelerating New Technologies to Meet Emerging Threats: Testimony before the U.S. Senate Subcommittee on Emerging Threats and Capabilities of the Committee on Armed Services, 115th Cong. (2018) (testimony of Michael D. Griffin, Under Secretary of Defense for Research and Engineering (USD(R&E)), 23, https://www.armed-services.senate.gov/imo/media/doc/18-40_04-18-18.pdf.

https://www.armed-services.senate.gov/imo/media/doc/18-40_04-18-18.pdf


8

Experimentation focuses on developing and evaluating a hypothesis to determine if a causal relationship exists between two variables (i.e., to answer the question, “Does ‘A’ cause ‘B’?”). It also applies to informal experiments in which the question asked may be as simple as: “What happens if I do X?” In contrast, for purposes of this guidebook, prototyping has two meanings. First, prototyping is the act of designing and creating a representative model for use in experiments or demonstrations. For instance, X-planes were prototypes that were used to conduct experiments in supersonic flight, variable geometry aerostructures, etc. In this context, the prototypes developed are used to inform decisions and answer a broad spectrum of questions (e.g., Are the requirements technically feasible? Can the end item be manufactured affordably? Is the CONOPS valid?16). The second meaning of prototyping pertains to actions typically taken prior to mass producing a solution. When an experiment identifies a promising design, prototyping develops and evaluates a representation of that design to ensure it fully satisfies the need.

4.4.2 Experimentation vs. Testing. Experimentation and testing are also closely linked and actually follow many of the same processes. The key difference between experimentation and testing is that experiments typically seek out “unknowns” in an attempt to uncover knowledge and confirm a cause-and-effect relationship between variables. Experiments also often seek to identify and characterize performance limitations in order to determine the point at which an item will fail. Testing, on the other hand, verifies and validates that a capability meets user-defined requirements to successfully accomplish a mission or mission thread, usually using pass-fail criteria.

4.4.3 Experimentation vs. Demonstration. The key difference between experimentation and demonstration is that experimentation increases knowledge in a specific domain, while demonstrations simply present and confirm what is already known. Experimentation identifies specific areas of uncertainty and custom-designs and conducts experiments to address that uncertainty. With demonstrations, however, the uncertainty has already been resolved; demonstrations simply recreate that knowledge to reveal the relationships between variables. DoD demonstrations are typically scripted and orchestrated activities that minimize the risk that the solution demonstrated will fail. They are primarily intended to display a solution’s military utility in specific operational environments to people unfamiliar with the technology or concept or to senior leaders responsible for making decisions regarding its employment, deployment, or acquisition in order to garner support for the technology or concept.

4.5 Experimentation Methods. When considering experimentation, what often comes to mind are experiments conducted in laboratories. While some defense experiments do occur in laboratories, they often take place

16 Please refer to the “DoD Prototyping Guidebook” for additional examples. Department of Defense, Department of Defense Prototyping Guidebook, Version 1.1 (Washington, DC: Department of Defense, 2019), https://www.dau.mil/tools/t/DoD-Prototyping-Guidebook.

https://www.dau.mil/tools/t/DoD-Prototyping-Guidebook

9

outside of the lab in a variety of settings using a variety of methods. The following are brief summaries of several of the most common methods used in defense experimentation.

4.5.1 Workshops. Workshops bring together a diverse talent of warfighters, policy makers, requirements writers, threat analysts, and technologists to explore threats, technologies, and concepts. They are often used to identify and refine capability gaps, establish requirements, identify and determine the feasibility of a new technology, discover and generate concepts, and develop CONOPS. Workshops can be conducted as informal brainstorming or idea-generation sessions or as structured deliberations of the merits and weaknesses of the topic being discussed.

4.5.2 Wargames. Wargames are simulations of warfare where technology, concepts, and CONOPS can be evaluated without the dangers of military conflict. Wargames seek to enhance the physical and psychological realism of a military problem to the extent possible by using warfighters as the players and evaluating their actions using models or rulesets. Wargames are often conducted using tabletop exercises and/or virtual environments.

4.5.2.1 Tabletop Exercise. Tabletop exercises are typically structured wargames where warfighters work through scenarios to discover and define capability gaps and their boundaries, and where initial insights into the value of proposed solutions to those gaps, across the full DOTMLPF-P spectrum, is discussed.

4.5.2.2 Virtual Wargames. Wargames can also be played virtually. Modeling and simulation (M&S) can be used to create virtual scenarios that simulate the interaction of two or more opposing forces. These simulations can then be used by the warfighter to evaluate alternative technologies and concepts, refine concepts, and help design future experiments. Types of simulations are differentiated by the amount of human involvement in the simulation, from no human involvement in constructive simulations to a great degree of human involvement in human-in-the-loop (HITL) simulations. The following subsections provide brief summaries of three types of virtual wargames.

In constructive simulations, the experiment designer chooses the input parameters of a force-on-force simulation and initiates the simulation. No human intervention occurs once the simulation begins. Results are then recorded and analyzed. This type of simulation enables participants to replay the same battle under identical conditions while systematically changing the input parameters (e.g., different technological solutions), enabling a side-by-side comparison of the parameters. The downside to constructive simulation is that the control afforded by this method reduces the applicability of the results to the operational environment.17

17 Richard A Kass, The Logic of Warfighting Experiments (Washington, DC: Command and Control Research Program, 2006), http://www.dodccrp.org/files/Kass_Logic.pdf, 116.

http://www.dodccrp.org/files/Kass_Logic.pdf

10

Analytic wargames employ military participants organized in Blue, White, and Red Cells to plan and execute a military operation. In a typical engagement, the Blue Cell provides its course of action to the White Cell, which communicates that action to the Red Cell. The Red Cell then communicates its counter move to the White Cell, which then runs the simulation using these inputs. The simulation generates the outcome of the fight. Analytic wargames allow warfighters to compare the operational values of multiple inputs by enabling the participants to fight the same battle multiple times using different inputs.18

Of the virtual wargames, HITL simulations provide the greatest amount of operational realism. HITL simulations are real-time simulations with a great degree of human-machine interaction in which military participants receive real-time inputs from the simulation, make real-time decisions, and direct simulated forces or platforms against simulated threat forces. A good example of a HITL simulation is a flight simulator. HITL simulations reflect warfighting decision-making better than constructive simulations and analytic wargames, but due to human involvement, variability increases, making changes more difficult to detect and cause and effect relationships more difficult to determine.19

4.5.3 Field Experiments. Field experiments are the most realistic experimentation method and the most applicable to an operational environment. Conducted in the anticipated operational environment using military personnel and equipment, field experiments best emulate the conditions that warfighters will likely face in combat. The scope of field experiments varies widely from small-scale experiments where operators are invited to simply try out new technologies and concepts to large-scale experiments and exercises that emulate a battle scenario.

4.5.3.1 Small-Scale Field Experiments. Small-scale field experimentation provides warfighters the opportunity to explore the effects of a proposed technology or concept solution in an operationally representative environment and confirm whether the capability demonstrates military utility or meets particular performance objectives. These small-scale experiments enable the warfighter to conduct multiple trials with a single solution, collaborate with technologists to refine their solutions and observe the effect of the changes in real-time, and compare the impact of multiple solutions in an operational environment. Small-scale field experiments often set the stage for participation in a large-scale field experiment.

4.5.3.2 Large-Scale Field Experiments. Large-scale field experiments, conducted at large experimentation venues or as part of major military exercises, provide the most realistic assessment of the effectiveness and utility of a technology or concept at scale in combat operations. Large-scale field experiments that include operational environment stresses can be used to validate technology solutions, obtain greater

18 The Technical Cooperation Program, Pocketbook Version of GUIDEx (Slim-Ex), 28. 19 Kass, The Logic of Warfighting Experiments, 116.

11

insight into a solution’s endurance and reliability, and demonstrate safety characteristics of a proposed solution. While highly applicable to combat operations, because of their scale, multiple trials are seldom conducted in the field, making it difficult to observe changes and determine true cause-and-effect relationships.20

4.6 Cultural Implications for Experimentation. At the heart of good experimentation is the likelihood that the experiment will fail. In fact, the most successful experiment designs ensure that failure is a possibility by stressing the object of the experiment beyond known or expected limits. This provides both an understanding of whether the proposed solution will solve the capability need as well as knowledge of if (and when) it might fail to deliver the expected performance.21 It also allows solution developers to modify and retest failed capabilities or pivot away from them and explore other opportunities.

Designing the possibility of failure into their experiments, however, is difficult for most DoD experimenters. The typical DoD practice of evaluating experimenters based on the success of their experiments has caused experimenters to become increasingly risk averse over time. In its October 2013 report, “Technology and Innovation Enablers for Superiority in 2030,” the DSB states that “experimentation in the Department has become synonymous with scripted demonstrations, testing, and training in an environment and culture that is arguably much more risk-averse than it was just 20 years ago.”23 This heightened risk aversion often results in experiment designs that have a low probability of failure, diminishing the quality and usefulness of the experiment.

One way to mitigate this risk-averse culture is to institutionalize in the Department a new understanding of what experimentation “success” and “failure” means. As mentioned earlier, the ultimate purpose for all experimentation is to advance knowledge, providing decision makers with information they need to make decisions. As a result, an experiment “succeeds” if it produces sufficient evidence to conclude if a cause-and-effect relationship exists between two variables—even if the experiment does not produce the expected results. In other words, experiments that establish the ineffectiveness of proposed solutions are not failures; rather, they are successful learning activities. On the other hand, an experiment that does not advance knowledge by producing sufficient evidence is a “failed” experiment. It fails to increase knowledge about the effectiveness of a proposed solution. The litmus test of “success” and “failure” in experimentation has less to do with the expected results of the experiment and more to do with the data that the experiment generates.

20 Kass, The Logic of Warfighting Experiments, 117. 21 Defense Science Board, Report on Technology and Innovation Enablers for Superiority in 2030, 103. 22 Mattis, Summary of the 2018 National Defense Strategy of the United States of America, 7. 23 Defense Science Board, Report on Technology and Innovation Enablers for Superiority in 2030, 78.

“We must…foster a culture of experimentation and calculated risk-taking.”22

2018 National Defense Strategy

12

Congress recognized this in its FY17 NDAA Conference Report: “The conferees expect that the [USD(R&E)] would take risks, press the technology envelope, test and experiment, and have the latitude to fail, as appropriate.”24 The Department agreed with Congress in its August 2017 report to Congress on “Restructuring the Department of Defense Acquisition, Technology and Logistics Organization and Chief Management Officer Organizations,” emphasizing: “This requires a culture change and the re-education of our workforce. This is a significant cultural shift that must be continually reinforced with risk tolerance and the move away from a perceived ‘zero risk’ mentality.”25

In order to show that experiments that fail to produce expected results actually succeed in their intended purpose, experimenters must clearly identify up-front the purpose of the experiment, the information that is going to be learned, and the value of that information. That way, even if the experiment fails to produce the expected results, the developer can point to the metrics of success identified during the planning process to justify the expenditure and demonstrate that the experiment itself succeeded.

Institutionalizing new definitions of what constitutes experiment “success” and “failure” is critical to “foster[ing] [the] culture of experimentation and calculated risk-taking” called for in the NDS.26 Faster and less expensive “failures” in experimentation lead to more rapid, iterative system development that will reduce cost and technical risk.

5 Experimentation Activities

Even though each experiment is unique, several key activities are universally applicable and should be considered for all experiments:

• Formulating experiments • Planning experiments • Soliciting proposed solutions for experiments • Selecting proposed solutions for experiments • Preparing for and conducting experiments • Data analysis and interpretation • Results of experimentation

Depending on the specific experiment, experiment type, experiment scope, and the venue selected, experimenters may determine that some of these activities are unnecessary or they may discover that some activities are performed by the experimentation venue. Experimenters should tailor their activities to address their specific experiment. This section describes each of these activities and provides recommendations for each based on best practices captured from literature and from SMEs in the experimentation community.

24 U.S. Congress, House, Conference Report: National Defense Authorization Act for Fiscal Year 2017, 1130. 25 Report to Congress, 30. 26 Mattis, Summary of the 2018 National Defense Strategy of the United States of America, 7.

13

5.1 Formulating Experiments. Experimentation should start with a clear articulation of why the experiment is being conducted and a well-defined description of how the conclusions will be applied. This involves several iterative activities explained in this section: generating the problem statement, establishing the experimentation team, and developing the hypothesis. These activities are so closely aligned and their products are iteratively refined throughout the experiment that it is often difficult to identify which activity occurs first. The results of this formulation activity include a clear unambiguous problem statement and hypothesis and a robust experimentation team.

5.1.1 Generate the Problem Statement. Generating and refining the problem statement is one of the most critical activities in experimentation. The problem statement helps to identify appropriate team members and keeps the team focused on the experiment’s purpose throughout the experiment. The problem statement should address the complete issue being studied, not just the specific hypothesis being analyzed,27 and include the following components:

• A clear articulation of the specific capability gap, need, opportunity, condition, or obstacle to be overcome;

• Identification of affected stakeholders; and • The specific capability needed.28

The robustness of the problem statement is a function of the formality of the experiment. For informal experiments that allow significant free play, the problem statement should not be overly restrictive, allowing sufficient flexibility for operators and technologists to pursue ideas. Problem statements for more formal experiments, on the other hand, should be very specific, enabling experimenters to adequately design and control the experiment in order to generate the information needed for decision makers.

Experimenters can use numerous sources of information—both formal and informal—to identify the core capability needs and develop the problem statements to be explored in their experiments. In his April, 18, 2018 hearing before the Senate Armed Services Subcommittee on Emerging Threats and Capabilities, Dr. Griffin stated

“it is critically important for the DoD to utilize intelligence products, technology forecasting, and our own analysis to inform decisions on where we will invest, what we

27 Alberts, Code of Best Practice: Experimentation, 129. 28 Experiment Planning Guide (Norfolk, VA: Navy Warfare Development Command, 2013), 14.

Unlike formal acquisition programs, experimentation is not bound by traditional Joint or Military Service requirements processes.

14

will prototype, what experiments we will do, and what emerging capabilities and concepts of operation will help us to succeed.”29

The most obvious sources of capability needs are validated requirements that are documented through formal processes. Examples include requirements listed in approved Joint Capabilities Integration and Development System (JCIDS) documents and strategic needs recorded in the following documents:

• The 2018 National Defense Strategy; • USD(R&E)’s Road to Dominance modernization priorities; • The Chairman's Risk Assessment; and • The Joint Requirements Oversight Council-led Capability Gap Assessment.

Formal requirements also include capability gaps that have been validated by Components or the Joint Staff and documented by Joint or Military Services’ requirements processes, such as Integrated Priority Lists (IPL) and Initial Capability Documents. In addition, urgent needs are often documented in Components’ urgent needs documents or in the Joint Staff’s Joint Urgent Operational Needs Statements and Joint Emergent Operational Needs Statements.

Unlike formal acquisition programs, however, experimentation is not bound by traditional Joint or Military Service requirements processes. In fact, the NDS encourages the use of experimentation prior to defining requirements.30 Instead, experimenters can design and conduct experiments to address military capability gaps identified and provided by the warfighter, outside of Joint and Military Services’ requirements processes. Sources for these gaps include the following:

• Critical intelligence parameter breaches; • Emerging needs and opportunities that are identified through threat, intelligence, and risk

assessments; and • Offsetting or disruptive needs that are identified through ongoing operations, other

experiments, demonstrations, and exercises.

5.1.2 Establish the Experimentation Team. Membership on the experimentation team is not static, and active participation will ebb and flow throughout the lifecycle of the experiment. That said, the core members of the team should include the experiment lead, innovative operational experts, logistics representatives, financial process experts (contracting, acquisition, etc.), and technologists (scientists, coders, and engineers proficient in the experimentation domain). These members must be identified and must be actively engaged in the problem statement and hypothesis development at the start of project and throughout the design, planning, execution, and analysis of the experiment. In addition to core members, experiment leads should consider including planners, requirements

29 Accelerating New Technologies to Meet Emerging Threats, 8. 30 Mattis, Summary of the 2018 National Defense Strategy of the United States of America, 11.

15

experts, vendors, red teams, experiment designers, trainers, knowledge management experts, scenario developers, M&S experts, and data analysts on their teams.

5.1.3 Develop the Hypothesis. A hypothesis is a formal statement of the problem being evaluated and a proposed solution to that problem. Hypotheses are not formal conclusions based on proven theory; rather, they are educated guesses of expectations intended to guide the experiment.31 Hypotheses are often written in an if-then format that describes a proposed causal relationship between the proposed solution and the problem, where the “If” part of the statement represents the proposed solution (the independent variable) and the operational constraints to be controlled (intervening variables), and the “Then” part of the statement addresses the possible outcome to the problem (the dependent variable). For example, a hypothesis might read:

If proposed solution (A) is deployed under operational conditions (C),

Then operational capability gap (B) will be resolved.

Similar to the problem statement, the robustness of the hypothesis is a function of the formality of the experiment being conducted. Formal experiments need to be designed and adequately controlled to ensure the data that is produced generates the information needed by decision makers. The hypothesis that guides these types of experiments should be very precise. Less formal experiments, on the other hand, require flexibility for the warfighter and technologist to explore possibilities and trade spaces. As a result, hypotheses developed for these types of experiments will be less rigorous and more general.

Some DoD organizations have determined that the complexity of their field experiments makes it nearly impossible to clearly produce a state of independence between variables. The intervening variables are too numerous to control or account for appropriately. Instead, these organizations develop robust objective statements that state the intent of the experiment, questions to be answered, conditions required, measures and metrics to be taken, and the data to be collected. For these organizations, the objective statement drives the design, planning, and execution of the experiment.32

31 Kass, The Logic of Warfighting Experiments, 35-36. 32 Dr. Shelley P. Gallup, email message to author, June 20, 2019.

16

5.2 Planning Experiments. Successful experimentation begins with effective planning. Experiment plans should be constructed as living documents that act as roadmaps for their experiments, modified and added to, as appropriate, from the start of experiment formulation through the execution of the experiment. When execution starts, the plan should provide a comprehensive summary of all aspects of the experiment and a compilation of the individual functional area plans in a single location. At a minimum, experimenters should consider providing or discussing the following topics in their plans:

• Clear, unambiguous problem statement for the experiment; • Clear hypothesis (or set of hypotheses) to assess; • Contracting strategy; • Funding strategy; • General approach to the experiment and experiment design; • Schedule of events (to include experiment set-up and dry-run); • Organization (Blue force, Red force, and experiment team); • Scenarios and plans for free play; • Control plan;

33 Alberts, Code of Best Practice: Experimentation, 128. 34 Experiment Planning Guide, D-1. 35 Databases available to DoD experimenters include: Defense Technical Information Center databases (https://discover.dtic.mil), Joint Staff’s Joint Lessons Learned Information System (https://www.jcs.mil/Doctrine/Joint-Lessons-Learned), and the Center for Army Lessons Learned database (https://call2.army.mil). 36 Experiment Planning Guide, 19.

Best Practices for Formulating the Experiment

• Key principles for developing effective problem statements: o Be as precise as possible in specifying the issue; o Formulate the problem statement as a comparison against a baseline, if possible; and o Sufficiently research the problem to ensure the hypothesis includes all known factors.33

• A problem statement should be just that…a clear statement of the problem. In order to minimize bias, problem statements should avoid assigning blame or proposing a cause for the problem, and they should also avoid taking a position or suggesting a solution.34

• Experimenters should consider reviewing DoD databases that catalog reports, studies, and lessons from prior experiments conducted.35

• Operations security (OPSEC) should be addressed early in the experimentation process and emphasized throughout the project. Experimenters should consider adding an OPSEC-trained SME to the experimentation team to assess experimentation planning, execution, and reporting.

• In order to clearly articulate the problem or hypothesis, experimenters should consider diagraming the problem or the relationships between the hypothesis variables. These diagrams are often referred to as conceptual models.36

https://discover.dtic.mil/

https://www.jcs.mil/Doctrine/Joint-Lessons-Learned

https://call2.army.mil/

17

• Data collection and analysis; • Personnel; • Logistics and infrastructure; • Training and training materials; • Risk management; • OPSEC; • Communications;37 • Safety considerations;38 and • Forecast for the next steps, given success or failure.

The type and scope of the experiment and the experiment venue used will determine the topics to be included in the plans and the level of detail to be included in these sections. While impossible to plan a perfect experiment, it is the experimenter’s responsibility to make the experiment as useful as possible considering assumptions, limitations, and constraints and to caveat the results of the experiment appropriately. The following subsections further develop some of the more critical topics that the plan needs to address.

5.2.1 Selecting the Contracting Strategy. The NDS identifies experimentation as a tool that can help streamline the process of developing capabilities and delivering them to the warfighter. The speed at which experimentation can support the warfighter is governed in large part by the tools experimenters have at their disposal to get these efforts on contract. Experimenters have a number of expedited Federal Acquisition Regulation (FAR)-based contracting and non-FAR-based non-contract vehicles available for use with experimentation that are, in large part, the same strategies available for prototyping. Additional information pertaining to contracting strategies can be found in Section 6 of the DoD Prototyping Guidebook.39 Experimenters should express the urgency of their project to their contracting authority and work with them to structure an appropriate contracting strategy for their effort.

5.2.2 Securing Funding for the Experiment. One of the biggest obstacles to experimentation is securing funding to conduct the experiment or to apply the recommendations resulting from the experiment. Specific challenges that experimenters face when securing funding include:

• DoD’s rigid funding structure that regulates the type of technology development that an organization can pursue;

37 Experiment Planning Guide, 53 & H-1-1. 38 Experimenters should consider producing a document that describes the specific hazards of the experiment and indicates the capability is safe for use and maintenance by typical troops. See discussion of “System Safety” at https://www.dau.mil/acquipedia/pages/articledetails.aspx#!483. 39 Department of Defense, Department of Defense Prototyping Guidebook, Version 1.1, 21.

https://www.dau.mil/acquipedia/pages/articledetails.aspx#!483

18

• The length of time it takes DoD’s Planning, Programming, Budgeting, and Execution process to make funding available (nearly two years from the time a funding need is identified); and

• Limitations Congress places on the specific use of funding in the NDAA.

Obtaining appropriate funding for experimentation is a challenge inherent in prototyping as well and is discussed in the DoD Prototyping Guidebook. For a summary of funding vehicles and DoD offices that can be pursued as potential funding sources for experimentation, please refer to Section 7 of the DoD Prototyping Guidebook.40

5.2.3 Experiment Design. Second in importance only to a well-defined problem statement is the experiment design. The design must ensure that, at the conclusion of the experiment, a determination can be made regarding the causal relationship in the hypothesis and that decision makers have confidence in both the results of the experiment and the information they need to make their decision.41

40 Department of Defense, Department of Defense Prototyping Guidebook, Version 1.1, 24. 41 Kass, The Logic of Warfighting Experiments, 19. 42 Alberts, Code of Best Practice: Experimentation, 74. 43 For an example and further information regarding two-level factorial experiments, please refer to section R5.3.7 at the following link: http://umich.edu/~elements/05chap/html/05prof2.htm. 44 The Technical Cooperation Program, Pocketbook Version of GUIDEx (Slim-Ex), 52.

Best Practices for Experiment Design

• When designing experiments, designers should consider several important topics: o Ensure all relevant variables and associated ranges are identified; o Determine how each variable will be measured; o Identify the factors that are believed to influence the relationships between variables; o Determine how these variables will be controlled when needed; o Identify the baseline that will be used for comparison; o Select the sample size needed to achieve the statistical relevance desired; o Establish the number of trials that will be run; o Determine the amount and type of data that will be needed; and o Select the appropriate analytic strategy.42

• Experimenters should consider using two-level factorial experiments43 to help focus subsequent experiments on the independent variables and their settings that have the greatest impact on the dependent variable(s). This enables experimenters to use their time and available resources on experiments that are most beneficial.

• Include stakeholders early in the design process to ensure the experiment satisfies stakeholders’ objectives and intent.

• Encourage early, firm decision-making on scenarios, participants, funding, technical environment, and study issues. The longer it takes to make decisions on these topics, the more difficult it will be to control the variables.44

• Address safety of personnel and equipment early in the planning process and throughout the planning and execution of the experiment.

http://umich.edu/%7Eelements/05chap/html/05prof2.htm

19

Confidence in the results of an experiment is measured by an experiment’s validity, reliability, precision, and credibility. Unfortunately, it is impossible to design experiments to satisfy all of these measures 100 percent, and often emphasizing one criterion results in a decrease in another. For example, maximizing precision in an experiment will increase the internal validity of the experiment, but it will also result in a decrease in the experiment’s external validity. The challenge for experiment designers, then, is to design the experiment in a way that emphasizes the desired validity, reliability, precision, and credibility for that particular experiment within the funding and schedule constraints provided.

5.2.4 Scenario Development. To ensure their experiments generate the data that decision makers need, many experimenters rely on scenarios (scripted sequences of events) that focus the experiment on the problem being evaluated and provide boundaries for the experiment. Table 1 identifies the four primary factors that comprise scenarios and provides examples of each.45

Table 1: Primary Scenario Factors

Factor Examples Context Objectives being pursued, the geopolitical situation, and other

background information pertinent to the problem (e.g., timeframe)

Participants Numbers, types, intentions, and capabilities of Blue forces, Red forces, and other players.

Environment Physical location of the problem including manmade and non-manmade obstacles and considerations (e.g., climate, weather, landmines)

Events Scenario injects, their purposes, and the activities to be observed

Scenarios are composed of pre-planned events, called scenario events or injects, that are intended to drive the actions of experiment participants. A chronological listing of these events and actions are often recorded in the master scenario event list (MSEL). Each entry in the MSEL includes important information regarding the scenario event, such as

• A designated time for delivering the inject; • An event synopsis; • The name of the experiment controller responsible for delivering the inject; • Special delivery instructions; • The task and objective to be demonstrated;

45 NATO Code of Best Practice for C2 Assessment (Washington, DC: Command and Control Research Program, 2002), 164-165, http://dodccrp.org/files/NATO_COBP.pdf.

http://dodccrp.org/files/NATO_COBP.pdf

20

• The expected action; and • The intended player receiving the inject.46

The type of experiment conducted drives the level of specificity and control included in the scenario. Typically, the more formal the experiment, the more specific and controlled the scenario. Scenario developers should be careful to appropriately scope the scenario for the type of experiment being conducted. Scenarios written for more formal experiments that are too general may fail to generate the data needed to support the analysis. Likewise, for less formal experiments, overly specific scenarios may inadvertently eliminate examination of some relevant factors and relationships. Bottom line, scenarios need to be valid, reliable, and credible and be developed or adapted in a way that supports the objectives of the experiment.

5.2.5 Data Collection and Analysis Plans. Planning for data collection and data analysis are critical efforts that need to begin early in the experiment planning process and be coordinated with other aspects of the plan (e.g., scenario development) to ensure valid, reliable, precise, and credible data are captured and that the analysis will generate the information needed to address the issue being evaluated. Closely linked, the data collection plan and the data analysis plan will be developed iteratively and will be updated throughout the experiment’s lifecycle. Typically developed first, the data analysis plan contains a description of the analysis tools that will be used to evaluate the experiment data and a discussion of potential bias and risk in the experiment design. The data collection plan, on the other hand, describes the data needed to be collected to support the analysis plan and provides the structure that ensures the scenarios, participants, and environment will generate the data

46 Department of Defense, DoD Participation in the National Exercise Program (NEP), DoD Instruction 3020.47, (Washington DC: Department of Defense, 2019), 17, https://www.esd.whs.mil/Portals/54/Documents/DD/issuances/dodi/302047p.pdf?ver=2019-01-29-080914-067. 47 Alberts, Code of Best Practice: Experimentation, 200. 48 Alberts, Code of Best Practice: Experimentation, 92. 49 DoD has used games for wargaming purposes for decades. Improvements in computer technology, especially with commercial personal gaming, fueled the modification and re-use of commercial entertainment computer games for military wargaming purposes. For example, ‘“America's Army,’ a modification of Unreal Tournament; ‘DARWARS Ambush,’ and [sic] adaptation of ‘Operation Flashpoint;’ and X-Box's ‘Full Spectrum Warrior’ have all been used by the military. ‘Marine Doom’ was…an early modification of idSoftware's ‘Doom II.’”50 50 Carrie McLeroy, “History of Military gaming,” U.S. Army, August 27, 2008, https://www.army.mil/article/11936/history_of_military_gaming. 51 Alberts, Code of Best Practice: Experimentation, 93 & 222.

Best Practices for Scenario Development

• Develop and use multiple scenarios in an experiment. Using only a single scenario encourages suboptimization and decreases how broadly the findings can be applied.47

• Include three echelons of command in the scenario—one above and one below the focus of the experiment.48

• To reduce experiment costs associated with scenario development, re-use or modify existing scenarios as appropriate when doing so doesn’t compromise the experiment. Consider using commercial games49 if appropriate.51

https://www.esd.whs.mil/Portals/54/Documents/DD/issuances/dodi/302047p.pdf?ver=2019-01-29-080914-067

https://www.army.mil/article/11936/history_of_military_gaming

21

needed. Challenges associated with data collection will be assessed and integrated in a revised data analysis plan. This iterative process will continue through the life of the experiment.

The data analysis plan will usually include several types of analyses depending on the purpose and focus of the experiment, the information required, and the data collection means available. The type of experiment conducted will influence the tools selected to conduct the analysis. For example, because less formal experiments offer significant opportunity for unscripted free play, they require more open-ended analysis tools and techniques (e.g., histograms, scatter plots, mean values, etc.). However, more formal experiments are rigidly planned, requiring rigorous tools and techniques that enable statistical control (e.g., t-test, regression analysis, correlation analysis, etc.).54

Data collection plans should include descriptions of the content of the data, collection methods, and data handling and storage procedures. In developing the data collection plan, experiment teams should consider the myriad ways that data collection can be accomplished. The most reliable form of data collection is automated collection, in which the systems used to drive the experiment or the operators’ systems collect and store the data. Care must be taken to ensure that the data collection systems’ clocks are synchronized and that use of automated data collection tools will not impact the functionality of the systems under testing. Other means of data collection include screen captures, email archives, snapshots of databases, audio and/or video recording, survey instruments, proficiency testing of subjects, and human observation.

52 Alberts, Code of Best Practice: Experimentation, 224-225. 53 Experiment Planning Guide, H-2-2. 54 Alberts, Code of Best Practice: Experimentation, 113-114.

Best Practices for Data Analysis and Collection Plans

• Experimenters should understand what data decision makers and stakeholders consider the most useful to collect.

• Address protection of vendor intellectual property rights, as appropriate.

• Keep in mind that the only reason for collecting data is to support the data analysis. Losing sight of this can result in simply collecting data that is easy to collect as opposed to collecting data needed for the experiment.52

• The Navy Warfare Development Command developed a data collection plan template that includes the following topics: o Data collection methodologies: sensor system electronic data,

communications network data, observer manual collections, surveys (form or electronic), interviews, etc.

o Collection plan specifics: battle rhythm, type, periodicity, format, location, timeframe, method, etc.

o Data collection personnel: instructions, training requirements, location, timeframe, transportation and billeting requirements

o Collection form templates o Collection equipment o Observer logs o External collection requirements: related data that cannot be

captured during the execution event, e.g., surveys, interviews53

22

Key steps to developing a data collection plan include the following:

• Specify the variables to be measured; • Prioritize the variables to be measured; • Identify the collection method for each variable; • Ensure access for collecting data for each variable; • Specify the number of observations needed for each variable and confirm the expectation

to collect all observations; • Identify required training; • Specify the mechanisms that will be used to capture and store the data; and • Define the processes needed for data reduction and assembly.55

5.2.6 Risk Management. As with any acquisition project, experimenters must analyze, mitigate, and monitor risks to their experiments. Table 2 summarizes the risks common to experimentation.56

Table 2: Risks Common to Experimentation

Risk Category Description Examples Experiment Risks Internal activities that could

affect the success of the experiment

• Failure of the concept or technology to perform as advertised

• Insufficient participants with the correct skills and experience

• Unrealistic timeline • Safety hazards • System security challenges

Programmatic Risks Risks that are imposed externally

• Insufficient funding • Schedule constraints • Increases in scope

Operational Risks Risks associated with a solution’s ability to perform in an operational environment

• Non-ruggedized equipment • Inappropriate operational environment • “Acts of God”

Not all risks can be eliminated, but they should be identified, catalogued, prioritized, and managed to minimize their impact on the experiment. Experimenters can find additional information on risk management practices in DAU’s “Defense Acquisition Guidebook”57 or the “DoD Risk, Issue, and Opportunity Management Guide for Defense Acquisition Programs.”58

55 Alberts, Code of Best Practice: Experimentation, 242. 56 Experiment Planning Guide, 22. 57 Defense Acquisition University, Defense Acquisition Guidebook (2018), https://www.dau.mil/tools/dag. 58 Department of Defense, Department of Defense Risk, Issue, and Opportunity Management Guide for Defense Acquisition Programs (Washington, DC: Department of Defense, 2017), https://www.dau.mil/tools/Lists/DAUTools/Attachments/140/RIO-Guide-January2017.pdf.

https://www.dau.mil/tools/dag

https://www.dau.mil/tools/Lists/DAUTools/Attachments/140/RIO-Guide-January2017.pdf

23

5.2.7 Selecting the Experimentation Venue. DoD holds numerous events each year where experiments are conducted. These venues are both physical and virtual venues depending on the type of experiment being conducted and the objectives of the experiment. According to the U.S. Air Force Scientific Advisory Board, as long as a venue facilitates the exploration of ideas and insights, the venue (whether physical or virtual) can be used for experimentation.59 A critical component of the selection decision is the infrastructure that the venue offers. Physical venues should include appropriate infrastructure to support data collection, enable capturing the locations of relevant entities over time, and permit or provide adequate communication for the experiment team.

5.2.7.1 Relevant Environment. Experimenters should select a venue that maximizes the relevance of the environment to the problem the experiment is intended to inform. Not all relevant environments need to be operational environments. Depending on the problem, the relevant environment could be a virtual crowdsourcing environment, a laboratory bench, a seminar or workshop, a wind tunnel, a test and evaluation facility, a simulated environment, a defense experimentation venue, or a training exercise—to name just a few. The key is to ensure the venue environment is relevant to the problem statement and allows the experimentation team to implement the experiment as designed. For example, a large exercise or wargame may seem like an ideal venue for an experiment because of the opportunity for hands-on warfighter involvement with the proposed solution. However, because of the cost and scope of these venues, it is unlikely that multiple trials (necessary for many experiments) will be conducted, which could impact the ability of the decision makers to make an informed decision. On the other hand, if the ultimate objective is to deploy the solution for operational use, the relevant environment must include hands-on experimenting with the solution by the warfighter in an operationally representative environment.

5.2.7.2 Examples of DoD Experimentation Venues. This subsection provides a representative sample of DoD experimentation venues. Participants in these events are typically responsible for covering their own costs. For additional information regarding each event, the names of the events are hyperlinked to their applicable online presence (as of the date of this publication).

The Naval Undersea Warfare Center Division Newport conducts the annual ANTX, which provides a maritime demonstration and experimentation environment that targets specific technology focus areas or emerging warfighting concepts with a goal of getting potential capabilities out to the warfighter in 12 to 18 months. ANTXs are low-barrier-to-entry, loosely scripted experimentation events where technologists and warfighters are encouraged to explore alternate tactics and technology pairings in a field or simulated environment. Participants

59 United States Air Force Scientific Advisory Board, United States Air Force Scientific Advisory Board Report on System-Level Experimentation: Executive Summary and Annotated Brief, SAB-TR-06-02 (Washington DC: United States Air Force Scientific Advisory Board, 2006), 10, https://apps.dtic.mil/dtic/tr/fulltext/u2/a463950.pdf.

https://www.navsea.navy.mil/Home/Warfare-Centers/NUWC-Newport/What-We-Do/ANTX-2019/

https://apps.dtic.mil/dtic/tr/fulltext/u2/a463950.pdf

24

receive feedback from government technologists and operational SMEs. ANTXs are hosted by labs and warfare centers from across the naval R&D establishment.

The Army Maneuver Center of Excellence conducts an annual AEWE campaign of experimentation to identify concepts and capabilities that enhance the effectiveness of the current and future forces by putting new technology in the hands of Soldiers. AEWE is executed in three phases—live fire, non-networked, and force-on-force—providing participants the opportunity to examine emerging technologies of promise, experiment with small unit concepts and capabilities, and help determine DOTMLPF-P implications of new capabilities.

CBOAs are scenario-based events that support vulnerability and system limitation analysis of emerging capabilities in chemically- and biologically-contested environments. These live field experiments, conducted at operationally relevant venues, provide an opportunity for technology developers to interact with operational personnel and determine how their efforts might fill military capability gaps and meet high priority mission deficiencies. CBOAs are sponsored by the Defense Threat Reduction Agency’s Research and Development-Chemical and Biological Warfighter Integration Division.

The JIFX program conducts quarterly collaborative experimentation in an operational field environment using established infrastructure at Camp Roberts and San Clemente Island. JIFX experiments provide an environment where DoD and other organizations can conduct concept experimentation using surrogate systems, demonstrate and evaluate new technologies, and incorporate emerging technologies into their operations. JIFX is run by the Naval Postgraduate School.

Sea Dragon 2025 is a series of real-world experiments intended to refine the U.S. Marine Corps (USMC) of the future. Sea Dragon experiments are conducted in several phases that span a number of years. The first phase concentrated on the future makeup of the USMC infantry battalion. The second phase is an on-going three-year campaign focusing on hybrid logistics, operations in the information environment, and expeditionary advanced base operations. Sea Dragon 2025 is run by the Marine Corps Warfighting Laboratory.

USSOCOM conducts TE events throughout the U.S. with Government, academia, and private industry representation. TE events are typically held in austere, remote outdoor locations under various weather and environmental conditions, creating a setting where technology developers can interact with the Special Operations Forces (SOF) community in a collaborative manner. TE events are conducted by USSOCOM’s SOF Acquisition, Technology, and Logistics Center.

5.3 Soliciting Proposed Solutions for Experiments. The need to solicit for proposals depends on the dynamics of the experiment. When a technology or concept solution is already known and an experiment is planned to further refine

http://www.benning.army.mil/MCoE/CDID/AEWE/

https://www.fbo.gov/index?s=opportunity&mode=form&tab=core&id=fb983969e2e472e14b61314e716b7b83

https://my.nps.edu/web/fx

https://www.marines.mil/News/Messages/Messages-Display/Article/1381238/usmc-fy18-experiment-plan-sea-dragon-25-phase-ii/

https://www.socom.mil/SOF-ATL/Pages/technical-experimentation.aspx

https://www.socom.mil/SOF-ATL/Pages/technical-experimentation.aspx

25

or determine the operational utility of the solution, this step is not needed. However, as is often the case, the problem statement is drafted without a specific solution in mind. In these cases, once the problem statement is clearly drafted and the experiment plan is developed, the next major activity is soliciting potential solutions that meet the stated need. Potential solutions can be obtained from a number of sources. DoD Project/Program Managers or Program Executive Officers may recognize and offer legacy or new capabilities as potential solutions to the problem. National laboratories, defense laboratories, centers of excellence, and other DoD organizations are also great sources of new capability and prototypes that should be considered. Another approach is reaching out to Federally Funded Research and Development Centers and University Affiliated Research Centers that develop technology solutions. Finally, industry and other academic institutions are also great sources of innovative solutions.

When seeking non-Government non sole-source solutions, the FAR requires the use of the Federal Business Opportunities website (https://www.fbo.gov) for opportunities greater than $25,000. This website is a great resource for reaching traditional partners. However, experimenters who want to expand their target audience to include nontraditional suppliers of potential solutions will need to exploit alternative solicitation strategies. Additional information and best practices regarding soliciting potential solutions from traditional and nontraditional suppliers can be found in Section 5.3 of the DoD Prototyping Guidebook.60

5.4 Selecting Potential Solutions for Experiments. Determining which of the proposed solutions to include in the experiment is the next step in the process. To identify the most promising, innovative, and cost effective solutions, experimenters should establish selection criteria that clearly address the purpose or objective of the experiment. These criteria will often be weighted to emphasize specific attributes of the solution over others. Selection criteria and their weighting should be developed to address the problem statement directly, the future decision to be made, and the data needed to make that decision. Additional information and best practices associated with selecting potential solutions can be found in Section 5.4 of the DoD Prototyping Guidebook.61

60 Department of Defense, Department of Defense Prototyping Guidebook, Version 1.1, 12. 61 Department of Defense, Department of Defense Prototyping Guidebook, Version 1.1, 15.

https://www.fbo.gov/

26

5.5 Preparing For and Conducting Experiments. Preparing for an experiment starts as soon as the venue is selected, long before the actual event occurs, and proceeds in an iterative fashion throughout the planning and execution process. As the evolving plan identifies new requirements for the experiment, experimenters begin the effort to satisfy those requirements. If the venue is unable to meet an experiment requirement, experimenters will need to revise the plan. This iterative process continues through the experiment execution. Experiment planning and preparation culminate in an event that typically runs for three days to two weeks.63

While scope and complexity of experiments differ significantly depending on the type of experiment conducted, the following subsections address major topics of consideration that are fairly universal for all experiments. Naturally, the activities associated with each of these topics will vary greatly depending on the experimentation method employed. For example, the activities associated with field experiments are nearly always more substantial and complex than the activities associated with workshops or simulations. The following subsections are written to address the activities typically required for more-rigorous field experiments. Regardless of the scope and complexity of the experiment, however, experimenters should consider each of these

62 Carly Jackson et al., “Application of Set-Based Decision Methods to Accelerate Acquisition through Tactics and Technology Exploration and Experimentation (TnTE2)” in Proceedings of the Fifteenth Annual Acquisition Research Symposium (May 2018): 350, https://calhoun.nps.edu/bitstream/handle/10945/58779/SYM-AM-18-095-024_Jackson.pdf?sequence=1&isAllowed=y. 63 Experiment Planning Guide, 83.

Best Practices for Selecting Potential Solutions for Experimentation

• The Warfighting Lab Incentive Fund office employs members of the Joint Staff and the Office of Cost Assessment and Program Evaluation to evaluate submissions using the following criteria: o Potential for disruptive innovation o Potential contribution to off-set key US vulnerabilities o Potential for cost imposition/enhancements to US national interest across the conflict continuum o Potential cost/benefit for the Department o Amount of funding requested o Time required to execute and generate results o Potential for advancing US national interests o Past performance of proposing organization

• The Navy’s Tactics and Technology Exploration and Experimentation (TnTE2) methodology uses two categories of criteria to select solutions—technical ability and potential operational utility.62 Table 3 provides examples of criteria considered under each of these categories.

Table 3: Examples of Selection Criteria for Navy's TnTE2 Methodology

Technical Ability Potential Operational Utility • Technical maturity • Readiness to integrate with other systems • Reliability • Standardization

• Operational relevance • Personnel burden • Environmental constraints

https://calhoun.nps.edu/bitstream/handle/10945/58779/SYM-AM-18-095-024_Jackson.pdf?sequence=1&isAllowed=y


27

topics as they plan for, prepare, and execute their experiments. (For additional information regarding setting up and executing tabletop exercises64 and wargames,65 please refer to the footnoted references.)

5.5.1 Logistics and Set Up. The set-up schedule is dictated by the scope and complexity of the experiment. The greater the scope and complexity and the higher the levels of validity, reliability, precision, and/or credibility required, the longer the lead-time needed to prepare for the experiment. Physical set-up at the venue typically occurs 2-4 weeks before the experiment begins;66 however, many logistics activities must begin long before the physical set-up. For example, to ensure availability when needed, experimenters must begin the effort early to secure specific requirements like frequency spectrum, airspace clearance, and military training ranges. Likewise, experiment participants must be notified with sufficient lead-time to secure travel and billeting.

The following list contains examples of logistics activities that experimenters should address during the 2-4 week set-up time prior to the experiment, ensuring that:

• Necessary infrastructure is available and operable; • Systems operate correctly and interoperate, as appropriate, with other systems; • All nodes are sufficiently challenged and present an adequate representation; • Communications methods function effectively; • Instrumentation is calibrated and synchronized; and • Contingency plans have been prepared.67

5.5.2 Training. All experiment participants must be adequately trained to ensure they are able to effectively perform their functions. Inadequately trained participants create a significant risk to an otherwise well-constructed experiment. Training will usually occur during the 2-4 week experiment set-up period, but preparations must begin long before. A well-planned training program, including training materials, is key to successful participant involvement. Training should focus on four groups of participants: subjects, data collectors, experiment controllers, and the support team.

64 Eugene A. Razzetti, “Tabletop Exercises: for Added Value in Affordable Acquisition,” Defense AT&L Vol XLVI, No. 6, DAU 259 (November - December 2017): 26-31, https://www.dau.mil/library/defense-atl/_layouts/15/WopiFrame.aspx?sourcedoc=/library/defense-atl/DATLFiles/Nov-Dec_2017/DATL_Nov_Dec2017.pdf&action=default. 65 United States Army War College, Strategic Wargaming Series: Handbook (Carlisle, PA: United States Army War College, 2015), https://ssi.armywarcollege.edu/PDFfiles/PCorner/WargameHandbook.pdf. 66 Experiment Planning Guide, 83. 67 Experiment Planning Guide, 83.

https://www.dau.mil/library/defense-atl/_layouts/15/WopiFrame.aspx?sourcedoc=/library/defense-atl/DATLFiles/Nov-Dec_2017/DATL_Nov_Dec2017.pdf&action=default



https://ssi.armywarcollege.edu/PDFfiles/PCorner/WargameHandbook.pdf

28

5.5.2.1 Subjects. Training for experiment subjects should focus on the purpose of the experiment, the background and scenario(s), processes subjects will use during the experiment, and technical skills needed to operate the systems being evaluated as well as infrastructure equipment necessary for the experiment. If the experiment includes a comparison of multiple systems, subjects will need to be proficient in all systems being evaluated, including hands-on training when possible. Experimenters should consider requiring subjects to pass a proficiency exam prior to the start of the experiment.

5.5.2.2 Data Collectors. While providing a thorough overview of the basics of the experiment (e.g., purpose, context, problem statement, hypotheses, scenario(s), and major events), training for data collectors should focus on techniques for observation and data collection as well as timing and location of data that is to be collected. In addition, data collectors and experiment controllers must be clear on the data that the analysts expect to receive and be prepared to identify and record anomalies so that the analyst knows what to do with the data. Experimenters should consider evaluating the data collectors’ proficiency with collection methodologies, tools, and processes through a written exam as well as a dry run of their data collection tasks.

5.5.2.3 Experiment Controllers. Experiment controllers require training on experiment basics with an emphasis on the scenarios and MSELs. Responsible for the successful execution of the scenarios, controllers must have a thorough understanding of the timing and application of scenario injects and be proficient in other appropriate controller responsibilities.

5.5.2.4 Support Team. The support team must be well trained on the experiment basics as well as the systems they will be expected to operate. Often, the support team ends up training other participants on their roles or on the use of technical systems.

5.5.3 Pre-Experiment Dry Run. Successful experiments are typically preceded by a full run-through of every aspect of the experiment. This run-through includes conducting pretests of individual systems and experiment components (e.g., workstations, communications networks, databases, etc.) in stand-alone mode to ensure their functionality, as well as exercising them in an integrated system-of-systems approach to confirm their interoperability. Experimenters should also run full trials of each scenario using fully-trained subjects, data collectors, controllers, and support team staff, stressing the system to at least the same level expected during the experiment. Finally, the dry run should produce the same data expected during the experiment, and analysts should reduce and analyze the data as planned during the experiment.

5.5.4 Execution. Experiments typically run from three days to two weeks with the duration being a function of the scope and complexity of the experiment or the experiment venue. Each day should begin with a review of planned activities for the day and should end with a review of experiment activities conducted that day and a discussion of changes that should be made to increase the effectiveness of the next day’s activities. The experiment management team typically performs the control

29

function during the experiment and is responsible for executing the MSEL events at the time and in the manner that they are scheduled to occur. As with any event, flexibility and ingenuity are required to address complications experienced during execution that challenge the schedule or effectiveness of the experiment.

5.5.5 Data Collection and Management. Data collectors should follow the collection, handling, and storage procedures contained in the data collection plan. Experiment leads should evaluate data collection activities daily to ensure the correct data are being collected in the format needed for analysis and that they are handled and stored according to plans. Instrumentation used to collect data must be calibrated and operated in a way that minimizes any disruption to the operational realism experienced by the participants. The data collectors should be monitored and critiqued continuously to ensure that data are being collected consistently across collectors and in the manner specified in the collection plan. Raw data should be reduced as soon as possible, per the collection plan, and both the raw data and reduced data must be archived for analysis purposes.

5.6 Data Analysis and Interpretation. The final step in the experimentation process is analyzing and interpreting the data collected during the experiment. Data analysis should be conducted using the tools and techniques detailed in the data analysis plan. While being sure to complete the analysis contained in the plan, analysts should also be encouraged to pursue excursions with data of interest outside of the analysis plan. Technologists and operational SMEs should then interpret the results of the analysis, validating or invalidating the hypothesis, and provide decision makers the information they need to inform the decision that initiated the experiment.

5.7 Results of Experimentation. The measure of a successful experiment is whether it produces sufficient evidence to conclude if a cause-and-effect relationship exists between two variables—even if the experiment does not produce the expected results. If the experiment does not successfully produce the necessary evidence, experimenters can choose to conduct the experiment again (if schedule and funding permit) or they can terminate the effort to inform the relationship between the variables. However, experiments that do successfully produce sufficient evidence typically result in one or more of the following actions.

Best Practices for Data Analysis and Interpretation

• For experiments to have maximum effect, rather than simply tabulating the data, experimenters should interpret the data and draw applicable conclusions.

• At the conclusion of the experiment, after the data has been analyzed and interpreted, experimenters should revisit the purpose/hypothesis for the experiment and, as explicitly as possible, state what was learned through the data and what was not.

Best Practices for Results of Experimentation

• Experimenters should consider institutionalizing the results and valuable lessons learned during their experiment in available databases so other stakeholders across the defense community can benefit from their work.

30

5.7.1 Data are Used to Create or Update Models. Experimentation data can be used to either create new models or validate and refine existing models. In some cases, experimenters will create a conceptual model at the start of the experimentation process to assist in developing the problem statement, hypothesis, or the experiment design. Experimenters can then use the data from the experiment for sensitivity analysis to validate the model, revealing how stable the model is in light of the intervening variables, or they can use the data to modify the model so it better reflects the results of the experiment.

5.7.2 Results Generate or Refine Requirements for New Experiments. A single experiment may not generate the information needed for senior leaders to conclude if the proposed solution will solve the problem. Sometimes, decisions require a series of experiments testing different facets of the solution. This is especially true when experimenters initiate the process to solve a complex problem and recognize that they will need numerous experiments to generate the type of information decision makers will need. Some people refer to these as campaigns of experimentation, when experimenters apply a systematic approach to planning and conducting related serial and parallel experiments in order to methodically move a solution from a vague idea to a fielded system or approach.68 In these cases, the results of one experiment can generate or refine hypotheses for subsequent experiments.

5.7.3 Results Generate Changes to the Proposed Solution. Sometimes experimentation helps to mature the proposed solution. In the case of a non-materiel DOTMLPF-P solution, results of the experiment may reveal changes that need to be made to the proposed solution or another DOTMLPF-P element to make it more effective. In the case of a materiel solution, the results of the experiment may support the transition of the technology further along DoD’s technology readiness level continuum or identify changes that a technologist will want to make to the design of a prototype to improve its effectiveness or reduce its lifecycle cost.

5.7.4 Failed Solutions are Filtered Out. Successful experiments will sometimes identify potential solutions that fail to solve the problem being studied. Identifying failed solutions is as important as identifying successful solutions as it provides decision makers with information they need to terminate R&D activities associated with failed solutions and reallocate R&D resources to other promising capabilities.

5.7.5 Successful Solutions Transition to Operations. In some cases, at the conclusion of the experiment, the solutions will transition to operational use to address an existing critical warfighter capability gap. These solutions can exist along the entire DOTMLPF-P spectrum. Experiments evaluating non-materiel solutions may result in recommendations to operationalize one or more of the non-materiel DOTMLPF-P solutions

68 For additional information regarding campaigns of experimentation, please refer to the following source: David S. Alberts and Richard E. Hayes, Code of Best Practice: Campaigns of Experimentation (Washington, DC: Command and Control Research Program, 2005), http://www.dodccrp.org/files/Alberts_Campaigns.pdf.

http://www.dodccrp.org/files/Alberts_Campaigns.pdf

31

evaluated. Experiments evaluating materiel solutions may result in a fielded materiel operational capability.

For experiments where operationalizing the solution is an objective, it is critical for the innovator, program manager, and the operational unit to begin collaborating early in the planning phase and continue interacting throughout the project. This collaboration will enable the stakeholders to:

• Clearly understand the operational need; • Establish the criteria that defines a successful experiment in an operational environment; • Develop an appropriate sustainment package (e.g., standard operating procedures,

training requirements, etc.); and • Ensure appropriate system safety, security, and technical certifications are delivered with

the capability.

5.7.6 Successful Solutions Transition to Rapid Fielding. In Section 804 of the FY16 NDAA, Congress provided an expedited acquisition pathway to rapidly field successful technical solutions.69 This Middle Tier Acquisition pathway is available to decision makers for solutions that meet the following criteria:

• Existing products and proven technology (with minimal development required) that meet needs communicated by the warfighter;

• Selected using a merit-based process; • Performance was successfully demonstrated and evaluated for current operational

purposes; • Lifecycle costs and issues of logistics support and system integration are addressed; and • Production must begin within six (6) months and complete fielding within five (5) years

of an approved requirement.

5.7.7 Successful Solutions Integrate Into Existing Programs of Record (PoRs) or Initiate New Acquisition Programs.

Decision makers may choose to initiate new FAR-based acquisition programs for successful solutions or integrate the solutions into an existing PoR through traditional acquisition pathways pursuant to DoD Instruction 5000.02, “Implementation of the Defense Acquisition System.” If this pathway is expected from the outset of experiment planning, early collaboration with appropriate DoD and Military Services process owners and the receiving PoR should be initiated to ensure integration and interoperability success.

69 National Defense Authorization Act for Fiscal Year 2016, Pub. L. No. 114-92 § 804, 129 Stat. 883 (2015), https://www.gpo.gov/fdsys/pkg/PLAW-114publ92/pdf/PLAW-114publ92.pdf.

https://www.gpo.gov/fdsys/pkg/PLAW-114publ92/pdf/PLAW-114publ92.pdf

32

6 Summary

The NDS highlights that U.S. national security is affected by the rapid development of technological advancements that are accessible to both state and non-state actors and novel applications of technologies that are integrated with new emerging concepts. This has eroded the technological overmatch the U.S. military has operated in for decades. Current bureaucratic processes that emphasize exceptional performance, thoroughness, and minimizing risk at the expense of speed have directly contributed to this erosion. To combat this, the NDS challenges the Department to

“Streamline rapid, iterative approaches from development to fielding. A rapid, iterative approach to capability development will reduce costs, technological obsolescence, and acquisition risk…increase speed of delivery, enable design tradeoffs in the requirements process, expand the role of warfighters and intelligence analysis throughout the acquisition process, and utilize non-traditional suppliers.”70

Experimentation is a tool that enables much of what the NDS calls for—speed, an iterative approach, tradeoffs, and expanded roles of warfighters and intelligence analysis. The information and best practices provided in this guidebook is designed to help senior leaders, decision makers, staff officers, and experimenters most effectively use experimentation to inform decisions, supporting the ultimate goal of delivering capabilities to the warfighter at the speed of relevance.

70 Mattis, Summary of the 2018 National Defense Strategy of the United States of America, 11.

33

Appendix 1: Acronyms

AEWE Army Expeditionary Warrior Experiment

ANTX Advanced Naval Technology Exercise

CBOA Chemical Biological Operational Analysis

CONOPS Concept of Operations

DAU Defense Acquisition University

DoD Department of Defense

DOTMLPF-P Doctrine, Organization, Training, Materiel, Leadership and Education, Personnel, Facilities, and Policy

DSB Defense Science Board

FAR Federal Acquisition Regulation

FY Fiscal Year

HITL Human-in-the-Loop

IPL Integrated Priority List

JCIDS Joint Capabilities Integration and Development System

JIFX Joint Interagency Field Experimentation

M&S Modeling and Simulation

MSEL Master Scenario Event List

NDAA National Defense Authorization Act

NDS National Defense Strategy

OPSEC Operations Security

P&E Prototypes and Experiments

PoR Program of Record

R&D Research and Development

SME Subject Matter Expert

34

SOF Special Operations Forces

TE Technical Experimentation

USD(R&E) Under Secretary of Defense for Research and Engineering

USMC U.S. Marine Corps

USSOCOM U.S. Special Operations Command

35

Appendix 2: Definitions

Credibility. Measure of understanding, respect, and acceptance of the results by the professional communities participating in the experiment.

Defense Experimentation. Testing a hypothesis, under measured conditions, to explore unknown effects of manipulating proposed warfighting concepts, technologies, or conditions.

Dependent Variable. Feature or attribute of the subject of an experiment that is expected to change as a result of the introduction or manipulation of other influencing factors.

External Validity. Experimental design and conduct that ensures the results of the experiment can be generalized to other environments.

Hypothesis. A formal statement of the problem being evaluated and a proposed solution to that problem. Often written in an if-then format that describes a proposed causal relationship between the proposed solution and the problem, the “if” part of the statement represents the proposed solution (the independent variable) and the operational constraints to be controlled (intervening variables), and the “then” part of the statement addresses the possible outcome to the problem (the dependent variable).

Independent Variable. An influencing factor in an experiment that is not changed by other factors in the experiment and is introduced or manipulated in order to observe the impact on the subject of the experiment.

Internal Validity. Experimental design and conduct that ensures that no alternative explanations exist for the experiment results.

Intervening Variable. Feature of an experiment that, unless controlled, could affect the results of the experiment.

Master Scenario Event List (MSEL). A document that lists all of the scenario events/injects for an experiment.

Middle Tier Acquisition Pathway. Acquisition pathway that use the authorities in Section 804 of the FY16 NDAA to fill the gap between traditional PoRs and urgent operational needs. Rapid prototyping must be completed within a period of five years. Rapid fielding must begin production within six months of initiation and be completed within another five years.71

Military Capability Gap. Needs or capability gaps in meeting national defense strategies that are generated by the user or user-representative to address mission area deficiencies, evolving threats, emerging technologies, or weapon system cost improvements. For the purposes of prototyping and rapid fielding, military capability gaps include both formal requirements listed

71 National Defense Authorization Act for Fiscal Year 2016, § 804, 129 Stat. 882-883.

36

in approved JCIDS documents as well as other needs identified through the Combatant Command IPL accepted into the Chairman’s Capability Gap Assessment process, critical intelligence parameter breaches, and emerging needs identified through formal threat, intelligence, and risk assessments.

Nontraditional Defense Contractor. An entity that is not currently performing and has not performed, for at least the one-year period preceding the solicitation of sources by the DoD for the procurement or transaction, any contract or subcontract for the DoD that is subject to full coverage under the cost accounting standards prescribed pursuant to section 1502 of title 41 and the regulations implementing such section.72

Precision. Measure of whether or not the instrumentation is calibrated to tolerances that enable detection of meaningful differences.

Prototype. A physical or virtual model that is used to evaluate feasibility and usefulness.

Reliability. Measure of the objectivity of the experiment. Experimental design and conduct that ensures repeatability of the results of the experiment when conducted under similar conditions by other experimenters.

Rapid Prototyping. A prototyping pathway using nontraditional acquisition processes to rapidly develop and deploy prototypes of innovative technologies. It is the intent that these technologies provide new capabilities to meet emerging military needs, are demonstrated in an operational environment, and provide a residual operational capability within five years of project approval.

Scenario Events/Injects. Pre-planned events intended to drive the actions of experiment participants.

Technologists. Scientists and engineers proficient in the experiment domain.

Technology Base. The development efforts in basic and applied research.

Validity. Measure of how well the experiment measures what it intends to measure.

72 National Defense Authorization Act for Fiscal Year 2016, 10 U.S.C. § 2302(9) (2015), https://www.law.cornell.edu/uscode/text/10/2302.

https://www.law.cornell.edu/uscode/text/10/2302

37

Appendix 3: References

Accelerating New Technologies to Meet Emerging Threats: Testimony before the U.S. Senate Subcommittee on Emerging Threats and Capabilities of the Committee on Armed Services. 115th Cong., 2018 (testimony of Michael D. Griffin, Under Secretary of Defense for Research and Engineering (USD(R&E)). https://www.armed-services.senate.gov/imo/media/doc/18-40_04-18-18.pdf.

Alberts, David S. and Richard E. Hayes. Code of Best Practice: Experimentation. Washington, DC: Command and Control Research Program, 2002. http://dodccrp.org/files/Alberts_Experimentation.pdf.

Defense Acquisition University. Defense Acquisition Guidebook. 2018. https://www.dau.mil/tools/dag.

Defense Science Board. The Defense Science Board Report on Technology and Innovation Enablers for Superiority in 2030. Washington DC: Department of Defense, 2013. https://www.acq.osd.mil/DSB/reports/2010s/DSB2030.pdf.

Department of Defense. Department of Defense Prototyping Guidebook, Version 1.1. Washington, DC: Department of Defense, 2019. https://www.dau.mil/tools/t/DoD-Prototyping-Guidebook.

Department of Defense. Department of Defense Risk, Issue, and Opportunity Management Guide for Defense Acquisition Programs. Washington, DC: Department of Defense, 2017. https://www.dau.mil/tools/Lists/DAUTools/Attachments/140/RIO-Guide-January2017.pdf.

Department of Defense. DoD Participation in the National Exercise Program (NEP). DoD Instruction 3020.47. Washington DC: Department of Defense, 2019. https://www.esd.whs.mil/Portals/54/Documents/DD/issuances/dodi/302047p.pdf?ver=2019-01-29-080914-067.

Experiment Planning Guide. Norfolk, VA: Navy Warfare Development Command, 2013.

Jackson, Carly, Aileen Sansone, Christopher Mercer, and Douglas King. “Application of Set-Based Decision Methods to Accelerate Acquisition through Tactics and Technology Exploration and Experimentation (TnTE2).” In Proceedings of the Fifteenth Annual Acquisition Research Symposium (May 2018): 335-361. https://calhoun.nps.edu/bitstream/handle/10945/58779/SYM-AM-18-095-024_Jackson.pdf?sequence=1&isAllowed=y.

Kass, Richard A. The Logic of Warfighting Experiments. Washington, DC: Command and Control Research Program, 2006. http://www.dodccrp.org/files/Kass_Logic.pdf.

Mattis, James N., Secretary of Defense. Summary of the 2018 National Defense Strategy of the United States of America: Sharpening the American Military’s Competitive Edge. Washington, DC: Department of Defense, 2018. https://dod.defense.gov/Portals/1/Documents/pubs/2018-National-Defense-Strategy-Summary.pdf.



http://dodccrp.org/files/Alberts_Experimentation.pdf

https://www.dau.mil/tools/dag

https://www.acq.osd.mil/DSB/reports/2010s/DSB2030.pdf



https://www.dau.mil/tools/Lists/DAUTools/Attachments/140/RIO-Guide-January2017.pdf





http://www.dodccrp.org/files/Kass_Logic.pdf



38

McLeroy, Carrie. “History of Military gaming.” U.S. Army. August 27, 2008. https://www.army.mil/article/11936/history_of_military_gaming.

National Defense Authorization Act for Fiscal Year 2016. 10 U.S.C. § 2302(9). Washington, DC, 2015. https://www.law.cornell.edu/uscode/text/10/2302.

National Defense Authorization Act for Fiscal Year 2016. Pub. L. No. 114-92 § 804, 129 Stat. 882. Washington, DC, 2015. https://www.gpo.gov/fdsys/pkg/PLAW-114publ92/pdf/PLAW-114publ92.pdf.

NATO Code of Best Practice for C2 Assessment. Washington, DC: Command and Control Research Program, 2002. http://dodccrp.org/files/NATO_COBP.pdf.

Razzetti, Eugene A. “Tabletop Exercises: for Added Value in Affordable Acquisition.” Defense AT&L Vol XLVI, No. 6, DAU 259 (November - December 2017): 26-31. https://www.dau.mil/library/defense-atl/_layouts/15/WopiFrame.aspx?sourcedoc=/library/defense-atl/DATLFiles/Nov-Dec_2017/DATL_Nov_Dec2017.pdf&action=default.

Report to Congress: Restructuring the Department of Defense Acquisition, Technology and Logistics Organization and Chief Management Officer Organization, In Response to Section 901 of the National Defense Authorization Act for Fiscal Year 2017 (Public Law 114 - 328). Washington DC: Department of Defense, 2014. https://dod.defense.gov/Portals/1/Documents/pubs/Section-901-FY-2017-NDAA-Report.pdf.

The Technical Cooperation Program. Pocketbook Version of GUIDEx (Slim-Ex): Guide for Understanding and Implementing Defense Experimentation. Ottowa, Canada: Canadian Forces Experimentation Centre, 2006. https://www.acq.osd.mil/ttcp/guidance/documents/GUIDExPocketbookMar2006.pdf.

United States Air Force Scientific Advisory Board. United States Air Force Scientific Advisory Board Report on System-Level Experimentation: Executive Summary and Annotated Brief. SAB-TR-06-02. Washington DC: United States Air Force Scientific Advisory Board, 2006. https://apps.dtic.mil/dtic/tr/fulltext/u2/a463950.pdf.

United States Army War College. Strategic Wargaming Series: Handbook. Carlisle, PA: United States Army War College, 2015. https://ssi.armywarcollege.edu/PDFfiles/PCorner/WargameHandbook.pdf.

U.S. Congress, House. Conference Report: National Defense Authorization Act for Fiscal Year 2017. S. 2943. 114th Cong., 2d sess. 2016. https://www.congress.gov/114/crpt/hrpt840/CRPT-114hrpt840.pdf.

https://www.army.mil/article/11936/history_of_military_gaming

https://www.law.cornell.edu/uscode/text/10/2302



http://dodccrp.org/files/NATO_COBP.pdf




https://dod.defense.gov/Portals/1/Documents/pubs/Section-901-FY-2017-NDAA-Report.pdf

https://www.acq.osd.mil/ttcp/guidance/documents/GUIDExPocketbookMar2006.pdf

https://apps.dtic.mil/dtic/tr/fulltext/u2/a463950.pdf

https://ssi.armywarcollege.edu/PDFfiles/PCorner/WargameHandbook.pdf