systemics of nasa's faster-better-cheaper systems management and systems engineering: a...

10
Department of Systems Engineering and Engineering Management, Stevens Institute of Technology © 2005 Stevens Institute of Technology, ISBN 0-615-12843-2 PROCEEDINGS CSER 2005, March 23-25, Hoboken, NJ, USA Systemics of NASA’s Faster-Better-Cheaper Systems Management And Systems Engineering: A Framework For Agile Systems Engineering and Systems Management For Mission Success Michael J. DiMario Lockheed Martin & Stevens Institute of Technology 23 Badger Drive, Skillman, NJ 08558 [email protected] Abstract “The largest obstacle to low-cost innovation is the belief that it cannot be done.” (McCurdy 2001) This statement is a fundamental cornerstone in achieving agile systems development and producing reliable products with complex technologies at reasonable cost in reasonable time. NASA’s Faster-Better-Cheaper (FBC) vision is important for agile systems engineering and systems management. Several FBC programs failed because they neglected to adopt necessary practices in systems management. Systems management originally evolved to manage risky programs and technologies. Risk management techniques as well as robust systems engineering, robust trade studies, configuration management, and tailored systems management processes must be required to manage FBC environments and markets. The principles exhibited by the Pathfinder program align with the values and principles of the Agile Manifesto. The principles of the Agile Manifesto may be the necessary framework required for flexible and fast development for non-software engineering programs given they include vital systems management and systems engineering processes. 1.0 Introduction and Problem Complex systems possess characteristics that will cause them to fail as discussed in (Perrow 1984). Many studies cite how engineers architect systems to counter single points of failure in complex systems, but have difficulty with component interactions as discussed in (McCurdy 2001, Perrow 1984, Petroski 1992). For example, the Mars Polar Lander (MPL) was destroyed due to the unpredictable nature of component interactions. Failure is inevitable and inherent. Management techniques can only manage how failure is likely or not likely to occur. Engineering is a human endeavor and because of this we cannot achieve a risk free society as we become more dependent on ever more complex technologies. For traditional NASA programs as well as large-scale non-NASA programs, economy and systems reliability are polar opposites. In systems engineering, trade studies are performed whereby cost is traded for reliability as increased reliability increases cost. Shortened schedules are accomplished via increased budgets while reducing risk increases cost and schedule. The importance of cost is ever more heightened whereby government programs are increasingly under scrutiny. The government is becoming more responsive to costs as revealed

Upload: independent

Post on 23-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

Department of Systems Engineering and Engineering Management, Stevens Institute of Technology

© 2005 Stevens Institute of Technology, ISBN 0-615-12843-2 PROCEEDINGS CSER 2005, March 23-25, Hoboken, NJ, USA

Systemics of NASA’s Faster-Better-Cheaper Systems Management And Systems Engineering: A Framework

For Agile Systems Engineering and Systems Management For Mission Success

Michael J. DiMario Lockheed Martin & Stevens Institute of Technology

23 Badger Drive, Skillman, NJ 08558 [email protected]

Abstract

“The largest obstacle to low-cost innovation is the belief that it cannot be done.” (McCurdy 2001) This statement is a fundamental cornerstone in achieving agile systems development and producing reliable products with complex technologies at reasonable cost in reasonable time. NASA’s Faster-Better-Cheaper (FBC) vision is important for agile systems engineering and systems management. Several FBC programs failed because they neglected to adopt necessary practices in systems management. Systems management originally evolved to manage risky programs and technologies. Risk management techniques as well as robust systems engineering, robust trade studies, configuration management, and tailored systems management processes must be required to manage FBC environments and markets. The principles exhibited by the Pathfinder program align with the values and principles of the Agile Manifesto. The principles of the Agile Manifesto may be the necessary framework required for flexible and fast development for non-software engineering programs given they include vital systems management and systems engineering processes.

1.0 Introduction and Problem

Complex systems possess characteristics that will cause them to fail as discussed in (Perrow 1984). Many studies cite how engineers architect systems to counter single points of failure in complex systems, but have difficulty with component interactions as discussed in (McCurdy 2001, Perrow 1984, Petroski 1992). For example, the Mars Polar Lander (MPL) was destroyed due to the unpredictable nature of component interactions. Failure is inevitable and inherent. Management techniques can only manage how failure is likely or not likely to occur. Engineering is a human endeavor and because of this we cannot achieve a risk free society as we become more dependent on ever more complex technologies.

For traditional NASA programs as well as large-scale non-NASA programs, economy and systems reliability are polar opposites. In systems engineering, trade studies are performed whereby cost is traded for reliability as increased reliability increases cost. Shortened schedules are accomplished via increased budgets while reducing risk increases cost and schedule.

The importance of cost is ever more heightened whereby government programs are increasingly under scrutiny. The government is becoming more responsive to costs as revealed

by (Osborne 1992) and the commercial markets face unprecedented international competition from cheaper educated labor markets. NASA programs had been unique from other programs because they traditionally had highly reliable organizations and processes since the product of their programs cannot be retrieved and placed back on the bench for modifications and further testing.

Achieving high reliability organizations is difficult at best. (Perrow 1984) suggests that achieving high reliability organizations is not possible. One reason is that trial-and-error risk taking is preferential to risk aversion because one learns from adversity, not from avoiding adversity as suggested by (Wildavsky 1989). However, the very large scale programs of the Apollo moon landings, US Navy Aegis fleet defense weapons systems, and Lucent’s 5ESS™ telecommunications switch are examples of high reliable programs and products. However, even these programs are costly due to high product reliability.

A culture of high reliability at low cost is possible if the proper techniques are adopted as suggested by the principles of the Agile Manifesto. Engineering systems mirror the social values and abilities of the designers and builders and the operation of complex technologies is not merely a technical issue as it involves social elements as well. The organizational social values define the culture and thus define the systems management of the organization and program. For engineering domains, this culture shall be referred to as “tribal knowledge” and will be more fully defined in the following sections.

High reliability organizations adopt a culture of high reliability making extensive use of redundancy and engage in constant learning, through risk taking, trial, and analysis of error. Historically, spacecraft engineers have used formal systems

engineering and management to control risk and avoid failure.

Historically, systems analysis and systems engineering was created by engineers to analyze and coordinate large-scale development projects. These same engineers communicated through ad hoc committees and channels maintaining their control, power, and are unique to each individual program. However, those who controlled the funding sought ways to better control a seemingly uncontrollable development process. A solution was found through configuration management and change control board activities linking managerial processes with the technical processes as described by (Johnson 2002). Stakeholders emphasized a specific set of processes to meet their needs. These processes delivered a service for the stakeholder concerning the program and became for them their “systems-of-interest”.

A traditional NASA systemic of reducing risk drives increasing cost and schedule. This systemic is a reinforcing loop whereby a small change in reliability drives significant increases in cost and schedule. The FBC failures discussed here are a result of a need to change traditional systems management and systems engineering processes due to increased cost and schedule without a thorough understanding to what extent processes are to be changed for mission success.

2.0 FBC Approach

The disciplines of systems engineering and management continue to evolve as programs and the business of engineering continue to evolve. NASA’s FBC philosophy is considered by many to be a success as described in (NASAb 2000) and is applicable to other large-scale programs of the federal and commercial sectors as discussed by (Muirhead 1999). NASA as well as the commercial sector established systems engineering and systems management processes to solve critical problems. The systems grew from artisan knowledge processes that are non-repeatable and are only known by

PROCEEDINGS CSER 2005, March 23-25, Hoboken, NJ, USA

their executors. These processes have evolved to what is now referred to as “formal systems engineering” processes that are repeatable and reliable. Formal systems management and systems engineering methods were the primary factors in the improved dependability of ballistic missiles and spacecraft. Missile reliability in the air force and JPL missile programs improved from 50% to 95%. (Johnson 2002)

However, the pendulum swing between formal and artisan process continues and may fall somewhere between given the forces of business and government economic dynamics. The set of systems engineering and management processes the pendulum will center on are also evolving.

Codified and Tribal Knowledge. Codified knowledge is knowledge placed into a repeatable form or process. This process is meant to pass on scientific, engineering, or managerial practices. Documenting via lessons learned from success and failure is a means to capture this knowledge. Documenting successful practices and standardizing them creates consistency. Unlike artisan knowledge, tribal knowledge is known by members of an organization whereby the knowledge and culture of the organization are intertwined and has direct effects on the program. Artisan and tribal knowledge becomes codified knowledge only if it is captured and repeated.

Tribal knowledge teams create self-learning organizations whereby domain expertise and teamwork principles are a substitution for formal systems engineering and management disciplines as was traditionally the case for space programs. The organization devises a set of heuristics to solve its own problems and meet its own objectives. The limiting factor is that it is not repeatable since it has not been codified. Tribal knowledge embraces the five disciplines of a learning organization as

described by (Senge 1994) of shared vision, personal mastery, mental models, team learning, and systems thinking. This is also similar to the principles of the Agile Manifesto.

NASA Faster-Better-Cheaper Vision. In 1992, NASA’s chief administrator Dan Goldin proposed the FBC initiative to produce spacecraft that were inexpensive and yet reliable. FBC is defined as significantly reducing the development cost of a spacecraft on a compressed schedule and NASA designating a program as FBC. The average FBC program cost $145M with less than 3 years to develop versus an average of 7 years for a traditional spacecraft. The FBC philosophy is smaller and less expensive spacecraft are not inferior to larger spacecraft. In producing smaller and less complex spacecraft, costs would be reduced by reducing development cost, reducing weight and thus launch costs, reduce operational costs, and reducing risk by making them less complex. Mission loss due to catastrophic failure is reduced by being able to produce several spacecraft for different aspects of a mission. Mission failure is avoided since the entire mission is not encapsulated into one spacecraft and subject to total loss.

The scientific community played an important part in establishing the FBC vision. During the 1980s, the NASA planetary missions experienced long development cycles, extensive cost growth, and few flight opportunities. However, the scientific community is also at fault for large-scale missions as they placed extensive requirements on each mission for precedent science. Small spacecraft meant small science. (Roy 1998)

Dan Goldin believed that losing 2 out of 10 spacecraft was an acceptable risk for going faster and testing new technologies. For non-manned spacecraft, he stated, “A project that’s 20 for 20 isn’t successful. It’s proof that we’re playing it safe.” In this approach, risk was acceptable as new technologies would be introduced and more frequent launches would absorb some failures. (McCurdy 2001)

PROCEEDINGS CSER 2005, March 23-25, Hoboken, NJ, USA

However, (Muirhead 1999) describes that even for Mars Pathfinder that began in late 1992, one of the first FBC programs, the direction was to take risks but do not fail.

This approach avoids the problem of increasing mission requirements because there are more missions. In a traditional single spacecraft and subsequent mission, there are more mission requirements and thus increases the weight and spacecraft complexity. This increases the cost of the spacecraft and its launcher. FBC redundancy is achieved by launching many spacecraft and taking greater risk with each one. Larger spacecraft would still be necessary for specific scientific reasons such as larger aperture, stopping power, or simultaneous measurements.

The FBC technology approach was to design for simplicity, use common spacecraft components, and reuse components from other spacecraft. For example, the cameras used on the very successful Mars Pathfinder were originally developed for the Cassini spacecraft.

Goldin modified his three element approach of cost, performance, and schedule after the MPL failure. Through my interviews with Lockheed Martin Space Company personnel, he began to emphasize reliability and chastised Lockheed Martin for the failures of MPL and Mars Climate Orbiter (MCO) who was the prime contractor. He added risk management as his fourth element subsequent to the failures. (McCurdy 2001)

In the years between 1992 and 2000, there were 16 FBC missions of which 6 had failed leaving a 63% mission success rate. The year 1999 was an extremely troublesome year whereby 4 out of 5 spacecraft failed following 9 out of 10 successes in the previous years. In 1997, Lewis observation satellite spun out of control as discussed in (Anderson 1998) and NASA cancelled the Clark satellite as costs

exceeded NASA’s threshold. In 1999, the Wide Field Infrared Explorer, MCO, MPL, and the twin Deep Space 2 (DS2) microprobes failed. (Bearden 2001) estimates a 75% success rate that includes small satellites. Achieving reliability with reduced relative cost is difficult. It requires techniques not used before in the relationships between cost, schedule, and technological complexity. To combine these critical aspects and achieve mission success, they must be combined as a holistic system. Speed, technology, and cost all conspire to determine reliability, capability, and mission success. For example, (Perrow 1984) discusses earth quakes caused by water dams that were not considered as part of the holistic dam system until recently.

In the aerospace industry, a basic tenet is that cost, schedule, and reliability are interrelated in a manner that one adversely affects the other. A program can only manage two of the three simultaneously. Reduced schedules drive up costs, assuming reliability is constant. Risk and reliability add another dimension and efforts to reduce risk increase cost and schedule. Thus, faster and cheaper cannot be done simultaneously. Complex spacecraft are inherently more expensive and prone to interactive failures, and thus they are less reliable. The FBC vision was to build smaller and less complex spacecraft.

In FBC programs, this has evolved to reducing costs and time without appreciable loss in reliability or capability. The “better” in FBC means delivering more capability for the dollars provided for relative capability. FBC programs were able to do this by using micro-technologies that reduced weight and size and thus reducing cost. A mission that fulfills half of its mission requirements at one-tenth the cost is defined as “better.” Less weight also reduces spacecraft complexity which in turn reduces and simplifies project management adding to increased savings. Cost of spacecraft weight can be the deciding factor to funding a NASA program since it costs about $8300 per pound for low

PROCEEDINGS CSER 2005, March 23-25, Hoboken, NJ, USA

earth orbit aboard a Titan IVB or $8500 per pound on board the space shuttle. The Cassini spacecraft that has recently entered Saturn orbit cost $35,000 per pound when launched in 1997. The cost of spacecraft has been decreasing with advances in technology and process. In 1969, the Mariner 6 and 7 spacecraft that flew past Mars cost $848,000 per pound and in 1976, the Viking spacecraft with greater capabilities than Mariner including landing and digging, cost $480,000 per pound in year 2000 dollars.

In FBC, formal systems management and systems engineering is relaxed and replaced by teamwork as discussed by (McCurdy 2001). This is where the pendulum swings from formal process to tribal knowledge systems. It is believed that team based management can assure reliability and mission success. These programs avoid extensive systems engineering and systems management as the main means for avoiding risk and guaranteeing mission success. Born from the early manned space ventures and the Air Force’s ICBM program, systems engineering and systems management is viewed as too cumbersome and expensive. FBC programs forego many of the processes and discipline of systems engineering and systems management. Question becomes, what are the appropriate processes for mission success?

At first observation, the FBC failures in 1999 may be seen as due to complexity. (Bearden 2001) cites the following:

Average complexity of small spacecraft failure is higher than successful missions,

Complexity of small planetary missions is higher than earth orbiting missions and are generally faster, and

Dependence on success rate to complexity as a function of development time and spacecraft cost.

To date, large complex programs used expensive formal systems management processes. The management teams for FBC, when faced with tight schedules and low budgets, abandoned the processes associated with formal systems management and fatally neglected to institute teamwork methodologies.

Traditionally, NASA program managers countered risk with technical complexity. Technical simplicity became the mandate or chase reliability with complexity. In the early FBC programs, it became evident that small project teams could solve issues of reliability without resorting to formal systems management approaches. The teams were small and thus could solve reliability problems through informal communications.

3.0 FBC Failures

MPL was a robotic spacecraft set to land on the surface of Mars on December 3, 1999 to study volatiles and climate history. DS2 are twin basketball sized microprobes dispatched with MPL to act as projectiles and embed one meter into the Mars surface to test 10 new spacecraft microprobe technologies and to determine the presence of water. DS2 microprobes were to be released by MPL 5 minutes prior to Mars atmosphere entry.

It is presumed that the lander crashed on the surface of Mars due to premature engine shutoff at 130 feet above the surface of Mars. The engines were designed to shutoff upon leg contact with the surface within 50 microseconds of contact. The leg sensors were designed to send an electrical signal upon contact to the onboard software whereby the software would recognize the voltage signal and shutdown the engine.

The sensors had a tendency to generate voltage upon deployment of its three legs. Software programmers programmed the flight software to ignore this signal. The software was complex whereby it had to take into account no signal from a failed sensor. One line of code contained a software defect whereby premature

PROCEEDINGS CSER 2005, March 23-25, Hoboken, NJ, USA

engine shutdown would occur if their was a spurious signal at 5,000 feet to be recognized as a viable shutdown signal at 130 feet when the radar altimeter informed the computer to prepare for engine shutdown.

During development, a leg deployment test was performed with flight software. It was originally found that leg sensors did not work and the problem was corrected via a change in wiring. Tests were repeated for touchdown, but not for leg deployment. It is possible this error would have been found if leg deployment had been retested. The actual cause is a sound theory and only a theory because in order to reduce costs, the MPL did not have telemetry capability. MPL was designed to land alone with no human communications onto the vast and desolate plains of Mars.

The failure took place in approximately 60 seconds in a way the designers had not anticipated. Three major systems were involved – flight software, landing legs, and touchdown sensors. The leg designers were aware of the spurious signals that would be generated, but failed to communicate how it would occur to systems engineers. Software requirements to clear spurious signals were absent and the situation was not realized during testing. (NASA 1999)

In (NASA 2000d) as well as in my interviews found that the development and test teams worked greater than 60 hour work weeks and a few worked 80 hours per week to meet very tight schedules. As a result of the tight budgets, key technical areas were staffed by a single individual. The data collected in my interviews reveal that the primary root cause of MPL failure was the lack of sufficient funds. Leg deployment testing was not performed due to lack of engineers. Technology was not a direct causal issue, nor untried space technologies, or poor technology cost-risk trades.

(Perrow 1984) implies that writing better requirements, installing more sensors, and creating redundant systems would make MPL more complex and thus increasing the tendency to fail. A finding from the MPL review board (NASA 2000d) found that the risk management processes were inadequate, especially in cost-risk tradeoffs. The systems engineering trade studies may have not adequately addressed this dynamic process as it was found that systems engineering processes were too lean. (NASA 2000) Faster in FBC does not mean to arbitrarily reduce development and test time. It means reducing the cycle time by eliminating inefficient or redundant processes.

The DS2 spacecraft did not undergo a system level qualification test due to budget and schedule constraints. The risk of not performing the test was assessed and not performing the test was approved by JPL and NASA. The exact failure of DS2 is not known as there was no telemetry. There are many plausible explanations, but what is clear is that the DS2 microprobes were inadequately tested and not ready for launch. (NASA 2000d)

MCO, built by Lockheed Martin Space Company and supervised by JPL personnel, was lost on September 23, 1999 as it swept behind the planet. MCO was to map Mars’ surface, profile the atmosphere, detect surface ice, and dig for traces of water beneath the surface. The mission was a complete success until Mars approach.

The failure investigation revealed a navigation error whereby velocity changes were low by a factor of 4.45. One pound of force in English measurements converts to 4.45 Newtons. Lockheed Martin used English units whereby the JPL navigation group in California used metric units. Both groups failed to recognize the constant discrepancy and MCO swept too low into the Martian atmosphere and was destroyed. (NASA 1999)

JPL employees were responsible for systems engineering, navigation, mission operations, and project management. Lockheed Martin was

PROCEEDINGS CSER 2005, March 23-25, Hoboken, NJ, USA

responsible for design, development, and systems integration. The JPL engineers working on navigation did not attend critical design reviews at Lockheed Martin. JPL had assumed that MCO would operate the same as the Mars Global Surveyor, another FBC mission that was underway. The basic root cause was understaffing of the JPL navigation team. Similar to the MPL failure, budget was the critical element and not technology or schedule.

FBC Failure Recurring Themes. The most prominent recurring themes of FBC failures are the lack of reviews, lack of risk management and assessment, testing, and communications. Budget and thus staffing may have been a prominent finding as shown in (NASA 2000b, 2000c). A grouping of the post-MPL NASA findings of FBC failures into systems management versus systems engineering:

Systems Management Experienced project management is

essential. Project manager must be responsible and

accountable for all aspects of mission success.

Effective risk identification and management.

Institutional management must be accountable for policies and procedures and assure appropriate implementation.

If not ready – do not launch. Communicate objectives, requirements,

constraints, and risk assessment. Senior management must be receptive to

communications of problems and risks. A dedicated single interface at NASA

headquarters for the Mars Program and is responsible for all requirements and funds.

Contractor responsibilities must include notification to the customer of project risk and deviations from acceptable practice.

Systems Engineering Unique constraints of deep space missions

demand adequate margins. Thorough test and verification program. Telemetry coverage of critical events is

necessary for analysis and ability to incorporate information for lessons learned. As revealed, technical engineering

complexity is not a major factor. Rather, it is the concert of cost, schedule, and technical complexity that is the heart of the failures. In my own analysis and interviews, the following findings are:

Planetary missions have a narrow window to launch placing schedule as the primary driver; for a Mars launch it is 1 month or wait 26 months.

Mission failure as a result of new or untried technologies is rare in FBC.

Complexity may have consumed cost and schedule and may be indirectly linked.

Lack of formal or too lean risk management practices.

Systems engineering process too lean. Communication breakdown. Trade studies inadequate for the relationship

of cost, schedule, and technology. Inadequate software design and systems

testing due to cost constraint. Lessons learned not captured.

(Bearden 2001) acknowledges that “few spacecraft design methodologies include technology-related cost and schedule risk.” The (US General Accounting Office (GAO) 2002) states that the FBC concept was mostly successful with a few notable failures. Planetary mission failures, although few, are very visible to the general public embarrassing NASA and the general contractor. In order to learn from the successes as well as failures, the GAO has cited the lack of lessons leaned processes in NASA. The failures of MCO and MPL, according to the GAO, could have been avoided if lessons learned from previous programs were applied.

PROCEEDINGS CSER 2005, March 23-25, Hoboken, NJ, USA

4.0 Mars Pathfinder - FBC Success

The most prominent FBC success story was the Mars Pathfinder mission. Pathfinder, and so named, was the first spacecraft program for a new generation of low-cost spacecraft. Pathfinder was launched on December 4, 1996 and landed on the surface of Mars seven months later on July 4, 1997. Its mission objectives were to demonstrate a reliable low-cost system for placing a science payload and rover onto the Mars surface, demonstrate the mobility and usefulness of a micro-rover named Sojourner, and return new engineering and scientific data on the nature of the Martian surface environment.

The Mars Pathfinder had three years to be developed and launched with a fixed budget of $171M and $25M for the rover. Total mission costs were $265M in comparison to the 1976 Mars Viking mission of $3.9B in equivalent dollars. The costs are reflective of the theme of FBC as quoted by the Pathfinder flight system manager – “go there at a fraction of the cost, do it in a fraction of the time, take risks but don’t fail.” (Muirhead 1999) In order to accomplish this new mission and a new way of doing engineering, a different way of thinking about engineering was required.

The project scope was based on a capability driven design instead of a requirements driven design. The design used available technology, hardware, and proven capabilities. The requirements were predetermined. Pathfinder program used a design-to-cost (DTC) systems engineering methodology. Management pursued mission requirements with keeping costs within a specified amount. Design-to-requirements (DTR) versus DTC processes are different. In DTC, goals and cost are inputs for design and capabilities. For DTR, requirements are the only input for design, cost, and capabilities.

Every element was reviewed for reliability and risk asking “what is the probability of failure and what is the risk of losing the mission”? Each element was reviewed versus numerical reliability analysis techniques. Another technique employed were lessons learned from the Viking mission. Given the amount of time that had passed since the Viking mission and the difficulties encountered in collecting archived material, Pathfinder team created their own lessons learned process for those programs that would follow them to codify their knowledge and experience.

Using existing technology did not mean taking significant risk. The integration of the technology and use for space was untried. The ability to produce a free roving Sojourner proved to be a challenge, but the use of technological advances in miniaturization and micro-electronics proved to be the solution. The adoption of technology or the use of common space components kept costs low. However, the integration and modifications of the technology was costly.

Mars Pathfinder performed for 83 days of surface operation. Pathfinder went silent on September 27, 1997. Pathfinder was designed to operate on the Martian surface for 30 days and sent four times the amount of data than was expected.

5.0 FBC Agile Systemics and Summary

Success of Pathfinder is largely due to the following in a tribal knowledge culture:

Domain expertise Freedom to cross organizational boundaries

to solve problems Modify mission to remain within DTC with

robust engineering Good communication Risk management process Visibility through reviews Improved up-front planning Clear requirement definition Technology use within readiness level Verification and validation

PROCEEDINGS CSER 2005, March 23-25, Hoboken, NJ, USA

Pathfinder’s tribal knowledge culture proved to be highly successful because they used specific systems management and systems engineering practices. The informal learning networks and domain based exchange of ideas and problem resolution facilitated common values and found solutions to their problems. An FBC operations solution systemic shown in Figure 1 uses many of the Agile Manifesto values.

NASA MissionRequirements

Cost &Schedule

ImplementationRequirements

Complexity and Design toCost & ScheduleTailor Processes

Must MeetLaunch Window

Tailor HardwareSoftware & Mission Profile

Hardware, Software Design& Mission Profile Fixed

Risk Assessment& Communications

Figure 1. FBC Systemic FBC agile development is about

performing highly reliable engineering with emphasis on scope, resource, and degrees of freedom. It is a way of operating and executing fundamental processes scaled to the program at hand. Critical practices are associated with FBC and the degree of mission success is measured as to how well these practices are executed. Agile systems engineering and systems management is required to create innovative methods for development and support without comprising innovation, quality and performance. Innovative product and organizational strategies are required to achieve mission success in an FBC environment. The principles of the Agile Manifesto may be applied to highly reliable engineering tasks. The 12 principles are exhibited in the Pathfinder program and

FBC vision. Principles 1, 3 and 7 are modified replacing “software” with “product”. The principles and FBC vision align, albeit spacecraft as product replacing software. The modified Agile Manifesto principles are: 1. Our highest priority is to satisfy the

customer early and continuous delivery of valuable product.

2. Welcome changing requirements, even late in development. Agile processes harness change for the customer’s competitive advantage.

3. Deliver working product frequently. 4. Business people and developers must work

together daily throughout the project. 5. Build projects around motivated individuals.

Give them the environment and support they need, and trust them to get the job done.

6. The most efficient and effective method of conveying information to and within a development team is face-to-face conversation.

7. Working product is the primary measure of progress.

8. Agile processes promote sustainable development. The sponsors, developers, and users should be able to maintain a constant pace indefinitely.

9. Continuous attention to technical excellence and good design enhances agility.

10. Simplicity – the art of maximising the amount of work not done – is essential.

11. The best architectures, requirements, and designs emerge from self-organizing teams.

12. At regular intervals, the team reflects on how to become more effective, then tunes and adjusts its behaviour accordingly. FBC concept is important for agile systems

engineering and systems management. Several FBC programs failed because they neglected to adopt necessary practices in systems management. To avoid problems incurred in reducing infrastructures, systems management techniques become vital to assure mission success. Systems management originally evolved to manage risky programs and

PROCEEDINGS CSER 2005, March 23-25, Hoboken, NJ, USA

technologies. It becomes important as to how these techniques are employed. The techniques of the Agile Manifesto as exhibited by the Pathfinder program are an important example that agile development may be applied to non-software engineering programs. The Pathfinder FBC program used many of the principles of agile development in concert with risk management techniques and certain systems management and systems engineering disciplines that led to its success and fulfilling the promise of the FBC philosophy.

References Bearden, David A. “When Is A Satellite

Mission Too Fast and Too Cheap?” 2001 MAPLD International Conference, September 11, 2001.

Johnson, Stephen B. The Secret of Apollo: Systems Management in American and European Space Programs. Baltimore: The Johns Hopkins University Press, 2002.

McCurdy, Howard E. Faster, Better, Cheaper: Low-Cost Innovation in the U.S. Space Program. Baltimore: The Johns Hopkins University Press, 2001.

Muirhead, Brian K., William L. Simon. High Velocity Leadership. New York: HarpersCollins. 1999.

NASA. Mars Climate Orbiter Mishap Investigation Board, Phase 1 Report. November 10, 1999.

NASA. Mars Program Independent Assessment Team: Summary Report. March 14, 2000.

NASA. Mars Climate Orbiter Mishap Investigation Board: Report on Project Management in NASA. Mars Climate Orbiter Mishap Investigation Board, March 13, 2000.

NASA. JPL Special Review Board: Report on the Loss of the Mars Polar Lander and Deep Space 2 Missions. Pasadena:

Jet Propulsion Laboratory, March 22, 2000. Osborne, David, Ted Gaebler. Reinventing

Government. Reading: Addison-Wesley Publishing Company, 1992.

Perrow, Charles. Normal Accidents: Living with High-Risk Technologies. New York: Basic Books, Inc., 1984.

Petroski, Henry. To Engineer Is Human. New York: Vintage Books, 1992.

Roy, Stephanie A. “The Origin of the Smaller, Better, Cheaper Approach in NASA’s Solar System Exploration Program.” Space Policy, 14 (August 1998): 153-71.

Senge, Peter M. Et al. The Fifth Discipline Fieldbook: Strategies and Tools for Building a Learning Organization. New York: Doubleday, 1994.

U.S. General Accounting Office. NASA: Better Mechanisms Needed for Sharing Lessons Learned. Washington D.C.: U.S. General Accounting Office, January 30. 2002.

Wildavsky, Aaron. Searching For Safety. New Brunswick: Social Philosophy and Policy Center, 1989.

Biography

Michael DiMario is a senior program manager at Lockheed Martin in Moorestown, New Jersey where he has also served as Director of Systems and Software Quality Engineering since 2001. He spent 19 years with AT&T and Lucent Bell Laboratories in engineering leadership roles for software development, systems engineering, and quality. Michael has a MS degree in computer science from the University of Wisconsin, an MBA in technology management from the Illinois Institute of Technology in Chicago, and is in the PhD systems engineering program at Stevens Institute of Technology. He is an avid astronomer and works with University of Chicago Yerkes Observatory conducting work on near earth asteroids and collaborated on the earliest pre-discovery of Pluto.

PROCEEDINGS CSER 2005, March 23-25, Hoboken, NJ, USA