turner.john
TRANSCRIPT
Risk Informed Design and TestOn
NASA’s Constellation Program
John V. Turner, PhDConstellation Program Risk Manager
Used with permission
NASA CxP John V. Turner, PMC 2009Page 2
Program Goals• NASA identified goals for the CxP related to ISS Support and
Lunar Exploration– Intent is to lay groundwork for Mars exploration as well
• Exploration Systems Architecture Study conducted to develop exploration systems architecture to support these missions
• Constellation program chartered to develop and field this architecture
NASA CxP John V. Turner, PMC 2009Page 3
The Challenge
• Develop an architecture that optimally meets goals and objectives, within cost and schedule, and with acceptable safety and mission success riskRisk Informed Design: aims to support design activities
in identifying acceptable and optimal safety Risk Informed Test: aims to support test activities in
identifying ways to best reduce uncertainty and risk (uncover defects in design, manufacturing, and processing prior to IOC)
NASA CxP John V. Turner, PMC 2009Page 4
Risk Timeline – ISS Mission
Ground Ops
First StageIgnition Staging
Second Stage
MECO
Orbit OpsDocked at ISS
Mission Elapsed Time
Crew Ingress A Leading Risk!
8-10 Minutes 180-210 Days!
Entry / Landing / Rescue
Timeframes and Intensity are illustrative – not to actual scale.
• Risk changes in character, intensity, source over time• Risk prevention and mitigation must be considered in
every system and activity across all mission phases• Understanding the integrated implications of system risks
is critical to success
NASA CxP John V. Turner, PMC 2009Page 5
Sources of Failure• Where do Defects Enter into the flight equipment and operations
that result in failure?• Defects can arise through
– Actual system design flaws– Inadequate testing to uncover defects– Manufacturing errors– Integration or processing errors– Bad decisions during real time
• Note: History indicates that manufacturing, integration and processing are very significant defect sources
• Goal: Put in place processes that identify and eliminate defects leading to failure
Design TestManufacturing Integration /
Processing Operations.
RI Design RI Test Robust Quality Assurance MissionOperations
DefectSource
Mitigation
NASA CxP John V. Turner, PMC 2009Page 6
Risk Informed Design (RID)
• Probabilistic Requirements to drive risk performance in the design
• Loss of Crew (LOC) and Loss of Mission (LOM) risk factored into significant design and planning trades
• Risk assessment embedded in Integrated Design Analysis Cycles to inform all key analysis tasks• “Zero Based Design”
• Risk Informed Test Plans• Focus additional analysis and test resources on High
risk / High Uncertainty areas
6
NASA CxP John V. Turner, PMC 2009Page 7
RID Approach• Premise: Risk is a design commodity like mass or power
• Qualitative and Quantitative risk analyses expose dominant risk contributors and support design and planning trades to assign critical design commodities (mass, volume, power, cost, etc.)
• Iterative systems engineering design cycles incorporate risk in trade space and identify design solutions that are risk informed
• Risk analysis considers all significant failure types, including: functional, phenomenological, software, human reliability, common cause, and external or environmental events,
• Complexity and fidelity of analysis consistent with the available data and information during each design cycle
NASA CxP John V. Turner, PMC 2009Page 8
“Zero Based Design”1. Early design concepts are defined with minimally required
functionality to perform the mission and no redundancy– Focus on implementing “Key Driving Requirements” vs
establishing a fully functional, acceptably safe, or highly reliable design.
– Risk analyses are performed during this phase to understand the risk vulnerabilities of this “zero based design” (ZBD).
NASA CxP John V. Turner, PMC 2009Page 9
2. Prioritize design enhancements with a focus on enhanced functionality and LOC risk. – Focus: “Make the design work”, “Make the design safe” – Identify optimal use of design commodities, cost, and
schedule to reduce risk – with priority on diversity vs simple redundancy.
– Major Premise: Simple redundancy is one option to improve safety and reliability. It is not the only option. It is not always the safest or most cost effective option.
– Compare different investment portfolios using FOMs derived from key risk commodities, including LOC risk
– Goal: Spend scarce risk mitigation resources (mass, power, volume, cost) most effectively to maximally address risk
“Zero Based Design”
NASA CxP John V. Turner, PMC 2009Page 10
3. Finally, additional enhancements are considered which more fully address functional requirements and focus on reliability and loss of mission (LOM) risk. – A portfolio approach to comparing investments is again
used– Ensures that the final design iteration produces a
vehicle that better meets functional requirements, safely, reliably, and within budget.
“Zero Based Design”
NASA CxP John V. Turner, PMC 2009Page 11
Zero Based Design Summary• “Build-Up” approach from the zero based design to a
risk balanced system design, its complexity, and the existence of each system element. – Rationale exists to justify resource allocations such as: mass,
power, and assures that affirmative rationale is used for the cost.
– Build up approach lessens the likelihood of having to make dramatic design changes later in the design cycle to resolve critical commodity shortfalls and get back “in the box.”
• This approach is described in detail in two NESC reports:– “Crew Exploration Vehicle Smart Buyer Design Team Final
Report”
– “DDT&E Considerations for Safe and Reliable Human Rated Spacecraft Systems”
NASA CxP John V. Turner, PMC 2009Page 12
Results
• Program– Original ESAS Loss of Crew (LOC) and Loss of Mission (LOM)
requirements were derived using initial architecture trade study that included conceptual design concepts and underestimated certain significant risk drivers
– Requirements have been adjusted based on current design and environments, improved analysis and a better understanding of what is challenging yet achievable
– CxP architecture currently meeting mission level LOC and LOM requirements
• Orion Project– Orion early design conducted prior to inauguration of RID activities– Began RID design cycles in late 2007
– Significant design changes (4X improvement in LOC, 3X in LOM))– Implemented Apollo 13 Low Power Emergency Return Capability– improvements in safety and mission success while resolving
mass challenges
NASA CxP John V. Turner, PMC 2009Page 13
Results
• Ares Project − Ares conducted RID design trades early in the DDTE process
and incorporated design changes in multiple subsystems− Ares I risk analysis currently projects significant improvement in
reliability of previous manned launch systems• Altair Project
– Conducted ZBD approach from project initiation, completed LOM and LOC risk buyback design iterations from Zero Based design configuration
– Significantly Improved safety and mission success and developed stronger design concept to enter next stage of design
NASA CxP John V. Turner, PMC 2009Page 14
Some Lessons Learned• RID brings designers and analysts together early to evaluate sources of
risk, the integrated implications of risks, and the efficacy of different design implementations in maximizing safety and mission success
• RID drives designers toward dissimilar or functional redundancy vis traditional redundant system approach – Reduced weight penalty incurred by traditional method
• Requirements are met more effectively wrt use of design commodities– design features can be prioritized to determine where reductions are best
applied in the event of mass issues• Risk Informed Campaign Analysis provides insight into “program” vs
“mission” success as a function of system design issues• Evaluating DRM LOM requires strong understanding of operational
flexibility – forces early operations criteria development and operations driven design
• Current methods for evaluating Maturity Growth require improvement– Assumed maturity for design analysis– Need better way to address maturity growth and determine early mission
risk
NASA CxP John V. Turner, PMC 2009Page 15
Some Lessons Learned• The tools used to model LOC and LOM should evolve from early concept
development to verification phase– Simple, historical data driven models early– Conventional Linked Fault Tree / Event Tree models later– Models increase in complexity and fidelity with the design
• Application of Qualitative Top Down Functional Modeling to identify significant hazards that should drive both the Integrated Hazard Analysis and PRA Master Logic Diagrams
• Consistency and Visibility are Critical!– Models, – Data– Methods– Tools
• Three types of risk analysis to support RID– Mission Risk Models– Hazard Quantification– Focused assessments and trades– Different methods potentially used for each
NASA CxP John V. Turner, PMC 2009Page 16
Risk Informed Design Continuum
SRR SDR PDR CDREarly Concept
Exploration
Define initial mission architecture
Define Requirements
Preliminary Design
Early Design DetailedDesign
TBD
Verification
Risk Analysis Fidelity
Des
ign
Ana
lysi
s C
ycle
s
Simple models…………………………………………………………………………………………....Complex Models
Heritage and surrogate data………………………………………………………………………Test ./ Demonstrated Data
Architecture trades……………….………………Design Improvement…………………………………..Verification
Design Fidelity
CxP
16
NASA CxP John V. Turner, PMC 2009Page 17
Architecture Trade Studies
0.00 1.00 2.00 3.00
Reference Missions
Architecture 2Architecture 9Architecture 4Architecture 7Architecture 1Architecture 3Architecture 8Architecture 5
Architecture 10Architecture 6
Ris
k FO
M
Mars Mission Architecture Risk Assessment
Systems ReliabilityEntry / LandingMars Orbit InsertionLaunch / IntegrationTrans Mars InjectionMars AscentTrans Earth InjectionOther Hazards
Example Only – Not Real Data
NASA CxP John V. Turner, PMC 2009Page 18
Architecture and System Level Assessments
18
Example Only – Not Real Data
NASA CxP John V. Turner, PMC 2009Page 19
LOC Uncertainty Results
19
Example Only – Not Real Data
NASA CxP John V. Turner, PMC 2009Page 20
Prioritizing Design Mitigation
Example Only – Not Real Data
NASA CxP John V. Turner, PMC 2009Page 21
Mission Success Depends Upon a Combination of Many Variables
Launch:• Time increment
between launches
• Launch Availability
• Launch Probability
• Order of Launches
LEO Loiter:• LEO Loiter Duration
• Ascent Rendezvous Opportunities
• TLI Windows
Vehicle Reliability:• LOM/LOC
Target Characteristics:• Redundant Landing Sites
• Multiple opportunities to access a select landing site
• Lighting constraints at target
Launch Strategy:• Two launch
• Single Launch
Vehicle Performance:• Orbital Mechanics Variation
Tolerance
• Additional Propulsive Capability
• Vehicle Life
• Launch Mass Constraints
NASA CxP John V. Turner, PMC 2009Page 22
RITOS Overview• RITOS Objective: Elicit expert opinion and historical data related to top
program flight risk drivers in order to:1. Better understand the risks and associated uncertainties2. Identify potential mitigations and/or controls and effective test
and verification strategies3. Qualitatively assess the adequacy of the currently planned
mitigations/controls and test and verification activities
• RITOS Approach: – Identify top program risk drivers based on SR&QA products,
history, and judgment– Elicit expert opinion and historical data related to the risk driver– Assess currently planned mitigation/control and test and
verification strategies based on elicitation results, historical data, and judgment
– Provide recommendations to T&V for enhancing currently planned approach to risk driver mitigation/control and test and verification
NASA CxP John V. Turner, PMC 2009Page 23
SR&QA Scope♦Are planned analysis and test (ground/flight)
adequate to characterize and burn down risk?• Type, scope, and fidelity of tests• Frequency of tests
♦ Is the plan executable?• Budget
− Enough $• Schedule
− Fabrication / Integration / Need dates− Test, Fix, Fly
• Analysis and Reaction Time
• Facilities− Do we have the right facilities− Availability
• Test articles− Availability, fidelity, re-use issues, timing
SR&QAFocus
SE&IFocus
NASA CxP John V. Turner, PMC 2009Page 24
Risk Topic Selection• Risk topics are chosen based on their priority in the various SR&QA risk
product results, historical data, and SR&QA judgment• Initial topic list:
– MMOD Impact to Orion for ISS DRM– First Stage/Upper Stage Separation– Orion descent and landing– Upper Stage Engine– Launch Abort System– Upper Stage/Orion Separation– Thermal Protection System
• List can be further expanded as new risk topics are identified
NASA CxP John V. Turner, PMC 2009Page 25
Expert Elicitation
• Each risk topic is researched in order to understand the mechanisms and/or phenomena that drive the risk
• Attempt to identify an expert from each discipline area related to the risk – External candidates– Historical failure experts– CxP Internal subject matter experts– NESC panelists from applicable studies
• Elicitation is a structured one-on-one discussion with the candidate in which various topics related to the risk are discussed, but in context to:
– Risk calculation, characterization, and uncertainty– Test and Verification– Mitigations and Controls
• Following elicitation, results are combined into themes and organized such that they are useful to the assessment
NASA CxP John V. Turner, PMC 2009Page 26
Results Format• RITOS objective is to provide results that are beneficial to CxP IT&V and
SR&QA• RITOS approach can be modified for each risk topic to accommodate IT&V
needs• Results are qualitative, but can provide “sanity check” of T&V plans• Results could be useful in prioritizing test objectives• Results will be presented in two formats:
1. Bulleted form as elicitation result conclusions2. Swimlane chart depiction of currently planned T&V activities with RITOS
recommendations mapped into process flow
NASA CxP John V. Turner, PMC 2009Page 27
RITOS Progress to DateInitial Research
Candidate ID /Question Dev
Schedule and Conduct Elicitations; Compile Results
Assess Existing Test Plan Status
MMOD
FS/US Sep
US Engine
ED&L
TPS
LAS
US/Orion Sep
On-hold Reduced progress Normal progress
NASA CxP John V. Turner, PMC 2009Page 28
RITOS Lessons Learned• Obtaining access to CxP Internal Subject Matter Experts has proven to be
challenging. Working through Level II representatives has helped but not fully solved problem.
• Coordination with Projects is challenging and time consuming. In cases where delay to study progress occurred we moved ahead to future topics to continue progressing.
• Obtaining test plans is challenging and in some cases test plans do not exist. In cases where test plans are not available we package results in way that can be used during test plan development. Test plan will be assessed once it becomes available.
• Typical RITOS elicitation results are qualitative, but can provide sanity check of test plans and insight into test prioritization when reductions are being considered. Conclusions obtained from elicitation themes are provided to Cx IT&V.
NASA CxP John V. Turner, PMC 2009Page 29
Conclusions• Risk Informed Design Provides a methodology to incorporate risk
information early in the design process and obtain a more optimal balance of design commodities and risk than traditional rule of thumb safety design criteria
• Risk Informed Test utilizes risk information to identify areas where test can be used more effectively to reduce uncertainty and risk prior to transition to operations.
• Experience to date in the Constellation program indicates the value of the RID and RIT, but additional work is need to develop more consistent methods and tools to accomplish RID and RIT
• In order to eliminate defects and thus reduce actual failures, programs and projects need to proactively address defect sources in Design, Test, Mission Assurance, and operations– This presentation only addresses two of these four “buckets”
NASA CxP John V. Turner, PMC 2009Page 30
Backup
NASA CxP John V. Turner, PMC 2009Page 31 31
COMPONENTS AND FLOW OF A TYPICAL PRA MODEL
Phase I Results FMEAs/CILs Hazard Reports Functional Analyses Previous Risk
Assessments
MLDDevelopment
List ofInitiating Events
SAPHIRE
Flight Rules Training Manuals System Architecture Engineering Expertise
MADS PRACA Industry databases Other assessments
(e.g. off-line simulation models)
Relative risk drivers
Event Trees
Fault Trees
Data Analyses
Reviewed byProgram Organizations
Risk Levels forselected end states
End States
List of consequenceof interest
Cut Sets
CCF A,B,C 1E-3
Gas Explosion 2E-4
A fails, B fails, C fails 1.5 E-4
Etc.
For Shuttle:LOCV (Loss of Crew & Vehicle)
Something that this graphic does not display isthe necessary engineering analysis that must be done
to support success criteria and capacity
A large number of pages of detailed documentation are required
Assumptions•
NASA CxP John V. Turner, PMC 2009Page 32
RITOS Process
Page 32RITOS Overview10/19/2009