alternatives to mtbf
DESCRIPTION
MTBF is a common metric among practitioners and users of reliability prediction, safety assurance, and maintenance planning. However, there are a number of significant flaws and limitations with this approach. This presentation goes through those limitations and uses that information to suggest alternatives that may provide much greater insight into product performance.TRANSCRIPT
© 2004 – 2010
Reliability Communication:
MTBF. Is There a Better Way?
2014 Avionics Maintenance Conference
Craig Hillman, CEO
© 2004 - 2007© 2004 - 20109000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com
Who is DfR Solutions?
The Industry Leader in
Quality-Reliability-
Durability
of Electronics
50 Fastest Growing
Companies in the
Electronics Industry
- Inc Magazine2012 Global
Technology
Award Winner
Best Design
Verification Tool
- Printed Circuit Design
Key Facts
• Founded in 2005
• 30+ Employees, Multiple
worldwide locations
• Software, Consulting,
Research, Lab Services
Over 600 Customers
Most Major Avionic
OEMs and Suppliers
© 2004 - 2007© 2004 - 20109000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com
o 𝑀𝑇𝐵𝐹 =𝐻𝑜𝑢𝑟𝑠 𝑜𝑓 𝑂𝑝𝑒𝑟𝑎𝑡𝑖𝑜𝑛 × 𝑆𝑎𝑚𝑝𝑙𝑒𝑠
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝐹𝑎𝑖𝑙𝑢𝑟𝑒𝑠
o 𝑀𝑇𝐵𝐹 =1
𝐹𝑎𝑖𝑙𝑢𝑟𝑒 𝑅𝑎𝑡𝑒 𝜆
o 𝑀𝑇𝐵𝐹 = MTTF + Mean Time to Repair (MTTR)
What is MTBF?
3
© 2004 - 2007© 2004 - 20109000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com
A well-manufactured and screened product
(‘box’) will have no defects
AND
A well-designed product will not experience
wearout during its operational lifetime
Why MTBF?
4
© 2004 - 2007© 2004 - 20109000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com
BIG numbers are easier to remember
than small numbers
153,000 hours vs. 0.00065%/hour?
(of course, why not 5.7%/year?)
Why MTBF? (cont.)
5
© 2004 - 2007© 2004 - 20109000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com
Only One Number
Why MTBF? (cont.)
6
© 2004 - 2007© 2004 - 20109000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com
o Airworthiness Requirements are Business Critical
o Safety Assessments demonstrate compliance with
Airworthiness Requirements
o Reliability Prediction ‘feeds’ Safety Assessments
o FAA encourages MTBF for Reliability Prediction
Why MTBF? (cont.)
7
MTBF MTBF
MAINTENANCE SAFETY
© 2004 - 2007© 2004 - 20109000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com
o Misunderstandings are common among non-reliability
experts
o Must assume a constant
failure rate
o Assumes failure must
occur
o Encourages use of empirical handbooks
Why NOT MTBF?
8
© 2004 - 2007© 2004 - 20109000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com
o MTBF can be used for predicting reliability
at the design/concept stage
o MTBF can also be used for extrapolating
reliability from existing events
Key Reminder
9
© 2004 - 2007© 2004 - 20109000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com
o Failure Rate
o Reliability with a Confidence Interval / B10
o Failure Free Operating Period (FFOP) / Maintenance
Free Operating Period (MFOP)
o Mean Cumulative Function (MCF)
o Rate of Occurrence of Failure (ROCOF)
What Are the Alternatives?
10
© 2004 - 2007© 2004 - 20109000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com
o Simply invert MTBF
o Advantages - More intuitive
No assumptions regarding constant
failure rate
o Disadvantages - More challenging to incorporate
time to repair
Can just invert MTBF (has anything
really changed?)
Failure Rate
11
Common among part manufacturers
© 2004 - 2007© 2004 - 20109000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com
o 99% Reliability with 95% Confidence
o Advantages - More intuitive No assumptions regarding constantfailure rate
Forces a discussion on confidence levels (moves away from empirical handbooks)
o Disadvantages - More challenging to incorporatetime to repair
Reliability with a Confidence Interval
12
Common among industrial controls, auto manufacturers
© 2004 - 2007© 2004 - 20109000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com
o Time to 10% probability of failure
o Often thought of the beginning of wearout
o A variance of reliability with confidence level
B10
13
Common among moving parts that wearout (fans, motors, etc.)
© 2004 - 2007© 2004 - 20109000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com
o Initially proposed in the early 1980’s
o Incorporated into MIL-STD-781D (failure-free period life tests)
o Concept was to extrapolate concepts from mechanical parts and apply them to electronic boxes
o Two approaches
o Constant failure rate is low enough that the probability of failure is highly unlikely (below a certain value) over a given period of time (possibly brings us right back to MTBF)
o Replacement of exponential distributions with three-parameter Weibull
Failure Free Operating Period (FFOP)
14
© 2004 - 2007© 2004 - 20109000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com
o Potentially valid
concept for some
mechanisms
o Major challenge is under-
standing gamma ()
o Requires large number of
samples
o Need to characterize change as a function of stress
o Major benefit is changing the default conversation
from ‘will fail’ to ‘will not fail’
Three Parameter Weibull
15
© 2004 - 2007© 2004 - 20109000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com
o Proposed by the UK Ministry of Defense in mid 90’s
o Defined as period of time, typically starting from initial use, where the equipment is to perform its function without any maintenance (unscheduled)
o Concept is driven by the manufacturer taking some responsibility for maintenance (similar to performance-based logistics)
o Calculating MFOP requires an estimate of survivability of the system during the maintenance-free period (MFOPS)
Maintenance Free Operating Period (MFOP)
16
© 2004 - 2007© 2004 - 20109000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com
o Different organizations have taken different approaches to MFOPo Some have applied FFOP to MFOP
o Others have overlaid MFOP over MTBF metrics (MFOPS)
o Many have used it to justify greater fault detection and fault tolerance (beyond safety)
o Benefitso Changes the conversation, does not assume constant
failure rate, better for repairable systems, provides stronger financial motivation behind any reliability prediction
MFOP (cont.)
17
© 2004 - 2007© 2004 - 20109000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com
o Designed to replace the use of MTBF/MTBUR in
extrapolating field events
Mean Cumulative Function
18
Heavlin, 2005
o MTBF assumes independent and identically distributed lifetimes (iid)
o Recurrence data vs. life data (repairable systems are typically not iid). The order and duration can be critical
© 2004 - 2007© 2004 - 20109000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com
o Plot of cumulative failures vs system age (analogous to
cumulative hazard functions for non-repairable
systems)
MCF (cont.)
19
Heavlin, 2005
© 2004 - 2007© 2004 - 20109000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com
o Once MCF is calculated and plotted, a number of
statistical techniques are available
o Cochran-Mantel-Haenszel (CMH) to identify outliers
o Archetypal analysis to separate out groups of systems,
detect trends, identify outliers
o Rate of Occurrence of Failure (ROCOF)
o Derivative of MCF
MCF (cont.)
20
© 2004 - 2007© 2004 - 20109000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com
o Why ROCOF? Operating hours / total failures equals
MTBF only works if the failure rate is constant
(exponential distribution)
o MCF is the expected value of the number of failures
over some time interval
o ROCOF is the instantaneous rate of change in the
expected number of failures
o Designed to measure the in-service performance of
repairable units
Rate of Occurrence of Failure (ROCOF)
21
© 2004 - 2007© 2004 - 20109000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com
Plotting MCF
22
Hogge, 2012
© 2004 - 2007© 2004 - 20109000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com
Plotting ROCOF
23
Hogge, 2012
© 2004 - 2007© 2004 - 20109000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com
ROCOF vs. MTBF
24
© 2004 - 2007© 2004 - 20109000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com
ROCOF Insight
25
© 2004 - 2007© 2004 - 20109000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com
o There needs to be a discussion to determine if average
MTBF captures the true pain of failures
Where Does This Leave Us?
26
MTBF
© 2004 - 2007© 2004 - 20109000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com
o There are a number of advantages in moving away
from the use of MTXX to predict and track reliability
o Use of other methodologies will improve maintenance
prediction and performance
o However, in a regulated industry, change is difficult
without the express backing of the regulator
o Look at DoD and MIL-HDBK-217!
Conclusion
27
© 2004 - 2007© 2004 - 20109000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com
28
Questions ?