Download - Design for Reliability by Adesh
-
8/2/2019 Design for Reliability by Adesh
1/68
ADESH KUMAR
M.TECH-1ST YEAR
(MACHINE DESIGN)
JAMIA MILLIA ISLAMIA
NEW DELHI
DESIGN FOR RELIABILITY
-
8/2/2019 Design for Reliability by Adesh
2/68
Chapter Objectives
Introduce the need for design for reliability
List the main causes of reliability failures
How do failures relate to their mechanisms
Describe each failure
Propose design guidelines against the failure
-
8/2/2019 Design for Reliability by Adesh
3/68
What is Reliability?
Reliability is:
The ability of an item to perform its required
function under defined customer operating
conditions for a stated period of time.
The probability that no (system) failure will
occur in a given time interval
In research, the term reliability means"repeatability" or "consistency". A measure is
considered reliable if it would give us the same
result over and over again
-
8/2/2019 Design for Reliability by Adesh
4/68
Other Names of DFR
DFR has many aliases:
Design for Durability
Design for Robustness Design for Useful Life
-
8/2/2019 Design for Reliability by Adesh
5/68
What do Reliability Engineers Do?
Implement Reliability Engineering Programs
across all functions
EngineeringResearch
manufacturing
Testing
Packaging
field service
-
8/2/2019 Design for Reliability by Adesh
6/68
What is Probability?
Probability is:
A measure that describes the chance orlikelihood that an event will occur.
The probability that event (A) occurs isrepresented by a number between 0 (zero) and 1.
When P(A) = 0, the event cannot occur.
When P(A) = 1, the event is certain to occur.
When P(A) = 0.5, the event is as likely tooccur as it is not.
-
8/2/2019 Design for Reliability by Adesh
7/68
-
8/2/2019 Design for Reliability by Adesh
8/68
Cost-Reliability Functions
-
8/2/2019 Design for Reliability by Adesh
9/68
What are Noise Factors?
Noise Factors are sources of disturbing
influences that can disrupt the idealfunction, causing error states which lead
to quality problems.
-
8/2/2019 Design for Reliability by Adesh
10/68
Reliability Terms
Mean Time To Failure (MTTF) for non-repairablesystems
Mean Time Between Failures for repairable
systems (MTBF)
Reliability Probability (survival) R(t)
Failure Probability (cumulative density function )
F(t)=1-R(t)
Failure Probability Density f(t) Failure Rate (hazard rate) (t)
-
8/2/2019 Design for Reliability by Adesh
11/68
MTBF & MTTF
Mean Time Between FailuresApplies to repairableitems.
Mean Time To FailureApplies to non-repairableitems.
Both of these terms indicate the average time an item
is expected to function before failure.
-
8/2/2019 Design for Reliability by Adesh
12/68
Reliability Function
Probability density function of failuresf(t) = le-lt for t > 0
Probability of failure from (0 to T)
F(t) = 1 e-lT
Reliability functionR(T) = 1 F(T) = e-lT
-
8/2/2019 Design for Reliability by Adesh
13/68
14
Series Systems
RS = R1 R2 ... Rn
1 2 n
-
8/2/2019 Design for Reliability by Adesh
14/68
Serial reliability
Series systems are also referred to as
weakest link or chain systems.
System failure is caused by the failure of
any one component.
Therefore, for a series system, the reliability
of the system is the product of the individual
component reliabilities
More components = less reliability
1
n
i
i
s e r i a l r e l i a b i l i t y x
-
8/2/2019 Design for Reliability by Adesh
15/68
-
8/2/2019 Design for Reliability by Adesh
16/68
Parallel reliability
1
1 (1 )
n
i
i
p a ra llel relia b ility x
oParallel systems are also referred to as
redundant.
oThe system fails only if all of the components
fail.oTherefore, for a parallel system, the system
probability of failure is the product of the
individual component probabilities.
-
8/2/2019 Design for Reliability by Adesh
17/68
Series-Parallel Systems
Convert to equivalent series system
A B
C
C
D
RA RB RCRD
RC
A B C D
RA RB RD
RC
= 1 (1-RC)(1-RC)
-
8/2/2019 Design for Reliability by Adesh
18/68
ADESH18
A Simple Example
A system has 4000 components with afailure rate of 0.02% per 1000 hours.Calculate and MTBF.
= (0.02 / 100) * (1 / 1000) * 4000 = 8 *10-4 failures/hour
MTBF = 1 / (8 * 10-4 ) = 1250 hours
-
8/2/2019 Design for Reliability by Adesh
19/68
ADESH19
An Example A first generation computer contains 10000 components each
with = 0.5%/(1000 hours). What is the period of 99%reliability?
MTBF = t / (1 R(t)) = t / (1 0.99) t = MTBF * 0.01 = 0.01 /av Where av is the average failure rate N = No. of components = 10000 = failure rate of a component = 0.5% / (1000 hours) = 0.005/1000 = 5 * 10-6 per
hour
Therefore, av = N = 10000 * 5 * 10-6 = 5 * 10-2
per hour
Therefore, t = 0.01 / (5 * 10
-2
) = 12 minutes
-
8/2/2019 Design for Reliability by Adesh
20/68
Reliability Failure Modes
Failures may be SUDDEN (non-predictable) orGRADUAL (predictable). They may also be PARTIALor COMPLETE.
A Catastrophic failure is both sudden and complete.
A Degradation failure is both gradual and partial.
Two root causes:1. lack of robustness2. mistakes
-
8/2/2019 Design for Reliability by Adesh
21/68
Causes of Failure
MisuseFailures attributable to the application of
stresses beyond the stated capabilities of the item.
Inherent WeaknessFailures attributable to
weakness inherent in the item itself when subjected
to stresses within the stated capabilities of the item.
-
8/2/2019 Design for Reliability by Adesh
22/68
Classifications of Reliability Failure
Early stage failureCauses for such type of failure are
inadequate design, poor manufacturing, and inappropriate
usage. these can be catastrophic to human life.
Overstress MechanismsThese occur due to insufficientsafety factor in design, higher than expected
random loads, human errors, misapplication.
Wearout MechanismsOccur late in life and then increase
with age.This happens on corrosion, material fatigue, poor
maintenance, creep , degradation in strength.
-
8/2/2019 Design for Reliability by Adesh
23/68
Common Measures of Unreliability
% Failure - % of failures in a total population
MTTF (Mean Time To Failure) - the average time of
operation to first failure.
MTBF (Mean Time Between Failure) - the average time
between product failures.
Repairs Per Thousand (R/1000)
Bq LifeLife at which q% of the population will fail
-
8/2/2019 Design for Reliability by Adesh
24/68
Cumulative Failure Rate Curve
-
8/2/2019 Design for Reliability by Adesh
25/68
-
8/2/2019 Design for Reliability by Adesh
26/68
The Bathtub Curve
Reliability specialists often describe the lifetime ofa population of products using a graphical
representation called the bathtub curve. The
bathtub curve consists of three periods: an infant
mortality period with a decreasing failure rate
followed by a normal life period (also known as
"useful life") with a low, relatively constant failure
rate and concluding with a wear-out period thatexhibits an increasing failure rate.
-
8/2/2019 Design for Reliability by Adesh
27/68
27
Reliability
Age
Probof dying
in the nextyear(deaths/1000)
0
10
20
30
40
50
60
70
80
90
0 2 5 12 16 19 30 50 70 86
From the Statistical Bulletin 79, no 1, Jan-Mar 1998
-
8/2/2019 Design for Reliability by Adesh
28/68
Steps in Designing for Reliability
1. Develop a Reliability Plan
Determine Which Reliability Tools are
Needed
2. Analyze Noise Factors
3. Tests for Reliability
4. Track Failures and Determine Corrective
Actions
-
8/2/2019 Design for Reliability by Adesh
29/68
Develop a Reliability Plan
Planning for reliability is just as important asplanning for design and manufacturing.
Why?
To determine: useful life of product
what accelerated life testing to be used
Reliability must be as close to perfect as possible
for the products useful life. You MUST know where your product's major
points of failure are!
-
8/2/2019 Design for Reliability by Adesh
30/68
Tools for testing
Stress Analysis
Reliability Predictions (MTBF)
FMEA (Failure Mode and Effects Analysis)
Fault Tree Analysis
Reliability Block Diagrams
-
8/2/2019 Design for Reliability by Adesh
31/68
Why do Reliability Calculation?
Reliability calculations make the product
more reliable which can be used as a selling
feature by the marketing department. Also,
this adds to the company reputation and can
be used for comparisons with competition.
-
8/2/2019 Design for Reliability by Adesh
32/68
Stress Analysis
It establishes the presence of a safety margin
thus enhancing system life. Stress analysis
provides input data for reliability prediction.It is based on customer requirements.
-
8/2/2019 Design for Reliability by Adesh
33/68
Reliability Predictions (MTBF)
MTBF (Mean Time between Failures) for an
existing product can be found by studying field
failure data. For a new product however, or if
significant changes are made to the design, it maybe required to estimate or calculate MTBF before
any field data is available.
-
8/2/2019 Design for Reliability by Adesh
34/68
ADESH
Failure Modes and Effects Analysis
Failure modes and effects analysis (FMEA) is aqualitative technique for understanding the
behaviour of components in an engineered systems
The objective is to determine the influence of
component failure on other components, and on
the system as a whole
FMEA can also be used as a stand-alone procedure
for relative ranking of failure modes that screensthem according to risk.
F il d d ff t l i
-
8/2/2019 Design for Reliability by Adesh
35/68
Failure mode and effects analysis
(FMEA)
Failure Mode: Consider each component or functional block andhow it can fail.
Determine the Effect of each failure mode, and the severity on
system function.
Determine the likelihood of occurrence and detecting the failure. Calculate the Risk Priority Number (RPN = Severity X
Occurrence X Detection).
Consider corrective actions (may reduce severity of occurrence,
or increase probably of detection).
Start with the higher RPN values (most severe problems) and
work down.
Recalculate RPN after the corrective actions have been
determined, the aim is to minimize RPN.
-
8/2/2019 Design for Reliability by Adesh
36/68
ADESH
Reliability Block Diagrams
Most systems are defined through a combination of bothseries and parallel connections of subsystems
Reliability block diagrams (RBD) represent a system usinginterconnected blocks arranged in combinations of seriesand/or parallel configurations
They can be used to analyze the reliability of a systemquantitatively
Reliability block diagrams can consider active and stand-bystates to get estimates of reliability, and availability (or
unavailability) of the system Reliability block diagrams may be difficult to construct for
very complex systems
-
8/2/2019 Design for Reliability by Adesh
37/68
CASE STUDY: Network Storage
Evaluations Using
Reliability Calculations
This section uses a case study to introduce
concepts and calculations for systematically
comparing redundancy and reliability factors asthey apply to network storage configurations. We
will determine a reliability figure on three very
basic architectures. The starting point of our study
is the network storage requirements.
-
8/2/2019 Design for Reliability by Adesh
38/68
Network Storage Requirements
We want networked storage that has access to one
server. Later, this storage will be accessible to other
servers. The server is already in place, and has been
designed to sustain single component hardware failures
(with dual host bus adapters (HBAs), for example).
Data on this storage must be mirrored, and the storage
access must also stand up to hardware failures. The
cost of the storage system must be reasonable, while
still providing good performance.
-
8/2/2019 Design for Reliability by Adesh
39/68
Architecture 1 Architecture 1 provides the
basic storage necessitieswe are looking for with thefollowing advantages anddisadvantages:
Advantages:
Storage is accessible ifone of the links is down.
Storage A is mirrored ontoB.
Other servers can beconnected to the
concentrator to access thestorage.
Disadvantages:
If the concentrator fails, wehave no more access to
the storage. Thisconcentrator is a single
-
8/2/2019 Design for Reliability by Adesh
40/68
Architecture 2
Architecture 2 has been
improved to take intoaccount the previousSPOF. A concentratorhas been added.
Advantages:
If any links orcomponents go down,storage is stillaccessible (resilient to
hardware failures). Data is mirrored (Disk A
Disk B).
Other servers can beconnected to bothconcentrators to access
-
8/2/2019 Design for Reliability by Adesh
41/68
Architecture 3
The main difference is that
Disk A and Disk B have onlyone data path. Disk A is stillmirrored to Disk B, asrequired.
This architecture has all theadvantages of the previousarchitectures with thefollowing differences:
Disk A can only be accessed
through Link C, and Disk Bonly through Link D.
There is no data multi pathingsoftware layer, which resultsin easier administration
and easier troubleshooting.
-
8/2/2019 Design for Reliability by Adesh
42/68
Determining Reliability
Using the reliability formulas , we can determinewhich architecture has the highest reliability value.For the purpose of this article , we will use sampleMTBF values (as obtained by the manufacturer)and AFR*(Annual Failure Rate) values shown inthe table below:
*(The AFR for each component was calculated using the MTBF
where (8760/MTBF) = AFR). The example MTBF values weretaken from real network storage component statistics. However,such values vary greatly, and these numbers are given herepurely for illustration.
-
8/2/2019 Design for Reliability by Adesh
43/68
Determining Reliability
Component AFR
Variable
Sample MTBF Values
(hours)
AFR
HBA 1 H 800,000 0.011
HBA 2 H
LINK A L 400,000 0.022
LINK B L
Concentrator 1 C 580,000 0.0151
Concentrator 2 C
LINK C L 400,000 0.022LINK D L
Disk A D 1,000,000 0.0088
Disk BD
-
8/2/2019 Design for Reliability by Adesh
44/68
Determining Reliability
Having the rate of failure of each individualcomponent, we can obtain the system's annual
failure rate AFR and consequently the system
reliability (R) and system MTBF values. The AFR
values of redundant components are multiplied tothe power equal to the number of redundant
components. The AFR values of non-redundant
components are multiplied by the number of those
components in series.
-
8/2/2019 Design for Reliability by Adesh
45/68
Calculation
In case of Architecture 1, concentrator(C) is theonly non-redundant component.
AFR1 = (H+L)2 + C + L2 + D2
AFR1 = (0.011+0.022) 2 + 0.0151 + (0.022)2 +(0.0088)2 = 0.0167
R1 = 1 - AFR1 = 10.0167 = 0.9833, or 98.33%
MTBF1= 8760/AFR1 = 8760/0.0167 = 524,551
hours.
-
8/2/2019 Design for Reliability by Adesh
46/68
Calculation
The architecture 2 has a different configuration
with no non-redundant components.
AFR2 = (H+L+C+L) 2 + D2 AFR2 = (0.011+0.022+0.0151+0.022) 2 +
(0.0088)2 = 0.0005
R2 = 1AFR2 = 10.0005 = 0.995, or 99.50%
MTBF2= 8760/AFR2 = 8760/0.0005 = 1,752,000
hours.
-
8/2/2019 Design for Reliability by Adesh
47/68
Calculation
Architecture 3 has yet another configuration andhas no non-redundant components.
AFR3 = (H+L+C+L+D) 2
AFR3 = (0.011+0.022+0.0151+0.022+0.0088) 2 =0.0062
R3 = 1AFR3 = 10.0062 = 0.9938, or 99.38%
MTBF3= 8760/AFR3 = 8760/0.0062 = 1,412,903
hours.
-
8/2/2019 Design for Reliability by Adesh
48/68
Conclusion
When the calculations are complete, we compare thedata:
Architecture 1 = 98.33%, or a System's MTBF =524,551 hours
Architecture 2 = 99.50%, or a System's MTBF =1,752,000 hours
Architecture 3 = 99.38%, or a System's MTBF =1,412,903 hours
The MTBF figures are the most revealing, and indicatethat architecture 2 is statistically the most reliable ofall.
Failure Effects
-
8/2/2019 Design for Reliability by Adesh
49/68
Failure Effects
(What customer experiences)
Noise
Inoperability
Instability
Intermittent operation
Roughness
Excessive effort requirements Unpleasant or unusual odor
Poor appearance
-
8/2/2019 Design for Reliability by Adesh
50/68
Design &Manufacture
Pre-Production Design
Control of Production
Working Tolerances
Material QualityComponent Quality
Component Stress
Installation &Environmental
Temperature
Humidity
Vibration
Chemical Attack
Interconnections
Factors Affecting
Reliability
-
8/2/2019 Design for Reliability by Adesh
51/68
Design against failure
Important to understand the failure (why, where, howlong, application, etc.)
Two methods for design against failure:1. By reducing the stress that cause the failure.2. By increasing the strength of the component.
Either one can be achieved by: Selecting materials Changing the package geometry Changing the dimensions Protection
-
8/2/2019 Design for Reliability by Adesh
52/68
Fatigue Failure?
Fatigue is the most common mechanism of failureand responsible for 90% of all structural and
electrical failures.
Occurs in metals, polymers, and ceramics.
Metal paper clip example
Bend in both directions
Repeat the process
-
8/2/2019 Design for Reliability by Adesh
53/68
Design Against Fatigue Failure
Increase fatigue strength.
Reduce the amplitude of cylic loading.
avoid stress concentration region
-
8/2/2019 Design for Reliability by Adesh
54/68
Design Against Brittle Fracture
Brittle fracture is an overstress failuremechanism that occurs rapidly with little or nowarning when the induced stress in the
component exceeds the fraction strength ofthe material.
Occurs in brittle materials (ceramics, glasses
and silicon).
Applied stress and work could break theatomic bonds.
Design Guidelines to Reduce
-
8/2/2019 Design for Reliability by Adesh
55/68
Design Guidelines to Reduce
Brittle Fracture
Designs with materials and processing
conditions that would produce the least
stress in brittle materials should be created.
The brittle material should be polished to
remove surface flaws to enhance reliability.
-
8/2/2019 Design for Reliability by Adesh
56/68
Design Against Creep Failure
What is Creep? A time-dependent deformation process under
load.
Thermally-activated process: the rate ofdeformation for a given stress level increasessignificantly with temperature.
Deformation depends on1. The applied load.2. The duration through which the load is applied3. Elevated temperature
-
8/2/2019 Design for Reliability by Adesh
57/68
Design Against Creep Failure
Creep can occur at any stress level.
Creep is most important at elevatedtemperatures.
Design Guidelines to Reduce Creep
-
8/2/2019 Design for Reliability by Adesh
58/68
Design Guidelines to Reduce Creep-
Induced Failure.
Use materials with high melting point if the
application calls for harsh temperature conditions.
Reduction of mechanical stress will reduce creep
deformation.
Creep is a time controlled phenomenon.
-
8/2/2019 Design for Reliability by Adesh
59/68
Design Against Plastic Deformation
What is Plastic Deformation?
When the applied mechanical stress exceeds theelastic limit or yield point of a material.
It is permanent.
Excessive deformation and continuedaccumulation of plastic strain due to cyclic loading
will eventually lead to cracking of the componentand make it unusable.
Design Guidelines Against Plastic
-
8/2/2019 Design for Reliability by Adesh
60/68
Design Guidelines Against Plastic
Deformation
Limit the design stresses in the packaging structure
below the yield strength of the materials used. If
possible, use materials that have high yieldstrength.
Design and control the local plastic deformation at
regions of stress concentrations.
-
8/2/2019 Design for Reliability by Adesh
61/68
Chemically Induced Failures
What are Chemically Induced Failures?
Chemical process such as electrochemical
reactions can result in cracking of components
leading to electrical failures.
Two Types
Corrosion Intermetallic Diffusion
Design Against Corrosion Induced
-
8/2/2019 Design for Reliability by Adesh
62/68
Design Against Corrosion-Induced
Failure
What is Chemical Corrosion?
The chemical or
electrochemical reaction
between a material, usually
a metal, and its environment
that produces a deterioration
of the material and itsproperties.
Design Guidelines to Reduce
-
8/2/2019 Design for Reliability by Adesh
63/68
Design Guidelines to Reduce
Corrosion
Metals with a high oxidation potential tend tocorrode faster.
Use hermetic packages to prevent moistureabsorption.
Ensure there are no trapped moisture or
contaminants during the processing an assembly ofthe packages.
Design Against Intermetallic
-
8/2/2019 Design for Reliability by Adesh
64/68
Design Against Intermetallic
Diffusion
What is Intermetallic Diffusion?
During wirebonding and solder reflow, the
joining process generates intermetallic layers
which are byproducts of the joining process.
Design Guidelines Against
-
8/2/2019 Design for Reliability by Adesh
65/68
Design Guidelines Against
Intermetallic Diffusion
Limit the process temperatures and control thetime exposed to high temperatures during the
joining process.
Control the temperature range and cycles ofexposure at the high temperature period.
Application of nickel/gold coating on the barecopper pad surfaces.
-
8/2/2019 Design for Reliability by Adesh
66/68
Achieving reliability growth
Detect failure causes
Feedback
Redesign
Improved fabrication
Verification of redesign
-
8/2/2019 Design for Reliability by Adesh
67/68
References
Mechanical reliability and design by A.D.S Carter
Introduction to reliability in design by Charles O.
Smith.
http://www.reliabilityanalysislab.com/ReliabilityServices.asp
http://pms401.pd9.ford.com:8080/arr/concept.htm
-
8/2/2019 Design for Reliability by Adesh
68/68