design for reliability by adesh

Upload: delhiites-adesh

Post on 05-Apr-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 Design for Reliability by Adesh

    1/68

    ADESH KUMAR

    M.TECH-1ST YEAR

    (MACHINE DESIGN)

    JAMIA MILLIA ISLAMIA

    NEW DELHI

    DESIGN FOR RELIABILITY

  • 8/2/2019 Design for Reliability by Adesh

    2/68

    Chapter Objectives

    Introduce the need for design for reliability

    List the main causes of reliability failures

    How do failures relate to their mechanisms

    Describe each failure

    Propose design guidelines against the failure

  • 8/2/2019 Design for Reliability by Adesh

    3/68

    What is Reliability?

    Reliability is:

    The ability of an item to perform its required

    function under defined customer operating

    conditions for a stated period of time.

    The probability that no (system) failure will

    occur in a given time interval

    In research, the term reliability means"repeatability" or "consistency". A measure is

    considered reliable if it would give us the same

    result over and over again

  • 8/2/2019 Design for Reliability by Adesh

    4/68

    Other Names of DFR

    DFR has many aliases:

    Design for Durability

    Design for Robustness Design for Useful Life

  • 8/2/2019 Design for Reliability by Adesh

    5/68

    What do Reliability Engineers Do?

    Implement Reliability Engineering Programs

    across all functions

    EngineeringResearch

    manufacturing

    Testing

    Packaging

    field service

  • 8/2/2019 Design for Reliability by Adesh

    6/68

    What is Probability?

    Probability is:

    A measure that describes the chance orlikelihood that an event will occur.

    The probability that event (A) occurs isrepresented by a number between 0 (zero) and 1.

    When P(A) = 0, the event cannot occur.

    When P(A) = 1, the event is certain to occur.

    When P(A) = 0.5, the event is as likely tooccur as it is not.

  • 8/2/2019 Design for Reliability by Adesh

    7/68

  • 8/2/2019 Design for Reliability by Adesh

    8/68

    Cost-Reliability Functions

  • 8/2/2019 Design for Reliability by Adesh

    9/68

    What are Noise Factors?

    Noise Factors are sources of disturbing

    influences that can disrupt the idealfunction, causing error states which lead

    to quality problems.

  • 8/2/2019 Design for Reliability by Adesh

    10/68

    Reliability Terms

    Mean Time To Failure (MTTF) for non-repairablesystems

    Mean Time Between Failures for repairable

    systems (MTBF)

    Reliability Probability (survival) R(t)

    Failure Probability (cumulative density function )

    F(t)=1-R(t)

    Failure Probability Density f(t) Failure Rate (hazard rate) (t)

  • 8/2/2019 Design for Reliability by Adesh

    11/68

    MTBF & MTTF

    Mean Time Between FailuresApplies to repairableitems.

    Mean Time To FailureApplies to non-repairableitems.

    Both of these terms indicate the average time an item

    is expected to function before failure.

  • 8/2/2019 Design for Reliability by Adesh

    12/68

    Reliability Function

    Probability density function of failuresf(t) = le-lt for t > 0

    Probability of failure from (0 to T)

    F(t) = 1 e-lT

    Reliability functionR(T) = 1 F(T) = e-lT

  • 8/2/2019 Design for Reliability by Adesh

    13/68

    14

    Series Systems

    RS = R1 R2 ... Rn

    1 2 n

  • 8/2/2019 Design for Reliability by Adesh

    14/68

    Serial reliability

    Series systems are also referred to as

    weakest link or chain systems.

    System failure is caused by the failure of

    any one component.

    Therefore, for a series system, the reliability

    of the system is the product of the individual

    component reliabilities

    More components = less reliability

    1

    n

    i

    i

    s e r i a l r e l i a b i l i t y x

  • 8/2/2019 Design for Reliability by Adesh

    15/68

  • 8/2/2019 Design for Reliability by Adesh

    16/68

    Parallel reliability

    1

    1 (1 )

    n

    i

    i

    p a ra llel relia b ility x

    oParallel systems are also referred to as

    redundant.

    oThe system fails only if all of the components

    fail.oTherefore, for a parallel system, the system

    probability of failure is the product of the

    individual component probabilities.

  • 8/2/2019 Design for Reliability by Adesh

    17/68

    Series-Parallel Systems

    Convert to equivalent series system

    A B

    C

    C

    D

    RA RB RCRD

    RC

    A B C D

    RA RB RD

    RC

    = 1 (1-RC)(1-RC)

  • 8/2/2019 Design for Reliability by Adesh

    18/68

    ADESH18

    A Simple Example

    A system has 4000 components with afailure rate of 0.02% per 1000 hours.Calculate and MTBF.

    = (0.02 / 100) * (1 / 1000) * 4000 = 8 *10-4 failures/hour

    MTBF = 1 / (8 * 10-4 ) = 1250 hours

  • 8/2/2019 Design for Reliability by Adesh

    19/68

    ADESH19

    An Example A first generation computer contains 10000 components each

    with = 0.5%/(1000 hours). What is the period of 99%reliability?

    MTBF = t / (1 R(t)) = t / (1 0.99) t = MTBF * 0.01 = 0.01 /av Where av is the average failure rate N = No. of components = 10000 = failure rate of a component = 0.5% / (1000 hours) = 0.005/1000 = 5 * 10-6 per

    hour

    Therefore, av = N = 10000 * 5 * 10-6 = 5 * 10-2

    per hour

    Therefore, t = 0.01 / (5 * 10

    -2

    ) = 12 minutes

  • 8/2/2019 Design for Reliability by Adesh

    20/68

    Reliability Failure Modes

    Failures may be SUDDEN (non-predictable) orGRADUAL (predictable). They may also be PARTIALor COMPLETE.

    A Catastrophic failure is both sudden and complete.

    A Degradation failure is both gradual and partial.

    Two root causes:1. lack of robustness2. mistakes

  • 8/2/2019 Design for Reliability by Adesh

    21/68

    Causes of Failure

    MisuseFailures attributable to the application of

    stresses beyond the stated capabilities of the item.

    Inherent WeaknessFailures attributable to

    weakness inherent in the item itself when subjected

    to stresses within the stated capabilities of the item.

  • 8/2/2019 Design for Reliability by Adesh

    22/68

    Classifications of Reliability Failure

    Early stage failureCauses for such type of failure are

    inadequate design, poor manufacturing, and inappropriate

    usage. these can be catastrophic to human life.

    Overstress MechanismsThese occur due to insufficientsafety factor in design, higher than expected

    random loads, human errors, misapplication.

    Wearout MechanismsOccur late in life and then increase

    with age.This happens on corrosion, material fatigue, poor

    maintenance, creep , degradation in strength.

  • 8/2/2019 Design for Reliability by Adesh

    23/68

    Common Measures of Unreliability

    % Failure - % of failures in a total population

    MTTF (Mean Time To Failure) - the average time of

    operation to first failure.

    MTBF (Mean Time Between Failure) - the average time

    between product failures.

    Repairs Per Thousand (R/1000)

    Bq LifeLife at which q% of the population will fail

  • 8/2/2019 Design for Reliability by Adesh

    24/68

    Cumulative Failure Rate Curve

  • 8/2/2019 Design for Reliability by Adesh

    25/68

  • 8/2/2019 Design for Reliability by Adesh

    26/68

    The Bathtub Curve

    Reliability specialists often describe the lifetime ofa population of products using a graphical

    representation called the bathtub curve. The

    bathtub curve consists of three periods: an infant

    mortality period with a decreasing failure rate

    followed by a normal life period (also known as

    "useful life") with a low, relatively constant failure

    rate and concluding with a wear-out period thatexhibits an increasing failure rate.

  • 8/2/2019 Design for Reliability by Adesh

    27/68

    27

    Reliability

    Age

    Probof dying

    in the nextyear(deaths/1000)

    0

    10

    20

    30

    40

    50

    60

    70

    80

    90

    0 2 5 12 16 19 30 50 70 86

    From the Statistical Bulletin 79, no 1, Jan-Mar 1998

  • 8/2/2019 Design for Reliability by Adesh

    28/68

    Steps in Designing for Reliability

    1. Develop a Reliability Plan

    Determine Which Reliability Tools are

    Needed

    2. Analyze Noise Factors

    3. Tests for Reliability

    4. Track Failures and Determine Corrective

    Actions

  • 8/2/2019 Design for Reliability by Adesh

    29/68

    Develop a Reliability Plan

    Planning for reliability is just as important asplanning for design and manufacturing.

    Why?

    To determine: useful life of product

    what accelerated life testing to be used

    Reliability must be as close to perfect as possible

    for the products useful life. You MUST know where your product's major

    points of failure are!

  • 8/2/2019 Design for Reliability by Adesh

    30/68

    Tools for testing

    Stress Analysis

    Reliability Predictions (MTBF)

    FMEA (Failure Mode and Effects Analysis)

    Fault Tree Analysis

    Reliability Block Diagrams

  • 8/2/2019 Design for Reliability by Adesh

    31/68

    Why do Reliability Calculation?

    Reliability calculations make the product

    more reliable which can be used as a selling

    feature by the marketing department. Also,

    this adds to the company reputation and can

    be used for comparisons with competition.

  • 8/2/2019 Design for Reliability by Adesh

    32/68

    Stress Analysis

    It establishes the presence of a safety margin

    thus enhancing system life. Stress analysis

    provides input data for reliability prediction.It is based on customer requirements.

  • 8/2/2019 Design for Reliability by Adesh

    33/68

    Reliability Predictions (MTBF)

    MTBF (Mean Time between Failures) for an

    existing product can be found by studying field

    failure data. For a new product however, or if

    significant changes are made to the design, it maybe required to estimate or calculate MTBF before

    any field data is available.

  • 8/2/2019 Design for Reliability by Adesh

    34/68

    ADESH

    Failure Modes and Effects Analysis

    Failure modes and effects analysis (FMEA) is aqualitative technique for understanding the

    behaviour of components in an engineered systems

    The objective is to determine the influence of

    component failure on other components, and on

    the system as a whole

    FMEA can also be used as a stand-alone procedure

    for relative ranking of failure modes that screensthem according to risk.

    F il d d ff t l i

  • 8/2/2019 Design for Reliability by Adesh

    35/68

    Failure mode and effects analysis

    (FMEA)

    Failure Mode: Consider each component or functional block andhow it can fail.

    Determine the Effect of each failure mode, and the severity on

    system function.

    Determine the likelihood of occurrence and detecting the failure. Calculate the Risk Priority Number (RPN = Severity X

    Occurrence X Detection).

    Consider corrective actions (may reduce severity of occurrence,

    or increase probably of detection).

    Start with the higher RPN values (most severe problems) and

    work down.

    Recalculate RPN after the corrective actions have been

    determined, the aim is to minimize RPN.

  • 8/2/2019 Design for Reliability by Adesh

    36/68

    ADESH

    Reliability Block Diagrams

    Most systems are defined through a combination of bothseries and parallel connections of subsystems

    Reliability block diagrams (RBD) represent a system usinginterconnected blocks arranged in combinations of seriesand/or parallel configurations

    They can be used to analyze the reliability of a systemquantitatively

    Reliability block diagrams can consider active and stand-bystates to get estimates of reliability, and availability (or

    unavailability) of the system Reliability block diagrams may be difficult to construct for

    very complex systems

  • 8/2/2019 Design for Reliability by Adesh

    37/68

    CASE STUDY: Network Storage

    Evaluations Using

    Reliability Calculations

    This section uses a case study to introduce

    concepts and calculations for systematically

    comparing redundancy and reliability factors asthey apply to network storage configurations. We

    will determine a reliability figure on three very

    basic architectures. The starting point of our study

    is the network storage requirements.

  • 8/2/2019 Design for Reliability by Adesh

    38/68

    Network Storage Requirements

    We want networked storage that has access to one

    server. Later, this storage will be accessible to other

    servers. The server is already in place, and has been

    designed to sustain single component hardware failures

    (with dual host bus adapters (HBAs), for example).

    Data on this storage must be mirrored, and the storage

    access must also stand up to hardware failures. The

    cost of the storage system must be reasonable, while

    still providing good performance.

  • 8/2/2019 Design for Reliability by Adesh

    39/68

    Architecture 1 Architecture 1 provides the

    basic storage necessitieswe are looking for with thefollowing advantages anddisadvantages:

    Advantages:

    Storage is accessible ifone of the links is down.

    Storage A is mirrored ontoB.

    Other servers can beconnected to the

    concentrator to access thestorage.

    Disadvantages:

    If the concentrator fails, wehave no more access to

    the storage. Thisconcentrator is a single

  • 8/2/2019 Design for Reliability by Adesh

    40/68

    Architecture 2

    Architecture 2 has been

    improved to take intoaccount the previousSPOF. A concentratorhas been added.

    Advantages:

    If any links orcomponents go down,storage is stillaccessible (resilient to

    hardware failures). Data is mirrored (Disk A

    Disk B).

    Other servers can beconnected to bothconcentrators to access

  • 8/2/2019 Design for Reliability by Adesh

    41/68

    Architecture 3

    The main difference is that

    Disk A and Disk B have onlyone data path. Disk A is stillmirrored to Disk B, asrequired.

    This architecture has all theadvantages of the previousarchitectures with thefollowing differences:

    Disk A can only be accessed

    through Link C, and Disk Bonly through Link D.

    There is no data multi pathingsoftware layer, which resultsin easier administration

    and easier troubleshooting.

  • 8/2/2019 Design for Reliability by Adesh

    42/68

    Determining Reliability

    Using the reliability formulas , we can determinewhich architecture has the highest reliability value.For the purpose of this article , we will use sampleMTBF values (as obtained by the manufacturer)and AFR*(Annual Failure Rate) values shown inthe table below:

    *(The AFR for each component was calculated using the MTBF

    where (8760/MTBF) = AFR). The example MTBF values weretaken from real network storage component statistics. However,such values vary greatly, and these numbers are given herepurely for illustration.

  • 8/2/2019 Design for Reliability by Adesh

    43/68

    Determining Reliability

    Component AFR

    Variable

    Sample MTBF Values

    (hours)

    AFR

    HBA 1 H 800,000 0.011

    HBA 2 H

    LINK A L 400,000 0.022

    LINK B L

    Concentrator 1 C 580,000 0.0151

    Concentrator 2 C

    LINK C L 400,000 0.022LINK D L

    Disk A D 1,000,000 0.0088

    Disk BD

  • 8/2/2019 Design for Reliability by Adesh

    44/68

    Determining Reliability

    Having the rate of failure of each individualcomponent, we can obtain the system's annual

    failure rate AFR and consequently the system

    reliability (R) and system MTBF values. The AFR

    values of redundant components are multiplied tothe power equal to the number of redundant

    components. The AFR values of non-redundant

    components are multiplied by the number of those

    components in series.

  • 8/2/2019 Design for Reliability by Adesh

    45/68

    Calculation

    In case of Architecture 1, concentrator(C) is theonly non-redundant component.

    AFR1 = (H+L)2 + C + L2 + D2

    AFR1 = (0.011+0.022) 2 + 0.0151 + (0.022)2 +(0.0088)2 = 0.0167

    R1 = 1 - AFR1 = 10.0167 = 0.9833, or 98.33%

    MTBF1= 8760/AFR1 = 8760/0.0167 = 524,551

    hours.

  • 8/2/2019 Design for Reliability by Adesh

    46/68

    Calculation

    The architecture 2 has a different configuration

    with no non-redundant components.

    AFR2 = (H+L+C+L) 2 + D2 AFR2 = (0.011+0.022+0.0151+0.022) 2 +

    (0.0088)2 = 0.0005

    R2 = 1AFR2 = 10.0005 = 0.995, or 99.50%

    MTBF2= 8760/AFR2 = 8760/0.0005 = 1,752,000

    hours.

  • 8/2/2019 Design for Reliability by Adesh

    47/68

    Calculation

    Architecture 3 has yet another configuration andhas no non-redundant components.

    AFR3 = (H+L+C+L+D) 2

    AFR3 = (0.011+0.022+0.0151+0.022+0.0088) 2 =0.0062

    R3 = 1AFR3 = 10.0062 = 0.9938, or 99.38%

    MTBF3= 8760/AFR3 = 8760/0.0062 = 1,412,903

    hours.

  • 8/2/2019 Design for Reliability by Adesh

    48/68

    Conclusion

    When the calculations are complete, we compare thedata:

    Architecture 1 = 98.33%, or a System's MTBF =524,551 hours

    Architecture 2 = 99.50%, or a System's MTBF =1,752,000 hours

    Architecture 3 = 99.38%, or a System's MTBF =1,412,903 hours

    The MTBF figures are the most revealing, and indicatethat architecture 2 is statistically the most reliable ofall.

    Failure Effects

  • 8/2/2019 Design for Reliability by Adesh

    49/68

    Failure Effects

    (What customer experiences)

    Noise

    Inoperability

    Instability

    Intermittent operation

    Roughness

    Excessive effort requirements Unpleasant or unusual odor

    Poor appearance

  • 8/2/2019 Design for Reliability by Adesh

    50/68

    Design &Manufacture

    Pre-Production Design

    Control of Production

    Working Tolerances

    Material QualityComponent Quality

    Component Stress

    Installation &Environmental

    Temperature

    Humidity

    Vibration

    Chemical Attack

    Interconnections

    Factors Affecting

    Reliability

  • 8/2/2019 Design for Reliability by Adesh

    51/68

    Design against failure

    Important to understand the failure (why, where, howlong, application, etc.)

    Two methods for design against failure:1. By reducing the stress that cause the failure.2. By increasing the strength of the component.

    Either one can be achieved by: Selecting materials Changing the package geometry Changing the dimensions Protection

  • 8/2/2019 Design for Reliability by Adesh

    52/68

    Fatigue Failure?

    Fatigue is the most common mechanism of failureand responsible for 90% of all structural and

    electrical failures.

    Occurs in metals, polymers, and ceramics.

    Metal paper clip example

    Bend in both directions

    Repeat the process

  • 8/2/2019 Design for Reliability by Adesh

    53/68

    Design Against Fatigue Failure

    Increase fatigue strength.

    Reduce the amplitude of cylic loading.

    avoid stress concentration region

  • 8/2/2019 Design for Reliability by Adesh

    54/68

    Design Against Brittle Fracture

    Brittle fracture is an overstress failuremechanism that occurs rapidly with little or nowarning when the induced stress in the

    component exceeds the fraction strength ofthe material.

    Occurs in brittle materials (ceramics, glasses

    and silicon).

    Applied stress and work could break theatomic bonds.

    Design Guidelines to Reduce

  • 8/2/2019 Design for Reliability by Adesh

    55/68

    Design Guidelines to Reduce

    Brittle Fracture

    Designs with materials and processing

    conditions that would produce the least

    stress in brittle materials should be created.

    The brittle material should be polished to

    remove surface flaws to enhance reliability.

  • 8/2/2019 Design for Reliability by Adesh

    56/68

    Design Against Creep Failure

    What is Creep? A time-dependent deformation process under

    load.

    Thermally-activated process: the rate ofdeformation for a given stress level increasessignificantly with temperature.

    Deformation depends on1. The applied load.2. The duration through which the load is applied3. Elevated temperature

  • 8/2/2019 Design for Reliability by Adesh

    57/68

    Design Against Creep Failure

    Creep can occur at any stress level.

    Creep is most important at elevatedtemperatures.

    Design Guidelines to Reduce Creep

  • 8/2/2019 Design for Reliability by Adesh

    58/68

    Design Guidelines to Reduce Creep-

    Induced Failure.

    Use materials with high melting point if the

    application calls for harsh temperature conditions.

    Reduction of mechanical stress will reduce creep

    deformation.

    Creep is a time controlled phenomenon.

  • 8/2/2019 Design for Reliability by Adesh

    59/68

    Design Against Plastic Deformation

    What is Plastic Deformation?

    When the applied mechanical stress exceeds theelastic limit or yield point of a material.

    It is permanent.

    Excessive deformation and continuedaccumulation of plastic strain due to cyclic loading

    will eventually lead to cracking of the componentand make it unusable.

    Design Guidelines Against Plastic

  • 8/2/2019 Design for Reliability by Adesh

    60/68

    Design Guidelines Against Plastic

    Deformation

    Limit the design stresses in the packaging structure

    below the yield strength of the materials used. If

    possible, use materials that have high yieldstrength.

    Design and control the local plastic deformation at

    regions of stress concentrations.

  • 8/2/2019 Design for Reliability by Adesh

    61/68

    Chemically Induced Failures

    What are Chemically Induced Failures?

    Chemical process such as electrochemical

    reactions can result in cracking of components

    leading to electrical failures.

    Two Types

    Corrosion Intermetallic Diffusion

    Design Against Corrosion Induced

  • 8/2/2019 Design for Reliability by Adesh

    62/68

    Design Against Corrosion-Induced

    Failure

    What is Chemical Corrosion?

    The chemical or

    electrochemical reaction

    between a material, usually

    a metal, and its environment

    that produces a deterioration

    of the material and itsproperties.

    Design Guidelines to Reduce

  • 8/2/2019 Design for Reliability by Adesh

    63/68

    Design Guidelines to Reduce

    Corrosion

    Metals with a high oxidation potential tend tocorrode faster.

    Use hermetic packages to prevent moistureabsorption.

    Ensure there are no trapped moisture or

    contaminants during the processing an assembly ofthe packages.

    Design Against Intermetallic

  • 8/2/2019 Design for Reliability by Adesh

    64/68

    Design Against Intermetallic

    Diffusion

    What is Intermetallic Diffusion?

    During wirebonding and solder reflow, the

    joining process generates intermetallic layers

    which are byproducts of the joining process.

    Design Guidelines Against

  • 8/2/2019 Design for Reliability by Adesh

    65/68

    Design Guidelines Against

    Intermetallic Diffusion

    Limit the process temperatures and control thetime exposed to high temperatures during the

    joining process.

    Control the temperature range and cycles ofexposure at the high temperature period.

    Application of nickel/gold coating on the barecopper pad surfaces.

  • 8/2/2019 Design for Reliability by Adesh

    66/68

    Achieving reliability growth

    Detect failure causes

    Feedback

    Redesign

    Improved fabrication

    Verification of redesign

  • 8/2/2019 Design for Reliability by Adesh

    67/68

    References

    Mechanical reliability and design by A.D.S Carter

    Introduction to reliability in design by Charles O.

    Smith.

    http://www.reliabilityanalysislab.com/ReliabilityServices.asp

    http://pms401.pd9.ford.com:8080/arr/concept.htm

  • 8/2/2019 Design for Reliability by Adesh

    68/68