six sigma basic stats module
TRANSCRIPT
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
1
Basic StatisticsGB Module
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
2
Continuous Improvement Road Map
Improve
Control
• Define CTQ• Determine Current State
• Determine Key Input / Output Variables
• Perform MSA• Calculate initial process
capabilities
Measure
• Verify Effects of Key inputs with DOE’s
• Determine Optimum Settings
• Update Control Plan• Verify Improvements
Analyze• Evaluate Existing Control Plan• Using statistical methods to determine potential key inputs• Prioritize key input variables
N
Define
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Customers and Variation
• Customers complain when they believe the product or service they receive differs from their expectations; there is variation
• Variation has many faces:– Missing functionality/actions– Defects and faults– Delays etc
• All variation is caused• Six Sigma is about reducing and controlling variation• We need to understand variation and the causes of variation
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Causes of Variation
YY DependentDependent OutputOutput EffectEffect SymptomSymptom MonitorMonitor
XX11 . . . X . . . XNN
IndependentIndependent Input-ProcessInput-Process CauseCause ProblemProblem ControlControl
Therefore we need to understand the Xs and improve and control the ones with most influence on Y
The variation in Y is caused by variation of the Xs
Output (Ys)Process
(Xs)Input (Xs)
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Long Term & Short Term Variation
ProcessResponse
Y
Short-Term includes: common cause variation only
Long-Term includes: common cause & (some) special cause variation
EXAMPLEI drive to work. It takes me 35 +/- 3 minutes. This is the common cause variation. One day it takes me 50 minutes due to roadworks - this is a special cause.
Time
Short-Term Variation due toCommon causes
Long
-Ter
m V
aria
tion
Special Causes
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Pro
cess
Res
pons
e
Time
Examples of Special Causes
Special causes are assignable and can include:
Weather (season, time of day)Lighting ConditionsMachine TypeMachine AgeMaintenanceSupplierOperatoretc…special causes
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Exercise – Special Causes
• Consider the process in your project
• Make a list of the potential special causes
• Be prepared to share your list with the rest of the group
• Time: 10 Minutes
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Measuring Variation
• Variation is not simple to measure because it is RANDOM• Random does not mean erratic! While it may not be
possible to predict what an individual process output is, there is usually a pattern if we measure a number of outputs
• Process outputs will group together and we are interested in their central value, the value they group around, and their spread.
• This grouping forms a pattern that is often predictable
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Example
• If a coin is tossed we cannot predict whether it will be a head or tail
• If we tossed the coin say 100 times we would expect that on 50 occasions it would be a head and on 50 occasions it would be a tail
• So there is a pattern - but we cannot predict any individual toss
• We relate the expectations to chance (probability), there is a 50% chance it will be a head
• Randomness is about chance
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Coin Toss Exercise
• Everyone needs a coin of some type
• Flip the coin 25 times and record the number of “heads”
• Report the total number of “heads” obtained and create a dot plot
• Repeat the experiment• How do the dot plots
compare?
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Randomness and Distributions• Outputs group together to form a pattern
• This pattern describes the distribution of the variation
• We cannot predict where an individual value will fall, but we can predict the overall pattern
X
X
X
X X
X X X
X X X X
X X X X X X X X
10 11 12 13 14 15 16 17 18
Time to deliver
Distribution
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Real Life and Distributions• Distributions can be modelled mathematically
• If we collect data from a process or product we can “match” it to a distribution and use the properties of the distribution for analysis and predictions
Real situationModelled by adistribution
AnalysisReal Solution
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Probability Distributions• Probability
distributions (normally just called distributions) are a way of being able to make predictions about random events
• There are many “standard” distributions which enable us to model real world variation
• Standard distributions– Attribute data
• Binomial• Poisson
– Variable Data• Normal (Gaussian)• Lognormal (skewed)• Student t• F-distribution• Exponential
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Key Properties of DistributionsCentral Tendency: the value the data
groups around
Spread or dispersion of the values
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Measures of Central Tendency– Mean (mu)
– Median
- middle value of ranked data
– Mode
- most frequently occurring value
x
n i
If the distribution is symmetric
The mean, medianand mode have the same value
If the distribution is NOT symmetric
The mean, medianand mode have different values
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Measures of Spread• Range R = Biggest value – smallest value
• The range is susceptible to outlying values, as a result we need a better measure
• One approach is to calculate the average deviation from the mean:
(X - )n
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
• The average of the deviations squared is called the variance and is a measure of spread
• It suffers from having units the same as the mean2. To overcome this we take the square root to give the standard deviation which has the symbol
• We use standard deviation as a measure of spread
Variance and Standard Deviation
n
xi2
i 21 xn
V
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Descriptive Statistics• Can be calculated using Minitab or Excel
• Gives information about a data set’s central tendency, spread and shape
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Descriptive StatisticsDescriptive Statistics: Data1
Variable N Mean Median TrMean StDev SE MeanData1 500 56.421 56.355 56.486 5.563 0.249
Variable Minimum Maximum Q1 Q3Data1 38.260 69.801 52.693 60.227
Central Tendency
Shape
Spread
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Computer Exercise!• Open the file 3L54
Stone.mpj
• Calculate the Descriptive Statistics for the data set
• Additionally, create a graphical descriptive statistics output
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Descriptive Statistics Result
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Descriptive Statistics – Graphical Summary
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Descriptive Statistics – Graphical Summary
15 25 35 45 55 65
95% Confidence Interval for Mu
41 42 43 44
95% Confidence Interval for Median
Variable: MPH
A-Squared:P-Value:
MeanStDevVarianceSkewnessKurtosisN
Minimum1st QuartileMedian3rd QuartileMaximum
40.5931
8.4792
40.5544
0.3990.360
42.2730 9.5282
90.78578.53E-02-4.0E-02
126
16.900036.675042.450048.050065.2000
43.9530
10.8755
43.5000
Anderson-Darling Normality Test
95% Confidence Interval for Mu
95% Confidence Interval for Sigma
95% Confidence Interval for Median
Descriptive Statistics
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
XX
XXXXX
XX
X
X
X
On Target Reduce theSpread
X
XXXX
XX
XXXX
XX X
Mean and standard deviation tell us a great deal about a process
Off-Target Measured by the mean
Spread Measured by the Standard Deviation
X
X
X
X
XX
X
X
XXX
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Off-Target Spread
On Target Reduce the Spread
Distributions and Variation …..
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Normal Distribution
Mean
• Frequently occurs in practice • Models random behaviour• Shows that the variation groups around the mean and tails off• Symmetric about the mean with a 50% chance of falling either side of the mean• Is the basis for Six Sigma and many Six Sigma tools
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Area and ProbabilityArea under Normal curve = probability or chance of
being in that region
-4 -3 -2 -1 0 1 2 3 4
-4 -3 -2 -1 0 1 2 3 4
Area = 0.5 probability = 0.5 or 50%
Area = 0.159 probability = 0.159 or15.9%
-4 -3 -2 -1 0 1 2 3 4
Area = 1.0 probability = 1.0 or 100%
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Area, Probability & Standard Deviation
Mean
Total Area100%
-134
.13%
34.1
3%
-2-3 1 2 3
13.5
9%
13.5
9%
2.14
%
2.14
%
0.135% 0.135%
99.73% between -3 and + 3
0.27% lies outside -3 and + 3
The area under theNormal distributionrelates to probability
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
A Common Situation
Mean
X
What is the probabilitythat the characteristic
is greater orequal to X?
Given productcharacteristic that isnormally distributedwith a mean and
standard deviation
If we calculate the number of standard deviations betweenthe mean and X, we can use
the area results to determine theprobability
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Standard Normal Distribution
Tables exist for the probability vs. number of standard deviationsfor the case of a “Standard Normal distribution” which has
a mean = 0 and a standard deviation = 1
Z value or number of standard deviations
z .00 .01 .02 .03 .04 .05
0.0 .5000 .4960 .4920 .4880 .4840 .48010.1 .4602 .4562 .4522 .4483 .4443 .44040.2 .4207 .4168 .4129 .4090 .4052 .40130.3 .3821 .3783 .3745 .3707 .3669 .36320.4 .3446 .3409 .3372 .3336 .3300 .32640.5 .3085 .3050 .3015 .2981 .2946 .29120.6 .2743 .2709 .2676 .2643 .2611 .25780.7 .2420 .2389 .2358 .2327 .2296 .22660.8 .2119 .2090 .2061 .2033 .2005 .19770.9 .1841 .1814 .1788 .1762 .1736 .17111.0 .1587 .1562 .1539 .1515 .1492 .14691.1 .1357 .1335 .1314 .1292 .1271 .1251
Mean =0
= 1
Tables exist that give the probability of a pointof interest X being
greater or equal to Z
Z
X
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
• The probability values are often called p-values– p-value = tail area– Area under curve beyond point or value of interest– Probability of being at value of interest or beyond
• A small p-value (0 to 0.05) indicates– The probability is small that the value of interest comes from that distribution
by chance– Something else is going on
P-values are Probabilities of Interest
-4 -3 -2 -1 0 1 2 3 4 -4 -3 -2 -1 0 1 2 3 4 -4 -3 -2 -1 0 1 2 3 4
Value ofInterest
Value ofInterest
Value ofInterest
Value ofInterest
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Z Values
Mean 10
3
Mean 45
6.1
Mean 7.98
1.2
Mean =0
= 1
Z is calculated using the equation
P-value
Z
XZ =
X -
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Using Z Values
Mean = 1.735
= 0.0855
1.65 1.85
Total area in tails 0.1611 + 0.0885 = 0.2496Area of interest = 1- 0.2496 = 0.75 = yield
345.10855.0
735.185.1 Z994.00855.0
735.165.1 Z
NoteThe sign of the Z value
simply indicates direction from the mean.
A customer demands that a product’s specification is 1.75±0.1 We collect datafrom the manufacturing process and find the mean to be 1.735 and = 0.0855
From Standard Normal tables Area 0.1611
From Standard Normal tables Area 0.0885
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Mini Exercise
Mean = 1.5
= 0.25
1.75
?ZProbability that an item
is greater than 1.75?
From Standard Normal tables Area/Probability?
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Variation and 6-Sigma
Customer Critical RequirementsLSL
Target
USL1 2 3 4 5 6
99.9999998%99.9999998%99.9999998%99.9999998%
Z= 6.0Tail Area
No of Defects1 in a billion
Tail AreaNo of Defects1 in a billion
This is a 6-Sigma (Process or Product)
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Long-Term & Short-Term Variation
The presence of special causes will act to increase the variationseen by the customer. A gross assumption is a 1.5 sigma shift.
Cha
ract
eris
tics
or R
espo
nse
Time
Short-Term Variation
Long-Term Variation
What the Customer
Sees
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
1.5 Sigma Shift
Customer Critical RequirementsLSL
Target
USL
This is a Six Sigma (Process or Product)
99.99966%99.99966%99.99966%99.99966%
Tail AreaNo of Defects
0
Tail AreaNo of Defects3.4 in a million
Z= 4.5
1 2 3 4 5 6
1.5
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
1.5 Sigma Shift Demonstration
• Open Minitab file Glass Strength.mpj• Calculate the subgroup standard
deviations using Descriptive Statistics• Calculate the average standard
deviation across the subgroups• Stack the subgroups• Calculate the combined Standard
Deviation of the stacked data• Divide the standard deviation of the
stacked data by the average standard deviation of the subgroups.
What did you find?
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Summary
• In the short term we need a Zst = 6.0 to “guarantee” a long term Zlt = 4.5
• Note to achieve 3.4 defects per million requires Zlt = 4.5 - we should not strive to achieve Zst = 6.0 if
Zshift = Zst - Zlt < 1.5
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Z values and Sigma levels
• Zst values are related to Sigma levels• In 6-Sigma we look at short and long term values: Zst and Zlt • In 6-Sigma if we cannot calculate long term variation we assume a 1.5
sigma shift
Zlt = Zst - 1.5
• Note Z-tables generally do not have the 1.5 sigma shift and give Zlt
• Sigma/DPMO tables do have the 1.5 sigma shift and give Zst
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Testing for Normality• The Normal distribution is important to 6-Sigma since many of the tools
and techniques are affected by Non Normal data
Tool Consequence
Process sigma Incorrect process sigma
Individuals control chart False detection of special causes
Hypothesis testing Incorrect conclusions
Regression False identification of important factors poor predictive properties
DOE Incorrect conclusions about important factors poor prediction abilities
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma U S L
U S L
a v e r a g e
W h a t p e r c e n t a g e f a l l s h e r e ?
T h e p e r c e n t a g e i s d i f f e r e n t f o r t h e N o r m a l c u r v e
Effect of not checking Normality• Example: Effect of skewed
distribution on calculating the process Sigma Level
– Process Sigma Level is determined by finding the area beyond the specification limits using Z-tables
– If the data is not Normal, the area will be incorrectly estimated from the Z-tables and therefore give a misleading Process Sigma Level
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Exercise – Normal?
– Look at each histogram on the following pages and decide which data sets come from a Normal distribution
• Circle or mark the ones you think are Normal.
– Work in pairs to confirm your answers
– Be prepared to share your answers with the whole group
– Time 10 minutes
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Assess Data for Normality•25 Data Points
Mark the histograms that you think come from a Normal distribution
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Assess Data for Normality, cont.50 Data Points
Mark the histograms that you think come from a Normal distribution
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Assess Data for Normality, cont.100 Data Points
Mark the histograms that you think come from a Normal distribution
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Exercise : Answers– Just looking at histograms can be deceptive
• Each of the Histograms on the previous pages were randomly generated in Minitab as a Normal distribution with a mean = 50, and a standard deviation = 10: they are all Normal.
– It is difficult to tell if data is Normal by looking at histograms of n = 25, n = 50, and sometimes even n = 100
– Plotting the data is very good practice, but do not be misled by small amounts of data
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Other DistributionsExponential Poisson Uniform
25
50
100
Sam
ple
Siz
e
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Data Not Normal• If the data is not Normal there may be
reasons which can be corrected– Extreme values, Typographical errors
- correct them
– Multiple modes - separate them
– Data rounded - increase precision
– Not enough data – collect more
– Special causes present – remove them
– Underlying distribution is not normal
Always check these first!
before concluding
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Check for Normal Distribution
• Both variable and discrete data (if there is enough) can often be modelled by a Normal distribution
• Many Statistical tools are based upon a normal distribution. However, many of the statistical tools will produce outputs even if the data is not normal. These outputs could be misleading
• Hence one of the first steps having collected data is to check for normality
• There is a test in Minitab for this
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Normality TestIn a normality test the datais plotted on “normal probability paper” - if thedata follows a straight lineit is normal.
The test also includes a hypothesis test (seeWeek 2). This test providesa quantitative value as towhether the data is normalthrough the p-value.
If p>0.05 we can say thatthe data is normallydistributed
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
P-values• The p-value is the risk of making the wrong decision - in this case
concluding that the data is not normally distributed when it is.• In this case the p-value is 54% - a 54% risk of making the wrong
decision - in this case concluding that the data is not normally distributed when it is.
• This risk is too high so we conclude the data is normally distributed• We can never be risk free or 100% certain. Hence we need to set a
decision level. Experience shows that this is 5% (or 95% confidence)
• Hence we test to see if p > 0.05 if it is the data is normally distributed
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Using Minitab to Check for Normality
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Normality Test Exercise
• Using the data from Glass Strength.mpj, check for normality for each of the subgroups
• Then check for normality on the combined data
• Be prepared to report your findings
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Non Normal Distributions• Having checked the data for typographical errors etc
and concluded that the data is not normally distributed progress can still be made– In some cases of non-normal distributions (typically
Skewed Distributions) it is possible to transform the data to make it normal
– In some cases the data may be “close enough” to a normal distribution to use the statistical tools with care
– In some cases it does not matter that the data is not normally distributed
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Six
Sig
ma
Summary• This ppt has covered
– Variation• Common and Special causes, Long and short term data
– Distributions• Central Value (tendency): mean, median and mode
• Spread or dispersion: range, variance and standard deviation
– Normal Distribution• Z-values and p-values, Six Sigma, 1.5 Sigma shift and Z-values
• Checking for normality and dealing with Non normal data