ge minitab primer
TRANSCRIPT
GE Company ProprietaryVersion 2.0 1
Minitab Primer
Minitab Primer Introduction to StatisticalData Analysis With Minitab
GE Company ProprietaryVersion 2.0 2
Minitab Primer Introduction
This primer is designed to provide one with the skills necessary to effectively employ Minitab within the Six Sigma framework. It begins with an introduction to selected file, data manipulation, and help functions, followed by a series of demonstrations related to transaction and service quality. The student is encouraged to work through each demonstration, following the lead of the instructor. Each demonstration begins with a page highlighting the Minitab functions applied in working-through the examples. These pages reflect the commands that the user would see on the screen while using Minitab (the heirarchical structure of Minitab is preserved).
It is assumed that the user is familiar with basic statistics, e.g., hypothesis testing and regression analysis. A companion primer, entitled Statistics Primer - Introduction to Statistics Through Graphical Analysis is available on the World-Wide-Web (GE Corporate) for those requiring a review of fundamental statistics.
Augie Stagliano Pittsfield, MAOctober 1996
GE Company ProprietaryVersion 2.0 3
Minitab Primer Takeaways
After Completing This Training, You Will Be Able To....
• Import Data Files• Perform Basic Data Manipulation Techniques• Use Functions to Perform Calculations • Construct and Interpret Various Graph Types • Generate and Interpret Basic Statistical Information• Apply One and Two Sample Hypothesis Tests• Perform Simple Linear Regression• Apply Tests and One-Way ANOVA
An Overview of Applied Statistical Techniques Stressing Interpretation of Analytical Results
An Overview of Applied Statistical Techniques Stressing Interpretation of Analytical Results
GE Company ProprietaryVersion 2.0 4
Minitab Primer File Commands
• New Worksheet
• Open Worksheet
• Merge Worksheet
• Save Worksheet
• Print Window
• Get Worksheet Information
• Display Data
• Restart Minitab
• Exit
GE Company ProprietaryVersion 2.0 5
Minitab Primer Help and Manip Commands
Help Commands:
• Contents
• Getting Started...
• How Do I...
• Search for Help On...
• Sort
• Stack
• Unstack
Manip Commands:
GE Company ProprietaryVersion 2.0 6
Minitab Primer Demonstration One
• STAT• Basic Statistics
• Descriptive Statistics• 1-Sample t• 2-Sample t• Correlation
• Tables• Tally• Chisquare Test
• CALC• Probability Distributions
• Normal
• STAT• ANOVA
• ONEWAY
Basic Statistical Analysis
• GRAPH• Plot• Time Series Plot• Histogram• Boxplot• Character Graphs
• Dotplot
• STAT• SPC
• Run Chart
Graphical Analysis
• STAT• Regression
• Regression• Fitted Line Plot
Regression Analysis
Choice of Tool Depends Upon the Requirements of the Analysis
Choice of Tool Depends Upon the Requirements of the Analysis
GE Company ProprietaryVersion 2.0 7
Minitab Primer
Example: Receivables “Days-to-Collection” Data File: days.xlsVariable: Days
Collection terms for receivables is 60 days. Payments are entered into a data collection system in the same time-order as they are received. Characterize this process and determine its long term z-value and sigma. Also, test that the average days-to-collection is equal to 50 days (Business Target).
Demonstration One
GE Company ProprietaryVersion 2.0 8
Minitab Primer
Receivables Process Characterization Receivables Process Characterization
Descriptive Statistics
Variable N N* Mean Median TrMean StDev SEMeanDays 50 0 63.80 64.00 63.75 8.45 1.19
Variable Min Max Q1 Q3Days 45.00 87.00 58.75 68.25
8575655545
20
10
0
Days
Freq
uenc
y
605040302010
90
80
70
60
50
40
Index
Day
s
Histogram... Time Series Plot...
MeasureMeasure - Analyze - Improve - Control
GE Company ProprietaryVersion 2.0 9
Minitab Primer
Receivables Process - Yield & Sigma ValuesReceivables Process - Yield & Sigma Values
Days Count 45 1 48 2 49 1 53 1 54 3 55 1 58 3 59 3 60 1 61 2 62 1 63 4 64 4 65 1 66 2 67 6 68 2 69 1 70 2 71 2 72 1 74 2 77 1 78 1 79 1 87 1 N= 50
16 Items Within Spec (60 days)
34 Items Outside of Spec
Inverse Cumulative Distribution Function
Normal with mean = 0 and standard deviation = 1.00000
P( X <= x) x ??? ??? ??? z-value (LT)z-value (LT)
Sigma = ???Sigma = ???
Yield = ???
MeasureMeasure - Analyze - Improve - Control
GE Company ProprietaryVersion 2.0 10
Minitab Primer
Is the Average Days-to-Pay On Target ???Is the Average Days-to-Pay On Target ???
T-Test of the Mean
Test of mu = 50.00 vs mu > 50.00
Variable N Mean StDev SE Mean T P-ValueDays 50 63.80 8.45 1.19 11.55 0.0000
Hypothesis Test of the Mean.....
Business Target: 50 Days = 0.05 Test for Mean > 50 Days Conf. Level = 95.0%
Results.....
MeasureMeasure - Analyze - Improve - Control
GE Company ProprietaryVersion 2.0 11
Minitab Primer
Choice of Tool Depends Upon the Requirements of the Analysis
Choice of Tool Depends Upon the Requirements of the Analysis
Demonstration Two
• STAT• Basic Statistics
• Descriptive Statistics• 1-Sample t• 2-Sample t• Correlation
• Tables• Tally• Chisquare Test
• CALC• Probability Distributions
• Normal
• STAT• ANOVA
• ONEWAY
Basic Statistical Analysis
• GRAPH• Plot• Time Series Plot• Histogram• Boxplot• Character Graphs
• Dotplot
• STAT• SPC
• Run Chart
Graphical Analysis
• STAT• Regression
• Regression• Fitted Line Plot
Regression Analysis
GE Company ProprietaryVersion 2.0 12
Minitab Primer
Example: GE Stock DataData File: price.xlsVariable: Price
Description: this data set contains actual daily price data for atime period of approximately two years. The data is ordered in its original time sequence. Characterize the data and checkfor stability over time.
MeasureMeasure - Analyze - Improve - Control
GE Company ProprietaryVersion 2.0 13
Minitab Primer
11510595857565554535
95% Conf idence Int erval f or Mu
656055
95% Conf idence Int erval f or Median
Variable: price
55.000
18.449
61.969
Maximum3rd Quart ileMedian1st Quart ileMinimum
n of dat aKurt osisSkewnessVarianceSt d DevMean
p-value:A-Squared:
57.500
20.866
65.380
109.750 65.812 56.750 49.625 45.500
509.000 0.296 1.338
383.499 19.583 63.674
0.000 51.523
95% Conf idence Int erval f or Median
95% Conf idence Int erval f or Sigma
95% Conf idence Int erval f or Mu
Anderson-Darling Normalit y Test
Descriptive Statistics
500300100
110
100
90
80
70
60
50
Observation
pric
e
1.000 0.000 8.000339.000261.000
1.000 0.000245.000255.475 13.000
Approx p-value f or Oscillat ion:Approx p-value f or Trends:Longest run up or down:Expect ed number of runs:Number of runs up or down:
Approx p-value f or Mixt ures:Approx p-value f or Clust er ing:Longest run about median:Expect ed number of runs:Number of runs about median:
Run Chart for price
Results of GE Stock Price Demonstration Results of GE Stock Price Demonstration
MeasureMeasure - Analyze - Improve - Control
GE Company ProprietaryVersion 2.0 14
Minitab Primer
Choice of Tool Depends Upon the Requirements of the Analysis
Choice of Tool Depends Upon the Requirements of the Analysis
• STAT• Basic Statistics
• Descriptive Statistics• 1-Sample t• 2-Sample t• Correlation
• Tables• Tally• Chisquare Test
• CALC• Probability Distributions
• Normal
• STAT• ANOVA
• ONEWAY
Basic Statistical Analysis
• GRAPH• Plot• Time Series Plot• Histogram• Boxplot• Character Graphs
• Dotplot
• STAT• SPC
• Run Chart
Graphical Analysis
• STAT• Regression
• Regression• Fitted Line Plot
Regression Analysis
Demonstration Three
GE Company ProprietaryVersion 2.0 15
Minitab Primer
Example: Comparing Two Different Business Regions
Data File: receive.xlsVariables: region1, region2 (t-test)
region1, reg1$$$ (scatter diagram, correlation, and regression)
Evaluate the relative performance of these two business regions using hypothesis testing. Also, prepare a scatter diagram and regression model (calculate correlation co-efficient) using Reg1$$$ as the response variable and Region1 as the predictor.
Measure - AnalyzeAnalyze - Improve - Control
GE Company ProprietaryVersion 2.0 16
Minitab Primer
Two Sample T-Test and Confidence Interval
Twosample T for region1 vs region2 N Mean StDev SE Meanregion1 100 46.10 10.1 1.01region2 100 44.48 9.84 0.98
95% C.I. for mu region1 - mu region2: ( -1.2, 4.40)T-Test mu region1 = mu region2 (vs not =): T= 1.14 P=0.26 DF= 197
Hypothesis Test Results...
Is There a Difference in the Average Levelof Receivables Ages Between Regions 1 & 2?
Is There a Difference in the Average Levelof Receivables Ages Between Regions 1 & 2?
Measure - AnalyzeAnalyze - Improve - Control
GE Company ProprietaryVersion 2.0 17
Minitab Primer
Correlations (Pearson)
Correlation of region1 and reg1$$$ = 0.930
Correlation Coefficient (r)...Scatter Plot...
80706050403020
750
650
550
450
350
250
150
region1
reg1
$$$
Establish a Relationship Between Responseand Predictor Before Building the Model
Establish a Relationship Between Responseand Predictor Before Building the Model
Measure - AnalyzeAnalyze - Improve - Control
GE Company ProprietaryVersion 2.0 18
Minitab Primer
80706050403020
850
750
650
550
450
350
250
150
region1
reg1
$$$
R-Squared = 0.865
Y = 29.0826 + 9.65584X
Regression Plot
Fitted Line Plot...
Measure - AnalyzeAnalyze - Improve - Control
GE Company ProprietaryVersion 2.0 19
Minitab Primer Regression Analysis
The regression equation isreg1$$$ = 29.1 + 9.66 region1
Predictor Coef Stdev t-ratio pConstant 29.08 18.22 1.60 0.114region1 9.6558 0.3861 25.01 0.000
s = 38.95 R-sq = 86.5% R-sq(adj) = 86.3%
Analysis of Variance
SOURCE DF SS MS F pRegression 1 948965 948965 625.40 0.000Error 98 148702 1517Total 99 1097667
Unusual ObservationsObs. region1 reg1$$$ Fit Stdev.Fit Residual St.Resid 10 45.0 381.00 463.60 3.92 -82.60 -2.13R 31 41.0 525.00 424.97 4.36 100.03 2.58R 53 75.0 739.00 753.27 11.82 -14.27 -0.38 X 64 59.0 513.00 598.78 6.33 -85.78 -2.23R 70 47.0 404.00 482.91 3.91 -78.91 -2.04R 76 23.0 251.00 251.17 9.73 -0.17 -0.00 X 78 69.0 648.00 695.34 9.67 -47.34 -1.25 X 92 20.0 176.00 222.20 10.80 -46.20 -1.23 X 95 50.0 598.00 511.87 4.18 86.13 2.22R 98 45.0 558.00 463.60 3.92 94.40 2.44R
R denotes an obs. with a large st. resid.X denotes an obs. whose X value gives it large influence.
Regression Results...
Measure - AnalyzeAnalyze - Improve - Control
GE Company ProprietaryVersion 2.0 20
Minitab Primer
Choice of Tool Depends Upon the Requirements of the Analysis
Choice of Tool Depends Upon the Requirements of the Analysis
Demonstration Four
• STAT• Basic Statistics
• Descriptive Statistics• 1-Sample t• 2-Sample t• Correlation
• Tables• Tally• Chisquare Test
• CALC• Probability Distributions
• Normal
• STAT• ANOVA
• ONEWAY
Basic Statistical Analysis
• GRAPH• Plot• Time Series Plot• Histogram• Boxplot• Character Graphs
• Dotplot
• STAT• SPC
• Run Chart
Graphical Analysis
• STAT• Regression
• Regression• Fitted Line Plot
Regression Analysis
GE Company ProprietaryVersion 2.0 21
Minitab Primer Demonstration Four
Example: Comparing Many Different Business Regions
Data File: aging.xlsVariables: Country1 - Country5
Evaluate the relative performance of five different business regions using boxplots and dotplots.
GE Company ProprietaryVersion 2.0 22
Minitab Primer
54321
200
100
0
COUNTRY
AG
ING
Boxplot Results...
Measure - Analyze - ImproveImprove - Control
GE Company ProprietaryVersion 2.0 23
Minitab Primer
Character Dotplot
. .: :: :: :: :: :: .::: :::: :::: ..:::::. -+---------+---------+---------+---------+---------+-----AGING (1) .: :: :.: . ..:::::: . . . :. ::::::::::.::: -+---------+---------+---------+---------+---------+-----AGING(2) . : : . : : : . . :::.:.:.: : . .:.::::::::: :.:. -+---------+---------+---------+---------+---------+-----AGING (3) .. : . ::: . ..:: . . : .: ... ::::::::::.::....... -+---------+---------+---------+---------+---------+-----AGING(4) . :.: . ::: . : : .: :. :.:::: :..:.::::::.:: . -+---------+---------+---------+---------+---------+-----AGING(5) -40 0 40 80 120 160
Dotplot Results...
One-Way Analysis of Variance
Analysis of Variance on AGING Source DF SS MS F pCOUNTRY 4 424064 106016 246.30 0.000Error 295 126978 430Total 299 551042
Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev --------+---------+---------+-------- 1 60 25.93 4.87 (*-) 2 60 13.97 16.54 (-*) 3 60 46.00 16.20 (*-) 4 60 118.25 27.76 (-*) 5 60 25.53 28.66 (*-) --------+---------+---------+--------Pooled StDev = 20.75 35 70 105
ANOVA Results...
Measure - Analyze - ImproveImprove - Control
GE Company ProprietaryVersion 2.0 24
Minitab Primer
Choice of Tool Depends Upon the Requirements of the Analysis
Choice of Tool Depends Upon the Requirements of the Analysis
• STAT• Basic Statistics
• Descriptive Statistics• 1-Sample t• 2-Sample t• Correlation
• Tables• Tally• Chisquare Test
• CALC• Probability Distributions
• Normal
• STAT• ANOVA
• ONEWAY
Basic Statistical Analysis
• GRAPH• Plot• Time Series Plot• Histogram• Boxplot• Character Graphs
• Dotplot
• STAT• Control Charts
• Xbar-S• SPC
• Run Chart
Graphical Analysis
• STAT• Regression
• Regression• Fitted Line Plot
Regression Analysis
Demonstration Five
GE Company ProprietaryVersion 2.0 25
Minitab Primer
Example: Invoice DisputesData File: chisq.xlsVariables: Process, Invoices, and Disputes
Description: this data set contains the number of invoices issued to customers using six different processes. Invoices is thenumber issued and Disputes is the number of customer issuespending problem resolution. Determine whether the results of this test indicate a difference in the six processes.
Measure - Analyze - ImproveImprove - Control
GE Company ProprietaryVersion 2.0 26
Minitab Primer
process invoices disputes1 54 162 47 133 52 154 53 85 49 156 52 2
Results of the Six Trials... Results of the Chi-Square Test...
Hypothesis TestHo: (O-E)2 = 0Ha: (O-E)2 > 0: 0.05: (n-1) = 5
Decision Rule: If p < , Reject Ho
Expected counts are printed below observed counts
invoices disputes Total 1 54 16 70 57.15 12.85
2 47 13 60 48.99 11.01
3 52 15 67 54.70 12.30
4 53 8 61 49.81 11.19
5 49 15 64 52.26 11.74
6 52 2 54 44.09 9.91
Total 307 69 376
ChiSq = 0.174 + 0.775 + 0.081 + 0.359 + 0.134 + 0.595 + 0.205 + 0.911 + 0.203 + 0.902 + 1.419 + 6.313 = 12.071df = 5, p = 0.035
Is the Result Significant at the 0.05 Alpha Level? Is the Result Significant at the 0.05 Alpha Level?
Measure - Analyze - Improve Improve - Control
GE Company ProprietaryVersion 2.0 27
Minitab Primer
Example: Receivables Process ControlData File: days.xlsVariables: Days
Description: Collection terms are 60 days. Payments are entered into a data collection system in the same time-order as they are received. Determine whether or not the process isin control and capable of satisfying the terms.
Measure - Analyze - Improve - Control Control
GE Company ProprietaryVersion 2.0 28
Minitab Primer
Results of Analysis...
Is the Process in Control and Capable? Is the Process in Control and Capable?
Measure - Analyze - Improve - Control Control
109876543210
70
60
50
Subgroup
Mea
ns
20
10
0Std
Dev
iatio
ns
MU=63.80
UCL=75.18
LCL=52.42
S=7.970
UCL=16.65
LCL=0.000
Xbar and S Chart for: Days
GE Company ProprietaryVersion 2.0 29
Minitab Primer
• State the Goal of Your Work
• Identify the Desired Output
• Collect the “Right” Data…Don’t Use Data “Just Because It’s Available”
• Select the Tool(s) that Will Deliver the Desired Results
Conclusion
Avoid “Over Analysis” ... Identify Your Needs Up-front and Focus on Results
Avoid “Over Analysis” ... Identify Your Needs Up-front and Focus on Results
GE Company ProprietaryVersion 2.0 30
Minitab Primer Appendices
1) Solutions to Problems Using Excel a) Demonstration Oneb) Demonstration Two
2) Formulae for Calculating Sample Sizea) Attributes Testsb) Variables Tests
3) Minitab Tools and the Breakthrough Strategy
GE Company ProprietaryVersion 2.0 31
Minitab Primer Appendix 1a - Excel
Results of Receivables Demonstration Using ExcelResults of Receivables Demonstration Using Excel
• Normally Distributed Data• Data Stable Over Time• Average 64 Days-to-Collection• About 68% of the Payments Occur Between 56 and 72 Days• About 50% of the Payments Exceed 64 Days
Observations....Descriptive Statistics....
Days
Mean 63.8Standard Error 1.19Median 64Mode 67Standard Deviation 8.4Sample Variance 71.3Kurtosis 0.44Skewness 0.07Range 42Minimum 45Maximum 87Sum 3,190Count 50ConfidLevel(95.000%) 2.34
Data Set....
. .
. .
Payment Days1 552 723 694 665 776 707 798 659 6410 6311 67
Histogram....
Histogram
0
5
10
15
20
25
40 50 60 70 80 90
Days
Frequency
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Run Chart....
Run Chart: Days-to-Collection
40
45
50
55
60
65
70
75
80
85
90
1 3 5 7 9
11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49
Pay
men
t
Days
avg
GE Company ProprietaryVersion 2.0 32
Minitab Primer
Graphical Analysis Reveals Unusual EventsGraphical Analysis Reveals Unusual Events
price
0.00
20.00
40.00
60.00
80.00
100.00
120.00
0 100 200 300 400 500 600
price
Run Chart....Data Set....
day price1 104.002 103.633 103.004 103.885 104.386 105.007 105.638 106.009 105.2510 106.8811 107.1312 108.3813 108.7514 108.3815 108.25. .. .. .
Histogram
0
5
10
15
20
25
45.5
0
50.5
6
55.6
2
60.6
8
65.7
4
70.8
0
75.8
5
80.9
1
85.9
7
91.0
3
96.0
9
101.
15
106.
21
Bin
Fre
qu
ency
Frequency
Histogram....
price
Mean 63.67Standard Error 0.87Median 56.75Mode #NUM!Standard Deviation 19.58Sample Variance 383.50Kurtosis 0.32Skewness 1.35Range 64.25Minimum 45.50Maximum 109.75Sum 32410.24Count 509Confidence Level (95.000%) 1.70
Descriptive Statistics....
• Bimodal Data - Two Different Groups• Data Unstable Over Time• Descriptive Statistics Unreliable Due to Data Distribution & Instability• Significant Event Occurred at Time “100”• Data is Upward Trending After Time “100”• The Data Set is GE Stock Price and the Significant Event is a Stock Split
Observations....
Appendix 1b - Excel
GE Company ProprietaryVersion 2.0 33
Minitab Primer Appendix 2a - Attributes Sample Size
Estimating Sample Size for Attributes
n =2
)(2
· p · q
Example: How large a sample size is required to estimate the proportion ofunpaid invoices with a margin of error of +/- 4% at a 95% confidence level?
n - sample size - Z-value for Desired Confidence Level- Desired Precision Widthp - Population Proportion (Use 0.5 if Unknown)q - Complement of p, i.e., (1- p)
n = · )(
2
· 0.5 · 0.5 = 600.25 601
GE Company ProprietaryVersion 2.0 34
Minitab Primer Appendix 2b - Variables Sample Size
Estimating Sample Size for Variables
n =2 ( )
2
Example: How large a sample size is required to estimate the average value ofunpaid invoices with a standard deviation of $3.50 within a margin of error of +/- $1.00 at a 90% confidence level?
n - sample size - Z-value for Desired Confidence Level- Desired Precision Width - Standard Deviation
n = · · $ $( )
2
= 33.35 34
GE Company ProprietaryVersion 2.0 35
Minitab Primer
A Sampling of Statistical Tools to Apply With the Breakthrough Strategy...
Measure Analyze Improve Control
• Histograms
• Run Charts
• Descriptive Statistics
• Dotplots
• Boxplots
• Hypothesis Tests
• Boxplots
• Dotplots
• Scatter Plots
• Correlation Analysis
• Regression Analysis
• ANOVA
• Hypothesis Tests • Regression Analysis
• DOE
• Run Charts
• Control Charts
• Confidence Intervals
Appendix 3 - MAIC Tools
Use Tools Creatively....But Avoid “Force Fitting”Use Tools Creatively....But Avoid “Force Fitting”