1
Statistical Analysis – Descriptive Statistics
Dr. Jerrell T. Stracener, SAE Fellow
Leadership in Engineering
EMIS 7370/5370 STAT 5340 : PROBABILITY AND STATISTICS FOR SCIENTISTS AND ENGINEERS
Systems Engineering ProgramDepartment of Engineering Management, Information and Systems
2
• Basic Concepts
• Analysis of Location, or Central Tendency
• Analysis of Variability
• Analysis of Shape
3
Populationthe total of all possible values (measurement, counts, etc.) of a particular characteristic for aspecific group of objects.
Samplea part of a population selected according to some rule or plan.
Why sample?- Population does not exist- Sampling and testing is destructive
Population vs. Sample
4
Characteristics that distinguish one type of sample from another:
• the manner in which the sample was obtained
• the purpose for which the sample was obtained
Sampling
5
• Simple Random SampleThe sample X1, X2, ... ,Xn is a random sample if X1, X2, ... , Xn are independent identically distributed random variables.
Remark: Each value in the population has an equal and independent chance of being included in the sample.
• Stratified Random SampleThe population is first subdivided into sub-populations for strata, and a simple randomsample is drawn from each strata
Types of Samples
6
•Censored Samples
Type I Censoring - Sample is terminated at a fixed time, t0. The sample consists of K times to failure plus the information that n-k items survived the fixed time of truncation.
Type II Censoring - Sampling is terminated upon the Kth failure. The sample consists of K times to failure, plus information that n-k items survived the random time of truncation, tk.
Progressive Censoring - Sampling is reduced in stage.
Types of Samples - Continued
7
• Systematic Random Sample
The N items in the population are arranged in some order.
Select an item at random from the first K = N/n items, where n is the sample size.
Select every Kth item thereafter.
Types of Samples - Continued
8
• Data represents the entire population
Statistical analysis is primarily descriptive.
• Data represents sample from population
Statistical analysis
- describes the sample
- provides information about the population
Statistical Analysis Objective
9
• Sample (Arithmetic) Mean
• Sample Midrange
• Sample Mode
• Sample Median
• Sample Percentiles
Analysis of Location or Central Tendency
10
• Formula:
• Remarks:
Most frequently used statistic
Easy to understand
May be misleading due to extreme values
n
1iix
n
1x
Sample Mean
11
• Definition:
Most frequently occurring value in the sample
• Remarks:
A sample may have more than one mode
The mode may not be a central value
Not well understood, nor frequently used
Sample Mode
12
Formula: , if n is odd & K = (n+1)/2
, if n is even & K = n/2
where the sample values X1, X2, ... , Xn are arranged in numerical order
• Remarks:
Not well understood, nor accepted
All sample data does not appear to be utilized
Not affected by extreme values
kx
2
xx 1kk 0.5x
Sample Median
13
• Sample Range
• Sample Variance
• Sample Standard Deviation
• Sample Coefficient of Variation
Analysis of Variability
14
• Formula: R = Xmax - Xmin
where Xmax is the largest value in the sampleand Xmin is the smallest sample value
• Remarks:
Easy to determine
Easily understood
Determined by extreme values
Does not use all sample data
Sample Range
15
• Sample Variance
• Sample Standard Deviation
s = (sample variance)1/2
• Remarks
Most frequently used measure of variabilityNot well understood
1nn
xxn
xx1n
1s
2n
1ii
n
1i
2i2n
1ii
2
Sample Variance & Standard Deviation
16
• Remarks
Relative measure of variation
Used for comparing the variation in two samples of data that are measured in two different units
x
sCVs
Sample Coefficient of Variation
17
• Skewness
• Kurtosis
Analysis of Shape
18
For a unimodal distribution, xr is an indicator ofdistribution shape
< 1 , indicates skewed to the left
xr = 1 , indicates symmetric
> 1 , indicates skewed to the right
5.0x
xxr
Estimate of Skewness
19
• The third moment about the mean is related to the asymmetry or skewness of a distribution
• For a unimodal (i.e., a single peaked) distribution
3 < 0 , distribution is skewed to the left3 = 0 , distribution is symmetric3 > 0 , distribution is skewed to the right
• Measure of skewness relative to degree of spread
33 XE
2/3231 )/( 2
2 xE
Measure of Skewness
20
• Normal
•Exponential
01
41
Comparison of Distribution Skewness
21
• Estimate of skewness of a distribution from a random sample
where
and
2/3231 )/(ˆ mm
2n
1ii2 xx
n
1m
3n
1ii3 xx
n
1m
n
1iix
n
1x
Estimation of Skewness
22
•The fourth moment about the mean is related to the peakedness, called kurtosis, of a distribution
• Relative measure of Kurtosis
where
44 xE
2242 /
22 xE
Measurement of Kurtosis
23
• Estimate of kurtosis of a distribution (2) from a random sample
where
and
22422
^
)/(mmb
2n
1ii2 xx
n
1m
4n
1ii4 xx
n
1m
n
1iix
n
1x
Estimation of Kurtosis
24
Comparison of Kurtosis
25
Presentation of Data
26
40 specimens are cut from a plate for tensile tests. The tensile tests were made, resulting in Tensile Strength, x, as follows:
i x i x i x i x1 48.5 11 55.0 21 53.1 31 54.62 54.7 12 55.7 22 49.1 32 49.93 47.8 13 49.9 23 55.6 33 44.54 56.9 14 54.8 24 46.2 34 52.95 54.8 15 49.7 25 52.0 35 54.46 57.9 16 58.9 26 56.6 36 60.27 44.9 17 52.7 27 52.9 37 50.28 53.0 18 57.8 28 52.2 38 57.49 54.7 19 46.8 29 54.1 39 54.8
10 46.7 20 49.2 30 42.3 40 61.2
Perform a statistical analysis of the tensile strength data.
40 Specimens
27
40 Specimens
The following descriptive statistics were calculated from the data:
Descriptive Statistics
Count 40Minimum 42.35Maximum 61.18Range 18.84Sum 2104.82Mean 52.62Median 53.03Sample Variance 19.83Standard Deviation 4.45Kurtosis 2.51Skewness -0.34