gtech 201 lecture 12
DESCRIPTION
GTECH 201 Lecture 12. Intro to Descriptive Statistics. Topics for Today. Measures of Central Tendency Mean, Median, Mode Sample and Population Mean Weighted Means Selecting Appropriate Measures of Central Tendency Measures of Dispersion Variance Standard Deviation. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: GTECH 201 Lecture 12](https://reader036.vdocuments.mx/reader036/viewer/2022062304/568145ce550346895db2d654/html5/thumbnails/1.jpg)
GTECH 201Lecture 12
Intro to Descriptive Statistics
![Page 2: GTECH 201 Lecture 12](https://reader036.vdocuments.mx/reader036/viewer/2022062304/568145ce550346895db2d654/html5/thumbnails/2.jpg)
Topics for Today
Measures of Central Tendency Mean, Median, Mode Sample and Population Mean Weighted Means Selecting Appropriate Measures of
Central Tendency Measures of Dispersion
Variance Standard Deviation
![Page 3: GTECH 201 Lecture 12](https://reader036.vdocuments.mx/reader036/viewer/2022062304/568145ce550346895db2d654/html5/thumbnails/3.jpg)
Descriptive vs. Inferential
Descriptive Statistics Methods for organizing and
summarizing information
Inferential Statistics Methods for drawing and measuring
the reliability of conclusions about a population based on information obtained from a sample of the population
![Page 4: GTECH 201 Lecture 12](https://reader036.vdocuments.mx/reader036/viewer/2022062304/568145ce550346895db2d654/html5/thumbnails/4.jpg)
Looking at This Data Set…
Student Performance in Class Tests
ID Test 1 Test 2 Test 3 Test 4
1 2463 B+ A 95 102 4140 A- A 90 9.53 1210 D F 0 04 O649 D+ B+ 80.5 95 2925 B ? 86 8.56 4194 A- A 86.5 97 4266 B+ F 90.5 8.58 2517 A- A 83.5 10
![Page 5: GTECH 201 Lecture 12](https://reader036.vdocuments.mx/reader036/viewer/2022062304/568145ce550346895db2d654/html5/thumbnails/5.jpg)
Overview Mean Median Mode Sample and Population Mean Weighted Means Selecting Appropriate Measures of
Central Tendency Applying these measures
![Page 6: GTECH 201 Lecture 12](https://reader036.vdocuments.mx/reader036/viewer/2022062304/568145ce550346895db2d654/html5/thumbnails/6.jpg)
Mean
The mean of a set of n observations is the arithmetic average
Mean of n observations x1, x2,x3,….xn is
In Excel, =AVERAGE(insert range)
ixxn
1
i n
ix
![Page 7: GTECH 201 Lecture 12](https://reader036.vdocuments.mx/reader036/viewer/2022062304/568145ce550346895db2d654/html5/thumbnails/7.jpg)
Median
The data value that is exactly in the middle of an ordered list if the number of pieces of data is odd
The mean of the two middle pieces of data in an ordered list if the number of pieces of data is even
The median is a typical value; it is the midpoint of observations when they are arranged in an ascending or descending order
![Page 8: GTECH 201 Lecture 12](https://reader036.vdocuments.mx/reader036/viewer/2022062304/568145ce550346895db2d654/html5/thumbnails/8.jpg)
Mode The most frequent data value; i.e., any
value having the highest frequency among the observations
In Excel,you use the functions =MEDIAN (insert range)
=MODE (insert range) Unimodal, Bimodal, Multimodal data
sets Outliers
![Page 9: GTECH 201 Lecture 12](https://reader036.vdocuments.mx/reader036/viewer/2022062304/568145ce550346895db2d654/html5/thumbnails/9.jpg)
Sample and Population Means
Mean of a data set Population mean if data set includes
entire population
Sample mean if data set is only a sample of the population
iX
N
ixxn
![Page 10: GTECH 201 Lecture 12](https://reader036.vdocuments.mx/reader036/viewer/2022062304/568145ce550346895db2d654/html5/thumbnails/10.jpg)
Weighted Means
To calculate the mean when your information is available only in the form of summary data
C Interval Freq25 – 29.9 430 – 34.9 535 – 39.9 12
j jx fx
n
![Page 11: GTECH 201 Lecture 12](https://reader036.vdocuments.mx/reader036/viewer/2022062304/568145ce550346895db2d654/html5/thumbnails/11.jpg)
Skewed Distributions
![Page 12: GTECH 201 Lecture 12](https://reader036.vdocuments.mx/reader036/viewer/2022062304/568145ce550346895db2d654/html5/thumbnails/12.jpg)
Skewed Distributions When there is one mode and the distribution
is symmetric mean, median, mode are the same
Positive skew mean moves towards the positive tail median also pulls towards the positive tail
Negative skew mean moves towards the negative tail median also moves towards the negative tail
![Page 13: GTECH 201 Lecture 12](https://reader036.vdocuments.mx/reader036/viewer/2022062304/568145ce550346895db2d654/html5/thumbnails/13.jpg)
Selecting Appropriate Measures
Mean affected by extreme values includes all observations, therefore
comprehensive (useful for interval/ratio data) Median
not affected by the number of observations reveals typical situations (used for ordinal data)
Mode useful for nominal variables
![Page 14: GTECH 201 Lecture 12](https://reader036.vdocuments.mx/reader036/viewer/2022062304/568145ce550346895db2d654/html5/thumbnails/14.jpg)
Other Useful Calculations
In addition to the sum of data, xwe need to be able to calculate:
2 2; ;x x x x x
2 2x x
xy x y
![Page 15: GTECH 201 Lecture 12](https://reader036.vdocuments.mx/reader036/viewer/2022062304/568145ce550346895db2d654/html5/thumbnails/15.jpg)
Variability or Spread Mean and the median - limits Range – coarse measure of variability Percentiles
kth percentile is the point at which k percent of the numbers fall below it and the rest are fall above it
25th percentile (lower quartile) 50th percentile (median) 75th percentile (upper quartile) Interquartile range (difference between the 25th
percentile value and the 75th percentile value)
![Page 16: GTECH 201 Lecture 12](https://reader036.vdocuments.mx/reader036/viewer/2022062304/568145ce550346895db2d654/html5/thumbnails/16.jpg)
Describing the Spread
A five number summary Median Quartiles Extremes
Variance and Standard Deviation Measures spread about the mean Standard deviation cannot be discussed
without the mean
![Page 17: GTECH 201 Lecture 12](https://reader036.vdocuments.mx/reader036/viewer/2022062304/568145ce550346895db2d654/html5/thumbnails/17.jpg)
Calculating PercentilesIn the list of twelve observations2 4 7 11 11 11 11 14 16 16 24 29Compute median, 25th and 75th percentiles
11 11
2
Median
The lower quartile is the median of the 6 observations that fall below the medianThe upper quartile is the median of the 6 observations that fall above the median
7 112
16 162
![Page 18: GTECH 201 Lecture 12](https://reader036.vdocuments.mx/reader036/viewer/2022062304/568145ce550346895db2d654/html5/thumbnails/18.jpg)
Five Number Summary
Median = 11 Lower Quartile = 9 Upper Quartile = 16 Extremes are 2 and 29 Can compute the range = 27 In a symmetric distribution, the lower
and upper quartiles are equally distant from the median
![Page 19: GTECH 201 Lecture 12](https://reader036.vdocuments.mx/reader036/viewer/2022062304/568145ce550346895db2d654/html5/thumbnails/19.jpg)
Variance Is the mean of the squares of the
deviations of the observations from their mean
Population variance
Sample variance
2
2
iX
N
2
2
1
ix x
sn
![Page 20: GTECH 201 Lecture 12](https://reader036.vdocuments.mx/reader036/viewer/2022062304/568145ce550346895db2d654/html5/thumbnails/20.jpg)
ExampleThe heights, in inches for five starting players in a
men’s college basket ball team are:
67 72 76 76 84
Compute the mean and standard deviation.
x67 -8 6472 -3 976 1 176 1 184 9 81
375 0 156
2x xx x
xx
n= 75
![Page 21: GTECH 201 Lecture 12](https://reader036.vdocuments.mx/reader036/viewer/2022062304/568145ce550346895db2d654/html5/thumbnails/21.jpg)
Standard Deviation Standard deviation is positive
square root of the variance
Variance in our basketball example:
2
2
1
ix x
sn
2 156
4s = 39
![Page 22: GTECH 201 Lecture 12](https://reader036.vdocuments.mx/reader036/viewer/2022062304/568145ce550346895db2d654/html5/thumbnails/22.jpg)
Formulas – Standard Deviation
2
1
ix x
sn
Standard deviation of a sample
Standard deviation of a population
2
iX
N
![Page 23: GTECH 201 Lecture 12](https://reader036.vdocuments.mx/reader036/viewer/2022062304/568145ce550346895db2d654/html5/thumbnails/23.jpg)
Example (Continued)
2
1
ix x
sn
39
6.24
s
s
![Page 24: GTECH 201 Lecture 12](https://reader036.vdocuments.mx/reader036/viewer/2022062304/568145ce550346895db2d654/html5/thumbnails/24.jpg)
Short Cut – Simpler Formula
22
1
n x x
sn n
Standard Deviationof a sample
Sum of the squares of data values, i.e., you square each data value and then sum those squared valuesSquare of the sum of data values, i.e., you sum all the data values and then square that sum
2 x
2
x
![Page 25: GTECH 201 Lecture 12](https://reader036.vdocuments.mx/reader036/viewer/2022062304/568145ce550346895db2d654/html5/thumbnails/25.jpg)
Example (using the short cut)
x67 448972 518476 577676 577684 7056
375 28281
2x
2 2
375
140625
x
25 28281 375
5 4
780
20
s
s
39
6.24
s
s
![Page 26: GTECH 201 Lecture 12](https://reader036.vdocuments.mx/reader036/viewer/2022062304/568145ce550346895db2d654/html5/thumbnails/26.jpg)
Interpreting Std. Deviation
s and s 2 will be small when all the data are close together
The deviations from the mean Will be both positive and negative Sum will always be 0
s is always 0 or a positive number s = 0 means no spread; as s value
increases, the spread of the data increases The units of s are the same as the original
observations s is heavily influenced by outliers
![Page 27: GTECH 201 Lecture 12](https://reader036.vdocuments.mx/reader036/viewer/2022062304/568145ce550346895db2d654/html5/thumbnails/27.jpg)
Coefficient of Variation
CV is the standard deviation described as a percent of the mean
CV =
100 s
x
CV is useful when comparing different sets of data where sample size and standard deviation are different