module 10: summarizing numbersbiostatcourse.fiu.edu/pdfslides/module10.pdf · module 10:...
TRANSCRIPT
![Page 1: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/1.jpg)
10 - 1
Module 10: Summarizing Numbers
This module presents the standard summarizing numbers, also often called sample statistics or point estimates, that are essential to using and understanding data. Module 10 focuses on measures of central tendency, including means, medians, modes, midranges and geometric means. A discussion of Percentiles and Box plots are also included.
Reviewed 05 May 05/ MODULE 10
![Page 2: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/2.jpg)
10 - 2
• Mean—Average
• Median—Middle
• Mode—Most frequent
• Midrange—Halfway between smallest, largest
• Geometric Mean—Uses logarithms
Measures of Central Tendency
![Page 3: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/3.jpg)
10 - 3
Person xi
1 182 193 204 215 22
Sample 1
![Page 4: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/4.jpg)
10 - 4
Mean or Average
• The mean or average is obtained by adding up the values for all the observations and then dividing by the number of observations
• In general, the mean is the best measure of central tendency to use, but there are exceptions
![Page 5: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/5.jpg)
10 - 5
Sample 1 Sample 2x 1 x 2
18 9019 420 321 222 1
1 1 2 21 1
5 5
1 1 2 21 1
/ 5 / 5
n n
i ii i
i ii i
Sum x Sum x
x x x x
= =
= =
= =
= =
∑ ∑
∑ ∑
Calculating the Mean
![Page 6: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/6.jpg)
10 - 6
Mean for Sample 1
1( )
n
i ii
x Sum x=
=∑
Person xi
1 182 193 204 215 22
Sum (xi) 100Mean 20.0
1/
n
ii
Mean X x n=
= =∑
![Page 7: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/7.jpg)
10 - 7
Person xi
1 902 43 34 25 1
Sum(xi) 100Mean 20
Mean for Sample 2
![Page 8: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/8.jpg)
10 - 8
• The median is the “middle” observation when the complete list of observations is sorted in order.
• When there is a odd number of observations, the value of the middle one is the median.
• When there is a even number of observations, the value of the average of the two “middle” observations is used as the median.
• The median may be a better indication of the center of a group of numbers if there are some values that are considerably more extreme than others
• Median income is often used for this reason
Median
![Page 9: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/9.jpg)
10 - 9
Median for Sample 1
Person xi
1 182 193 204 215 22
Sum (xi) 100Mean 20
Median
![Page 10: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/10.jpg)
10 - 10
Median for Sample 2
Person xi
1 902 43 34 25 1
Sum (xi) 100Mean 20
Median
![Page 11: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/11.jpg)
10 - 11
• The value of the point halfway between the smallest and the largest observations.
• Easily calculated by calculating the average of the values for the smallest and largest observations.
• Note that the value of the midrange need not be a number that is a value for one of the observations.
Midrange
![Page 12: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/12.jpg)
10 - 12
Midrange for Sample 1Person xi
1 182 193 204 215 22
Sum (xi) 100Mean 20.0
Midrange = (18 + 22)/2 = 20
Midrange
![Page 13: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/13.jpg)
10 - 13
Person xi
1 902 43 34 25 1
Sum(xi) 100Mean 20
Midrange for Sample 2
Midrange = (90 + 1)/2 = 45.5
![Page 14: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/14.jpg)
10 - 14
• Value of observation that occurs most frequently.
• Represents a number that does occur in the observations.
• Not always well-defined since there may not be one value that occurs most frequently.
Mode
![Page 15: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/15.jpg)
10 - 15
Person xi
1 182 193 204 215 22
Sum (xi) 100Mean 20.0
Mode for Sample 1
No mode since all values occur equally frequently
![Page 16: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/16.jpg)
10 - 16
Person xi
1 902 43 34 25 1
Sum(xi) 100Mean 20
Mode for Sample 2
No mode since all values occur equally frequently
![Page 17: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/17.jpg)
10 - 17
Geometric Mean
• First, take log for each sample point
• Second, calculate mean for log values
• Convert mean of log values back to original scale
![Page 18: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/18.jpg)
10 - 18
Person Age log10(age) loge(age)1 18 1.26 2.892 19 1.28 2.943 20 1.30 3.004 21 1.32 3.045 22 1.34 3.09∑ 100 6.50 14.97
20 1.30 2.99GM - 19.95 19.89x
1.3010 19.95= 2.99 19.89e =
Geometric Mean for Sample 1
![Page 19: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/19.jpg)
10 - 19
Person Age log10(age) loge(age)
1 90 1.95 4.502 4 0.60 1.393 3 0.48 1.104 2 0.30 0.695 1 0.00 0.00∑ 100 3.33 7.68
20 0.67 1.54GM - 4.68 4.64
x
0.6710 4.68= 1.54 4.64e =
Geometric Mean for Sample 2
![Page 20: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/20.jpg)
10 - 20
Measure Sample 1 Sample 2Mean 20.0 years 20.0 yearsMedian 20 years 3 yearsMode none noneMidrange 20 years 45.5 yearsGeometric Mean 19.95 years 4.68 years
Measures of Central Tendency
![Page 21: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/21.jpg)
10 - 21
Knowing the Mean is not Enough
• What else would it be useful to know?
• A key issue is how alike or “unlike” each other the individual observations are
• How can we measure “unlikeness”
![Page 22: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/22.jpg)
10 - 22
PercentilesPercentiles are numbers that divide the data into 100 equal parts. For a set of observations arranged in order of magnitude, the pth percentile is the value that has p percent of the observations below it and (100-p) percent above it. The most commonly used percentiles are the 25th, 50th and 75th percentiles.
The 50th percentile is that observation or number that has 50% of the observations below it and the other 50% above it ; this is simply the ‘middle’ observation when the set of observations are arranged in order of magnitude. The 50th
percentile is usually referred to as the median.
![Page 23: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/23.jpg)
10 - 23
For the age distribution, n = 121;
The 75th percentile for the age distribution is the (75 *121)/100 = 90.75 ~ 91st observation when the ages are arranged in an increasing order of magnitude. The 75th percentile of the ages is therefore 31 years; the 25th
percentile, 50th and 80th percentile are the 31st, 61st, and 97th observations respectively, as shown on the next slide.
Example: Age Distribution from Module 9
( ) th
th n*p
100The p percentile is the observation, when the set
of observations are arranged in order or magnitude; where n is the sample size.
![Page 24: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/24.jpg)
10 - 24
75th percentile
121
21
2
3
4
1
3
4
5
6
13
17
9
11
16
6
Frequency
Total
12135+
10034
9833
9532
9131
9030
8729
8328
7827
7226
5925
4224
3323
2222
621
Cumulative Frequency
Age
80th percentile
50th percentile
25th percentile
The 31st
observation falls in this group
The 61st
observation falls in this group
The 97th
observation falls in this group
![Page 25: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/25.jpg)
10 - 25
Box PlotAn individual box symbol summarizes the distribution of data within a data set. By using a box symbol, in addition to the average value, other information about the distribution of the measurements can also be displayed. As shown on the next slide, the 25th, 50th
(Median), and 75th percentile of the distribution can be displayed along with the average (mean) value of the distribution.
![Page 26: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/26.jpg)
10 - 26
7 5 th p e rc e n t i le + 1 .5 IQ R Q u a n ti ta tive S c a le R e fe r r e d to a s w h isk e r 7 5 th p e rc e n t i le A ve ra g e / m e a n 5 0 th p e rc e n t i le /m e d ia n 2 5 th p e rc e n t i le R e fe r re d to a s w h isk e r 2 5 th p e rc e n t i le - 1 .5 IQ R In d iv id u a l b o x s y m b o l
IQ R : In te rq u a r t i le ra n g e , w h ic h is c a lc u la te d b y su b s tra c t in g th e 2 5 th p e rc e n ti le o f th e d a ta fro m 7 5 th p e rc e n ti le ; c o n se q u e n tly , i t c o n ta in s th e m id d le 5 0 % o f th e o b se r v a t io n s .
Box Plot
![Page 27: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/27.jpg)
10 - 27
Box Plot for Age distribution
25th percentile
50th percentile (median)
75th percentile
25
30
35
Age
40
Mean Age27
26
31
23
SAS generated Box plot
![Page 28: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/28.jpg)
10 - 28
129N =
Y
300
200
100
0
108
20
96
127
Box plot from SPSS
![Page 29: Module 10: Summarizing Numbersbiostatcourse.fiu.edu/PDFSlides/Module10.pdf · Module 10: Summarizing Numbers ... • The median may be a better indication of the center of a group](https://reader031.vdocuments.mx/reader031/viewer/2022022501/5aa74b8f7f8b9a50528c2be3/html5/thumbnails/29.jpg)
10 - 29
Box plot from ViSta (The Visual Statistics System)