copyright (c) 2002 houghton mifflin company. all rights reserved. 1 averages and variation
TRANSCRIPT
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1
Averages and Variation
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 2
Today
• Check in– Quiz next Tuesday– Will NOT NOT NOT include today’s
material• So you can focus on other lectures
– Proposal draft assigned• It’s a draft of the full proposal• Due Tuesday november 22nd
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 3
Measures of Central Tendency
• Mode
• Median
• Mean
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 4
The Mode
the value or property that occurs most frequently in the data
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 5
Find the mode:
6, 7, 2, 3, 4, 6, 2, 6
The mode is 6.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 6
Find the mode:
6, 7, 2, 3, 4, 5, 9, 8
There is no mode for this data.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 7
The Median
the central value of an ordered distribution
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 8
To find the median of raw data:
• Order the data from smallest to largest.
• For an odd number of data values, the
median is the middle value.
• For an even number of data values, the
median is found by dividing the sum of
the two middle values by two.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 9
Find the median:
Data: 5, 2, 7, 1, 4, 3, 2
Rearrange: 1, 2, 2, 3, 4, 5, 7
The median is 3.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 10
Find the median:
Data: 31, 57, 12, 22, 43, 50
Rearrange: 12, 22, 31, 43, 50, 57
The median is the average of the middle two values =
372
4331
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 11
The Mean
The mean of a collection of data is found by:• summing all the entries• dividing by the number of entries
entriesofnumberentriesallofsum
mean
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 12
Find the mean:
6, 7, 2, 3, 4, 5, 2, 8
6.4625.48
378
82543276mean
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 13
Sigma Notation
•The symbol means “sum the following.”
• is the Greek letter (capital) sigma.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 14
Notations for mean
Sample mean
“x bar”
Population mean
Greek letter (mu)x
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 15
Number of entries in a set of data
• If the data represents a sample, the
number of entries = n.
• If the data represents an entire
population, the number of entries = N.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 16
Sample mean
nx
x
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 17
Population mean
N
x
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 18
Resistant Measure
a measure that is not influenced by extremely high or low data values
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 19
Which is less resistant?
• Mean• Median
The mean is less resistant. It can be made arbitrarily large by increasing the size of one value.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 20
Weighted Average
Average calculated where some of the numbers are assigned more
importance or weight
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 21
Weighted Average
x. value data the ofweight the w
AverageWeighted
where
w
xw
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 22
Compute the Weighted Average:
• Midterm grade = 92• Term Paper grade = 80• Final exam grade = 88• Midterm weight = 25%• Term paper weight = 25%• Final exam weight = 50%
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 23
Compute the Weighted Average:
x w xw• Midterm 92 .25 23• Term Paper 80 .25 20• Final exam 88 .50 44
1.00 87
Average Weighted8700.1
87
w
xw
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 24
Percentiles
For any whole number P (between 1 and 99), the Pth percentile of a distribution is a value such that P% of the data fall at or below it.
The percent falling above the Pth percentile will be (100 – P)%.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 25
Percentiles
40% of data
Low
est
valu
e
Hig
hes
t va
lueP 40
60% of data
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 26
Quartiles
• Percentiles that divide the data into fourths
• Q1 = 25th percentile
• Q2 = the median
• Q3 = 75th percentile
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 27
Computing Quartiles
• Order the data from smallest to largest.• Find the median, the second quartile.• Find the median of the data falling below
Q2. This is the first quartile.
• Find the median of the data falling above Q2. This is the third quartile.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 28
Find the quartiles:
12 15 16 16 17 18 22 22
23 24 25 30 32 33 33 34
41 45 51
The data has been ordered.
The median is 24.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 29
Find the quartiles:
12 15 16 16 17 18 22 22
23 24 25 30 32 33 33 34
41 45 51
The data has been ordered.
The median is 24.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 30
Find the quartiles:
12 15 16 16 17 18 22 22
23 24 25 30 32 33 33 34
41 45 51
For the data below the median, the median is 17.
17 is the first quartile.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 31
Find the quartiles:
12 15 16 16 17 18 22 22
23 24 25 30 32 33 33 34
41 45 51
For the data above the median, the median is 33.
33 is the third quartile.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 32
Find the interquartile range:
12 15 16 16 17 18 22 22
23 24 25 30 32 33 33 34
41 45 51
IQR = Q3 – Q1 = 33 – 17 = 16
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 33
Measures of Variation
• Range
• Standard Deviation
• Variance—but we won’t talk about this
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 34
The Range
the difference between the largest and smallest values of a
distribution
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 35
Find the range:
10, 13, 17, 17, 18
The range = largest minus smallest
= 18 minus 10 = 8
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 36
The standard deviation
a measure of the average variation of the data entries from the mean
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 37
Standard Deviation
• Tells us how much data entries differ from the mean
• Why do we care? Can’t we just calculate the mean and the range?
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 38
Standard Deviation—why?
• Suppose 2 data sets:• 1, 4, 4, 5, 6, 7, 8, 9, 10; range = 10-1=9• Mean = 54/9 = 6• Or• 1, 2, 5, 6, 7, 7, 7, 9, 10; range = 10-1=9• Mean = 54/9 = 6• Data sets are different, but the mean and range
are the same.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 39
Standard Deviation
• Knowing HOW the data are arranged (distributed) tells us more than the mean and range.
• A lot of variability, or not very much variability?
• Especially important in large data sets where it may be impossible to ‘eyeball’ the variability.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 40
Standard deviation of a sample
1n
)xx(s
2
n = sample size
mean of the sample
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 41
To calculate standard deviation of a sample
• Calculate the mean of the sample.• Find the difference between each entry (x) and the
mean. These differences will add up to zero.• Square the deviations from the mean.• Sum the squares of the deviations from the
mean.• Divide the sum by (n 1) to get the variance.• Take the square root of the variance to get
the standard deviation.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 42
Find the standard deviation
x302622
2)x(x xx
4 04
16 016___3278 mean=
26
Sum = 0