making sense of statistics
TRANSCRIPT
-
8/13/2019 Making Sense of Statistics
1/39
by
Jason SamuelsCUNY-BMCC
AMATYC 39, 2013-11-2
-
8/13/2019 Making Sense of Statistics
2/39
Students Dont Get StatisticsAfter years of Algebra courses, Statistics requires a very
different way of thinking
Whats the formula? ome steps requ re a ormu a e.g. find the z-score
Some steps dont
e.g. find the z-score
wait, what?
n
xz
2
2
2
1
2
1
21
nn
xxz
n
pp
ppz
)1(
2
22
1
11
21
)1()1(
n
pp
n
pp
ppz
-
8/13/2019 Making Sense of Statistics
3/39
Which Topics Can Be Unified? Doing calculations with standard data distributions
Find the data value, z-score, probability
Normal distribution, t-distribution, etc. on ence nterva s
Hypothesis tests
Some ideas so these topics make sense to students
-
8/13/2019 Making Sense of Statistics
4/39
Key idea #1: Describe the distribution
Orients the students toward the values they will use inthe problem and in their calculations
escr e t e str ut on o t e ata:
Center (mean)
Spread (standard deviation)
Shape (which distribution: normal, t, etc.)
-
8/13/2019 Making Sense of Statistics
5/39
Describe the distribution an example
Ex) A college has an average of 23.7 students in eachclass, with a standard deviation of 5.6. What is theprobability that a sample of 35 classes has an average of
Get the facts: =23.7 =5.6 n=35 want P( > 25)
Describe the distribution of Mean:
Standard deviation:
Shape: n>30 so its normal
x
x7.23x
95.035
6.5x
-
8/13/2019 Making Sense of Statistics
6/39
Key Idea #2: Draw the Graph
All values can be organized and connected using onegraph:
-
8/13/2019 Making Sense of Statistics
7/39
Draw the graph example continued
From before
Get the facts: =23.7 =5.6 n=35 want P( > 25)
Describe the distribution of : normalx 7.23
x 95.0
x
x
Now draw the graph:
z
x 23.7 25
-
8/13/2019 Making Sense of Statistics
8/39
Key Idea #3:
The Flow ChartAlmost every calculation
students will do withstan ar istri utions isguided by this flow chart:
-
8/13/2019 Making Sense of Statistics
9/39
Key Idea #4: The Formula There is only one formula students need to know:
deviationstandard
(mean)value)(datastatistictest
Or, equivalently:
data value = (mean)+(test statistic)(standard deviation)
For a single data value:
For a sample mean:
For a sample proportion:
zor
xxz
xx
x
x zxx
z
...or...
pp
p
pzp
pz
...or...
-
8/13/2019 Making Sense of Statistics
10/39
BenefitStudents learn that z has one meaning the number of
standard deviations from the mean so z has one formula
Never again will students use these varied, complex formulas:
Students make fewer order-of-operation calculation errors
n
xz
n
pp
ppz
)1(
2
2
2
1
2
1
21
nn
xxz
2
22
1
11
21
)1()1(
n
pp
n
pp
ppz
-
8/13/2019 Making Sense of Statistics
11/39
The formula example continuedFrom before
Get the facts:
=23.7 =5.6 n=35 want P( > 25)
x Describe the distribution of
normal
Now:
Find the z-score:
x
7.23x 95.035
6.5x
37.195.0
7.2325
x
xxz
-
8/13/2019 Making Sense of Statistics
12/39
Flow Chart & Graph - together
Probability
Z-score
Data value
-
8/13/2019 Making Sense of Statistics
13/39
Flow Chart & Graph - example continued
Get the facts: =23.7 =5.6 n=35 want P( > 25)Describe the distribution of : normal
Now, fill in the graph following the flowchart:
7.23x 95.0xx
x
Probability
Z-score
Data value
1.37
.9147.0853
,
-
8/13/2019 Making Sense of Statistics
14/39
Putting it together: an exercise
The mean time for all flight delays is 21 minutes with astandard deviation of 12 minutes. What is theprobability that a sample of 36 flights has a delay
-
8/13/2019 Making Sense of Statistics
15/39
Putting it together: an exercise
Step 1: get the facts =21
=12
n=36
(1) Get the facts:=21 =12 n=36find P( >26)
(2) Describe the distribution:
(3) Draw the graph:
x
(4) Do the calculations:
(5) Conclusion:
-
8/13/2019 Making Sense of Statistics
16/39
Putting it together: an exercise
Step 2: describe the distribution
Center:
mean =21
Spread:
(1) Get the facts:=21 =12 n=36find P( >26)
(2) Describe the distribution:= 21 =2 Normal
(3) Draw the graph:
x x
x
x
standard deviation
Shape:
n>30, so the distribution is normal(4) Do the calculations:
(5) Conclusion:
26
12
36
12
nx
-
8/13/2019 Making Sense of Statistics
17/39
Putting it together: an exercise
Step 3: Draw the graph(1) Get the facts:
=21 =12 n=36find P( >26)
(2) Describe the distribution:= 21 =2 Normal
(3) Draw the graph:
x x
x
(4) Do the calculations:
(5) Conclusion:
-
8/13/2019 Making Sense of Statistics
18/39
Putting it together: an exercise
Step 4: Do the calculations
z-score:
(1) Get the facts:=21 =12 n=36find P( >26)
(2) Describe the distribution:= 21 =2 Normal
(3) Draw the graph:
x x
x
2126 xx
Areas: using technology
area to the left = .9937area to the right = .0063
(4) Do the calculations:z = 2.5 area=.9937 & .0063
(5) Conclusion:
.2
x
2.5
-
8/13/2019 Making Sense of Statistics
19/39
Putting it together: an exercise
Step 5: Write the conclusion
The probability is .0063
(1) Get the facts:=21 =12 n=36find P( >26)
(2) Describe the distribution:= 21 =2 Normal
(3) Draw the graph:
x x
x
(4) Do the calculations:z = 2.5 area=.9937 & .0063
(5) Conclusion:The probability is .0063
-
8/13/2019 Making Sense of Statistics
20/39
A harder exercise (thats not harder)
Ex) for United, the mean delay time is 18 minutes,st.dev.=11 minutes. For Delta, the mean delay time is 22minutes, st.dev.=14 minutes. Find the probability that,
for a sample of 32 United flights and 34 Delta flights,Delta has a higher mean delay time by over 2 minutes.
-
8/13/2019 Making Sense of Statistics
21/39
A (not) harder exercise
Step 1: Get the facts
Delta:
= = =
(1) Get the facts:1=22 1=14 n1=342=18 2=11 n2=32find P( >2)
(2) Describe the distribution:
12 xx
1 1 1United:
2=18 2=11 n2=32
Find P( >2)
(3) Draw the graph:
(4) Do the calculations:
(5) Conclusion:
21 xx
-
8/13/2019 Making Sense of Statistics
22/39
A (not) harder exercise
Step 2: Describe the distribution
Center
Mean
=22-18=4
(1) Get the facts:1=22 1=14 n1=342=18 2=11 n2=32find P( >2)
(2) Describe the distribution:
Normal
12 xx
421 xx 090.321 xx
2121
xx
Spread
Standard deviation
Shape
n1, n2>30 so its normal
(3) Draw the graph:
(4) Do the calculations:
(5) Conclusion:
22
2
2
2
1
2
1 )()(or)()(
2121 xxxx
nn
09.332
11
34
14 22
-
8/13/2019 Making Sense of Statistics
23/39
A (not) harder exercise
Step 3: Draw the graph (1) Get the facts:1=22 1=14 n1=342=18 2=11 n2=32find P( >2)
(2) Describe the distribution:
Normal
12 xx
421 xx 090.321 xx(3) Draw the graph:
(4) Do the calculations:
(5) Conclusion:
-
8/13/2019 Making Sense of Statistics
24/39
A (not) harder exercise
Step 4: Do the calculations
z-score:
(1) Get the facts:1=22 1=14 n1=342=18 2=11 n2=32find P( >2)
(2) Describe the distribution:
Normal
12 xx
421 xx 090.321
xx42
Areas:
area to the left = .2587area to the right = .7413
(3) Draw the graph:
(4) Do the calculations:z = -0.65 areas: .2587 & .7413
(5) Conclusion:
.09.3
-0.65
-
8/13/2019 Making Sense of Statistics
25/39
A (not) harder exercise
Step 5: Write the conclusion
The probability is .7413
(1) Get the facts:1=22 1=14 n1=342=18 2=11 n2=32find P( >2)
(2) Describe the distribution:
Normal
12 xx
421 xx 090.321
xx
(3) Draw the graph:
(4) Do the calculations:z = -0.65 areas: .2587 & .7413
(5) Conclusion:
The probability is .7413
-
8/13/2019 Making Sense of Statistics
26/39
A Handy Tool StatDisk
Does all basic statistics calculations with a simplegraphical interface and one or two clicks
.
-
8/13/2019 Making Sense of Statistics
27/39
The Issue of the Center
First students learn that they know , this defines thecenter of the distribution, and x (the value from thedata) exists relative to that
,
In the case of inference confidence intervals andhypothesis tests (or p) is not known. Rather, weknow (or ) and make an inference about (or p).
What does this mean for the distribution, and thegraph?
x p
-
8/13/2019 Making Sense of Statistics
28/39
The Issue of the Center Confidence Interval
Formula:
What does this imply for the graph?),( xx zxzx
The center is , not ! We are calculating values for , not
With confidence intervals we just use the formula and ignore it
With hypothesis tests, the issue does not go away
xxzx xzx
xx
-
8/13/2019 Making Sense of Statistics
29/39
The Issue of the Center
Hypothesis Test
Old way:
Ho: = 0
1 o
and you spend all this time explaining why, even thoughthe hypothesis says > o you shade to the right of
(and I think students still dont understand, they just do it)
o x
x
-
8/13/2019 Making Sense of Statistics
30/39
Recognizing a Different Center Hypothesis Test
New way:
Ho: = 0
1 o
and now you shade where the claim tells you to shade,and that area is your confidence level
xo
-
8/13/2019 Making Sense of Statistics
31/39
Why This Makes Sense Shaded area matches the claim
Hypothesis tests and confidence intervals are bothinferences about the population, and they should agree
, , . We are using a distribution of values for
The center is
What does confidence mean? Its a type of probabilistic statement
95% of the time, a conclusion made in this way will becorrect
x
-
8/13/2019 Making Sense of Statistics
32/39
Different center: an exercise
Ex) We want to find out if the average American familyhas more than 1.8 kids (because that places a strain onmunicipal services). From a survey of 500 families, the
=. . .
-
8/13/2019 Making Sense of Statistics
33/39
Different center: an exercise
Step 1: Get the facts
=1.92 =0.9 n=500
(1) Get the facts:=1.92 =0.9 n=500
test claim: > 1.8
(2) Describe the distribution:
(3) Draw the graph:
x
x
c a m: .
(4) Do the calculations:
(5) Conclusion:
-
8/13/2019 Making Sense of Statistics
34/39
Different center: an exercise
Step 2: describe the distribution of
Center:
(1) Get the facts:=1.92 =0.9 n=500
test claim: > 1.8
(2) Describe the distribution:mean=1.92 stdev=.0402 normal
(3) Draw the graph:
x
mean= .
Spread:
Shape:n>30 so its normal (4) Do the calculations:
(5) Conclusion:
0402.500
9.0st.dev.
n
-
8/13/2019 Making Sense of Statistics
35/39
Different center: an exercise
Step 3: Draw the graph(1) Get the facts:
=1.92 =0.9 n=500
test claim: > 1.8
(2) Describe the distribution:mean=1.92 stdev=.0402 normal
(3) Draw the graph:
x
(4) Do the calculations:
(5) Conclusion:
-
8/13/2019 Making Sense of Statistics
36/39
Different center: an exercise
Step 4: Do the calculations(1) Get the facts:
=1.92 =0.9 n=500
test claim: > 1.8
(2) Describe the distribution:mean=1.92 stdev=.0402 normal
(3) Draw the graph:
x
99.2
0402.
92.18.1
z
Areas:
area to the left = .0014
area to the right = .9986(4) Do the calculations:
z = 2.99 areas: .0014 & .9986(5) Conclusion:
-2.99
-
8/13/2019 Making Sense of Statistics
37/39
Different center: an exercise
Step 5: Write the conclusion
We are .9986 confident in the
(1) Get the facts:=1.92 =0.9 n=500
test claim: > 1.8
(2) Describe the distribution:mean=1.92 stdev=.0402 normal
(3) Draw the graph:
x
.average American family
has more than 1.8 children)
(4) Do the calculations:z=-2.99 areas .0014 & .9986
(5) Conclusion:We have .9986 confidence
that > 1.8
-
8/13/2019 Making Sense of Statistics
38/39
Big Changes
All the formulas for the test statistic flip For means
the center is
The formula for z is:
For ro ortions
..
0
ds
xz
x
the center is
The formula for z is:
These are equivalent to the confidence interval formulas
(just solve for 0) so we already used them withoutknowing it
The formulas forx & z (given population info) were inverses;
Now the formulas for and z from inference (confidenceintervals & hypothesis tests) are inverses as they should be
..
0
ds
ppz
p
-
8/13/2019 Making Sense of Statistics
39/39
Jason [email protected]