stat 20: intro to probability and statisticstchilders/stat20/lecture16.pdf · stat 20: intro to...
TRANSCRIPT
Stat 20: Intro to Probability and StatisticsLecture 16: More Box Models
Tessa L. Childers-DayUC Berkeley
22 July 2014
Today’s Goals EV and SE Normal Curve Classifying and Counting
By the end of this lecture...
You will be able to:
Determine what we expect the sum of draws from a box tobe, and how far off we will likely be
Quickly calculate the SD of a list with only two kinds ofnumbers
Easily calculate probabilities for sums of draws
Use a box model to address more kinds of problems, e.g.counting the number of “6”s shown in a series of throws
2 / 28
Today’s Goals EV and SE Normal Curve Classifying and Counting
Recap: Box Models
Box models are useful in analyzing games of chance
Draw a box
Indicate the number and kind of tickets
Indicate the number and kind of draws
Indicate what is done with each ticket
Examined minimum and maximum of sum of draws
3 / 28
Today’s Goals EV and SE Normal Curve Classifying and Counting
Example 1: Box Model
Have a box with three tickets–a “1”, a “2”, and a “3”
Draw 5 times, with replacement
Add together the values seen on each ticket
What is the sum of the draws?
How much does each draw contribute to the sum?
What can we reasonably expect the sum to be?
4 / 28
Today’s Goals EV and SE Normal Curve Classifying and Counting
The Expected Value
The Expected Value (EV) for the sum of the draws from the box is
# of draws × average of the box
The sum of draws from a box (with replacement) should besomewhere around the expected value.
5 / 28
Today’s Goals EV and SE Normal Curve Classifying and Counting
Example 2: Rolling Dice
You are playing a dice game. It costs $1 per play. You roll thedice, and if it is an even number, you win $3. If it is odd you winnothing. About how much do you expect to win or lose in 50plays?
1 Draw a box model, indicating the number and kind of tickets
2 Indicate the number and kind of draws
3 Indicate what is done with each ticket
4 Answer the question above
6 / 28
Today’s Goals EV and SE Normal Curve Classifying and Counting
Example 3: Coin Flipping
You are playing a coin flipping game. It costs $1 to play. You flipthe two coins, and if there is at least one head showing, you win$2. Otherwise, you win nothing.
True or False, and Explain: If you play 30 times, you will definitelywin $10.
7 / 28
Today’s Goals EV and SE Normal Curve Classifying and Counting
Chance Error
Variation around expected value is due to chance error
chance error = # observed – # expected
If I actually win $5, what is my chance error? What if I lose $15?
How big is my chance error likely to be?
8 / 28
Today’s Goals EV and SE Normal Curve Classifying and Counting
Standard Error
The Standard Error (SE) for the sum of the draws from the box is√# of draws × SD of the box
The sum of draws from a box (with replacement) should besomewhere around the expected value, give or take a SE.
9 / 28
Today’s Goals EV and SE Normal Curve Classifying and Counting
Example 4: Two Boxes
What kind of variability do we expect from the sum of 5 draws,with replacement, from a box with:
1 A single “1” and a single “3”?
2 A single “1” and a single “10”?
Calculate the EV and SE of both of these situations.
10 / 28
Today’s Goals EV and SE Normal Curve Classifying and Counting
SD Shortcut
Obviously calculating a lot of SDs:
1 Find average
2 Find the deviations from average
3 Square the deviations from average
4 Average the squared deviations
5 Take the square root
11 / 28
Today’s Goals EV and SE Normal Curve Classifying and Counting
SD Shortcut (cont.)
If there are only two types of tickets in the box (or only twotypes of numbers in the list):
1 Call the larger number the “big #” and the smaller numberthe “small #”
2 Call the fraction of larger numbers “b.f.” and the fraction ofsmaller numbers “s.f.”
3 Calculate
SD = (big # − small #)×√
b.f. × s.f.
12 / 28
Today’s Goals EV and SE Normal Curve Classifying and Counting
SD Shortcut (cont.)
Let’s look at a box with three “2”s and two “1”s
avg = 1.6
sd =
√(2− 1.6)2 + (2− 1.6)2 + (2− 1.6)2 + (1− 1.6)2 + (1− 1.6)2
5
=
√(0.4)2 + (0.4)2 + (0.4)2 + (−0.6)2 + (−0.6)2
5
= 0.49
sd = (2− 1)×√
3
5× 2
5
= 0.49
13 / 28
Today’s Goals EV and SE Normal Curve Classifying and Counting
Lists vs. Chance Processes
List of numbers (tickets in a box), all values are known
mean = average = sum of values, divided by number ofvalues; the typical size of an entry/ticket
SD = standard deviation = square root of average ofdeviations from mean; the typical size of the deviation fromthe mean in a single entry
The typical entry in a list is around average, give or take astandard deviation or so.
14 / 28
Today’s Goals EV and SE Normal Curve Classifying and Counting
Lists vs. Chance Processes (cont.)
Chance process (draws from a box), values are unknown
EV for sum of draws with replacement = number of drawstimes average of box; typical size of the sum of draws withreplacement
SE for sum of draws with replacement = standard error =square root of number of draws times SD of box; typical sizeof deviation from EV in a single sum of draws
The sum of draws with replacement is around expected value,give or take a standard error or so.
15 / 28
Today’s Goals EV and SE Normal Curve Classifying and Counting
Example 5: Drawing from a Box
50 draws are taken, with replacement, from a box with 1 each ofthe following: “1”, “2”, “3”, “6”, “8”
1 Calculate the expected value and standard error for the sumof the draws.
2 The sum of the draws will be around , give ortake or so.
3 Someone actually makes 50 draws with replacement. You areasked to guess what the sum is. Do you think your guess is offby about 2, 12, or 20?
4 You are told that 175 is the sum. Fill in the following:
(a) expected value =(b) observed value =(c) chance error =(d) standard error =
16 / 28
Today’s Goals EV and SE Normal Curve Classifying and Counting
Interesting Question:
What is the chance that using the box above (1 each of: “1”, “2”,“3”, “6”, “8”), the sum of 1000 draws is between 3900 and 4100?
Could you find a similar probability for a much smaller number ofdraws?
Recalling the frequency definition of probability, could you find thisprobability?
17 / 28
Today’s Goals EV and SE Normal Curve Classifying and Counting
Interesting Question: (cont.)
Draw 1000 ticketswith replacement,calculate the sum ofthe tickets
Do this 10 times,record the proportionof sums (out of 10)that are between 3900and 4100
Do this 100 times,record the proportionof sums (out of 100)that are between 3900and 4100
0 2000 4000 6000 8000 10000
0.75
0.80
0.85
0.90
Relative Proportion of Observed Sums of 1000 Draws Between 3900 and 4100
Number of Observed Sums
Rel
ativ
e P
ropo
rtio
n B
etw
een
3900
and
410
0
18 / 28
Today’s Goals EV and SE Normal Curve Classifying and Counting
Interesting Question: (cont.)
Draw 1000 ticketswith replacement,calculate the sum ofthe tickets
Do this 200 times,record the proportionof sums (out of 200)that are between 3900and 4100
Do this times,record the proportionof sums that arebetween 3900 and4100
0 2000 4000 6000 8000 10000
0.75
0.80
0.85
0.90
Relative Proportion of Observed Sums of 1000 Draws Between 3900 and 4100
Number of Observed Sums
Rel
ativ
e P
ropo
rtio
n B
etw
een
3900
and
410
0
19 / 28
Today’s Goals EV and SE Normal Curve Classifying and Counting
Interesting Question: (cont.)
Draw 1000 ticketswith replacement,calculate the sum ofthe tickets
Do this 10,000 times,make a histogram ofthe sum
The histogram looksnormal
Histogram of 10,000 Observed Sums of 1,000 Draws From The Box
Sum of 1,000 Draws
Den
sity
3700 3800 3900 4000 4100 4200 4300
0.00
00.
001
0.00
20.
003
0.00
4
20 / 28
Today’s Goals EV and SE Normal Curve Classifying and Counting
Interesting Question: (cont.)
We can use the approximate normality of this curve to calculatethe chance that the sum of 1,000 draws is between 3900 and 4100.
z =value of sum − expected value of sum
standard error of sum
Use the normal table to find the chance that the sum of 1000 drawsfrom the box (“1”, “2”, “3”, “6”, “8”) is between 3900 and 4100.
In general, the normal curve can be used to calculate probabilitiesfor sums of random draws with replacement from a box, when thenumber of draws is “large”
21 / 28
Today’s Goals EV and SE Normal Curve Classifying and Counting
Example 6: Using the Normal Curve
A fair die is thrown 200 times.
1 Calculate the expected value and standard error for the sumof the throws
2 The sum of the throws will be around , give ortake or so.
3 Find the probability that the sum of the throws is greater than647.
22 / 28
Today’s Goals EV and SE Normal Curve Classifying and Counting
Example 7: Counting the Evens
A fair die is thrown 600 times.
1 The sum of the throws will be around 2100, give or take 42or so.
2 The sum of evens thrown will be around , give ortake or so.
3 The number of evens thrown will be around , giveor take or so.
23 / 28
Today’s Goals EV and SE Normal Curve Classifying and Counting
Example 7: Counting the Evens (cont.)
Let’s simplify, and assume a fair die is thrown 5 times. I roll
6 2 4 5 3
If I am making a sum of throws, I will add:
6 + 2 + 4 + 5 + 3 = 20
If I am making a sum of evens thrown, I will add:
6 + 2 + 4 + 0 + 0 = 12
If I am counting the number of evens thrown, I will add:
1 + 1 + 1 + 0 + 0 = 3
Each process can be written as a sum!
24 / 28
Today’s Goals EV and SE Normal Curve Classifying and Counting
Example 7: Counting the Evens (cont.)
Strategy for adding only certain things or counting number ofthings:
1 Make the box describing the basic chance process
2 Formulate your desired quantity as a sum
3 Change the value of the tickets (but not the number oftickets!) to add as appropriate
25 / 28
Today’s Goals EV and SE Normal Curve Classifying and Counting
Example 7: Counting the Evens (cont.)
A fair die is thrown 600 times. Box is (“1”, “2”, “3”, “4”, “5”,“6”). Have 600 throws, with replacement
1 Sum of throws is like sum of draws from box above, usual EVand SE formulas apply
2 Sum of evens: each even drawn adds itself to the sum. Eachodd drawn adds 0 to the sum. Box becomes (“0”, “2”, “0”,“4”, “0”, “6”). Sum of evens is like sum of draws from thisbox, usual EV and SE formulas apply
3 Number of evens: each even drawn adds 1 to the count (sum).Each odd drawn adds 0 to the count (sum). Box becomes(“0”, “1”, “0”, “1”, “0”, “1”). Number of evens is like sumof draws from this box, usual EV and SE formulas apply.
26 / 28
Today’s Goals EV and SE Normal Curve Classifying and Counting
In a 0-1 Box
The “1”s represent the event(s) that we wish to count, the“0”s represent the event(s) that we do not wish to count
The EV for the sum = number of draws × average of box.But what is the average of the box?
The SE for the sum =√
number of draws× SD of box. Butwhat is the SD of the box?
The normal curve can be used to calculate chances for sumsof draws: new avg = EV for sum, new SD = SE for sum
27 / 28
Today’s Goals EV and SE Normal Curve Classifying and Counting
Important Takeaways
The EV is the sum that we expect to see
The SE is the amount we expect a particular sum to be “off”from the EV, due to chance error
There is a shortcut for calculating the SD of a list with onlytwo kinds of numbers
The normal curve can be used to calculate chances for sumsof draws: new avg = EV for sum, new SD = SE for sum
Changing the box helps address a lot of problems–either toadd only certain kinds of tickets, or to count the number of acertain event
Next time: Why use the normal curve to calculate probabilities?
28 / 28