celebrating the birthday problem
TRANSCRIPT
CELEBRATING THE BIRTHDAY PROBLEMAuthor(s): NEVILLE SPENCERSource: The Mathematics Teacher, Vol. 70, No. 4 (APRIL 1977), pp. 348-353Published by: National Council of Teachers of MathematicsStable URL: http://www.jstor.org/stable/27960843 .
Accessed: 07/12/2014 14:34
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp
.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].
.
National Council of Teachers of Mathematics is collaborating with JSTOR to digitize, preserve and extendaccess to The Mathematics Teacher.
http://www.jstor.org
This content downloaded from 138.251.14.35 on Sun, 7 Dec 2014 14:34:22 PMAll use subject to JSTOR Terms and Conditions
CELEBRATING THE BIRTHDAY PROBLEM
Do you have an intuitive guess of a probability that wo
people in a group have the same birthday? This article will prove that your guess is probably a large underestimate.
by NEVILLE SPENCER California Stata Collega, San Bernardino
San Bernardino, CA 92407
ONE of the most startling paradoxes in
elementary mathematics is the birthday problem. It can be stated as follows:
In a gathering of people, what is the
probability that at least two of them have the same birthday (that is, were born on the same day of the same month)?
What makes the problem so amazing is that the actual probabilities don't correspond with one's intuition. For example, with =
30, a person might intuitively reason as fol lows: If there are 30 people with birthdays randomly distributed throughout the year, then the probability of two or more birth
days falling on the same date must be 30 out of 365, or roughly one in twelve. Some
people, allowing for some error in their calculations and intuition, would not be
surprised at 20 percent probability. But the actual probability is about 71 percent, as seen in table 1. With = 60 people, the
probability is 99 percent?a virtual cer
tainty. The paradox results from an error in the
intuitive approach. The results cited in table 1 are obtained by calculating the
probability of matching someone's birth
day in the gathering to a date selected at random from the calendar (assuming that
everyone has a different birthday). The paradoxical nature and the useful
ness of the birthday problem can be better
explored by changing its emphasis to some
TABLE l The Probability That Among People at Least
Two Will Have the Same Birthday
5 0.027 10 0.117 15 0.253 20 0.411 25 0.569 30 0.706 35 0.814 40 0.891 45 0.941 50 0.970 55 0.986 60 0.994 65 0.998 70 0.999 75 0.999+
thing other than birthdays and restating it as follows (Kleber 1969): If people each choose a number randomly and inde
pendently from the set of whole numbers from 1 to N> what is the probability that two or more people choose the same number?
The advantage of this version is that any desired probability of success can be se lected in advance, and then, for a given value of n, a value of can be chosen that will produce an approximation of the de sired probability.
Table 2 gives the value of for selected values of andP. For example, to have a 75
percent probability of success with a group of = 25, choose = 224. If each person in the group chose two numbers at random
(in effect making =
50), the group could choose from = 900 numbers with the same 75 percent chance of success. Inter ested readers should explore Kleber's ar ticle for more discussion and a more com
plete table.
348 Mathematics Teacher
This content downloaded from 138.251.14.35 on Sun, 7 Dec 2014 14:34:22 PMAll use subject to JSTOR Terms and Conditions
TABLE 2 Values of for Selected and
o.io 0.25 0.50 0.90
5 10 15 20 25 30 35 40 45 50
96 430
1,001 1,809 2,855 4,138 5,658 7,416 9,411 11,643
36 159 369 666
1,051 1,521
2,079 2,724 3,456 4,274
16 68 156 280 441 637 869
1,138 1,443 1,783
35 80 143 224 323 440 575 729 900
6 22 50 89 138 199 270 352 445 548
The computation of the actual probabil ity of successful pairing yields another bonus for the teacher of introductory prob ability, since it makes cogent use of the
concepts of complementary and condi tional probabilities. To compute the proba bility of two or more numbers being the same (if numbers are selected at random from the first natural numbers), we first calculate the probability Q that the selec tion brings about no matching numbers, and then we use the fact that = 1 ? Q. If
Pm is the probability that the mth number selected differs from the previous m ? 1 numbers selected, then Pm is a conditional
probability and
Q = 2 3 *.
In order for the mth number to differ from those previously selected, it must be one of the -
(m -
1) choices not yet made. Since there are possible numbers,
_ ? m + 1 __ m ? 1
Hence
?-5 -^). and = 1 - Q.
In what follows, I would like to propose some modifications of the birthday prob lem (in either form) that (1 ) give the teacher some ideas for supplemental problems to
augment the original one and (2) give stu dents the probabilities they need to experi ment with the problem on their own. It
should also be noted that for the computa tion of the tables of probabilities and other values involved, a computer is a great, if not indispensable tool. Furthermore, the
computer programs needed for the compu tations are short and simple; so further use
may be made of the original problem and modifications by having the students write the necessary programs for computation.
In the performance of the experiment with a given (possibly small) number of
people, the revised form of the problem makes the experiment more flexible and more remarkable than it is in its original form. However, it relies heavily on the use of a random number selection process by the people involved in the experiment. In informal settings no table of random num bers (or other device) is available to assure
that the numbers are randomly selected. It is not likely that numbers selected by people "off the top of their heads" will be
random, and therefore, to assure a uniform distribution of choices, it is perhaps better to return to the original birthday problem. However, we may still be faced with the
difficulty of having too few people provid ing too few dates to perform the experiment satisfactorily. We remedy this situation by having each person choose more than one
date. The question then becomes the fol
lowing:
In a group of members, each person
gives k dates. For different values of k >
1, what is the probability that two or more dates are the same, and how do these probabilities compare with the cor
responding ones for k = 1?
To guarantee randomness, all the stu dents could select their own birthday and one or more of their relatives' birthdays. It is reasonable to assume that no single indi vidual will select two dates that are the same. Therefore, if we assume that the k dates given by each person will be different from one another, the probabilities will be less that two or more dates (from all dates
given) are the same. But how much less? The calculation is done in the same way
as the original birthday problem. First we
April 1977 349
This content downloaded from 138.251.14.35 on Sun, 7 Dec 2014 14:34:22 PMAll use subject to JSTOR Terms and Conditions
compute the probability that all the dates will be different and then subtract this re
sult from 1. If each student gives k dates, then the probability that the first student chooses k different dates is 1. The probabil ity that the second chooses k dates different
from one another and different from those
previously chosen is
= 365 - k 365 - k - \ 2,k ~~ 365
' 364
365 - k - 2 365 - 2k + 1
363 365 - k + 1 *
where each numerator represents the num
ber of available unchosen dates up to that
point and each denominator represents the
number of dates from which it is possible to
select. The denominators diminish because
each succeeding date chosen by the student
must differ from the others they have cho
sen, thus reducing the number of allowable
dates from which they may choose. The
probability that the rath student chooses k
dates differing from all that have been cho sen is
365 - O ?
l)/c ? 1
364
365 - mk + 1
365 - k + 1
So the probability that students, each giv ing k dates (a total of = ? k dates), will
have chosen them all different is
Q = Pm.k m = 1
= 365-364-363- ? ? ? -(365 - + 1)
(365)"(364)n ? ? -
(365 - k + I)w
'
and the probability of two or more dates
being the same is = 1 - Q. The computa tions for to three places are given in table
3, with k ranging from 1 to 10 across the
top and the number of participants listed in the left-hand column. The probabilities listed under k = 1 correspond to those of table 1 for the appropriate number of stu
dents. Blanks in the table represent a prob ability of better than 0.9995.
Notice that for a fixed number of dates, = ?
k, the probability that each
chooses more than one date is not signifi
cantly different from the probability that
each chooses only a single date, especially in cases that exclude very small numbers of
people and low probability of success. In
fact, if we restrict ourselves to ten or more
people and hold the total number of dates = ? k fixed, we can examine the differ
ence between the one-date-per-person case
and the multiple-date-per-person case. The
second row of table 4 lists the probabilities of success for one-date-per-person (k
= 1)
TABLE 3 Probability of Success for Students Each
Choosing k Different Dates
5 0.027 0.105 0.221 0.361 0.505 0.639 0.752 0.840 0.903 0.945 10 0.117 0.395 0.681 0.872 0.961 0.991 0.999 15 0.253 0.694 0.933 0.992
20 0.411 0.885 0.993
25 0.569 0.968 30 0.706 0.994 35 0.814 0.999 40 0.891 45 0.941 50 0.970 55 0.986 60 0.994 65 0.998 70 0.999
This content downloaded from 138.251.14.35 on Sun, 7 Dec 2014 14:34:22 PMAll use subject to JSTOR Terms and Conditions
TABLE 4 Probability of Success for
Each of People Choosing k Dates
(total number of dates) 10 20 30 40 50 60 70
Probability for k = 1 0.117 0.411 0.706 0.891 0.970 0.994 0.999 Probability for k =
1,2, ?'?,7( =
10) 0.117 0.395 0.081 0.872 0.961 0.991 0.999 Difference 0.000 0.016 0.025 0.019 0.009 0.003 0.000
with the number of people varying in increments of ten from 10 to 70. The third row lists the probabilities of success for =
10 people, each choosing from k = 1 to
k = 7 dates. (The choice of = 10 was made because this seemed to show the
greatest variation from the original case. See table 3.)
The computations in table 4 show that even in a situation in which only a few
people are present, it is possible to make the "birthday wager" with a similarly high probability of success, provided each per son supplies enough dates to make the total number of dates reasonably large (say 35 or
more).
When the experiment is performed this
way, it is necessary that the choices of dates
(or numbers) be made not only randomly but independently. In a formal situation, each selection or selections should be made in the same way one would take a secret ballot. But in an informal situation, it is more likely that choices will be made known verbally, therefore giving people the
opportunity to alter the experiment and
change a date if they hear it ahead of their time. The calculations for this situation are a little more complicated and involve the
concept of dependence, not yet used:
In a group of members, all members select k dates and present their selec tion orally. If r percent of the group will change a selected date(s) if it is
presented before their turns, what is the probability that two or more dates will be the same, for different values of k > 1, and how does it compare with the corresponding probability for k =
1?
For the sake of calculation, let us assume that r = 50 percent and compute the proba
bility of success for different values of k and as before. Let represent the event that a person is honest, that is, does not
change his or her date; D, that a person is
dishonest, that is, does change his or her
date; and Pm,k(S), the probability that all k dates given by the mth member, who may be either honest or dishonest, are different from each other and all preceding dates.
Letting P(A\ B) be the conditional probabil ity of event A given the occurrence of event
B, we have Pm,k{S) =
P(H) ? Pm,k(S \H) +
P{D)Pmtft(S\D). But P(H) = P(D) = 0.5 and Pm,k(S \D)
= 1, since a person chang ing his or her date will certainly choose dates different from all preceding ones. So
Pm,k(S) = 0.5Pm,k(S I H) + 0.5(1)
= 0.5(Pm,k(S I H)+ 1),
and we need only determine Pmtk(S\H). This is the probability that an honest per son will choose k dates differing from all
preceding dates, which is the calculation we made in (1) previously. Thus, Pm,k(S\H)
=
Pmfk and we therefore have
Pm.k = Pm,k(S I H)
= 365 - (m -
l)k 365
365 - (m -
l)k - 1
364
365 - mk + I 365 - k + 1
kfi 365 - (m
- l)k
- j
T? 365 - j
M V 365 - j I
April 1977 351
This content downloaded from 138.251.14.35 on Sun, 7 Dec 2014 14:34:22 PMAll use subject to JSTOR Terms and Conditions
TABLE 5 Probability of Success for People
50 Percent Honest Each Choosing k Dates
10
5 10 15 20 25 30 35 40 45 50 55 60
0.014 0.060 0.135 0.231 0.340 0.453 0.563 0.663 0.750 0.820 0.876 0.917
0.053 0.219 0.438 0.647 0.807 0.908 0.961 0.986 0.995 0.999
0.116 0.422 0.719 0.897 0.971 0.994 0.999
0.194 0.614 0.886 0.978 0.997
0.283 0.762 0.959 0.996
0.375 0.861 0.986 0.999
0.465 0.920 0.995
0.547 0.954 0.998
0.619 0.972 0.999
0.681 0.982 0.999
The probability that - k dates are dif ferent is
Q = PmAS) m = l
= -5(iV* + 1), m= 1
and the probability that at least two dates are the same is - 1 ? Q. Some of these results are given in table 5. Blanks represent a probability better than 0.9995. The results are surprising! There isn't the
drastic difference in probabilities one might expect. Notice that with sixty dates from ten or more people, the probability of suc cess is still 86 to 92 percent. And to guaran tee close to 75 percent success, we need only fifty dates, instead of thirty-five with a com
pletely honest group. The calculations involved in these modi
fications of the original problem are rea
sonably simple and involve basic concepts of introductory probability. It is hoped they will give the teacher some interesting follow-up problems and give the student some motivation for making the computa tions. Here are some other hypothetical problems that may be included.
1. In a gathering of people, each is asked his or her birthday, or more dates if needed. Each person hears j dates before his or hers and 50 percent of the people are
dishonest and change their date(s) if they hear it (them) given. What is the probability that two or more dates are the same?
2. In a gathering of people, what is the
probability that two or more people were
born within a week of one another? Within k days of one another?
3. If it was possible to identify the in stant a person was born and assign a num
ber on a continuous scale from 0 to 365, then in a gathering of different people what would be the probability that at least two were born within a time interval t
apart? This is a continuous analog of prob lem 2 above, and for t = 1 day should give the results of the original problem.
4. If in a gathering of people, m of them refuse to give their birthdates, how should an even wager be divided? The origi nal even wager is that two or more of the
people were born on the same day. If k of them refuse to give their birthdays and the
remaining dates are all differentVthe wager is neither lost nor won. This problem is similar to the problem worked out by Pas cal and Ferm?t in the beginnings of formal
probability theory.
5. In one of the variations of the birth
day problem with people, what is the
expectation of winning an even wager? A
two-to-one wager? For any fixed az, what
wager should be given if you wish to break even over the long run?
This discussion should help broaden the
application and usefulness of the birthday problem in teaching introductory probabil ity. Besides this, there are other uses for this
352 Mathematics Teacher
This content downloaded from 138.251.14.35 on Sun, 7 Dec 2014 14:34:22 PMAll use subject to JSTOR Terms and Conditions
paradox in the classroom. For example, a statistical experiment coulcj be run by hav
ing students first guess what probability is reasonable for success in the original prob lem (or its restatement in terms of random selection of numbers) and then use that
guess as a hypothesis to be tested by ran dom sampling. The birthday problem para dox is rich in its potential for use in teach
ing.
REFERENCE Kleber, Richard S. "A Classroom Illustration of a
Nonintuitive Probability." Mathematics Teacher 62 (May 1969):361-62.
PROFESSOR GOOWfc MATH PRIMER 144 PAGE CAR7WNSTO?y(?LBnmmMB0t TW^WM35mBEmCW&6U/?F. MADE* 6-9
^TESSELLATION SLIDE TALI WHATARE TESSELLATIONS* EXAMPLES ARCHITECTURE' m TO MAKE THEM-INCLUDES SOME TESSELLATION* BTMAR/TS ?SCHER- OUTZ/NE, PRO?S. AND PROJECTS' 32 SUDES *6*AD?S 8-/2
GEOMETRY IN ARCHITECTURE GEOMETRY IN MATERIALS-GEOMETRY IN STRUCTURE* BEAM, ARCH / TRUSS-GEOMETRY OF 8 /LP/NG SHAPES* EXAMPLES. GEOMETRY if/ mWK60mMEmrm'0V7UME * 32 SUPES
GEOMETRY IN BRIMES TYPES OP BRIDGES-BEAM, ARCHf SUSPENSfON, ?TC, 32 SUDES SNOiY/W ACTUAL BR/P&ES WITH PRAWW64 EMPH4S/Z/M& GEOME7P/CAL CHARACTERISTICS OFCOAISTRlATTW.
f^TTt
GEOMETRY IN BUILDINGS GRAD&5-IO /6 INTEREST/N6 ?UILPINGSSHOWN/NSUD?S*THEN,INUNE DRAWINGSt EMPHASIZING AID7I CAU?tfZf THE GEOMETRICAL SHAPES WHICH MAKE UP THE STRUCTURES-32 SUDES
JHEMATICS IN YACHT DESIGN USIN 0 ONLY ADDITION, S?BT&tCTIOAl, M?Ty 4pfW $tONt STUDENTS CAfY CALCULATE DISPLACEMENT 4 CENTER Of SRAWTY FORA S4/L304T-LOTS OFAETTN
METfCDR/U AMP AREA MEASUREMENT REWtfEP TO ACHIEVE PRACTICAL f?E<S- S2 Su&f+OUTZ/NF
PRIMER-^4.50 - SUDE TALKS - *23-00
- ADD* 50 EACH ITEM, P*H
INTCR6ALACTIC PUBLISHIN6 COMPANY 106 STRATFORD AVENUE
_WESTMONT N.J, ? ? _
Get Your Ideas from RESEARCH
There is a wealth of information in research material that you can
apply in your mathematics classes. These reports will give you keener
insight and a sharper understand
ing of students' learning processes.
'Classroom Ideas from Research on
Computational Skills. Nontechnical
descriptions of research findings on compu tation learning and how elementary school teachers might use them; bibliography. 64 pp. $2.50
'Elementary School Mathematics: A Guide to Current Research. Surveys research on curriculum, the child, the learn ing environment, and teaching methods. Bib liography. 200 pp. $5.00
Evaluation in the Mathematics Classroom: From What and Why to How and Where. Quick reference guide to evaluation for the classroom teacher; in cludes scope of testing purposes and proce dures, how to plan and write tests, and bibli ography. 64 pp. $2.10
Plagetian Cognitive-Development Research and Mathematical Edu cation. Papers presented at a conference
sponsored by the NCTM and Columbia Uni versity to increase cooperation between mathematics educators and psychologists. 243 pp. $5.00
Research and Development in Edu cation: Mathematics. A report of the Conference on Needed Research in Mathe matics Education. 142 pp. $2.00
Research on Mathematical Think
ing of Young Children. Eight papers prompted by the widely recognized need for
applying cognitive-development research to mathematics education. 208 pp. $4.50
Teaching Mathematics in the
Elementary School ? What's Needed? What's Happeningt jointly published with the NAESR Directed toward
elementary school principals and mathemat ics specialists but also helpful to teachers and
parents. 121 pp. $3.35
Using Research: A Key to Elemen
tary School Mathematics. Revision of a 1970 study; II bulletins summarizing re search ideas applicable in the classroom; ref erences. 142 pp. $3.50
New 1976
NATIONAL COUNCIL OF TEACHERS OF MATHEMATICS
1906 Association Drive Reston, Virginia 22091
All orders totaling $20 or less must be accom Darned by full payment in U S currency or equivalent There is a $1 service charge on cash orders totaling 'ess t^i'i $5 Ma?e checks payable to the NCTM. shipping and nandhng charges will be added to ah billed"cders
An annotated listing of all NCTM publications is free on request.
April 1977 353
This content downloaded from 138.251.14.35 on Sun, 7 Dec 2014 14:34:22 PMAll use subject to JSTOR Terms and Conditions