Overview
Own research
Counts and Proportions
Binomial Setting
Binomial Distributions
Finding binomial probabilities
Sign test
Book: Moore and McCabe, Introduction to thepractice of statistics
Binomial tests – p.2/23
Question Answering
Who is the president of France?When did Marilyn Monroe die?Where was Adolf Hitler born?Who is Silvio Berlusconi?
Binomial tests – p.3/23
Patterns
[Person] is the president of [Country][Person], the president of [Country][Person] was born in [Country]
[Person] died in [Year]
Binomial tests – p.4/23
Answering Questions
Name Birth place
Wim Kok Bergambacht
Gerard Reve Amsterdam
... ...
Name Function
M. Jackson popstar
Elisabeth II queen of Eng-land
... ...
Binomial tests – p.5/23
Counts and Proportions (1)
Random sample of questions which have tobe answered (n)
Answers are correct or incorrect
Count: number of correct answers (X)
Sample proportion: p̂ = Xn
In my case: n = 220, X = 125, p̂ = 125220 = 0.57
Binomial tests – p.6/23
Sampling distribution
A statistic from a random sample or randomizedexperiment is a random variable. The samplingdistribution of this variable is the distribution of itsvalues for all possible samples.
The probability distribution of the statistic is its
sampling distribution
Binomial tests – p.7/23
Population distribution
The population distribution of a variable is thedistribution of its values for all members of thepopulation.
The population distribution is also the probabil-
ity distribution of the variable when we randomly
choose one individual from the population.
Binomial tests – p.8/23
Example
Length of women between ages 18 and 24
Distribution is normal, mean = 64.5 inchesand standard deviation = 2.5 inches.
Select a woman at random. Her height is X.
Repeated sampling: N(64.5, 2.5).
Binomial tests – p.9/23
Binomial setting
Binomial setting
There are a fixed number n of observations
The n observations are all independent
Each observation falls into one of just twocategories.
The probability of success is the same foreach observation
Binomial tests – p.10/23
Binomial distribution for sample counts
The distribution of the count X of successes in the
binomial setting is called the binomial distribution
with parameters n and p. X is B(n, p).
Binomial tests – p.11/23
Counts and proportions (2)
X is a count. It takes a value between 0 andn. It has a binomial distribution.
p̂ is the sample proportion. It takes a valuebetween 0 and 1. It does not have a binomialdistribution. To do probability calculations,restate p̂ in X.
Binomial tests – p.12/23
Recognizing binomial settings
Tossing a coin 10 times. How many times dowe see heads?
Dealing 10 cards. How many times do we seea red card?
Answering questions. How many questionsare answered correctly?
Binomial tests – p.13/23
Binomial Mean and Standard deviation
The mean and standard deviation of a binomialcount X and a sample proportion of successesp̂ = X
nare:
µX = np µp̂ = p
σX =√
np(1− p) σp̂ =√
p(1−p)n
The sample proportion p̂ is unbiased estimator of
the population proportion p.
Binomial tests – p.14/23
Binomial formula
Example
Each child born to a particular set of parents has
probability 0.25 of having blood type O. If these
parents have 5 children, what is the probability
that exactly 2 of them have type O blood?
Binomial tests – p.15/23
Binomial formula
n = 5, X = 2, p = 0.25.We want P (X = 2).
P(OOØØØ)=P(O)P(O)P(Ø)P(Ø)P(Ø)= (0.25)(0.25)(0.75)(0.75)(0.75)= (0.25)2(0.75)3
Binomial tests – p.16/23
Binomial formula
n = 5, X = 2, p = 0.25.We want P (X = 2).
P(OOØØØ)=P(O)P(O)P(Ø)P(Ø)P(Ø)= (0.25)(0.25)(0.75)(0.75)(0.75)= (0.25)2(0.75)3
OOØØØ OØOØØ OØØOØ OØØØO ØOOØØØOØOØ ØOØØO ØØOOØ ØØOØO ØØØOO
Binomial tests – p.17/23
Binomial formula
n = 5, X = 2, p = 0.25.We want P (X = 2).
P(OOØØØ)=P(O)P(O)P(Ø)P(Ø)P(Ø)= (0.25)(0.25)(0.75)(0.75)(0.75)= (0.25)2(0.75)3
OOØØØ OØOØØ OØØOØ OØØØO ØOOØØØOØOØ ØOØØO ØØOOØ ØØOØO ØØØOO
P (X = 2) = 10(0.25)2(0.75)3 = 0.26
Binomial tests – p.18/23
Binomial formula
k: number of successesP(OOØØØ)= (0.25)2(0.75)3
pk(1− p)n−k
(
n
k
)
= n!k!(n−k)! n! = nx(n− 1)x(n− 2)x...x2x1
P (X = k) =(
n
k
)
pk(1− p)n−k
Binomial tests – p.19/23
Binomial formula on own data
[Pronoun] is the president of [Country][Pronoun] was born in [Country]The [Definite Noun] died in [Year]
Binomial tests – p.20/23
Binomial formula on own data
Remember: The sample proportion p̂ is unbiasedestimator of the population proportion p.Same set of questions: n = 220Number of correct questions: X = 131For p we take p̂: p = p̂ = 0.57
P (X = k) =(
n
k
)
pk(1− p)n−k
P (X = 131) =(
220131
)
0.57131(1− 0.57)220−131 = 0.041The probability to answer 131 question correcthaving the B(220, 0.57) distribution is 4.1%.
Binomial tests – p.21/23
Paired sign test on own data
Ignore pairs with difference 0; the number of trials n is thecount of the remaining pairs. The test statistic is the countX of pairs with a positive difference. P -values for X arebased on the binomial distribution B(n, 1/2).
simple pattens: 125 correctAdded coreference patterns: 131 correct8 differences, 7 improved, 1 did more poorly.
Is this evidence for an improved result?
Binomial tests – p.22/23
Paired sign test on own data
H0 : p = 0.5 no effectHa : p > 0.5 positive effect
n∑
k=X
(
n
k
)
pk(1− p)n−k
(
87
)
0.57(1− 0.5)8−7 +(
88
)
0.58(1− 0.5)8−8 =
8x0.57x0.5 + (0.5)8 = 9(0.5)8 = 0.035
The probability to get this result assuming thatthere was no difference in the performance of thesystem is 3.5%. That means we can reject H0.
Binomial tests – p.23/23