a short course on probability and sampling
Post on 24-Nov-2015
202 Views
Preview:
DESCRIPTION
TRANSCRIPT
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
PROBABILITY and SAMPLING
Concept of Probability, the Probability Rules, Probability Distributions and Applications
For randomly occurring events, we would like to know how many times we get a desired result
out of all trials. This means we would like to know the fraction of favourable events or trails.
Suppose, we flip a coin a few number of times. We know there is a 50-50 chance of occurring a
Head or a Tail. We may count how many times there is a Head or a Tail out of all the flips.
Let,
= No. of favourable events and = Total no. of events.
= fraction of favourable events. We can also say this is relative frequency in
the usual language of Statistics.
Now, if we do the trials a large number of times, this fraction tends to some fixed value
specific to the event. Then the limiting value of the fraction is what we call probability.
Note:
Total no. of trials is also called sample space when we are drawing samples out of total
population. As the no. of trials is increased, the sample space becomes bigger.
Definition of Probability:
Probability is the ratio of number of favourable events to the total number of events, provided
the total number of events is very large (actually infinity).
, when (infinity).
So by definition, is a fraction between 0 and 1 : .
No favourable outcome.
All the outcomes are in favour.
We can also think in the following way: probability of occurring an event, probability of
not occurring the event. Since, either the event will occur or not occur, we must write:
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
2
Therefore, we have, .
Example #1:
In a coin tossing, we know from our experience, = and = =
. So,
.
Example #2:
In a throw of a dice, we know that the probability of the dice facing 1 up, 2 up, 3 up etc.
will be , , and so on.
Here,
Probability of not occurring 1 is
.
Note:
The condition that the total probability of all the events has to be 1 is called normalization of
probabilities:
Rules of Probability:
When more than one event takes place, we need to calculate the joint probability for the all the
events.
Mutually Exclusive Events
Two events are mutually exclusive (or disjoint) when they cannot occur at the same time.
Suppose, two events are A and B and the individual probabilities for them are designated as
and . Mutually exclusive means,
.
Addition Rule:
Example#1: The probability of occurring either Head or Tail in a coin toss,
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
3
Example#2: The probability of occurring either 1 or 6 in a dice throw,
.
Independent Events
When the occurrence of one event does not influence the other but they can occur at the same
time, they are called independent. For example, the rain fall today and the Manchester United
winning a match.
Multiplication Rule:
Example #1:
What is the probability that two Heads will occur when we toss two coins together?
for the first coin and for the second coin.
.
Note that if would flip a single coin two times and ask the probability of getting Heads twice, we
would get the same answer.
Example #2:
Now we ask the question, what is the probability of getting one Head and one Tail in the
flipping of two coins together?
Consider, the probability of obtaining Head in the first coin and Tail in the second coin:
.
And the probability of obtaining Tail in the first and the Head in the second:
.
Now the total probability of above two events (either of them occurs mutually exclusively):
.
Note that in the flipping of two coins together, there are 4 types of events, HH, HT, TH, TT. Out
of which the relative occurrence of one Head and one Tail is 2/4 = /12.
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
4
When Events are NOT Mutually Exclusive:
If the events are not mutually exclusive, there are some
overlap. Suppose, we designate
an area A corresponding to the probability of
some event A and the area B to the probability
of another event B. The overlap between the
two areas then represents the joint
probability, . Note that for two
independent events the overlap would be zero.
Addition Rule in this case:
When Events are NOT Independent:
Multiplication rule:
) The probability of B given A. This is a conditional probability, i.e., the probability of
occurring B provided A occurs first.
Similarly, ) The probability of A, given B.
Note here that
) = , when B does not depend on A which means A and B are independent.
) = , when A does not depend on B which means A and B are independent.
So, we can write the formula for conditional probability:
)
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
5
Let us consider the following table and use the probability rules.
In a survey over 100 people, the question was asked whether they are graduate or not.
Q,1 What us the probability that a randomly selected person is a male?
Ans.
Q.2 What is the probability that a randomly selected person is a female?
Ans.
Q.3 What is the probability that a randomly selected person is a male who is graduate?
Ans.
[Also we can think,
]
Q.4 What is the probability that a randomly selected person is a female who is non-graduate?
Ans.
[Also,
]
Q.5 What is the probability that the randomly selected person is either a male graduate or a
female non-graduate?
Ans. This two events are mutually exclusive and by the law of addition,
.
Q.6 If we now select two persons, what is the probability that one of them is a male graduate
and another is a female non-graduate?
Ans. Two independent events are occurring together. So by the law of multiplication of
probabilities,
.
Q.7 What is the probability that a randomly selected no-graduate is a female? [Prob. of non-
graduate among female]
Graduate Non-graduate
Total
Male 40 20 60
Female 10 30 40
Total 50 50 100
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
6
Ans.
Q.8 What is the probability that a randomly selected graduate is a male?
Ans. This is no. of male out of total graduates,
.
Note: In Q.7 & 8, each probability is a conditional probability. However, we gave the answers by
looking at the table directly. Now we answer them in terms of the law of conditional
probability.
Ans. to Q.8: Suppose, A = graduate, B = male, = probability of male given that they are
graduates.
We use the formula:
Here, = Prob. of male graduates =
, = prob. of graduates =
.
Exercise: Q.7 can also be answered in terms of conditional probability formula. Do this and check
yourself.
Q.9 What is the probability that the selected person is either male or graduate?
Ans. Here the two events do not happen together but they are not mutually exclusive. So we
use the formula:
=
.
Probability Distributions
Let us think of the probabilities for a number of events marked 1, 2, 3..and so on.
For each event we can have and also for all the events,
(normalization).
So, we have a set of probabilities corresponding to a set of events. This collection of
probabilities is a probability distribution for all that discrete events.
Imagine, instead of discrete events, we have as a variable which can have continuous values.
Also, there is the probability for each value of . Now if we plot against , we get a
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
7
continuous curve which is the continuous probability distribution curve (commonly referred as
the probability distribution curve).
Fig. 3.1
Area under the curve (above x-axis) can be obtained by summing up the areas of the approximate
rectangular bars (which we may easily find by plotting this on a graph paper). Approximate area
of one such bar of width and height is = . So, the approximate total area
between the two end points and is =
.
To calculate exactly, we need the help of Integral Calculus which essentially sums up the areas
of the rectangles (bars) of infinitesimally (smaller than the smallest you can think) small width.
Those not familiar with the Mathematics of Calculus, do not have to worry as the following
explanation and symbols can be understood qualitatively which may serve the purpose for now.
The area under the curve (between the two extreme points shown in the above figure) is the
following definite integral:
Area =
= .
is the total probability for all the values between the two limits. That is why, is often
referred to as the probability density. So, is the actual probability in between and
, where is the infinitesimally small (smaller than you can think) range! Note that the
area of the bar of height and width at some position is .
As in the discrete case, the area is the sum of all the mutually exclusive events.
[The sum (called sigma) in the discrete case becomes (called integral) for the continuous
case.]
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
8
Also,
= (Normalization)
Normalization means that the total area under the curve (extended from negative infinity to
positive infinity that means over the entire stretch of the curve.) is unity. This is true as in
discrete case we know that the sum of all the probabilities for all the events should be 1.
For discrete events, we calculated the relative frequency and then the Bar diagram from them.
Here for the continuous case, the bars merge together to form a continuous spectrum and that
is the probability distribution. The relative frequencies tend to the probabilities for
corresponding values of the variable for large number of events.
Now given the probability distribution curve, we would like to know about the shape and size of
the curve, some specific quantities that are representative of the character of the event.
From a discrete data set to a continuous Prob. distribution:
For any discrete set of data collection, we measure the central tendency of the data set. We
commonly calculate mean, mean of square and variance.
Mean:
=
=
(
) ,
where is the frequency of occurrence for event and we have total frequency, .
[Note:
relative frequency]
Mean of Square:
= (
)
Variance: Var ( ) = =
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
9
=
(
)
= (
)
*( ) +
Standard deviation is the square root of the variance.
Now for a large number of events, each of the ratios
in the above formulas becomes the
corresponding probability :
as tends to very large.
Therefore, we write the above quantities in terms of probabilities:
If the probabilities , , etc. are known for the values , , and so on, we can say
that we have a discrete probability distribution. When the probabilities are so infinitesimally
closely spaced that we can have probabilities for all possible continuous values of the variable
, we can say that there is a function of which is called continuous probability
distribution function.
[Note: However, in a practical calculation, when instead of probabilities, we are given the frequencies
, , for the quantities that appear in a data set, we calculate mean or average: = (
) .]
Expectation Values:
As the probability distribution (no matter discrete or continuous) for some event or some
population is known, we may expect what its mean value would be, either through
mathematical calculations or through our experience.
*In Statistics, population means entire or all possible set of data. Taking a few data (which we
call sample ) from the population we often try to estimate the mean, which is definitely
different from the population mean. But we know, with the larger and larger sample size, this
Mean,
Mean of Square, =
Variance,
=
Standard deviation
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
10
mean (which we call sample mean) should tend towards the population mean. This means, we
expect the population mean. More on this aspect will be discussed in the chapter on
Sampling. ]
So, the expectation value, the is mean of . Likewise, we can have expectation value of any power of .
Combination Rules: When we scale a variable that is we multiply a variable by a number or add with this, we need to know
how this scaled variable behaves. Do they have same statistical measures? Do they follow the same kind
of distributions? Also, we ask the same question for two or more variables when scaled and added
together to form a combined variable.
[Continuous case]
[Continuous case]
=
When
Mean:
Variance:
When
Mean:
Variance:
If has a Normal distribution, is also a Normal distribution.
When
Mean:
Variance:
If and are separately Normal distributions, is then also a Normal
distribution.
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
11
Following the combination rules in the above box, we can solve the following problem.
Example:
The weight of individual people follows Normal distribution, . What will be the
probability distribution of weight of 10 people taking together?
Ans. Here, mean , .
Mean weight of 10 people, + = = 40
Variance, + = = 500
The probability distribution of weight of 10 people taking together, .
Normal Distribution:
For any naturally occurring event, for any
random measurement of any value in any
experiment, the distribution that occurs is
Normal distribution. The bell shaped
symmetric curve is called Normal curve. If
we calculate the height or age distribution
or a distribution IQ level among a
population, the probability distribution
turns out to be Normal. The name normal is given as it occurs normally. In Mathematics or
Physics literature, it is also called Gaussian distribution after the great mathematician, Karl
Fredrick Gauss.
Properties of Normal Distribution:
A Bell shaped Symmetric distribution with the peak at the middle. The distribution curve
is extended from to [from minus infinity to plus infinity].
Mean, Mode and Median at the same position (at the peak).
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
12
Area under the curve:
Total area under the curve = 100%
A = 68%,
[Area within one standard deviation (
from the mean ( on both sides]
A = 95%,
A = 99.7%,
Normal distribution is most commonly observed and widely used and discussed. There are
various other kinds of distributions which can be identifies by the shapes and
mathematical expressions.
NOTE:
If we combine a set of Normal distributions, we get a Normal distribution as a result. Consider
some -numbers ( ) where each of which are drawn independently from a Normal
distribution. Calculate the mean of the numbers:
. If we draw -numbers again
and again, the mean of them would be different but the mean would follow Normal
distribution, provided the number is sufficiently large. But more interestingly, the individual
distributions from which the numbers are drawn, do not matter, the combination always turn
up to be a Normal distribution. This is Central Limit Theorem.
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
13
Experiment with rolling dice:
So, here we roll dice, calculate probabilities of occurring numbers and try to establish some
truth!
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
14
Example #1 Throwing of a single dice:
The chance of turning up of any side is equal which is 1 out of 6. We consider that a priori
probabilities for each case and find out the mean and variance from the following table.
1 2 3 4 5 6 Total
1/6 1/6 1/6 1/6 1/6 1/6 1
1/6 2/6 3/6 4/6 5/6 6/6 21/6
1/6 4/6 9/6 16/6 25/6 36/6 91/6
From the table, we can calculate mean,
and
variance,
If we plot against , we obtain the probability distribution for this case. This distribution is
uninteresting as we can check that the probabilities for all values of are same! The curve
obtained by joining the points will be a horizontal straight line.
Fig.
Now we do this similar experiment taking two dice together.
Example #2 (Two Dice)
We look for the value of which is the sum of two numbers on the top faces of the two dice as
rolled.
Here we shall have possible combinations of events and can have a minimum
value, and maximum value, .
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
15
2 3 4 5 6 7 8 9 10 11 12 Total
1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36 1
2/36 6/36 12/36 20/36 30/36 42/36 40/36 36/36 30/36 22/36 12/36 252/36
4/36 18/36 48/36 100/36 180/36 294/36 320/36 324/36 300/36 242/36 144/36 1974/36
Mean,
, Variance,
Now if we plot against taking from
above table, we get an interesting
symmetric distribution around a peak! The
peak is at (mean value).
The distribution is showing a peak at the
middle and it is symmetric!
We can go on doing such experiment taking 3 or more dice together and ask for the sum of
values and the corresponding probabilities as above. It can be understood that the smoothness
of the distribution would be more and more tending towards a definite shape while retaining
the peak at the centre.
[In fact, the envelope of the probability values at different (joining the top of the height bars)
of the discrete distribution will slowly assume a continuous symmetric curve!]
In the limit of large number of events obtained from the large number of dice throwing
together, we tend to get a continuous bell shaped symmetric distribution.
This is Normal Distribution.
For a large number of independent random observations, the probability distribution for the
mean of the observations can be shown to be Normal distribution. This is called Central Limit
Theorem.
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
16
Shape of a Distribution: Symmetry, Skewness, Kurtosis
Skewness:
A Normal distribution is symmetric around its peak. The peak corresponds to the most probable
value that is the value for which the probability is the maximum. An interesting thing about a
symmetric distribution is that the mean, median and mode are at the same position.
The skewness is any deviation from symmetry or we can say, lack of symmetry. For a symmetric
distribution, skewness is zero.
Coefficient of skewness =
The following mathematical definition is often used to measure the skewness:
Skew =
(
)
,
where is the standard deviation of the distribution. So, we see that the skewness is a
dimensionless quantity.
Skewness can be positive or negative. A distribution with a positive value of skewness is called
positively skewed, which means the tail of the distribution is more extended towards the more
positive values of . On the other hand, a distribution with a negative value of skewness is
called negatively skewed, which means the tail is more extended towards more negative values
(or lowers values) of .
Below are the two figures demonstrating the negative and positive skewness: the distributions
are correspondingly called negative skewed and positively skewed distributions.
(Negative Skewness: Mean < Mode) (Positive Skewness: Mean > Mode)
Kurtosis:
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
17
Kurtosis is another kind of measure of the shape of the distribution. It tells us about the
peakedness (how the peak looks like) or flatness of the probability distribution.
A Normal distribution is considered as a standard (or benchmark) in this regard. So, any change
of shape of the peak of a distribution (peakedness or flatness) compared to a Normal
distribution is measured.
The mathematical expression for kurtosis:
Kurt =
(
)
Note that the number 3 is subtracted from the expression so as to make the value of kurtosis
for Normal distribution equal to zero. It can be shown that
(
)
= 0 for Normal
distribution.
When kurtosis is positive, the peak of the distribution appears sharper relative to a Normal
distribution. The distribution is then called leptokurtic. One the other hand, when the kurtosis
is negative, we call the distribution mesokurtic. A mesokurtic distribution looks flatter
compared to a Normal distribution. As the distribution looks almost flat on top, it is called
platykurtic.
Fig.
If a distribution has more than one peak
The distribution we discussed (and we shall consider
throughout) is a unimodal distribution that means a
distribution which has a single mode or one peak. But in
many practical cases, we can have a distribution with
many peaks or many modes. For example, a distribution
with two peaks (in the fig. below) is called a bimodal
distribution.
Platykurtic
(Negative kurtosis)
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
18
Z-Distribution
What is a Z-distribution?
A Z-distribution is nothing but a Normal distribution with the peak (mean) at zero.
The peak of a Normal distribution is generally at a finite value with a standard deviation
(say). If we consider a new variable
the given Normal distribution (of variable) becomes another Normal distribution (of variable ) with the peak value at and this is
then called Z-distribution.
[The derivation of Z-distribution is given in appendix for those who are interested to know.]
For solving problems with Normal distribution, it is often advantageous to obtain a Z-
distribution and then to consult a Z-table.
In the following, we demonstrate with some examples how that is done.
Consider the following typical situations where we have to calculate the areas from Z-
distribution:
Fig.
(Total area under the curve = 1)
Fig.
(Area between and is 0.5 or area between
and is 0.5 because of symmetry)
Fig.
(Area between and any other value )
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
19
Fig.
(Area between two positive values of or between two negative values)
Fig.
(Area between a negative value and a positive value)
Fig.
(Area less than a negative or greater than a
positive value)
Important:
In the z-score table we always look for the area between zero and any other value (as the
integral is actually done that way). So, zero is always the reference point.
Finally, the area between any two values of is obtained by adding or subtracting the scores
involving zero. This will be clear from the following examples.
Examples:
(Some typical problems are discussed, consult the z-score table given in the appendix.)
#1. In the Geography examination, the marks distribution is known to be Normal where the
mean is 52 and the standard deviation is 15. Determine the z-scores of students receiving
marks: (i) 40, (ii) 95, (iii) 52.
Solution: Here, ,
(i)
(ii)
(iii)
So, we see the z-scores can be negative, positive or zero.
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
20
#2. Find the area under the normal curve in each of the following cases:
(i) and
Area = 0.3849 from table.
(ii) and
Area = 0.2518
(Note: The area is equal to the area between and as the curve is symmetric.)
(iii) Area between and 2.21
Area = (area between and 2.21) + (area between and -0.46)
= 0.4861 + 0.1772 = 0.6633
(Note: The areas are added as they are on both sides of .)
(iv) Area between and
Required area = (area between and 1.94) (area between and 0.81)
= 0.4738 0.2881 = 0.1857
(Note: There is the subtraction as the two areas are on the same side of .)
(v) To the left of
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
21
Required area = 0.5 (area between and )
= 0.5 0.2257 = 0.2743
(vi) To the right of
Required area = (area between and ) + 0.5
= (area between and ) + 0.5
= 0.3997 + 0.5 = 0.8997
#3. Among 1000 students, the mean score in the final examination is 25 and the standard
deviation is 4.0. Assume the distribution is Normal. Find the following.
(a) How many students score between 22 and 27?
=25, = 4.0
,
So the probability is the area under the curve between -0.75 and 0.5
= (area between 0 and -0.75) + (area between 0 and 0.5)
= 0.2734 + 0.1915 = 0.4649
The number of students in this marks range =
(b) How many students score above 30?
Probability = area right to
= (area between 0 and 1.25)
= 0.5 0.3944 = 0.1056
The number of students =
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
22
(c) How many students score below 15?
Area = 0.5 (area between and -2.5) = 0.5 0.4938 = 0.0062
The number of students =
(d) How many score 24?
Here we have to calculate area between 23.5 and 24.5.
,
Area between and
= (area between 0 and ) + (area between 0 and
= 0.1480 0.0517 = 0.0963
The number of students = .
Binomial Distribution
Before we discuss Binomial distribution, we should know certain basic mathematical
operations. For those who are not familiar with some mathematical notations and rules, may
consult the necessary introduction given in the following Box.
Binomial Probability:
Suppose, the probability of occurring a certain event is and not occurring of the event is
. In a total of trials, the particular event occurs times each with probability and
does not occur times each with probability, . Also, we have to know which events
will occur out of total events. The number of ways we can do that is the number of
combinations = ( ) . Consider a variable which is equal to the relative frequency,
.
As the events are considered independent, the joint probability will be
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
23
The above probability is called binomial probability.
Now consider the following table based on the binomial probability:
..
) (
) (
) --------
Factorial: ! =
For example, !
Consider that factorial of negative integers have no meaning and ! .
Note that we can write ! = !
Permutation: How many different objects can be arranged among themselves? The
answer is the permutation of objects, !
For example, for three objects A, B, C, the different combinations are ABC, ACB, BCA,
BAC, CAB, CBA: total 6 ways = !
Combination: () or =
!
! !
This is the number of ways some objects can be selected from objects.
For example, if we want to know how 2 students can be selected from total 3 students,
the answer is ( )
!
! !
!
! ! .
Also note for quick calculations, ( )
!
! ! = 1, (
) !
! ! and
( )
!
! ! .
(
) (
)
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
24
If we add all the terms of the second row above, we get the following binomial expansion:
( ) (
) (1)
From the expression (1) above, we can easily check the following known algebraic formulas:
.
= ..
The coefficients of the terms on the right of the above can be arranged in the following
triangular form which is called Pascals triangle:
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
1 8 28 56 70 56 28 8 1
The Rule:
As indicated above, a number in a row (except the right and left most ones) is the sum of two
numbers on the two sides of the preceding row.
So, from the 8th row in the Pascals triangle we can easily write the binomial expansion:
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
25
Remember that each term represents a binomial probability. A binomial distribution is a
collection of these discrete binomial probabilities. Note:
Example #1:
Five independent shots are fired at a target. The probability of a hit from each shot is 0.4.
Q. What is the probability that two shots will hit the target?
Ans. Here , , ,
( )
!
! !
Q. What is the probability that there will be more than two hits?
Ans. Prob. = ( ) (
) (
)
= !
! !
!
! !
!
! !
= !
!
!
!
=
Q. What is the expectation value of the hits (that is the mean value of hitting the targets out of
all five shots)?
Ans. For this we have to calculate the probabilities , , ,..for the corresponding number
of hits 0, 1, 2..
The expectation value,
= 0 + ( ) (
)
( ) (
) (
)
=
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
26
= 0.2592 + 0.6912 + 0.6912 + 0.3072 + 0.0512 = 2.0
Example #2:
Now, imagine a situation where we toss 8 coins together or we toss one coin 8 times
consecutively. We measure the relative occurrence of Head in 8 trials. Let us attach values,
Head = 1 and Tail = 0. So, we can think of a variable which can take values 1/8, 2/8, 3/8,
4/8. and so on. Thus we can associate probabilities for the values of directly from Pascals
triangle (or by using formula). Note that probability of occurring Head, and not-
occurring Head, .
(
)
, (
) (
)
(
) (
)
, (
) (
)
(
) (
)
, (
) (
)
(
) (
)
, (
) (
)
(
)
If we now plot against , we get the following symmetric discrete distribution with the
peak value at .
Fig.
For large number of trails, this distribution becomes Normal distribution. Therefore, we can say
the following:
Binomial Probability distribution for a random variable becomes Normal distribution
for a large number of trials.
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
27
The Z-Table
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.1 0.03983 0.04380 0.04776 0.05172 0.05567 0.05962 0.06356 0.06749 0.07142 0.07535
0.2 0.07926 0.08317 0.08706 0.09095 0.09483 0.09871 0.10257 0.10642 0.11026 0.11409
0.3 0.11791 0.12172 0.12552 0.12930 0.13307 0.13683 0.14058 0.14431 0.14803 0.15173
0.4 0.15542 0.15910 0.16276 0.16640 0.17003 0.17364 0.17724 0.18082 0.18439 0.18793
0.5 0.19146 0.19497 0.19847 0.20194 0.20540 0.20884 0.21226 0.21566 0.21904 0.22240
0.6 0.22575 0.22907 0.23237 0.23565 0.23891 0.24215 0.24537 0.24857 0.25175 0.25490
0.7 0.25804 0.26115 0.26424 0.26730 0.27035 0.27337 0.27637 0.27935 0.28230 0.28524
0.8 0.28814 0.29103 0.29389 0.29673 0.29955 0.30234 0.30511 0.30785 0.31057 0.31327
0.9 0.31594 0.31859 0.32121 0.32381 0.32639 0.32894 0.33147 0.33398 0.33646 0.33891
1.0 0.34134 0.34375 0.34614 0.34849 0.35083 0.35314 0.35543 0.35769 0.35993 0.36214
1.1 0.36433 0.36650 0.36864 0.37076 0.37286 0.37493 0.37698 0.37900 0.38100 0.38298
1.2 0.38493 0.38686 0.38877 0.39065 0.39251 0.39435 0.39617 0.39796 0.39973 0.40147
1.3 0.40320 0.40490 0.40658 0.40824 0.40988 0.41149 0.41308 0.41466 0.41621 0.41774
1.4 0.41924 0.42073 0.42220 0.42364 0.42507 0.42647 0.42785 0.42922 0.43056 0.43189
1.5 0.43319 0.43448 0.43574 0.43699 0.43822 0.43943 0.44062 0.44179 0.44295 0.44408
1.6 0.44520 0.44630 0.44738 0.44845 0.44950 0.45053 0.45154 0.45254 0.45352 0.45449
1.7 0.45543 0.45637 0.45728 0.45818 0.45907 0.45994 0.46080 0.46164 0.46246 0.46327
1.8 0.46407 0.46485 0.46562 0.46638 0.46712 0.46784 0.46856 0.46926 0.46995 0.47062
1.9 0.47128 0.47193 0.47257 0.47320 0.47381 0.47441 0.47500 0.47558 0.47615 0.47670
2.0 0.47725 0.47778 0.47831 0.47882 0.47932 0.47982 0.48030 0.48077 0.48124 0.48169
2.1 0.48214 0.48257 0.48300 0.48341 0.48382 0.48422 0.48461 0.48500 0.48537 0.48574
2.2 0.48610 0.48645 0.48679 0.48713 0.48745 0.48778 0.48809 0.48840 0.48870 0.48899
2.3 0.48928 0.48956 0.48983 0.49010 0.49036 0.49061 0.49086 0.49111 0.49134 0.49158
2.4 0.49180 0.49202 0.49224 0.49245 0.49266 0.49286 0.49305 0.49324 0.49343 0.49361
2.5 0.49379 0.49396 0.49413 0.49430 0.49446 0.49461 0.49477 0.49492 0.49506 0.49520
2.6 0.49534 0.49547 0.49560 0.49573 0.49585 0.49598 0.49609 0.49621 0.49632 0.49643
2.7 0.49653 0.49664 0.49674 0.49683 0.49693 0.49702 0.49711 0.49720 0.49728 0.49736
2.8 0.49744 0.49752 0.49760 0.49767 0.49774 0.49781 0.49788 0.49795 0.49801 0.49807
2.9 0.49813 0.49819 0.49825 0.49831 0.49836 0.49841 0.49846 0.49851 0.49856 0.49861
3.0 0.49865 0.49869 0.49874 0.49878 0.49882 0.49886 0.49889 0.49893 0.49896 0.49900
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
28
Sampling
Basic Concept:
What is sampling?
Sampling is to take a subsection of the population for a particular study. The aim is to
select the data sample in order to represent the total data set.
In statistics, population means the total collection of data. When the population or the
entire collection of data is studied, it is called census.
In short, population is the total set and the sample is the subset of it.
Why the sampling is done?
When the number of elements in a population is large it is often not possible to
investigate the population completely due to lack of time, money and resources. This is
why the sampling is necessary.
Sampling is done in such a way that the subset of data represents the entire set.
Example:
If a TV channel wants to know the popularity of a program it would be expensive to ask
everybodys opinion. Instead a subsection of viewers are interviewed and the data is
collected.
Methods of Sampling:
A sample of size means there are -data points in the collection. A sample of size is
collected from a population of size in such a way that all the features of the population are
well represented by this.
If a sampling method does over-represent or under-represent a feature of the population it is
said to be biased. The aim of any selection method is to reduce the chance of bias as far as
possible.
There are several methods of sampling; among them the most common is the random
sampling.
Random sampling:
For a sample of size , we collect -data from the population. We collect many such
samples for our evaluation. If this is done randomly so that each group of size taken
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
29
from the population has equal chance of getting selected, we call this random sampling.
Sometimes, it is called simple random sampling.
For a random sampling, the successive drawings have to be independent.
Let us suppose, we want to select a sample of size 100 from a population of size 10000.
In case of random sampling, we select the elements (that is which element is to be
picked) with the help of a random number (generated in a computer) or by consulting a
random number table or by some kind of dice throwing.
Systematic Sampling:
If simple random sampling from population is not possible, the systematic sampling may
be done. First, population is enumerated from 1 onwards. If sample size of from a
population of size is to be obtained, every
-th item is selected. First a random
number between 1 and is selected and then it is taken as the 1st element. After this
every -th element is taken.
Example:
Follow the table given below.
Sl no. value
1 20
2 27
3 33
4 21
5 15
6 22
7 45
8 13
9 32
10 29
11 10
12 16
For a sample of size
Select a random number between 1-
3: choose 2, for example.
Start with #2 and then take 5, 8, 11
number data.
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
30
Stratified Sampling:
In this method, the population is first divided into groups (strata). Each element of the
sample belongs to one such group.
Divide the population into non-overlapping groups each containing , data such
that . Next do the simple random sampling to collect one or
a few elements from each group.
Suppose, a population is classified into several groups according to age or something
like that. Then from each group random samples are collected.
Note: This is also called restricted random sampling.
Cluster Sampling:
In this method, like before, the population is divided into groups called clusters. Then
clusters are taken randomly and the elements are collected from them as sample.
Probability sampling
Any method of sampling that uses (probabilistically) random selection is in general
called probability sampling.
Sampling variation:
When sampling from a population is done, we take not one sample but different sets of
samples having same size. If the samples are different, we call this sampling variation.
Usually in practice, we often draw only one sample or one set of data from a population.
But we may not be sure what may happen in case we draw several other samples. Will
we get the same result? The answer is No. If we look for mean value, we see that the
mean is not the same for all the samples that we are able to draw. We then get some
distributions of the sample means.
population size, sample size, = the sample fraction.
Many samples of the same size yield a sampling distribution.
The sampling distributions are usually assumed to follow any well-known probability
distribution.
We look for various properties from the distribution curves.
It is seen how the variation of sample size can affect the properties.
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
31
From the experience and theory, we can say that the variability of sampling
distributions decreases with sample size.
SAMPLING DISTRIBUTIONS
What do you do after the sample is collected?
The first thing one can do with a set of data is to measure the central tendency of it. Usually, we
calculate the mean and variance.
The calculation of mean (or variance) is done over many samples of same sizes. Let us suppose,
we have collected -samples of same size. The mean values , , .of the
various samples are calculated. It is assumed that the grand mean of all these mean values is
the actual sample mean, .
The mean of the sample means is the estimate of the population mean. Similarly, the variance
of the mean values calculated from the set of samples (of equal size) is an estimate of the
population variance.
It can be shown:
Hypothesis Testing
What is Hypothesis?
On the basis of sample information, we make certain decisions about the population. In taking
such decisions we make certain assumptions. These assumptions are known as statistical
hypothesis.
[ Note: A collected set of data points which is a part of the population (a few number of data)
is called a sample. The process of selection is called sampling. When all the data are considered
for a study, this is called population.]
Sample mean is the unbiased estimate of population mean, .
For the population variance, the unbiased estimate is
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
32
How to test Hypothesis?
Assuming the hypothesis correct, we calculate the probability of getting the observed sample. If
this probability is less than a certain assigned value, the hypothesis is rejected.
If there is no significant difference between the observed value and the expected value, the
hypothesis is called Null Hypothesis.
Test of significance:
The tests which enable us to decide whether to accept or to reject the null hypothesis are called
the tests of significance. If the differences between the sample values and the population
values are significantly large it is to be rejected (i.e., Hypothesis is not Null).
It is known that the mean of a sample is an unbiased estimate of the population mean . It is called point estimate. But we know, if we collect different samples, the mean ( ) varies from sample to sample. Mean of samples form a distribution which we call sampling distribution. Note that the sampling distribution is Normal if the variable in the population is normally
distributed.
Now the question is, how close is a calculated mean to the population mean? We have to
estimate that with some level of accuracy.
Confidence Interval:
Confidence interval is a range of values over which we can trap the population mean with some
probability. So, we consider the probability distribution of sample means in order to find that
probability of trapping.
Suppose, we have a sample mean and we consider a symmetric interval around this:
where is a value that we shall determine.
If | | , the confidence interval traps the population mean .
How to calculate confidence interval?
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
33
Suppose, the variable follows a Normal distribution, with mean and standard deviation .
Symbolically,
So, for a sample if size , (
).
This mean that the distribution of mean ( ) of sample size follows a Normal distribution with
mean and standard deviation .
If the confidence interval is 95%, the interval has a probability 0.95 to trap the population
mean: | |
Now as an example, consider a sampling distribution with , .
Here follows z-distribution, Z . [Normal distribution with mean = 0, stand dev. = 1]
Now let us look up the z-table. The total area under the curve is 100% which gives us the total
probability = 1. The shaded area (as in the fig.) is 95% of the total area which corresponds to
probability = 0.95.
The half of the shaded area = 0.95/2 = 0.475 as it is symmetric around zero.
In the z-distribution [ , we now find the value of from z-table, where the
area from to is 0.475.
as we consider the critical value, .
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
34
Thus 95% confidence interval:
If the sample mean is , the confidence interval:
So we can say with 95% confidence level that the population mean can be in this interval.
Let us now calculate the width of the confidence interval for 95% confidence:
So, we can see that the interval decreases with the increase of sample size. That is we can
narrow down the search of the population mean as we take larger sample size. Then we can say
with more accuracy that our measured mean is closer to the population mean.
For example, for ,
For ,
,
For 98% Confidence interval:
Shaded area = 0.98. Half the shaded area = 0.98/2 =0.49 which is between and
Thus 98% confidence interval is *
+.
NOTE #1:
Symbolically, it often said that the confidence level is , where .
This also means significant level.
NOTE: For a sample of size = , with population variance , a 95% confidence
interval means *
+
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
35
For example, for Confidence level = Significance level = Confidence level and significance levels are complimentary.
NOTE #2:
When we are not sure if the population is Normal and we do not know the population variance
, we can still use the method of calculating the confidence interval by considering the
variance of a large sample (usually ).
(
)
Then we consider the interval, *
+.
Students T-test:
This is applied to find confidence interval for a small sample. The population is Normal.
Consider the variable defined as
[Note here, we use , calculated for the sample, instead of .]
The values of the variable varies from sample to sample and thus it forms a distribution
looking very similar to Normal distribution. This is t-distribution. As we take larger and larger
samples, the t-distributions more and more become closer to a Normal distribution, ,
which is nothing but z-distribution.
Confidence
Level
z
90% 1.645
95% 1.96
98% 2.326
99% 2.576
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
36
Now instead of sample size, the family of distributions are characterized by a parameter called
degrees of freedom (df), usually denoted by [Nu].
Degrees of freedom = No. of independent values used for calculation of .
For example, if is the sample size, we use -data points but they are related by their mean,
or . Such a condition in the form of a relation or equation is called a
constraint. Thus we have independent quantities and this is degrees of freedom here.
Degrees of freedom, Number of values Number of constraints
In this case,
The -distributions are now designated as -distributions. As is higher the -distributions
tend more and more towards z-distribution.
Like z-table, we now have -table to consult, from where we have the area under the curve with
some -range.
So, for a Normal distribution, for a sample of size , we have confidence interval:
*
+ for a confidence level, for -degrees of freedom.
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
37
EXAMPLE #1:
Consider the following 10 measurements of some variable. The hypothesis is that the
population mean is . We have to verify that. Assume that the readings follow a Normal
Distribution.
No. of
Obs.
1 2 3 4 5 6 7 8 9 10
Values 0.13 -0.09 0.06 0.15 -0.02 0.03 0.01 -0.02 -0.07 0.05
(
)
(
)
Degrees of freedom,
From -table for with 95% confidence level, we have .
Confidence interval: *
+
The mean is trapped inside the above interval. So the hypothesis is right. Null
hypothesis.
EXAMPLE #2:
The mean life time (in Hours) of an electric bulb is measured to be 10.4. Now a technology is
introduced to increase the life time. The experimental data collected from a random sample of
size , , . Test whether there is any evidence at the 10%
significance level that the new technology has actually increased the life time.
[Note that it is not asked if there is any decrease in life time. The question is to ask whether
there is any increase or it remains the same.]
Ans.
Null hypothesis, , Alternate hypothesis,
Here we consider one tail t-test as we are to look for the increase only.
Sample mean,
Unbiased estimate of the population variance (from the sample),
(
)
*
+
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
38
For the t-test,
Here, degrees of freedom, . So we look for area under the curve for
distribution.
For 10% significance, i.e. for 90% confidence level, we find . Thus our
observed value lies in the rejection region. That means that the mean life time is increased.
Alternate hypothesis.
EXAMPLE #3:
You are measuring some length which is 10 cm. Five measurements by you are 9.88, 10.18.
10.23, 10.39, 10.25 cm. Assume that the measurements follow a Normal distribution. Test at
the 5% significance level whether there results support the claim or it is biased.
Ans.
Since the bias can be in either direction (positive or negative), we consider two tail test.
The Hypothesis, Null cm, Alternate cm.
Sample mean,
Variance,
(
)
*
+ , this is an unbiased
estimate of the population mean.
For 5% significance level, we consider the area of 0.95 (shaded area in the fig.) around the
centre, and an area of 0.025 on both sides (at both the tails).
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
39
We consider distribution as the degrees of freedom, . The rejection region on
either sides corresponds to , from the table.
Here we find that the t-value is below the rejection region that is in the acceptance region. Thus
the hypothesis ( cm) is accepted. Null hypothesis.
Chi-Squared Test: ( -Test)
In some measurement, we obtain the frequencies of some events. We call them observed
frequencies ( ). We have to test whether the observed frequencies are consistent with the
expected frequencies according to some given distribution or hypothesis.
The measure of discrepancy between the observed and expected frequencies is defined by the
following quantity:
NOTE:
In one tail t-test, we consider only one side of the t-distribution, either on the right side (for
increase or positive values) or on the left side (for decrease or negative values). For two tail t-
test, we consider both sides of the distribution (as we have done before) considering the fact that
the value of the variable can increase or decrease from the mean value.
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
40
Note: (Chi-square) is a positive quantity, lower its value better is the agreement between the
observed and expected frequencies. In other words, it gives a goodness of fit of the model or
hypothesis. For , the agreement is absolute.
Like t-distribution, we do also have -distribution. We measure the values for different samples of
same size and obtain a distribution. The distribution, here also, is characterized by the degrees of
freedom . So for we write ,
EXAMPLE #1:
In a dice throw experiment, we obtain the following fig. where the dice was thrown 600 times.
Score 1 2 3 4 5 6
Freq. 90 108 110 95 100 97
Let us check the above with respect to -test. Our hypothesis is that for each score, the
probability = 1/6 (for a fair dice). So
the expected frequency =
.
Hypothesis, the dice is fair,
the dice is not fair.
In this example, the degrees of
freedom, .
So after we calculate from the
following table, we have to look for the
-table.
Score
1 90 100 -10 1
2 108 100 8 0.64
3 110 100 10 1
4 95 100 -5 0.25
5 100 100 0 0
6 97 100 -3 0.09
Total 600 600 0 2.98
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
41
To calculate :
From the table, we see . If we consider 90% confidence level, we have ( )
. Our obtained value for is below this. So it falls within the acceptance region. The dice is fair,
the hypothesis null.
EXAMPLE #2:
In a genetic study, it is predicted that the children with both parents of blood group AB will fall
into blood groups AB, A and B in the ratio 2:1:1. Out of a random sample 100, we find 55
children have blood group AB, 27 have blood group A and 18 blood group B. Test at 10%
significance level whether the observed results agree with the theoretical prediction.
Ans.
Hypothesis The childrens blood group is in ratio 2:1:1
The childrens blood group is NOT in ratio 2:1:1
The ratio of probabilities AB, A, B is 2:1:1 =
Degrees of freedom
For 10% significance level we look for -distribution table: (
)
We find
The rejection region is thus above .
Here the obtained value of is below the rejection region, so it falls in the acceptance region.
The hypothesis is correct. Null Hypothesis!
EXAMPLE #3
The rain fall ( ) at some place is measured in cm in the following table. We assume that is a
random variable and it follows a Normal distribution with mean and standard deviation
.
(i) Calculate the expected frequencies of the different classes
Blood
group
AB 55 50 5 0.5
A 27 25 2 0.16
AB 18 25 -7 1.96
Total 100 100 0 2.62
65
Obs. Freq. 10 18 28 18 12
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
42
(ii) Carry out a goodness of fit analysis to test at the 5% level of significance and test the
hypothesis that the random variable actually follows the Normal distribution
.
Ans.
(i)
For 35, 45, 55, 65 we have -1, -0.333,
0.333, 1 respectively.
Now Follow z-table.
For , we have
,
Expected frequency =
Here, total frequency =
For ,
,
Expected frequency =
For ,
,
Expected frequency =
By symmetry, the expected frequencies for the 4th and 5th groups are 18.14 and 13.65
respectively.
To carry out -test we prepare the following table.
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
43
Class
65 12 13.65 -1.65 0.2
Total 86 86.01 0
Here,
From the -distribution table, , for 5% significance level.
Since, 2.56 is not in the rejection region, the data follows Normal distribution, .
Null hypothesis.
Additional Information
Type I and Type II errors:
In case of Hypothesis testing, we call
Type I Error -> When we incorrectly reject the true Null Hypothesis.
Type II Error -> When we fail to reject the false Null Hypothesis.
Probability Density Function:
In Probability theory, the probability density function (P.D.F.) of a continuous random variable
is the probability around a certain value or probability in a unit interval. P.D.F. when integrated
over a finite interval gives the cumulative probability.
, P.D.F.
-
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.
(Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,
kg.abhi@gmail.com, Physics Deptt, Panskura B. College, WB, India
44
*The Lecture notes are for private circulation only. Some of the ideas and examples are
taken from the some books and numerous materials available in internet.
Books and Websites:
1. Advanced Level Mathematics: STATISTICS 1 & 2 Steve Dobbs and Jane Miller
(Pub: CAMBRIDGE International Examinations)
2. The Analysis of Time Series An Introduction (fifth edition) C. Chatfield (Pub:
Chapman & Hall)
3. Basic & Clinical Biostatistics (fourth edition) Beth Dawson, Robert G. Trapp (Pub: Lange
Medical Books/ McGraw-Hill)
4. Numerical Recipes (2nd Ed, Vol I FORTRAN) William H. Press, Saul A. Teukolsky, William
T. Vetterling, Brian P. Flannery (Pub: Cambridge University Press)
5. Mathematical Physics, H.K. Dass
6. Website: people.richland.edu
top related