a short course on probability and sampling

44
A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc. (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta, [email protected], Physics Deptt, Panskura B. College, WB, India PROBABILITY and SAMPLING Concept of Probability, the Probability Rules, Probability Distributions and Applications For randomly occurring events, we would like to know how many times we get a desired result out of all trials. This means we would like to know the fraction of favourable events or trails. Suppose, we flip a coin a few number of times. We know there is a 50-50 chance of occurring a Head or a Tail. We may count how many times there is a “Head” or a “Tail” out of all the flips. Let, = No. of favourable events and = Total no. of events. = fraction of favourable events. We can also say this is relative frequency in the usual language of Statistics. Now, if we do the trials a large number of times, this fraction tends to some fixed value specific to the event. Then the limiting value of the fraction is what we call probability. Note: Total no. of trials is also called ‘sample space’ when we are drawing samples out of total ‘population’. As the no. of trials is increased, the sample space becomes bigger. Definition of Probability: Probability is the ratio of number of favourable events to the total number of events, provided the total number of events is very large (actually infinity). , when (infinity). So by definition, is a fraction between 0 and 1 : . No favourable outcome. All the outcomes are in favour. We can also think in the following way: probability of occurring an event, probability of not occurring the event. Since, either the event will occur or not occur, we must write:

Upload: abhijit-kar-gupta

Post on 24-Nov-2015

202 views

Category:

Documents


0 download

DESCRIPTION

Probability and Sampling - a simple approach, with minimum use of mathematics. This short course is intended to serve as a concept and application of the subject. I hope, this would be useful for the students of Geography Biology and Social sciences etc [other than those of mathematics and mathematically oriented subjects].

TRANSCRIPT

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    PROBABILITY and SAMPLING

    Concept of Probability, the Probability Rules, Probability Distributions and Applications

    For randomly occurring events, we would like to know how many times we get a desired result

    out of all trials. This means we would like to know the fraction of favourable events or trails.

    Suppose, we flip a coin a few number of times. We know there is a 50-50 chance of occurring a

    Head or a Tail. We may count how many times there is a Head or a Tail out of all the flips.

    Let,

    = No. of favourable events and = Total no. of events.

    = fraction of favourable events. We can also say this is relative frequency in

    the usual language of Statistics.

    Now, if we do the trials a large number of times, this fraction tends to some fixed value

    specific to the event. Then the limiting value of the fraction is what we call probability.

    Note:

    Total no. of trials is also called sample space when we are drawing samples out of total

    population. As the no. of trials is increased, the sample space becomes bigger.

    Definition of Probability:

    Probability is the ratio of number of favourable events to the total number of events, provided

    the total number of events is very large (actually infinity).

    , when (infinity).

    So by definition, is a fraction between 0 and 1 : .

    No favourable outcome.

    All the outcomes are in favour.

    We can also think in the following way: probability of occurring an event, probability of

    not occurring the event. Since, either the event will occur or not occur, we must write:

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    2

    Therefore, we have, .

    Example #1:

    In a coin tossing, we know from our experience, = and = =

    . So,

    .

    Example #2:

    In a throw of a dice, we know that the probability of the dice facing 1 up, 2 up, 3 up etc.

    will be , , and so on.

    Here,

    Probability of not occurring 1 is

    .

    Note:

    The condition that the total probability of all the events has to be 1 is called normalization of

    probabilities:

    Rules of Probability:

    When more than one event takes place, we need to calculate the joint probability for the all the

    events.

    Mutually Exclusive Events

    Two events are mutually exclusive (or disjoint) when they cannot occur at the same time.

    Suppose, two events are A and B and the individual probabilities for them are designated as

    and . Mutually exclusive means,

    .

    Addition Rule:

    Example#1: The probability of occurring either Head or Tail in a coin toss,

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    3

    Example#2: The probability of occurring either 1 or 6 in a dice throw,

    .

    Independent Events

    When the occurrence of one event does not influence the other but they can occur at the same

    time, they are called independent. For example, the rain fall today and the Manchester United

    winning a match.

    Multiplication Rule:

    Example #1:

    What is the probability that two Heads will occur when we toss two coins together?

    for the first coin and for the second coin.

    .

    Note that if would flip a single coin two times and ask the probability of getting Heads twice, we

    would get the same answer.

    Example #2:

    Now we ask the question, what is the probability of getting one Head and one Tail in the

    flipping of two coins together?

    Consider, the probability of obtaining Head in the first coin and Tail in the second coin:

    .

    And the probability of obtaining Tail in the first and the Head in the second:

    .

    Now the total probability of above two events (either of them occurs mutually exclusively):

    .

    Note that in the flipping of two coins together, there are 4 types of events, HH, HT, TH, TT. Out

    of which the relative occurrence of one Head and one Tail is 2/4 = /12.

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    4

    When Events are NOT Mutually Exclusive:

    If the events are not mutually exclusive, there are some

    overlap. Suppose, we designate

    an area A corresponding to the probability of

    some event A and the area B to the probability

    of another event B. The overlap between the

    two areas then represents the joint

    probability, . Note that for two

    independent events the overlap would be zero.

    Addition Rule in this case:

    When Events are NOT Independent:

    Multiplication rule:

    ) The probability of B given A. This is a conditional probability, i.e., the probability of

    occurring B provided A occurs first.

    Similarly, ) The probability of A, given B.

    Note here that

    ) = , when B does not depend on A which means A and B are independent.

    ) = , when A does not depend on B which means A and B are independent.

    So, we can write the formula for conditional probability:

    )

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    5

    Let us consider the following table and use the probability rules.

    In a survey over 100 people, the question was asked whether they are graduate or not.

    Q,1 What us the probability that a randomly selected person is a male?

    Ans.

    Q.2 What is the probability that a randomly selected person is a female?

    Ans.

    Q.3 What is the probability that a randomly selected person is a male who is graduate?

    Ans.

    [Also we can think,

    ]

    Q.4 What is the probability that a randomly selected person is a female who is non-graduate?

    Ans.

    [Also,

    ]

    Q.5 What is the probability that the randomly selected person is either a male graduate or a

    female non-graduate?

    Ans. This two events are mutually exclusive and by the law of addition,

    .

    Q.6 If we now select two persons, what is the probability that one of them is a male graduate

    and another is a female non-graduate?

    Ans. Two independent events are occurring together. So by the law of multiplication of

    probabilities,

    .

    Q.7 What is the probability that a randomly selected no-graduate is a female? [Prob. of non-

    graduate among female]

    Graduate Non-graduate

    Total

    Male 40 20 60

    Female 10 30 40

    Total 50 50 100

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    6

    Ans.

    Q.8 What is the probability that a randomly selected graduate is a male?

    Ans. This is no. of male out of total graduates,

    .

    Note: In Q.7 & 8, each probability is a conditional probability. However, we gave the answers by

    looking at the table directly. Now we answer them in terms of the law of conditional

    probability.

    Ans. to Q.8: Suppose, A = graduate, B = male, = probability of male given that they are

    graduates.

    We use the formula:

    Here, = Prob. of male graduates =

    , = prob. of graduates =

    .

    Exercise: Q.7 can also be answered in terms of conditional probability formula. Do this and check

    yourself.

    Q.9 What is the probability that the selected person is either male or graduate?

    Ans. Here the two events do not happen together but they are not mutually exclusive. So we

    use the formula:

    =

    .

    Probability Distributions

    Let us think of the probabilities for a number of events marked 1, 2, 3..and so on.

    For each event we can have and also for all the events,

    (normalization).

    So, we have a set of probabilities corresponding to a set of events. This collection of

    probabilities is a probability distribution for all that discrete events.

    Imagine, instead of discrete events, we have as a variable which can have continuous values.

    Also, there is the probability for each value of . Now if we plot against , we get a

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    7

    continuous curve which is the continuous probability distribution curve (commonly referred as

    the probability distribution curve).

    Fig. 3.1

    Area under the curve (above x-axis) can be obtained by summing up the areas of the approximate

    rectangular bars (which we may easily find by plotting this on a graph paper). Approximate area

    of one such bar of width and height is = . So, the approximate total area

    between the two end points and is =

    .

    To calculate exactly, we need the help of Integral Calculus which essentially sums up the areas

    of the rectangles (bars) of infinitesimally (smaller than the smallest you can think) small width.

    Those not familiar with the Mathematics of Calculus, do not have to worry as the following

    explanation and symbols can be understood qualitatively which may serve the purpose for now.

    The area under the curve (between the two extreme points shown in the above figure) is the

    following definite integral:

    Area =

    = .

    is the total probability for all the values between the two limits. That is why, is often

    referred to as the probability density. So, is the actual probability in between and

    , where is the infinitesimally small (smaller than you can think) range! Note that the

    area of the bar of height and width at some position is .

    As in the discrete case, the area is the sum of all the mutually exclusive events.

    [The sum (called sigma) in the discrete case becomes (called integral) for the continuous

    case.]

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    8

    Also,

    = (Normalization)

    Normalization means that the total area under the curve (extended from negative infinity to

    positive infinity that means over the entire stretch of the curve.) is unity. This is true as in

    discrete case we know that the sum of all the probabilities for all the events should be 1.

    For discrete events, we calculated the relative frequency and then the Bar diagram from them.

    Here for the continuous case, the bars merge together to form a continuous spectrum and that

    is the probability distribution. The relative frequencies tend to the probabilities for

    corresponding values of the variable for large number of events.

    Now given the probability distribution curve, we would like to know about the shape and size of

    the curve, some specific quantities that are representative of the character of the event.

    From a discrete data set to a continuous Prob. distribution:

    For any discrete set of data collection, we measure the central tendency of the data set. We

    commonly calculate mean, mean of square and variance.

    Mean:

    =

    =

    (

    ) ,

    where is the frequency of occurrence for event and we have total frequency, .

    [Note:

    relative frequency]

    Mean of Square:

    = (

    )

    Variance: Var ( ) = =

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    9

    =

    (

    )

    = (

    )

    *( ) +

    Standard deviation is the square root of the variance.

    Now for a large number of events, each of the ratios

    in the above formulas becomes the

    corresponding probability :

    as tends to very large.

    Therefore, we write the above quantities in terms of probabilities:

    If the probabilities , , etc. are known for the values , , and so on, we can say

    that we have a discrete probability distribution. When the probabilities are so infinitesimally

    closely spaced that we can have probabilities for all possible continuous values of the variable

    , we can say that there is a function of which is called continuous probability

    distribution function.

    [Note: However, in a practical calculation, when instead of probabilities, we are given the frequencies

    , , for the quantities that appear in a data set, we calculate mean or average: = (

    ) .]

    Expectation Values:

    As the probability distribution (no matter discrete or continuous) for some event or some

    population is known, we may expect what its mean value would be, either through

    mathematical calculations or through our experience.

    *In Statistics, population means entire or all possible set of data. Taking a few data (which we

    call sample ) from the population we often try to estimate the mean, which is definitely

    different from the population mean. But we know, with the larger and larger sample size, this

    Mean,

    Mean of Square, =

    Variance,

    =

    Standard deviation

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    10

    mean (which we call sample mean) should tend towards the population mean. This means, we

    expect the population mean. More on this aspect will be discussed in the chapter on

    Sampling. ]

    So, the expectation value, the is mean of . Likewise, we can have expectation value of any power of .

    Combination Rules: When we scale a variable that is we multiply a variable by a number or add with this, we need to know

    how this scaled variable behaves. Do they have same statistical measures? Do they follow the same kind

    of distributions? Also, we ask the same question for two or more variables when scaled and added

    together to form a combined variable.

    [Continuous case]

    [Continuous case]

    =

    When

    Mean:

    Variance:

    When

    Mean:

    Variance:

    If has a Normal distribution, is also a Normal distribution.

    When

    Mean:

    Variance:

    If and are separately Normal distributions, is then also a Normal

    distribution.

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    11

    Following the combination rules in the above box, we can solve the following problem.

    Example:

    The weight of individual people follows Normal distribution, . What will be the

    probability distribution of weight of 10 people taking together?

    Ans. Here, mean , .

    Mean weight of 10 people, + = = 40

    Variance, + = = 500

    The probability distribution of weight of 10 people taking together, .

    Normal Distribution:

    For any naturally occurring event, for any

    random measurement of any value in any

    experiment, the distribution that occurs is

    Normal distribution. The bell shaped

    symmetric curve is called Normal curve. If

    we calculate the height or age distribution

    or a distribution IQ level among a

    population, the probability distribution

    turns out to be Normal. The name normal is given as it occurs normally. In Mathematics or

    Physics literature, it is also called Gaussian distribution after the great mathematician, Karl

    Fredrick Gauss.

    Properties of Normal Distribution:

    A Bell shaped Symmetric distribution with the peak at the middle. The distribution curve

    is extended from to [from minus infinity to plus infinity].

    Mean, Mode and Median at the same position (at the peak).

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    12

    Area under the curve:

    Total area under the curve = 100%

    A = 68%,

    [Area within one standard deviation (

    from the mean ( on both sides]

    A = 95%,

    A = 99.7%,

    Normal distribution is most commonly observed and widely used and discussed. There are

    various other kinds of distributions which can be identifies by the shapes and

    mathematical expressions.

    NOTE:

    If we combine a set of Normal distributions, we get a Normal distribution as a result. Consider

    some -numbers ( ) where each of which are drawn independently from a Normal

    distribution. Calculate the mean of the numbers:

    . If we draw -numbers again

    and again, the mean of them would be different but the mean would follow Normal

    distribution, provided the number is sufficiently large. But more interestingly, the individual

    distributions from which the numbers are drawn, do not matter, the combination always turn

    up to be a Normal distribution. This is Central Limit Theorem.

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    13

    Experiment with rolling dice:

    So, here we roll dice, calculate probabilities of occurring numbers and try to establish some

    truth!

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    14

    Example #1 Throwing of a single dice:

    The chance of turning up of any side is equal which is 1 out of 6. We consider that a priori

    probabilities for each case and find out the mean and variance from the following table.

    1 2 3 4 5 6 Total

    1/6 1/6 1/6 1/6 1/6 1/6 1

    1/6 2/6 3/6 4/6 5/6 6/6 21/6

    1/6 4/6 9/6 16/6 25/6 36/6 91/6

    From the table, we can calculate mean,

    and

    variance,

    If we plot against , we obtain the probability distribution for this case. This distribution is

    uninteresting as we can check that the probabilities for all values of are same! The curve

    obtained by joining the points will be a horizontal straight line.

    Fig.

    Now we do this similar experiment taking two dice together.

    Example #2 (Two Dice)

    We look for the value of which is the sum of two numbers on the top faces of the two dice as

    rolled.

    Here we shall have possible combinations of events and can have a minimum

    value, and maximum value, .

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    15

    2 3 4 5 6 7 8 9 10 11 12 Total

    1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36 1

    2/36 6/36 12/36 20/36 30/36 42/36 40/36 36/36 30/36 22/36 12/36 252/36

    4/36 18/36 48/36 100/36 180/36 294/36 320/36 324/36 300/36 242/36 144/36 1974/36

    Mean,

    , Variance,

    Now if we plot against taking from

    above table, we get an interesting

    symmetric distribution around a peak! The

    peak is at (mean value).

    The distribution is showing a peak at the

    middle and it is symmetric!

    We can go on doing such experiment taking 3 or more dice together and ask for the sum of

    values and the corresponding probabilities as above. It can be understood that the smoothness

    of the distribution would be more and more tending towards a definite shape while retaining

    the peak at the centre.

    [In fact, the envelope of the probability values at different (joining the top of the height bars)

    of the discrete distribution will slowly assume a continuous symmetric curve!]

    In the limit of large number of events obtained from the large number of dice throwing

    together, we tend to get a continuous bell shaped symmetric distribution.

    This is Normal Distribution.

    For a large number of independent random observations, the probability distribution for the

    mean of the observations can be shown to be Normal distribution. This is called Central Limit

    Theorem.

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    16

    Shape of a Distribution: Symmetry, Skewness, Kurtosis

    Skewness:

    A Normal distribution is symmetric around its peak. The peak corresponds to the most probable

    value that is the value for which the probability is the maximum. An interesting thing about a

    symmetric distribution is that the mean, median and mode are at the same position.

    The skewness is any deviation from symmetry or we can say, lack of symmetry. For a symmetric

    distribution, skewness is zero.

    Coefficient of skewness =

    The following mathematical definition is often used to measure the skewness:

    Skew =

    (

    )

    ,

    where is the standard deviation of the distribution. So, we see that the skewness is a

    dimensionless quantity.

    Skewness can be positive or negative. A distribution with a positive value of skewness is called

    positively skewed, which means the tail of the distribution is more extended towards the more

    positive values of . On the other hand, a distribution with a negative value of skewness is

    called negatively skewed, which means the tail is more extended towards more negative values

    (or lowers values) of .

    Below are the two figures demonstrating the negative and positive skewness: the distributions

    are correspondingly called negative skewed and positively skewed distributions.

    (Negative Skewness: Mean < Mode) (Positive Skewness: Mean > Mode)

    Kurtosis:

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    17

    Kurtosis is another kind of measure of the shape of the distribution. It tells us about the

    peakedness (how the peak looks like) or flatness of the probability distribution.

    A Normal distribution is considered as a standard (or benchmark) in this regard. So, any change

    of shape of the peak of a distribution (peakedness or flatness) compared to a Normal

    distribution is measured.

    The mathematical expression for kurtosis:

    Kurt =

    (

    )

    Note that the number 3 is subtracted from the expression so as to make the value of kurtosis

    for Normal distribution equal to zero. It can be shown that

    (

    )

    = 0 for Normal

    distribution.

    When kurtosis is positive, the peak of the distribution appears sharper relative to a Normal

    distribution. The distribution is then called leptokurtic. One the other hand, when the kurtosis

    is negative, we call the distribution mesokurtic. A mesokurtic distribution looks flatter

    compared to a Normal distribution. As the distribution looks almost flat on top, it is called

    platykurtic.

    Fig.

    If a distribution has more than one peak

    The distribution we discussed (and we shall consider

    throughout) is a unimodal distribution that means a

    distribution which has a single mode or one peak. But in

    many practical cases, we can have a distribution with

    many peaks or many modes. For example, a distribution

    with two peaks (in the fig. below) is called a bimodal

    distribution.

    Platykurtic

    (Negative kurtosis)

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    18

    Z-Distribution

    What is a Z-distribution?

    A Z-distribution is nothing but a Normal distribution with the peak (mean) at zero.

    The peak of a Normal distribution is generally at a finite value with a standard deviation

    (say). If we consider a new variable

    the given Normal distribution (of variable) becomes another Normal distribution (of variable ) with the peak value at and this is

    then called Z-distribution.

    [The derivation of Z-distribution is given in appendix for those who are interested to know.]

    For solving problems with Normal distribution, it is often advantageous to obtain a Z-

    distribution and then to consult a Z-table.

    In the following, we demonstrate with some examples how that is done.

    Consider the following typical situations where we have to calculate the areas from Z-

    distribution:

    Fig.

    (Total area under the curve = 1)

    Fig.

    (Area between and is 0.5 or area between

    and is 0.5 because of symmetry)

    Fig.

    (Area between and any other value )

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    19

    Fig.

    (Area between two positive values of or between two negative values)

    Fig.

    (Area between a negative value and a positive value)

    Fig.

    (Area less than a negative or greater than a

    positive value)

    Important:

    In the z-score table we always look for the area between zero and any other value (as the

    integral is actually done that way). So, zero is always the reference point.

    Finally, the area between any two values of is obtained by adding or subtracting the scores

    involving zero. This will be clear from the following examples.

    Examples:

    (Some typical problems are discussed, consult the z-score table given in the appendix.)

    #1. In the Geography examination, the marks distribution is known to be Normal where the

    mean is 52 and the standard deviation is 15. Determine the z-scores of students receiving

    marks: (i) 40, (ii) 95, (iii) 52.

    Solution: Here, ,

    (i)

    (ii)

    (iii)

    So, we see the z-scores can be negative, positive or zero.

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    20

    #2. Find the area under the normal curve in each of the following cases:

    (i) and

    Area = 0.3849 from table.

    (ii) and

    Area = 0.2518

    (Note: The area is equal to the area between and as the curve is symmetric.)

    (iii) Area between and 2.21

    Area = (area between and 2.21) + (area between and -0.46)

    = 0.4861 + 0.1772 = 0.6633

    (Note: The areas are added as they are on both sides of .)

    (iv) Area between and

    Required area = (area between and 1.94) (area between and 0.81)

    = 0.4738 0.2881 = 0.1857

    (Note: There is the subtraction as the two areas are on the same side of .)

    (v) To the left of

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    21

    Required area = 0.5 (area between and )

    = 0.5 0.2257 = 0.2743

    (vi) To the right of

    Required area = (area between and ) + 0.5

    = (area between and ) + 0.5

    = 0.3997 + 0.5 = 0.8997

    #3. Among 1000 students, the mean score in the final examination is 25 and the standard

    deviation is 4.0. Assume the distribution is Normal. Find the following.

    (a) How many students score between 22 and 27?

    =25, = 4.0

    ,

    So the probability is the area under the curve between -0.75 and 0.5

    = (area between 0 and -0.75) + (area between 0 and 0.5)

    = 0.2734 + 0.1915 = 0.4649

    The number of students in this marks range =

    (b) How many students score above 30?

    Probability = area right to

    = (area between 0 and 1.25)

    = 0.5 0.3944 = 0.1056

    The number of students =

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    22

    (c) How many students score below 15?

    Area = 0.5 (area between and -2.5) = 0.5 0.4938 = 0.0062

    The number of students =

    (d) How many score 24?

    Here we have to calculate area between 23.5 and 24.5.

    ,

    Area between and

    = (area between 0 and ) + (area between 0 and

    = 0.1480 0.0517 = 0.0963

    The number of students = .

    Binomial Distribution

    Before we discuss Binomial distribution, we should know certain basic mathematical

    operations. For those who are not familiar with some mathematical notations and rules, may

    consult the necessary introduction given in the following Box.

    Binomial Probability:

    Suppose, the probability of occurring a certain event is and not occurring of the event is

    . In a total of trials, the particular event occurs times each with probability and

    does not occur times each with probability, . Also, we have to know which events

    will occur out of total events. The number of ways we can do that is the number of

    combinations = ( ) . Consider a variable which is equal to the relative frequency,

    .

    As the events are considered independent, the joint probability will be

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    23

    The above probability is called binomial probability.

    Now consider the following table based on the binomial probability:

    ..

    ) (

    ) (

    ) --------

    Factorial: ! =

    For example, !

    Consider that factorial of negative integers have no meaning and ! .

    Note that we can write ! = !

    Permutation: How many different objects can be arranged among themselves? The

    answer is the permutation of objects, !

    For example, for three objects A, B, C, the different combinations are ABC, ACB, BCA,

    BAC, CAB, CBA: total 6 ways = !

    Combination: () or =

    !

    ! !

    This is the number of ways some objects can be selected from objects.

    For example, if we want to know how 2 students can be selected from total 3 students,

    the answer is ( )

    !

    ! !

    !

    ! ! .

    Also note for quick calculations, ( )

    !

    ! ! = 1, (

    ) !

    ! ! and

    ( )

    !

    ! ! .

    (

    ) (

    )

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    24

    If we add all the terms of the second row above, we get the following binomial expansion:

    ( ) (

    ) (1)

    From the expression (1) above, we can easily check the following known algebraic formulas:

    .

    = ..

    The coefficients of the terms on the right of the above can be arranged in the following

    triangular form which is called Pascals triangle:

    1 1

    1 2 1

    1 3 3 1

    1 4 6 4 1

    1 5 10 10 5 1

    1 6 15 20 15 6 1

    1 7 21 35 35 21 7 1

    1 8 28 56 70 56 28 8 1

    The Rule:

    As indicated above, a number in a row (except the right and left most ones) is the sum of two

    numbers on the two sides of the preceding row.

    So, from the 8th row in the Pascals triangle we can easily write the binomial expansion:

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    25

    Remember that each term represents a binomial probability. A binomial distribution is a

    collection of these discrete binomial probabilities. Note:

    Example #1:

    Five independent shots are fired at a target. The probability of a hit from each shot is 0.4.

    Q. What is the probability that two shots will hit the target?

    Ans. Here , , ,

    ( )

    !

    ! !

    Q. What is the probability that there will be more than two hits?

    Ans. Prob. = ( ) (

    ) (

    )

    = !

    ! !

    !

    ! !

    !

    ! !

    = !

    !

    !

    !

    =

    Q. What is the expectation value of the hits (that is the mean value of hitting the targets out of

    all five shots)?

    Ans. For this we have to calculate the probabilities , , ,..for the corresponding number

    of hits 0, 1, 2..

    The expectation value,

    = 0 + ( ) (

    )

    ( ) (

    ) (

    )

    =

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    26

    = 0.2592 + 0.6912 + 0.6912 + 0.3072 + 0.0512 = 2.0

    Example #2:

    Now, imagine a situation where we toss 8 coins together or we toss one coin 8 times

    consecutively. We measure the relative occurrence of Head in 8 trials. Let us attach values,

    Head = 1 and Tail = 0. So, we can think of a variable which can take values 1/8, 2/8, 3/8,

    4/8. and so on. Thus we can associate probabilities for the values of directly from Pascals

    triangle (or by using formula). Note that probability of occurring Head, and not-

    occurring Head, .

    (

    )

    , (

    ) (

    )

    (

    ) (

    )

    , (

    ) (

    )

    (

    ) (

    )

    , (

    ) (

    )

    (

    ) (

    )

    , (

    ) (

    )

    (

    )

    If we now plot against , we get the following symmetric discrete distribution with the

    peak value at .

    Fig.

    For large number of trails, this distribution becomes Normal distribution. Therefore, we can say

    the following:

    Binomial Probability distribution for a random variable becomes Normal distribution

    for a large number of trials.

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    27

    The Z-Table

    z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

    0.1 0.03983 0.04380 0.04776 0.05172 0.05567 0.05962 0.06356 0.06749 0.07142 0.07535

    0.2 0.07926 0.08317 0.08706 0.09095 0.09483 0.09871 0.10257 0.10642 0.11026 0.11409

    0.3 0.11791 0.12172 0.12552 0.12930 0.13307 0.13683 0.14058 0.14431 0.14803 0.15173

    0.4 0.15542 0.15910 0.16276 0.16640 0.17003 0.17364 0.17724 0.18082 0.18439 0.18793

    0.5 0.19146 0.19497 0.19847 0.20194 0.20540 0.20884 0.21226 0.21566 0.21904 0.22240

    0.6 0.22575 0.22907 0.23237 0.23565 0.23891 0.24215 0.24537 0.24857 0.25175 0.25490

    0.7 0.25804 0.26115 0.26424 0.26730 0.27035 0.27337 0.27637 0.27935 0.28230 0.28524

    0.8 0.28814 0.29103 0.29389 0.29673 0.29955 0.30234 0.30511 0.30785 0.31057 0.31327

    0.9 0.31594 0.31859 0.32121 0.32381 0.32639 0.32894 0.33147 0.33398 0.33646 0.33891

    1.0 0.34134 0.34375 0.34614 0.34849 0.35083 0.35314 0.35543 0.35769 0.35993 0.36214

    1.1 0.36433 0.36650 0.36864 0.37076 0.37286 0.37493 0.37698 0.37900 0.38100 0.38298

    1.2 0.38493 0.38686 0.38877 0.39065 0.39251 0.39435 0.39617 0.39796 0.39973 0.40147

    1.3 0.40320 0.40490 0.40658 0.40824 0.40988 0.41149 0.41308 0.41466 0.41621 0.41774

    1.4 0.41924 0.42073 0.42220 0.42364 0.42507 0.42647 0.42785 0.42922 0.43056 0.43189

    1.5 0.43319 0.43448 0.43574 0.43699 0.43822 0.43943 0.44062 0.44179 0.44295 0.44408

    1.6 0.44520 0.44630 0.44738 0.44845 0.44950 0.45053 0.45154 0.45254 0.45352 0.45449

    1.7 0.45543 0.45637 0.45728 0.45818 0.45907 0.45994 0.46080 0.46164 0.46246 0.46327

    1.8 0.46407 0.46485 0.46562 0.46638 0.46712 0.46784 0.46856 0.46926 0.46995 0.47062

    1.9 0.47128 0.47193 0.47257 0.47320 0.47381 0.47441 0.47500 0.47558 0.47615 0.47670

    2.0 0.47725 0.47778 0.47831 0.47882 0.47932 0.47982 0.48030 0.48077 0.48124 0.48169

    2.1 0.48214 0.48257 0.48300 0.48341 0.48382 0.48422 0.48461 0.48500 0.48537 0.48574

    2.2 0.48610 0.48645 0.48679 0.48713 0.48745 0.48778 0.48809 0.48840 0.48870 0.48899

    2.3 0.48928 0.48956 0.48983 0.49010 0.49036 0.49061 0.49086 0.49111 0.49134 0.49158

    2.4 0.49180 0.49202 0.49224 0.49245 0.49266 0.49286 0.49305 0.49324 0.49343 0.49361

    2.5 0.49379 0.49396 0.49413 0.49430 0.49446 0.49461 0.49477 0.49492 0.49506 0.49520

    2.6 0.49534 0.49547 0.49560 0.49573 0.49585 0.49598 0.49609 0.49621 0.49632 0.49643

    2.7 0.49653 0.49664 0.49674 0.49683 0.49693 0.49702 0.49711 0.49720 0.49728 0.49736

    2.8 0.49744 0.49752 0.49760 0.49767 0.49774 0.49781 0.49788 0.49795 0.49801 0.49807

    2.9 0.49813 0.49819 0.49825 0.49831 0.49836 0.49841 0.49846 0.49851 0.49856 0.49861

    3.0 0.49865 0.49869 0.49874 0.49878 0.49882 0.49886 0.49889 0.49893 0.49896 0.49900

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    28

    Sampling

    Basic Concept:

    What is sampling?

    Sampling is to take a subsection of the population for a particular study. The aim is to

    select the data sample in order to represent the total data set.

    In statistics, population means the total collection of data. When the population or the

    entire collection of data is studied, it is called census.

    In short, population is the total set and the sample is the subset of it.

    Why the sampling is done?

    When the number of elements in a population is large it is often not possible to

    investigate the population completely due to lack of time, money and resources. This is

    why the sampling is necessary.

    Sampling is done in such a way that the subset of data represents the entire set.

    Example:

    If a TV channel wants to know the popularity of a program it would be expensive to ask

    everybodys opinion. Instead a subsection of viewers are interviewed and the data is

    collected.

    Methods of Sampling:

    A sample of size means there are -data points in the collection. A sample of size is

    collected from a population of size in such a way that all the features of the population are

    well represented by this.

    If a sampling method does over-represent or under-represent a feature of the population it is

    said to be biased. The aim of any selection method is to reduce the chance of bias as far as

    possible.

    There are several methods of sampling; among them the most common is the random

    sampling.

    Random sampling:

    For a sample of size , we collect -data from the population. We collect many such

    samples for our evaluation. If this is done randomly so that each group of size taken

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    29

    from the population has equal chance of getting selected, we call this random sampling.

    Sometimes, it is called simple random sampling.

    For a random sampling, the successive drawings have to be independent.

    Let us suppose, we want to select a sample of size 100 from a population of size 10000.

    In case of random sampling, we select the elements (that is which element is to be

    picked) with the help of a random number (generated in a computer) or by consulting a

    random number table or by some kind of dice throwing.

    Systematic Sampling:

    If simple random sampling from population is not possible, the systematic sampling may

    be done. First, population is enumerated from 1 onwards. If sample size of from a

    population of size is to be obtained, every

    -th item is selected. First a random

    number between 1 and is selected and then it is taken as the 1st element. After this

    every -th element is taken.

    Example:

    Follow the table given below.

    Sl no. value

    1 20

    2 27

    3 33

    4 21

    5 15

    6 22

    7 45

    8 13

    9 32

    10 29

    11 10

    12 16

    For a sample of size

    Select a random number between 1-

    3: choose 2, for example.

    Start with #2 and then take 5, 8, 11

    number data.

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    30

    Stratified Sampling:

    In this method, the population is first divided into groups (strata). Each element of the

    sample belongs to one such group.

    Divide the population into non-overlapping groups each containing , data such

    that . Next do the simple random sampling to collect one or

    a few elements from each group.

    Suppose, a population is classified into several groups according to age or something

    like that. Then from each group random samples are collected.

    Note: This is also called restricted random sampling.

    Cluster Sampling:

    In this method, like before, the population is divided into groups called clusters. Then

    clusters are taken randomly and the elements are collected from them as sample.

    Probability sampling

    Any method of sampling that uses (probabilistically) random selection is in general

    called probability sampling.

    Sampling variation:

    When sampling from a population is done, we take not one sample but different sets of

    samples having same size. If the samples are different, we call this sampling variation.

    Usually in practice, we often draw only one sample or one set of data from a population.

    But we may not be sure what may happen in case we draw several other samples. Will

    we get the same result? The answer is No. If we look for mean value, we see that the

    mean is not the same for all the samples that we are able to draw. We then get some

    distributions of the sample means.

    population size, sample size, = the sample fraction.

    Many samples of the same size yield a sampling distribution.

    The sampling distributions are usually assumed to follow any well-known probability

    distribution.

    We look for various properties from the distribution curves.

    It is seen how the variation of sample size can affect the properties.

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    31

    From the experience and theory, we can say that the variability of sampling

    distributions decreases with sample size.

    SAMPLING DISTRIBUTIONS

    What do you do after the sample is collected?

    The first thing one can do with a set of data is to measure the central tendency of it. Usually, we

    calculate the mean and variance.

    The calculation of mean (or variance) is done over many samples of same sizes. Let us suppose,

    we have collected -samples of same size. The mean values , , .of the

    various samples are calculated. It is assumed that the grand mean of all these mean values is

    the actual sample mean, .

    The mean of the sample means is the estimate of the population mean. Similarly, the variance

    of the mean values calculated from the set of samples (of equal size) is an estimate of the

    population variance.

    It can be shown:

    Hypothesis Testing

    What is Hypothesis?

    On the basis of sample information, we make certain decisions about the population. In taking

    such decisions we make certain assumptions. These assumptions are known as statistical

    hypothesis.

    [ Note: A collected set of data points which is a part of the population (a few number of data)

    is called a sample. The process of selection is called sampling. When all the data are considered

    for a study, this is called population.]

    Sample mean is the unbiased estimate of population mean, .

    For the population variance, the unbiased estimate is

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    32

    How to test Hypothesis?

    Assuming the hypothesis correct, we calculate the probability of getting the observed sample. If

    this probability is less than a certain assigned value, the hypothesis is rejected.

    If there is no significant difference between the observed value and the expected value, the

    hypothesis is called Null Hypothesis.

    Test of significance:

    The tests which enable us to decide whether to accept or to reject the null hypothesis are called

    the tests of significance. If the differences between the sample values and the population

    values are significantly large it is to be rejected (i.e., Hypothesis is not Null).

    It is known that the mean of a sample is an unbiased estimate of the population mean . It is called point estimate. But we know, if we collect different samples, the mean ( ) varies from sample to sample. Mean of samples form a distribution which we call sampling distribution. Note that the sampling distribution is Normal if the variable in the population is normally

    distributed.

    Now the question is, how close is a calculated mean to the population mean? We have to

    estimate that with some level of accuracy.

    Confidence Interval:

    Confidence interval is a range of values over which we can trap the population mean with some

    probability. So, we consider the probability distribution of sample means in order to find that

    probability of trapping.

    Suppose, we have a sample mean and we consider a symmetric interval around this:

    where is a value that we shall determine.

    If | | , the confidence interval traps the population mean .

    How to calculate confidence interval?

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    33

    Suppose, the variable follows a Normal distribution, with mean and standard deviation .

    Symbolically,

    So, for a sample if size , (

    ).

    This mean that the distribution of mean ( ) of sample size follows a Normal distribution with

    mean and standard deviation .

    If the confidence interval is 95%, the interval has a probability 0.95 to trap the population

    mean: | |

    Now as an example, consider a sampling distribution with , .

    Here follows z-distribution, Z . [Normal distribution with mean = 0, stand dev. = 1]

    Now let us look up the z-table. The total area under the curve is 100% which gives us the total

    probability = 1. The shaded area (as in the fig.) is 95% of the total area which corresponds to

    probability = 0.95.

    The half of the shaded area = 0.95/2 = 0.475 as it is symmetric around zero.

    In the z-distribution [ , we now find the value of from z-table, where the

    area from to is 0.475.

    as we consider the critical value, .

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    34

    Thus 95% confidence interval:

    If the sample mean is , the confidence interval:

    So we can say with 95% confidence level that the population mean can be in this interval.

    Let us now calculate the width of the confidence interval for 95% confidence:

    So, we can see that the interval decreases with the increase of sample size. That is we can

    narrow down the search of the population mean as we take larger sample size. Then we can say

    with more accuracy that our measured mean is closer to the population mean.

    For example, for ,

    For ,

    ,

    For 98% Confidence interval:

    Shaded area = 0.98. Half the shaded area = 0.98/2 =0.49 which is between and

    Thus 98% confidence interval is *

    +.

    NOTE #1:

    Symbolically, it often said that the confidence level is , where .

    This also means significant level.

    NOTE: For a sample of size = , with population variance , a 95% confidence

    interval means *

    +

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    35

    For example, for Confidence level = Significance level = Confidence level and significance levels are complimentary.

    NOTE #2:

    When we are not sure if the population is Normal and we do not know the population variance

    , we can still use the method of calculating the confidence interval by considering the

    variance of a large sample (usually ).

    (

    )

    Then we consider the interval, *

    +.

    Students T-test:

    This is applied to find confidence interval for a small sample. The population is Normal.

    Consider the variable defined as

    [Note here, we use , calculated for the sample, instead of .]

    The values of the variable varies from sample to sample and thus it forms a distribution

    looking very similar to Normal distribution. This is t-distribution. As we take larger and larger

    samples, the t-distributions more and more become closer to a Normal distribution, ,

    which is nothing but z-distribution.

    Confidence

    Level

    z

    90% 1.645

    95% 1.96

    98% 2.326

    99% 2.576

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    36

    Now instead of sample size, the family of distributions are characterized by a parameter called

    degrees of freedom (df), usually denoted by [Nu].

    Degrees of freedom = No. of independent values used for calculation of .

    For example, if is the sample size, we use -data points but they are related by their mean,

    or . Such a condition in the form of a relation or equation is called a

    constraint. Thus we have independent quantities and this is degrees of freedom here.

    Degrees of freedom, Number of values Number of constraints

    In this case,

    The -distributions are now designated as -distributions. As is higher the -distributions

    tend more and more towards z-distribution.

    Like z-table, we now have -table to consult, from where we have the area under the curve with

    some -range.

    So, for a Normal distribution, for a sample of size , we have confidence interval:

    *

    + for a confidence level, for -degrees of freedom.

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    37

    EXAMPLE #1:

    Consider the following 10 measurements of some variable. The hypothesis is that the

    population mean is . We have to verify that. Assume that the readings follow a Normal

    Distribution.

    No. of

    Obs.

    1 2 3 4 5 6 7 8 9 10

    Values 0.13 -0.09 0.06 0.15 -0.02 0.03 0.01 -0.02 -0.07 0.05

    (

    )

    (

    )

    Degrees of freedom,

    From -table for with 95% confidence level, we have .

    Confidence interval: *

    +

    The mean is trapped inside the above interval. So the hypothesis is right. Null

    hypothesis.

    EXAMPLE #2:

    The mean life time (in Hours) of an electric bulb is measured to be 10.4. Now a technology is

    introduced to increase the life time. The experimental data collected from a random sample of

    size , , . Test whether there is any evidence at the 10%

    significance level that the new technology has actually increased the life time.

    [Note that it is not asked if there is any decrease in life time. The question is to ask whether

    there is any increase or it remains the same.]

    Ans.

    Null hypothesis, , Alternate hypothesis,

    Here we consider one tail t-test as we are to look for the increase only.

    Sample mean,

    Unbiased estimate of the population variance (from the sample),

    (

    )

    *

    +

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    38

    For the t-test,

    Here, degrees of freedom, . So we look for area under the curve for

    distribution.

    For 10% significance, i.e. for 90% confidence level, we find . Thus our

    observed value lies in the rejection region. That means that the mean life time is increased.

    Alternate hypothesis.

    EXAMPLE #3:

    You are measuring some length which is 10 cm. Five measurements by you are 9.88, 10.18.

    10.23, 10.39, 10.25 cm. Assume that the measurements follow a Normal distribution. Test at

    the 5% significance level whether there results support the claim or it is biased.

    Ans.

    Since the bias can be in either direction (positive or negative), we consider two tail test.

    The Hypothesis, Null cm, Alternate cm.

    Sample mean,

    Variance,

    (

    )

    *

    + , this is an unbiased

    estimate of the population mean.

    For 5% significance level, we consider the area of 0.95 (shaded area in the fig.) around the

    centre, and an area of 0.025 on both sides (at both the tails).

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    39

    We consider distribution as the degrees of freedom, . The rejection region on

    either sides corresponds to , from the table.

    Here we find that the t-value is below the rejection region that is in the acceptance region. Thus

    the hypothesis ( cm) is accepted. Null hypothesis.

    Chi-Squared Test: ( -Test)

    In some measurement, we obtain the frequencies of some events. We call them observed

    frequencies ( ). We have to test whether the observed frequencies are consistent with the

    expected frequencies according to some given distribution or hypothesis.

    The measure of discrepancy between the observed and expected frequencies is defined by the

    following quantity:

    NOTE:

    In one tail t-test, we consider only one side of the t-distribution, either on the right side (for

    increase or positive values) or on the left side (for decrease or negative values). For two tail t-

    test, we consider both sides of the distribution (as we have done before) considering the fact that

    the value of the variable can increase or decrease from the mean value.

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    40

    Note: (Chi-square) is a positive quantity, lower its value better is the agreement between the

    observed and expected frequencies. In other words, it gives a goodness of fit of the model or

    hypothesis. For , the agreement is absolute.

    Like t-distribution, we do also have -distribution. We measure the values for different samples of

    same size and obtain a distribution. The distribution, here also, is characterized by the degrees of

    freedom . So for we write ,

    EXAMPLE #1:

    In a dice throw experiment, we obtain the following fig. where the dice was thrown 600 times.

    Score 1 2 3 4 5 6

    Freq. 90 108 110 95 100 97

    Let us check the above with respect to -test. Our hypothesis is that for each score, the

    probability = 1/6 (for a fair dice). So

    the expected frequency =

    .

    Hypothesis, the dice is fair,

    the dice is not fair.

    In this example, the degrees of

    freedom, .

    So after we calculate from the

    following table, we have to look for the

    -table.

    Score

    1 90 100 -10 1

    2 108 100 8 0.64

    3 110 100 10 1

    4 95 100 -5 0.25

    5 100 100 0 0

    6 97 100 -3 0.09

    Total 600 600 0 2.98

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    41

    To calculate :

    From the table, we see . If we consider 90% confidence level, we have ( )

    . Our obtained value for is below this. So it falls within the acceptance region. The dice is fair,

    the hypothesis null.

    EXAMPLE #2:

    In a genetic study, it is predicted that the children with both parents of blood group AB will fall

    into blood groups AB, A and B in the ratio 2:1:1. Out of a random sample 100, we find 55

    children have blood group AB, 27 have blood group A and 18 blood group B. Test at 10%

    significance level whether the observed results agree with the theoretical prediction.

    Ans.

    Hypothesis The childrens blood group is in ratio 2:1:1

    The childrens blood group is NOT in ratio 2:1:1

    The ratio of probabilities AB, A, B is 2:1:1 =

    Degrees of freedom

    For 10% significance level we look for -distribution table: (

    )

    We find

    The rejection region is thus above .

    Here the obtained value of is below the rejection region, so it falls in the acceptance region.

    The hypothesis is correct. Null Hypothesis!

    EXAMPLE #3

    The rain fall ( ) at some place is measured in cm in the following table. We assume that is a

    random variable and it follows a Normal distribution with mean and standard deviation

    .

    (i) Calculate the expected frequencies of the different classes

    Blood

    group

    AB 55 50 5 0.5

    A 27 25 2 0.16

    AB 18 25 -7 1.96

    Total 100 100 0 2.62

    65

    Obs. Freq. 10 18 28 18 12

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    42

    (ii) Carry out a goodness of fit analysis to test at the 5% level of significance and test the

    hypothesis that the random variable actually follows the Normal distribution

    .

    Ans.

    (i)

    For 35, 45, 55, 65 we have -1, -0.333,

    0.333, 1 respectively.

    Now Follow z-table.

    For , we have

    ,

    Expected frequency =

    Here, total frequency =

    For ,

    ,

    Expected frequency =

    For ,

    ,

    Expected frequency =

    By symmetry, the expected frequencies for the 4th and 5th groups are 18.14 and 13.65

    respectively.

    To carry out -test we prepare the following table.

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    43

    Class

    65 12 13.65 -1.65 0.2

    Total 86 86.01 0

    Here,

    From the -distribution table, , for 5% significance level.

    Since, 2.56 is not in the rejection region, the data follows Normal distribution, .

    Null hypothesis.

    Additional Information

    Type I and Type II errors:

    In case of Hypothesis testing, we call

    Type I Error -> When we incorrectly reject the true Null Hypothesis.

    Type II Error -> When we fail to reject the false Null Hypothesis.

    Probability Density Function:

    In Probability theory, the probability density function (P.D.F.) of a continuous random variable

    is the probability around a certain value or probability in a unit interval. P.D.F. when integrated

    over a finite interval gives the cumulative probability.

    , P.D.F.

  • A short Course on Probability Theory and Sampling, originally prepared as lecture notes for M.Sc.

    (Geography) students of Vidyasagar Univ, WB, India. Compiled by Dr. A. Kar Gupta,

    [email protected], Physics Deptt, Panskura B. College, WB, India

    44

    *The Lecture notes are for private circulation only. Some of the ideas and examples are

    taken from the some books and numerous materials available in internet.

    Books and Websites:

    1. Advanced Level Mathematics: STATISTICS 1 & 2 Steve Dobbs and Jane Miller

    (Pub: CAMBRIDGE International Examinations)

    2. The Analysis of Time Series An Introduction (fifth edition) C. Chatfield (Pub:

    Chapman & Hall)

    3. Basic & Clinical Biostatistics (fourth edition) Beth Dawson, Robert G. Trapp (Pub: Lange

    Medical Books/ McGraw-Hill)

    4. Numerical Recipes (2nd Ed, Vol I FORTRAN) William H. Press, Saul A. Teukolsky, William

    T. Vetterling, Brian P. Flannery (Pub: Cambridge University Press)

    5. Mathematical Physics, H.K. Dass

    6. Website: people.richland.edu