5 joint probability distributions and random samples

99
5 Joint Probability Distributions and Random Samples

Upload: elfreda-welch

Post on 24-Dec-2015

227 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: 5 Joint Probability Distributions and Random Samples

5Joint Probability

Distributions and Random Samples

Page 2: 5 Joint Probability Distributions and Random Samples

5.1 Jointly Distributed Random Variables

Page 3: 5 Joint Probability Distributions and Random Samples

3

Two Discrete Random Variables

Page 4: 5 Joint Probability Distributions and Random Samples

4

Two Discrete Random Variables

The probability mass function (pmf) of a single discrete rv X specifies how much probability mass is placed on each possible X value.

The joint pmf of two discrete rv’s X and Y describes how much probability mass is placed on each possible pair of values (x, y).

DefinitionLet X and Y be two discrete rv’s defined on the sample space of an experiment. The joint probability mass function p (x, y) is defined for each pair of numbers (x, y) by

p (x, y) = P(X = x and Y = y)

Page 5: 5 Joint Probability Distributions and Random Samples

5

Two Discrete Random Variables

It must be the case that p (x, y) 0 and p (x, y) = 1.

Now let A be any set consisting of pairs of (x, y) values (e.g., A = {(x, y): x + y = 5} or {(x, y): max (x, y) 3}).

Then the probability P[(X, Y) A] is obtained by summing the joint pmf over pairs in A:

P[(X, Y) A] = p (x, y)

Page 6: 5 Joint Probability Distributions and Random Samples

6

Example 1

A large insurance agency services a number of customers who have purchased both a homeowner’s policy and an automobile policy from the agency. For each type of policy, a deductible amount must be specified.

For an automobile policy, the choices are $100 and $250, whereas for a homeowner’s policy, the choices are 0, $100, and $200.

Suppose an individual with both types of policy is selected at random from the agency’s files. Let X = the deductible amount on the auto policy and Y = the deductible amount on the homeowner’s policy.

Page 7: 5 Joint Probability Distributions and Random Samples

7

Example 1

Possible (X, Y) pairs are then (100, 0), (100, 100),(100, 200), (250, 0), (250, 100), and (250, 200); the joint pmf specifies the probability associated with each one of these pairs, with any other pair having probability zero.

Suppose the joint pmf is given in the accompanying joint probability table:

cont’d

Page 8: 5 Joint Probability Distributions and Random Samples

8

Example 1

Then p(100, 100) = P(X = 100 and Y = 100) = P($100 deductible on both policies) = .10.

The probability P(Y 100) is computed by summing probabilities of all (x, y) pairs for which y 100:

P(Y 100) = p(100, 100) + p(250, 100) + p(100, 200) + p(250, 200)

= .75

cont’d

Page 9: 5 Joint Probability Distributions and Random Samples

9

Two Discrete Random Variables

Definition

The marginal probability mass function of X, denoted by pX (x), is given by

pX (x) = p (x, y) for each possible value x

Similarly, the marginal probability mass function of Y is

pY (y) = p (x, y) for each possible value y.

Page 10: 5 Joint Probability Distributions and Random Samples

10

Example 2

Example 1 continued…The possible X values are x = 100 and x = 250, so computing row totals in the joint probability table yields

pX(100) = p(100, 0) + p(100, 100) + p(100, 200) = .50

and

pX(250) = p(250, 0) + p(250, 100) + p(250, 200) = .50

The marginal pmf of X is then

Page 11: 5 Joint Probability Distributions and Random Samples

11

Example 2

Similarly, the marginal pmf of Y is obtained from column totals as

so P(Y 100) = pY(100) + pY(200) = .75 as before.

cont’d

Page 12: 5 Joint Probability Distributions and Random Samples

12

Two Continuous Random Variables

Page 13: 5 Joint Probability Distributions and Random Samples

13

Two Continuous Random Variables

The probability that the observed value of a continuous rv X lies in a one-dimensional set A (such as an interval) is obtained by integrating the pdf f (x) over the set A.

Similarly, the probability that the pair (X, Y) of continuous rv’s falls in a two-dimensional set A (such as a rectangle) is obtained by integrating a function called the joint density function.

Page 14: 5 Joint Probability Distributions and Random Samples

14

Two Continuous Random Variables

Definition

Let X and Y be continuous rv’s. A joint probability density function f (x, y) for these two variables is a function satisfying f (x, y) 0 and

Then for any two-dimensional set A

Page 15: 5 Joint Probability Distributions and Random Samples

15

Two Continuous Random Variables

In particular, if A is the two-dimensional rectangle{(x, y): a x b, c y d}, then

We can think of f (x, y) as specifying a surface at heightf(x, y) above the point (x, y) in a three-dimensional coordinate system.

Then P[(X, Y) A] is the volume underneath this surface and above the region A, analogous to the area under a curve in the case of a single rv.

Page 16: 5 Joint Probability Distributions and Random Samples

16

Two Continuous Random Variables

This is illustrated in Figure 5.1.

Figure 5.1

P[(X, Y ) A] = volume under density surface above A

Page 17: 5 Joint Probability Distributions and Random Samples

17

Example 3

A bank operates both a drive-up facility and a walk-up window. On a randomly selected day, let X = the proportion of time that the drive-up facility is in use and Y = the proportion of time that the walk-up window is in use.

Then the set of possible values for (X, Y) is the rectangle

D = {(x, y): 0 x 1, 0 y 1}.

Page 18: 5 Joint Probability Distributions and Random Samples

18

Example 3

Suppose the joint pdf of (X, Y) is given by

To verify that this is a legitimate pdf, note that f (x, y) 0 and

cont’d

Page 19: 5 Joint Probability Distributions and Random Samples

19

Example 3

The probability that neither facility is busy more than one-quarter of the time is

cont’d

Page 20: 5 Joint Probability Distributions and Random Samples

20

Example 3

cont’d

Page 21: 5 Joint Probability Distributions and Random Samples

21

Two Continuous Random Variables

The marginal pdf of each variable can be obtained in a manner analogous to what we did in the case of two discrete variables.

The marginal pdf of X at the value x results from holding x fixed in the pair (x, y) and integrating the joint pdf over y. Integrating the joint pdf with respect to x gives the marginal pdf of Y.

Page 22: 5 Joint Probability Distributions and Random Samples

22

Two Continuous Random Variables

Definition

The marginal probability density functions of X and Y, denoted by fX (x) and fY (y), respectively, are given by

Page 23: 5 Joint Probability Distributions and Random Samples

23

Independent Random Variables

Page 24: 5 Joint Probability Distributions and Random Samples

24

Independent Random Variables

In many situations, information about the observed value of one of the two variables X and Y gives information about the value of the other variable.

In Example 1, the marginal probability of X at x = 250 was .5, as was the probability that X = 100. If, however, we are told that the selected individual had Y = 0, then X = 100 is four times as likely as X = 250.

Thus there is a dependence between the two variables. Earlier, we pointed out that one way of defining independence of two events is via the conditionP(A B) = P(A) P(B).

Page 25: 5 Joint Probability Distributions and Random Samples

25

Independent Random Variables

Here is an analogous definition for the independence of two rv’s.

Definition

Two random variables X and Y are said to be independent if for every pair of x and y values

p (x, y) = pX (x) pY (y) when X and Y are discrete

or

f (x, y) = fX (x) fY (y) when X and Y are continuous

If (5.1) is not satisfied for all (x, y), then X and Y are said to be dependent.

(5.1)

Page 26: 5 Joint Probability Distributions and Random Samples

26

Independent Random Variables

The definition says that two variables are independent if their joint pmf or pdf is the product of the two marginal pmf’s or pdf’s.

Intuitively, independence says that knowing the value of one of the variables does not provide additional information about what the value of the other variable might be.

Page 27: 5 Joint Probability Distributions and Random Samples

27

Example 6

In the insurance situation of Examples 1 and 2,

p(100, 100) = .10 (.5)(.25) = pX(100) pY(100)

so X and Y are not independent.

Independence of X and Y requires that every entry in the joint probability table be the product of the corresponding row and column marginal probabilities.

Page 28: 5 Joint Probability Distributions and Random Samples

28

Independent Random Variables

Independence of two random variables is most useful when the description of the experiment under study suggests that X and Y have no effect on one another.

Then once the marginal pmf’s or pdf’s have been specified, the joint pmf or pdf is simply the product of the two marginal functions. It follows that

P (a X b, c Y d) = P (a X b) P (c Y d)

Page 29: 5 Joint Probability Distributions and Random Samples

29

5.2 Expected Values, Covariance, and Correlation

Page 30: 5 Joint Probability Distributions and Random Samples

30

Expected Values, Covariance, and Correlation

Proposition

Let X and Y be jointly distributed rv’s with pmf p(x, y) or pdf f (x, y) according to whether the variables are discrete or continuous.

Then the expected value of a function h(X, Y), denoted by E[h(X, Y)] or h(X, Y), is given by

if X and Y are discrete

if X and Y are continuous

Page 31: 5 Joint Probability Distributions and Random Samples

31

Example 13

Five friends have purchased tickets to a certain concert. If the tickets are for seats 1–5 in a particular row and the tickets are randomly distributed among the five, whatis the expected number of seats separating any particular two of the five?

Let X and Y denote the seat numbers of the first and second individuals, respectively. Possible (X, Y) pairs are {(1, 2), (1, 3), . . . , (5, 4)}, and the joint pmf of (X, Y) is

x = 1, . . . , 5; y = 1, . . . , 5; x y

otherwise

Page 32: 5 Joint Probability Distributions and Random Samples

32

Example 13

The number of seats separating the two individuals is h(X, Y) = | X – Y | – 1.

The accompanying table gives h(x, y) for each possible (x, y) pair.

cont’d

Page 33: 5 Joint Probability Distributions and Random Samples

33

Example 13

Thus

cont’d

Page 34: 5 Joint Probability Distributions and Random Samples

34

Covariance

Page 35: 5 Joint Probability Distributions and Random Samples

35

Covariance

When two random variables X and Y are not independent, it is frequently of interest to assess how strongly they are related to one another.

Definition

The covariance between two rv’s X and Y is

Cov(X, Y) = E[(X – X)(Y – Y)]

X, Y discrete

X, Y continuous

Page 36: 5 Joint Probability Distributions and Random Samples

36

Covariance

That is, since X – X and Y – Y are the deviations of the two variables from their respective mean values, the covariance is the expected product of deviations. Notethat Cov(X, X) = E[(X – X)2] = V(X).

The rationale for the definition is as follows.

Suppose X and Y have a strong positive relationship to one another, by which we mean that large values of X tend to occur with large values of Y and small values of X with small values of Y.

Page 37: 5 Joint Probability Distributions and Random Samples

37

Covariance

Then most of the probability mass or density will be associated with (x – X) and (y – Y), either both positive (both X and Y above their respective means) or both negative, so the product (x – X)(y – Y) will tend to be positive.

Thus for a strong positive relationship, Cov(X, Y) should be quite positive.

For a strong negative relationship, the signs of (x – X) and (y – Y) will tend to be opposite, yielding a negative product.

Page 38: 5 Joint Probability Distributions and Random Samples

38

Covariance

Thus for a strong negative relationship, Cov(X, Y) should be quite negative.

If X and Y are not strongly related, positive and negative products will tend to cancel one another, yielding a covariance near 0.

Page 39: 5 Joint Probability Distributions and Random Samples

39

Covariance

Figure 5.4 illustrates the different possibilities. The covariance depends on both the set of possible pairs and the probabilities. In Figure 5.4, the probabilities could be changed without altering the set of possible pairs, and this could drastically change the value of Cov(X, Y).

p(x, y) = 1/10 for each of ten pairs corresponding to indicated points:

Figure 5.4

(a) positive covariance; (b) negative covariance; (c) covariance near zero

Page 40: 5 Joint Probability Distributions and Random Samples

40

Example 15

The joint and marginal pmf’s for

X = automobile policy deductible amount and

Y = homeowner policy deductible amount in Example 5.1 were

from which X = xpX(x) = 175 and Y = 125.

Page 41: 5 Joint Probability Distributions and Random Samples

41

Example 15

Therefore,

Cov(X, Y) = (x – 175)(y – 125)p(x, y)

= (100 – 175)(0 – 125)(.20) + . . . + (250 – 175)(200 – 125)(.30)

= 1875

(x, y)

cont’d

Page 42: 5 Joint Probability Distributions and Random Samples

42

Covariance

The following shortcut formula for Cov(X, Y) simplifies the computations.

Proposition

Cov(X, Y) = E(XY) – X Y

According to this formula, no intermediate subtractions are necessary; only at the end of the computation is X Y subtracted from E(XY). The proof involves expanding (X – X)(Y – Y) and then taking the expected value of each term separately.

Page 43: 5 Joint Probability Distributions and Random Samples

43

Correlation

Page 44: 5 Joint Probability Distributions and Random Samples

44

Correlation

Definition

The correlation coefficient of X and Y, denoted by Corr(X, Y), X,Y, or just , is defined by

Page 45: 5 Joint Probability Distributions and Random Samples

45

Example 17

It is easily verified that in the insurance scenario of Example 15, E(X2) = 36,250,

= 36,250 – (175)2 = 5625,

X = 75, E(Y2) = 22,500,

= 6875, and Y = 82.92.

This gives

Page 46: 5 Joint Probability Distributions and Random Samples

46

Correlation

The following proposition shows that remedies the defectof Cov(X, Y) and also suggests how to recognize the existence of a strong (linear) relationship.

Proposition

1. If a and c are either both positive or both negative,

Corr(aX + b, cY + d) = acCorr(X, Y)

2. For any two rv’s X and Y, –1 Corr(X, Y) 1.

Page 47: 5 Joint Probability Distributions and Random Samples

47

Correlation

If we think of p(x, y) or f(x, y) as prescribing a mathematical model for how the two numerical variables X and Y are distributed in some population (height and weight, verbal SAT score and quantitative SAT score, etc.), then is a population characteristic or parameter that measures how strongly X and Y are related in the population.

We will consider taking a sample of pairs (x1, y1), . . . , (xn, yn) from the population.

The sample correlation coefficient r will then be defined and used to make inferences about .

Page 48: 5 Joint Probability Distributions and Random Samples

48

Correlation

The correlation coefficient is actually not a completely general measure of the strength of a relationship.

Proposition

1. If X and Y are independent, then = 0, but = 0 does not imply independence.

2. = 1 or –1 iff Y = aX + b for some numbers a and b with a 0.

Page 49: 5 Joint Probability Distributions and Random Samples

49

Correlation

This proposition says that is a measure of the degree of linear relationship between X and Y, and only when the two variables are perfectly related in a linear manner will be as positive or negative as it can be.

A less than 1 in absolute value indicates only that the relationship is not completely linear, but there may still be a very strong nonlinear relation.

Page 50: 5 Joint Probability Distributions and Random Samples

50

Correlation

Also, = 0 does not imply that X and Y are independent, but only that there is a complete absence of a linear relationship. When = 0, X and Y are said to be uncorrelated.

Two variables could be uncorrelated yet highly dependentbecause there is a strong nonlinear relationship, so becareful not to conclude too much from knowing that = 0.

Page 51: 5 Joint Probability Distributions and Random Samples

51

Correlation

A value of near 1 does not necessarily imply that increasing the value of X causes Y to increase. It implies only that large X values are associated with large Y values.

For example, in the population of children, vocabulary size and number of cavities are quite positively correlated, but it is certainly not true that cavities cause vocabularyto grow.

Instead, the values of both these variables tend to increase as the value of age, a third variable, increases.

Page 52: 5 Joint Probability Distributions and Random Samples

52

5.3 Statistics and Their Distributions

Page 53: 5 Joint Probability Distributions and Random Samples

53

Statistics and Their Distributions

Definition

A statistic is any quantity whose value can be calculated from sample data. Prior to obtaining data, there is uncertainty as to what value of any particular statistic will result. Therefore, a statistic is a random variable and will be denoted by an uppercase letter; a lowercase letter is used to represent the calculated or observed value of the statistic.

Page 54: 5 Joint Probability Distributions and Random Samples

54

Statistics and Their Distributions

Thus the sample mean, regarded as a statistic (before a sample has been selected or an experiment carried out), is denoted by ; the calculated value of this statistic is .

Similarly, S represents the sample standard deviation thought of as a statistic, and its computed value is s.

If samples of two different types of bricks are selected and the individual compressive strengths are denoted by X1, . . . , Xm and Y1, . . . , Yn, respectively, then the statistic , the difference between the two sample mean compressive strengths, is often of great interest.

Page 55: 5 Joint Probability Distributions and Random Samples

55

Statistics and Their Distributions

The probability distribution of a statistic is sometimes referred to as its sampling distribution to emphasize that it describes how the statistic varies in value across all samples that might be selected.

Page 56: 5 Joint Probability Distributions and Random Samples

56

Random Samples

Page 57: 5 Joint Probability Distributions and Random Samples

57

Random Samples

Definition

The rv’s X1, X2, . . . , Xn are said to form a (simple) random

sample of size n if

1. The Xi’s are independent rv’s.

2. Every Xi has the same probability distribution.

Page 58: 5 Joint Probability Distributions and Random Samples

58

Random Samples

Conditions 1 and 2 can be paraphrased by saying that the Xi’s are independent and identically distributed (iid).

If sampling is either with replacement or from an infinite (conceptual) population, Conditions 1 and 2 are satisfied exactly.

These conditions will be approximately satisfied if sampling is without replacement, yet the sample size n is much smaller than the population size N.

Page 59: 5 Joint Probability Distributions and Random Samples

59

Random Samples

In practice, if n/N .05 (at most 5% of the population is sampled), we can proceed as if the Xi’s form a random sample.

The virtue of this sampling method is that the probability distribution of any statistic can be more easily obtained than for any other sampling method.

There are two general methods for obtaining information about a statistic’s sampling distribution. One method involves calculations based on probability rules, and the other involves carrying out a simulation experiment.

Page 60: 5 Joint Probability Distributions and Random Samples

60

Simulation Experiments

Page 61: 5 Joint Probability Distributions and Random Samples

61

Simulation Experiments

The following characteristics of an experiment must be specified:

1. The statistic of interest ( , S, a particular trimmed mean, etc.)

2. The population distribution (normal with = 100 and = 15, uniform with lower limit A = 5 and upper limit B = 10,etc.)

3. The sample size n (e.g., n = 10 or n = 50)

4. The number of replications k (number of samples to be obtained)

Page 62: 5 Joint Probability Distributions and Random Samples

62

Simulation Experiments

Then use appropriate software to obtain k different random samples, each of size n, from the designated population distribution.

For each sample, calculate the value of the statistic and construct a histogram of the k values. This histogram gives the approximate sampling distribution of the statistic.

The larger the value of k, the better the approximation will tend to be (the actual sampling distribution emerges as k ). In practice, k = 500 or 1000 is usually sufficient if the statistic is “fairly simple.”

Page 63: 5 Joint Probability Distributions and Random Samples

63

Simulation Experiments

The final aspect of the histograms to note is their spread relative to one another.

The larger the value of n, the more concentrated is the sampling distribution about the mean value. This is why the histograms for n = 20 and n = 30 are based on narrower class intervals than those for the two smaller sample sizes.

For the larger sample sizes, most of the values are quite close to 8.25. This is the effect of averaging. When n is small, a single unusual x value can result in an value far from the center.

Page 64: 5 Joint Probability Distributions and Random Samples

64

Simulation Experiments

With a larger sample size, any unusual x values, when averaged in with the other sample values, still tend to yield an value close to .

Combining these insights yields a result that should appeal to your intuition:

based on a large n tends to be closer to than does

based on a small n.

Page 65: 5 Joint Probability Distributions and Random Samples

65Copyright © Cengage Learning. All rights reserved.

5.4 The Distribution of the Sample Mean

Page 66: 5 Joint Probability Distributions and Random Samples

66

The Distribution of the Sample Mean

The importance of the sample mean springs from its use in drawing conclusions about the population mean . Some of the most frequently used inferential procedures are based on properties of the sampling distribution of .

A preview of these properties appeared in the calculations and simulation experiments of the previous section, where we noted relationships between E( ) and and also among V( ), 2, and n.

Page 67: 5 Joint Probability Distributions and Random Samples

67

The Distribution of the Sample Mean

PropositionLet X1, X2, . . . , Xn be a random sample from a distribution with mean value and standard deviation . Then

1.

2.

In addition, with T0 = X1+ . . . + Xn (the sample total),

Page 68: 5 Joint Probability Distributions and Random Samples

68

The Distribution of the Sample Mean

The sampling distribution of is centered precisely at the mean of the population

The distribution becomes more concentrated about as the sample size n increases.

The distribution of To becomes more spread out as n increases.Averaging moves probability in toward the middle, whereas totaling spreads probability out over a wider and wider range of values.

The standard deviation is often called the standard error of the mean

Page 69: 5 Joint Probability Distributions and Random Samples

69

Example 24

In a notched tensile fatigue test on a titanium specimen, the expected number of cycles to first acoustic emission (used to indicate crack initiation) is = 28,000, and the standard deviation of the number of cycles is = 5000.

Let X1, X2, . . . , X25 be a random sample of size 25, where each Xi is the number of cycles on a different randomly selected specimen.

Then the expected value of the sample mean number of cycles until first emission is E( )= = 28,000, and the expected total number of cycles for the 25 specimens isE(To) = n = 25(28,000) = 700,000.

Page 70: 5 Joint Probability Distributions and Random Samples

70

Example 24

The standard deviation of (standard error of the mean) and of To are

If the sample size increases to n = 100, E( ) is unchanged, but = 500, half of its previous value (the sample size must be quadrupled to halve the standard deviation of ).

cont’d

Page 71: 5 Joint Probability Distributions and Random Samples

71

The Case of a Normal Population Distribution

Page 72: 5 Joint Probability Distributions and Random Samples

72

The Case of a Normal Population Distribution

PropositionLet X1, X2, . . . , Xn be a random sample from a normal distribution with mean and standard deviation . Then for any n, is normally distributed (with mean and standard deviation , as is To (with mean n and standard

Deviation ).

We know everything there is to know about the and To distributions when the population distribution is normal. In particular, probabilities such as P(a b) and P(c To d) can be obtained simply by standardizing.

Page 73: 5 Joint Probability Distributions and Random Samples

73

The Case of a Normal Population Distribution

Figure 5.14 illustrates the proposition.

A normal population distribution and sampling distributions

Figure 5.14

Page 74: 5 Joint Probability Distributions and Random Samples

74

Example 25

The time that it takes a randomly selected rat of a certain subspecies to find its way through a maze is a normally distributed rv with = 1.5 min and = .35 min. Suppose five rats are selected.

Let X1, . . . , X5 denote their times in the maze. Assuming the Xi’s to be a random sample from this normal distribution, what is the probability that the total time To = X1 + . . . + X5 for the five is between 6 and 8 min?

Page 75: 5 Joint Probability Distributions and Random Samples

75

Example 25

By the proposition, To has a normal distribution with = n = 5(1.5) = 7.5

and

variance = n 2 = 5(.1225) = .6125, so = .783.

To standardize To, subtract and divide by :

cont’d

Page 76: 5 Joint Probability Distributions and Random Samples

76

Example 25

Determination of the probability that the sample average time (a normally distributed variable) is at most 2.0 min requires = = 1.5 and = = .1565.

Then

cont’d

Page 77: 5 Joint Probability Distributions and Random Samples

77

The Central Limit Theorem

Page 78: 5 Joint Probability Distributions and Random Samples

78

The Central Limit Theorem

When the Xi’s are normally distributed, so is for every sample size n.

Even when the population distribution is highly nonnormal, averaging produces a distribution more bell-shaped than the one being sampled.

A reasonable conjecture is that if n is large, a suitable normal curve will approximate the actual distribution of . The formal statement of this result is the most important theorem of probability.

Page 79: 5 Joint Probability Distributions and Random Samples

79

The Central Limit Theorem

TheoremThe Central Limit Theorem (CLT)

Let X1, X2, . . . , Xn be a random sample from a distribution with mean and variance

2. Then if n is sufficiently large, has approximately a normal distribution with and and To also has approximately a normal distribution with The larger the value of n, the better the approximation.

Page 80: 5 Joint Probability Distributions and Random Samples

80

The Central Limit Theorem

Figure 5.15 illustrates the Central Limit Theorem.

The Central Limit Theorem illustrated

Figure 5.15

Page 81: 5 Joint Probability Distributions and Random Samples

81

Example 26

The amount of a particular impurity in a batch of a certain chemical product is a random variable with mean value 4.0 g and standard deviation 1.5 g.

If 50 batches are independently prepared, what is the (approximate) probability that the sample average amount of impurity is between 3.5 and 3.8 g?

According to the rule of thumb to be stated shortly, n = 50 is large enough for the CLT to be applicable.

Page 82: 5 Joint Probability Distributions and Random Samples

82

Example 26

then has approximately a normal distribution with mean value = 4.0 and

so

cont’d

Page 83: 5 Joint Probability Distributions and Random Samples

83

The Central Limit Theorem

The CLT provides insight into why many random variables have probability distributions that are approximately normal.

For example, the measurement error in a scientific experiment can be thought of as the sum of a number of underlying perturbations and errors of small magnitude.

A practical difficulty in applying the CLT is in knowing when n is sufficiently large. The problem is that the accuracy of the approximation for a particular n depends on the shape of the original underlying distribution being sampled.

Page 84: 5 Joint Probability Distributions and Random Samples

84

The Central Limit Theorem

If the underlying distribution is close to a normal density curve, then the approximation will be good even for a small n, whereas if it is far from being normal, then a large n will be required.

Rule of ThumbIf n > 30, the Central Limit Theorem can be used.

There are population distributions for which even an n of 40 or 50 does not suffice, but such distributions are rarely encountered in practice.

Page 85: 5 Joint Probability Distributions and Random Samples

85

The Central Limit Theorem

On the other hand, the rule of thumb is often conservative; for many population distributions, an n much less than 30 would suffice.

For example, in the case of a uniform population distribution, the CLT gives a good approximation for n 12.

Page 86: 5 Joint Probability Distributions and Random Samples

86Copyright © Cengage Learning. All rights reserved.

5.5 The Distribution of a Linear Combination

Page 87: 5 Joint Probability Distributions and Random Samples

87

The Distribution of a Linear Combination

The sample mean X and sample total To are special cases of a type of random variable that arises very frequently in statistical applications.

Definition

Given a collection of n random variables X1, . . . , Xn and n numerical constants a1, . . . , an, the rv

is called a linear combination of the Xi’s.

(5.7)

Page 88: 5 Joint Probability Distributions and Random Samples

88

The Distribution of a Linear Combination

For example, 4X1 – 5X2 + 8X3 is a linear combination of X1, X2, and X3 with a1 = 4, a2 = –5, and a3 = 8.

Taking a1 = a2 = . . . = an = 1 gives Y = X1 + . . . + Xn = To,

and a1 = a2 = . . . = an = yields

Page 89: 5 Joint Probability Distributions and Random Samples

89

The Distribution of a Linear Combination

Proposition

Let X1, X2, . . . , Xn have mean values 1, . . . , n, respectively, and variances respectively.

1. Whether or not the Xi’s are independent,

E(a1X1 + a2X2 + . . . + anXn) = a1E(X1) + a2E(X2) + . . . + anE(Xn)

= a11 + . . . + ann

2. If X1, . . . , Xn are independent,

V(a1X1 + a2X2 + . . . + anXn)

(5.8)

(5.9)

Page 90: 5 Joint Probability Distributions and Random Samples

90

The Distribution of a Linear Combination

And

3. For any X1, . . . , Xn,

(5.10)

(5.11)

Page 91: 5 Joint Probability Distributions and Random Samples

91

Example 29

A gas station sells three grades of gasoline: regular, extra, and super.

These are priced at $3.00, $3.20, and $3.40 per gallon, respectively.

Let X1, X2, and X3 denote the amounts of these grades purchased (gallons) on a particular day.

Suppose the Xi’s are independent with 1 = 1000, 2 = 500, 3 = 300, 1 = 100, 2 = 80, and 3 = 50.

Page 92: 5 Joint Probability Distributions and Random Samples

92

Example 29

The revenue from sales is Y = 3.0X1 + 3.2X2 + 3.4X3, and

E(Y) = 3.01 + 3.22 + 3.43

= $5620

cont’d

Page 93: 5 Joint Probability Distributions and Random Samples

93

The Difference Between Two Random Variables

Page 94: 5 Joint Probability Distributions and Random Samples

94

The Difference Between Two Random Variables

An important special case of a linear combination results from taking n = 2, a1 = 1, and a2 = –1:

Y = a1X1 + a2X2 = X1 – X2

We then have the following corollary to the proposition.

Corollary

E(X1 – X2) = E(X1) – E(X2) for any two rv’s X1 and X2.V(X1 – X2) = V(X1) + V(X2) if X1 and X2 are independent rv’s.

Page 95: 5 Joint Probability Distributions and Random Samples

95

Example 30

A certain automobile manufacturer equips a particular model with either a six-cylinder engine or a four-cylinder engine.

Let X1 and X2 be fuel efficiencies for independently and randomly selected six-cylinder and four-cylinder cars, respectively. With 1 = 22, 2 = 26, 1 = 1.2, and 2 = 1.5,

E(X1 – X2) = 1 – 2

= 22 – 26

= –4

Page 96: 5 Joint Probability Distributions and Random Samples

96

Example 30

If we relabel so that X1 refers to the four-cylinder car, then E(X1 – X2) = 4, but the variance of the difference is still 3.69.

cont’d

Page 97: 5 Joint Probability Distributions and Random Samples

97

The Case of Normal Random Variables

Page 98: 5 Joint Probability Distributions and Random Samples

98

The Case of Normal Random Variables

When the Xi’s form a random sample from a normal distribution, X and To are both normally distributed. Here is a more general result concerning linear combinations.

Proposition

If X1, X2, . . . , Xn are independent, normally distributed rv’s (with possibly different means and/or variances), then any linear combination of the Xi’s also has a normal distribution. In particular, the difference X1 – X2 between two independent, normally distributed variables is itself normally distributed.

Page 99: 5 Joint Probability Distributions and Random Samples

99

The Case of Normal Random Variables

The CLT can also be generalized so it applies to certain linear combinations. Roughly speaking, if n is large and no individual term is likely to contribute too much to the overall value, then Y has approximately a normal distribution.