chapter3 statistic
DESCRIPTION
Statistic Chapter 3TRANSCRIPT
Chapter 3:Sampling Distribution
113
CHAPTER 3 : SAMPLING DISTRIBUTIONS
Sub-Topic
Sampling error.
Introduction to sampling distribution.
Sampling distribution of single mean.
Sampling distribution of the difference between two means.
Sampling distribution test : t distribution, 2 - distribution and F-distribution.
Chapter Learning Outcome
Solve the problems involve the sampling distributions for the single and two
population means.
Learning Objective
By the end of this chapter, students should be able to
Understand the concept of sampling error.
Determine the mean and standard deviation for the sampling distribution of
the sample mean.
Understand the importance of the Central Limit Theorem.
Apply the sampling distributions for mean and difference between two means.
Key Term (English to Bahasa Melayu)
English Bahasa Melayu
1. Parameter → Parameter
2. Statistic → Statistik
3. Sampling error → Ralat pensampelan
4. Central Limit Theorem → Teorem had memusat
5. Sampling distribution → Taburan pensampelan
6. Simple random sample → Sampel rawak mudah
Chapter 3:Sampling Distribution
114
3.1 Sampling error
Definition 1
Sampling error of single mean is the difference between values (a statistic)
computed from a sample and the corresponding value (a parameter) computed from a
population.
Theory 1
Formula sampling error of single mean : _
xe
where
_
x sample mean
population mean.
Definition 2
A parameter is a measure computed from the entire population.
Definition 3
A statistics is a measure computed from a sample that has been selected from a
population.
Example 1
If given that the mean population is 158972 square feet and a sample size of five
shopping centre yields with sample mean 155072 square feet. Find the sampling
error.
Answer Example 1
We know that 158972 and 155072_
x .
The sampling error
Chapter 3:Sampling Distribution
115
39003900158972155072_
xe Square feet.
Theory 2
Formula for population mean, N
x
where
population mean
x values in the population
N population size.
Theory 3
Fundamental statistical concepts are
The size of the sampling error depends on which sample is taken.
The sampling error may be positive or negative.
There is potentially a different value for each possible sample mean.
Definition 4
A simple random sample is a sample selected in such a manner that each possible
sample of a given size has an equal chance of being selected.
Theory 4
Formula for sample mean, is n
xx
_
where
_
x sample mean
x sample value selected from the population
n sample size.
Chapter 3:Sampling Distribution
116
3.2 Introduction to sampling distribution
In the inferential statistics process, a researcher selects a random sample from the
population, computes a statistics on the sample and reaches conclusions about the
population parameter from the statistics. In this chapter, we will explore the sample
mean, __
x , as the statistic. The sample means is one of the more common statistics
used in the inferential process. To compute and assign the probability of occurrence
of a particular value of a sample mean, the researcher must know the distribution of
the sample means. One way to examine the distribution possibilities is to take a
population with a particular distribution, randomly select samples of a given size,
compute the sample means and attempt to determine how the means are distributed.
Let say 23 people selected randomly from the population of women in Ayer Hitam
Pahat between the ages of 20 and 40 years old and we computed the mean height of
the sample. We would not expect our sample mean to be equal to the mean of all
women in Ayer Hitam. It might be somewhat lower or it might be somewhat higher,
but it would not equal the population mean exactly. Similarly, if we took a second
sample of 23 people from the same population, we would not expect the mean of this
second sample to equal the mean of the first sample. Inferential statistics concerns
generalizing from sample to population. A critical part of inferential statistics
involves determining how far sample statistics are likely to vary from each other and
from the population parameter. Why we sample the population ? Why not we study
the whole population ? These all because the physical impossibility of checking all
items in the population, the cost of studying all the items in a population, the sample
results are usually adequate, contacting the whole population would often be time-
consuming and the last one is the destructive nature of certain tests such as a study of
light bulb life.
Definition 5
If samples of size n are drawn randomly from a population that has a mean of and
a standard deviation of 2 , the sample means, __
x , are approximately normally
Chapter 3:Sampling Distribution
117
distributed for sufficiently large sample size ( 30n ) regardless of the shape of the
population distribution. If the population is normally distributed, the sample means
are normally distributed for any size sample. It can be shown that the mean of the
sample means is the population mean, which is __
x
and standard deviation of the
sample means (called the standard error of the mean) is the standard deviation of the
population divided by the square root of the sample size, which is nx
__ .
Definition 6
A sampling distributions is a distribution of the possible values of a statistic for a
given size sample selected from a population.
Definition 7
Sampling distribution of the mean, for random samples of n observations taken
from a population with mean, and a standard deviation, , regardless of the
population’s distribution, provided the sample size is sufficiently large, the
distribution of the sample mean, _
x will be normal with a mean equal to the
population mean, _
x
. Further, the standard deviation will equal the population
standard deviation divided by the square-root of the sample size, nx
_ . The
larger sample size is, the better an approximation to the normal distribution.
3.3 Sampling distribution of the single mean
Theory 5
Z-value for sampling distribution of _
x is n
xZ
_
where
_
x sample mean
Chapter 3:Sampling Distribution
118
population mean
population standard deviation
n sample size
Theory 6 ( Calculation probability of single mean )
In order to calculate the probability of single mean, we need to follow four steps
below.
Step 1 : Write the mean of sample mean, __
x which is _
x
.
Step 2 : Write the standard deviation of sample mean, __
x which is nx
_
.
Step 3 : Write the distribution in normal distribution form which is
2
__
____ ,~xx
Nx .
Step 4 : Find the probability of sample mean,
__
____
x
x
rZPrxP
.
Example 2
What is the probability that a sample of 100 automobile insurance claim files will
yield an average claim of RM4527.77 or less if the average claim for the population is
RM4560 with standard deviation of RM600 ?
Answer Example 2
Given 100n , 77.4527_
x , 4560 and 600 .
Step 1 : 4560_ x
Step 2 : 60100
600_
nx
Step 3 : 2__
60,4560~ Nx
Step 4 :
60
456077.452777.4527
__
ZPxP
Chapter 3:Sampling Distribution
119
54.0 ZP
)54.0( ZP
2946.0
Example 3
The random variable, X represent the number of box in a container, has the following
probability distribution.
X 4 5 6 7
P(x) 0.2 0.4 0.3 0.1
(a) Find the population mean and variance.
(b) Find the sample mean and variance for random samples of 36 boxes.
(c) Calculate the probability if the average number of box in 36 containers will be
less than 5.5.
Answer Example 3
(a) )(.)( xPxXE
)1.0(7)3.0(6)4.0(5)2.0(4
7.08.10.28.0
3.5
)1.0(7)3.0(6)4.0(5)2.0(4)( 22222 XE
9.48.10102.3
9.28
22 )()()(Var XEXEX
2)3.5(9.28
81.0
Chapter 3:Sampling Distribution
120
(b) Mean sample, 3.5_ x
Variance sample, 0225.036
81.022_
nx
(c) Step 1 : 3.5_ x
Step 2 : 15.00225.036
81.02
_ nx
Step 3 : 2__
15.0,3.5~ Nx
Step 4 :
15.0
3.55.55.5
__
ZPxP
)33.1( ZP
33.11 ZP
09176.01
90824.0
Example 4
An electrical firm manufactures light bulbs that have a length of life that is
approximately normally distributed, with mean equal to 800 hours and a standard
deviation of 40 hours. Find the probability that a random sample of 16 bulbs will have
an average life of less than 775 hours.
Answer Example 4
Step 1 : 800_ x
Step 2 : 1016
40_
nx
Step 3 : 100,800~__
Nx
Step 4 :
10
800775775
__
ZPxP
Chapter 3:Sampling Distribution
121
)5.2( ZP
0062.0
Example 5
At a large university, the mean age of the students is 22.3 years and the standard
deviation is 4 years. A random sample of 64 students is drawn. What is the
probability that the average age of these students is greater than 23 years ?
Answer Example 5
Step 1 : 3.22_ x
Step 2 : 5.064
4_
nx
Step 3 : 2__
5.0,3.22~ Nx
Step 4 :
5.0
3.222323
__
ZPxP
)4.1( ZP
0808.0
Example 6
The breaking strength (in kg/mm) for a certain type of fabric has mean 1.86 and
standard deviation 0.27. A random sample of 80 pieces of fabric is drawn. What is the
probability that the sample mean breaking strength is less than 1.8 kg/mm ?
Answer Example 6
Given : 86.1 , 27.0 and 80n
Step 1 : 86.1_ x
Step 2 : 03018.080
27.0_
nx
Chapter 3:Sampling Distribution
122
Step 3 : 2__
03018.0,80~ Nx
Step 4 :
03018.0
86.18.18.1
__
ZPxP
)99.1( ZP
0233.0
Example 7
Taking random samples of size n from an infinite population that has a standard
deviation two, show that __
x would be a more precise estimator of if sample size
were increased from four to six. Interpret the result.
Answer Example 7
The precision of __
x as an estimator of is measured by the standard deviation,
x
.
For sampling from an infinite population, nx
.
Therefore, for 2 and 4n : 142 nx
Increasing n from 4 to 16 : 5.0162 nx
Thus, with an increasing in sample size and a constant σ,
x
decreases by 50%.
Exercise 3.3
1. Bags of concrete mix labeled have a population mean weight of 100 kg and a
population standard deviation of 0.5 kg.
(a) What is the probability that the mean weight of a random sample of 50
bags is less than 99.9 kg ?
(b) If the population mean weight is increased to 100.15 kg, what is the
probability that the mean weight of a sample of size 50 will be less
than 100 kg ?
Chapter 3:Sampling Distribution
123
2. In a report stated that the average time of watching movie per week for
children with ages between two and six years is 22 hours. Assume the variable
is normally distributed and the standard deviation is five hours. A sample of
33 children with ages between two and six years is randomly selected. Find
the probability that the average time they watch movie per week will be
greater than 23.5 hours.
3. Women from aged 18 to 24, their systolic blood pressures (in mm Hg) are
normally distributed with a mean of 114.4 and a standard deviation of 13.1.
(a) If five women between the ages of 18 to 24 are randomly selected, find
the probability that her systolic blood pressure is greater than 120.
(b) If twelve women between the ages of 18 to 24 are randomly selected,
find the probability that the average of their systolic blood pressure is
less than 115.
4. A random sample of hundred is taken from a normally distributed population
with mean 20 and standard deviation equal to one. What is the probability that
__
x will take on a value 20 and 20.2 inclusive ?
5. Engineers must consider the breadths of male heads when designing
motorcycle helmets. Men have head breadths that are normally distributed
with a mean of 15.24 cm and a standard deviation of 2.54 cm.
(a) If one male is randomly selected, find the probability that his head
breadth is less than 15.75 cm.
(b) Find the probability that 100 randomly selected men have a mean head
breadths at least 16.00 cm.
6. The lifetime of a particular type of battery is normally distributed with a mean
of 1100 days and a standard deviation of 80 days. The manufacturer randomly
selects 400 batteries of this type and ships them to a departmental store.
Chapter 3:Sampling Distribution
124
(a) What is the mean and standard deviation of the sampling distribution
of __
x ?
(b) What is the probability that the average lifetime of these 400 batteries
is between 1097 and 1104 days ?
7. The time required to assemble an electronic component is normally distributed
with a mean of 25 minutes and a standard deviation of 3.5 minutes. Find the
probability that the average time required to assemble all 19 components is at
least 23 minutes.
8. The amount of sulfur in the daily emissions from a power plant has a normal
distribution with a mean of 94 and a standard deviation of 22. For a random
sample of 5 days, find the probability that the average amount of sulfur
emissions will exceed 80.
9. According to the growth chart that doctors use as a reference, the heights of
two-year-old boys are normally distributed with mean 34.5 inches and
standard deviation 1.3 inches. If six two-year-old boys are selected, what is
the probability that their average height will be between 34.1 and 35.2 inches.
10. Casual workers in a certain industry are paid on average RM5.10 per hour
which is normally distributed with standard deviation of RM2.20. A sample of
35 casual workers from the industry was selected to be respondents for the
underpaid issue questionnaires. Find the probability that the average payment
for those casual workers is
(a) at least RM6.00 per hour.
(b) greater than RM4.80 per hour.
11. Intelligent Quotients (IQ) in the general population are normally distributed
with a mean of 100 and a standard deviation of 15. A random sample of 40
Chapter 3:Sampling Distribution
125
students was taken in a certain university. Find the probability that the mean
IQ of the sample is
(a) greater than 105 and less than 107.
(b) not more than 109.
12. Given a random sample of 40,321 ,,, XXXX which is drawn from
population with Poisson distribution 3.5. Find probability that the sample
mean is between 3.4 and 4.3.
13 PVC pipe is manufactured with mean diameter is 3.2 cm and standard
deviation is 1.6 cm. The distribution of diameter is normal. Find the
probability that a random sample of 64 pipes will have a sample mean
diameter is less than three centimeter.
14 Consider the PVC pipe in the previous question. How is the standard
deviation of the sample mean changed when the sample size is decreased from
64 to 9 ? Explain.
Answer Exercise 3.3
1. (a) 0.0793 (b) 0.0170
2. 0.0427
3. (a) 0.1685 (b) 0.5636
4. 0.1359
5. (a) 0.5793 (b) 0.0014
6. (a) 1100, 4 (b) 0.6147
7. 0.9936
8. 0.9222
9. 0.6799
10. (a) 0.0078 (b) 0.7910
11. (a) 0.0159 (b) 0.9826
Chapter 3:Sampling Distribution
126
12. 0.6296
13. 0.1587
3.4 Sampling distribution of the difference between two means
Theory 7
Statistical analyses are very often concerned with the difference between means. A
typical example is an experiment designed to compare the mean of a control group
with the mean of an experimental group. Inferential statistics used in the analysis of
this type of experiment depend on the sampling distribution of the difference between
means.
Theory 8 ( Calculation probability of two means)
In order to calculate the probability of two means, we need to follow four steps
below.
Step 1 : Write the mean of sampling distribution which is, 212
__
1
__ xx
Step 2 : Write the standard deviation of sample mean which is,2
22
1
12
2
__
1
__
nnxx
.
Step 3 : Write the distribution in normal form,
2__
2
__
1
__
2
__
1
__ ,~xxxx
Nx .
Step 4 : Find the probability of sample mean,
2
_
1
_
1
_
1
___
2
__
1
xx
xx
r
ZPrxxP
.
Example 8
The mature citrus trees of type A have a mean height of 14.8 feet with a standard
deviation of 1.2 feet. The mature citrus trees of type B have a mean height of 12.9
feet with a standard deviation of 1.5 feet. Two samples of size 12 and 15 are
randomly selected from mature citrus tree of type A and B respectively. Find the
probability that
Chapter 3:Sampling Distribution
127
(a) the mean of type A is more than 14 feet.
(b) the mean of type B is between 12 to 14 feet.
(c) the mean of type A is two feet more than the mean of type B.
Answer Example 8
(a)
A B
Sample mean 14.8 12.9
Sample standard deviation 1.2 1.5
Sample size 12 15
Step 1 : 8.14__ AxA
Step 2 : 34641.012
2.1__
n
A
x
Step 3 : 2__
34641.0,8.14~ Nx A
Step 4 :
34641.0
8.1414)14(
__
ZPxP
)31.2( ZP
)31.2(1 ZP
01044.01
9896.0
(b) Step 1 : 9.12__ BxB
Step 2 : 38729.015
5.1__
n
B
x
Step 3 : 2__
38729.0,9.12~ Nx B
Step 4 :
38729.0
9.1214
38729.0
9.12121412
__
ZPxP
)84.232.2( ZP
)84.2()32.2(1 ZPZP
Chapter 3:Sampling Distribution
128
00226.001017.01
9878.0
(c) Step 1 : 9.19.128.14__ BAx
Step 2 : 51961.015
5.1
12
2.1 2222
__ B
B
A
A
x nn
Step 3 : 2____
51961.0,9.1~ Nxx BA
Step 4 :
51961.0
9.122
____
ZPxxP BA
)19.0( ZP
4247.0
Example 9
The result of Statistics Test 1 for two groups of management students, Section 1 and
Section 2 are normally distributed with )4,60( 2N and )2,64( 2N respectively. Two
samples of size 9 and 12 are randomly selected from Section 1 and Section 2
respectively. Find the probability that the mean of Section 2 is lower than the mean
of Section 1 ?
Answer Example 9
Section 1 Section 2
Sample mean 60 64
Sample variance 16 4
Sample size 9 12
Step 1 : 460641212
xx
Step 2 : 4529.112
4
9
16
2
2
2
1
2
1
12
nnxx
Chapter 3:Sampling Distribution
129
Step 3 : 21
__
2
__
4529.1,4~ Nxx
Step 4 :
01
__
2
__
1
__
2
__
xxPxxP
4529.1
40ZP
)75.2( ZP
)75.2( ZP
0030.0
Example 10
Consider two populations of students who participate in a reading programmed prior
to taking a Japanese course. The populations are those who earn an A grade and those
who earn a B grade. Let X be the number of books read by the students who
participate in the programmed. Find the probability that the mean number of books
read by the students who earn A grade is greater than the students who earn B grade if
given the data below.
Grade A Grade B
Sample mean 37 25
Sample standard deviation 8.7014 8.5264
Sample size 8 6
Answer Example 10
Step 1 : 122537
BA
xx BA
Step 2 : 6455.46
5264.8
8
7014.8 2222
B
B
A
A
xx nnBA
Step 3 : 2____
6455.4,12~ Nxx BA
Step 4 :
0
________
BABA xxPxxP
Chapter 3:Sampling Distribution
130
6455.4
120ZP
)58.2( ZP
)58.2(1 ZP
00494.01
99506.0
Example 11
The length of computer desk is approximately normal distributed. There are two
factories produce that kind of desk. The summary statistics are given below.
Factory A Factory B
Sample mean 60.5 58.3
Sample standard deviation 3 4
Sample size 35 40
Find the probability the mean sample for the length of computer desk produced by
Factory B is greater than mean sample for the length of computer desk produced by
Factory A.
Answer Example 11
Step 1 : 2.25.603.58
AB
xx AB
Step 2 : 81064.040
4
35
3 2222
B
B
A
A
xx nnAB
Step 3 : 2____
81064.0,2.2~ Nxx AB
Step 4 :
0
________
ABAB xxPxxP
Chapter 3:Sampling Distribution
131
81064.0
)2.2(0ZP
0034.0
Exercise 3.4
1. The usage of electricity at residential area A is normally distributed with mean
of 156 kilowatt per hour and standard deviation of 43 kilowatt per hour.
Meanwhile the usage of electricity at residential area B is also normally
distributed with mean of 161 kilowatt per hour and its standard deviation is 48
kilowatt per hour. Two samples of size 20 and 25 residences are randomly
selected from residential area A and residential area B, respectively. Find the
probability that the mean of usage at residential area A is lower than the mean
of usage at residential area B.
2. A company manufactures two types of cables, brand A and brand B that have
mean breaking strengths of 4000 kg and 4500 kg and standard deviations of
300 kg and 200 kg, respectively. If 100 cables of brand A and 50 cables of
brand B are tested, what is the probability that the mean breaking strengths of
brand B will be at least 600 kg more than brand A ?
3. The effective life of a component used in a jet-turbine aircraft engine is a
random variable with mean 3465 hours and standard deviation 25 hours. The
distribution of effective life is fairly close to a normal distribution. The engine
manufacturer introduces an improvement into the manufacturing process for
this component that increases the mean life to 4050 hours and decreases the
standard deviation to 15 hours. Given a random sample of 20 components is
selected from the old process and 35 components is selected from the
improved process. What is the probability that the difference between two
sample mean improved process and old process is at most 23 hours ?
Chapter 3:Sampling Distribution
132
4. The average running times of films produced by Company A are 98.4 minutes
with standard deviation of 7.8 minutes. Companies B have a mean running
times of 110.7 minutes with standard deviation of 29.8 minutes. Assume the
populations are approximately normally distributed. What is the probability
that a random sample of 36 films from Company B will have mean running
times that at least 13 minutes more than the mean running times of a random
sample of 49 films from Company A.
5. The elasticity of polymer is affected by the concentration of a reactant. When
low concentration is used, the true mean elasticity is 55 and when high
concentration is used the mean elasticity is 60. The standard deviation of
elasticity is 4, regardless of concentration. The distribution of elasticity is
normally distributed. Two random samples of size 16 are taken. Find the
probability that the difference mean between high concentration and low
concentration is more than two.
6. If given two populations of UTHM students who participate in a debate
competition France Language. The populations are those who get a Score A
and Score B. Let X is the number of questions answered by the students who
participate in the competition. Find the probability that the mean number of
questions collect by the students who get Score B is at most than the students
who get Score A if given the data such as below.
Score A Score B
Sample mean 83 91
Sample standard deviation 12 8
Sample size 15 14
7. The mean age at death in Malaysia is 55.5 years and Singapore is 57 years.
The standard deviation is approximately 4.6 years and 5 years for each
country respectively. Samples of 130 deaths from the Malaysia Hospital and
Chapter 3:Sampling Distribution
133
120 from Singapore Hospital were selected. Find the probability that
(a) the mean age at death in Malaysia is greater than the mean age at
death in Singapore.
(b) the mean age at death in Singapore is three less than the mean age at
death in Malaysia.
8. The average life of a hand phone is 8 years for a female and 6 years for a
male, with a standard deviation of 1 and 2 years respectively. Assuming that
the lives of these hand phones follow approximately a normal distribution,
find the probability that the mean life of a random
(a) male hand phone falls between 6.6 and 7.7 years.
(b) sample of 44 females is not less than 2.5 years than the sample of 55
males hand phones.
9. A company manufactures two types of polystyrenes, type A and type B that
have mean breaking strengths of 400 g and 450 g and standard deviations of
30 g and 20 g, respectively. If 80 polystyrenes of type A and 45 polystyrenes
of type B are tested, what is the probability that the mean breaking strengths
of type B will be at most 53 g more than type A ?
10. The average running times of disks produced by Company X is 88.1 minutes
and a standard deviation of 6.1 minutes, while those of Company Y have a
mean running times of 99.3 minutes with standard deviation of 13.6 minutes.
Assume the populations are approximately normally distributed. What is the
probability that a random sample of 41 disks from Company Y will have
mean running times that at most 15 minutes more than the mean running times
of a random sample of 32 disks from Company X ?
11. A random sample of size sixteen is selected from a normal population with a
mean of 75 and a standard deviation of eight from sample A. A second sample
Chapter 3:Sampling Distribution
134
of size nine is selected from another normal population with a mean of 70 and
a standard deviation of twelve from sample B. Let AX and
BX be the two
sample means. Find
(a) the probability that mean difference between sample A and sample B
will be exceed four.
(b) the probability that mean difference between sample A and sample B
will be between 3.5 and 5.5.
12. The elasticity of polymer is affected by the concentration of a reactant. When
low concentration is used, the true mean elasticity is 55 and when high
concentration is used the mean elasticity is 60. The standard deviation of
elasticity is 4, regardless of concentration. The distribution of elasticity is
normally distributed. Two random samples of size 16 are taken. Find the
probability that the difference mean between high concentration and low
concentration is more than two.
13. The average running times of films produced by Company A is 98.4 minutes
and a standard deviation of 7.8 minutes, while those of Company B have a
mean running times of 110.7 minutes with standard deviation of 29.8 minutes.
Assume the populations are approximately normally distributed. What is the
probability that a random sample of 36 films from Company B will have mean
running times that at least 13 minutes more than the mean running times of a
random sample of 49 films from Company A.
14. The weight of computer chair is approximately normally distributed. There
are two company produce that kind of chair. The data in table shows as a
follows.
Company 1 Company 2
Sample mean = 20.1 Sample mean = 23.1
Sample standard deviation = 4.6 Sample standard deviation = 3.1
Chapter 3:Sampling Distribution
135
Sample size = 38 Sample size = 29
Find the probability that mean weight of computer chair produced by
Company 2 is greater than weight of computer chair produced by Company 1.
15. A study was designed to estimate the difference in diastolic blood pressure
readings between men and women. The mean and standard deviation for
sixteen men are 77.37 and 8.35, while for thirteen women are 71.08 and 9.22
respectively. Assume that the readings are normally distributed, find
(a) the sampling distribution of the different between diastolic blood
pressure readings for men and women.
(b) the probability that the different between diastolic blood pressure
readings for men is greater than women.
(c) the probability that the different between diastolic blood pressure
readings from men is five less than women.
Answer Exercise 3.4
1. 0.6443 2. 0.0076
3. 0.0073 4. 0.4443
5. 0.9830 6. 0.0166
7. (a) 0070 (b) 0.9931
8. (a) 0.1844 (b) 0.0526
9. 0.7486 10. 0.9452
11. (a) 0.5871 (b) 0.1769
12. 0.983 13. 0.44433
14. 0.9993
15. (b) 0.9719 (c) 0.3483
3.5 Sampling distribution test
Chapter 3:Sampling Distribution
136
Theory 9
The t-distribution has been introduced by W. S. Gosset (1876 - 1937). He adopted the
pen name "student." Therefore, the distribution is known as 'student’s t-distribution'.
It is used to establish confidence limits and test the hypothesis when the population
variance is not known and sample size is small (less than 30). If a random sample 1x ,
2x , …, nx of n values be drawn from a normal population with mean μ and standard
deviation s, then the mean of sample nxx i
. Estimation of the variance, let 2s
be the estimate of the variance of the sample then 2s given by
1
2
2
nxxs i whereby )1( n as denominator in place ''n . The statistic '' t
is defined as nsxt 2
whereby
x sample mean, μ is actual mean of
population and n sample size and s standard deviation of sample. The formula for
1
2
nxxs i .
Note :
'' t is distributed as the student distribution with )1( n degree of freedom
(df ).
The variable '' t distribution ranges from minus infinity to plus infinity.
Such as standard normal distribution, it is also symmetrical and has mean zero.
2 of t-distribution is greater than 1, but becomes 1 as 'df' increases and thus
the sample size becomes large.
The t-distribution is lower at the mean and higher at the tails than the normal
distribution.
The t-distribution has proportionally greater area at its tails than the normal
distribution.
The t-distribution is similar in shape to the standard normal distribution,
which are symmetric about zero, uni-modal and bell-shaped.
Chapter 3:Sampling Distribution
137
The spread of a t-distribution is larger than that of a standard normal
distribution. That is, there is more probability in the tails of a t-distribution.
This makes sense because the t-statistic should have more variability that the
test statistic, Z that we use before. There is added variability in the t statistic
since it uses s , an estimate of , rather than a known, fixed value of .
Theory 10 (Finding areas under the t-distributions)
We use t-distribution table to find areas under the t-distributions. The table gives the
value of vt , which is the 100α percentage point of the t-distributions for v, degrees of
freedom. The numbers in the middle of the table are values from t-distributions. Each
row corresponds to a t-distribution with the degrees of freedom given at the beginning
of the row. The numbers in the top row are right tail areas.
Example 12
Find the value of t-distribution, if given nine degrees of freedom with alpha equal to
0.05.
Answer Example 12
Refers to t-distribution table, if we go across the row for nine degrees of freedom and
down the column for an area of 0.05, we get the t value of 1.833. That means, for t9
distribution, the area under the curve to the right of 1.833 is 0.05.
Example 13
Find the value of t-distribution, if given twenty degrees of freedom with alpha equal
to 0.001.
Answer Example 13
Refers to t-distribution table, if we go across the row for twenty degrees of freedom
and down the column for an area of 0.001, we get the t value of 3.552. That means,
for 20t distribution, the area under the curve to the right of 3.552 is 0.001.
Chapter 3:Sampling Distribution
138
Example 14
By using the statistical table, find the value of vt , .
(a) 025.0)( 14, tTP
(b) 005.0)( 24, tTP
Answer Example 14
(a) Given : 025.0)( 14, tTP
145.2025.0,14, tt v , then 025.0)145.2( TP
(b) Given : 005.0)( 24, tTP
797.2005.0,24, tt v , then 005.0)()( 24,24, tTPtTP
Theory 11
Tests like Z score, t, and F are based on the assumption that the samples were drawn
from normally distributed populations or more accurately that the sample means were
normally distributed. As these tests require assumption about the type of population
or parameters, these tests are known as 'parametric tests.' There are many situations in
which it is impossible to make any rigid assumption about the distribution of the
population from which samples are drawn. This limitation led to search for non-
parametric tests. Chi-square (Read as Ki - square) test of independence and goodness
of fit is a prominent example of a non-parametric test. The chi-square, 2 test can be
used to evaluate a relationship between two nominal or ordinal variables. The 2 is
measure of actual divergence of the observed and expected frequencies. In sampling
studies we never expect that there will be a perfect coincidence between actual and
observed frequencies and the question that we have to tackle is about the degree to
which the difference between actual and observed frequencies can be ignored as
arising due to fluctuations of sampling. If there is no difference between actual and
observed frequencies then 02 . If there is a difference, then 2 would be more
Chapter 3:Sampling Distribution
139
than 0. But the difference may also be due to sample fluctuation and thus the value of
2 should be ignored in drawing the inference. Such values of 2 under different
conditions are given in the form of tables and if the actual value is greater than the
table value, it indicates that the difference is not solely due to sample fluctuation and
that there is some other reason. On the other hand, if the calculated 2 is less than the
table value, it indicates that the difference may have arisen due to chance fluctuations
and can be ignored. Thus 2 tests enable us to find out the divergence between
theory and fact or between expected and actual frequencies are significant or not. If
the calculated value of 2 is very small, compared to table value then expected
frequencies are very little and the fit is good. If the calculated value of 2 is very
large as compared to table value then divergence between the expected and the
observed frequencies is very big and the fit is poor.
Theory 12 (Finding areas under the 2 - distributions)
We use 2 - distribution table to find areas under the 2 - distributions. The table
gives the value of v,2 which is the 100 percentage point of the 2 - distributions
for v , degrees of freedom. The numbers in the middle of the table are values from
2 - distributions. Each row corresponds to a 2 - distribution with the degrees of
freedom given at the beginning of the row. The numbers in the top row are right tail
areas.
Example 15
Find the value of 2 - distribution, if given seventeen degrees of freedom with alpha
equal to 0.95.
Answer Example 15
Refers to 2 - distribution table, if we go across the row for seventeen degrees of
freedom and down the column for an area of 0.95, we get the 2 value of 8.672. That
Chapter 3:Sampling Distribution
140
means, for 172 distribution, the area under the curve to the right of 8.672 is 0.95.
Example 16
Find the value of 2 - distribution, if given twelve degrees of freedom with alpha
equal to 0.02.
Answer Example 16
Refers to 2 - distribution table, if we go across the row for twelve degrees of
freedom and down the column for an area of 0.02, we get the 2 value of 24.054.
That means, for 122 distribution, the area under the curve to the right of 24.054 is
0.02.
Theory 13
In probability theory and statistics, the F-distribution is a continuous probability
distribution. It is also known as Snedecor's F distribution or the Fisher-Snedecor
distribution (after R.A. Fisher and George W. Snedecor). The F-distribution becomes
relevant when we try to calculate the ratios of variances of normally distributed
statistics. Suppose we have two samples with n1 and n2 observations, the ratio 2
2
2
1
s
sF
is distributed according to an F distribution (named after R.A. Fisher) with 111 nv
numerator degrees of freedom, and 122 nv denominator degrees of freedom. The
F-distribution is skewed to the right, and the F-values can be only positive.
Theory 14 (Finding areas under the F - distributions)
We use F-distribution table to find areas under the F-distributions. The table gives the
values of 21,, vvF which is the 100α percentage point of the F-distributions having 1v
degrees of freedom in the numerator and 2v degrees of freedom in the denominator.
For each pair of values of 1v and 2v , 21,, vvF is tabulated for
001.0,01.0,025.0,05.0 and the 025.0 values being bracketed. The lower
Chapter 3:Sampling Distribution
141
percentage points of the distribution may be obtained from the
relation
12
21
,,
,,1
1
vv
vvF
F
for example, 351.01
12,8,05.0
18,12,95.0 F
F .
Example 17
If 2
1s and 2
2s are the variances of independent random samples of size, 251 n and
132 n from normal population with equal variances, find
25.6
2
2
2
1
s
sP .
Answer Example 17
Variance of normal population are equal for two independent random samples,
2
2
2
1
2 .
251 n , 24125111 nv
132 n , 12113122 nv
From statistical table :
999.0
001.01
)25.6(1
)25.6(25.62
2
2
1
FP
FPs
sP
Exercise 3.4
By using the statistical table, find the probability
1. 17),898.2( vTP
2. 7),415.1( vTP
3. 30),042.2( vTP
By using the statistical table, find the value of T , if given
4. 005.0,26_____),( vTP
Chapter 3:Sampling Distribution
142
5. 995.0,21_____),( vTP
6. 9995.0,14_____),( vTP
In each of the following parts, find ,95.02 . Assume a chi- square distribution with
7. 14 degrees of freedom
8. 29 degrees of freedom
Assume a chi – square distribution with 17 degrees of freedom. Fill in the blanks.
9. 05.0)__________( 2 P
10. 005.0)__________( 2 P
11. Assume a 2 distribution with 7 degrees of freedom, find
)346.6167.2( 2 P .
12. Assume a 2 distribution with 17 degrees of freedom, find )511.19( 2 P .
13. If 2
1S and 2
2S are the variances of independent random samples of size,
91 n and 112 n from the normal population with equal variances, find
06.5
2
2
2
1
S
SP .
14. If 2
1s and 2
2s are the variances of independent random samples of size,
91 n and 132 n from normal population with equal variances, find
50.4
2
2
2
1
s
sP .
15. If 2
1s and 2
2s are the variances of independent random samples of size,
71 n and 72 n from normal population with equal variances, find
Chapter 3:Sampling Distribution
143
28.4
2
2
2
1
s
sP .
Answer Exercise 3.4
1. 0.005 2. 0.90
3. 0.975 4. 2.779
5. -2.831 6. 4.140
7. 6.571 8. 17.708
9. 27.587 10. 5.697
11. 0.45 12. 0.7
13. 0.01 14. 0.99
15. 0.95
EXERCISE CHAPTER 3
1. A simple random sample of 100 men is chosen from a population with mean
height 70 inch and standard deviation 2.5 inch. What is the probability that the
average height of the sample men is greater than 69.5 inch ?
2. A group of ball bearings have a mean weight of 5.02 grams and a standard
deviation of 0.30 grams. A random sample of 100 ball bearings chosen from
this group, find
(a) the probability that an average weight of ball bearings chosen from
this group are between 4.96 and 5.00 grams.
(b) the probability that an average weight of ball bearings chosen from
this group are more than 5.10 grams
3. Given the population 5, 5, 5, 7, 7, 8, 8, 8, 9, 9. Find mean and standard
deviation sampling distribution of
(a) If a random sample of size 40 was drawn with replacement from that
population.
(b) If a random sample of size 3 was drawn with replacement from that
Chapter 3:Sampling Distribution
144
population.
4. The mean height of 250 UPM staffs is 158 m and the standard deviation is
5 m. Find the mean and standard deviation of the sampling distribution of the
mean height for a sample size of 38 staffs.
5. A chemical engineer calculates that the populations mean yield of batch is 518
grams per milliliter with a standard deviation of 40 grams. Assume that the
distribution of yield to be approximately normal. What is the probability in a
certain month, he get yield less than 515 grams for 36 batches ?
6. The viscosity of a fluid can be measured in an experiment by dropping a small
ball into a calibrated tube containing the fluid and observing the random
variable X, the time it takes for the ball to drop the measures distance. Assume
that X is normally distributed with a mean of 20 seconds and a standard
deviation of 0.5 seconds for a particular type of liquid.
(a) What is the standard deviation of the average time of 40 experiments ?
(b) What is the probability that the average time of 40 experiments will
exceed 20.1 seconds ?
(c) Suppose the experiment is repeated only 20 times. What is the
probability that the average value of X less than 20.1 seconds ?
7. Two independent experiments are being run in which two different types of
paints are compared. Twenty specimens are painted using type A and the
drying time in hours is recorded on each. The same is done with type B.
Assume that the mean drying times of the two populations are normal,
)1,(~ NX A and )1,(~ NX B for the two types of paints.
(a) Write down the mean distribution of AX__
.
(b) Calculate
3.0
____
BA XXP .
Chapter 3:Sampling Distribution
145
8. The photo resist thickness in semiconductor manufacturing has a mean of 10
micrometers and standard deviation of 3 micrometer. Assume that the
thickness is normally distributed and that the thicknesses of different photo
resist are independent.
(a) Determine the probability that the average thickness of 10 photo resist
is either greater than 11 or less than 9 micrometers.
(b) Determine the number of photo resist that need to be measured such
that the average thickness exceeds 11 micrometers is 0.01.
9. The population of the usage per sheet of paper for old and new certain
products are distributed )60,2000(1N and )40,2500(2N respectively. Two
random samples are taken from each population of size 1n and 2n .
(a) Write down the sampling distribution of the different means of new to
old products.
(b) Find the probability of mean usage of sample of size 3021 nn for
new and old products with new products is at least 500.5 sheets more
than old products.
(c) Find the probability of mean usage of size 201 n and 252 n for
new and old products with new products is at most 502 sheets more
than old products.
10. Line Clear Manufacturing Sdn. Bhd. manufactured two type of cables A and
B that have mean breaking strengths of 2500 lb and 2400 lb with their
standard deviation 150 lb and 100 lb. If 50 cables of brand A and 25 cables of
brand B are tested, what is the probability that the mean breaking strength of
A will be
(a) at least 150 lb more than brand B ?
(b) at least 110 lb more than brand B ?
11. A light bulbs manufacturer claims that the lifetime of its light bulbs has a
Chapter 3:Sampling Distribution
146
mean of 54 months and a standard deviation of 6 months. Your consumer
advocacy group tests 50 of them. What is the probability that it finds a mean
lifetime of less than 52 months ?
12. A manufacturer of video display units is testing two microcircuit designs to
determine whether they produce equivalent mean current flow is normally
distributed with mean and standard deviation such as below. Find the
probability that the mean of design A is lower than the mean of design B.
Design A Design B
Mean 24.2 23.9
Variance 10 20
Sample size 15 10
13. The amount of time that a drive-through restaurant counter spends on a
customer is normally distributed with a mean 3.2 minutes and a standard
deviation 1.6 minutes. If a random sample of 64 customers is observed, find
the probability that the mean time at the counter is
(a) less than 2.7 minutes.
(b) more than 3 minutes.
(c) at least 3.2 minutes but less than 3.4 minutes.
14. Students may choose between a 3 semester course in physics without labs and
4 semester course with labs. The final written examination is the same for
each section. The section wit labs made an average examination grade of 84
with a standard deviation of 4 and the section without labs made an average
grade of 77 with a standard deviation of 6. Assume the populations are
approximately normally distributed. Find the probability that the sample mean
for a random sample of scores of 12 students with labs exceeds the sample
mean for a random sample of scores of 18 students without labs by at most 5.
Chapter 3:Sampling Distribution
147
15. Casual workers in a certain industry are paid on average RM5.10 per hour
which is normally distributed with standard deviation of RM2.20. A sample of
35 casual workers from the industry was selected to be respondents for the
underpaid issue questionnaires. Find the probability that the average payment
for those casual workers is
(a) at least RM6.00 per hour.
(b) greater than RM4.80 per hour.
16. The mean age at death in Malaysia is 55.5 years and Singapore is 57 years.
The standard deviation is approximately 4.6 years and 5 years for each
country respectively. Samples of 130 deaths from the Malaysia Hospital and
120 deaths from Singapore Hospital were selected. Find the probability that
(a) the mean age at death in Malaysia is greater than the mean age at death
in Singapore.
(b) the mean age at death in Singapore is three less than the mean age at
death in Malaysia.
17. Intelligent Quotients (IQ) in the general population are normally distributed
with a mean of 100 and a standard deviation of 15. A random sample of 40
students was taken in a certain university. Find the probability that
(a) the mean IQ of the sample is greater than 105 and less than 107.
(b) the mean IQ of the sample is not more than 109.
18. The average life of a hand phone is 8 years for a female and 6 years for male,
with a standard deviation of 1 and 2 years respectively. Assuming that the
lives of these hand phones follow approximately a normal distribution, find
(a) the probability that the mean life of a random male hand phone falls
between 6.6 and 7.7 years.
(b) the probability that the mean life of a random sample of 44 females is
not less than 2.5 years than the sample of 55 males hand phones.
Chapter 3:Sampling Distribution
148
19. A company manufacturers two types of polystyrenes, type A and type B that
have mean breaking strengths of 400 g and 450 g with standard deviation of
30 g and 20 g, respectively. If 80 polystyrenes of type A and 45 polystyrenes
of type B are tested, what is the probability that the mean breaking strengths
of type B will be at most 53 g more than type A ?
20. The average running times of disks produced by Company X is 88.1 minutes
and a standard deviation of 6.1 minutes, while those of Company Y have a
mean running times of 99.3 minutes with standard deviation of 13.6 minutes.
Assume the populations are approximately normally distributed. What is the
probability that a random sample of 41 disks from Company Y will have
mean running times that at most 15 minutes more than the mean running times
of a random sample of 32 disks from Company X ?
ANSWER EXERCISE CHAPTER 3
1. 0.97725
2. (a) 0.22867 (b) 0.00379
3. (a) 7.1, 0.2392 (b) 7.1, 0.87368
4. 158m, 0.8111 5. 0.32636
6. (a) 0.0791 (b) 0.1038 (c) 0.8133
7. (b) 0.00135
8. (a) 0.2937 (b) 49
9. (b) 0.3936 (c) 0.82381
10. (a) 0.0432 (b) 0.3658
11. 0.00914 12. 0.45620
13. (a) 0.00621 (b) 0.84134 (c) 0.34134
14. 0.13567
15. (a) 0.00776 (b) 0.791
16. (a) 0.00695 (b) 0.99305
17. (a) 0.01593 (b) 0.98257
Chapter 3:Sampling Distribution
149
18. (a) 0.18443 (b) 0.0526
19. 0.7486 20. 0.9452
SUMMARY CHAPTER 3
Sampling error of single mean : _
xe .
Population mean, N
x .
Sample mean, is n
xx
_
.
Z-value for sampling distribution of _
x is n
xZ
_
.
Chapter 3:Sampling Distribution
150
Calculate the probability of single mean :
Step 1 : Mean of sample mean, __
x which is _
x
.
Step 2 : Standard deviation of sample mean, __
x which is nx
_
.
Step 3 : Distribution in normal distribution form which is
2
__
____ ,~xx
Nx .
Step 4 : Probability of sample mean,
__
____
x
x
rZPrxP
.
Calculate the probability of two means :
Step 1 : Mean of sampling distribution which is, 212
__
1
__ xx
Step 2 : Standard deviation of sample mean which is,2
22
1
12
2
__
1
__
nnxx
.
Step 3 : Distribution in normal form,
2__
2
__
1
__
2
__
1
__ ,~xxxx
Nx .
Step 4 : Probability of sample mean,
2
_
1
_
1
_
1
___
2
__
1
xx
xx
r
ZPrxxP
.
CORRECTION PAGE CHAPTER 3
Chapter 3:Sampling Distribution
151
Chapter 3:Sampling Distribution
152