1 1 slide chapter 7 (b) – point estimation and sampling distributions point estimation is a form...
TRANSCRIPT
1 1 Slide Slide
Chapter 7 (b) – Point Estimation and Chapter 7 (b) – Point Estimation and Sampling DistributionsSampling Distributions
Point estimationPoint estimation is a form of statistical inference. is a form of statistical inference. Point estimationPoint estimation is a form of statistical inference. is a form of statistical inference.
In In point estimationpoint estimation we use the data from the sample we use the data from the sample to compute a value of a sample statistic that servesto compute a value of a sample statistic that serves as an estimate of a population parameter.as an estimate of a population parameter.
In In point estimationpoint estimation we use the data from the sample we use the data from the sample to compute a value of a sample statistic that servesto compute a value of a sample statistic that serves as an estimate of a population parameter.as an estimate of a population parameter.
We refer to We refer to as the as the point estimatorpoint estimator of the population of the population mean mean .. We refer to We refer to as the as the point estimatorpoint estimator of the population of the population mean mean ..
xx
ss is the is the point estimatorpoint estimator of the population standard of the population standard deviation deviation .. ss is the is the point estimatorpoint estimator of the population standard of the population standard deviation deviation ..
is the is the point estimatorpoint estimator of the population proportion of the population proportion pp.. is the is the point estimatorpoint estimator of the population proportion of the population proportion pp..pp
2 2 Slide Slide
• Example: St. Andrew’s CollegeExample: St. Andrew’s College
Point EstimationPoint Estimation
St. Andrew’s College received 900 applications from St. Andrew’s College received 900 applications from prospective students. The application form contains a prospective students. The application form contains a variety of information including the individual’s Scholastic variety of information including the individual’s Scholastic Aptitude Test (SAT) score and whether or not the individual Aptitude Test (SAT) score and whether or not the individual desires on-campus housing.desires on-campus housing.
At a meeting in a few hours, the Director of Admissions At a meeting in a few hours, the Director of Admissions would like to announce the average SAT score and the would like to announce the average SAT score and the proportion of applicants that want to live on campus, for the proportion of applicants that want to live on campus, for the population of 900 applicants.population of 900 applicants.
However, the necessary data on the applicants have not yet However, the necessary data on the applicants have not yet been entered in the college’s computerized database. So, been entered in the college’s computerized database. So, the Director decides to estimate the values of the the Director decides to estimate the values of the population parameters of interest based on sample population parameters of interest based on sample statistics. The sample of 30 applicants selected earlier with statistics. The sample of 30 applicants selected earlier with Excel’s Excel’s RANDRAND function will be used. function will be used.
3 3 Slide Slide
Excel Value Worksheet (Sorted)
Note: Rows 10-31 are not shown.
Point Estimation Using Excel
A B
1Applicant Number
23456789
Random Number0.000270.001920.003030.004810.005380.005830.006490.00667
1277340858116185510394
C D
SAT Score
On-Campus Housing
1207 No1143 Yes1091 Yes1108 No1227 Yes982 Yes
1363 Yes1108 No
4 4 Slide Slide
Point EstimationPoint Estimation
Note: Different random numbers would haveidentified a different sample which would haveresulted in different point estimates.
• s as Point Estimator of
5 5 Slide Slide
• Population Mean SAT Score
• Population Standard Deviation for SAT Score
Population Proportion Wanting On-Campus Housing
Once all the data for the 900 applicants were enteredin the college’s database, the values of the populationparameters of interest were calculated.
Point Estimation
6 6 Slide Slide
PopulationParameter
PointEstimator
PointEstimate
ParameterValue
= Population mean SAT score
1697 1684
= Population std. deviation for SAT score
87.4 s = Sample stan- dard deviation for SAT score
85.2
p = Population pro- portion wanting campus housing
.72 .67
Summary of Point EstimatesObtained from a Simple Random Sample
7 7 Slide Slide
Practical AdvicePractical Advice
The The target populationtarget population is the population we want to is the population we want to make inferences about.make inferences about. The The target populationtarget population is the population we want to is the population we want to make inferences about.make inferences about.
Whenever a sample is used to make inferences aboutWhenever a sample is used to make inferences about a population, we should make sure that the targeteda population, we should make sure that the targeted population and the sampled population are in closepopulation and the sampled population are in close agreement.agreement.
Whenever a sample is used to make inferences aboutWhenever a sample is used to make inferences about a population, we should make sure that the targeteda population, we should make sure that the targeted population and the sampled population are in closepopulation and the sampled population are in close agreement.agreement.
The The sampled populationsampled population is the population from is the population from which the sample is actually taken.which the sample is actually taken. The The sampled populationsampled population is the population from is the population from which the sample is actually taken.which the sample is actually taken.
8 8 Slide Slide
Process of Statistical InferenceProcess of Statistical Inference
The value of is used toThe value of is used tomake inferences aboutmake inferences about
the value of the value of ..
xx The sample data The sample data provide a value forprovide a value for
the sample meanthe sample mean . .xx
A simple random sampleA simple random sampleof of nn elements is selected elements is selected
from the population.from the population.
Population Population with meanwith mean
= ?= ?
Sampling Distribution Sampling Distribution of of
xx
9 9 Slide Slide
where: = the population meanwhere: = the population mean
When the expected value of the point estimatorWhen the expected value of the point estimator
equals the population parameter, we say the pointequals the population parameter, we say the point
estimator is unbiased.estimator is unbiased.
Sampling Distribution of Sampling Distribution of xx
10 10 Slide Slide
Sampling Distribution of Sampling Distribution of xx
We will use the following notation to define We will use the following notation to define thethe
standard deviation of the sampling distribution standard deviation of the sampling distribution of .of .
xx
= the standard deviation of = the standard deviation of xx xx
= the standard deviation of the population = the standard deviation of the population
nn = the sample size = the sample size
NN = the population size = the population size
• Standard Deviation ofStandard Deviation of xx
11 11 Slide Slide
Sampling Distribution of Sampling Distribution of xx
Finite PopulationFinite Population Infinite PopulationInfinite Population
)(1 nN
nNx
)(1 nN
nNx
x n
x n
• is referred to as the is referred to as the standard standard errorerror of the of the mean.mean.
x x
• A finite population is treated as beingA finite population is treated as being infinite if infinite if nn//NN << .05. .05.
• is the is the finite populationfinite population correction factorcorrection factor..
( ) / ( )N n N 1( ) / ( )N n N 1
xx• Standard Deviation ofStandard Deviation of
12 12 Slide Slide
In cases where the population is highly skewed or outliers are present, samples of size 50 may be needed.
13 13 Slide Slide
Central Limit TheoremCentral Limit Theorem
14 14 Slide Slide
Example: St. Andrew’s College
15 15 Slide Slide
Step 1: Calculate the Step 1: Calculate the zz-value at the -value at the upperupper endpoint of endpoint of the interval.the interval.
z = (1707 - 1697)/15.96 = .63z = (1707 - 1697)/15.96 = .63
PP((zz << .63) = .7357 .63) = .7357
Step 2: Find the area under the curve to the left of theStep 2: Find the area under the curve to the left of the upperupper endpoint. endpoint.
Sampling Distribution of Sampling Distribution of xx Example: St. Andrew’s CollegeExample: St. Andrew’s College
What is the probability that a simple random sample of 30 What is the probability that a simple random sample of 30 applicants will provide an estimate of the population mean applicants will provide an estimate of the population mean SAT score that is within +/SAT score that is within +/10 of the actual population mean 10 of the actual population mean ? ?
Be between 1687 and 1707?Be between 1687 and 1707?
16 16 Slide Slide
Cumulative Probabilities for the Standard Normal Distribution
z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
. . . . . . . . . . .
.5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224
.6 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .7549
.7 .7580 .7611 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .7852
.8 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133
.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389
. . . . . . . . . . .
• Example: St. Andrew’s College
17 17 Slide Slide
1697 1707
Area = .7357
Example: St. Andrew’s College
18 18 Slide Slide
1687 1697
Area = .2643
Step 3: Calculate the Step 3: Calculate the zz-value at the -value at the lowerlower endpoint of endpoint of the interval.the interval. zz = (1687 - 1697)/15.96 = - .63 = (1687 - 1697)/15.96 = - .63
Step 4: Find the area under the curve to the left of theStep 4: Find the area under the curve to the left of the lowerlower endpoint. endpoint. PP((zz << -.63) = .2643 -.63) = .2643
19 19 Slide Slide
Step 5: Calculate the area under the curve betweenStep 5: Calculate the area under the curve between the lower and upper endpoints of the interval.the lower and upper endpoints of the interval.
PP(-.68 (-.68 << zz << .68) = .68) = PP((zz << .68) - .68) - PP((zz << -.68) -.68)
= .7357 - .2643= .7357 - .2643= .4714= .4714
The probability that the sample mean SAT score willThe probability that the sample mean SAT score willbe between 1687 and 1707 is:be between 1687 and 1707 is:
Example: St. Andrew’s CollegeExample: St. Andrew’s College
20 20 Slide Slide
17071687 1697
Area = .4714
Example: St. Andrew’s College
21 21 Slide Slide
• Suppose we select a simple random sample of 100Suppose we select a simple random sample of 100 applicants instead of the 30 originally considered.applicants instead of the 30 originally considered.
Example: St. Andrew’s CollegeExample: St. Andrew’s College
22 22 Slide Slide
Example: St. Andrew’s CollegeExample: St. Andrew’s College
23 23 Slide Slide
• Example: St. Andrew’s CollegeExample: St. Andrew’s College
24 24 Slide Slide
17071687 1697
Area = .7776
Example: St. Andrew’s College
25 25 Slide Slide
E p p( ) E p p( )
Sampling Distribution ofSampling Distribution ofpp
where:where:pp = the population proportion = the population proportion
The The sampling distribution of sampling distribution of is the probability is the probabilitydistribution of all possible values of the sampledistribution of all possible values of the sampleproportion .proportion .pp
pp
• Expected Value ofExpected Value of pp
26 26 Slide Slide
n
pp
N
nNp
)1(
1
n
pp
N
nNp
)1(
1
pp pn
( )1 pp pn
( )1
• is referred to as the is referred to as the standard standard error oferror of the proportionthe proportion..
p p
Sampling Distribution ofSampling Distribution of
Finite PopulationFinite Population Infinite PopulationInfinite Population
• Standard Deviation ofStandard Deviation of
• is the finite populationis the finite population correction factor.correction factor.
( ) / ( )N n N 1( ) / ( )N n N 1
pp
pp
27 27 Slide Slide
Recall that 72% of the prospective students Recall that 72% of the prospective students applyingapplying
to St. Andrew’s College desire on-campus to St. Andrew’s College desire on-campus housing.housing.
Example: St. Andrew’s CollegeExample: St. Andrew’s College
Sampling Distribution ofSampling Distribution of pp
What is the probability that a simple random What is the probability that a simple random sample of 30 applicants will provide an estimate of sample of 30 applicants will provide an estimate of the population proportion of applicant desiring on-the population proportion of applicant desiring on-campus housing that is within plus or minus .05 of campus housing that is within plus or minus .05 of the actual population proportion?the actual population proportion?
For our example, with n = 30 and p = .72, thenormal distribution is an acceptable approximationbecause:npnp = 30(.72) = 21.6 = 30(.72) = 21.6 >> 5 5nn(1 - (1 - pp) = 30(.28) = 8.4 ) = 30(.28) = 8.4 >> 5 5
28 28 Slide Slide
.72(1 .72).082
30p
.72(1 .72).082
30p
( ) .72E p ( ) .72E ppp
SamplingSamplingDistributionDistributionof of pp
Prospective Students Applying to St. Andrew’s Prospective Students Applying to St. Andrew’s College Desiring On-Campus Housing.College Desiring On-Campus Housing.
29 29 Slide Slide
The sampling distribution of can be approximated The sampling distribution of can be approximated by a normal distribution whenever the sample size by a normal distribution whenever the sample size is large enough to satisfy the two conditions:is large enough to satisfy the two conditions:
npnp >> 5 and 5 and nn(1 – (1 – pp) ) >> 5 5
. . . because when these conditions are satisfied, the. . . because when these conditions are satisfied, the probability distribution of probability distribution of xx in the sample proportion, in the sample proportion, = = xx//nn, can be approximated by normal distribution, can be approximated by normal distribution (and because (and because nn is a constant). is a constant).
Form of the Sampling Distribution of Form of the Sampling Distribution of pp
pp
pp
30 30 Slide Slide
Step 1Step 1: : Calculate the Calculate the zz-value at the -value at the upperupper endpointendpoint of the interval.of the interval.zz = (.77 = (.77 .72)/.082 = .61 .72)/.082 = .61
PP((zz << .61) = .7291 .61) = .7291
Step 2Step 2:: Find the area under the curve to the left Find the area under the curve to the left ofof the the upperupper endpoint. endpoint.
Step 3Step 3: : Calculate the Calculate the zz-value at the -value at the lowerlower endpoint ofendpoint of the interval.the interval.zz = (.67 = (.67 .72)/.082 = - .61 .72)/.082 = - .61
Step 4Step 4:: Find the area under the curve to the left of the Find the area under the curve to the left of the lowerlower endpoint. endpoint.
PP((zz << -.61) = .2709 -.61) = .2709
31 31 Slide Slide
PP(.67 (.67 << << .77) = .4582 .77) = .4582pp
Step 5Step 5: : Calculate the area under the curve betweenCalculate the area under the curve between the lower and upper endpoints of the interval.the lower and upper endpoints of the interval.
PP(-.61 (-.61 << zz << .61) = .61) = PP((zz << .61) .61) PP((zz << -.61) -.61)
= .7291 = .7291 .2709 .2709
= .4582= .4582The probability that the sample proportion of applicantsThe probability that the sample proportion of applicantswanting on-campus housing will be within +/-.05 of thewanting on-campus housing will be within +/-.05 of theactual population proportion :actual population proportion :
32 32 Slide Slide
.77.77.67.67 .72.72
Area = .4582Area = .4582
pp
SamplingSamplingDistributionDistributionof of pp
.082p .082p
Sampling Distribution ofSampling Distribution ofpp
Example: St. Andrew’s CollegeExample: St. Andrew’s College
33 33 Slide Slide
The population is first divided into groups ofThe population is first divided into groups of elements called elements called stratastrata.. The population is first divided into groups ofThe population is first divided into groups of elements called elements called stratastrata..
Stratified Random SamplingStratified Random Sampling
Each element in the population belongs to one andEach element in the population belongs to one and only one stratum.only one stratum. Each element in the population belongs to one andEach element in the population belongs to one and only one stratum.only one stratum.
Best results are obtained when the elements withinBest results are obtained when the elements within each stratum are as much alike as possibleeach stratum are as much alike as possible (i.e. a (i.e. a homogeneous grouphomogeneous group).).
Best results are obtained when the elements withinBest results are obtained when the elements within each stratum are as much alike as possibleeach stratum are as much alike as possible (i.e. a (i.e. a homogeneous grouphomogeneous group).).
Other Types of Sampling
34 34 Slide Slide
Stratified Random SamplingStratified Random Sampling
A simple random sample is taken from each stratum.A simple random sample is taken from each stratum. A simple random sample is taken from each stratum.A simple random sample is taken from each stratum.
Formulas are available for combining the stratumFormulas are available for combining the stratum sample results into one population parametersample results into one population parameter estimate.estimate.
Formulas are available for combining the stratumFormulas are available for combining the stratum sample results into one population parametersample results into one population parameter estimate.estimate.
AdvantageAdvantage: If strata are homogeneous, this method: If strata are homogeneous, this method is as “precise” as simple random sampling but withis as “precise” as simple random sampling but with a smaller total sample size.a smaller total sample size.
AdvantageAdvantage: If strata are homogeneous, this method: If strata are homogeneous, this method is as “precise” as simple random sampling but withis as “precise” as simple random sampling but with a smaller total sample size.a smaller total sample size.
ExampleExample: The basis for forming the strata might be: The basis for forming the strata might be department, location, age, industry type, and so on.department, location, age, industry type, and so on. ExampleExample: The basis for forming the strata might be: The basis for forming the strata might be department, location, age, industry type, and so on.department, location, age, industry type, and so on.
35 35 Slide Slide
Cluster SamplingCluster Sampling
The population is first divided into separate groupsThe population is first divided into separate groups of elements called of elements called clustersclusters.. The population is first divided into separate groupsThe population is first divided into separate groups of elements called of elements called clustersclusters..
Ideally, each cluster is a representative small-scaleIdeally, each cluster is a representative small-scale version of the population (i.e. heterogeneous group).version of the population (i.e. heterogeneous group). Ideally, each cluster is a representative small-scaleIdeally, each cluster is a representative small-scale version of the population (i.e. heterogeneous group).version of the population (i.e. heterogeneous group).
A simple random sample of the clusters is then taken.A simple random sample of the clusters is then taken. A simple random sample of the clusters is then taken.A simple random sample of the clusters is then taken.
All elements within each sampled (chosen) clusterAll elements within each sampled (chosen) cluster form the sample.form the sample. All elements within each sampled (chosen) clusterAll elements within each sampled (chosen) cluster form the sample.form the sample.
36 36 Slide Slide
Cluster SamplingCluster Sampling
AdvantageAdvantage: The close proximity of elements can be: The close proximity of elements can be cost effective (i.e. many sample observations can becost effective (i.e. many sample observations can be obtained in a short time).obtained in a short time).
AdvantageAdvantage: The close proximity of elements can be: The close proximity of elements can be cost effective (i.e. many sample observations can becost effective (i.e. many sample observations can be obtained in a short time).obtained in a short time).
DisadvantageDisadvantage: This method generally requires a: This method generally requires a larger total sample size than simple or stratifiedlarger total sample size than simple or stratified random sampling.random sampling.
DisadvantageDisadvantage: This method generally requires a: This method generally requires a larger total sample size than simple or stratifiedlarger total sample size than simple or stratified random sampling.random sampling.
ExampleExample: A primary application is area sampling,: A primary application is area sampling, where clusters are city blocks or other well-definedwhere clusters are city blocks or other well-defined areas.areas.
ExampleExample: A primary application is area sampling,: A primary application is area sampling, where clusters are city blocks or other well-definedwhere clusters are city blocks or other well-defined areas.areas.
37 37 Slide Slide
Systematic SamplingSystematic Sampling
This method has the properties of a simple random sample,This method has the properties of a simple random sample,especially if the list of the population elements is a random ordering.especially if the list of the population elements is a random ordering. This method has the properties of a simple random sample,This method has the properties of a simple random sample,especially if the list of the population elements is a random ordering.especially if the list of the population elements is a random ordering.
AdvantageAdvantage: The sample usually will be easier to identify than it: The sample usually will be easier to identify than itwould be if simple random sampling were used.would be if simple random sampling were used. AdvantageAdvantage: The sample usually will be easier to identify than it: The sample usually will be easier to identify than itwould be if simple random sampling were used.would be if simple random sampling were used.
ExampleExample: Selecting every 100: Selecting every 100thth listing in a telephone book after the listing in a telephone book after thefirst randomly selected listingfirst randomly selected listingExampleExample: Selecting every 100: Selecting every 100thth listing in a telephone book after the listing in a telephone book after thefirst randomly selected listingfirst randomly selected listing
If a sample size of If a sample size of nn is desired from a population containing is desired from a population containing NN elements, elements,we might sample one element for every we might sample one element for every nn//NN elements in the population. elements in the population.If a sample size of If a sample size of nn is desired from a population containing is desired from a population containing NN elements, elements,we might sample one element for every we might sample one element for every nn//NN elements in the population. elements in the population.
We randomly select one of the first We randomly select one of the first nn//NN elements from the population elements from the populationlist.list.We randomly select one of the first We randomly select one of the first nn//NN elements from the population elements from the populationlist.list.
We then select every We then select every nn//NNth element that follows in the population list.th element that follows in the population list.We then select every We then select every nn//NNth element that follows in the population list.th element that follows in the population list.
38 38 Slide Slide
Convenience SamplingConvenience Sampling
It is a It is a nonprobability sampling techniquenonprobability sampling technique. Items are. Items are included in the sample without known probabilitiesincluded in the sample without known probabilities of being selected.of being selected.
It is a It is a nonprobability sampling techniquenonprobability sampling technique. Items are. Items are included in the sample without known probabilitiesincluded in the sample without known probabilities of being selected.of being selected.
ExampleExample: A professor conducting research might use: A professor conducting research might use student volunteers to constitute a sample.student volunteers to constitute a sample. ExampleExample: A professor conducting research might use: A professor conducting research might use student volunteers to constitute a sample.student volunteers to constitute a sample.
The sample is identified primarily by The sample is identified primarily by convenienceconvenience.. The sample is identified primarily by The sample is identified primarily by convenienceconvenience..
AdvantageAdvantage: Sample selection and data collection are: Sample selection and data collection are relatively easy.relatively easy. AdvantageAdvantage: Sample selection and data collection are: Sample selection and data collection are relatively easy.relatively easy.
DisadvantageDisadvantage: It is impossible to determine how: It is impossible to determine how representative of the population the sample is.representative of the population the sample is. DisadvantageDisadvantage: It is impossible to determine how: It is impossible to determine how representative of the population the sample is.representative of the population the sample is.
39 39 Slide Slide
Judgment SamplingJudgment Sampling
The person most knowledgeable on the subject of theThe person most knowledgeable on the subject of the study selects elements of the population that he orstudy selects elements of the population that he or she feels are most representative of the population.she feels are most representative of the population.
The person most knowledgeable on the subject of theThe person most knowledgeable on the subject of the study selects elements of the population that he orstudy selects elements of the population that he or she feels are most representative of the population.she feels are most representative of the population.
It is a It is a nonprobability sampling techniquenonprobability sampling technique.. It is a It is a nonprobability sampling techniquenonprobability sampling technique..
ExampleExample: A reporter might sample three or four senators,: A reporter might sample three or four senators, judging them as reflecting the general opinion of the senate.judging them as reflecting the general opinion of the senate. ExampleExample: A reporter might sample three or four senators,: A reporter might sample three or four senators, judging them as reflecting the general opinion of the senate.judging them as reflecting the general opinion of the senate.
AdvantageAdvantage: It is a relatively easy way of selecting a sample.: It is a relatively easy way of selecting a sample. AdvantageAdvantage: It is a relatively easy way of selecting a sample.: It is a relatively easy way of selecting a sample.
DisadvantageDisadvantage: The quality of the sample results depends: The quality of the sample results depends on the judgment of the person selecting the sample.on the judgment of the person selecting the sample. DisadvantageDisadvantage: The quality of the sample results depends: The quality of the sample results depends on the judgment of the person selecting the sample.on the judgment of the person selecting the sample.
40 40 Slide Slide
RecommendationRecommendation
It is recommended that probability sampling methodsIt is recommended that probability sampling methods (simple random, stratified, cluster, or systematic) be(simple random, stratified, cluster, or systematic) be used.used.
It is recommended that probability sampling methodsIt is recommended that probability sampling methods (simple random, stratified, cluster, or systematic) be(simple random, stratified, cluster, or systematic) be used.used.
For these methods, formulas are available for For these methods, formulas are available for evaluating the “goodness” of the sample results inevaluating the “goodness” of the sample results in terms of the closeness of the results to the populationterms of the closeness of the results to the population parameters being estimated.parameters being estimated.
For these methods, formulas are available for For these methods, formulas are available for evaluating the “goodness” of the sample results inevaluating the “goodness” of the sample results in terms of the closeness of the results to the populationterms of the closeness of the results to the population parameters being estimated.parameters being estimated.
An evaluation of the goodness cannot be made withAn evaluation of the goodness cannot be made with non-probability (convenience or judgment) samplingnon-probability (convenience or judgment) sampling methods.methods.
An evaluation of the goodness cannot be made withAn evaluation of the goodness cannot be made with non-probability (convenience or judgment) samplingnon-probability (convenience or judgment) sampling methods.methods.