sampling design and analysis mth 494 lecture-30 ossam chohan assistant professor ciit abbottabad

31
Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

Upload: eric-greer

Post on 03-Jan-2016

230 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

Sampling Design and AnalysisMTH 494

Lecture-30

Ossam ChohanAssistant Professor

CIIT Abbottabad

Page 2: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

2

Sampling with unequal probabilities

• Up to now, we have only discussed sampling schemes in which the probabilities of choosing sampling units are equal.

• Equal probabilities give schemes that are often easy to design and explain. Such schemes are not, however, always possible or, if practicable, as efficient as schemes using unequal probabilities.

• Cluster sample with equal probabilities may result in large variance for the design-unbiased estimator of the population mean and total

Page 3: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

3

Primary Sampling Units(PSUS)• In sample surveys, primary sampling unit (commonly abbreviated as

PSU) arises in samples in which population elements are grouped into aggregates and the aggregates become units in sample selection. The aggregates are, due to their intended usage, called "sampling units." Primary sampling unit refers to sampling units that are selected in the first (primary) stage of a multi-stage sample ultimately aimed at selecting individual elements. In selecting a sample, one may choose elements directly; in such a design, the elements are the only sampling units. One may also choose to group the elements into aggregates and choose the aggregates in a first stage of selection and then elements at a later stage of selection. The aggregates and the elements are both sampling units in such a design. For example, if a survey is selecting households as elements, then counties may serve as the primary sampling unit, with blocks and households ...

Page 4: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

4

Sampling one primary unit

• As a special case, suppose we select just one (n=1) of N psus to be in the sample.

• The total for psu I is denoted by ti, and we want to estimate the population total, t.

• Sampling one psu will demonstrate the ideas of unequal-probability sampling without introducing the complications.

Page 5: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

5

Understanding example

• Let us start out by looking at what happens for a situation in which we know the whole population.

• A town has four supermarkets, ranging in size from 100 square meters (m2) to 1000 m2.

• We want to estimate the total amount of sales in the four stores for last month by sampling just one of the stores.

• Of course, this is just an illustration-if we really had only four supermarkets we would probably take a census)

Page 6: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

6

• You might expect that a larger store would have more sales than a smaller store, and that the variability in total sales among several 1000 m2 stores will be greater than the variability in total sales among several 100 m2 stores.

• Since we sample only one store, the probability that a store is selected on the first draw is (ϒi) is the same as the probability that the store is included in the sample (πi).

• For this example, take • πi = ϒi = Pr(store i selected)

Page 7: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

7

• This is proportional to the size of the store. Since store A accounts for 1/16 of the total floor area of the four stores, it is sampled with probability 1/16. for illustrative purposes, we know the values of ti for the whole population.

• Values are given on next slide.

Page 8: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

8

Store Size (m2) ϒi πi (in thousand)

A 100 1/16 11

B 200 2/16 20

C 300 3/16 24

D 1000 10/16 245

Total 1600 1 300

Page 9: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

9

• We could select a probability sample of size 1 with probabilities given above by shuffling cards numbered 1 through 16 and choosing one card.

• If the card’s number is 1, choose store A; if 2 or 3, choose B; if 4, 5 or 6, choose C; and if 7 through 16, choose D. or we could spin once on a spinner like this:

Page 10: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

10

Page 11: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

11

• We compensate for the unequal probabilities of selection by also using ϒi in the estimator.

• We have already seen such compensation fro unequal probabilities in stratified sampling: if we select 10% of the units in stratum 1 and 20% of the units in stratum 2, the sampling weight is 10 for each units in stratum 1 and 5 for each unit in stratum 2.

• Here, we select store A with probability 1/16, so store A’s sampling weight is 16.

Page 12: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

12

• If the size of store is roughly proportional to the total sales for that store, we would expect that store A also has about 1/16 of the total sales and that multiplying store A’s sales by 16 would estimate the total sales for all four stores.

• As always, the sampling weight of unit i is the reciprocal of the probability of selection:

iiw

1

Page 13: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

13

Unequal-probability sampling without replacement

• Generally, sampling with replacement is less efficient than sampling without replacement; with replacement is introduced first because of the case in selecting and analyzing samples.

• Nevertheless, in large surveys with many small strata, the inefficiencies may wipe out the gains in convenience. Much research has been done on unequal-probability sampling without replacement; the theory is more complicated because the probability that a unit is selected is different for the first unit chosen than for the second, third and subsequent units. When you understand the probabilistic arguments involved, however, you can find the properties of any sampling scheme.

Page 14: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

14

Example

• The supermarket example discussed above can be used to illustrate some of the features of unequal-probability sampling with replacement. Here is the population again.

Page 15: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

Store Size (m2) ϒi πi (in thousand)

A 100 1/16 11

B 200 2/16 20

C 300 3/16 24

D 1000 10/16 245

Total 1600 1 300

Page 16: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

16

• Let’s select two psus without replacement and with unequal probabilities. As we discussed above

ϒi = P(Select unit i on first draw)• Since we are sampling without replacement,

though, the probability that unit j is selected on the second draw depends on which unit was selected on the first draw.

Page 17: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

17

• One way to select the units with unequal probabilities is to use ϒi as the probability of selecting unit i on the first draw, and then adjust the probabilities of selecting the other stores on the second draw. If store A was chosen on the first draw, then for selecting the second store would spin the wheel while clocking out the selection for store A, or shuffle the deck and redeal without Card-1. Thus

Page 18: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

18

Rest of the solution

Page 19: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

19

Page 20: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

20

Summary

• Summary of the Designs and Methods

Page 21: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

21

• You will recall that the objective of statistics is to make inferences about a population from information contained in a sample.

• This unit discusses the design of sample surveys and associated methods of inference for populations containing a finite number of elements.

• Practical examples have been selected primarily from the fields of business and the social sciences where finite populations of human responses are frequently the target of surveys.

• Natural resource management examples are also included.

Page 22: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

22

Summary

• The method of inference employed for most sample surveys is estimation. Thus we consider appropriate estimators for population parameters and the associated two-standard deviation bound on the error of estimation.

• In repeated sampling the error of estimation will be less than its bound, with probability approximately equal to 0.95.

• Equivalently, we construct confidence intervals that, in repeated sampling, enclose the true population parameter approximately 95 times out of 100.

Page 23: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

23

• The quantity of information pertinent to a given parameter is measured by the bound on the error of estimation.

Page 24: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

24

• The first segment, presented in initial discussions, reviews the objective of statistics and points to the peculiarities of problems arising in the social sciences, business and natural resource management that make them different from traditional type of experiment conducted in the laboratory.

Page 25: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

25

• The basic sample survey design, simple random sampling, is presented first.

• For this design the sample is selected so that every sample of size n in the population has an equal chance of being chosen. The design does not make a specific attempt to reduce the cost of the desired quantity of information. It is the most basic type of sample survey design, and all other designs are compared with it.

Page 26: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

26

• The second type of design, stratified random sampling divides the population into homogeneous groups called strata, this procedure usually produces an estimator that possesses a smaller variance than can be acquired by simple random sampling.

• Thus the cost of survey can be reduced by selecting fewer elements to achieve an equivalent bound on the error of estimation.

Page 27: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

27

• The third type of experimental design is systematic sampling, which is usually applied to population elements that are available in a list or line, such as names on file cards in a drawer or people coming out a factory. A random starting point is selected and then every kth element thereafter is sampled.

• Systematic sampling is frequently conducted when collecting a simple random or a stratified random sample is extremely costly or impossible.

• Once again, the reduction in survey cost in primarily associated with the cost of collecting the sample.

Page 28: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

28

• The fourth type of sample survey design is cluster sampling.

• Cluster sampling may reduce cost because each sampling unit is a collection of elements usually selected so as to be physically close together.

• Cluster sampling is most often used when a frame that lists all population elements is not available or when travel costs from element to element are considerable.

• Cluster sampling reduces the cost of the survey primarily by reducing the cost of collecting the data.

Page 29: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

29

• A discussion of ratio, regression, and difference estimators, which utilize information on an auxiliary variable is covered in third segment of the material.

• The ratio estimator illustrates how additional information, frequently acquired at little cost, can be used to reduce the variance of estimator and, consequently, reduce the overall cost of a survey.

• It also suggests the possibility of acquiring more sophisticated estimators by using information on more than one auxiliary variable.

Page 30: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

30

• This unit on ratio estimation follows naturally the discussion on simple random sampling in previous unit of SRS. That is, you can take a measurement of y, the response of interest, for each element of the SRS and utilize the traditional estimators.

Page 31: Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad

31

• To summarize, we have presented various elementary sample survey designs along with their associated methods of inference.

• Treatment of the topics has been directed towards practical applications so that you can see how sample survey design can be employed to make inferences at minimum cost when sampling from finite social, business or natural resource populations.