sampling techniques dr. shaik shaffi ahamed ph.d., assistant professor department of family &...

Post on 06-Jan-2018

228 Views

Category:

Documents

5 Downloads

Preview:

Click to see full reader

DESCRIPTION

Populations and Sampling Reasons for using samples There are many good reasons for studying a sample instead of an entire population: Samples can be studied more quickly than populations. Speed can be important if a physician needs to determine something quickly, such as a vaccine or treatment for a new disease. A study of a sample is less expensive than a study of an entire population because a smaller number of items or subjects are examined. This consideration is especially important in the design of large studies that require a long follow-up. A study of the entire populations is impossible in most situations. Sample results are often more accurate than results based on a population.

TRANSCRIPT

Sampling TechniquesSampling Techniques

Dr. Shaik Shaffi Ahamed Ph.D.,Assistant ProfessorDepartment of Family & Community MedicineCollege of MedicineKing Saud University

Why should we take sample?, Can’t we study the whole ?

It is possible depends on objective

-to know how many live in a country--age and sex categories--changing pattern of age structure--when plan for country CENSUS--death in a hospital record all the death

It is not possible-to test the life of bulbs – burn bulbs till it lost its life-count of RBW in blood – draw all the blood & count-Count the stars in the sky

It is not necessary- estimate Hb% in blood – a drop of blood is enough – blood in any part of the body will provide same

Populations and SamplingPopulations and Sampling Reasons for using samplesReasons for using samplesThere are many good reasons for studying a sample instead of an

entire population: Samples can be studied more quickly than populations. Speed

can be important if a physician needs to determine something quickly, such as a vaccine or treatment for a new disease.

A study of a sample is less expensive than a study of an entire population because a smaller number of items or subjects are examined. This consideration is especially important in the design of large studies that require a long follow-up.

A study of the entire populations is impossible in most situations.

Sample results are often more accurate than results based on a population.

Sampling in EpidemiologySampling in Epidemiology

Why Sample?– Unable to study all members of a population– Reduce bias– Save time and money– Measurements may be better in sample than in

entire population– Feasibility

SamplingSamplingSampling is the process or technique

of selecting a sample of appropriate characteristics and

adequate size.

6

Study Population

• A population may be defined as an aggregate of all things / units possessing a common trait or characteristic.

• The whole collection of units (“the universe”).

TerminologyTerminology

7

Terminology – Cont.Terminology – Cont.

Target (Study) Population• The population that possesses a characteristic

(parameter) which we wish to estimate or concerning which, we wish to draw conclusion.

• The population you expect the eventual results of the research to apply (target of inference).

• It may be real or hypothetical.

8

Sample• A selected subset of the study population.• Chosen by some process (e.g. sampling) with

the objective of investigating particular characteristic (parameter) of the study population.

Sampling• Process of obtaining a sample from the target

population.

Terminology – Cont.Terminology – Cont.

Sampling Frame • This is the complete list of sampling units in the

target population to be subjected to the sampling procedure.

• Completeness and accuracy of this list is essential for the success of the study.

Sampling Units These are the individual units / entities that make up the frame just as elements are entities that make up the population.

Terminology – Cont.

Sampling ErrorThis arises out of random sampling and is the discrepancies between sample values and the population value.

Sampling VariationDue to infinite variations among individuals

and their surrounding conditions.Produce differences among samples from the

population and is due to chance.

Terminology – Cont.

Repeat the same study, under exactly Repeat the same study, under exactly similar conditions, we will not similar conditions, we will not necessarily get identical results.necessarily get identical results. Example: In a clinical trail of 200 patients we

find that the efficacy of a particular drug is 75%

If we repeat the study using the same drug in another group of similar 200 patients we will not get the same efficacy of 75%. It could be 78% or 71%.

“Different results from different trails though all of them conducted under the same conditions”

Example: If two drugs have the same efficacy then the difference

between the cure rates of these two drugs should be zero.

But in practice we may not get a difference of zero.

If we find the difference is small say 2%, 3%, or 5%, we may accept the hypothesis that the two drugs are equally effective.

On the other hand, if we find the difference to be large say 25%, we would infer that the difference is very large and conclude that the drugs are not of equally efficacy.

Example: If we testing the claim of pharmaceutical company that

the efficacy of a particular drug is 80%.

We may accept the company’s claim if we observe the efficacy in the trail to be 78%, 81%, 83% or 77%.

But if the efficacy in trail happens to be 50%, we would have good cause to feel that true efficacy cannot be 80%.

And the chance of such happening must be very low. We then tend to dismiss the claim that the efficacy of the drug is 80%.

THEREFORE “WHILE TAKING DECISIONS BASED ON

EXPERIMENTAL DATA WE MUST GIVE SOME ALLOWANCE FOR SAMPLING VARIATION “.

“VARIATION BETWEEN ONE SAMPLE AND ANOTHER SAMPLE IS KNOWN AS SAMPLING VARIATION”.

Study Participants

Subjects that are actually participating in the study.

Subset of study population that were contactable and consented / agreed to participate.

Terminology – Cont.

Study Participants - Cont.Study Participants - Cont.Study participants may still be not

representative of the target population even with random sampling because of:– Sampling frame is out of date.– Failure to recruit eligible subjects.– Non consent or non response.– Drop Out / Withdrawal.

Decisions RequiredDecisions Required for selecting for selecting samplesample

1.   Specify what is the target population. This is entirely determined by the research objective.

2.     Specify what is the study population. (e.g. who are eligible for inclusion in the study)

3.     Select a sampling design for obtaining a sample for study.

4.     Strategy to ensure high response or participation rate, otherwise inference must take account of non-responses.

Decisions will have considerable impact on study validity (soundness of conclusion or inference made).

Study populations and sampling summarized schematically

Consent or respond

Select based on judgment and accessibility

Probability sampling

Target population:real or

hypothetical

Study Population

Sample

Participants in study

In general, 2 requirements

1. Sampling frame must be available, otherwise construct one or use special sampling techniques. Frame construction may not be easy.

2. Choose an appropriate sampling method to draw a sample from the frame.

How to sample ?

The Sampling Design ProcessThe Sampling Design Process

Define the Population

Determine the Sampling Frame

Select Sampling Technique(s)

Determine the Sample Size

Execute the Sampling Process

Classification of Sampling Classification of Sampling TechniquesTechniques

Sampling Techniques

Non probabilitySampling

Techniques

ProbabilitySampling

Techniques

ConvenienceSampling

JudgmentalSampling

QuotaSampling

SnowballSampling

SystematicSampling

StratifiedSampling

ClusterSampling

Other SamplingTechniques

Simple RandomSampling

Simple Random SamplingSimple Random SamplingA sample may be defined as random if every

sampling unit in the study population has an equal chance of being selected.

Selection of SRS may be done by:– Drawing the number or name from a hat or box.– Using a Random Number Table.– Using a computer to generate the numbers.

SRS MethodsSRS Methods

Lottery Method Random Number Table method

Tables ofTables of randomrandom numbersnumbers

are used after numbers have been assigned to numbers of the study population. Use random number table to select subject. Start anywhere. Continue selecting until the desired sample is reached

49486 93775 88744 80091 92732

94860 36746 04571 13150 65383

10169 95685 47585 53247 60900

12018 45351 15671 23026 55344

45611 71585 61487 87434 07498

89137 30984 18842 69619 53872

94541 12057 30771 19598 96069

89920 28843 87599 30181 26839

32472 32796 15255 39636 90819

1 2 3 4 5Random Number tableRandom Number table

How to select a simple random How to select a simple random samplesample

1. Define the population2. Determine the desired sample size3. List all members of the population or the

potential subjects For example:

– 4th grade boys who have demonstrated problem behaviors

– Lets select 10

Potential Subject PoolPotential Subject Pool

1. Robert 2. Ralph 3. John 4. Andy 5. Joel 6. Thomas 7. Cooper 8. Maurice 9. Terry10. Carl

11. Ken12. Wilmer13. Alan14. Kevin15. James16. Henry17. Don18. Walt19. Doug20. George

21. Steve22. Larry23. Rick24. Bruce25. Clyde26. Sam27. Kent28. Travis29. Woody30. Brian

So our selected subjects are numbers 10, 22, 24, So our selected subjects are numbers 10, 22, 24, 15, 6, 1, 25, 11, 13, & 16.15, 6, 1, 25, 11, 13, & 16.

1. Robert 2. Ralph 3. John 4. Andy 5. Joel 6. Thomas 7. Cooper 8. Maurice 9. Terry10. Carl

11. Ken12. Wilmer13. Alan14. Kevin15. James16. Henry17. Don18. Walt19. Doug20. George

21. Steve22. Larry23. Rick24. Bruce25. Clyde26. Sam27. Kent28. Travis29. Woody30. Brian

Simple random sampling– Estimate hemoglobin levels in patients with sickle cell

anemia1. Determine sample size 2. Obtain a list of all patients with sickle cell anemia

in a hospital or clinic3. Patient is the sampling unit4. Use Lottery method/ a table of random numbers

to select units from the sampling frame5. Measure hemoglobin in all patients6. Calculate mean and standard deviation of sample

Simple random sampling– Advantages

Simple process and easy to understandEasy calculation of means and variance

– DisadvantagesNot most efficient method, that is, not the most precise

estimate for the costRequires knowledge of the complete sampling frameCannot always be certain that there is an equal chance of

selectionNon respondents or refusals

Sampling in EpidemiologySampling in EpidemiologySystematic sampling

– The sampling units are spaced regularly throughout the sampling frame, e.g., every 3rd unit would be selected

– May be used as either probability sample or notNot a probability sample unless the starting point is

randomly selectedNon-random sample if the starting point is

determined by some other mechanism than chance

Systematic SamplingSystematic Sampling The sample is chosen by selecting a random starting point

and then picking every i th element in succession from the sampling frame.

The sampling interval, i, is determined by dividing the population size N by the sample size n and rounding to the nearest integer.

For example, there are 100,000 elements in the population and a sample of 1,000 is desired. In this case the sampling interval, i, is 100. A random number between 1 and 100 is selected. If, for example, this number is 23, the sample consists of elements 23, 123, 223, 323, 423, 523, and so on.

ExampleExample If a systematic sample of 500 students were to be carried

out in a university with an enrolled population of 10,000, the sampling interval would be:

I = N/n = 10,000/500 =20 All students would be assigned sequential numbers. The

starting point would be chosen by selecting a random number between 1 and 20. If this number was 9, then the 9th student on the list of students would be selected along with every following 20th student. The sample of students would be those corresponding to student numbers 9, 29, 49, 69, ........ 9929, 9949, 9969 and 9989.

Systematic Sampling

• Decide on sample size: n

• Divide population of N individuals into groups of k individuals: k = N/n

• Randomly select one individual from the 1st group.• Select every k-th individual thereafter.

N = 64

n = 8

k = 8

First Group

Systematic sampling– Advantages

Sampling frame does not need to be defined in advanceEasier to implement in the fieldIf there are unrecognized trends in the sample frame,

systematic sample ensure coverage of the spectrum of units

– DisadvantagesVariance cannot be estimated unless assumptions are

made

Stratified SamplingStratified Sampling A two-step process in which the population is

partitioned into subpopulations, or strata. The strata should be mutually exclusive and

collectively exhaustive in that every population element should be assigned to one and only one stratum and no population elements should be omitted.

Next, elements are selected from each stratum by a

random procedure, usually SRS.

A major objective of stratified sampling is to increase precision without increasing cost.

Stratified random sample– The sampling frame comprises

groups, or strata, with certain characteristics

– A sample of units are selected from each group or stratum

Sampling in EpidemiologySampling in Epidemiology Stratified random sample

– Assess dietary intake in adolescents1. Define three age groups: 11-13, 14-16, 17-192. Stratify age groups by sex3. Obtain list of children in this age range from

schools4. Randomly select children from each of the 6

strata until sample size is obtained5. Measure dietary intake

Mild Moderate Severe

Stratified Random selection for drug trail in hypertension

Stratified random sample– Advantages

Assures that certain subgroups are represented in a sample

Allows investigator to estimate parameters in different strata

More precise estimates of the parameters because strata are more homogeneous, e.g., smaller variance within strata

Strata of interest can be sampled most intensively, e.g., groups with greatest variance

Administrative advantages– Disadvantages

Loss of precision if small number of units is sampled from strata

Cluster SamplingCluster Sampling The population is first divided into mutually exclusively groups of elements called clusters.

Ideally, each cluster is a representative small-scale version of the population (i.e. heterogeneous group).

A simple random sample of the clusters is then taken.

All elements within each sampled (chosen) cluster form the sample. Elements within a cluster should be as heterogeneous as possible, but clusters themselves should be as homogeneous as possible. Ideally, each cluster should be a small-scale representation of the population.

Cluster sampling– Estimate the prevalence of dental caries in

school children1. Among the schools in the catchments area, list

all of the classrooms in each school2. Take a simple random sample of classrooms, or

cluster of children3. Examine all children in a cluster for dental

caries4. Estimate prevalence of caries within clusters

than combine in overall estimate, with variance

Cluster sampling– Advantages

The entire sampling frame need not be enumerated in advance, just the clusters once identified

More economical in terms of resources than simple random sampling

– DisadvantagesLoss of precision, i.e., wider variance, but can be

accounted for with larger number of clusters

Multistage SamplingMultistage Sampling

Similar to cluster sampling except that there are two sampling events, instead of one– Primary units are randomly selected– Individual units within primary units randomly

selected for measurement

Multi–Stage SamplingMulti–Stage Sampling

This sampling method is actually a combination of the basic sampling methods carried out in stages.

Aim of subdividing the population into progressively smaller units by random sampling at each stage.

Sampling in EpidemiologySampling in EpidemiologyMultistage sampling

– Estimate the prevalence of dental caries in school children

1. Among the schools in the catchments area, list all of the classrooms in each school

2. Take a simple random sample of classrooms, or cluster of children

3. Enumerate the children in each classroom 4. Take a simple random sample of children within the

classroom5. Examine all children in a cluster for dental caries6. Estimate prevalence of caries within clusters than

combine in overall estimate, with variance

Classification of Sampling Classification of Sampling TechniquesTechniques

Sampling Techniques

NonprobabilitySampling

Techniques

ProbabilitySampling

Techniques

ConvenienceSampling

JudgmentalSampling

QuotaSampling

SnowballSampling

SystematicSampling

StratifiedSampling

ClusterSampling

Other SamplingTechniques

Simple RandomSampling

Sampling Methods Sampling Methods Non-probability samples Non-probability samples

Convenience SamplingConvenience SamplingConvenience sampling attempts to obtain a sample of convenient elements. Often, respondents are selected because they happen to be in the right place at the right time.

– use of students, and members of social organizations– mall intercept interviews without qualifying the

respondents– department stores using charge account lists– “people on the street” interviews

Convenience sample– Case series of patients with a particular

condition at a certain hospital– “Normal” graduate students walking down

the hall are asked to donate blood for a study– Children with febrile seizures reporting to an

emergency roomInvestigator decides who is enrolled in a

study

Judgmental SamplingJudgmental Sampling

Judgmental sampling is a form of convenience sampling in which the population elements are selected based on the judgment of the researcher.– It involves hand-picking from the accessible

population those individuals judged most appropriate for the study.

QUOTA SAMPLING

Quota SamplingQuota SamplingQuota sampling may be viewed as two-stage restricted judgmental sampling. – The first stage consists of developing control categories, or quotas, of

population elements. – In the second stage, sample elements are selected based on convenience or

judgment.

Population Samplecomposition compositionControlCharacteristic Percentage Percentage NumberSex Male 48 48 480 Female 52 52 520____ ____ ____100 100 1000

QUOTA SAMPLING

Snowball SamplingSnowball SamplingIn snowball sampling, an initial group of respondents is selected, usually at random.

– After being interviewed, these respondents are asked to identify others who belong to the target population of interest.

– Subsequent respondents are selected based on the referrals.

Consecutive sampleConsecutive sample

Consecutive sample– A case series of consecutive patients with a

condition of interest – Consecutive series means ALL patients

with the condition within hospital or clinic, not just the patients the investigators happen to know about

Consecutive sample– Outcome of 1000 consecutive patients presenting

to the emergency room with chest pain– Natural history of all 125 patients with HIV-

associated TB during 5 year period

Explicit efforts must be made to identify and recruit ALL persons with the condition of interest

Sampling Methods Sampling Methods Non-probability samplesNon-probability samples Depends on expert’s opinion, Probabilities of selection not considered. Advantages: include convenience, speed,

and lower cost. Disadvantages;

– Lack of accuracy,– lack of results generalizability.

Availability sampling: selecting on the basis of convenience.

                                  

     

Random sampling: every combination of a given size has an equal chance of being chosen.

Cluster sampling: dividing the population into clusters, typically on the basis of geography, and taking a sample of the clusters.

Snowball sampling: asking individuals studied to provide references to others.

Multi-stage sampling: sampling subunits within sampled units.

Stratified sampling: dividing the population into groups on the basis of some characteristic and then sampling each group.

Quota sampling: selecting fixed numbers of units in each of a number of categories.

Systematic sampling: choosing every nth item from a list, beginning at a random point.

Technique Strengths WeaknessesNonprobability Sampling Convenience sampling

Least expensive, leasttime-consuming, mostconvenient

Selection bias, sample notrepresentative, not recommended fordescriptive or causal research

Judgmental sampling Low cost, convenient,not time-consuming

Does not allow generalization,subjective

Quota sampling Sample can be controlledfor certain characteristics

Selection bias, no assurance ofrepresentativeness

Snowball sampling Can estimate rarecharacteristics

Time-consuming

Probability sampling Simple random sampling(SRS)

Easily understood,results projectable

Difficult to construct samplingframe, expensive, lower precision,no assurance of representativeness.

Systematic sampling Can increaserepresentativeness,easier to implement thanSRS, sampling frame notnecessary

Can decrease representativeness

Stratified sampling Include all importantsubpopulations,precision

Difficult to select relevantstratification variables, not feasible tostratify on many variables, expensive

Cluster sampling Easy to implement, costeffective

Imprecise, difficult to compute andinterpret results

Strengths and Weaknesses of Strengths and Weaknesses of

Basic Sampling TechniquesBasic Sampling Techniques

Random . . .Random . . . Random Selection vs. Random Assignment

– Random Selection = every member of the population has an equal chance of being selected for the sample.

– Random Assignment = every member of the sample (however chosen) has an equal chance of being placed in the experimental group or the control group.

Random assignment allows for individual differences among test participants to be averaged out.

Subject Selection (Random Subject Selection (Random Selection)Selection)

Choosing which potential subjects will actually participate in the study

Subject Assignment (Random Subject Assignment (Random Assignment)Assignment)

Deciding which group or condition each subject will be part of

Group A Group B

Population: 200 8th Graders

40 High IQstudents

120 Avg.IQ students

40 Low IQstudents

30students

30students

30students

15students

15students

15students

15students

15students

15students

Group A Group B Group A Group B Group A Group B

Randomization (Random Randomization (Random assignment to two treatments)assignment to two treatments) Randomization tends to produce study groups

comparable with respect to known and unknown

risk factors,

removes investigator bias in the allocation of

participants

and guarantees that statistical tests will have valid

significance levels

Trialist’s most powerful weapon against bias

Randomization (Cont)Randomization (Cont)

Simple randomization: Toss a Coin– AAABBAAAAABABABBAAAABAA…

Random permuted blocks (Block

Randomization)– AABB-ABBA-BBAA-BAAB-ABAB-AABB-

Block RandomizationBlock Randomization Each block contains

all conditions of the experiment in a randomized order.

E, C, C, E

C, E, C, E

E, E, C, C

ExperimentalGroupN = 6

ControlGroupN = 6

Prevalence and risk factors of HIV 1 and HIV 2 infection in Urban Prevalence and risk factors of HIV 1 and HIV 2 infection in Urban and rural areas in TNand rural areas in TN. Int. J. of STD & AIDS 1998;9:98-103

Objective:Objective: Find prevalence and risk factors. Setting:Setting: Centers in metropolitan city & municipality. Subjects:Subjects: Individuals in Tamil nadu.Sampling Procedure:

“ Health camps were organized in 5 urban and 5 rural centers to cover entire state graphically”

“ Every third person screened, in the active reproductive age group, were recruited as a subject. At each camp the inclusion of subjects continued until 200 persons were recruited”

Sex differences in the use of asthma drugs: Cross-sectional study.BMJ 1998; 317: 1434-7

Objective : To assess the use of asthma drugs. Design : Cross-sectional study. Setting: Six general practices in East Anglia. Subjects : Adults aged 20-54 with Asthma

Sampling method“identify cases with asthma received drugs one year before – through database from each participating practices. The sample was stratified into three categories of severity corresponding the prescribed drugs

Bronchodilator alone (mild) 38%Steroids (moderate) 57%Nebulizer treatment (severe) 5%

Use SRS to select subject in each practice based on proportion of use of each type of drug within the practice

Genital ulcer disease and acquisition of HIV infection.Genital ulcer disease and acquisition of HIV infection.Indian J Med Microbiol 1992; 10(4):265-269

Objective :Objective : To find out the association of HIV infection with genital ulcer disease .

Setting :Setting : Dept. of STD, GGH, Chennai.

SubjectsSubjects : Individuals attending the STD dept.

Sampling procedureSampling procedure‘ Blood samples from first 20 patients were taken for analysis once a week for 40 weeks’.

Prevalence of series eye disease and visual impairment in a north London population: Population based, cross sectional study.BMJ 1998; 316:1643-48.

Objective: To estimate eye disorders and of visual impairmentDesign: Cross-sectional survey. Setting : General Practices in metropolitan in England. Subjects: aged 65 or older & registered

17 general practice group

Random sampling

7 were selected

People age 65 or older were registered with the general practices. Total 750-850 in each Gen Pract

One third in each practices were selected to form survey sample

Use SRS to select eligible people in each practice

Sampling Procedure

ExampleExample A medical student in a city in South Africa conducted a

survey to measure the prevalence of HIV in his village. He used simple random sampling to select the subjects. At the end of his study, he was able to estimate the prevalence in the general population of the village. However, he was not able to calculate the prevalence of HIV in some subgroups such as homosexual due to the absence of this subgroup from his sample. So, to guarantee the presence of such rare group, what kind of sampling should he have used?

A. Systematic random sample. B. Cluster sample. C. Multistage-staged sample. D. Stratified random sample. E. None of the above.

ExampleExample A post-graduate trainee of family medicine was assigned

a project to evaluate the effect of teachers’ smoking on students’ behavior. He presented the following scenario as an explanation of his method of subjects’ selection:

“Out of 400 schools in Riyadh 30 schools were selected randomly and then all subjects (teachers) in each selected school will be included in the study”

The type of sampling method is: A. Multi-staged sample B. Cluster sample C. Simple random sample D. Stratified random sample E. None of the above

ExampleExample

Stratified random sample:

A. Make use of random number tables B. Is one type of non-random sample C. Divide the population into groups or clusters

according to characteristic of interest D. Take all units in some clusters E. Increase precision

top related