chapter 6: cluster sampling design 1 - iowa state...

Chapter 6: Cluster sampling design 1

Jae-Kwang Kim

Iowa State University

Fall, 2014

Kim (ISU) Ch. 6: Cluster sampling design 1 Fall, 2014 1 / 26

Introduction

1 Introduction

2 Single-Stage Cluster Sampling : Equal size case

3 Single-stege cluster sampling: General case


Introduction

Setup:1 Frame list clusters (disjoint groups of population elements)2 Select clusters by a probability sampling.

Why cluster sampling?1 Frame inadequacy: No way to get a list of elements (very expensive)

but relatively easy or cheap to list clusters2 Convenience: By grouping elements into “close” subgroups, you can

save time or money


Introduction

Single stage cluster sampling: Sample clusters and observe allelements in each selected cluster.

Two-stage sampling:

[Stage 1] Population is divided into clusters, called Primary SamplingUnits (PSU’s). A probability sample of PSU’s is drawn.[Stage 2] Each selected PSU is divided into clusters or elements, calledSecondary Sampling Units (SSU’s). A probability sample of SSU’s isdrawn in each selected PSU.

If SSU=cluster, it is called two-stage cluster sampling.If SSU=element, it is called two-stage element sampling

Multi-stage sampling: PSU, SSU, ..., USU (Ultimate Sampling Unit)If USU=cluster, it is called multi-stage cluster sampling.If USU=element, it is called multi-stage element sampling.Probably, don’t know N. So, estimation of yU is more difficult.


Introduction

Notation (Population level)

UI = {1, · · · ,NI}: index set of clusters in the population

Ui : the set of elements in the i-th cluster of size Mi ,(i = 1, 2, · · · ,NI )

yij : measurement of item y at the j-th element (j = 1, 2, · · · ,Mi ) incluster i , i = 1, 2, · · · ,NI .

Population total: Y =∑NI

i=1

∑Mij=1 yij =

∑NIi=1 Yi =

∑NIi=1Mi Yi where

Yi =∑Mi

j=1 yij = Mi Yi

Population size: N =∑NI

i=1Mi


Introduction

Notation (Sample level)

AI : index set of clusters in the sample

nI = |AI |: the number of sampled clusters

A = ∪i∈AIUi : index set of elements in the sample

nA = |A| =∑

i∈AIMi : the number of sampled elements

Usually, nA is not fixed even if nI is fixed.


Single-Stage Cluster Sampling : Equal size case

1 Introduction






Mi are all equal (denoted by M = Mi ).

Sampling design: Simple random cluster sampling1 Simple random sampling of nI clusters from NI clusters.2 Observe all the elements in the selected clusters.

Estimation of mean:

ˆYU =YHT

NIM=

1

nI

1

M

∑i∈AI

M∑j=1

yij =1

nI

∑i∈AI

Yi

Variance

Var( ˆYU) =1

nI

(1− nI

NI

)1

NI − 1

NI∑i=1

(Yi − YU

)2where Yi =

∑Mj=1 yij/M and YU = Y /(NIM) =

∑NIi=1 Yi/NI .



ANOVA

Source D.F. Sum of Squares Mean S.S.Between cluster NI − 1 SSB S2

b

Within cluster NI (M − 1) SSW S2w

Total NIM − 1 SST S2

where

SSB =

NI∑i=1

M(Yi − YU

)2SSW =

NI∑i=1

M∑j=1

(yij − Yi

)2SST =

NI∑i=1

M∑j=1

(yij − YU

)2and MSS=SS/(d.f.).



Note that

S2 =(NI − 1)S2

b + NI (M − 1)S2w

NIM − 1∼=

S2b + (M − 1) S2

w

M.

The variance is

Var(

ˆYU

)=

1

nIM

(1− nI

NI

)S2b .

Note that, under simple random sampling,

VarSRS

(ˆYU

)=

1

nIM

(1− nIM

NIM

)S2,

as n = nIM is the size of the sampled elements.



Design effect

Want to compare the current sampling design p (·) with SRS of equalsample size

Kish (1965) introduced design effect

deff(p, YHT

)=

Vp

(YHT

)VSRS

(YHT

)Usage 1: Compare designs

If deff > 1, then p (·) is less efficient than SRS.If deff < 1, then p (·) is more efficient than SRS.



Design effect (Cont’d)

Usage 2: Determine sample size:1 Have some desired variance V ∗

2 Under SRS, you can easily find required sample size n∗

3 Choose n∗p = deff · n∗.Then,

Vp

(YHT | n∗p

)= V ∗.

n∗ is often called effective sample size. It is the sample size requiredfor the given V ∗ if the sample design is SRS.



Intracluster correlation coefficient

A measure of within cluster homogeneity

Assume Mi = M

Intracluster correlation coefficient

ρ =Cov [yij , yik |j 6= k]√

V (yij)√

V (yik)

=

∑NIi=1

∑j 6=k

∑(yij − Y )(yik − Y )/

∑NIi=1M(M − 1)∑NI

i=1

∑Mj=1(yij − Y )2/

∑NIi=1M



Intracluster correlation coefficient (Cont’d)

Properties1

ρ = 1− M

M − 1

SSW

SST.

= 1− S2w

S2

2 Since 0 ≤ SSW ≤ SST ,

− 1

M − 1≤ ρ ≤ 1.

For SSW = 0, ρ = 1: perfect homogeneity in cluster.For SSW = SST , ρ = −1/(M − 1): perfect heterogeneity in cluster.Each cluster is like the whole population in terms of variability.



Intracluster correlation coefficient (Cont’d)

Variance under simple random cluster (SRC) sampling satisfies

VSRC ( ˆY ).

= VSRS( ˆY )[1 + (M − 1)ρ]

Thus,deff = 1 + (M − 1) ρ.

Effective sample size

n∗ =nA

1 + (M − 1)ρ.

For example, if ρ = 0.1 and M = 11, deff = 2. Thus, even if ρ is low,design effect can be large if M is large.



Estimation of intracluster correlation coefficient

Table: ANOVA table in the sample level (under SRC sampling)

Source D.F. Sum of Squares Mean SSBetween nI − 1 Sample SSB (SSSB) s2b = SSSB/(nI − 1)Within nI (M − 1) Sample SSW (SSSW) s2w = SSSW /{nI (M − 1)}Total nIM − 1 Sample SST (SSST)

Sum of Squares: Sample SST = Sample SSB + Sample SSW

Sample SSB =∑i∈AI

M

Yi − (n−1I

∑k∈AI

Yk)

2

Sample SSW =∑i∈AI

M∑j=1

(yij − Yi )2



Results

E (s2b) =M

NI − 1

NI∑i=1

(Yi − Y

)2= S2

b

E (s2w ) =1

NI (M − 1)

NI∑i=1

M∑j=1

(yij − Yi

)2= S2

w

Estimated intracluster correlation

ρ = 1− s2wσ2y

where

σ2y =(NI − 1)s2b + NI (M − 1)s2w

NIM

.=

1

Ms2b +

(1− 1

M

)s2w


Single-stege cluster sampling: General case

1 Introduction





Single-stage cluster sampling: General case

Meaning1 Draw a probability sample AI from UI via pI (·).2 Observe every elements in each selected clusters.

Cluster inclusion probability

πIi = Pr (i ∈ AI ) =∑

AI ;i∈AI

pI (AI )

πIij = Pr (i , j ∈ AI ) =∑

AI ;i ,j∈AI

pI (AI )

Element inclusion probability

πik = Pr {(ik) ∈ A} = Pr (i ∈ AI ) = πIi where k ∈ Ui

πik,jl = Pr {(ik) ∈ A & (jl) ∈ A} =

{Pr (i ∈ AI ) = πIi if i = jPr (i , j ∈ AI ) = πIij if i 6= j



Single-stage cluster sampling

HT estimation1 Point estimator:

YHT =∑i∈AI

Yi

πIi

2 Variance

Var(YHT

)=∑i∈UI

∑j∈UI

∆IijYi

πIi

Yj

πIj

where ∆Iij = πIij − πIiπIj .3 Variance estimation

V(YHT

)=∑i∈AI

∑j∈AI

Yi

πIi

Yj

πIj

∆Iij

πIij

provided πIij > 0.



Remark

For fixed size design (nAI= nI ),

Var(YHT

)= −1

2

∑i∈UI

∑j∈UI

∆Iij

(Yi

πIi−

Yj

πIj

)2

1 If πIi ∝ Yi , then YHT = Y

2 If πIi ∝ Mi and if Yi is constant, then YHT = Y .

3 Equal probability sampling design is generally inefficient.



Optimal design

Assume the following model

yij = µ+ ai + eij

where µ is an unknown parameter, ai ∼ (0, σ2a), and eij ∼ (0, σ2e ).

Total variance of YHT :

V (YHT ) = V (∑i∈AI

π−1Ii Miµ) + E{∑i∈AI

π−2Ii γi}

= V (∑i∈AI

π−1Ii Miµ) + E{∑i∈UI

π−1Ii γi}

where γi = V (Yi ) = M2i σ

2a + Miσ

2e .



Optimal design (Cont’d)

The second term of the total variance is minimized when

πIi ∝ γ1/2i = Miσa

(1 +

σ2eσ2a· 1

Mi

)1/2

.

If σ2e/σ

2a is small, then πIi ∝ Mi lead to an optimal sampling design.

If σ2e/σ

2a is large, then πIi ∝ M

1/2i can be a better design.



Mean estimation

Population size N is generally unknown.

If the parameter of interest is population mean, we may have twodifferent concepts:

µ1 =1

NI

NI∑i=1

Yi

Mi=

1

NI

NI∑i=1

Yi

µ2 =1

N

NI∑i=1

Yi =

∑NIi=1

∑Mij=1 yij∑NI

i=1Mi

If Mi = M, then µ1 = µ2. Otherwise, the two parameters havedifferent meanings.



Mean estimation

Suppose that we are interested in estimating YU = µ2.

From each sampled cluster i , we observe(Mi , Yi

).

Ratio estimation

ˆYR =

∑i∈AI

Yi/πIi∑i∈AI

Mi/πIi:=

YHT

NHT

Variance

V(

ˆYR

).

=1

N2

{V (YHT )− 2µCov(YHT , NHT ) + µ2V (NHT )

}



Mean estimation

Under SRC sampling, for example, the variance is

V(

ˆYR

).

=N2I

nIN2

(1− nI

NI

)1

NI − 1

NI∑i=1

(Yi −Mi YU

)2=

1

nI M2

(1− nI

NI

)1

NI − 1

NI∑i=1

M2i

(Yi − YU

)2with M = N−1I

∑NIi=1Mi = N/NI .


chapter 6: cluster sampling design 1 - iowa state...

Documents