standard errors in systematic sampling chris … · random 8900 8900 143 35000 we could pretend...

17
STANDARD ERRORS IN SYSTEMATIC SAMPLING Chris Glasbey & Stijn Bierman Biomathematics & Statistics Scotland

Upload: duongdang

Post on 13-Apr-2019

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: STANDARD ERRORS IN SYSTEMATIC SAMPLING Chris … · random 8900 8900 143 35000 We could pretend systematic design was one of other three, to obtain va^r if we believe the systematic

STANDARD ERRORS IN SYSTEMATIC SAMPLING

Chris Glasbey & Stijn Bierman

Biomathematics & Statistics Scotland

Page 2: STANDARD ERRORS IN SYSTEMATIC SAMPLING Chris … · random 8900 8900 143 35000 We could pretend systematic design was one of other three, to obtain va^r if we believe the systematic

How many acorns are under an oak tree?

2

Page 3: STANDARD ERRORS IN SYSTEMATIC SAMPLING Chris … · random 8900 8900 143 35000 We could pretend systematic design was one of other three, to obtain va^r if we believe the systematic

Standard estimator of total is

T = Mn∑

i=1yi where n = 144, M = 9

(NB We are only interested in the total, not in interpolating y’s)

But what is its variance?

Two frameworks for inference:

1) Design based: Assume unobserved array of y’s is fixed, and only stochas-ticity is due to sampling

2) Model based: Assume y’s are a single realisation from a super-populationof possible arrays

But, for systematic sampling, there is no design-based, unbiased estimatorof var(T − T )

3

Page 4: STANDARD ERRORS IN SYSTEMATIC SAMPLING Chris … · random 8900 8900 143 35000 We could pretend systematic design was one of other three, to obtain va^r if we believe the systematic

1. Design-based inference

If instead we had sampled at random:

var(T − T ) =

1 −1

M

M2nσ2, where σ2 =1

n − 1

n∑

i=1(yi − y)2,

is an unbiased estimator with (n − 1) d.f., where σ2 is variance of y array

4

Page 5: STANDARD ERRORS IN SYSTEMATIC SAMPLING Chris … · random 8900 8900 143 35000 We could pretend systematic design was one of other three, to obtain va^r if we believe the systematic

From randomisation, var(T − T ) = yTQy, with y arranged as vector

For random sampling

Q =

8, −ε, −ε, −ε, −ε, . . .−ε, 8, −ε, −ε, −ε, . . .−ε, −ε, 8, −ε, −ε, . . .−ε, −ε, −ε, 8, −ε, . . .

... ... ... . . .

ε = 0.0062

Whereas for systematic sampling

Q =

8, −1, −1, 8, −1, . . .−1, 8, −1, −1, 8, . . .−1, −1, 8, −1, −1, . . .

8, −1, −1, 8, −1, . . .... ... ... . . .

In the absence of trend, and if “covariances” decay with distance, systematicsampling is most efficient (Bellman, 1977)

5

Page 6: STANDARD ERRORS IN SYSTEMATIC SAMPLING Chris … · random 8900 8900 143 35000 We could pretend systematic design was one of other three, to obtain va^r if we believe the systematic

For this problem we know the truth, a complete census:

T = 85306 acorns

(Approximated from Aubry and Debouzie, Ecology, 2000.)

So we can conduct a simulation study

6

Page 7: STANDARD ERRORS IN SYSTEMATIC SAMPLING Chris … · random 8900 8900 143 35000 We could pretend systematic design was one of other three, to obtain va^r if we believe the systematic

sampling design rmsesystematic 2000random 8900

7

Page 8: STANDARD ERRORS IN SYSTEMATIC SAMPLING Chris … · random 8900 8900 143 35000 We could pretend systematic design was one of other three, to obtain va^r if we believe the systematic

sampling design rmse√

var d.f. 95% c.widthsystematic 2000random 8900 8900 143 35000

8

Page 9: STANDARD ERRORS IN SYSTEMATIC SAMPLING Chris … · random 8900 8900 143 35000 We could pretend systematic design was one of other three, to obtain va^r if we believe the systematic

Two other types of design with unbiased estimators of var(T − T ):

replicated systematic stratified

d.f. = 3 d.f. ≤ n2 (Satterthwaite, 1946)

9

Page 10: STANDARD ERRORS IN SYSTEMATIC SAMPLING Chris … · random 8900 8900 143 35000 We could pretend systematic design was one of other three, to obtain va^r if we believe the systematic

sampling design rmse√

var d.f. 95% c.widthsystematic 20004-rep systematic 2600 2600 3 150002-sample strata 3600 3600 12 16000random 8900 8900 143 35000

10

Page 11: STANDARD ERRORS IN SYSTEMATIC SAMPLING Chris … · random 8900 8900 143 35000 We could pretend systematic design was one of other three, to obtain va^r if we believe the systematic

sampling design rmse√

var d.f. 95% c.widthsystematic 20004-rep systematic 2600 2600 3 150002-sample strata 3600 3600 12 16000random 8900 8900 143 35000

We could pretend systematic design was one of other three, to obtain varif we believe the systematic design is the most efficient (Bellman, 1977)

But if there are trends it may not be!

11

Page 12: STANDARD ERRORS IN SYSTEMATIC SAMPLING Chris … · random 8900 8900 143 35000 We could pretend systematic design was one of other three, to obtain va^r if we believe the systematic

For example, if the data were elevations, such as:

sampling design rmse√

var d.f. 95% c.widthsystematic 3704-rep systematic 380 380 3 23002-sample strata 110 110 10 500random 410 410 143 1640

12

Page 13: STANDARD ERRORS IN SYSTEMATIC SAMPLING Chris … · random 8900 8900 143 35000 We could pretend systematic design was one of other three, to obtain va^r if we believe the systematic

2. Model-based inference

Model for trend:

fij = β1 exp

−1

2

log rij − β4

β5

6

where rij =√

√(i − β2)2 + (j − β3)2

0 5 10 15 20 25

050

100

150

200

250

r

y

Fit by maximising Poisson quasi-likelihood, ∑(y log f − f ), and Tf = ∑ fij

13

Page 14: STANDARD ERRORS IN SYSTEMATIC SAMPLING Chris … · random 8900 8900 143 35000 We could pretend systematic design was one of other three, to obtain va^r if we believe the systematic

sampling design rmse T rmse(∑ f)systematic 2000 20004-rep systematic 2600 20002-sample strata 3600 2800random 8900 3600

14

Page 15: STANDARD ERRORS IN SYSTEMATIC SAMPLING Chris … · random 8900 8900 143 35000 We could pretend systematic design was one of other three, to obtain va^r if we believe the systematic

Isotropic exponential model for autocorrelation of errors, estimated from

standardised residuals (e = (y − f )/√

f)

0 2 4 6 8 10

−0.

50.

00.

51.

0

distance

corr

elat

ion

Obtain y, using best linear predictor of unobserved e’s, and Tc = ∑ yij

15

Page 16: STANDARD ERRORS IN SYSTEMATIC SAMPLING Chris … · random 8900 8900 143 35000 We could pretend systematic design was one of other three, to obtain va^r if we believe the systematic

sampling design rmse T rmse(∑ f) rmse(∑ y)systematic 2000 2000 21004-rep systematic 2600 2000 20002-sample strata 3600 2800 2700random 8900 3600 3400

Estimate var(Tc − T ) by simulation, using a parametric bootstrap

16

Page 17: STANDARD ERRORS IN SYSTEMATIC SAMPLING Chris … · random 8900 8900 143 35000 We could pretend systematic design was one of other three, to obtain va^r if we believe the systematic

3. Summary

For design-based inference with acorn data:

• Systematic sampling is unknowably most precise. In absence of trend,other designs give conservative estimates

• 4-rep systematic and 2-sample stratified designs produce shortest confi-dence intervals

Model-based inference reduces design effects, but at the cost of subjectiveassumptions

17