operational procedures for selecting samples for …

FAO Statistics Working Paper Series / 21-22

OPERATIONAL PROCEDURES FOR SELECTING SAMPLES

FOR REPEATED AGRICULTURAL SURVEYS

WITH A ROTATION DESIGN

FAO Statistics Working Paper Series/21-22

OPERATIONAL PROCEDURES FOR SELECTING SAMPLES

FOR REPEATED AGRICULTURAL SURVEYS

WITH A ROTATION DESIGN

Dramane Bako

Food and Agriculture Organization of the United Nations

Rome, 2021

Required citation:

Bako, D. 2021. Operational procedures for selecting samples for repeated agricultural surveys with a rotation design. Rome, FAO. https://doi.org/10.4060/cb4074en

The designations employed and the presentation of material in this information product do not imply the expression of any opinion

whatsoever on the part of the Food and Agriculture Organization of the United Nations (FAO) concerning the legal or development

status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. The mention

of specific companies or products of manufacturers, whether or not these have been patented, does not imply that these have been

endorsed or recommended by FAO in preference to others of a similar nature that are not mentioned.

The views expressed in this information product are those of the author(s) and do not necessarily reflect the views or policies of FAO.

ISBN 978-92-5-134194-0

© FAO, 2021

Some rights reserved. This work is made available under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 IGO

licence (CC BY-NC-SA 3.0 IGO; https://creativecommons.org/licenses/by-nc-sa/3.0/igo/legalcode).

Under the terms of this licence, this work may be copied, redistributed and adapted for non-commercial purposes, provided that the

work is appropriately cited. In any use of this work, there should be no suggestion that FAO endorses any specific organization,

products or services. The use of the FAO logo is not permitted. If the work is adapted, then it must be licensed under the same or

equivalent Creative Commons licence. If a translation of this work is created, it must include the following disclaimer along with the

required citation: “This translation was not created by the Food and Agriculture Organization of the United Nations (FAO). FAO is not

responsible for the content or accuracy of this translation. The original [Language] edition shall be the authoritative edition.”

Disputes arising under the licence that cannot be settled amicably will be resolved by mediation and arbitration as described in Article

8 of the licence except as otherwise provided herein. The applicable mediation rules will be the mediation rules of the World Intellectual

Property Organization http://www.wipo.int/amc/en/mediation/rules and any arbitration will be conducted in accordance with the

Arbitration Rules of the United Nations Commission on International Trade Law (UNCITRAL).

Third-party materials. Users wishing to reuse material from this work that is attributed to a third party, such as tables, figures or

images, are responsible for determining whether permission is needed for that reuse and for obtaining permission from the copyright

holder. The risk of claims resulting from infringement of any third-party-owned component in the work rests solely with the user.

Sales, rights and licensing. FAO information products are available on the FAO website (www.fao.org/publications) and can be

purchased through [email protected]. Requests for commercial use should be submitted via: www.fao.org/contact-

us/licence-request. Queries regarding rights and licensing should be submitted to: [email protected].

https://doi.org/10.4060/cb4074en

https://creativecommons.org/licenses/by-nc-sa/3.0/igo/legalcode

mailto:[email protected]

http://www.fao.org/contact-us/licence-request

http://www.fao.org/contact-us/licence-request


iii

Contents

Acknowledgements ...................................................................................................................................... iv

Introduction .................................................................................................................................................. 1

1 Rotation in single-stage and multistage sampling ................................................................................ 2

2 Use of permanent random numbers (PRN) .......................................................................................... 3

2.1 Overview ....................................................................................................................................... 3

2.2 Application .................................................................................................................................... 4

2.2.1 Selection of all samples during the initial year .................................................................. 5

2.2.2 Sample update procedure ................................................................................................. 6

2.2.3 Use of existing statistical software for PRN sampling ....................................................... 8

3 Repeated collocated sampling ............................................................................................................ 10

3.1 Overview ..................................................................................................................................... 10

3.2 Application .................................................................................................................................. 10

4 Rotation group sampling ..................................................................................................................... 13

4.1 Overview ..................................................................................................................................... 13

4.2 Application .................................................................................................................................. 13

5 Rotation design and PPS sampling ...................................................................................................... 17

5.1 Pareto PPS sampling ................................................................................................................... 17

5.2 Application .................................................................................................................................. 18

References .................................................................................................................................................. 23

Annex: Overview of composite estimators ................................................................................................. 24

iv

Acknowledgements

This document is part of the methodological works of the Survey Team of the Food and Agriculture

Organization of the United Nations (FAO)’s Statistics Division to provide operational guidance on selected

areas of agricultural survey methodology with an overall objective to promote cost effective practices in

agricultural surveys implementation. The methodological works are conducted under the overall

coordination of Christophe Duhamel and the technical supervision of Flavio Bolliger and Neli Georgieva.

This publication was prepared by Dramane Bako. The document benefited from valuable comments and

inputs from Pedro Luis do Nascimento Silva, Jacques Delince, Flavio Bolliger, Silvia Missiroli and Oleg Cara.

The author thanks Pierre Lavallée for his availability for discussions on the topic in the framework of the

development of the methodological note.

1

Introduction

In agricultural surveys, the target populations are agricultural holdings1 that are, broadly speaking,

independent producers of agricultural products. Agricultural holdings are generally classified into two

categories: (i) holdings in the household sector (operated by households) and (ii) holdings in the non-

household sector (operated by other structures like corporations and government institutions). The FAO

sampling strategy for agricultural surveys recommends a two-stage sampling design for the first category

and a single stage sampling design for the second population (FAO, 2017).

The agricultural survey programs proposed by FAO to countries recommend the implementation of a

census of agriculture at least once every ten years and regugar annual agricultural surveys. From one year

to another, there are three alternatives regarding the samples for such repeated surveys: (i) selecting a

new sample every year (often called “repeated cross-section”), (ii) using the same sample during a number

of years (panel) or (iii) changing a proportion of the sample from one year to another (partial rotation).

The first option would substantially increase operational costs of the survey program as selecting a sample

every year requires updating the sampling frame and locating the new sampling units for survey

implementation. In addition it does not guarantee enough overlapping units between samples over two

survey round for longitudinal analyses and reliable estimates of changes. For developing countries, FAO

recommends either the panel or the partial rotation designs as cost effective options for annual survey

programs. The panel design allows both cross sectional and longitudinal analyses with, in theory, all

sample units. It is less costly and presents some operational advantages, as the enumerators shall

interview the same holdings every year. However, the panel sample could suffer from attrition and

obsolescence that would deter its representativeness and increase sampling errors and operational costs

for tracking missing units. The partial rotation scheme is a great alternative to address the issue of sample

attrition through a renewal of a part of the sample while allowing longitudinal analyses over two different

survey occasions.

The main objective of this note is to present how to perform sample selection with partial rotation over

the survey cycle. A number of methods recommended in the literature are proposed here considering

their suitability, cost effectiveness and ease of implementation in the context of agricultural surveys in

developing countries.

1 The World Programme for the Census of Agriculture 2020 (WCA 2020) defines the agricultural holding as “economic units of agricultural production under single management comprising all livestock kept and all land used wholly or partly for agricultural production purposes, without regard to title, legal form or size. Single management may be exercised by an individual or household, jointly by two or more individuals or households, by a clan or tribe, or by a juridical person such as a corporation, cooperative or government agency. The holding’s land may consist of one or more parcels, located in one or more separate areas or in one or more territorial or administrative divisions, providing the parcels share the same production means, such as labor, farm buildings, machinery or draught animals” (FAO, 2015).

2

1 Rotation in single-stage and multistage sampling

Single-stage and multistage sampling are the most common sampling designs for agricultural surveys

(FAO, 2017). In case of a single-stage sampling (as recommended for non-household holdings), the

procedures proposed in this document could be used to select rotating samples in the population or in

each stratum if a stratification is performed.

In the framework of a multistage sampling, rotation is advised in the final selection phase. For instance in

two-stage sampling, it would be recommended to rotate the secondary sampling units (SSU) rather than

the primary sampling units (PSU). Graham (1963, p108) recognises cost advantages associated with

maintaining a fixed set of PSU although higher variability between them could be noticed in some cases;

the paper recommends definitively a rotation of higher-stage sampling units. In fact, rotating the PSU

would be more expensive as it would imply updating more populations (populations of SSU in more PSU

and the population of PSU in each survey occasion). In addition, rotating SSU is likely to produce smoother

estimates than rotating PSU. Therefore, with a two-stage sampling, rotation procedures should be

performed for the SSU in each sampled PSU.

3

2 Use of permanent random numbers (PRN)

2.1 Overview

The PRN technique is a method for selecting a sample using a simple random sampling without

replacement (SRWOR) design. The technique is part of the sequential sample selection methods that are

distinguished from conventional methods in the manner in which random numbers are used to determine

the sample (Chromy 1979; Fan et al. 1962). The PRN sampling (as labelled by Ernest et al. (2000)) consists

in the following steps:

(a). Independently assign a random number 𝑢𝑖 from the uniform distribution 𝑈[0,1] to each unit in

the population;

(b). Sort the frame in ascending order of the 𝑢𝑖;

(c). Starting at any point 𝑢0 (starting point), the sample is composed of the first 𝑛 units with 𝑢𝑖 > 𝑢0.

The frame is treated as a circular list. If 𝑛 units are not obtained in the interval [𝑢0, 1], then wrap

around to 0 and continue.

Ohlsson (1992) presents a formal proof that this technique produces an SRWOR.

The starting point 𝑢0 (referred in step (c) above) may be fixed or may correspond to the PRN of a unit

selected with equal probability in the population (see Ernest et al. 2000).

In the framework of rotation sampling, the PRN offers two important advantages: (i) facilitation of the

frame’s update on each survey occasion and (ii) facilitation of the selection of overlapping samples.

Updating the frame when using the PRN technique will consist as follow: new units that appeared in the

population (births) are added to the frame and new PRN are generated for each of them; units that left

the population (deaths) are removed from the frame. There should be clear procedures about important

change of units, in particular cases of merging and splitting:

if a unit split into two or more units, one of the new units (e.g. the largest one) may keep the PRN

of the unit that splitted and new PRN will be generated for the other new units. However if the

change is considered much important; it may treated definitively as a death of the initial unit and

and births of new units. In that case, new PRN should be generated for all new units.

If two of more units merged, the new unit may conserve the PRN of one of the units that merged

(e.g. the one of the largest unit)

The update of the frame would required a new listing operation unless there is a systematic tracking of

births, deaths and other changes in the population. For developing countries, budget constrainst may not

allow implementing annual listing operations to update the frame. In case the cost of removing all deaths

in the whole frame is high for sample update in a given survey occasion, the deaths can be just removed

in the previous sample and then sequentially removed from the other units following the last sampled

units in the previous frame until the expected sample size is achieved.

In addition the PRN technique allows selecting different samples without overlap (negative coordination)

or with overlap (positive coordination) playing with the flexibility in the choice of the starting points.

Accordingly, selecting samples with a partial rotation scheme become quite easy with the PRN technique,

as it would consist simply in selecting samples with positive coordination with the overlap required by the

survey design.

4

Ohlsson (1995) informs that the PRN sampling is used for coordinating samples with partial rotation

designs in national statistical offices of many countries including Sweden, Australia, New Zealand and

France. The method is also used in the Brazilian Labour Force Survey (see Antonaci and Silva, 2007).

2.2 Application

Let’s consider a five years survey plan with partial sample rotation scheme (20 percent sample rotation

from one year to another) and suppose we are using a two-stage sampling design. Suppose we want to

select five samples of 10 secondary sampling units (agricultural holdings) with 20 percent rotation (80

percent sample overlap) in each selected primary sampling units (say enumeration areas).

Let’s use the PRN technique for an enumeration area with a population of 30 agricultural holdings. We

can either select all five samples during the first year or in case there are enough resource, the population

can be updated at some point of time to update the sample.

5

2.2.1 Selection of all samples during the initial year

Figure 1. Operational steps for sample selection with the PRN sampling method

Step1: generate PRN for each unit of the population

Step2: Sort the frame in ascending order of the PRN

Step3: Select the rotating samples

Source: Author's own elaboration, 2021.

ID PRN

1 0.71591221

2 0.55655816

3 0.75315888

4 0.80058705

5 0.62235617

6 0.33324844

7 0.80201168

8 0.88476671

9 0.31148278

10 0.89016459

11 0.35002149

12 0.47981483

13 0.06673275

14 0.70801775

15 0.83569431

16 0.96405383

17 0.21768859

18 0.97995322

19 0.02679974

20 0.58936127

21 0.98458031

22 0.67577825

23 0.71421294

24 0.22740058

25 0.76220788

26 0.54986539

27 0.39809190

28 0.63592297

29 0.85170579

30 0.68497418

ID PRN

19 0.02679974

13 0.06673275

17 0.21768859

24 0.22740058

9 0.31148278

6 0.33324844

11 0.35002149

27 0.39809190

12 0.47981483

26 0.54986539

2 0.55655816

20 0.58936127

5 0.62235617

28 0.63592297

22 0.67577825

30 0.68497418

14 0.70801775

23 0.71421294

1 0.71591221

3 0.75315888

25 0.76220788

4 0.80058705

7 0.80201168

15 0.83569431

29 0.85170579

8 0.88476671

10 0.89016459

16 0.96405383

18 0.97995322

21 0.98458031

ID PRN

19 0.02679974

13 0.06673275

17 0.21768859

24 0.22740058

9 0.31148278

6 0.33324844

11 0.35002149

27 0.39809190

12 0.47981483

26 0.54986539

2 0.55655816

20 0.58936127

5 0.62235617

28 0.63592297

22 0.67577825

30 0.68497418

14 0.70801775

23 0.71421294

1 0.71591221

3 0.75315888

25 0.76220788

4 0.80058705

7 0.80201168

15 0.83569431

29 0.85170579

8 0.88476671

10 0.89016459

16 0.96405383

18 0.97995322

21 0.98458031

s

a

m

p

l

e

1

s

a

m

p

l

e

2

s

a

m

p

l

e

3

s

a

m

p

l

e

4

s

a

m

p

l

e

5

6

To achieve the rotation of 20 percent of the sample (two units here), the step3 above consists just in

changing the starting point by skipping two units from the previous starting point.

Figure 2. Samples selected with the PRN sampling method

Sample1 Sample2 Sample3 Sample4 Sample5

19 17 9 11 12

13 24 6 27 26

17 9 11 12 2

24 6 27 26 20

9 11 12 2 5

6 27 26 20 28

11 12 2 5 22

27 26 20 28 30

12 2 5 22 14

26 20 28 30 23


2.2.2 Sample update procedure

Let’s suppose that on the year 3, there are enough resource to update the populations of all enumeration

areas. After the update in the enumeration area considered above, suppose it was noticed that

agricultural holdings 5 and 27 disappeared from the population and three new holdings (labelled 31, 32,

33) appeared in the population.

The update of the sampling frame is straightforward. It would consist simply in (i) removing units 5 and

27 from the population, (ii) include the new units 31-33 and (iii) generate new PRN for these new units.

New samples can then be selected for year 3 following the procedure previously described above.

However the choice of the new starting point for year 3 should be made carefully to keep the planned

number of overlapping units with the sample of year2 as much as possible.

7

Figure 3. Operational steps for updating samples with the PRN sampling method

ID PRN

1 0.71591221

2 0.55655816

3 0.75315888

4 0.80058705

5 0.62235617

6 0.33324844

7 0.80201168

8 0.88476671

9 0.31148278

10 0.89016459

11 0.35002149

12 0.47981483

13 0.06673275

14 0.70801775

15 0.83569431

16 0.96405383

17 0.21768859

18 0.97995322

19 0.02679974

20 0.58936127

21 0.98458031

22 0.67577825

23 0.71421294

24 0.22740058

25 0.76220788

26 0.54986539

27 0.39809190

28 0.63592297

29 0.85170579

30 0.68497418

31 0.21095076

32 0.09093491

33 0.23833398

ID PRN

19 0.02679974

13 0.06673275

32 0.09093491

31 0.21095076

17 0.21768859

24 0.22740058

33 0.23833398

9 0.31148278

6 0.33324844

11 0.35002149

27 0.39809190

12 0.47981483

26 0.54986539

2 0.55655816

20 0.58936127

5 0.62235617

28 0.63592297

22 0.67577825

30 0.68497418

14 0.70801775

23 0.71421294

1 0.71591221

3 0.75315888

25 0.76220788

4 0.80058705

7 0.80201168

15 0.83569431

29 0.85170579

8 0.88476671

10 0.89016459

16 0.96405383

18 0.97995322

21 0.98458031

ID PRN

19 0.02679974

13 0.06673275

32 0.09093491

31 0.21095076

17 0.21768859

24 0.22740058

33 0.23833398

9 0.31148278

6 0.33324844

11 0.35002149

12 0.47981483

26 0.54986539

2 0.55655816

20 0.58936127

28 0.63592297

22 0.67577825

30 0.68497418

14 0.70801775

23 0.71421294

1 0.71591221

3 0.75315888

25 0.76220788

4 0.80058705

7 0.80201168

15 0.83569431

29 0.85170579

8 0.88476671

10 0.89016459

16 0.96405383

18 0.97995322

21 0.98458031

Sample3

Sample4

Sample5

Sample2

Sample1

8


2.2.3 Use of existing statistical software for PRN sampling

The PRN sample selection can be performed with any statistical software that have functions to (i)

generate random numbers from a uniform distribution, (ii) sort a database by the ascending/descending

order of a specific variable, (iii) select observations based on an identification variable.

Almost all statistical software have these basic functions. Below are applications with the most popular

software in developing countries R, SPSS and STATA.

R software

R is certainly the most suitable statistical software for PRN sampling because it has a specific package for

this sampling approach called ‘prnsamplr’. The package performs even probability-proportional-to-size

(PPS) sampling using permanent random numbers.

However, the user can also perform easily PRN sampling following the steps described above.

Table 1. PRN sampling with R software

Generating PRN To create a variable ‘prn’ of uniform random numbers in a database ‘frame’: frame$prn <- runif(1)

Sorting

Sorting a database ‘frame’ in ascending order of the variable prn is straightforward using the package ‘plyr’ : library(plyr) arrange(frame, prn) frame <- frame [ order(frame$prn), ]

Selection

sample1 <- frame [1:10,]




sample5 <- frame [9:18,] Source: Author's own elaboration, 2021.

Sample1 Sample2 Sample3 Sample4 Sample519 17 33 6 12

13 24 9 11 26

17 9 6 12 2

24 6 11 26 20

9 11 12 2 28

6 27 26 20 22

11 12 2 28 30

27 26 20 22 14

12 2 28 30 23

26 20 22 14 1

Update

9

Table 2. PRN sampling with SPSS software

SPSS

Generating PRN To create a variable ‘prn’ of uniform random numbers in a database: COMPUTE prn=RV.UNIFORM(0,1). EXECUTE.

Sorting Sorting the database in ascending order of the variable prn: SORT CASES BY prn(A).

Selection

DATASET COPY sample1.

DATASET ACTIVATE sample1.

FILTER OFF.

USE 1 thru 10 /permanent.

EXECUTE.

DATASET COPY sample2.

DATASET ACTIVATE sample2.

FILTER OFF.

USE 3 thru 12 /permanent.

EXECUTE.

And so on until sample 5 Source: Author's own elaboration, 2021.

Table 3. PRN sampling with Stata software

Stata

Generating PRN To create a variable ‘prn’ of uniform random numbers in a database: gen prn=runiform()

Sorting Sorting the database in ascending order of the variable prn: sort prn

Selection

sample1 : (Using the frame database)

keep in 1/10

sample2: (Using the frame database)

keep in 3/12


keep in 5/14


keep in 7/16


keep in 9/18 Source: Author's own elaboration, 2021.

10

3 Repeated collocated sampling

3.1 Overview

Srinath and Carpenter (1995) points out that the PRN technique may lead to over- or

underrepresentation of births in the sample because the new PRNs generated for births are not equally

spaced on the interval [0, 1]. The authors suggested a new procedure called repeated collocated

sampling that facilitates a better handling of births and could be used in the context of agricultural

surveys. The procedure, which corresponds to a SRSWOR, consists in the following steps:

(i). sort in a random order all units of the target population (e.g. in a domain, stratum, PSU etc.)

(ii). assign a Sample Selection Number 𝑆𝑆𝑁(𝑖) to each unit 𝑖 as follows:

𝑆𝑆𝑁(𝑖) =𝑅𝑖 − 𝜀

𝑁 (3.1.1)

Where 𝑅𝑖 is the rank of the unit 𝑖 after the random sorting; 𝜀 is a random number from the uniform

distribution 𝑈[0,1] and 𝑁 is the size of the population

(iii). the sample is composed by all units with a sample selection number lower than the desired

sampling fraction 𝑓 (𝑆𝑆𝑁(𝑖) ≤ 𝑓)

(iv). if a percentage 𝑟 of the sample is expected to be rotated, then in each survey occasion 𝑡, all

units whose 𝑆𝑆𝑁 lie within the interval [(𝑡 − 1)𝑟𝑓, (𝑡 − 1)𝑟𝑓 + 𝑓] would constitute the sample

(𝑡 = 1 corresponding to the first survey occasion).

In each survey occasion, if there are 𝑄 new births since the last sampling occasion, they are also

randomly ordered and assigned new sample selection numbers 𝑆𝑆𝑁(𝑖) as below:

𝑆𝑆𝑁(𝑖) =𝑅𝑖 − 𝜀

𝑄 (3.1.2)

Where 𝑅𝑖 and 𝜀 are defined as above. However, in each survey occasion, either the same 𝜀 could be

used or a new random number could be selected from the uniform distribution.

3.2 Application Let’s consider the same enumeration area (EA) composed by 30 units as in section 2.2 and a 20 percent

partial rotation design with a sample of 10 agricultural holdings.

We have:

𝑓 𝑟𝑓 𝜀 Sampling fraction: 10/30=0.33 0.2 × 0.33 = 0.066 Random number= 0.017537133

11

Figure 5. Operational steps for selecting samples with the repeated collocated sampling method


Step1: generate random numbers for each unit in the EA

Step2: Sort the EA e.g. in ascending order of random numbers (random sorting)

Step3: Define the 𝑅𝑖 and calculate the SSN

Step4: identify the samples

ID Rand

1 0.87012650

2 0.28088389

3 0.71732738

4 0.70422743

5 0.36168256

6 0.74144240

7 0.17386889

8 0.13231919

9 0.45544714

10 0.88902360

11 0.05395863

12 0.37404500

13 0.31871866

14 0.40329879

15 0.24708789

16 0.91238972

17 0.64396404

18 0.39094439

19 0.64129422

20 0.22257691

21 0.61204142

22 0.52843557

23 0.45859939

24 0.75891911

25 0.41555944

26 0.08445414

27 0.95686136

28 0.71136379

29 0.95392510

30 0.08210462

ID Rand

11 0.0540

30 0.0821

26 0.0845

8 0.1323

7 0.1739

20 0.2226

15 0.2471

2 0.2809

13 0.3187

5 0.3617

12 0.3740

18 0.3909

14 0.4033

25 0.4156

9 0.4554

23 0.4586

22 0.5284

21 0.6120

19 0.6413

17 0.6440

4 0.7042

28 0.7114

3 0.7173

6 0.7414

24 0.7589

1 0.8701

10 0.8890

16 0.9124

29 0.9539

27 0.9569

ID Rand SSN

11 0.0540 1 0.0327

30 0.0821 2 0.0661

26 0.0845 3 0.0994

8 0.1323 4 0.1327

7 0.1739 5 0.1661

20 0.2226 6 0.1994

15 0.2471 7 0.2327

2 0.2809 8 0.2661

13 0.3187 9 0.2994

5 0.3617 10 0.3327

12 0.3740 11 0.3661

18 0.3909 12 0.3994

14 0.4033 13 0.4327

25 0.4156 14 0.4661

9 0.4554 15 0.4994

23 0.4586 16 0.5327

22 0.5284 17 0.5661

21 0.6120 18 0.5994

19 0.6413 19 0.6327

17 0.6440 20 0.6661

4 0.7042 21 0.6994

28 0.7114 22 0.7327

3 0.7173 23 0.7661

6 0.7414 24 0.7994

24 0.7589 25 0.8327

1 0.8701 26 0.8661

10 0.8890 27 0.8994

16 0.9124 28 0.9327

29 0.9539 29 0.9661

27 0.9569 30 0.9994

ID Rand SSN [0, f] [rf, rf+f] [2rf, 2rf+f] [3rf, 3rf+f] [4rf, 4rf+f]

11 0.0540 1 0.0327 1 0 0 0 0

30 0.0821 2 0.0661 1 0 0 0 0

26 0.0845 3 0.0994 1 1 0 0 0

8 0.1323 4 0.1327 1 1 0 0 0

7 0.1739 5 0.1661 1 1 1 0 0

20 0.2226 6 0.1994 1 1 1 0 0

15 0.2471 7 0.2327 1 1 1 1 0

2 0.2809 8 0.2661 1 1 1 1 0

13 0.3187 9 0.2994 1 1 1 1 1

5 0.3617 10 0.3327 1 1 1 1 1

12 0.3740 11 0.3661 0 1 1 1 1

18 0.3909 12 0.3994 0 1 1 1 1

14 0.4033 13 0.4327 0 0 1 1 1

25 0.4156 14 0.4661 0 0 1 1 1

9 0.4554 15 0.4994 0 0 0 1 1

23 0.4586 16 0.5327 0 0 0 1 1

22 0.5284 17 0.5661 0 0 0 0 1

21 0.6120 18 0.5994 0 0 0 0 1

19 0.6413 19 0.6327 0 0 0 0 0

17 0.6440 20 0.6661 0 0 0 0 0

4 0.7042 21 0.6994 0 0 0 0 0

28 0.7114 22 0.7327 0 0 0 0 0

3 0.7173 23 0.7661 0 0 0 0 0

6 0.7414 24 0.7994 0 0 0 0 0

24 0.7589 25 0.8327 0 0 0 0 0

1 0.8701 26 0.8661 0 0 0 0 0

10 0.8890 27 0.8994 0 0 0 0 0

16 0.9124 28 0.9327 0 0 0 0 0

29 0.9539 29 0.9661 0 0 0 0 0

27 0.9569 30 0.9994 0 0 0 0 0

sample1 sample2 sample3 sample4 sample5

11 26 7 15 13

30 8 20 2 5

26 7 15 13 12

8 20 2 5 18

7 15 13 12 14

20 2 5 18 25

15 13 12 14 9

2 5 18 25 23

13 12 14 9 22

5 18 25 23 21

12

Sample update procedure

As in section 2.2.2, let’s suppose that an update of the sample is planned during year 3 and that it was

noticed that holdings 5 and 27 disappeared and three new holdings (labelled 31, 32 and 33) appeared in

the population.

The new population size of the EA is therefore 31. To update the sample, it is necessary to update the

sampling fraction 𝑓 and the Sample Selection Numbers 𝑆𝑆𝑁 of sampling units that did not disappear, as

these are functions of the population size. Then new SSN should be calculated for new units 31, 32 and

33 using equation 3.1.2 (here Q=3). We will consider the same value for 𝜀 for both old and new SSN.

𝑓 𝑟𝑓 𝜀 Sampling fraction: 10/28=0.322 0.2 × 0.33 = 0.0645 Random number= 0.017537133

Figure 6. Samples update procedure with the repeated collocated sampling method


ID Rand SSN [0, f] [rf, rf+f] [2rf, 2rf+f] [3rf, 3rf+f] [4rf, 4rf+f]

11 0.0540 1 0.0351 1 0 0 0 0

30 0.0821 2 0.0708 1 1 0 0 0

26 0.0845 3 0.1065 1 1 0 0 0

8 0.1323 4 0.1422 1 1 1 0 0

7 0.1739 5 0.1779 1 1 1 0 0

20 0.2226 6 0.2137 1 1 1 1 0

15 0.2471 7 0.2494 1 1 1 1 0

2 0.2809 8 0.2851 1 1 1 1 1

13 0.3187 9 0.3208 1 1 1 1 1

12 0.3740 10 0.3565 0 1 1 1 1

18 0.3909 11 0.3922 0 0 1 1 1

14 0.4033 12 0.4279 0 0 1 1 1

25 0.4156 13 0.4637 0 0 0 1 1

9 0.4554 14 0.4994 0 0 0 1 1

23 0.4586 15 0.5351 0 0 0 0 1

22 0.5284 16 0.5708 0 0 0 0 1

21 0.6120 17 0.6065 0 0 0 0 0

19 0.6413 18 0.6422 0 0 0 0 0

17 0.6440 19 0.6779 0 0 0 0 0

4 0.7042 20 0.7137 0 0 0 0 0

28 0.7114 21 0.7494 0 0 0 0 0

3 0.7173 22 0.7851 0 0 0 0 0

6 0.7414 23 0.8208 0 0 0 0 0

24 0.7589 24 0.8565 0 0 0 0 0

1 0.8701 25 0.8922 0 0 0 0 0

10 0.8890 26 0.9279 0 0 0 0 0

16 0.9124 27 0.9637 0 0 0 0 0

29 0.9539 28 0.9994 0 0 0 0 0

31 0.5100 1 0.3275 0 1 1 1 1

33 0.7324 2 0.6608 0 0 0 0 0

32 0.8703 3 0.9942 0 0 0 0 0


11 26 8 20 2

30 8 7 15 13

26 7 20 2 12

8 20 15 13 18

7 15 2 12 14

20 2 13 18 25

15 13 12 14 9

2 5 18 25 23

13 12 14 9 22

5 18 31 31 31

Update

13

4 Rotation group sampling

4.1 Overview

This method consist basically in dividing randomly the population in groups and then select randomly

rotating samples of groups considering the sample size required by the survey design. This approach has

some advantages compared to the PRN sampling: it ensures that no unit is sampled too often due to

random chance and facilitates the handling of births in the populations. The method is described by

Srinath and Carpenter (1995) as follows:

First, the population is randomly divided in 𝑃 groups. For that, a random permutation of the

numbers 1,2,… , 𝑃 is first performed (assign ordering). Then, the first unit of the population is

assigned to the first rotation group in the assigned ordering, the second population unit is

assigned to the second rotation group in the assigned ordering and so on to the 𝑃th population

unit, which is assigned to the 𝑃th rotation group in the assigned ordering. The process begins

again with the (𝑃 + 1)th population unit assigned to the first rotation group, the (𝑃 + 2)th

population unit assigned to the second rotation group, and so on.

Secondly, suppose we want to select 𝑝 rotation groups in the sample. The original numbers of

the groups (1,2,… , 𝑃) before the assigned ordering, called rotation ordering is used for the

selection. Rotation groups numbered 1,2,… , 𝑝 in the rotation ordering are included in the

sample on the first survey occasion; on the second occasion, rotation group 1 rotates out of the

sample while the (𝑝 + 1)th rotates into the sample, and so on.

Let’s consider 𝑛 and 𝑁 respectively the sample size and the size of the population of the

domain/stratum. Dividing the population in 𝑃 groups means that each group will have (𝑁 𝑃⁄ ) units. Of

course, if (𝑁 𝑃⁄ ) is not an integer, the 𝑃 groups will not have the same size. If 𝑝 groups is planned to be

selected for the sample, then 𝑛 ≅ 𝑝 × (𝑁 𝑃⁄ ) and

𝑃 ≅ 𝑝 × (𝑁 𝑛⁄ ) (4.1.1)

4.2 Application

Let’s consider the same population of 30 holdings of an enumeration area and a sample of 10 units.

Using the formula 4.1.1, we have 𝑃 = 3𝑝. We can then opt to divide the population in 15 rotation

groups and select 5 rotation groups in each sample.

14

Figure 7. Procedures for selecting samples with the rotation group sampling method

Step1: Assign ordering of the rotation groups

Step2: Assign population units to the rotation groups

Rotation ordering Random numbers Assign ordering Random numbers

1 0.839269467 4 0.037332926

2 0.291737035 5 0.064879519

3 0.751828209 12 0.184000017

4 0.037332926 2 0.291737035

5 0.064879519 9 0.561580771

6 0.886087912 15 0.570630813

7 0.805773866 11 0.583437453

8 0.884244074 13 0.619856794

9 0.561580771 14 0.69216479

10 0.803065413 3 0.751828209

11 0.583437453 10 0.803065413

12 0.184000017 7 0.805773866

13 0.619856794 1 0.839269467

14 0.69216479 8 0.884244074

15 0.570630813 6 0.886087912

ID Assign ordering ID Rotation ordering

1 4 13 1

2 5 28 1

3 12 4 2

4 2 19 2

5 9 10 3

6 15 25 3

7 11 1 4

8 13 16 4

9 14 2 5

10 3 17 5

11 10 15 6

12 7 30 6

13 1 12 7

14 8 27 7

15 6 14 8

16 4 29 8

17 5 5 9

18 12 20 9

19 2 11 10

20 9 26 10

21 15 7 11

22 11 22 11

23 13 3 12

24 14 18 12

25 3 8 13

26 10 23 13

27 7 9 14

28 1 24 14

29 8 6 15

30 6 21 15

15

Step3: Sample selection


Sample update procedure

In rotation group sampling, sample update would consist in updating the population by removing the

deaths (units disappeared), including the births (new units) and assign them randomly in rotation groups

and then select new sample.

To illustrate, let’s keep the same assumption of sample update in the year 3 and a situation of two

deaths (5 and 27) and three births (labelled 31-33) in that year.

IDRotation

orderingSample1 Sample2 Sample3 Sample4 Sample5

13 1 13 4 10 1 2

28 1 28 19 25 16 17

4 2 4 10 1 2 15

19 2 19 25 16 17 30

10 3 10 1 2 15 12

25 3 25 16 17 30 27

1 4 1 2 15 12 14

16 4 16 17 30 27 29

2 5 2 15 12 14 5

17 5 17 30 27 29 20

15 6

30 6

12 7

27 7

14 8

29 8

5 9

20 9

11 10

26 10

7 11

22 11

3 12

18 12

8 13

23 13

9 14

24 14

6 15

21 15

s

a

m

p

l

e

1

s

a

m

p

l

e

2

s

a

m

p

l

e

3

s

a

m

p

l

e

4

s

a

m

p

l

e

5

16

Figure 8. Samples update procedure with the the rotation group sampling method


As it can be seen from the illustrations above, the updating procedure may affect the target number of

annual overlapping units. This is a drawback of the sample update although it improves the sample

presentativeness.

ID Assign ordering

1 4

2 5

3 12

4 2

5 9

6 15

7 11

8 13

9 14

10 3

11 10

12 7

13 1

14 8

15 6

16 4

17 5

18 12

19 2

20 9

21 15

22 11

23 13

24 14

25 3

26 10

27 7

28 1

29 8

30 6

31 4

32 5

33 12

ID Rotation ordering

13 1

28 1

4 2

19 2

10 3

25 3

1 4

16 4

31 4

2 5

17 5

32 5

15 6

30 6

12 7

27 7

14 8

29 8

5 9

20 9

11 10

26 10

7 11

22 11

3 12

18 12

33 12

8 13

23 13

9 14

24 14

6 15

21 15

IDRotation

ordering

13 1

28 1

4 2

19 2

10 3

25 3

1 4

16 4

31 4

2 5

17 5

32 5

15 6

30 6

12 7

27 7

14 8

29 8

5 9

20 9

11 10

26 10

7 11

22 11

3 12

18 12

33 12

8 13

23 13

9 14

24 14

6 15

21 15

s

a

m

p

l

e

1

s

a

m

p

l

e

2

s

a

m

p

l

e

3

s

a

m

p

l

e

4

s

a

m

p

l

e

5


13 4 10 1 31

28 19 25 16 2

4 10 1 31 17

19 25 16 2 32

10 1 31 17 15

25 16 2 32 30

1 2 17 15 12

16 17 32 30

2 15 15 12 14

17 30 30 29

Update

17

5 Rotation design and PPS sampling

Sampling with probability proportional to size (PPS) can contribute significantly to improve estimations

of totals when sampling units are different in size and the sampling frame includes a measure of size

positively correlated with the variables of interest of the survey.

In the FAO sampling methodology for agricultural surveys, PPS sampling is recommended for the

selection of primary sampling units and a simple random sampling is suggested for selecting the

secondary sampling units (agricultural holdings) in the framework of a two-stage sampling. As

mentioned in section 1, with a two-stage sampling, rotation procedures should be performed for the

SSU in each sampled PSU and the methods presented in the previous sections can be used for that

purpose.

However, FAO recommends single-stage sampling design for selecting samples of non-households

agricultural holdings and special holdings (large farms, commercial holdings…). Such agricultural

holdings have usually very heterogeneous sizes and PPS sampling could be used for selecting samples

with a rotation design.

5.1 Pareto PPS sampling

An efficient approach used by Statistics Sweden (Lindblom and Teterukovsky, 2007) for selecting

overlapping samples with a PPS sampling is the Pareto PPS sampling approach. We are recommending

this method here because it is easy to implement and proven efficient. Rosén (1997) showed that the

Pareto PPS sampling presents lower sampling errors compared to other PPS sampling methods including

systematic PPS and Sunter PPS (Sunter, 1977).

Let’s consider a stratified simple random sampling without replacement with PPS sample selection and

suppose of a given stratum:

𝑛: sample size in the stratum

𝑥𝑖: measure of size of the unit 𝑖

𝑝𝑖: probability of selection of the unit 𝑖

𝑝𝑖 = 𝑛𝑥𝑖

∑ 𝑥𝑖𝑖 (5.1.1)

The Pareto PPS sampling approach consists simply as follow (Rosén, 1997):

(i). Generate independent standard uniform random variables 𝑈𝑖 for all sampling unit 𝑖.

(ii). Compute the ranking variables 𝑄𝑖 using the formula:

𝑄𝑖 =𝑈𝑖(1 − 𝑝𝑖)

𝑝𝑖(1 − 𝑈𝑖) (5.1.2)

18

The sample consists of the 𝑛 smallest values of the ranking variables 𝑄𝑖. It is easy to notice that the

ranking variable 𝑄 = (𝑄1, 𝑄2, … ) is an increasing function of the PRN variable 𝑈 = (𝑈1, 𝑈2, … )

Stratified Pareto PPS-scheme ensures that actual inclusion probabilities 𝜋𝑖 ≈ 𝑝𝑖 with good

approximation for all units (Rosén, 1997; Lindblom and Teterukovsky, 2007).

In order to select overlapping samples with the Pareto PPS, Statistics Sweden performs a temporary

transformation of the PRN 𝑈𝑖 into new random variables 𝑍𝑖 in function of a given starting point 𝑆

(Lindblom and Teterukovsky, 2007):

𝑍𝑖 = {𝑈𝑖 − 𝑆 𝑖𝑓 𝑈𝑖 ≥ 𝑆

1 + 𝑈𝑖 − 𝑆 𝑖𝑓 𝑈𝑖 < 𝑆 (5.1.3)

Then new ranking variables 𝑄𝑖∗ are computed using the 𝑍𝑖 instead of the 𝑈𝑖.

𝑄𝑖∗ =

𝑍𝑖(1 − 𝑝𝑖)

𝑝𝑖(1 − 𝑍𝑖) (5.1.4)

The importance of such transformation is that it allows the selection of samples with desired number of

overlapping units with suitable selections of starting points 𝑆 as performed in PRN sampling.

5.2 Application

Selection of coordinated samples with Pareto PPS with the approach of Statistics Sweden described

above can be performed with the R package ‘prnsamplr’. We propose below a simple illustration of the

approach using Excel. Let’s suppose overlapping samples of 10 units and 20 percent overlap should be

selected from a population of 30 units with measures of size Xi and PRN Ui. The probability of selection

pi is calculated using the formula 5.1.1.

19

Table 4. Sample selection with the Pareto PPS sampling method

ID Xi Pi Ui

1 88 0.17285 0.64985

2 32 0.06286 0.88124

3 286 0.56178 0.75384

4 228 0.44785 0.78103

5 248 0.48713 0.14281

6 65 0.12768 0.16858

7 59 0.11589 0.21072

8 141 0.27696 0.80233

9 272 0.53428 0.14279

10 220 0.43214 0.43041

11 141 0.27696 0.39755

12 103 0.20232 0.86876

13 268 0.52642 0.16116

14 240 0.47142 0.08902

15 27 0.05303 0.57740

16 303 0.59517 0.82068

17 362 0.71106 0.39322

18 72 0.14143 0.57949

19 93 0.18268 0.21135

20 39 0.07661 0.66833

21 307 0.60302 0.60101

22 265 0.52053 0.97080

23 120 0.23571 0.30023

24 233 0.45767 0.25735

25 122 0.23964 0.12511

26 179 0.35160 0.84064

27 74 0.14535 0.47639

28 317 0.62267 0.35911

29 53 0.10411 0.49048

30 134 0.26321 0.35317

5091

20

Setting the initial points conform to our overlapping objective: two units rotated from one year to

another.

ID Xi Pi Ui S

14 240 0.47142 0.08902 0.08 S1

25 122 0.23964 0.12511

9 272 0.53428 0.14279 0.14 S2

5 248 0.48713 0.14281

13 268 0.52642 0.16116 0.16 S3

6 65 0.12768 0.16858

7 59 0.11589 0.21072 0.21 S4

19 93 0.18268 0.21135

24 233 0.45767 0.25735 0.25 S5

23 120 0.23571 0.30023

30 134 0.26321 0.35317

28 317 0.62267 0.35911

17 362 0.71106 0.39322

11 141 0.27696 0.39755

10 220 0.43214 0.43041

27 74 0.14535 0.47639

29 53 0.10411 0.49048

15 27 0.05303 0.57740

18 72 0.14143 0.57949

21 307 0.60302 0.60101

1 88 0.17285 0.64985

20 39 0.07661 0.66833

3 286 0.56178 0.75384

4 228 0.44785 0.78103

8 141 0.27696 0.80233

16 303 0.59517 0.82068

26 179 0.35160 0.84064

12 103 0.20232 0.86876

2 32 0.06286 0.88124

22 265 0.52053 0.97080

5091

21

Calculating 𝑍1 …𝑍5 and 𝑄1∗ …𝑄5

∗ using the initial points 𝑆1 …𝑆5 and formulas 5.1.3 and 5.1.4

ID Z1 Z2 Z3 Z4 Z5 Q1* Q2* Q3* Q4* Q5*

1 0.570 0.510 0.490 0.440 0.400 6.339347 4.977569 4.594826 3.757545 3.188168

2 0.801 0.741 0.721 0.671 0.631 60.10442 42.71026 38.57619 30.44163 25.52227

3 0.674 0.614 0.594 0.544 0.504 1.6116 1.239993 1.140522 0.930005 0.792141

4 0.701 0.641 0.621 0.571 0.531 2.890954 2.201668 2.020409 1.641207 1.396064

5 0.063 0.003 0.983 0.933 0.893 0.070559 0.002966 60.19023 14.6163 8.769118

6 0.089 0.029 0.009 0.959 0.919 0.66404 0.201026 0.059141 158.1269 77.08396

7 0.131 0.071 0.051 0.001 0.961 1.147223 0.580587 0.407626 0.005514 186.5983

8 0.722 0.662 0.642 0.592 0.552 6.791425 5.120772 4.688447 3.793219 3.22102

9 0.063 0.003 0.983 0.933 0.893 0.058401 0.002439 49.7796 12.09804 7.259024

10 0.350 0.290 0.270 0.220 0.180 0.708862 0.537809 0.487044 0.371526 0.289259

11 0.318 0.258 0.238 0.188 0.148 1.214747 0.905605 0.813369 0.602648 0.45187

12 0.789 0.729 0.709 0.659 0.619 14.72185 10.59314 9.594941 7.611347 6.399088

13 0.081 0.021 0.001 0.951 0.911 0.079459 0.019445 0.001042 17.51903 9.226377

14 0.009 0.949 0.929 0.879 0.839 0.010211 20.87474 14.67652 8.147179 5.844109

15 0.497 0.437 0.417 0.367 0.327 17.67081 13.88201 12.79249 10.37011 8.691506

16 0.741 0.681 0.661 0.611 0.571 1.942842 1.449969 1.324413 1.06696 0.904175

17 0.313 0.253 0.233 0.183 0.143 0.185326 0.137788 0.123595 0.091154 0.067927

18 0.499 0.439 0.419 0.369 0.329 6.05845 4.760068 4.386917 3.557608 2.983214

19 0.131 0.071 0.051 0.001 0.961 0.676542 0.343754 0.242179 0.006042 111.2835

20 0.588 0.528 0.508 0.458 0.418 17.22618 13.50158 12.46205 10.19908 8.668813

21 0.521 0.461 0.441 0.391 0.351 0.716058 0.563064 0.519365 0.422675 0.356049

22 0.891 0.831 0.811 0.761 0.721 7.514301 4.522976 3.947483 2.929787 2.378076

23 0.220 0.160 0.140 0.090 0.050 0.91576 0.618662 0.528843 0.321575 0.171473

24 0.177 0.117 0.097 0.047 0.007 0.25546 0.157543 0.127797 0.058896 0.008772

25 0.045 0.985 0.965 0.915 0.875 0.149884 209.88 87.76136 34.20303 22.23245

26 0.761 0.701 0.681 0.631 0.591 5.860179 4.316042 3.930264 3.148596 2.660744

27 0.396 0.336 0.316 0.266 0.226 3.861206 2.980484 2.721266 2.135056 1.720646

28 0.279 0.219 0.199 0.149 0.109 0.234627 0.170038 0.150658 0.106196 0.074219

29 0.410 0.350 0.330 0.280 0.240 5.992201 4.6437 4.24791 3.3547 2.724805

30 0.273 0.213 0.193 0.143 0.103 1.052074 0.758389 0.670201 0.467742 0.322028

22

Selecting the samples


Sample 1 Sample 2 Sample 3 Sample 4 Sample 5

ID Q1* ID Q2* ID Q3* ID Q4* ID Q5*

14 0.010211 9 0.002439 13 0.001042 7 0.005514 24 0.008772

9 0.058401 5 0.002966 6 0.059141 19 0.006042 17 0.067927

5 0.070559 13 0.019445 17 0.123595 24 0.058896 28 0.074219

13 0.079459 17 0.137788 24 0.127797 17 0.091154 23 0.171473

25 0.149884 24 0.157543 28 0.150658 28 0.106196 10 0.289259

17 0.185326 28 0.170038 19 0.242179 23 0.321575 30 0.322028

28 0.234627 6 0.201026 7 0.407626 10 0.371526 21 0.356049

24 0.25546 19 0.343754 10 0.487044 21 0.422675 11 0.45187

6 0.66404 10 0.537809 21 0.519365 30 0.467742 3 0.792141

19 0.676542 21 0.563064 23 0.528843 11 0.602648 16 0.904175

10 0.708862 7 0.580587 30 0.670201 3 0.930005 4 1.396064

21 0.716058 23 0.618662 11 0.813369 16 1.06696 27 1.720646

23 0.91576 30 0.758389 3 1.140522 4 1.641207 22 2.378076

30 1.052074 11 0.905605 16 1.324413 27 2.135056 26 2.660744

7 1.147223 3 1.239993 4 2.020409 22 2.929787 29 2.724805

11 1.214747 16 1.449969 27 2.721266 26 3.148596 18 2.983214

3 1.6116 4 2.201668 26 3.930264 29 3.3547 1 3.188168

16 1.942842 27 2.980484 22 3.947483 18 3.557608 8 3.22102

4 2.890954 26 4.316042 29 4.24791 1 3.757545 14 5.844109

27 3.861206 22 4.522976 18 4.386917 8 3.793219 12 6.399088

26 5.860179 29 4.6437 1 4.594826 12 7.611347 9 7.259024

29 5.992201 18 4.760068 8 4.688447 14 8.147179 20 8.668813

18 6.05845 1 4.977569 12 9.594941 20 10.19908 15 8.691506

1 6.339347 8 5.120772 20 12.46205 15 10.37011 5 8.769118

8 6.791425 12 10.59314 15 12.79249 9 12.09804 13 9.226377

22 7.514301 20 13.50158 14 14.67652 5 14.6163 25 22.23245

12 14.72185 15 13.88201 2 38.57619 13 17.51903 2 25.52227

20 17.22618 14 20.87474 9 49.7796 2 30.44163 6 77.08396

15 17.67081 2 42.71026 5 60.19023 25 34.20303 19 111.2835

2 60.10442 25 209.88 25 87.76136 6 158.1269 7 186.5983

Sample 1 Sample 2 Sample 3 Sample 4 Sample 5

14 9 13 7 24

9 5 6 19 17

5 13 17 24 28

13 17 24 17 23

25 24 28 28 10

17 28 19 23 30

28 6 7 10 21

24 19 10 21 11

6 10 21 30 3

19 21 23 11 16

23

References

Antonaci, G. d. A. & Silva, D. B. d. N. 2007. Analysis of alternative rotation patterns for the Brazilian

system of integrated household surveys. In Proceedings of the 56th Session of the International

Statistical Institute (ISI).

Chromy, J. R. 1979. Sequential sample selection methods. Proceedings of the American Statistical

Association Section on Survey Research Methods of the American Statistical Association, 401- 406

Davies, C. 2009. Area Frame Design for Agricultural Surveys. RDD Research Report, Research and

Development Division, USDA-NASS, Fairfax, VA.

Ernst, L. R., Valliant, R. & Casady, R. J. 2000. Permanent and collocated random number sampling

and the coverage of births and deaths. Journal of Official Statistics 16.3: 211-228

Fan, C. T., Muller, M. E. & Rezucha, I. 1962. Development of Sampling Plans by Using Sequential

(Item by Item) Selection Techniques and Digital Computers. Journal of the American Statistical

Association, 57, 387-402.

FAO. 2015. World Census of Agriculture 2020. Volume 1: Programme, concepts and definitions. Rome.

FAO. 2017. Handbook on the Agricultural Integrated Survey (AGRIS). GSARS. Rome

Graham, J. E. 1963. Rotation designs for sampling on successive occasions. Retrospective Theses and

Dissertations. Paper 2384.

Gurney, M. & Daley. J.F. 1965. A Multivariate Approach to Estimation in Periodic Sample Surveys.

Proceedings of the Social Statistics Section, American Statistical Association, 242-257

Koop, J.C. 1988. The Technique of Replicated or Interpenetrating Samples, in Handbook to Statistics:

Sampling, Vol. 6, New York: Elsevier Science Publishers B. V., 333-368

Lindblom, A. & Teterukovsky, A. 2007. Coordination of Stratified Pareto pps Samples and Stratified

Simple Random Samples at Statistics Sweden. Paper presented at the ICES-III, June 18-21, 2007,

Montreal, Quebec, Canada

Ohlsson, E. 1992. SAMU, The system for Co-ordination of Samples from the Business Register at

Statistics Sweden-A methodological description, R&D Report 1992: 18, Stockholm: Statistics Sweden

Ohlsson, E. 1995. Coordination of Samples Using Permanent Random Numbers. In Business Survey

Methods, edited by Brenda Cox et al., pp 153-169. Wiley, New York.

Rao, J.N.K. & Graham, J. E 1964. Rotation designs for sampling on repeated occasions. Journal of the

American Statistical Association, 69, 492-509.

Srinath, K.P. & Carpenter, R.M. 1995. Sampling methods for repeated business surveys. In Business

Survey Methods, edited by Brenda Cox et al., pp 171-183. Wiley, New York.

Sunter, A.B. 1977. List sequential sampling with equal or unequal probabilities without replacement.

Appl. Statist. 26, 261-268.

24

Annex: Overview of composite estimators

This note does not aim covering procedures of estimation from partially overlapping repeated surveys.

There are many methods that could be covered in a separate document. However we are proposing below

an overview of a popular class of estimators: the composite estimators.

When rotating samples are selected for different survey occasions, each sample is valid pour reliable

cross-sectional estimates on the corresponding survey occasion. However, alternative more efficient

estimators are proposed in the literature for both cross-sectional and longitudinal estimations. Among

them, the composite estimators are certainly the most popular. The composite estimation combines

estimates of the current survey occasion with the ones of the previous occasions in an efficient manner

to produce estimates that are more accurate in general for most of the characteristics and in particular

for estimate of change (Rao and Graham, 1964; Steel and McLaren, 2008).

Simple Composite Estimator

Gurney and Daley (1965) discusses a number of composite estimators in the specific framework of the US

Current Population Survey. A very basic form of composite estimator of a population mean (�̅�𝑡𝐾), called

Simple Composite Estimator is:

�̅�𝑡𝐾 = (1 − 𝐾)�̅�𝑡 + 𝐾(�̅�𝑡−1

𝐾 + �̅�𝑡𝑀 − �̅�𝑡−1

𝑀 ) (1)

The simple composite estimator of the average change is:

𝑑𝑡𝐾 = �̅�𝑡

𝐾 − �̅�𝑡−1𝐾 (2)

Where:

�̅�𝑡𝐾 and �̅�𝑡−1

𝐾 are respectively the composite estimators for the current period 𝑡 and the previous

period 𝑡 − 1

�̅�𝑡 is the simple unbiased estimate of the mean for the current period 𝑡

�̅�𝑡𝑀 and �̅�𝑡−1

𝑀 are the estimates respectively for the current period 𝑡 and the previous period 𝑡 − 1

from the units of the sample at period t that were also in the sample at period 𝑡-1

𝐾 a constant weight factor between 0 and 1

Compared to estimators �̅�𝑡 and (�̅�𝑡 − �̅�𝑡−1) of respectively average and change for the current period 𝑡,

Rao and Graham (1964) shows that the simple composite estimators �̅�𝑡𝐾 and 𝑑𝑡

𝐾 have lower variances and

discuss optimum values for 𝐾.

AK composite estimator

The simple composite estimator can be improved with an additional term that will further reduce variance

and ameliorate the impact of times in survey effect (Steel and McLaren, 2008). That new composite

estimator, called AK composite estimator, is particularly efficient when in equation (1), 1 − 𝐾 is relatively

higher than 𝐾 (Gurney and Daley, 1965).

�̅�𝑡𝐴𝐾 = (1 − 𝐾)�̅�𝑡 + 𝐾(�̅�𝑡−1

𝐴𝐾 + �̅�𝑡𝑀 − �̅�𝑡−1

𝑀 ) + 𝐴(�̅�𝑡𝑁 − �̅�𝑡

𝑀) (3)

Where the new term �̅�𝑡𝑁 is the estimate for the current period 𝑡 from the new units being rotated in the

sample in the period 𝑡.

25

Optimal values of parameters 𝐾 and 𝐴 depend on the variable 𝑌 and time period.

Contact:

Statistics Division – Economic and Social Development

http://www.fao.org/food-agriculture-statistics/resources/publications/working-papers/en/

[email protected]

Food and Agriculture Organization of the United Nations

Rome, Italy

http://www.fao.org/food-agriculture-statistics/resources/publications/working-papers/en/


operational procedures for selecting samples for …

Documents