operational procedures for selecting samples for …
TRANSCRIPT
FAO Statistics Working Paper Series / 21-22
OPERATIONAL PROCEDURES FOR SELECTING SAMPLES
FOR REPEATED AGRICULTURAL SURVEYS
WITH A ROTATION DESIGN
FAO Statistics Working Paper Series/21-22
OPERATIONAL PROCEDURES FOR SELECTING SAMPLES
FOR REPEATED AGRICULTURAL SURVEYS
WITH A ROTATION DESIGN
Dramane Bako
Food and Agriculture Organization of the United Nations
Rome, 2021
Required citation:
Bako, D. 2021. Operational procedures for selecting samples for repeated agricultural surveys with a rotation design. Rome, FAO. https://doi.org/10.4060/cb4074en
The designations employed and the presentation of material in this information product do not imply the expression of any opinion
whatsoever on the part of the Food and Agriculture Organization of the United Nations (FAO) concerning the legal or development
status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. The mention
of specific companies or products of manufacturers, whether or not these have been patented, does not imply that these have been
endorsed or recommended by FAO in preference to others of a similar nature that are not mentioned.
The views expressed in this information product are those of the author(s) and do not necessarily reflect the views or policies of FAO.
ISBN 978-92-5-134194-0
© FAO, 2021
Some rights reserved. This work is made available under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 IGO
licence (CC BY-NC-SA 3.0 IGO; https://creativecommons.org/licenses/by-nc-sa/3.0/igo/legalcode).
Under the terms of this licence, this work may be copied, redistributed and adapted for non-commercial purposes, provided that the
work is appropriately cited. In any use of this work, there should be no suggestion that FAO endorses any specific organization,
products or services. The use of the FAO logo is not permitted. If the work is adapted, then it must be licensed under the same or
equivalent Creative Commons licence. If a translation of this work is created, it must include the following disclaimer along with the
required citation: “This translation was not created by the Food and Agriculture Organization of the United Nations (FAO). FAO is not
responsible for the content or accuracy of this translation. The original [Language] edition shall be the authoritative edition.”
Disputes arising under the licence that cannot be settled amicably will be resolved by mediation and arbitration as described in Article
8 of the licence except as otherwise provided herein. The applicable mediation rules will be the mediation rules of the World Intellectual
Property Organization http://www.wipo.int/amc/en/mediation/rules and any arbitration will be conducted in accordance with the
Arbitration Rules of the United Nations Commission on International Trade Law (UNCITRAL).
Third-party materials. Users wishing to reuse material from this work that is attributed to a third party, such as tables, figures or
images, are responsible for determining whether permission is needed for that reuse and for obtaining permission from the copyright
holder. The risk of claims resulting from infringement of any third-party-owned component in the work rests solely with the user.
Sales, rights and licensing. FAO information products are available on the FAO website (www.fao.org/publications) and can be
purchased through [email protected]. Requests for commercial use should be submitted via: www.fao.org/contact-
us/licence-request. Queries regarding rights and licensing should be submitted to: [email protected].
iii
Contents
Acknowledgements ...................................................................................................................................... iv
Introduction .................................................................................................................................................. 1
1 Rotation in single-stage and multistage sampling ................................................................................ 2
2 Use of permanent random numbers (PRN) .......................................................................................... 3
2.1 Overview ....................................................................................................................................... 3
2.2 Application .................................................................................................................................... 4
2.2.1 Selection of all samples during the initial year .................................................................. 5
2.2.2 Sample update procedure ................................................................................................. 6
2.2.3 Use of existing statistical software for PRN sampling ....................................................... 8
3 Repeated collocated sampling ............................................................................................................ 10
3.1 Overview ..................................................................................................................................... 10
3.2 Application .................................................................................................................................. 10
4 Rotation group sampling ..................................................................................................................... 13
4.1 Overview ..................................................................................................................................... 13
4.2 Application .................................................................................................................................. 13
5 Rotation design and PPS sampling ...................................................................................................... 17
5.1 Pareto PPS sampling ................................................................................................................... 17
5.2 Application .................................................................................................................................. 18
References .................................................................................................................................................. 23
Annex: Overview of composite estimators ................................................................................................. 24
iv
Acknowledgements
This document is part of the methodological works of the Survey Team of the Food and Agriculture
Organization of the United Nations (FAO)’s Statistics Division to provide operational guidance on selected
areas of agricultural survey methodology with an overall objective to promote cost effective practices in
agricultural surveys implementation. The methodological works are conducted under the overall
coordination of Christophe Duhamel and the technical supervision of Flavio Bolliger and Neli Georgieva.
This publication was prepared by Dramane Bako. The document benefited from valuable comments and
inputs from Pedro Luis do Nascimento Silva, Jacques Delince, Flavio Bolliger, Silvia Missiroli and Oleg Cara.
The author thanks Pierre Lavallée for his availability for discussions on the topic in the framework of the
development of the methodological note.
1
Introduction
In agricultural surveys, the target populations are agricultural holdings1 that are, broadly speaking,
independent producers of agricultural products. Agricultural holdings are generally classified into two
categories: (i) holdings in the household sector (operated by households) and (ii) holdings in the non-
household sector (operated by other structures like corporations and government institutions). The FAO
sampling strategy for agricultural surveys recommends a two-stage sampling design for the first category
and a single stage sampling design for the second population (FAO, 2017).
The agricultural survey programs proposed by FAO to countries recommend the implementation of a
census of agriculture at least once every ten years and regugar annual agricultural surveys. From one year
to another, there are three alternatives regarding the samples for such repeated surveys: (i) selecting a
new sample every year (often called “repeated cross-section”), (ii) using the same sample during a number
of years (panel) or (iii) changing a proportion of the sample from one year to another (partial rotation).
The first option would substantially increase operational costs of the survey program as selecting a sample
every year requires updating the sampling frame and locating the new sampling units for survey
implementation. In addition it does not guarantee enough overlapping units between samples over two
survey round for longitudinal analyses and reliable estimates of changes. For developing countries, FAO
recommends either the panel or the partial rotation designs as cost effective options for annual survey
programs. The panel design allows both cross sectional and longitudinal analyses with, in theory, all
sample units. It is less costly and presents some operational advantages, as the enumerators shall
interview the same holdings every year. However, the panel sample could suffer from attrition and
obsolescence that would deter its representativeness and increase sampling errors and operational costs
for tracking missing units. The partial rotation scheme is a great alternative to address the issue of sample
attrition through a renewal of a part of the sample while allowing longitudinal analyses over two different
survey occasions.
The main objective of this note is to present how to perform sample selection with partial rotation over
the survey cycle. A number of methods recommended in the literature are proposed here considering
their suitability, cost effectiveness and ease of implementation in the context of agricultural surveys in
developing countries.
1 The World Programme for the Census of Agriculture 2020 (WCA 2020) defines the agricultural holding as “economic units of agricultural production under single management comprising all livestock kept and all land used wholly or partly for agricultural production purposes, without regard to title, legal form or size. Single management may be exercised by an individual or household, jointly by two or more individuals or households, by a clan or tribe, or by a juridical person such as a corporation, cooperative or government agency. The holding’s land may consist of one or more parcels, located in one or more separate areas or in one or more territorial or administrative divisions, providing the parcels share the same production means, such as labor, farm buildings, machinery or draught animals” (FAO, 2015).
2
1 Rotation in single-stage and multistage sampling
Single-stage and multistage sampling are the most common sampling designs for agricultural surveys
(FAO, 2017). In case of a single-stage sampling (as recommended for non-household holdings), the
procedures proposed in this document could be used to select rotating samples in the population or in
each stratum if a stratification is performed.
In the framework of a multistage sampling, rotation is advised in the final selection phase. For instance in
two-stage sampling, it would be recommended to rotate the secondary sampling units (SSU) rather than
the primary sampling units (PSU). Graham (1963, p108) recognises cost advantages associated with
maintaining a fixed set of PSU although higher variability between them could be noticed in some cases;
the paper recommends definitively a rotation of higher-stage sampling units. In fact, rotating the PSU
would be more expensive as it would imply updating more populations (populations of SSU in more PSU
and the population of PSU in each survey occasion). In addition, rotating SSU is likely to produce smoother
estimates than rotating PSU. Therefore, with a two-stage sampling, rotation procedures should be
performed for the SSU in each sampled PSU.
3
2 Use of permanent random numbers (PRN)
2.1 Overview
The PRN technique is a method for selecting a sample using a simple random sampling without
replacement (SRWOR) design. The technique is part of the sequential sample selection methods that are
distinguished from conventional methods in the manner in which random numbers are used to determine
the sample (Chromy 1979; Fan et al. 1962). The PRN sampling (as labelled by Ernest et al. (2000)) consists
in the following steps:
(a). Independently assign a random number 𝑢𝑖 from the uniform distribution 𝑈[0,1] to each unit in
the population;
(b). Sort the frame in ascending order of the 𝑢𝑖;
(c). Starting at any point 𝑢0 (starting point), the sample is composed of the first 𝑛 units with 𝑢𝑖 > 𝑢0.
The frame is treated as a circular list. If 𝑛 units are not obtained in the interval [𝑢0, 1], then wrap
around to 0 and continue.
Ohlsson (1992) presents a formal proof that this technique produces an SRWOR.
The starting point 𝑢0 (referred in step (c) above) may be fixed or may correspond to the PRN of a unit
selected with equal probability in the population (see Ernest et al. 2000).
In the framework of rotation sampling, the PRN offers two important advantages: (i) facilitation of the
frame’s update on each survey occasion and (ii) facilitation of the selection of overlapping samples.
Updating the frame when using the PRN technique will consist as follow: new units that appeared in the
population (births) are added to the frame and new PRN are generated for each of them; units that left
the population (deaths) are removed from the frame. There should be clear procedures about important
change of units, in particular cases of merging and splitting:
if a unit split into two or more units, one of the new units (e.g. the largest one) may keep the PRN
of the unit that splitted and new PRN will be generated for the other new units. However if the
change is considered much important; it may treated definitively as a death of the initial unit and
and births of new units. In that case, new PRN should be generated for all new units.
If two of more units merged, the new unit may conserve the PRN of one of the units that merged
(e.g. the one of the largest unit)
The update of the frame would required a new listing operation unless there is a systematic tracking of
births, deaths and other changes in the population. For developing countries, budget constrainst may not
allow implementing annual listing operations to update the frame. In case the cost of removing all deaths
in the whole frame is high for sample update in a given survey occasion, the deaths can be just removed
in the previous sample and then sequentially removed from the other units following the last sampled
units in the previous frame until the expected sample size is achieved.
In addition the PRN technique allows selecting different samples without overlap (negative coordination)
or with overlap (positive coordination) playing with the flexibility in the choice of the starting points.
Accordingly, selecting samples with a partial rotation scheme become quite easy with the PRN technique,
as it would consist simply in selecting samples with positive coordination with the overlap required by the
survey design.
4
Ohlsson (1995) informs that the PRN sampling is used for coordinating samples with partial rotation
designs in national statistical offices of many countries including Sweden, Australia, New Zealand and
France. The method is also used in the Brazilian Labour Force Survey (see Antonaci and Silva, 2007).
2.2 Application
Let’s consider a five years survey plan with partial sample rotation scheme (20 percent sample rotation
from one year to another) and suppose we are using a two-stage sampling design. Suppose we want to
select five samples of 10 secondary sampling units (agricultural holdings) with 20 percent rotation (80
percent sample overlap) in each selected primary sampling units (say enumeration areas).
Let’s use the PRN technique for an enumeration area with a population of 30 agricultural holdings. We
can either select all five samples during the first year or in case there are enough resource, the population
can be updated at some point of time to update the sample.
5
2.2.1 Selection of all samples during the initial year
Figure 1. Operational steps for sample selection with the PRN sampling method
Step1: generate PRN for each unit of the population
Step2: Sort the frame in ascending order of the PRN
Step3: Select the rotating samples
Source: Author's own elaboration, 2021.
ID PRN
1 0.71591221
2 0.55655816
3 0.75315888
4 0.80058705
5 0.62235617
6 0.33324844
7 0.80201168
8 0.88476671
9 0.31148278
10 0.89016459
11 0.35002149
12 0.47981483
13 0.06673275
14 0.70801775
15 0.83569431
16 0.96405383
17 0.21768859
18 0.97995322
19 0.02679974
20 0.58936127
21 0.98458031
22 0.67577825
23 0.71421294
24 0.22740058
25 0.76220788
26 0.54986539
27 0.39809190
28 0.63592297
29 0.85170579
30 0.68497418
ID PRN
19 0.02679974
13 0.06673275
17 0.21768859
24 0.22740058
9 0.31148278
6 0.33324844
11 0.35002149
27 0.39809190
12 0.47981483
26 0.54986539
2 0.55655816
20 0.58936127
5 0.62235617
28 0.63592297
22 0.67577825
30 0.68497418
14 0.70801775
23 0.71421294
1 0.71591221
3 0.75315888
25 0.76220788
4 0.80058705
7 0.80201168
15 0.83569431
29 0.85170579
8 0.88476671
10 0.89016459
16 0.96405383
18 0.97995322
21 0.98458031
ID PRN
19 0.02679974
13 0.06673275
17 0.21768859
24 0.22740058
9 0.31148278
6 0.33324844
11 0.35002149
27 0.39809190
12 0.47981483
26 0.54986539
2 0.55655816
20 0.58936127
5 0.62235617
28 0.63592297
22 0.67577825
30 0.68497418
14 0.70801775
23 0.71421294
1 0.71591221
3 0.75315888
25 0.76220788
4 0.80058705
7 0.80201168
15 0.83569431
29 0.85170579
8 0.88476671
10 0.89016459
16 0.96405383
18 0.97995322
21 0.98458031
s
a
m
p
l
e
1
s
a
m
p
l
e
2
s
a
m
p
l
e
3
s
a
m
p
l
e
4
s
a
m
p
l
e
5
6
To achieve the rotation of 20 percent of the sample (two units here), the step3 above consists just in
changing the starting point by skipping two units from the previous starting point.
Figure 2. Samples selected with the PRN sampling method
Sample1 Sample2 Sample3 Sample4 Sample5
19 17 9 11 12
13 24 6 27 26
17 9 11 12 2
24 6 27 26 20
9 11 12 2 5
6 27 26 20 28
11 12 2 5 22
27 26 20 28 30
12 2 5 22 14
26 20 28 30 23
Source: Author's own elaboration, 2021.
2.2.2 Sample update procedure
Let’s suppose that on the year 3, there are enough resource to update the populations of all enumeration
areas. After the update in the enumeration area considered above, suppose it was noticed that
agricultural holdings 5 and 27 disappeared from the population and three new holdings (labelled 31, 32,
33) appeared in the population.
The update of the sampling frame is straightforward. It would consist simply in (i) removing units 5 and
27 from the population, (ii) include the new units 31-33 and (iii) generate new PRN for these new units.
New samples can then be selected for year 3 following the procedure previously described above.
However the choice of the new starting point for year 3 should be made carefully to keep the planned
number of overlapping units with the sample of year2 as much as possible.
7
Figure 3. Operational steps for updating samples with the PRN sampling method
ID PRN
1 0.71591221
2 0.55655816
3 0.75315888
4 0.80058705
5 0.62235617
6 0.33324844
7 0.80201168
8 0.88476671
9 0.31148278
10 0.89016459
11 0.35002149
12 0.47981483
13 0.06673275
14 0.70801775
15 0.83569431
16 0.96405383
17 0.21768859
18 0.97995322
19 0.02679974
20 0.58936127
21 0.98458031
22 0.67577825
23 0.71421294
24 0.22740058
25 0.76220788
26 0.54986539
27 0.39809190
28 0.63592297
29 0.85170579
30 0.68497418
31 0.21095076
32 0.09093491
33 0.23833398
ID PRN
19 0.02679974
13 0.06673275
32 0.09093491
31 0.21095076
17 0.21768859
24 0.22740058
33 0.23833398
9 0.31148278
6 0.33324844
11 0.35002149
27 0.39809190
12 0.47981483
26 0.54986539
2 0.55655816
20 0.58936127
5 0.62235617
28 0.63592297
22 0.67577825
30 0.68497418
14 0.70801775
23 0.71421294
1 0.71591221
3 0.75315888
25 0.76220788
4 0.80058705
7 0.80201168
15 0.83569431
29 0.85170579
8 0.88476671
10 0.89016459
16 0.96405383
18 0.97995322
21 0.98458031
ID PRN
19 0.02679974
13 0.06673275
32 0.09093491
31 0.21095076
17 0.21768859
24 0.22740058
33 0.23833398
9 0.31148278
6 0.33324844
11 0.35002149
12 0.47981483
26 0.54986539
2 0.55655816
20 0.58936127
28 0.63592297
22 0.67577825
30 0.68497418
14 0.70801775
23 0.71421294
1 0.71591221
3 0.75315888
25 0.76220788
4 0.80058705
7 0.80201168
15 0.83569431
29 0.85170579
8 0.88476671
10 0.89016459
16 0.96405383
18 0.97995322
21 0.98458031
Sample3
Sample4
Sample5
Sample2
Sample1
8
Source: Author's own elaboration, 2021.
2.2.3 Use of existing statistical software for PRN sampling
The PRN sample selection can be performed with any statistical software that have functions to (i)
generate random numbers from a uniform distribution, (ii) sort a database by the ascending/descending
order of a specific variable, (iii) select observations based on an identification variable.
Almost all statistical software have these basic functions. Below are applications with the most popular
software in developing countries R, SPSS and STATA.
R software
R is certainly the most suitable statistical software for PRN sampling because it has a specific package for
this sampling approach called ‘prnsamplr’. The package performs even probability-proportional-to-size
(PPS) sampling using permanent random numbers.
However, the user can also perform easily PRN sampling following the steps described above.
Table 1. PRN sampling with R software
Generating PRN To create a variable ‘prn’ of uniform random numbers in a database ‘frame’: frame$prn <- runif(1)
Sorting
Sorting a database ‘frame’ in ascending order of the variable prn is straightforward using the package ‘plyr’ : library(plyr) arrange(frame, prn) frame <- frame [ order(frame$prn), ]
Selection
sample1 <- frame [1:10,]
sample2 <- frame [3:12,]
sample3 <- frame [5:14,]
sample4 <- frame [7:16,]
sample5 <- frame [9:18,] Source: Author's own elaboration, 2021.
Sample1 Sample2 Sample3 Sample4 Sample519 17 33 6 12
13 24 9 11 26
17 9 6 12 2
24 6 11 26 20
9 11 12 2 28
6 27 26 20 22
11 12 2 28 30
27 26 20 22 14
12 2 28 30 23
26 20 22 14 1
Update
9
Table 2. PRN sampling with SPSS software
SPSS
Generating PRN To create a variable ‘prn’ of uniform random numbers in a database: COMPUTE prn=RV.UNIFORM(0,1). EXECUTE.
Sorting Sorting the database in ascending order of the variable prn: SORT CASES BY prn(A).
Selection
DATASET COPY sample1.
DATASET ACTIVATE sample1.
FILTER OFF.
USE 1 thru 10 /permanent.
EXECUTE.
DATASET COPY sample2.
DATASET ACTIVATE sample2.
FILTER OFF.
USE 3 thru 12 /permanent.
EXECUTE.
And so on until sample 5 Source: Author's own elaboration, 2021.
Table 3. PRN sampling with Stata software
Stata
Generating PRN To create a variable ‘prn’ of uniform random numbers in a database: gen prn=runiform()
Sorting Sorting the database in ascending order of the variable prn: sort prn
Selection
sample1 : (Using the frame database)
keep in 1/10
sample2: (Using the frame database)
keep in 3/12
sample3: (Using the frame database)
keep in 5/14
sample4: (Using the frame database)
keep in 7/16
sample5: (Using the frame database)
keep in 9/18 Source: Author's own elaboration, 2021.
10
3 Repeated collocated sampling
3.1 Overview
Srinath and Carpenter (1995) points out that the PRN technique may lead to over- or
underrepresentation of births in the sample because the new PRNs generated for births are not equally
spaced on the interval [0, 1]. The authors suggested a new procedure called repeated collocated
sampling that facilitates a better handling of births and could be used in the context of agricultural
surveys. The procedure, which corresponds to a SRSWOR, consists in the following steps:
(i). sort in a random order all units of the target population (e.g. in a domain, stratum, PSU etc.)
(ii). assign a Sample Selection Number 𝑆𝑆𝑁(𝑖) to each unit 𝑖 as follows:
𝑆𝑆𝑁(𝑖) =𝑅𝑖 − 𝜀
𝑁 (3.1.1)
Where 𝑅𝑖 is the rank of the unit 𝑖 after the random sorting; 𝜀 is a random number from the uniform
distribution 𝑈[0,1] and 𝑁 is the size of the population
(iii). the sample is composed by all units with a sample selection number lower than the desired
sampling fraction 𝑓 (𝑆𝑆𝑁(𝑖) ≤ 𝑓)
(iv). if a percentage 𝑟 of the sample is expected to be rotated, then in each survey occasion 𝑡, all
units whose 𝑆𝑆𝑁 lie within the interval [(𝑡 − 1)𝑟𝑓, (𝑡 − 1)𝑟𝑓 + 𝑓] would constitute the sample
(𝑡 = 1 corresponding to the first survey occasion).
In each survey occasion, if there are 𝑄 new births since the last sampling occasion, they are also
randomly ordered and assigned new sample selection numbers 𝑆𝑆𝑁(𝑖) as below:
𝑆𝑆𝑁(𝑖) =𝑅𝑖 − 𝜀
𝑄 (3.1.2)
Where 𝑅𝑖 and 𝜀 are defined as above. However, in each survey occasion, either the same 𝜀 could be
used or a new random number could be selected from the uniform distribution.
3.2 Application Let’s consider the same enumeration area (EA) composed by 30 units as in section 2.2 and a 20 percent
partial rotation design with a sample of 10 agricultural holdings.
We have:
𝑓 𝑟𝑓 𝜀 Sampling fraction: 10/30=0.33 0.2 × 0.33 = 0.066 Random number= 0.017537133
11
Figure 5. Operational steps for selecting samples with the repeated collocated sampling method
Source: Author's own elaboration, 2021.
Step1: generate random numbers for each unit in the EA
Step2: Sort the EA e.g. in ascending order of random numbers (random sorting)
Step3: Define the 𝑅𝑖 and calculate the SSN
Step4: identify the samples
ID Rand
1 0.87012650
2 0.28088389
3 0.71732738
4 0.70422743
5 0.36168256
6 0.74144240
7 0.17386889
8 0.13231919
9 0.45544714
10 0.88902360
11 0.05395863
12 0.37404500
13 0.31871866
14 0.40329879
15 0.24708789
16 0.91238972
17 0.64396404
18 0.39094439
19 0.64129422
20 0.22257691
21 0.61204142
22 0.52843557
23 0.45859939
24 0.75891911
25 0.41555944
26 0.08445414
27 0.95686136
28 0.71136379
29 0.95392510
30 0.08210462
ID Rand
11 0.0540
30 0.0821
26 0.0845
8 0.1323
7 0.1739
20 0.2226
15 0.2471
2 0.2809
13 0.3187
5 0.3617
12 0.3740
18 0.3909
14 0.4033
25 0.4156
9 0.4554
23 0.4586
22 0.5284
21 0.6120
19 0.6413
17 0.6440
4 0.7042
28 0.7114
3 0.7173
6 0.7414
24 0.7589
1 0.8701
10 0.8890
16 0.9124
29 0.9539
27 0.9569
ID Rand SSN
11 0.0540 1 0.0327
30 0.0821 2 0.0661
26 0.0845 3 0.0994
8 0.1323 4 0.1327
7 0.1739 5 0.1661
20 0.2226 6 0.1994
15 0.2471 7 0.2327
2 0.2809 8 0.2661
13 0.3187 9 0.2994
5 0.3617 10 0.3327
12 0.3740 11 0.3661
18 0.3909 12 0.3994
14 0.4033 13 0.4327
25 0.4156 14 0.4661
9 0.4554 15 0.4994
23 0.4586 16 0.5327
22 0.5284 17 0.5661
21 0.6120 18 0.5994
19 0.6413 19 0.6327
17 0.6440 20 0.6661
4 0.7042 21 0.6994
28 0.7114 22 0.7327
3 0.7173 23 0.7661
6 0.7414 24 0.7994
24 0.7589 25 0.8327
1 0.8701 26 0.8661
10 0.8890 27 0.8994
16 0.9124 28 0.9327
29 0.9539 29 0.9661
27 0.9569 30 0.9994
ID Rand SSN [0, f] [rf, rf+f] [2rf, 2rf+f] [3rf, 3rf+f] [4rf, 4rf+f]
11 0.0540 1 0.0327 1 0 0 0 0
30 0.0821 2 0.0661 1 0 0 0 0
26 0.0845 3 0.0994 1 1 0 0 0
8 0.1323 4 0.1327 1 1 0 0 0
7 0.1739 5 0.1661 1 1 1 0 0
20 0.2226 6 0.1994 1 1 1 0 0
15 0.2471 7 0.2327 1 1 1 1 0
2 0.2809 8 0.2661 1 1 1 1 0
13 0.3187 9 0.2994 1 1 1 1 1
5 0.3617 10 0.3327 1 1 1 1 1
12 0.3740 11 0.3661 0 1 1 1 1
18 0.3909 12 0.3994 0 1 1 1 1
14 0.4033 13 0.4327 0 0 1 1 1
25 0.4156 14 0.4661 0 0 1 1 1
9 0.4554 15 0.4994 0 0 0 1 1
23 0.4586 16 0.5327 0 0 0 1 1
22 0.5284 17 0.5661 0 0 0 0 1
21 0.6120 18 0.5994 0 0 0 0 1
19 0.6413 19 0.6327 0 0 0 0 0
17 0.6440 20 0.6661 0 0 0 0 0
4 0.7042 21 0.6994 0 0 0 0 0
28 0.7114 22 0.7327 0 0 0 0 0
3 0.7173 23 0.7661 0 0 0 0 0
6 0.7414 24 0.7994 0 0 0 0 0
24 0.7589 25 0.8327 0 0 0 0 0
1 0.8701 26 0.8661 0 0 0 0 0
10 0.8890 27 0.8994 0 0 0 0 0
16 0.9124 28 0.9327 0 0 0 0 0
29 0.9539 29 0.9661 0 0 0 0 0
27 0.9569 30 0.9994 0 0 0 0 0
sample1 sample2 sample3 sample4 sample5
11 26 7 15 13
30 8 20 2 5
26 7 15 13 12
8 20 2 5 18
7 15 13 12 14
20 2 5 18 25
15 13 12 14 9
2 5 18 25 23
13 12 14 9 22
5 18 25 23 21
12
Sample update procedure
As in section 2.2.2, let’s suppose that an update of the sample is planned during year 3 and that it was
noticed that holdings 5 and 27 disappeared and three new holdings (labelled 31, 32 and 33) appeared in
the population.
The new population size of the EA is therefore 31. To update the sample, it is necessary to update the
sampling fraction 𝑓 and the Sample Selection Numbers 𝑆𝑆𝑁 of sampling units that did not disappear, as
these are functions of the population size. Then new SSN should be calculated for new units 31, 32 and
33 using equation 3.1.2 (here Q=3). We will consider the same value for 𝜀 for both old and new SSN.
𝑓 𝑟𝑓 𝜀 Sampling fraction: 10/28=0.322 0.2 × 0.33 = 0.0645 Random number= 0.017537133
Figure 6. Samples update procedure with the repeated collocated sampling method
Source: Author's own elaboration, 2021.
ID Rand SSN [0, f] [rf, rf+f] [2rf, 2rf+f] [3rf, 3rf+f] [4rf, 4rf+f]
11 0.0540 1 0.0351 1 0 0 0 0
30 0.0821 2 0.0708 1 1 0 0 0
26 0.0845 3 0.1065 1 1 0 0 0
8 0.1323 4 0.1422 1 1 1 0 0
7 0.1739 5 0.1779 1 1 1 0 0
20 0.2226 6 0.2137 1 1 1 1 0
15 0.2471 7 0.2494 1 1 1 1 0
2 0.2809 8 0.2851 1 1 1 1 1
13 0.3187 9 0.3208 1 1 1 1 1
12 0.3740 10 0.3565 0 1 1 1 1
18 0.3909 11 0.3922 0 0 1 1 1
14 0.4033 12 0.4279 0 0 1 1 1
25 0.4156 13 0.4637 0 0 0 1 1
9 0.4554 14 0.4994 0 0 0 1 1
23 0.4586 15 0.5351 0 0 0 0 1
22 0.5284 16 0.5708 0 0 0 0 1
21 0.6120 17 0.6065 0 0 0 0 0
19 0.6413 18 0.6422 0 0 0 0 0
17 0.6440 19 0.6779 0 0 0 0 0
4 0.7042 20 0.7137 0 0 0 0 0
28 0.7114 21 0.7494 0 0 0 0 0
3 0.7173 22 0.7851 0 0 0 0 0
6 0.7414 23 0.8208 0 0 0 0 0
24 0.7589 24 0.8565 0 0 0 0 0
1 0.8701 25 0.8922 0 0 0 0 0
10 0.8890 26 0.9279 0 0 0 0 0
16 0.9124 27 0.9637 0 0 0 0 0
29 0.9539 28 0.9994 0 0 0 0 0
31 0.5100 1 0.3275 0 1 1 1 1
33 0.7324 2 0.6608 0 0 0 0 0
32 0.8703 3 0.9942 0 0 0 0 0
Sample1 Sample2 Sample3 Sample4 Sample5
11 26 8 20 2
30 8 7 15 13
26 7 20 2 12
8 20 15 13 18
7 15 2 12 14
20 2 13 18 25
15 13 12 14 9
2 5 18 25 23
13 12 14 9 22
5 18 31 31 31
Update
13
4 Rotation group sampling
4.1 Overview
This method consist basically in dividing randomly the population in groups and then select randomly
rotating samples of groups considering the sample size required by the survey design. This approach has
some advantages compared to the PRN sampling: it ensures that no unit is sampled too often due to
random chance and facilitates the handling of births in the populations. The method is described by
Srinath and Carpenter (1995) as follows:
First, the population is randomly divided in 𝑃 groups. For that, a random permutation of the
numbers 1,2,… , 𝑃 is first performed (assign ordering). Then, the first unit of the population is
assigned to the first rotation group in the assigned ordering, the second population unit is
assigned to the second rotation group in the assigned ordering and so on to the 𝑃th population
unit, which is assigned to the 𝑃th rotation group in the assigned ordering. The process begins
again with the (𝑃 + 1)th population unit assigned to the first rotation group, the (𝑃 + 2)th
population unit assigned to the second rotation group, and so on.
Secondly, suppose we want to select 𝑝 rotation groups in the sample. The original numbers of
the groups (1,2,… , 𝑃) before the assigned ordering, called rotation ordering is used for the
selection. Rotation groups numbered 1,2,… , 𝑝 in the rotation ordering are included in the
sample on the first survey occasion; on the second occasion, rotation group 1 rotates out of the
sample while the (𝑝 + 1)th rotates into the sample, and so on.
Let’s consider 𝑛 and 𝑁 respectively the sample size and the size of the population of the
domain/stratum. Dividing the population in 𝑃 groups means that each group will have (𝑁 𝑃⁄ ) units. Of
course, if (𝑁 𝑃⁄ ) is not an integer, the 𝑃 groups will not have the same size. If 𝑝 groups is planned to be
selected for the sample, then 𝑛 ≅ 𝑝 × (𝑁 𝑃⁄ ) and
𝑃 ≅ 𝑝 × (𝑁 𝑛⁄ ) (4.1.1)
4.2 Application
Let’s consider the same population of 30 holdings of an enumeration area and a sample of 10 units.
Using the formula 4.1.1, we have 𝑃 = 3𝑝. We can then opt to divide the population in 15 rotation
groups and select 5 rotation groups in each sample.
14
Figure 7. Procedures for selecting samples with the rotation group sampling method
Step1: Assign ordering of the rotation groups
Step2: Assign population units to the rotation groups
Rotation ordering Random numbers Assign ordering Random numbers
1 0.839269467 4 0.037332926
2 0.291737035 5 0.064879519
3 0.751828209 12 0.184000017
4 0.037332926 2 0.291737035
5 0.064879519 9 0.561580771
6 0.886087912 15 0.570630813
7 0.805773866 11 0.583437453
8 0.884244074 13 0.619856794
9 0.561580771 14 0.69216479
10 0.803065413 3 0.751828209
11 0.583437453 10 0.803065413
12 0.184000017 7 0.805773866
13 0.619856794 1 0.839269467
14 0.69216479 8 0.884244074
15 0.570630813 6 0.886087912
ID Assign ordering ID Rotation ordering
1 4 13 1
2 5 28 1
3 12 4 2
4 2 19 2
5 9 10 3
6 15 25 3
7 11 1 4
8 13 16 4
9 14 2 5
10 3 17 5
11 10 15 6
12 7 30 6
13 1 12 7
14 8 27 7
15 6 14 8
16 4 29 8
17 5 5 9
18 12 20 9
19 2 11 10
20 9 26 10
21 15 7 11
22 11 22 11
23 13 3 12
24 14 18 12
25 3 8 13
26 10 23 13
27 7 9 14
28 1 24 14
29 8 6 15
30 6 21 15
15
Step3: Sample selection
Source: Author's own elaboration, 2021.
Sample update procedure
In rotation group sampling, sample update would consist in updating the population by removing the
deaths (units disappeared), including the births (new units) and assign them randomly in rotation groups
and then select new sample.
To illustrate, let’s keep the same assumption of sample update in the year 3 and a situation of two
deaths (5 and 27) and three births (labelled 31-33) in that year.
IDRotation
orderingSample1 Sample2 Sample3 Sample4 Sample5
13 1 13 4 10 1 2
28 1 28 19 25 16 17
4 2 4 10 1 2 15
19 2 19 25 16 17 30
10 3 10 1 2 15 12
25 3 25 16 17 30 27
1 4 1 2 15 12 14
16 4 16 17 30 27 29
2 5 2 15 12 14 5
17 5 17 30 27 29 20
15 6
30 6
12 7
27 7
14 8
29 8
5 9
20 9
11 10
26 10
7 11
22 11
3 12
18 12
8 13
23 13
9 14
24 14
6 15
21 15
s
a
m
p
l
e
1
s
a
m
p
l
e
2
s
a
m
p
l
e
3
s
a
m
p
l
e
4
s
a
m
p
l
e
5
16
Figure 8. Samples update procedure with the the rotation group sampling method
Source: Author's own elaboration, 2021.
As it can be seen from the illustrations above, the updating procedure may affect the target number of
annual overlapping units. This is a drawback of the sample update although it improves the sample
presentativeness.
ID Assign ordering
1 4
2 5
3 12
4 2
5 9
6 15
7 11
8 13
9 14
10 3
11 10
12 7
13 1
14 8
15 6
16 4
17 5
18 12
19 2
20 9
21 15
22 11
23 13
24 14
25 3
26 10
27 7
28 1
29 8
30 6
31 4
32 5
33 12
ID Rotation ordering
13 1
28 1
4 2
19 2
10 3
25 3
1 4
16 4
31 4
2 5
17 5
32 5
15 6
30 6
12 7
27 7
14 8
29 8
5 9
20 9
11 10
26 10
7 11
22 11
3 12
18 12
33 12
8 13
23 13
9 14
24 14
6 15
21 15
IDRotation
ordering
13 1
28 1
4 2
19 2
10 3
25 3
1 4
16 4
31 4
2 5
17 5
32 5
15 6
30 6
12 7
27 7
14 8
29 8
5 9
20 9
11 10
26 10
7 11
22 11
3 12
18 12
33 12
8 13
23 13
9 14
24 14
6 15
21 15
s
a
m
p
l
e
1
s
a
m
p
l
e
2
s
a
m
p
l
e
3
s
a
m
p
l
e
4
s
a
m
p
l
e
5
Sample1 Sample2 Sample3 Sample4 Sample5
13 4 10 1 31
28 19 25 16 2
4 10 1 31 17
19 25 16 2 32
10 1 31 17 15
25 16 2 32 30
1 2 17 15 12
16 17 32 30
2 15 15 12 14
17 30 30 29
Update
17
5 Rotation design and PPS sampling
Sampling with probability proportional to size (PPS) can contribute significantly to improve estimations
of totals when sampling units are different in size and the sampling frame includes a measure of size
positively correlated with the variables of interest of the survey.
In the FAO sampling methodology for agricultural surveys, PPS sampling is recommended for the
selection of primary sampling units and a simple random sampling is suggested for selecting the
secondary sampling units (agricultural holdings) in the framework of a two-stage sampling. As
mentioned in section 1, with a two-stage sampling, rotation procedures should be performed for the
SSU in each sampled PSU and the methods presented in the previous sections can be used for that
purpose.
However, FAO recommends single-stage sampling design for selecting samples of non-households
agricultural holdings and special holdings (large farms, commercial holdings…). Such agricultural
holdings have usually very heterogeneous sizes and PPS sampling could be used for selecting samples
with a rotation design.
5.1 Pareto PPS sampling
An efficient approach used by Statistics Sweden (Lindblom and Teterukovsky, 2007) for selecting
overlapping samples with a PPS sampling is the Pareto PPS sampling approach. We are recommending
this method here because it is easy to implement and proven efficient. Rosén (1997) showed that the
Pareto PPS sampling presents lower sampling errors compared to other PPS sampling methods including
systematic PPS and Sunter PPS (Sunter, 1977).
Let’s consider a stratified simple random sampling without replacement with PPS sample selection and
suppose of a given stratum:
𝑛: sample size in the stratum
𝑥𝑖: measure of size of the unit 𝑖
𝑝𝑖: probability of selection of the unit 𝑖
𝑝𝑖 = 𝑛𝑥𝑖
∑ 𝑥𝑖𝑖 (5.1.1)
The Pareto PPS sampling approach consists simply as follow (Rosén, 1997):
(i). Generate independent standard uniform random variables 𝑈𝑖 for all sampling unit 𝑖.
(ii). Compute the ranking variables 𝑄𝑖 using the formula:
𝑄𝑖 =𝑈𝑖(1 − 𝑝𝑖)
𝑝𝑖(1 − 𝑈𝑖) (5.1.2)
18
The sample consists of the 𝑛 smallest values of the ranking variables 𝑄𝑖. It is easy to notice that the
ranking variable 𝑄 = (𝑄1, 𝑄2, … ) is an increasing function of the PRN variable 𝑈 = (𝑈1, 𝑈2, … )
Stratified Pareto PPS-scheme ensures that actual inclusion probabilities 𝜋𝑖 ≈ 𝑝𝑖 with good
approximation for all units (Rosén, 1997; Lindblom and Teterukovsky, 2007).
In order to select overlapping samples with the Pareto PPS, Statistics Sweden performs a temporary
transformation of the PRN 𝑈𝑖 into new random variables 𝑍𝑖 in function of a given starting point 𝑆
(Lindblom and Teterukovsky, 2007):
𝑍𝑖 = {𝑈𝑖 − 𝑆 𝑖𝑓 𝑈𝑖 ≥ 𝑆
1 + 𝑈𝑖 − 𝑆 𝑖𝑓 𝑈𝑖 < 𝑆 (5.1.3)
Then new ranking variables 𝑄𝑖∗ are computed using the 𝑍𝑖 instead of the 𝑈𝑖.
𝑄𝑖∗ =
𝑍𝑖(1 − 𝑝𝑖)
𝑝𝑖(1 − 𝑍𝑖) (5.1.4)
The importance of such transformation is that it allows the selection of samples with desired number of
overlapping units with suitable selections of starting points 𝑆 as performed in PRN sampling.
5.2 Application
Selection of coordinated samples with Pareto PPS with the approach of Statistics Sweden described
above can be performed with the R package ‘prnsamplr’. We propose below a simple illustration of the
approach using Excel. Let’s suppose overlapping samples of 10 units and 20 percent overlap should be
selected from a population of 30 units with measures of size Xi and PRN Ui. The probability of selection
pi is calculated using the formula 5.1.1.
19
Table 4. Sample selection with the Pareto PPS sampling method
ID Xi Pi Ui
1 88 0.17285 0.64985
2 32 0.06286 0.88124
3 286 0.56178 0.75384
4 228 0.44785 0.78103
5 248 0.48713 0.14281
6 65 0.12768 0.16858
7 59 0.11589 0.21072
8 141 0.27696 0.80233
9 272 0.53428 0.14279
10 220 0.43214 0.43041
11 141 0.27696 0.39755
12 103 0.20232 0.86876
13 268 0.52642 0.16116
14 240 0.47142 0.08902
15 27 0.05303 0.57740
16 303 0.59517 0.82068
17 362 0.71106 0.39322
18 72 0.14143 0.57949
19 93 0.18268 0.21135
20 39 0.07661 0.66833
21 307 0.60302 0.60101
22 265 0.52053 0.97080
23 120 0.23571 0.30023
24 233 0.45767 0.25735
25 122 0.23964 0.12511
26 179 0.35160 0.84064
27 74 0.14535 0.47639
28 317 0.62267 0.35911
29 53 0.10411 0.49048
30 134 0.26321 0.35317
5091
20
Setting the initial points conform to our overlapping objective: two units rotated from one year to
another.
ID Xi Pi Ui S
14 240 0.47142 0.08902 0.08 S1
25 122 0.23964 0.12511
9 272 0.53428 0.14279 0.14 S2
5 248 0.48713 0.14281
13 268 0.52642 0.16116 0.16 S3
6 65 0.12768 0.16858
7 59 0.11589 0.21072 0.21 S4
19 93 0.18268 0.21135
24 233 0.45767 0.25735 0.25 S5
23 120 0.23571 0.30023
30 134 0.26321 0.35317
28 317 0.62267 0.35911
17 362 0.71106 0.39322
11 141 0.27696 0.39755
10 220 0.43214 0.43041
27 74 0.14535 0.47639
29 53 0.10411 0.49048
15 27 0.05303 0.57740
18 72 0.14143 0.57949
21 307 0.60302 0.60101
1 88 0.17285 0.64985
20 39 0.07661 0.66833
3 286 0.56178 0.75384
4 228 0.44785 0.78103
8 141 0.27696 0.80233
16 303 0.59517 0.82068
26 179 0.35160 0.84064
12 103 0.20232 0.86876
2 32 0.06286 0.88124
22 265 0.52053 0.97080
5091
21
Calculating 𝑍1 …𝑍5 and 𝑄1∗ …𝑄5
∗ using the initial points 𝑆1 …𝑆5 and formulas 5.1.3 and 5.1.4
ID Z1 Z2 Z3 Z4 Z5 Q1* Q2* Q3* Q4* Q5*
1 0.570 0.510 0.490 0.440 0.400 6.339347 4.977569 4.594826 3.757545 3.188168
2 0.801 0.741 0.721 0.671 0.631 60.10442 42.71026 38.57619 30.44163 25.52227
3 0.674 0.614 0.594 0.544 0.504 1.6116 1.239993 1.140522 0.930005 0.792141
4 0.701 0.641 0.621 0.571 0.531 2.890954 2.201668 2.020409 1.641207 1.396064
5 0.063 0.003 0.983 0.933 0.893 0.070559 0.002966 60.19023 14.6163 8.769118
6 0.089 0.029 0.009 0.959 0.919 0.66404 0.201026 0.059141 158.1269 77.08396
7 0.131 0.071 0.051 0.001 0.961 1.147223 0.580587 0.407626 0.005514 186.5983
8 0.722 0.662 0.642 0.592 0.552 6.791425 5.120772 4.688447 3.793219 3.22102
9 0.063 0.003 0.983 0.933 0.893 0.058401 0.002439 49.7796 12.09804 7.259024
10 0.350 0.290 0.270 0.220 0.180 0.708862 0.537809 0.487044 0.371526 0.289259
11 0.318 0.258 0.238 0.188 0.148 1.214747 0.905605 0.813369 0.602648 0.45187
12 0.789 0.729 0.709 0.659 0.619 14.72185 10.59314 9.594941 7.611347 6.399088
13 0.081 0.021 0.001 0.951 0.911 0.079459 0.019445 0.001042 17.51903 9.226377
14 0.009 0.949 0.929 0.879 0.839 0.010211 20.87474 14.67652 8.147179 5.844109
15 0.497 0.437 0.417 0.367 0.327 17.67081 13.88201 12.79249 10.37011 8.691506
16 0.741 0.681 0.661 0.611 0.571 1.942842 1.449969 1.324413 1.06696 0.904175
17 0.313 0.253 0.233 0.183 0.143 0.185326 0.137788 0.123595 0.091154 0.067927
18 0.499 0.439 0.419 0.369 0.329 6.05845 4.760068 4.386917 3.557608 2.983214
19 0.131 0.071 0.051 0.001 0.961 0.676542 0.343754 0.242179 0.006042 111.2835
20 0.588 0.528 0.508 0.458 0.418 17.22618 13.50158 12.46205 10.19908 8.668813
21 0.521 0.461 0.441 0.391 0.351 0.716058 0.563064 0.519365 0.422675 0.356049
22 0.891 0.831 0.811 0.761 0.721 7.514301 4.522976 3.947483 2.929787 2.378076
23 0.220 0.160 0.140 0.090 0.050 0.91576 0.618662 0.528843 0.321575 0.171473
24 0.177 0.117 0.097 0.047 0.007 0.25546 0.157543 0.127797 0.058896 0.008772
25 0.045 0.985 0.965 0.915 0.875 0.149884 209.88 87.76136 34.20303 22.23245
26 0.761 0.701 0.681 0.631 0.591 5.860179 4.316042 3.930264 3.148596 2.660744
27 0.396 0.336 0.316 0.266 0.226 3.861206 2.980484 2.721266 2.135056 1.720646
28 0.279 0.219 0.199 0.149 0.109 0.234627 0.170038 0.150658 0.106196 0.074219
29 0.410 0.350 0.330 0.280 0.240 5.992201 4.6437 4.24791 3.3547 2.724805
30 0.273 0.213 0.193 0.143 0.103 1.052074 0.758389 0.670201 0.467742 0.322028
22
Selecting the samples
Source: Author's own elaboration, 2021.
Sample 1 Sample 2 Sample 3 Sample 4 Sample 5
ID Q1* ID Q2* ID Q3* ID Q4* ID Q5*
14 0.010211 9 0.002439 13 0.001042 7 0.005514 24 0.008772
9 0.058401 5 0.002966 6 0.059141 19 0.006042 17 0.067927
5 0.070559 13 0.019445 17 0.123595 24 0.058896 28 0.074219
13 0.079459 17 0.137788 24 0.127797 17 0.091154 23 0.171473
25 0.149884 24 0.157543 28 0.150658 28 0.106196 10 0.289259
17 0.185326 28 0.170038 19 0.242179 23 0.321575 30 0.322028
28 0.234627 6 0.201026 7 0.407626 10 0.371526 21 0.356049
24 0.25546 19 0.343754 10 0.487044 21 0.422675 11 0.45187
6 0.66404 10 0.537809 21 0.519365 30 0.467742 3 0.792141
19 0.676542 21 0.563064 23 0.528843 11 0.602648 16 0.904175
10 0.708862 7 0.580587 30 0.670201 3 0.930005 4 1.396064
21 0.716058 23 0.618662 11 0.813369 16 1.06696 27 1.720646
23 0.91576 30 0.758389 3 1.140522 4 1.641207 22 2.378076
30 1.052074 11 0.905605 16 1.324413 27 2.135056 26 2.660744
7 1.147223 3 1.239993 4 2.020409 22 2.929787 29 2.724805
11 1.214747 16 1.449969 27 2.721266 26 3.148596 18 2.983214
3 1.6116 4 2.201668 26 3.930264 29 3.3547 1 3.188168
16 1.942842 27 2.980484 22 3.947483 18 3.557608 8 3.22102
4 2.890954 26 4.316042 29 4.24791 1 3.757545 14 5.844109
27 3.861206 22 4.522976 18 4.386917 8 3.793219 12 6.399088
26 5.860179 29 4.6437 1 4.594826 12 7.611347 9 7.259024
29 5.992201 18 4.760068 8 4.688447 14 8.147179 20 8.668813
18 6.05845 1 4.977569 12 9.594941 20 10.19908 15 8.691506
1 6.339347 8 5.120772 20 12.46205 15 10.37011 5 8.769118
8 6.791425 12 10.59314 15 12.79249 9 12.09804 13 9.226377
22 7.514301 20 13.50158 14 14.67652 5 14.6163 25 22.23245
12 14.72185 15 13.88201 2 38.57619 13 17.51903 2 25.52227
20 17.22618 14 20.87474 9 49.7796 2 30.44163 6 77.08396
15 17.67081 2 42.71026 5 60.19023 25 34.20303 19 111.2835
2 60.10442 25 209.88 25 87.76136 6 158.1269 7 186.5983
Sample 1 Sample 2 Sample 3 Sample 4 Sample 5
14 9 13 7 24
9 5 6 19 17
5 13 17 24 28
13 17 24 17 23
25 24 28 28 10
17 28 19 23 30
28 6 7 10 21
24 19 10 21 11
6 10 21 30 3
19 21 23 11 16
23
References
Antonaci, G. d. A. & Silva, D. B. d. N. 2007. Analysis of alternative rotation patterns for the Brazilian
system of integrated household surveys. In Proceedings of the 56th Session of the International
Statistical Institute (ISI).
Chromy, J. R. 1979. Sequential sample selection methods. Proceedings of the American Statistical
Association Section on Survey Research Methods of the American Statistical Association, 401- 406
Davies, C. 2009. Area Frame Design for Agricultural Surveys. RDD Research Report, Research and
Development Division, USDA-NASS, Fairfax, VA.
Ernst, L. R., Valliant, R. & Casady, R. J. 2000. Permanent and collocated random number sampling
and the coverage of births and deaths. Journal of Official Statistics 16.3: 211-228
Fan, C. T., Muller, M. E. & Rezucha, I. 1962. Development of Sampling Plans by Using Sequential
(Item by Item) Selection Techniques and Digital Computers. Journal of the American Statistical
Association, 57, 387-402.
FAO. 2015. World Census of Agriculture 2020. Volume 1: Programme, concepts and definitions. Rome.
FAO. 2017. Handbook on the Agricultural Integrated Survey (AGRIS). GSARS. Rome
Graham, J. E. 1963. Rotation designs for sampling on successive occasions. Retrospective Theses and
Dissertations. Paper 2384.
Gurney, M. & Daley. J.F. 1965. A Multivariate Approach to Estimation in Periodic Sample Surveys.
Proceedings of the Social Statistics Section, American Statistical Association, 242-257
Koop, J.C. 1988. The Technique of Replicated or Interpenetrating Samples, in Handbook to Statistics:
Sampling, Vol. 6, New York: Elsevier Science Publishers B. V., 333-368
Lindblom, A. & Teterukovsky, A. 2007. Coordination of Stratified Pareto pps Samples and Stratified
Simple Random Samples at Statistics Sweden. Paper presented at the ICES-III, June 18-21, 2007,
Montreal, Quebec, Canada
Ohlsson, E. 1992. SAMU, The system for Co-ordination of Samples from the Business Register at
Statistics Sweden-A methodological description, R&D Report 1992: 18, Stockholm: Statistics Sweden
Ohlsson, E. 1995. Coordination of Samples Using Permanent Random Numbers. In Business Survey
Methods, edited by Brenda Cox et al., pp 153-169. Wiley, New York.
Rao, J.N.K. & Graham, J. E 1964. Rotation designs for sampling on repeated occasions. Journal of the
American Statistical Association, 69, 492-509.
Srinath, K.P. & Carpenter, R.M. 1995. Sampling methods for repeated business surveys. In Business
Survey Methods, edited by Brenda Cox et al., pp 171-183. Wiley, New York.
Sunter, A.B. 1977. List sequential sampling with equal or unequal probabilities without replacement.
Appl. Statist. 26, 261-268.
24
Annex: Overview of composite estimators
This note does not aim covering procedures of estimation from partially overlapping repeated surveys.
There are many methods that could be covered in a separate document. However we are proposing below
an overview of a popular class of estimators: the composite estimators.
When rotating samples are selected for different survey occasions, each sample is valid pour reliable
cross-sectional estimates on the corresponding survey occasion. However, alternative more efficient
estimators are proposed in the literature for both cross-sectional and longitudinal estimations. Among
them, the composite estimators are certainly the most popular. The composite estimation combines
estimates of the current survey occasion with the ones of the previous occasions in an efficient manner
to produce estimates that are more accurate in general for most of the characteristics and in particular
for estimate of change (Rao and Graham, 1964; Steel and McLaren, 2008).
Simple Composite Estimator
Gurney and Daley (1965) discusses a number of composite estimators in the specific framework of the US
Current Population Survey. A very basic form of composite estimator of a population mean (�̅�𝑡𝐾), called
Simple Composite Estimator is:
�̅�𝑡𝐾 = (1 − 𝐾)�̅�𝑡 + 𝐾(�̅�𝑡−1
𝐾 + �̅�𝑡𝑀 − �̅�𝑡−1
𝑀 ) (1)
The simple composite estimator of the average change is:
𝑑𝑡𝐾 = �̅�𝑡
𝐾 − �̅�𝑡−1𝐾 (2)
Where:
�̅�𝑡𝐾 and �̅�𝑡−1
𝐾 are respectively the composite estimators for the current period 𝑡 and the previous
period 𝑡 − 1
�̅�𝑡 is the simple unbiased estimate of the mean for the current period 𝑡
�̅�𝑡𝑀 and �̅�𝑡−1
𝑀 are the estimates respectively for the current period 𝑡 and the previous period 𝑡 − 1
from the units of the sample at period t that were also in the sample at period 𝑡-1
𝐾 a constant weight factor between 0 and 1
Compared to estimators �̅�𝑡 and (�̅�𝑡 − �̅�𝑡−1) of respectively average and change for the current period 𝑡,
Rao and Graham (1964) shows that the simple composite estimators �̅�𝑡𝐾 and 𝑑𝑡
𝐾 have lower variances and
discuss optimum values for 𝐾.
AK composite estimator
The simple composite estimator can be improved with an additional term that will further reduce variance
and ameliorate the impact of times in survey effect (Steel and McLaren, 2008). That new composite
estimator, called AK composite estimator, is particularly efficient when in equation (1), 1 − 𝐾 is relatively
higher than 𝐾 (Gurney and Daley, 1965).
�̅�𝑡𝐴𝐾 = (1 − 𝐾)�̅�𝑡 + 𝐾(�̅�𝑡−1
𝐴𝐾 + �̅�𝑡𝑀 − �̅�𝑡−1
𝑀 ) + 𝐴(�̅�𝑡𝑁 − �̅�𝑡
𝑀) (3)
Where the new term �̅�𝑡𝑁 is the estimate for the current period 𝑡 from the new units being rotated in the
sample in the period 𝑡.
25
Optimal values of parameters 𝐾 and 𝐴 depend on the variable 𝑌 and time period.
Contact:
Statistics Division – Economic and Social Development
http://www.fao.org/food-agriculture-statistics/resources/publications/working-papers/en/
Food and Agriculture Organization of the United Nations
Rome, Italy