Download - Editing of CPI prices data:Central Premise
The Role of the Tukey algorithm in validation procedures for prices data in a consumer price
index: the UK experience on this and more general aspects of data editing
David Fenwick
Editing of CPI prices data:Central Premise
1. Editing is a non-trivial issue. It can have a systematic numerical impact on measured inflation which can lead to bias
2. Automated editing can improve the quality of the index and increase operational efficiency
3. Both of the above statements apply to the Tukey algorithm
Auditing and editing needs to be in “real time” because shop prices can change quickly
Some background on ONS editing procedures – salient points
• Most prices are collected by handheld computer with the facility for interactive editing in real time
• Two distinct algorithms operate at HQ to identify outliers amongst price quotes, often operating in parallel
– Scrutiny, two tests by reference to “average” price of same/similar items (2k from 100k)
• The minimum-maximum test• The percentage change test
– Tukey, in essence a more sophisticated version of scrutiny. This is applied to price quotes not identified by scrutiny as outliers (4k from 100k)
Some background on ONS editing procedures – salient points
• At the time of study the presumption was that an outlier was incorrect, and therefore declared invalid, unless positively verified by reference to metadata from the price collector or by checking the quote with the shop keeper
– All “scrutiny” & half of Tukey outliers were subject to positive verification (most were correct).
– The rest, that is those not positively verified were assumed incorrect.
– The number explicitly accepted after verification was 100 times the number explicitly rejected (indicates imbalance & potential bias)
Some background on ONS editing procedures – salient points
• Automated filtering mechanisms avoid manual examination of large numbers of prices over a short time
• But automated filtering mechanisms need to be supported by “well-informed” manual editing
– Especially when there can be unpredictable variations in prices (e.g. seasonal goods, sales)
• Two main issues for ONS & motivation for the research
– The efficiency of its editing procedures– The impact on the accuracy of the RPI/CPI.
The Tukey Algorithm• Price quotes are ordered by the corresponding price ratios• Highest and lowest 5 per cent are flagged for further investigation and
excluded – Price ratios equal to 1 are excluded (i.e. no price change)
• Arithmetic mean of remaining price ratios used to divide remaining price ratios and their lower/upper trimmed means calculated
• The upper and lower Tukey limits used to flag those price observations which warrant attention are then calculated as follows:
– TU =AM + 2.5 (AMU – AM)– TL = AM – 2.5 (AM – AML)where AML is the lower trimmed mean and AMU is the upper trimmed mean.
• Tukey maximises use of immediate price history– Can be used for monthly or annual change– AML & AMU can be regularly recalculated
Issue 1: efficiency of editing procedures
• Considerable overlap between– Interactive editing in field
• Min max test, metadata (S=sale; R=recovery from sale), logistical checks (R must follow S)
– “scrutiny”• Not real time & less sophisticated than “scrutiny” but quick to identify
extreme outliers & can be run without many body of prices data
– Tukey• Sophisticated but not interactive
• More efficient than “scrutiny” which identifies many “outliers” which are valid price quotes
• “scrutiny” reduces efficiency of Tukey by prior exclusion of “scrutiny” outliers from Tukey
– “Scrutiny retained because quick & simple but the presumption of an outlier being “wrong until proven right” challenged
Issue 2: impact on index & potential bias
• Set of outliers may not necessarily adequately overlap set of “incorrect” prices
– Initial investigations showed number of Tukey (& “Scrutiny”) outliers which were incorrect was small.
• Better to focus on– Extreme outliers– Prices that have not changed for many months (editing
ignores)
• Study undertaken of clothing sub-index of RPI
Clothing: underlying analysis
• Were clothes prices really lower than 15 years ago?
Clothing Group index (base J an 87=100)
95
100
105
110
115
120
125
19
87
Jan
19
87
Ju
l
19
88
Jan
19
88
Ju
l
19
89
Jan
19
89
Ju
l
19
90
Jan
19
90
Ju
l
19
91
Jan
19
91
Ju
l
19
92
Jan
19
92
Ju
l
19
93
Jan
19
93
Ju
l
19
94
Jan
19
94
Ju
l
19
95
Jan
19
95
Ju
l
19
96
Jan
19
96
Ju
l
19
97
Jan
19
97
Ju
l
19
98
Jan
19
98
Ju
l
19
99
Jan
19
99
Ju
l
20
00
Jan
20
00
Ju
l
20
01
Jan
20
01
Ju
l
20
02
Jan
20
02
Ju
l
Clothing Group index 13 point Moving Average
Chart 1
Clothing: underlying analysis
US Apparel Group index (rebased to J an 87=100)
100
105
110
115
120
125
130
Jan
87
Jul 8
7
Jan
88
Jul 8
8
Jan
89
Jul 8
9
Jan
90
Jul 9
0
Jan
91
Jul 9
1
Jan
92
Jul 9
2
Jan
93
Jul 9
3
Jan
94
Jul 9
4
Jan
95
Jul 9
5
Jan
96
Jul 9
6
Jan
97
Jul 9
7
Jan
98
Jul 9
8
Jan
99
Jul 9
9
Jan
00
Jul 0
0
Jan
01
Jul 0
1
Jan
02
Jul 0
2
US Apparel Group index 13 point Moving Average
Chart 2
Clothing: underlying analysis
• Deeper sales & shallower recoveries?
– But not uniform across clothing
Clothing Sections index (base Jan 87=100)
70
80
90
100
110
120
130
140
150
160
170
1987 Jan
1987 Jul
1988 Jan
1988 Jul
1989 Jan
1989 Jul
1990 Jan
1990 Jul
1991 Jan
1991 Jul
1992 Jan
1992 Jul
1993 Jan
1993 Jul
1994 Jan
1994 Jul
1995 Jan
1995 Jul
1996 Jan
1996 Jul
1997 Jan
1997 Jul
1998 Jan
1998 Jul
1999 Jan
1999 Jul
2000 Jan
2000 Jul
2001 Jan
2001 Jul
2002 Jan
2002 Jul
Men's outerwear Women's outerwear Children's outerwear Footwear Other clothing
Chart 5
The research questions
• Is the data editor at HQ more reliable than the price collector?
– Automated editing automatically over-riding price quotes for replacement items from “non-comparable” to “comparable” where small price difference
– “Scrutiny” can overturn in a few seconds a considered decision by price collector
• “odd” results- only 10% as many “recoveries” as sales• Correct categorisation of replacement item as comparable or non-
comparable is important. – Tukey doesn’t test this
– “Scrutiny” • An examination of original & final indicator codes shows over half of
“non-comparable” replacements were reclassified at HQ (a half to “comparable”) & 10% of “C”’s were changed to “N”’s
• Analysis of price relatives indicate latter might have an undue influence on “Scrutiny” with a cumulative effect on index
Research question: what has been the numerical impact of edited changes in “indicator” codes?
• ONS records original and final (after all editing) “indicator” codes
– Original indicator codes gave higher index
Clothing group index (2002)
94
96
98
100
102
104
106
108
110
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan
Month
Final indicator Original indicator
Research question: what has been the numerical impact of edited changes in “indicator” codes?
Clothing group index (Jan 99 base)
86
90
94
98
102
106
110
Jan
-99
Ma
r-9
9
Ma
y-9
9
Jul-
99
Se
p-9
9
No
v-9
9
Jan
-00
Ma
r-0
0
Ma
y-0
0
Jul-
00
Se
p-0
0
No
v-0
0
Jan
-01
Ma
r-0
1
Ma
y-0
1
Jul-
01
Se
p-0
1
No
v-0
1
Jan
-02
Ma
r-0
2
Ma
y-0
2
Jul-
02
Se
p-0
2
No
v-0
2
Jan
-03
Month
Final indicator Original indicator
Research question: is the price collectors judgement better than HQs?
• Auditor back check of indicator codes
– Collectors decisions generally accurate & better then HQs
Indicator code decisions
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
March April May Total
scrutiny/validation auditor
both unchanged
Further thoughts on Tukey
• Tukey is robust– The implicit thresholds defined by Tukey are not subject to
substantial revision on receipt of new data
• Tukey could be used more effectively– Explicit parameters can be adjusted to identify extreme outliers– Double application
• First apply Tukey to initial prices data and then to full set• Validate but suspend validation decisions from first application until
confirmed by second
• Tukey cannot be applied to centrally collected prices or centrally calculated indices
– Too few quotes– Rely on scrutiny– But “centrals” represent the biggest inherent risk to the index (small
number of quotes, large weight)
Conclusions
• Data editing “filtering” mechanisms – Can increase efficiency. But
• can exclude “correct” prices• do not guarantee a better index
• Tukey can be undermined– by other editing procedures– By the presumption that an outlier is an invalid price unless
subsequently positively validated by editing– By automatic re-coding of non-comparable “replacements” as
comparable “replacements” e.g. where the price difference is small and there is no reference to the indicator code
• The judgement of the price collector is generally to be preferred
• Centrally collected prices and centrally calculated indices account for 40% of the basket and Tukey doesn’t help these
END OF PRESENTATION