Download - Editing of CPI prices data:Central Premise

The Role of the Tukey algorithm in validation procedures for prices data in a consumer price

index: the UK experience on this and more general aspects of data editing

David Fenwick

Editing of CPI prices data:Central Premise

1. Editing is a non-trivial issue. It can have a systematic numerical impact on measured inflation which can lead to bias

2. Automated editing can improve the quality of the index and increase operational efficiency

3. Both of the above statements apply to the Tukey algorithm

Auditing and editing needs to be in “real time” because shop prices can change quickly

Some background on ONS editing procedures – salient points

• Most prices are collected by handheld computer with the facility for interactive editing in real time

• Two distinct algorithms operate at HQ to identify outliers amongst price quotes, often operating in parallel

– Scrutiny, two tests by reference to “average” price of same/similar items (2k from 100k)

• The minimum-maximum test• The percentage change test

– Tukey, in essence a more sophisticated version of scrutiny. This is applied to price quotes not identified by scrutiny as outliers (4k from 100k)


• At the time of study the presumption was that an outlier was incorrect, and therefore declared invalid, unless positively verified by reference to metadata from the price collector or by checking the quote with the shop keeper

– All “scrutiny” & half of Tukey outliers were subject to positive verification (most were correct).

– The rest, that is those not positively verified were assumed incorrect.

– The number explicitly accepted after verification was 100 times the number explicitly rejected (indicates imbalance & potential bias)


• Automated filtering mechanisms avoid manual examination of large numbers of prices over a short time

• But automated filtering mechanisms need to be supported by “well-informed” manual editing

– Especially when there can be unpredictable variations in prices (e.g. seasonal goods, sales)

• Two main issues for ONS & motivation for the research

– The efficiency of its editing procedures– The impact on the accuracy of the RPI/CPI.

The Tukey Algorithm• Price quotes are ordered by the corresponding price ratios• Highest and lowest 5 per cent are flagged for further investigation and

excluded – Price ratios equal to 1 are excluded (i.e. no price change)

• Arithmetic mean of remaining price ratios used to divide remaining price ratios and their lower/upper trimmed means calculated

• The upper and lower Tukey limits used to flag those price observations which warrant attention are then calculated as follows:

– TU =AM + 2.5 (AMU – AM)– TL = AM – 2.5 (AM – AML)where AML is the lower trimmed mean and AMU is the upper trimmed mean.

• Tukey maximises use of immediate price history– Can be used for monthly or annual change– AML & AMU can be regularly recalculated

Issue 1: efficiency of editing procedures

• Considerable overlap between– Interactive editing in field

• Min max test, metadata (S=sale; R=recovery from sale), logistical checks (R must follow S)

– “scrutiny”• Not real time & less sophisticated than “scrutiny” but quick to identify

extreme outliers & can be run without many body of prices data

– Tukey• Sophisticated but not interactive

• More efficient than “scrutiny” which identifies many “outliers” which are valid price quotes

• “scrutiny” reduces efficiency of Tukey by prior exclusion of “scrutiny” outliers from Tukey

– “Scrutiny retained because quick & simple but the presumption of an outlier being “wrong until proven right” challenged

Issue 2: impact on index & potential bias

• Set of outliers may not necessarily adequately overlap set of “incorrect” prices

– Initial investigations showed number of Tukey (& “Scrutiny”) outliers which were incorrect was small.

• Better to focus on– Extreme outliers– Prices that have not changed for many months (editing

ignores)

• Study undertaken of clothing sub-index of RPI

Clothing: underlying analysis

• Were clothes prices really lower than 15 years ago?

Clothing Group index (base J an 87=100)

95

100

105

110

115

120

125

19

87

Jan

19

87

Ju

l

19

88

Jan

19

88

Ju

l

19

89

Jan

19

89

Ju

l

19

90

Jan

19

90

Ju

l

19

91

Jan

19

91

Ju

l

19

92

Jan

19

92

Ju

l

19

93

Jan

19

93

Ju

l

19

94

Jan

19

94

Ju

l

19

95

Jan

19

95

Ju

l

19

96

Jan

19

96

Ju

l

19

97

Jan

19

97

Ju

l

19

98

Jan

19

98

Ju

l

19

99

Jan

19

99

Ju

l

20

00

Jan

20

00

Ju

l

20

01

Jan

20

01

Ju

l

20

02

Jan

20

02

Ju

l

Clothing Group index 13 point Moving Average

Chart 1


US Apparel Group index (rebased to J an 87=100)

100

105

110

115

120

125

130

Jan

87

Jul 8

7

Jan

88

Jul 8

8

Jan

89

Jul 8

9

Jan

90

Jul 9

0

Jan

91

Jul 9

1

Jan

92

Jul 9

2

Jan

93

Jul 9

3

Jan

94

Jul 9

4

Jan

95

Jul 9

5

Jan

96

Jul 9

6

Jan

97

Jul 9

7

Jan

98

Jul 9

8

Jan

99

Jul 9

9

Jan

00

Jul 0

0

Jan

01

Jul 0

1

Jan

02

Jul 0

2

US Apparel Group index 13 point Moving Average

Chart 2


• Deeper sales & shallower recoveries?

– But not uniform across clothing

Clothing Sections index (base Jan 87=100)

70

80

90

100

110

120

130

140

150

160

170

1987 Jan

1987 Jul

1988 Jan

1988 Jul

1989 Jan

1989 Jul

1990 Jan

1990 Jul

1991 Jan

1991 Jul

1992 Jan

1992 Jul

1993 Jan

1993 Jul

1994 Jan

1994 Jul

1995 Jan

1995 Jul

1996 Jan

1996 Jul

1997 Jan

1997 Jul

1998 Jan

1998 Jul

1999 Jan

1999 Jul

2000 Jan

2000 Jul

2001 Jan

2001 Jul

2002 Jan

2002 Jul

Men's outerwear Women's outerwear Children's outerwear Footwear Other clothing

Chart 5

The research questions

• Is the data editor at HQ more reliable than the price collector?

– Automated editing automatically over-riding price quotes for replacement items from “non-comparable” to “comparable” where small price difference

– “Scrutiny” can overturn in a few seconds a considered decision by price collector

• “odd” results- only 10% as many “recoveries” as sales• Correct categorisation of replacement item as comparable or non-

comparable is important. – Tukey doesn’t test this

– “Scrutiny” • An examination of original & final indicator codes shows over half of

“non-comparable” replacements were reclassified at HQ (a half to “comparable”) & 10% of “C”’s were changed to “N”’s

• Analysis of price relatives indicate latter might have an undue influence on “Scrutiny” with a cumulative effect on index

Research question: what has been the numerical impact of edited changes in “indicator” codes?

• ONS records original and final (after all editing) “indicator” codes

– Original indicator codes gave higher index

Clothing group index (2002)

94

96

98

100

102

104

106

108

110

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan

Month

Final indicator Original indicator

Research question: what has been the numerical impact of edited changes in “indicator” codes?

Clothing group index (Jan 99 base)

86

90

94

98

102

106

110

Jan

-99

Ma

r-9

9

Ma

y-9

9

Jul-

99

Se

p-9

9

No

v-9

9

Jan

-00

Ma

r-0

0

Ma

y-0

0

Jul-

00

Se

p-0

0

No

v-0

0

Jan

-01

Ma

r-0

1

Ma

y-0

1

Jul-

01

Se

p-0

1

No

v-0

1

Jan

-02

Ma

r-0

2

Ma

y-0

2

Jul-

02

Se

p-0

2

No

v-0

2

Jan

-03

Month

Final indicator Original indicator

Research question: is the price collectors judgement better than HQs?

• Auditor back check of indicator codes

– Collectors decisions generally accurate & better then HQs

Indicator code decisions

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

March April May Total

scrutiny/validation auditor

both unchanged

Further thoughts on Tukey

• Tukey is robust– The implicit thresholds defined by Tukey are not subject to

substantial revision on receipt of new data

• Tukey could be used more effectively– Explicit parameters can be adjusted to identify extreme outliers– Double application

• First apply Tukey to initial prices data and then to full set• Validate but suspend validation decisions from first application until

confirmed by second

• Tukey cannot be applied to centrally collected prices or centrally calculated indices

– Too few quotes– Rely on scrutiny– But “centrals” represent the biggest inherent risk to the index (small

number of quotes, large weight)

Conclusions

• Data editing “filtering” mechanisms – Can increase efficiency. But

• can exclude “correct” prices• do not guarantee a better index

• Tukey can be undermined– by other editing procedures– By the presumption that an outlier is an invalid price unless

subsequently positively validated by editing– By automatic re-coding of non-comparable “replacements” as

comparable “replacements” e.g. where the price difference is small and there is no reference to the indicator code

• The judgement of the price collector is generally to be preferred

• Centrally collected prices and centrally calculated indices account for 40% of the basket and Tukey doesn’t help these

END OF PRESENTATION

Download - Editing of CPI prices data:Central Premise

Top Related