it never rains on sunday: the prevalence and implications of untagged multi-day rainfall...

22
INTERNATIONAL JOURNAL OF CLIMATOLOGY Int. J. Climatol. 24: 1171–1192 (2004) Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/joc.1053 IT NEVER RAINS ON SUNDAY: THE PREVALENCE AND IMPLICATIONS OF UNTAGGED MULTI-DAY RAINFALL ACCUMULATIONS IN THE AUSTRALIAN HIGH QUALITY DATA SET NEIL R. VINEY* and BRYSON C. BATES CSIRO Land and Water, Wembley, WA, Australia Received 13 November 2003 Revised 18 March 2004 Accepted 30 March 2004 ABSTRACT The perception prevalent in the literature that Australian rainfall records are reasonably uncontaminated by untagged weekend accumulations is reassessed. An objective probabilistic test for untagged accumulations is developed and applied to 181 gauges that have previously been identified as having high-quality data suitable for long-term analyses of climate change. As many as 102 of these gauges are found to contain hidden, untagged accumulations, and the overall prevalence of untagged accumulations in the high-quality data set is shown to be only slightly less than that of tagged accumulations. A simple study simulating the effects of accumulations in the records of the high-quality data set shows that, in records (or parts of records) with frequent accumulations, rainfall probability, mean wet-spell length and mean dry-spell length can be underestimated by as much as 24%, 34% and 18% respectively, and that the magnitude of the potential prediction error in these variables (and also in indices of rainfall intensity extremes) at a site shows strong dependence on the rainfall probability. Selected published studies on climate change are reanalysed to account for the presence of untagged accumulations and to show that significant changes in long-term trends can be obtained for individual locations. Copyright 2004 Royal Meteorological Society. KEY WORDS: Australia; rain gauges; accumulated rainfall; untagged accumulations; day of week; data quality; climate change analysis 1. INTRODUCTION In order to assess and quantify climate change and variability, consistent, long-term data sets of meteorological variables are required. To provide such a data set, Lavery et al. (1992) scrutinized records for 6600 daily rainfall observation sites within Australia and selected those with records extending back to 1910 and with consistency in observing practices, gauge exposure, gauge location and gauge type. Much of the information they based their assessments upon was taken from station documentation. This was complemented by a suite of statistical tests to illuminate inconsistencies and non-stationarity in exposure and observer diligence. This screening process resulted in a benchmark data set of 191 gauges. In the time that has elapsed since the publication of that data set, a review of updated station documentation has resulted in a further 10 gauges being removed from the list (Haylock and Nicholls, 2000). This data set has come to be known as the ‘high-quality data set’ and has been used in several published analyses of climate change (e.g. Suppiah and Hennessy, 1996, 1998; Groisman et al., 1999; Hennessy et al., 1999; Haylock and Nicholls, 2000; Manton et al., 2001). In Australia, rain gauges are read at 9 a.m. each day and the rainfall amount is recorded against the date of observation. Throughout this paper, any reference to, for example, Sunday rainfall, means that amount of *Correspondence to: Neil R. Viney, CSIRO Land and Water, Private Bag No. 5, Wembley, WA 6913, Australia; e-mail: [email protected] Copyright 2004 Royal Meteorological Society

Upload: neil-r-viney

Post on 15-Jun-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: It never rains on Sunday: the prevalence and implications of untagged multi-day rainfall accumulations in the Australian high quality data set

INTERNATIONAL JOURNAL OF CLIMATOLOGY

Int. J. Climatol. 24: 1171–1192 (2004)

Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/joc.1053

IT NEVER RAINS ON SUNDAY: THE PREVALENCE AND IMPLICATIONSOF UNTAGGED MULTI-DAY RAINFALL ACCUMULATIONS IN THE

AUSTRALIAN HIGH QUALITY DATA SET

NEIL R. VINEY* and BRYSON C. BATESCSIRO Land and Water, Wembley, WA, Australia

Received 13 November 2003Revised 18 March 2004

Accepted 30 March 2004

ABSTRACT

The perception prevalent in the literature that Australian rainfall records are reasonably uncontaminated by untaggedweekend accumulations is reassessed. An objective probabilistic test for untagged accumulations is developed and appliedto 181 gauges that have previously been identified as having high-quality data suitable for long-term analyses of climatechange. As many as 102 of these gauges are found to contain hidden, untagged accumulations, and the overall prevalenceof untagged accumulations in the high-quality data set is shown to be only slightly less than that of tagged accumulations.A simple study simulating the effects of accumulations in the records of the high-quality data set shows that, in records(or parts of records) with frequent accumulations, rainfall probability, mean wet-spell length and mean dry-spell lengthcan be underestimated by as much as 24%, 34% and 18% respectively, and that the magnitude of the potential predictionerror in these variables (and also in indices of rainfall intensity extremes) at a site shows strong dependence on therainfall probability. Selected published studies on climate change are reanalysed to account for the presence of untaggedaccumulations and to show that significant changes in long-term trends can be obtained for individual locations. Copyright 2004 Royal Meteorological Society.

KEY WORDS: Australia; rain gauges; accumulated rainfall; untagged accumulations; day of week; data quality; climate change analysis

1. INTRODUCTION

In order to assess and quantify climate change and variability, consistent, long-term data sets of meteorologicalvariables are required. To provide such a data set, Lavery et al. (1992) scrutinized records for 6600 dailyrainfall observation sites within Australia and selected those with records extending back to 1910 and withconsistency in observing practices, gauge exposure, gauge location and gauge type. Much of the informationthey based their assessments upon was taken from station documentation. This was complemented by a suiteof statistical tests to illuminate inconsistencies and non-stationarity in exposure and observer diligence. Thisscreening process resulted in a benchmark data set of 191 gauges. In the time that has elapsed since thepublication of that data set, a review of updated station documentation has resulted in a further 10 gaugesbeing removed from the list (Haylock and Nicholls, 2000). This data set has come to be known as the‘high-quality data set’ and has been used in several published analyses of climate change (e.g. Suppiah andHennessy, 1996, 1998; Groisman et al., 1999; Hennessy et al., 1999; Haylock and Nicholls, 2000; Mantonet al., 2001).

In Australia, rain gauges are read at 9 a.m. each day and the rainfall amount is recorded against the dateof observation. Throughout this paper, any reference to, for example, Sunday rainfall, means that amount of

* Correspondence to: Neil R. Viney, CSIRO Land and Water, Private Bag No. 5, Wembley, WA 6913, Australia;e-mail: [email protected]

Copyright 2004 Royal Meteorological Society

Page 2: It never rains on Sunday: the prevalence and implications of untagged multi-day rainfall accumulations in the Australian high quality data set

1172 N. R. VINEY AND B. C. BATES

rain that fell between 9 a.m. Saturday and 9 a.m. Sunday. Thus, the expectation is that most of the Sundayrainfall would in fact have fallen on Saturday.

2. OCCURRENCE OF ACCUMULATED DATA IN THE HIGH-QUALITY DATA SET

One data quality issue that was not considered as a selection criterion by Lavery et al. (1992) was thepresence and prevalence of accumulated rainfall totals. Such data elements arise when a gauge remainsunread for 2 days or more, thus leading to an accumulation of rainfall that has possibly fallen on more than1 day being present in the gauge when it is eventually read. In these cases, observers are instructed to notein their records the accumulated rainfall amount and the number of days since the gauge was last read. Inthe published rainfall records, all days that form part of a multi-day accumulation period are flagged with aquality tag that distinguishes them from days with 24 h accumulations. Throughout this paper, we will refer tosuch records as ‘tagged accumulations’ and will distinguish them from what we call ‘untagged accumulations’(which we introduce in Section 4).

These tagged accumulations are present, at varying degrees of prevalence, in almost all of the 181 stationrecords. They are particularly prevalent at stations located in public work places, such as post offices, whereobservers may be absent on weekends and public holidays. At least one-third of the 181 stations are locatedat post offices, and although some have very few accumulations, several have more than 2500 days withaccumulation tags since 1910.

Figure 1 shows the occurrence of accumulations at one typical station, in this case a gauge located at apost office. This site has more than 2300 days with accumulated data in the 91 year period from 1910 to2000. Immediately evident in Figure 1 is the change of slope in 1974. The number of days associated withaccumulation periods is reasonably constant at a rate of about 11 per year up until 1973. However, since thenthey have occurred at a rate of about 60 per year. The change in slope coincides with the year in which postoffices in Australia switched from a 6 day week to a 5 day week. That is, they no longer opened for businesson Saturdays. As a consequence, the rainfall observer was often not on-site to read the gauge on eitherSaturday or Sunday, so Monday’s reading comprised a 3 day accumulation. At the post office in Figure 1,where rain falls on about 80 days per year on average, the fivefold increase in accumulated data after 1974cannot be explained in full by a change from 2-day weekend accumulations to 3-day weekend accumulations.

1910 1920 1930 1940 1950 1960 1970 1980 1990 20000

500

1000

1500

2000

2500

Num

ber

of a

ccum

ulat

ions

Figure 1. Cumulative occurrence of accumulated rainfall totals for a typical rainfall station, 1910–2000

Copyright 2004 Royal Meteorological Society Int. J. Climatol. 24: 1171–1192 (2004)

Page 3: It never rains on Sunday: the prevalence and implications of untagged multi-day rainfall accumulations in the Australian high quality data set

MULTI-DAY RAINFALL ACCUMULATIONS IN AUSTRALIAN RAINFALL DATA SET 1173

Clearly, the increase in the reporting rate of accumulated data in 1974 must also have coincided with someother type of cultural change in the gauge reading practices at this station. The conversion from imperial tometric measurement units for rainfall observations, which also occurred in 1974, is unlikely to have had anyimpact on accumulation reporting rates.

Figure 2(a) shows that almost all the days with accumulations at the same station were Saturdays, Sundaysor Mondays. If we remove all accumulated data from the record, then the number of rainy days (definedhere as days with non-zero 24 h rainfall totals) is substantially less on the three weekend days (Saturday toMonday) than on the four midweek days (Tuesday to Friday; Figure 2(b)). Interestingly, although Figure 2(a)suggests that accumulations affect Sundays and Mondays with roughly equal frequency, there remain farmore rainy Mondays than rainy Sundays in Figure 2(b). Sundays and Mondays are, however, approximatelyequal in rainfall amount when the accumulations are neglected (Figure 2(c)), but Sundays have substantiallyheavier mean events (Figure 2(d)). This suggests that although the observers may not have been officially atwork on Sunday mornings, they may have been more inclined to make a special effort to read the gauge onSundays if they knew that there had been substantial rainfall in the previous 24 h. This is highlighted furtherby the post-1974 record, which shows just 10 rainy Saturdays and three rainy Sundays in 27 years. However,the mean intensity of those Saturday events is twice the midweek average, and that of the Sunday events isthree times the midweek average.

Another aspect of observer bias is shown in Figure 3, for a gauge with about 450 accumulated days ina 51 year period, almost all of which occurred on Sundays and Mondays. If we again neglect days withaccumulated data, then we find that both the number of rainy days (Figure 3(b)) and the total rainfallamount (Figure 3(c)) for Sundays are comparable to those on other days. In contrast, both occurrenceand amount are substantially reduced on Mondays. In fact, the shortfall in rainy days on Mondays

Sun Mon Tue Wed Thu Fri Sat0

200

400

600

800

1000

Num

ber

of m

issi

ng o

bser

vatio

ns

Missing days

(a)

Sun Mon Tue Wed Thu Fri Sat0

200

400

600

800

1000

1200

1400

Num

ber

of r

ainy

day

s

Rainy days

(b)

Sun Mon Tue Wed Thu Fri Sat0

1000

2000

3000

4000

5000

Tot

al r

ain

amou

nt (

mm

)

Rain amount

(c)

Sun Mon Tue Wed Thu Fri Sat0

1

2

3

4

5

Mea

n ra

infa

ll pe

r ev

ent (

mm

)

Intensity

(d)

Figure 2. Rainfall statistics for the 94 year period 1907–2000, apportioned by day of the week for station 025015 showing (a) thenumber of days without 24 h rainfall observations, (b) the number of rainy days (excluding days with accumulations), (c) the total

rainfall amount, and (d) the mean rainfall amount per rainy day

Copyright 2004 Royal Meteorological Society Int. J. Climatol. 24: 1171–1192 (2004)

Page 4: It never rains on Sunday: the prevalence and implications of untagged multi-day rainfall accumulations in the Australian high quality data set

1174 N. R. VINEY AND B. C. BATES

Sun Mon Tue Wed Thu Fri Sat0

50

100

150

200

250

Num

ber

of m

issi

ng o

bser

vatio

ns

Missing days

(a)

Sun Mon Tue Wed Thu Fri Sat0

100

200

300

400

500

Num

ber

of r

ainy

day

s

Rainy days

(b)

Sun Mon Tue Wed Thu Fri Sat0

500

1000

1500

2000

Tot

al r

ain

amou

nt (

mm

)

Rain amount

(c)

Sun Mon Tue Wed Thu Fri Sat0

1

2

3

4

5

6

Mea

n ra

infa

ll pe

r ev

ent (

mm

)

Intensity

(d)

Figure 3. As for Figure 2, but for station 016022 over the period 1920–70

is approximately equal to the number of Monday accumulations. It would appear that at this site theobservers read the gauge on Sundays only if there had been rain in the previous 24 h. On other weekends,including those with rain on Monday, the (empty) gauge was not read on Sunday, so Monday’s rain wasrecorded as a 2 day accumulation. This hypothesis is further supported by the observation that of the 183Mondays with 1 day accumulations in Figure 3(b), no fewer than 173 directly followed rainy Sundays.Although the observers appear to have acted entirely in accordance with the Bureau of Meteorology’sinstructions (and, indeed, have shown extra diligence in observing the gauge so frequently on rainy Sundayswhen they might have been excused from duty), they have introduced a selection bias into the Sundayobservations. Similar observation patterns are evident at other stations prior to the 1970s, particularlythose in rainfall districts 16–21 in northern and western South Australia. This observation practice hasimplications for the development of strategies for coping with accumulated data and will be discussed furtherin Section 3.

The occurrence of days with accumulated data for the 181 stations of the high-quality data set is shown inFigure 4. The increase in prevalence of accumulated data in 1974 is clearly shown. Groisman et al. (1999) andHennessy et al. (1999) attribute this increase to the cessation of Saturday trading at post offices in February1974. There is also evidence of a decline in accumulations since 1992, which Hennessy et al. (1999) attributeto the increasing deployment of automated rain gauges. In the early part of the 20th century, the occurrence ofaccumulations was reasonably constant at around 0.5% until 1945, with a slightly increasing trend thereafter.Despite presenting some evidence to suggest that Sunday rainfall totals may be about 10% less than those onother days of the week, Hennessy et al. (1999) conclude that, since Monday rainfall totals appear satisfactory,this relative absence of accumulations prior to 1974 is probably due to greater observer diligence, rather thanto failed reporting of accumulations.

Copyright 2004 Royal Meteorological Society Int. J. Climatol. 24: 1171–1192 (2004)

Page 5: It never rains on Sunday: the prevalence and implications of untagged multi-day rainfall accumulations in the Australian high quality data set

MULTI-DAY RAINFALL ACCUMULATIONS IN AUSTRALIAN RAINFALL DATA SET 1175

1890 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 20000

2

4

6

8

10

12

14

Acc

umul

atio

n pr

eval

ence

Figure 4. Occurrence of accumulated data (mean number of days per year per operational station) in the 181 stations of the high-qualitydata set

3. APPROACHES TO COPING WITH ACCUMULATED DATA

Despite the obvious potential for observer selectivity to bias some of the rainfall observations, particularly atthose stations with large numbers of accumulations, it does not necessarily follow that Lavery et al. (1992)erred in including those stations in the high-quality data set. Firstly, it is not unreasonable to expect that totalrainfall will be more or less preserved in such records. Secondly, given that the accumulations are tagged, itis relatively easy for the user to find them and devise simple ways of dealing with them.

The treatment of accumulated data depends on the application they are being used for. For example,Suppiah and Hennessy (1996), in a study of the intensity and frequency of heavy summer rainfall in tropicalAustralia, distributed accumulated rainfall evenly over the accumulation period. Despite the obvious tendencyto overestimate slightly the frequency of rainy days and to underestimate their intensity, this is a justifiablestrategy for tropical stations, where there is a high conditional probability of a rainy day given rain onthe previous day. As part of their study, Suppiah and Hennessy (1996) tested the effects of several otherdistribution patterns and found little evidence of significant differences in the intensity of heavy rainfall.Suppiah and Hennessy (1998) and Hennessy et al. (1999) later cited this test as justification for adopting thesame distribution strategy in Australia-wide studies of rainfall frequency and intensity, although it is likelyto be less valid for temperate regions, where conditional probabilities may be significantly less than in thetropics. In contrast, Haylock and Nicholls (2000) used two separate distribution strategies. One was to treatall accumulations as missing data. However, in their analysis of the number of days with rainfall greaterthan 1 mm, they appear to have treated each accumulation period as comprising one rainy day amongst asequence of dry days. Clearly, this latter strategy would lead to an underprediction of the number of rainydays. However, it should be noted that Suppiah and Hennessy (1998), Hennessy et al. (1999) and Haylockand Nicholls (2000) all strived to minimize the potential for problems associated with the use of accumulateddata by objective elimination from their analyses of stations with large numbers of accumulations.

An alternative strategy is to use information from neighbouring stations to distribute accumulations.Potential techniques are similar to those used for spatial interpolation, and include Thiessen polygons, distance-weighted interpolation and geostatistical methods such as splining and kriging (Goovaerts, 2000; Xia et al.,2001). However, the accuracy of these techniques decreases with decreasing station density, a problem that

Copyright 2004 Royal Meteorological Society Int. J. Climatol. 24: 1171–1192 (2004)

Page 6: It never rains on Sunday: the prevalence and implications of untagged multi-day rainfall accumulations in the Australian high quality data set

1176 N. R. VINEY AND B. C. BATES

is particularly acute in the relatively sparse Australian observational network. Spatial interpolation techniquescan also be affected if the quality of observations at neighbouring stations is compromised.

A non-distributary treatment was tested by Suppiah and Hennessy (1996). In their study of heavy tropicalrainfall, the use of stations with large numbers of accumulations was unavoidable. Recognizing the potentialimplications, they compared long-term trends in 90th and 95th percentile rainfall intensities calculated frommidweek data only (i.e. Tuesday to Friday) with those calculated using a 7 day week. They found somenotable changes in trends for some stations with large numbers of accumulations; but, since similar changeswere observed for stations with few accumulations, they concluded that the differences could not necessarilybe attributed to the removal of weekend accumulations. They further noted that the reduction in sample sizeof 43% reduces the significance levels of any observed trends.

In the light of Figure 3, it is worth noting that, as well as depending on the type of application, the treatmentof accumulated data, especially the distributary treatments, should also be predicated on the observationalpractices that lead to the accumulations and on the biases inherent in those practices. For examples likeFigure 3, a distribution strategy that apportions all the accumulated rainfall to Monday and leaves Sunday dryis likely to be more appropriate than a strategy that distributes rainfall evenly among the accumulation days.

4. UNTAGGED ACCUMULATIONS

The statement by Hennessy et al. (1999), that the relatively low occurrence of rainfall accumulations in thefirst half of the 20th century was due to greater observer diligence, merits closer investigation. ConsiderFigure 5. This station, which is part of the high-quality data set, has fewer than 200 days with accumulateddata in the period 1907–62, almost all of them on Sunday and Monday. However, after removing thosedays with accumulated data from the analysis, there are only 87 rainy Sundays (Figure 5(b)), compared with

0

20

40

60

80

100

Num

ber

of m

issi

ng o

bser

vatio

ns

Missing days

(a)

Sun Mon Tue Wed Thu Fri Sat0

200

400

600

800

1000

1200

Num

ber

of r

ainy

day

s

Rainy days

(b)

Sun Mon Tue Wed Thu Fri Sat

0

1

2

3

4

5

6

Mea

n ra

infa

ll pe

r ev

ent (

mm

)

Intensity

(d)

Sun Mon Tue Wed Thu Fri Sat0

1000

2000

3000

4000

5000

6000

7000

Tot

al r

ain

amou

nt (

mm

)

Rain amount

(c)

Sun Mon Tue Wed Thu Fri Sat

Figure 5. As for Figure 2, but for station 010505 over the period 1907–62

Copyright 2004 Royal Meteorological Society Int. J. Climatol. 24: 1171–1192 (2004)

Page 7: It never rains on Sunday: the prevalence and implications of untagged multi-day rainfall accumulations in the Australian high quality data set

MULTI-DAY RAINFALL ACCUMULATIONS IN AUSTRALIAN RAINFALL DATA SET 1177

1083 rainy Mondays and an average of 889 rainy days for the remaining days of the week. Clearly, evenif all 96 Sundays with accumulated data (Figure 5(a)) are added to this total, rainy Sundays are still vastlyunderrepresented. The discrepancy is too large to have been caused by chance. The number of rainy Mondaysis substantially higher than the Tuesday–Saturday average, despite having 97 Mondays with accumulateddata removed from the analysis. Clearly, rainy Mondays are overrepresented in Figure 5(b). In Figure 5(c),the rainfall totals on Sundays and Mondays are respectively significantly less and significantly greater thanthe average of the other days. The mean intensities for both Sunday and Monday are slightly greater than theTuesday–Saturday average (Figure 5(d)).

Clearly, the gauge depicted in Figure 5 includes a very large number of untagged weekend accumulations.In other words, while recording the rainfall depth of weekend accumulations as Monday rainfall, the observershave failed to flag most of the accumulations in their records.

For a station where the untagged accumulations are overwhelmingly 2 day accumulations (Sunday andMonday), we can obtain a first approximation of the number of untagged accumulations from the differencebetween the number of rainy Mondays and the number of rainy Sundays. In the case depicted in Figure 5, wemay surmise that about 1000 weekends with untagged accumulations are concealed within the 56 year record.

Another example from the high-quality data set of a gauge with untagged accumulations is shown inFigure 6. Here, during the 7 year period 1956–62 there were no tagged accumulations and no missing data.However, rainfall was recorded on only one Sunday during the period out of a total of 740 rainy days.That one Sunday event was, however, a significant fall of nearly 12 mm. As was the case with the data inFigure 5, Monday rainfall also appears to be over-represented here, both in frequency and amount. Based onthe difference between the recorded occurrences of rainy Mondays and rainy Sundays in Figure 6, we mayestimate that the data from this gauge contains about 170 weekends with untagged accumulations during the7 year period.

−1

−0.5

0

0.5

1

Num

ber

of m

issi

ng o

bser

vatio

ns

Missing days

(a)

Sun Mon Tue Wed Thu Fri Sat

0

2

4

6

8

10

12

Mea

n ra

infa

ll pe

r ev

ent (

mm

)

Intensity

(d)

Sun Mon Tue Wed Thu Fri Sat

0

50

100

150

200

Num

ber

of r

ainy

day

s

Rainy days

(b)

Sun Mon Tue Wed Thu Fri Sat

Tot

al r

ain

amou

nt (

mm

)

Rain amount

(c)

Sun Mon Tue Wed Thu Fri Sat0

200

400

600

800

1000

1200

Figure 6. As for Figure 2, but for station 009557 over the period 1956–62

Copyright 2004 Royal Meteorological Society Int. J. Climatol. 24: 1171–1192 (2004)

Page 8: It never rains on Sunday: the prevalence and implications of untagged multi-day rainfall accumulations in the Australian high quality data set

1178 N. R. VINEY AND B. C. BATES

5. AN OBJECTIVE TEST FOR UNTAGGED ACCUMULATIONS

In this section we develop an objective test for the presence of untagged accumulations and apply it toeach of the 181 high-quality gauges for the 111 year period, 1890–2000. We first make the assumption thatuntagged accumulations overwhelmingly include one Sunday on which the rain gauge remains unobservedand that data recorded between Tuesday and Friday (hereafter referred to as ‘weekdays’) are relativelyuncontaminated by untagged accumulations. In making this assumption we note that, apart from weekends,the other times that may be prone to accumulations are public holidays. In Australia, most public holidays fallon Monday, although some (e.g. Anzac Day (25 April) and the Christmas and New Year public holidays inlate December and early January) occur on fixed dates and can, therefore, fall on weekdays. We also assumethat for any accumulation, tagged or untagged, the probabilities that part of the rain fell during any particularday of the accumulation period are equal. This assumption will obviously not hold for gauges like that inFigure 3, but the effect of such observational biases will act conservatively on the prediction of untaggedaccumulations.

The test involves calculating the probability that the number of rainy Sundays in any given year could beas few as was observed. In order to do this we need an unbiased estimate of the probability p1 of rain onany given Sunday in a particular year. For this we used the weekday rainfall probability, i.e. the number ofrainy weekdays divided by the total number of weekdays during the year. Weekdays that form part of taggedaccumulations were ignored. In some years, some of the gauges, particularly those in drier areas or those thatwere closed for part of the year, had few non-zero weekday rainfall observations. This leads to considerableuncertainty in the estimation of annual rainfall probabilities. In order to reduce this uncertainty, these annualrainfall probabilities were smoothed using a 5-year weighted average. Five years was deemed large enoughto increase sample sizes sufficiently, yet was small enough to retain information about short- to medium-termchanges in rainfall climate or in observer practice.

There is some evidence of systematic variations in rainfall amount on different days of the week, particularlyin large cities. For example, Simmonds and Keay (1997) found that in Melbourne, Australia, during theperiod 1964–90, average daily rainfall totals on weekdays (which they defined as Monday to Friday) were10% greater than on Saturdays and Sundays. They attributed this difference to the greater anthropogenic heatemissions on weekdays. In developing the high-quality data set, Lavery et al. (1992) were cognizant of thepotential for urbanization to mask climatic trends and, consequently, eliminated stations in urban areas fromthe data set. As a result, effects such as those described by Simmonds and Keay (1997) are unlikely to havesignificant impact on the high-quality data set or to compromise the assumption made here, that weekdayrainfall probability is an adequate predictor of Sunday rainfall probability.

As already noted, the midweek rainfall probability can potentially be affected by untagged accumulationsassociated with public holidays. For example, many public holidays in Australia occur on Mondays, so somerain gauges may remain unread until Tuesday. If these accumulations are not tagged, then it is likely thatthe number of rainy Tuesdays will be over-observed, since some of these records will include rain thatfell on one or more days prior to a dry Tuesday. However, this effect is compensated by the possibilitythat during other untagged public-holiday accumulations, some weekday rainfall events will be unobserved.Analysis of the potential overall effect of untagged public-holiday accumulations for a typical cycle ofpublic holidays (10 days per year falling on various days of the week) shows that the expected number ofrainy Sundays can be overpredicted slightly for sites with p1 < 0.39 and underpredicted slightly for siteswith p1 > 0.39. The maximum magnitude of this overprediction is 0.25 rainy Sundays per year in some ofthe drier sites. This amount is unlikely to have a significant impact on the detection of untagged Sundayaccumulations. Furthermore, since it is highly likely that any gauge with a preponderance of untagged public-holiday accumulations will also have significant numbers of weekend accumulations, the possibility of falseidentification of a gauge with weekend accumulations is remote. On this basis, p1 was accepted withoutmodification as the predictor of the Sunday rainfall probability.

One problem remains: to take account of any tagged accumulations involving Sundays. We note that, foran accumulation of a days duration, the probability pS that part of this rain fell on Sunday is equal tothe probability of Sunday rainfall divided by the probability of rain during the accumulation period. Then,

Copyright 2004 Royal Meteorological Society Int. J. Climatol. 24: 1171–1192 (2004)

Page 9: It never rains on Sunday: the prevalence and implications of untagged multi-day rainfall accumulations in the Australian high quality data set

MULTI-DAY RAINFALL ACCUMULATIONS IN AUSTRALIAN RAINFALL DATA SET 1179

assuming the rainfall sequence may be approximated as a first-order Markov process,

pS = p1

1 − (1 − p1)(1 − p01)a−1

where p01 is the conditional probability of a wet day occuring immediately after a dry day and, like p1, iscalculated from weekday data. Thus, using this equation for each tagged accumulation during the year, wemay obtain an estimate of the expected number of rainy Sundays associated with the tagged accumulations.This is rounded upwards and added to the observed number of rainy Sundays to give the total number ofrainy Sundays during the year No.

The probability that the true number of rainy Sundays N does not exceed No in any given year is

p(N ≤ No) =No∑

i=0

n!

i!(n − i)!pi

1(1 − p1)n−i (1)

where n is the number of number of Sundays in the record. This number includes those Sundays with rainfallobservations (whether zero or non-zero) and those Sundays that are part of tagged accumulations, but it doesnot include periods during which the gauge was not operating at all. In most years n = 52.

Whenever p(N ≤ No) is less than some critical threshold, we may conclude that the number of rainySundays appearing in the record for that year is too few to have been caused by chance. These are the yearsfor which it is reasonable to suspect the presence of untagged weekend accumulations.

A Monte Carlo analysis was used to establish the threshold value, which was set at the level that wewould expect to be breached by chance in just 1 year of a 100 year period by no more than 5% of gauges.This threshold probability pc was found to be equal to 0.0008 and to be approximately invariant with n andwith p1.

Figure 7 shows annual values of p(N ≤ No) for one of the gauges in the high-quality data set. It clearlyshows a preponderance of untagged weekend accumulations in 1948 and between 1957 and 1965. In theyears up to about 1945, the annual non-exceedence probabilities are scattered evenly about 0.5 (the expectedvalue) with a perfectly acceptable minimum of 0.05. However, from the late 1940s to the early 1970s, there

1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 20001e–091e–081e–071e–061e–05

0.0001

0.001

0.01

0.050.1

0.25

0.5

0.75

0.90.95

0.99

p (

N ≤

Nob

s )

Figure 7. Annual non-exceedence probability for rainy Sundays at station 010525, 1890–2000. The solid horizontal lines show theexpected mean (0.5) and the critical probability (0.0008). Years with non-exceedence probabilities of less than pc are marked with a

solid circle

Copyright 2004 Royal Meteorological Society Int. J. Climatol. 24: 1171–1192 (2004)

Page 10: It never rains on Sunday: the prevalence and implications of untagged multi-day rainfall accumulations in the Australian high quality data set

1180 N. R. VINEY AND B. C. BATES

is a noticeable decrease in non-exceedence probabilities, with 9 years falling below pc. Non-exceedenceprobabilities recover somewhat from the early 1970s onwards, although there appear to be periods in the late1980s and the late 1990s with sustained low values.

During the periods 1949–56 and 1966–71, while the gauge consistently has probability values of less than0.5, these values never drop below pc. Nonetheless, the persistence of these low values of p(N ≤ No) issuggestive of a systematic under-observation of Sunday rainfall during this period. For example, the likelihoodof p(N ≤ No) dipping below 0.05 for 3 years in a row (e.g. 1966–68) must clearly be much less than 0.05,and may well have been less than we could reasonably expect by chance. To test this, the probabilities givenby Equation (1) were repeated using 3 year and 5 year sequences of Sundays. Given that pc is approximatelyinvariant with n, we may plot these probabilities on the same graph (Figure 8). It is now evident that theentire sequence of years between 1947 and 1973 contains untagged accumulations. Furthermore, the periodfrom 1997 to 2000 is now also identified as having too few Sunday rainfall events, but the questionableperiod in the late 1980s remains credible.

By way of contrast, the probabilities shown in Figure 9 do not indicate the presence of untaggedaccumulations at any time during that gauge’s record. Annual non-exceedence probabilities remain evenlyscattered about 0.5 throughout the record, with no sustained periods of low probabilities.

The analysis shown in Figures 8 and 9 was carried out for all 181 gauges in the high-quality data set.Untagged accumulations were assumed to occur in years where the 1 year non-exceedence probability wasless than pc, or either of the 3 year or 5 year non-exceedence probabilities centred on that year were lessthan pc. Untagged accumulations were also assumed where both the 3 year probabilities on either side of aparticular year were less than pc or where at least two of the 5 year probabilities on either side of a particularyear were less than pc.

Figure 10 indicates that 102 of the 181 gauges show evidence of untagged accumulations during the period1890 to 2000. Of these, 63 have sequences of at least five successive years of untagged accumulations, withone gauge having as many as 84 years of untagged accumulations. The occurrence of untagged accumulationsappears to have declined significantly since the mid-1970s.

A significant proportion of the incidences of untagged accumulations since 1980 occur around 1983,particularly in southeastern Australia. Examination of exceedence probability plots for many of the sites inVictoria and Tasmania and some sites in southern New South Wales and eastern South Australia suggests

1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 20001e–091e–081e–071e–061e–05

0.0001

0.001

0.01

0.050.1

0.25

0.5

0.75

0.90.95

0.99

p (

N ≤

Nob

s )

One–yearThree–yearFive–year

Figure 8. As for Figure 7, and for the same station, but with the 3 year and 5 year non-exceedence probabilities included. Years with3 year or 5 year non-exceedence probabilities of less than pc are marked with solid diamonds and triangles respectively. Other years

that are flanked on both sides by solid markers are indicated with an open diamond or triangle

Copyright 2004 Royal Meteorological Society Int. J. Climatol. 24: 1171–1192 (2004)

Page 11: It never rains on Sunday: the prevalence and implications of untagged multi-day rainfall accumulations in the Australian high quality data set

MULTI-DAY RAINFALL ACCUMULATIONS IN AUSTRALIAN RAINFALL DATA SET 1181

1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000

0.0001

0.001

0.01

0.050.1

0.25

0.5

0.75

0.90.95

0.99p

( N

≤ N

obs

)

One–yearThree–yearFive–year

Figure 9. As for Figure 8, but for a gauge (station 080065) without any apparent untagged accumulations

that Sunday rainfall was consistently underrecorded in the early 1980s throughout the region. Figure 9 is anexample of a gauge in Victoria with a low (but not subcritical) exceedence probability in 1982. Given thewidespread prevalence of low exceedence probabilities during this period, it is possible that this reflects anatural phenomenon (i.e. that Sunday rainfall was unusually infrequent that year in comparison with otherdays of the week) and may be associated with the record low values of the southern oscillation index thatwere recorded at the same time. Further analysis suggests that the low exceedence probabilities during thisperiod are not statistical artefacts associated with the use of 5 year weighted averages for annual rainfallprobabilities. Thus, if this phenomenon is indeed natural, we might reprieve many of the gauges from thesuspicion of having untagged accumulations in the early 1980s, especially those which do not show signs ofuntagged accumulations in other years.

For each station, we may estimate the number of untagged accumulations by the difference between theexpected and observed numbers of rainy Sundays. Using this procedure, but only for years that have beenidentified by their non-exceedence probabilities as deficient in rainy Sundays, we may compare the prevalenceof untagged accumulations with the prevalence of tagged accumulations. Figure 11 shows that the occurrenceof untagged accumulations had a slightly increasing trend of 1.0 to 2.5 Sundays per year per station between1900 and 1961, before declining abruptly in 1962 and again in 1974. In contrast, the occurrence of taggedaccumulations increased during the 1940s and 1950s from a steady base of about 0.75 to 1.50, stabilizedbriefly and then increased abruptly in 1974. There also appears to have been a reduction in the number oftagged accumulations since 1992.

It should be noted that this analysis is likely to underestimate the number of untagged accumulations for tworeasons. Firstly, it is only for years that are bad enough to be identified as having subcritical probabilities. Formany of the gauges, there are other years with suspected untagged accumulations, but where the prevalenceof those accumulations is insufficient to reduce p(N ≤ No) below pc. Secondly, the analysis presented hereonly seeks to identify untagged accumulations that involve an underrecording of Sunday rainfall. It is likelythat any station with untagged weekend accumulations will also contain untagged midweek accumulationscorresponding to midweek public holidays.

When the number of tagged and untagged accumulations are taken together (Figure 11), it is seen that theabrupt changes in each in 1974 caused only a barely perceptible change in the total number of Sundayaccumulations, with the decrease in untagged accumulations being offset by the coincident increase intagged accumulations. To a lesser extent, the decrease in untagged accumulations in 1962 is also offset

Copyright 2004 Royal Meteorological Society Int. J. Climatol. 24: 1171–1192 (2004)

Page 12: It never rains on Sunday: the prevalence and implications of untagged multi-day rainfall accumulations in the Australian high quality data set

1182 N. R. VINEY AND B. C. BATES

1890 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000

WA

NT

SA

Qld

NSW

Vic

Tas

Figure 10. Occurrence of untagged accumulations (circles) for each of the 181 gauges of the high-quality data set for the period1890–2000. Each horizontal line represents one station, with the thin lines indicating the years of operation for each station. Stationsare grouped by state and territory (WA: Western Australia; NT: Northern Territory; SA: South Australia; Qld: Queensland; NSW: NewSouth Wales; Vic: Victoria; Tas: Tasmania). There are no stations from the Australian Capital Territory or from offshore territories in

the high-quality data set

by an accompanying increase in tagged accumulations. Over the period 1910 to 1992 there is a reasonablyconsistent increasing trend in total Sunday accumulations. Of course, given the increasing prevalence of 3 dayaccumulations after 1974, the total number of days involved in accumulations is likely to have increasedsignificantly since then.

Copyright 2004 Royal Meteorological Society Int. J. Climatol. 24: 1171–1192 (2004)

Page 13: It never rains on Sunday: the prevalence and implications of untagged multi-day rainfall accumulations in the Australian high quality data set

MULTI-DAY RAINFALL ACCUMULATIONS IN AUSTRALIAN RAINFALL DATA SET 1183

1890 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 20000

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

Day

s w

ith S

unda

y ac

cum

ulat

ions

UntaggedTaggedTotal

Figure 11. Approximate number of Sundays with tagged and untagged accumulations (mean number per year per operational station)

Table I. Mean values of rainfall probability and the mean number of tagged and untagged Sundayaccumulations per year of record for each state and territory

State or territory No. of stations Rainfall probability Sunday accumulations

Tagged Untagged

Western Australia 45 0.22 0.75 0.96Northern Territory 3 0.17 0.57 0.52South Australia 38 0.22 3.00 1.19Queensland 27 0.21 0.64 0.47New South Wales 28 0.24 0.99 1.75Victoria 29 0.34 1.31 1.59Tasmania 11 0.42 1.15 1.01All 181 0.25 1.36 1.15

The total number of Sundays with untagged accumulations across the entire 181 high-quality stations isabout 22 800, an average of 126 per station. This total is of similar magnitude to the total number of Sundayswith tagged accumulations, i.e. 26 800, or 148 per station. The greatest number of Sundays with untaggedaccumulations at any one station is 1140, which compares with the maximum number of tagged Sundayaccumulations (1292). The maximum combined total of Sunday accumulations at any station is 2036, and 16stations have more than 1000 Sunday accumulations.

Table I lists the mean number of tagged and untagged Sunday accumulations per year of recordfor each state and territory. Comparisons between states are complicated by the likelihood that regionswith higher rainfall probability (and, therefore, more potential Sundays with rainfall) will have greaternumbers of accumulations. Despite this, it would appear from Table I that stations in South Australiahave significantly more tagged accumulations than the national average, whereas those in New SouthWales have significantly more untagged accumulations. Three states (Western Australia, New SouthWales and Victoria) have more untagged than tagged accumulations. The large disparity between taggedand untagged accumulations in South Australia reflects, in part, the preponderance of gauges with

Copyright 2004 Royal Meteorological Society Int. J. Climatol. 24: 1171–1192 (2004)

Page 14: It never rains on Sunday: the prevalence and implications of untagged multi-day rainfall accumulations in the Australian high quality data set

1184 N. R. VINEY AND B. C. BATES

observational characteristics like those of Figure 3, where Monday rainfall is underrepresented. An objectivetest similar to Equation (1) could easily be developed to detect such stations, but this has not beenpursued here.

6. IMPLICATIONS OF UNTAGGED ACCUMULATIONS: POTENTIAL IMPACTON CLIMATE INDICES

The magnitude of the potential impact of untagged accumulations on a variety of commonly used climaticindices may be evaluated by simulating the occurrence of weekend accumulations in the rainfall records.For each of the 181 stations the following long-term climatic averages were calculated: rainfall probabilityp1, 90th (P90) and 95th (P95) percentiles of rainfall intensity, mean wet-spell length tw and mean dry-spelllength td. This procedure was repeated after accumulating rainfall totals from two successive days of theweek into a single event. In this case, all Thursday rainfall was transferred to Friday’s total. In this way,the effects of 2 day accumulations could be assessed. The procedure was repeated again after all Wednesdayand Thursday rainfall was transferred to Friday’s total in order to test the effects of 3 day accumulations.Wednesday, Thursday and Friday were used in this simulation because, in general, they contain far fewerexisting accumulations than the weekend days.

The impacts on the various climatic indices, averaged over all 181 sites, are shown in Table II. The presenceof accumulated data leads to decreases in p1, P90, tw and td and to increases in P95. Interestingly, the impacton td is greater for 2 day accumulations than for 3 day accumulations.

The information in Table II, however, because it is averaged over all 181 sites, tells only part of the story.For each of the climatic indices, the magnitude of the response is strongly dependent on the frequency ofrainy days (Figures 12–14). The proportional impact on rainfall probability becomes greater as the probabilityincreases (Figure 12). The 3 day accumulations can decrease p1, and hence the observed number of rainydays, by more than 20% at sites with high rainfall probability. Further detailed analysis for selected gaugesindicates that, in addition to the gross ratios depicted in Figure 12, both seasonal and interannual ratios ofrainfall probability follow the same trend.

Figure 13 shows that, for both 2 day and 3 day accumulations, the 90th and 95th percentiles of dailyrainfall totals are slightly greater than for the unaccumulated case for sites with high rainfall probability,but are substantially less for sites with low rainfall probability. For both percentiles, the departure from theunaccumulated case is greater for the 3 day accumulations. The point at which both 90th percentiles cross the1 : 1 line is at a rainfall probability of about 0.4, whereas the 95th percentiles cross at about 0.2. This explainsthe observation in Table II of the average P90 decreasing as the level of accumulation increases, while theaverage P95 increases. Accumulations affect the cumulative distribution function of rainfall amounts in twoways. Firstly, any accumulation of rainfall days will decrease the number of rainy days and, therefore, increasethe level of the highest percentile corresponding to zero rain and decrease the intensity of adjacent percentiles.Secondly, accumulation will generally increase the peak intensities and, therefore, the very highest percentilevalues. Sites with low rainfall probability are likely to have their P90 and P95 values dominated by the first

Table II. Mean values of various climatic indices under three simulatedaccumulation schemes

Climatic index Accumulation scheme

None 2 day 3 day

Rainfall probability p1 0.247 0.228 0.20590th percentile P90 (mm) 4.15 4.03 3.7995th percentile P95 (mm) 9.11 9.21 9.29Mean wet-spell length tw (days) 2.02 1.65 1.56Mean dry-spell length td (days) 7.45 6.89 7.33

Copyright 2004 Royal Meteorological Society Int. J. Climatol. 24: 1171–1192 (2004)

Page 15: It never rains on Sunday: the prevalence and implications of untagged multi-day rainfall accumulations in the Australian high quality data set

MULTI-DAY RAINFALL ACCUMULATIONS IN AUSTRALIAN RAINFALL DATA SET 1185

0 0.1 0.2 0.3 0.4 0.5 0.6 0.70.75

0.8

0.85

0.9

0.95

1

Rainfall probability

Pro

port

iona

l rai

nfal

l pro

babi

lity

Figure 12. Ratios of long-term daily rainfall probabilities for data with simulated 2 day (triangles) and 3 day (circles) accumulations tothose of the raw rainfall series, plotted as a function of the raw rainfall probabilities. The linear trend lines for the 2 day (solid line)

and 3 day (broken line) accumulations are significant at the 1% level

effect, because those percentiles are close to the highest zero-valued percentile. On the other hand, sites withhigh rainfall probabilities are likely to have their P90 and P95 values dominated by the second effect, thusleading to increases in intensity.

Ratios of average wet- and dry-spell length are shown in Figure 14. Average wet-spell lengthsdecrease substantially for both accumulation strategies, and the decrease is greatest for gauges with highrainfall probabilities. Average dry-spell length decreases for all 2 day accumulations and for some 3 dayaccumulations, but it increases for other 3 day accumulations, particularly for sites with high rainfallprobability. Detailed analysis of wet and dry spells shows that the fragmentation of spells (i.e. the totalnumber of spells) increases for both accumulation strategies, but that the increase is greater for 2 dayaccumulations than for 3 day accumulations. This is because many 3 day accumulations actually result indecreased fragmentation, especially in wet locations. For example, the 5 day sequence dry–wet–dry–dry–wetbecomes dry–dry–dry–wet–wet when the middle 3 days are accumulated, and this results in a reduction inthe total number of spells. In contrast, only one potential 4 day sequence (dry–wet–dry–wet) results ina 2 day accumulation with decreased fragmentation. However, its occurrence probability is so low that itsimpact on moderating the increased fragmentation caused by other sequences is negligible. As shown inFigure 12, the total number of dry days increases as the level of accumulation increases, especially for siteswith high rainfall probability. For 3 day accumulations at these wet sites, this increase in dry days morethan offsets the moderate increase in the total number of spells, and thus leads to an increase in average dryspell length.

Figures 12–14 indicate the responses that can be expected in the various climatic indices if the gaugesare never read on one or two particular days of the week. As such, they represent the maximum potentialimpact of accumulated rainfall totals, regardless of whether the accumulations are tagged or not. However,since interannual trends are similar in pattern and magnitude, we could reasonably expect similar responses toindividual annual climatic indices as a result of sustained accumulations. This, coupled with the results shownin Figure 10, in which many stations suffer from untagged accumulations in some, but not all, years, suggeststhat any assessment of long-term trends in annual values of these indices could be severely compromised bythe presence of accumulated data.

Copyright 2004 Royal Meteorological Society Int. J. Climatol. 24: 1171–1192 (2004)

Page 16: It never rains on Sunday: the prevalence and implications of untagged multi-day rainfall accumulations in the Australian high quality data set

1186 N. R. VINEY AND B. C. BATES

0 0.1 0.2 0.3 0.4 0.5 0.6 0.70

0.2

0.4

0.6

0.8

1

1.2

Rainfall probability

Pro

port

iona

l 90t

h pe

rcen

tiles

a

0 0.1 0.2 0.3 0.4 0.5 0.6 0.70

0.2

0.4

0.6

0.8

1

1.2

Rainfall probability

Pro

port

iona

l 95t

h pe

rcen

tiles

b

Figure 13. As for Figure 12, but showing ratios of (a) the long-term 90th percentile and (b) the long-term 95th percentile of dailyrainfall totals for the two accumulation strategies to the respective percentiles of the raw rainfall series. The non-linear trend curves for

the 2 day (solid line) and 3 day (broken line) accumulations are significant at the 1% level

7. IMPLICATIONS OF UNTAGGED ACCUMULATIONS: REANALYSIS OF CLIMATECHANGE STUDIES

7.1. Nicholls and Kariko (1993)

Nicholls and Kariko (1993) assessed the number, average length (in days) and average intensity (rainamount per rainy day) of rain events at five stations in eastern Australia between 1910 and 1988. Two of themain selection criteria were that the stations have few days with missing data and few tagged accumulations.The five stations have a maximum of 61 missing days in the 79 year record and a maximum of 130 dayswith tagged accumulations. For tagged accumulations, Nicholls and Kariko (1993) appear to have assumedthe entire accumulation to have fallen on the final day of the accumulation period. This strategy is likelyto result in an underestimation of average length and an overestimation of average intensity for sites with alarge number of accumulations.

Three of the five stations that Nicholls and Kariko (1993) used also contain untagged weekend rainfallaccumulations. One record (Peak Hill) contains about 300 Sundays as part of untagged accumulations and isalso the record with the most tagged accumulations. Using the technique of Suppiah and Hennessy (1996)to sample only midweek days (Tuesday to Friday), it is straightforward to reanalyse average intensities.Furthermore, given that the average wet-spell length is equal to the inverse of the conditional probabilitythat a dry day follows a wet day, it is also possible to reanalyse the number and duration of events. Thisreanalysis, when compared with Nicholls and Kariko (1993: table 4), indicates that the correlation betweennumber and length of events changes from being positive and significant at the 5% level to being negative

Copyright 2004 Royal Meteorological Society Int. J. Climatol. 24: 1171–1192 (2004)

Page 17: It never rains on Sunday: the prevalence and implications of untagged multi-day rainfall accumulations in the Australian high quality data set

MULTI-DAY RAINFALL ACCUMULATIONS IN AUSTRALIAN RAINFALL DATA SET 1187

0 0.1 0.2 0.3 0.4 0.5 0.6 0.70.5

0.6

0.7

0.8

0.9

1

Rainfall probability

Pro

port

iona

l wet

spe

ll le

ngth

a

0 0.1 0.2 0.3 0.4 0.5 0.6 0.70.8

0.85

0.9

0.95

1

1.05

1.1

Rainfall probability

Pro

port

iona

l dry

spe

ll le

ngth

b

Figure 14. As for Figure 12, but showing ratios of (a) the long-term mean wet-spell length and (b) the long-term mean dry-spell lengthfor the two accumulation strategies to the respective spell lengths of the raw rainfall series. The linear trend lines for the 2 day (solid

line) and 3 day (broken line) accumulations are significant at the 1% level

and non-significant. Comparison with Nicholls and Kariko (1993: table 7) shows that the correlations withannual total, number and intensity have increased slightly, whereas that for length has decreased substantiallyand is no longer significant.

At Peak Hill there is a strong declining trend in the occurrence of accumulated data (tagged and untagged)over the period 1910 to 1988. As a result, we might expect that an analysis that includes weekend datawould have a stronger tendency to overestimate average intensity and underestimate event duration earlier inthe period compared with later. Consequently, any reanalysis using weekday data only is likely to result inincreased (more positive) trends in average intensity and decreased (less positive) trends in event length.

Peak Hill’s midweek rainfall probability is about 0.19. According to Figures 12 and 14, we could expect2 day and 3 day accumulations to lead to maximum underestimations of the occurrence of wet days of 6% and14% respectively, and to maximum underestimations in average wet-spell length of 15% and 19% respectively.These combine to yield maximum overpredictions in the respective number of events (spells) of 10% and 6%.The actual impact on these variables would have been a little less than these maximum estimates, because notall Sundays were unread. During the period 1918–53, where the untagged accumulations occur, about 80%of the expected number of rainy Sundays are missing. On the assumption that these untagged accumulationsinvolve only 2 days each, the actual underestimation of wet-spell length during this period might, therefore,be about 12%, with an overprediction in the number of events of about 8%. Again assuming a decliningtrend in accumulations, Nicholls and Kariko’s (1993) analysis would be expected to have overestimated themagnitude of the positive trend in wet-spell length, but to have underestimated the magnitude of the positivetrend in the number of events.

Copyright 2004 Royal Meteorological Society Int. J. Climatol. 24: 1171–1192 (2004)

Page 18: It never rains on Sunday: the prevalence and implications of untagged multi-day rainfall accumulations in the Australian high quality data set

1188 N. R. VINEY AND B. C. BATES

7.2. Suppiah and Hennessy (1998)

Suppiah and Hennessy (1998) analysed trends in heavy rainfall, total rainfall and number of dry days duringthe period 1910–90 for 125 stations across Australia. The stations analysed were those in the high-qualitydata set with few missing data values and few (less than 584) accumulated days. Rainfall observations forthe remaining accumulation days were replaced by mean rainfall in each accumulation period.

Peak Hill is presented (Suppiah and Hennessy, 1998: figure 3(b)) as an example of a station with asignificant increase in the 90th percentile of summer rainfall intensity. The slope of the linear regression of90th percentile rainfall against time is about 3.1 mm/day/century. This site, however, shows evidence of asustained period of untagged accumulations between 1918 and 1953. During this period, the observed numberof Sundays with non-zero rainfall is about 300 fewer than expected.

Figure 15 shows the 90th percentiles of summer rainfall for Peak Hill for each year and for each day of theweek. Tagged accumulations have been distributed in the way described by Suppiah and Hennessy (1998). Intotal there are about 110 days with tagged accumulations between 1954 and 1968, but none thereafter. Duringthe period of untagged accumulations, it is clear that the observed rainfall was more intense on Mondaysand less intense on Sundays than for the remaining days of the week. However, during the period of taggedaccumulations, the distribution strategy adopted by Suppiah and Hennessy (1998) appears to have producedSunday and Monday rainfall intensities that are commensurate with the rest of the week. The reason forthe elevated intensities for Tuesdays between 1967 and 1975 is not clear, but between 1969 and 1971 therewas about 46% more rainfall recorded on Tuesdays than on any other day, despite the number of rainy daysbeing the same. Such a preponderance of Tuesday rainfall is not evident at any of the nearby high-qualityrain gauges.

When the 90th percentiles of summer rainfall are reanalysed using only weekday data (Tuesday toFriday), the strength of the increasing trend is reduced (Figure 16). The slope of the regression line isnow 2.4 mm/day/century and is no longer significant at the 5% level.

This reanalysis is qualitatively consistent with the expectations of Figure 13, in which a maximum under-prediction of 6% is indicated for 2 day accumulations at a station with p1 = 0.19. That the underpredictionindicated by Figure 16 is much more substantial in some years may be due either to the scatter in Figure 13or to the interannual variability in p1.

1910 1920 1930 1940 1950 1960 1970 1980 19900

2

4

6

8

10

12

14

Rai

nfal

l (m

m)

Untagged TaggedSunday Monday Tuesday WednesdayThursday Friday Saturday

Figure 15. The 11 year running mean of 90th percentiles of daily November–April rainfall intensity at Peak Hill for eachday of the week

Copyright 2004 Royal Meteorological Society Int. J. Climatol. 24: 1171–1192 (2004)

Page 19: It never rains on Sunday: the prevalence and implications of untagged multi-day rainfall accumulations in the Australian high quality data set

MULTI-DAY RAINFALL ACCUMULATIONS IN AUSTRALIAN RAINFALL DATA SET 1189

1910 1920 1930 1940 1950 1960 1970 1980 19901.5

2

2.5

3

3.5

4

4.5

5

5.5

6

6.5

Rai

nfal

l (m

m)

Figure 16. The 11 year running mean of 90th percentiles of daily November–April rainfall intensity at Peak Hill using data from everyday of the week (thick solid line) and Tuesday–Friday only (thick broken line). The trend lines (thin lines) for the respective series are

obtained from linear regression of annual intensities

It should be noted that there are several other data quality issues associated with this station. For example,it has a poor histogram of daily rainfall totals. In 26 of the 80 years analysed by Suppiah and Hennessy(1998) there are more days with daily rainfall observations R (mm) in the range 1 < R ≤ 2 than there arewith observations in the range 0 < R ≤ 1. These 26 years include every year from 1980 to 1990, and thetrend continues until 1997. Indeed, between 1980 and 1997, only 3 years had more observations in the lowestmillimetre range than in the range 2 < R ≤ 3. Clearly, there has been, until very recent times, a consistentunderreporting of small rainfall totals at this site. Consideration should be given to withdrawing this site fromthe high-quality data set.

Using the probabilities given by Equation (1), we can estimate the expected number of rainy Sundays and,by comparison with the observed record, estimate the number of untagged Sunday accumulations in any year.Had Suppiah and Hennessy (1998) used this criterion in the same way as they did for tagged accumulations toeliminate stations with more than 584 accumulations of either type, a further 18 stations, at least, would havebeen rejected. One of them, Clarence Town, appears in Suppiah and Hennessy (1998: table II). Reanalysis ofthese data using weekday (Tuesday–Friday) rainfall only results in the sign of the trends in both 90th and95th percentile rainfall changing from negative to positive. The trends, however, remain non-significant at the5% level. Nonetheless, this change of sign brings this station into line with all other stations in New SouthWales that are shown to have positive trends in the 90th percentile daily rainfall in Suppiah and Hennessy(1998: figure 2).

7.3. Haylock and Nicholls (2000)

Haylock and Nicholls (2000) analysed long-term trends in extreme rainfall at 91 of the stations in thehigh-quality data set. In contrasting the annual frequency of rainy days at two stations, Goomalling andCollarenebri, they concluded that while the annual number of rainy days (defined as days with at least1 mm of rain) at Goomalling had declined by a highly significant 23 days per century (p < 0.001) those atCollarenebri had increased by 8 days per century (p < 0.065).

The rainfall records for Goomalling show no evidence of untagged accumulations, although they do includesome missing months in 1919 and 1954 that were ignored by Haylock and Nicholls (2000). However, the

Copyright 2004 Royal Meteorological Society Int. J. Climatol. 24: 1171–1192 (2004)

Page 20: It never rains on Sunday: the prevalence and implications of untagged multi-day rainfall accumulations in the Australian high quality data set

1190 N. R. VINEY AND B. C. BATES

records for Collarenebri indicate that as many as 152 untagged weekend accumulations are present in theyears 1926–45, 1950, 1966–68 and 1979–81. Indeed, there is no Sunday rainfall recorded at all in 10 ofthe 15 years between 1932 and 1946. There are a further 117 Sundays with tagged accumulations. When thisrecord is reanalysed using a 4 day week (Tuesday to Friday; scaled by 7/4), the trend increases to 10 daysper century and the probability that this trend is not greater than zero falls to 0.052. In other words, thearguments of Haylock and Nicholls (2000) about the contrast in trends between the two stations are enhancedby eliminating weekend accumulations.

Those years since 1946 that are not identified as having untagged accumulations tend to be the years withsubstantial numbers of tagged accumulations. As a consequence, the only extended period of record withoutaccumulations is between 1910 and 1925, whereas for the remainder of the record we would expect thenumber of rainy days to have been underpredicted. With Collarenebri having a daily rainfall probability of0.15, Figure 12 suggests that this underprediction could have been by as much as 6% for 2 day accumulations(which were prevalent between 1926 and 1973) and 13% for 3 day accumulations (since 1973). It is clearthat these underpredictions of the number of rainy days later in the time series are likely to have led to anunderprediction in the magnitude of the positive trend in rainy days, regardless of the threshold used to definea rainy day.

8. DISCUSSION

The presence of untagged accumulations in rainfall records of the high-quality data set presents a far moreinsidious problem than the presence of tagged accumulations. By definition, the locations (in time) of taggedaccumulations are known. Therefore, data users can easily adopt a suitable strategy for dealing with them.However, in a record with untagged accumulations, even though we may be able to estimate their prevalence,we will usually be unable to pinpoint the weekends on which they occur. For example, from the data inFigure 5, we know that about 1000 of the 1083 rainy Mondays are part of untagged weekend accumulations,but we do not know which 1000. From further analysis of the raw data we can discover that 21 of the rainyMondays directly follow a rainy Sunday, and so cannot be part of a weekend accumulation. However, westill do not know which 1000 of the remaining 1062 rainy Mondays should have been tagged. We are forced,therefore, to treat all 1062 Mondays with suspicion. Furthermore, even if we could identify each specificinstance of an untagged accumulation, we still have no way of knowing on which day or days the rain fellor how to distribute the rainfall total among the accumulation days.

The analysis presented in this paper has shown the potential for misleading or inaccurate inferences aboutclimate change when studies involve rainfall data with untagged accumulations. Studies focusing on theoccurrence of rainy days and on the lengths of wet and dry spells are particularly vulnerable to error. Althoughnone of the reanalyses presented in this paper is likely to have substantially altered the broad conclusions ofthe respective studies, the detail at individual stations can be affected appreciably.

Apart from climate change studies, the presence of untagged accumulations can also have a significantimpact on at least three other aspects of rainfall data application. One such area is spatial interpolation ofrainfall fields. Jeffery et al. (2001) used rainfall data from 4600 rainfall stations in Australia to constructinterpolated daily rainfall fields on a regular 0.05° grid covering the entire continent. In constructing theirinterpolated fields, any tagged accumulations were eliminated. Clearly, it would not have been possible toeliminate untagged accumulations, which presumably will have caused some degree of distortion to theinterpolated fields.

The second area of concern is rainfall-runoff models, many of which operate on a daily time step, withthe primary input being one or more time series of daily rainfall. By various pathways conceptualized in themodel, some of this rainfall is converted to streamflow. It is clear that model calibration (i.e. comparisonof observed and predicted streamflows to condition model parameters) could be compromised by the useof data with untagged accumulations. How, for instance, is the calibration algorithm expected to cope withthe problem of matching model predictions to an observed streamflow event that occurs before the rain thatcaused it has been recorded? Similar problems might arise during model validation or simulation.

Copyright 2004 Royal Meteorological Society Int. J. Climatol. 24: 1171–1192 (2004)

Page 21: It never rains on Sunday: the prevalence and implications of untagged multi-day rainfall accumulations in the Australian high quality data set

MULTI-DAY RAINFALL ACCUMULATIONS IN AUSTRALIAN RAINFALL DATA SET 1191

A third area of concern for the presence of untagged accumulations is in stochastic modelling of rainfallseries. Stochastic modelling of point rainfall is normally conditioned on observed temporal patterns of rainfalloccurrence and observed distributions of rainfall intensity. Where such patterns and distributions are distortedby hidden accumulations, the outcome is likely to be a model that reflects the observed record rather thanreality. Of even greater concern is the impact of accumulations on stochastic modelling of spatial rainfall (e.g.Charles et al., 1999; Hughes et al., 1999). As well as relying on temporal patterns of rainfall occurrence, spatialmodelling is also strongly dependent on the presence of good-quality patterns of spatial rainfall occurrenceto identify and classify daily weather states.

For each of the three examples, appropriate data-handling strategies may be put in place for processingrecords with tagged accumulations. No such strategies are available for handling untagged accumulations.Indeed, until now, the presence and impacts of untagged accumulations have not been fully recognized. Theonly reliable data-handling strategy would appear to be to eliminate large amounts of data — either particularyears or particular days of the week, or to reject affected stations altogether.

As was the case with tagged accumulations, it does not necessarily follow that Lavery et al. (1992) erredin including so many stations with untagged accumulations in the high-quality data set. Such stations can stillprovide long-term trends in rainfall amount and, if weekend observations are eliminated, trends in occurrence,intensity and spell length. However, it is debatable whether some of the statistical criteria used by Laveryet al. (1992) to reject potential stations from the high-quality data set were as important as the presence ofuntagged accumulations. It would be interesting, for example, to compare the potential impacts of roundingerrors or metrication with those of untagged accumulations. Perhaps of greater concern in the context ofproviding a baseline data set for climate change analysis is that most of the statistical screening performedby Lavery et al. (1992) was done on the entire time series and, therefore, did not necessarily identify shorterperiods of poor data quality within the series. An example is the underreporting of small rainfall eventsat Peak Hill, particularly during the 1980s. There may be many other stations with similar short-term dataquality issues that can significantly affect apparent climate trends.

9. CONCLUSIONS

This paper has highlighted and quantified the previously unacknowledged presence of untagged accumulationsin the Australian high-quality rainfall data set. An objective probabilistic test for untagged accumulations hasbeen developed, and this shows that as many as 102 out of the 181 stations in the data set contain hidden,untagged multi-day rainfall accumulations. Overall, the prevalence of untagged accumulations rivals that oftagged accumulations in the data set. Most of the untagged accumulations occur prior to 1974, but at somestations they have continued at least until 2000. Some other aspects of data quality at particular stations havealso been highlighted.

The potential impact of accumulated data, whether tagged or not, on a suite of climatic indices that arecommonly used to quantify climate change has been assessed by simulating midweek accumulations in therecords of the high-quality data set. These simulations show that, in records (or parts of records) with frequentaccumulations, rainfall probability, mean wet-spell length and mean dry-spell length can be underestimated byas much as 24%, 34% and 18% respectively. The magnitude of the potential prediction error at a site showsstrong dependence on the rainfall probability. For rainfall probability itself, together with wet-spell length,the magnitude of the underprediction increases with increasing station wetness and with the number of daysin the accumulations. Underestimations of the mean dry-spell length also increase with rainfall probabilityfor 2 day accumulations, but they decrease (and may become overpredictions) for 3 day accumulations asrainfall probability increases. Estimates of extreme rainfall intensity (90th and 95th percentiles) for recordswith accumulations tend to be slightly too large for stations with high rainfall probability and too small forstations with low rainfall probability.

Reanalysis of published studies on climate change that have made use of gauges with substantial numbersof untagged accumulations confirms the errors expected by the simulation data. This reanalysis was conductedusing midweek rainfall records only, since those records are less likely to involve either tagged or untagged

Copyright 2004 Royal Meteorological Society Int. J. Climatol. 24: 1171–1192 (2004)

Page 22: It never rains on Sunday: the prevalence and implications of untagged multi-day rainfall accumulations in the Australian high quality data set

1192 N. R. VINEY AND B. C. BATES

accumulations than weekend data. Short of rejecting altogether those gauges with large numbers of untaggedaccumulation data, this method of midweek sampling is recommended for climate change analyses usingdata with suspected untagged accumulations. Such a strategy is not relevant, however, for other rainfall dataapplications (such as spatial interpolation, rainfall-runoff modelling or stochastic rainfall modelling) that relyon continuous data of rigorous integrity.

Unwitting use of accumulated rainfall data can lead to mistaken or misleading results, with potentiallyserious consequences in areas as diverse as climate change research, spatial interpolation, stochastic modellingand hydrological modelling.

ACKNOWLEDGEMENTS

Rainfall data were obtained from the Silo database developed by the Queensland Department of NaturalResources and the Bureau of Meteorology. This work was partly funded by the Cooperative Research Centrefor Catchment Hydrology and the Australian Greenhouse Science Program.

REFERENCES

Charles SP, Bates BC, Hughes JP. 1999. A spatiotemporal model for downscaling precipitation occurrence and amounts. Journal ofGeophysical Research 104D: 31 657–31 669.

Goovaerts P. 2000. Geostatistical approaches for incorporating elevation into the spatial interpolation of rainfall. Journal of Hydrology228: 113–129.

Groisman PY, Karl TR, Easterling DR, Knight RW, Jamason PF, Hennessy KJ, Suppiah R, Page CM, Wibig J, Fortuniak K,Razuvaev VN, Douglas A, Førland E, Zhai P. 1999. Changes in the probability of heavy precipitation: important indicators ofclimatic change. Climatic Change 42: 243–283.

Haylock M, Nicholls N. 2000. Trends in extreme rainfall indices for an updated high quality data set for Australia, 1910–1998.International Journal of Climatology 20: 1533–1541.

Hennessy KJ, Suppiah R, Page CM. 1999. Australian rainfall changes, 1910–1995. Australian Meteorological Magazine 48: 1–13.Hughes JP, Guttorp P, Charles SP. 1999. A non-homogeneous hidden Markov model for precipitation occurrence. Applied Statistics 48:

15–30.Jeffrey SJ, Carter JO, Moodie KB, Beswick AR. 2001. Using spatial interpolation to construct a comprehensive archive of Australian

climate data. Environmental Modelling and Software 16: 309–330.Lavery B, Kariko A, Nicholls N. 1992. A historical rainfall data set for Australia. Australian Meteorological Magazine 40: 33–39.Manton MJ, Della-Marta PM, Haylock MR, Hennessy KJ, Nicholls N, Chambers LE, Collins DA, Daw G, Finet A, Gunawan D,

Inape K, Isobe H, Kestin TS, Lefale P, Leyu CH, Lwin T, Maitrepierre L, Ouprasitwong N, Page CM, Pahalad J, Plummer N,Salinger MJ, Suppiah R, Tran VL, Trewin B, Tibig I, Yee D. 2001. Trends in extreme daily rainfall and temperature in SoutheastAsia and the South Pacific: 1961–1998. International Journal of Climatology 21: 269–284.

Nicholls N, Kariko A. 1993. East Australian rainfall events: interannual variations, trends, and relationships with the southern oscillation.Journal of Climate 6: 1141–1152.

Simmonds I, Keay K. 1997. Weekly cycle of meteorological variations in Melbourne and the role of pollution and anthropogenic heatrelease. Atmospheric Environment 31: 1589–1603.

Suppiah R, Hennessy KJ. 1996. Trends in the intensity and frequency of heavy rainfall in tropical Australia and links with the southernoscillation. Australian Meteorological Magazine 45: 1–17.

Suppiah R, Hennessy KJ. 1998. Trends in total rainfall, heavy rain events and number of dry days in Australia, 1910–1990. InternationalJournal of Climatology 18: 1141–1164.

Xia Y, Fabian P, Winterhalter M, Zhao M. 2001. Forest climatology: estimation and use of daily climatological data for Bavaria,Germany. Agricultural and Forest Meteorology 106: 87–103.

Copyright 2004 Royal Meteorological Society Int. J. Climatol. 24: 1171–1192 (2004)