quantifying efficiency of homogenisation methods

Post on 31-Jan-2016

49 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Quantifying efficiency of homogenisation methods. Dr. Peter Domonkos dopeter@t-online.hu COST HOME ES0601. Measuring efficiency our expectations. Gaining the real climatic trends, Gaining the real trends and fluctuations, Identifying large inhomogeneity-shifts one-by-one, - PowerPoint PPT Presentation

TRANSCRIPT

Quantifying efficiency of homogenisation

methods

Dr. Peter Domonkos

dopeter@t-online.hu

COST HOME ES0601

Measuring efficiencyour expectations

• Gaining the real climatic trends,

• Gaining the real trends and fluctuations,

• Identifying large inhomogeneity-shifts

one-by-one,

• Identifying as many shifts as we can

Measuring efficiencygeneral practice

• Usually the rate of correct detection is examined (Ducré-Robitaille, Mestre, Menne and Williams, etc.)

• Menne and Williams (2005) apply the hit rate (or power, = H), false detection rate (F), false alarm rate (FAR), bias of detection frequency (B), and the improvement in skill compared to random forecasts (HSS).

shifts factual allshifts detectedcorrectly H

events shift" no" ofnumber shifts detected falseF

Measuring efficiencygeneral practice

shifts detected allshifts detected falseFAR

shifts factual allshifts detected allB

Measuring efficiencythis presentation

• Arbitrary, but reasonable choices• 1 = standard deviation of estimated noise

• Factual shift: Shift with MM0 magnitude between two adjacent 3 year long periods. M0 = 2 or M0 = 3 here.

• Right detection: A shift with M1.5 for M0 = 2

(M2 for M0 = 3) is detected with maximum 1 year lapse.

• False detection: A shift with M1.5 for M0 = 2

(M2 for M0 = 3) is detected at year j, but there is no shift of the same direction than the detected one with M > 0 within the (j-2,j+2) period.

Measuring efficiencythis presentation

• Let the number of the time series be m, the total of the factual shifts is k, the number of right detections is DR, that of false detections is DF, then

kD

H RkD

F F'

kDD

FHI FRA

'

mkmDD

I FRB

Measuring efficiencythis presentation

• Reliability of trends!? • Let the mean bias of trend slopes, caused by

inhomogeneities is t0 before the homogenisation, and t after the homogenisation. Then the improvement in trend reliability is indicated by

• General (combined) efficiency (Domonkos, 5th Seminar, 2006)

0

0

ttt

IT

Properties of time series

• Five versions of simulated datasets are examined here. Each dataset has 10,000, one hundred year long time series. The scale of the properties is wide from a single inhomogeneity per time series to the inclusion of very complex inhomogeneity-structures „Hungarian standard” (Domonkos, 5th Seminar, 2006).

• (1) 1 shift with M = 3; (2) 1 shift with M = 3 and 4 shifts with M = 1.5; (3) and (4) Shifts with 1/ decade frequency, exponential distribution of M above 1, and uniform distribution of M below 1. (3) Mmax<2; (4) Mmax<3;

(5) Hungarian standard

Distribution of difference (percentage) between the detected inhomogeneity-properties of simulated and real

climatic time series for HU STANDARD.k : simple, wk : weighted with sample size

0

20

40

60

80

100

120

-25 -20 -15 -10 -5 0 5 10 15 20 25 %

k w∙k

Homogenisation methods

• 15 objective homogenisation methods: 2-2 versions of Bayes-test [Bay, Ba1], Buishand-test

[Bu1, Bu2], SNHT [SNH, SNT] and t-test [tt1, tt2]; Caussinus-Mestre test [C-M], Easterling-Peterson test [E-P], Mann-Kendall test [M-K], MASH [MAS], Multiple Linear Regression [MLR], Pettitt-test [Pet] and Wilcoxon Rank Sum test [WRS].

Method parameterisation

• With original parameterisations the chance of detecting at least 1 inhomogeneity is ~5% in pure white noise.

• Minimum length of subperiods for calculating own statistical properties: usually 5 years, but in C-M and MAS 1 year, and in E-P 3 years.

• Outliers are prefiltered; Concerning multiple inhomogeneities the semihierarchic algorithm of Moberg and Alexandersson (1997) is included in Bay, Ba1, Bu1, Bu2, M-K, MLR, Pet, SNH, SNT and WRS.

• In a few experiments optimised parameterisation is applied (its use is indicated).

Red = C-M Blue = MASH Green = E-P Black = t-test (tt1) Brown = SNHT for shifts Lila = MLR

0

25

50

75

100

0 5 10 15 20 25

False rate (%)

Power (%)

Identification A, 1 shift (M=3)

0

25

50

75

100

MLR Bay SNH Ba1 WRS Bu2 Bu1 tt2 E-P C-M MAS SNT Pet tt1 M-K

%

Identification A, 1 shift (M=3)+ 4 small shifts

-10

15

40

65

90

E-P tt2 Bay SNH Ba1 C-M MAS Bu1 WRS Bu2 Pet MLR SNT tt1 M-K

%

Identif. A of M3, Exp. M<6

-10

15

40

65

90

MAS E-P C-M Ba1 Bay SNH tt2 tt1 Bu2 Bu1 WRS MLR Pet SNT M-K

%

Identif.A of M2, Hu standard

0

25

50

75

100

C-M MAS E-P Ba1 MLR Bay SNH SNT Bu2 Bu1 tt2 tt1 WRS Pet M-K

%

Identif.A of M3, Hu standard

0

25

50

75

100

MAS C-M E-P Ba1 MLR Bay SNH SNT Bu1 tt1 Bu2 tt2 WRS Pet M-K

%

Identif.B of M2, Exp. M<2

0

25

50

75

100

tt1 SNT E-P tt2 Bu1 Pet MLR SNH Bay WRS Bu2 Ba1 M-K MAS C-M

%

Identif.B of M3, Exp. M<2

0

25

50

75

100

tt1 SNT Bu1 Pet E-P tt2 SNH Bu2 Bay WRS Ba1 MLR MAS M-K C-M

%

Absence of large shiftsnumber of kinds: 7, best: tt1, C-M, Bay

-100

-75

-50

-25

0

25

50

75

100

All breaks Large breaks General eff.

Trends, 1 shift (M=3) filled columns = optimised parameters

0

25

50

75

100

%

Trends, 1 shift + 4 small shifts

0

25

50

75

100

%

Trends, Exp. M<2

0

25

50

75

100

%

Trends, Exp. M<6

0

25

50

75

100

%

Trends, Hu standard

-10

15

40

65

90

%

Identification A, 1 shift

0

25

50

75

100

%

Identif.A, 1 shift + 4 small shifts

-10

15

40

65

90

%

Identif.B of M2, Exp. M<2

0

25

50

75

100

%

Identif.B of M3, Exp. M<2

0

25

50

75

100

%

Identif.A of M2, Hu standard

0

25

50

75

100

%

Identif.A of M3, Hu standard

0

25

50

75

100

%

Identif.A of M3, Exp. M<6

-10

15

40

65

90

%

Discussion

• Identification of M>3 shifts is best with MASH, but its reproduction of climatic trends is not among the best results. This drawback of MASH can be reduced with parameter-optimisation.

• Many results with C-M are on the top, except for cases of very low rate of large inhomogeneities. If the evaluations of shorter than 3-year sections are excluded, and detection results with M<2 are not considered, all the possible disadvantages with C-M are avoidable, even the skill in detecting shifts of M>3 exceeds the performance of MASH.

Conclusions• The efficiency-order of homogenisation methods

strongly depends on the properties of time series, the purposes/priorities of the homogenisation, and on the way of the efficiency evaluation.

• Direct methods for identifying multiple inhomogeneities (C-M and MASH) usually perform better, than the other methods. When the avoidance of false detection has enhanced importance t-test and E-P methods are also competitive.

• Parameter-optimisation may yield improved results.

Thank you for your Thank you for your attention!attention!

COST HOME ES0601COST HOME ES0601

top related