session 5. bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2...

51
More complex problems Current topics Session 5. Bayesian inference for extremes 5.1 General theory 5.2 Markov chain monte Carlo (MCMC) 5.2.1 The Gibbs sampler 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example: Port Pirie, South Australia 5.3.2 More complex structures: A random effects model Lee Fawcett and Dave Walshaw Modelling Environmental Extremes

Upload: others

Post on 18-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

Session 5. Bayesian inference for extremes

5.1 General theory

5.2 Markov chain monte Carlo (MCMC)

5.2.1 The Gibbs sampler

5.2.2 Metropolis–Hastings sampling

5.2.3 Hybrid methods

5.3 Bayesian inference for extremes

5.3.1 Example: Port Pirie, South Australia

5.3.2 More complex structures: A random effects model

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 2: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5. Bayesian inference for extremes

Throughout this short course, the method of maximum likelihoodhas provided a general and flexible technique for parameterestimation.

In fact, there are numerous other inferential procedures available,such as:

Method of moments;

Probability weighted moments;

L–moments;

Ranked set estimation; and, of course,

Bayesian inference

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 3: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.1 General theory

Bayesian techniques offer an alternative way to draw inferencesfrom the likelihood function, which many practitioners often prefer.

As in the non–Bayesian setting, we assume data x = (x1, . . . , xn)to be realisations of a random variable whose density falls within aparametric family F = {f (x;ψ) : ψ ∈ Ψ}.

However, parameters of a distribution are now treated as randomvariables, for which we specify prior distributions.

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 4: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.1 General theory

This often provides the main argument for – and against (!) – theuse of Bayesian methods:

The specification of these prior distributions enables us tosupplement the information provided by the data – which, inextreme value analyses, is often very limited – with othersources of information.

But since different analysts might specify different priors,conclusions become subjective.

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 5: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.1 General theory

Suppose we model our observed data x using the probabilitydensity function f (x;ψ).

The likelihood function for ψ is therefore L(ψ|x) = f (x;ψ).

Also, suppose our prior beliefs about likely values of ψ areexpressed by the probability density function π(ψ).

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 6: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.1 General theory

We can combine both pieces of information using Bayes Theorem,which states that

π(ψ|x) =π(ψ)L(ψ|x)

f (x),

where

f (x) =

Ψπ(ψ)L(ψ|x)dψ if ψ,

Ψ

π(ψ)L(ψ|x) if ψ is discrete.

Since f (x) is not a function of ψ, Bayes Theorem can be written as

π(ψ|x) ∝ π(ψ) × L(ψ|x)

i.e. posterior ∝ prior × likelihood.Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 7: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.1 General theory

In equation (16), π(ψ|x) is the posterior distribution of theparameter vector ψ, ψ ∈ Ψ, i.e. the distribution of ψ after theinclusion of the data.

This prior distribution is often of great interest, since theprior–posterior changes represent the changes in our beliefs afterthe data has been included in the analysis.

However, computation of the denominator in (16) can beproblematic, and usually analytically intractable.

Stochastic simulation is one possible solution.

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 8: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.2 Markov chain Monte Carlo

The recent explosion in Markov chain Monte Carlo (MCMC)techniques owes largely to their application in Bayesian inference.

The idea is to produce simulated values from the posteriordistribution – not exactly, as this is usually unachievable, butthrough an appropriate MCMC technique.

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 9: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.2.1 The Gibbs sampler

The Gibbs sampler is a way of simulating from multivariatedistributions based only on the ability to simulate from conditionaldistributions.

Suppose the density of interest (usually the posterior density) isπ(ψ), where ψ = (ψ1, . . . , ψd )′, and that the full conditionals

π(ψi |ψ1, . . . , ψi−1, ψi+1, . . . , ψd ) = π(ψi |ψ−i ) = πi(ψi ), i = 1, . . . , d

are available for simulating from.

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 10: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.2.1 The Gibbs sampler

The Gibbs sampler uses the following algorithm:

1. Initialise the iteration counter to k = 1. Initialise the state ofthe chain to ψ(0) = (ψ

(0)1 , . . . , ψ

(0)d

)′;

2. Obtain a new value ψ(k) from ψ(k−1) by successivegeneration of values

ψ(k)1 ∼ π(ψ1|ψ

(k−1)2 , . . . , ψ

(k−1)d

)

ψ(k)2 ∼ π(ψ2|ψ

(k)1 , ψ

(k−1)3 , . . . , ψ

(k−1)d )

......

ψ(k)d ∼ π(ψd |ψ

(k)1 , . . . , ψ

(k)d−1);

3. Change counter k to k + 1, and return to step 2.

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 11: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.2.1 The Gibbs sampler

Each simulated value depends only on the previous simulatedvalue, and not any other previous values.

The Gibbs sampler can be used in isolation if we can readilysimulate from the full conditional distributions.

However, this is not always the case. Fortunately, the Gibbssampler can be combined with Metropolis–Hastings schemes whenthe full conditionals are difficult to simulate from.

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 12: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.2.2 Metropolis–Hastings sampling

Suppose again that π(ψ) is the density of interest.

Further, suppose that we have some arbitrary transition kernelp(ψi+1,ψi) for iterative simulation of successive values. Thenconsider the following algorithm:

1. Initialise the iteration counter to k = 1, and initialise thechain to ψ(0);

2. Generate a proposed value ψ′ using the kernel p(ψ(k−1),ψ′);

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 13: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.2.2 Metropolis–Hastings sampling

3. Evaluate the acceptance probability A(ψ(k),ψ′) of theproposed move, where

A(ψ,ψ′) = min

{

1,π(ψ′)L(ψ′|x)p(ψ′,ψ)

π(ψ)L(ψ|x)p(ψ,ψ′)

}

;

4. Put ψ(k) = ψ′ with probability A(ψ(k−1),ψ′), and putψ(k) = ψ(k−1) otherwise;

5. Change the counter from k to k + 1 and return to step 2.

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 14: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.2.2 Metropolis–Hastings sampling

So at each stage, a new value is generated from the proposaldistribution.

This is either accepted, in which case the chain moves, or rejected,in which case the chain stays where it is.

Whether or not the move is accepted or rejected depends on theacceptance probability which itself depends on the relationshipbetween the density of interest and the proposal distribution.

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 15: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.2.2 Metropolis–Hastings sampling

Common choices for the proposal distribution include:

symmetric chains, where

p(ψ,ψ′) = p(ψ′,ψ)

random walk chains, where the proposal ψ′ at iteration k is

ψ′ = ψ + εk ,

where the εk are IID random variables.

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 16: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.2.2 Metropolis–Hastings sampling

According to Coles (2001),

“...it’s almost like magic... regardless of the choice of p, the

rejection steps involved ensure that the simulated values have, in a

limiting sense, the desired marginal distribution”

In reality, there is alot more to the story, since choosing p so as toensure a short “settling–in” period and low dependence betweensuccessive values can be difficult to arrange.

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 17: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.2.3 Hybrid methods

Here, we combine Gibbs sampling and Metropolis–Hastingsschemes to form hybrid Markov chains whose stationarydistribution is the distribution of interest.

For example, given a multivariate distribution whose fullconditionals are awkward to simulate from directly, we can:

Define a Metropolis–Hastings scheme for each full conditional

Apply them to each component in turn for each iteration

Another scheme, known as “Metropolis within Gibbs”, goesthrough each full conditional in turn, simulating directly from thefull conditionals wherever possible, and carrying out aMetropolis–Hastings update elsewhere.

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 18: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3 Bayesian inference for extremes

There are various (compelling?) reasons for preferring a Bayesiananalysis of extremes over the more traditional likelihood approach.

Extreme data are (by their very nature) scarce, so the abilityto incorporate other sources of information through a priordistribution has obvious appeal

Bayes’ Theorem leads to an inference that comprises acomplete distribution

It is not dependent on the regularity assumptions required bythe theory of maximum likelihood

Implicit in the Bayesian framework is the concept of thepredictive distribution

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 19: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3 Bayesian inference for extremes

This predictive distribution describes how likely are differentoutcomes of a future experiment.

The predictive probability density function is given by

f (y |x) =

Ψf (y |ψ)π(ψ|x)dψ

We can see that the predictive distribution is formed by weightingthe possible values for ψ in the future experiment f (y |ψ) by howlikely we believe they are to occur after seeing the data.

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 20: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3 Bayesian inference for extremes

For example, a suitable model for the threshold excess Y of aprocess is Y ∼ GPD(σ, ξ).

Estimation of ψ = (σ, ξ) could be made on the basis of previousobservations x = (x1, . . . , xn).

Thus, in the Bayesian framework, we would have

Pr {Y ≤ y |x1, . . . , xn} =

ΨPr {Y ≤ y |ψ}π(ψ|x)dψ.

This equation gives the distribution of a future threshold excess,allowing for both parameter uncertainty and randomness in futureobservations.

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 21: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3 Bayesian inference for extremes

Solving

Pr {Y ≤ qr ,pred|x1, . . . , xn} = 1 −1

r

for qr ,pred therefore gives an estimate of the r–year return levelthat incorporates uncertainty due to model estimation.

After removal of the “burn–in” period, the MCMC procedure givesa sample ψ1, . . . ,ψB . Thus

Pr {Y ≤ qr ,pred|x1, . . . , xn} ≈1

B

B∑

i=1

Pr {Y ≤ qr ,pred|ψi} ,

which we can solve for qr ,pred using a numerical solver.

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 22: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.1 Example: Port Pirie, South Australia

Figure 18 shows a time series plot of annual maximum sea levels atanother Australian location – Port Pirie, in South Australia.

Notice that, unlike the corresponding data from Fremantle inWestern Australia, there doesn’t appear to be any trend in thisseries.

We use the GEV as a model for the annual maximum sea levels atPort Pirie Zi in year i , i.e.

Zi ∼ GEV (µ, σ, ξ) , i = 1, . . . , 65.

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 23: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.1 Example: Port Pirie, South Australia

3.6

3.8

4.0

4.2

4.4

4.6

1930 1950Year

Sea

leve

l(m

etre

s)

19701940 1960 1980

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 24: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.1 Example: Port Pirie, South Australia

When employing MCMC methods it is common to re–parameterisethe GEV scale parameter and work with η = log(σ) to retain thepositivity of this parameter.

In the absence of any expert prior information regarding the threeparameters of the GEV distribution, we adopt a ‘naive’ approachand use largely non–informative, independent priors for these,namely

π(µ) ∼ N(0, 10000),

π(η) ∼ N(0, 10000) and

π(ξ) ∼ N(0, 100),

the large variances of these distributions imposing near–flat priors.

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 25: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.1 Example: Port Pirie, South Australia

We use a Metropolis–Hastings MCMC sampling scheme since thefull conditionals (for Gibbs sampling) are unobtainable.

After setting initial starting values for ψ = (µ, η, ξ), we use arandom walk update procedure to generate future values in thechain, i.e.

µ′ = µi + ǫµ

η′ = ηi + ǫη and

ξ′ = ξi + ǫξ,

with the ǫ being normally distributed with zero mean and variancesvµ, vη and vξ respectively.

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 26: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.1 Example: Port Pirie, South Australia

3.84.0

4.24.4

4.64.8

5.00.2

0.40.6

0.81.0

1.20.0

0.51.0

1000

1000

1000

2000

2000

2000

3000

3000

3000

4000

4000

4000

5000

5000

5000

iteration

iteration

iteration

µσ

ξ

0

0

0

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 27: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.1 Example: Port Pirie, South Australia

0.2 0.41.0

0.5

0.0

0.0

µ σ

ξ

0

001

22

Post

erio

rden

sity

Post

erio

rden

sity

Post

erio

rden

sity

Post

erio

rden

sity

100–year return level

0.15 0.20 0.25 0.30 0.35

55

15

1.5

2.0

7

3.75 3.80 3.85 3.90 3.95 4.00

-0.4 -0.2

34

4

6

6

10

101214

2.5

8

8

µ σ ξ q100

Posterior mean (st. dev.) 3.874 (0.028) 0.203 (0.021) –0.024 (0.098) 4.788 (0.255)distribution 95% CI (3.819, 3.932) (0.166, 0.249) (–0.196, 0.182) (4.516, 5.375)Maximum m.l.e. (s.e.) 3.872 (0.028) 0.198 (0.020) -0.040 (0.098) 4.692 (0.158)likelihood 95% CI (3.821, 3.930) (0.158, 0.238) (–0.242, 0.142) (4.501, 5.270)

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 28: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.2 More complex structures

In this section we briefly discuss the work which will be presentedat next week’s TIES conference.

In this work, we:

Develop a hierarchical model for hourly maximum wind speedsover a region of central and northern England.

construct a model which is based on a standard limitingextreme value distribution, but

– incorporates random effects for site variation– incorporates random effects for seasonal variation– explicitly models the temporal dependence inherent in the data

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 29: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.2 More complex structures

Figure 22 illustrates an exploratory analysis of data from twocontrasting sites, Nottingham and Bradfield.

Shown are time series plots of the hourly maxima, histograms, anda plot of the time series against the version at lag 1.

The first three years of data only are used in each case, to bestillustrate the relevant data characteristics. We now (very briefly)outline the model structure used.

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 30: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.2 More complex structures

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 31: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.2 More complex structures

0

0

0

0

0

0

00

0

0

20

2020

20

20

2020

20

40

4040

40

40

4040

40

60

6060

60

60

6060

60

80

80

80

8080

80

100

100500

500

1000

1000

1500

1500

2000

2000

2500

10

10

30

30

50

50

70

70Jan 1975

Jan 1975

Jan 1976

Jan 1976

Jan 1977

Jan 1977

Wind speed (knots)

Win

dsp

eed

(knots

)

Wind speed (knots)

Win

dsp

eed

(knots

)

Fre

quen

cyFre

quen

cy

Xt

Xt

Xt−

1X

t−

1

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 32: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.2 More complex structures

We will start with the Generalised Pareto Distribution as a modelfor threshold excesses.

By doing so, we can incorporate more extreme data in our analysisthan if we were to select “block maxima”, and so increase theprecision of our analysis.

Thus, wind speed excesses over a high threshold will be modelledwith a GPD(σ, ξ).

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 33: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.2 More complex structures: Seasonal variation

Possible solution: Restrict an extreme value analysis to the‘season’ which contains the ‘most extreme’ extremes (e.g. Colesand Tawn, 1991)

We want our model to take account of seasonal variability andidentify all gusts which are large given the time of year as extreme!

Our solution: Fit a seasonally–varying GPD!

For wind speed data, there is no natural partition intoseparate seasons (in the UK)

We partition the annual cycle into 12 ‘seasons’ (we usecalendar months)

– reflects well the continuous nature of seasonal climate changes– still enough data within each season for analysis!

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 34: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.2 More complex structures: Site variability

We take the same approach to allow for site variation.

Thus, our model will yield parameter pairs

(σm,j , ξm,j), for m, j = 1, . . . , 12,

where m and j are indices of season and site (respectively).

We also need our threshold u to vary, since different criteria forwhat constitutes an extreme will be in play for each combination ofseason and site.

We denote by um,j the threshold for identifying extremes in monthm at site j .

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 35: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.2 More complex structures: Temporal dependence

The plot of the time series against the version at lag 1, for eachsite, shows the presence of substantial serial correlation betweensuccessive extremes.

What can be done?

1 ‘Remove’ it – use “Peaks over threshold” (Davison andSmith, 1990)

2 ‘Ignore’ it – initially, but then adjust standard errorspost–analysis (Smith, 1991)

3 Model it

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 36: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.2 More complex structures: Temporal dependence

We have already seen – in Part 2 of this short course – theshortcomings of declustering (and using POT).

That leaves us with:

(a) Adjust inference post–analysis to account for temporaldependence

(b) Explicitly model the dependence present

We opt for (b), because:

Our intention in this work is to investigate complex structuresin the data, not ignore or remove them!

Exploratory analyses support a simple first–order Markovmodel for the serial dependence (Fawcett and Walshaw, 2006)

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 37: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.2 More complex structures: Temporal dependence

The logistic model is one of the most flexible and accessible ofthese models (for example, see Tawn, 1988).

For consecutive threshold exceedances, the appropriate form of thismodel is given by:

F (xi , xi+1) = 1 −(

Z (xi)−1/α + Z (xi+1)

−1/α)α

, xi , xi+1 > u,

where the transformation Z is given by

Z (x) = λ−1{1 + ξ(x − u)/σ}1/ξ+ ,

and ensures that the margins are of GPD form.

Independence and complete dependence are obtained when α = 1and αց 0 respectively.

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 38: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.2 More complex structures: Threshold stability

Recall that we denote by (σm,j , ξm,j) the parameters of the GPDassumed to be valid for threshold excesses in season m and site j .

To ensure threshold stability in our models, we now use

σ̃m,j = σm,j − ξm,jum,j

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 39: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.2 More complex structures: Threshold stability

With this parameterisation, if X − u∗

m,j follows a GPD(σ̃m,j , ξm,j),where um,j > u∗

m,j , then

X − um,j also follows the same GPD,

which is useful for comparisons across different sites andseasons.

It also allows us to specify prior information about bothparameters without having to worry about thresholddependency.

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 40: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.2 More complex structures: The model

With these assumptions in mind, we build the following randomeffects model:

log(σ̃m,j) = γ(m)σ̃ + ǫ

(j)σ̃ ,

ξm,j = γ(m)ξ + ǫ

(j)ξ and

αj = ǫ(j)α ,

where γ and ǫ represent seasonal and site effects respectively.

We work with log(σ̃m,j) for computational convenience, and toretain the positivity of the scale parameter σ̃m,j .

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 41: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.2 More complex structures: The model

All random effects for log(σ̃m,j) and ξm,j are taken to be normallyand independently distributed:

γ(m)σ̃ ∼ N0(0, τσ̃) and

γ(m)ξ ∼ N0(0, τξ), m = 1, . . . , 12,

for the seasonal effects, and

ǫ(j)σ̃ ∼ N0(aσ̃, ζσ̃) and

ǫ(j)ξ ∼ N0(aξ, ζξ), j = 1, . . . , 12,

for the site effects.

In the absence of any prior knowledge about αj , we set

ǫ(j)α ∼ U(0, 1).

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 42: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.2 More complex structures: The model

The final layer of the model is the specification of priordistributions for the random effect distribution parameters.

Here we adopt conjugacy wherever possible to simplifycomputations, specifying:

aσ̃ ∼ N0(bσ̃ , cσ̃), aξ ∼ N0(bξ, cξ);

τσ̃ ∼ Ga(dσ̃, eσ̃), τξ ∼ Ga(dξ, eξ);

ζσ̃ ∼ Ga(fσ̃, gσ̃), ζξ ∼ Ga(fξ, gξ).

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 43: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.2 More complex structures: MCMC

Estimation of the model is made via a Metropolis within Gibbsalgorithm

Here, we update each component singly using a Gibbs samplerwhere conjugacy allows;

Elsewhere, we adopt a Metropolis step

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 44: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.2 More complex structures: MCMC

The full conditionals for the Gibbs sampling are:

a�| . . . ∼ N

(

b�c�+ ζ

ǫ(j)�

c�+ nsζ�

, c�+ nsζ�

)

;

ζ�| . . . ∼ Ga

(

f�+

ns

2, g

�+

1

2

(ǫ(j)�

− a�)2)

;

τ�| . . . ∼ Ga

(

d�+

nm

2, e

�+

1

2

(γ(m)�

)2)

;

where nm = 12 and ns = 12.

The complexity of the GPD likelihood means that conjugacy isunattainable for the random effects parameters, and a Metropolisstep is used here.

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 45: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.2 More complex structures: MCMC

Obviously, we first need to specify appropriate hyper–parameters.In the absence of any expert prior knowledge, we use:

b�= 0, c

�= 10−6, d

�= e

�= f

�= g

�= 10−2.

The implementation of the MCMC scheme then yields samplesfrom the approximate posterior distributions for

the 12 site effect parameters for each of log(σ̃m,j) and ξm,j ;

the 12 seasonal effect parameters for each of log(σ̃m,j) andξm,j , and

the 12 site effect parameters for the dependence parameter αj .

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 46: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.2 More complex structures: Some results

iteration

0 200 400 600 800 1000

−0.5

0.5

iteration

0 200 400 600 800 1000

−0.5

0.5

iteration

0 200 400 600 800 1000

−0.5

0.5

iteration

0 200 400 600 800 1000

−0.5

0.5

iteration

0 200 400 600 800 1000−1

.00.

0

iteration

0 200 400 600 800 1000

−1.0

0.0

iteration

0 200 400 600 800 1000

−0.5

0.5

iteration

0 200 400 600 800 1000

−1.0

0.0

iteration

0 200 400 600 800 1000

−0.5

0.5

iteration

0 200 400 600 800 1000

−1.0

0.0

iteration

0 200 400 600 800 1000

−0.5

0.5

iteration

0 200 400 600 800 1000

−1.0

0.0

Sheffield Bradfield RAF Leeming

Linton on Ouse Scampton Cranwell

Bingley Nottingham Finningley

Keele Aigburth Ringway

ǫ(1

)σ̃

ǫ(2

)σ̃

ǫ(3

)σ̃

ǫ(4

)σ̃

ǫ(5

)σ̃

ǫ(6

)σ̃

ǫ(7

)σ̃

ǫ(8

)σ̃

ǫ(9

)σ̃

ǫ(1

0)

σ̃

ǫ(1

1)

σ̃ ǫ(1

2)

σ̃

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 47: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.2 More complex structures: Some results

0 1000 2000 3000 4000 5000

1.8

2.2

2.6

0 1000 2000 3000 4000 5000

−0.0

40.

040 1000 2000 3000 4000 5000

−0.6

0.0

0 1000 2000 3000 4000 5000−0

.15

0.00

0 1000 2000 3000 4000 5000

0.35

0.55

0 1000 2000 3000 4000 5000

610

14

0 1000 2000 3000 4000 5000

−0.2

0.0

0.2

0 1000 2000 3000 4000 5000

0.35

0.55

6.0 6.5 7.0 7.5 8.0

0.0

0.6

1.2

−0.20 −0.15 −0.10 −0.05 0.00 0.05

05

1015

0.34 0.36 0.38 0.40 0.42 0.44

010

20

Seasonal effect for log(σ̃) (Jan) Seasonal effect for ξ (Jan)

Site effect for log(σ̃) (Bradfield) Site effect for ξ (Bradfield) Site effect for α (Bradfield)

σ̃ (January, Bradfield) ξ (January, Bradfield) α (Bradfield)

Posterior density for σ̃1,2 Posterior density for ξ1,2 Posterior density for α2

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 48: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.2 More complex structures: Some results

Bradfield, January Nottingham, JulyMean (st. dev.) MLE (s.e.) Mean (st. dev.) MLE (s.e.)

γ(m)σ̃ 1.891 (0.042) 1.294 (0.042)

γ(m)ξ 0.021 (0.018) 0.002 (0.018)

ǫ(j)σ̃ 0.367 (0.044) –0.121 (0.041)

ǫ(j)ξ –0.105 (0.020) –0.059 (0.017)

ǫ(j)α 0.385 (0.009) 0.300 (0.011)σ̃m,j 7.267 (0.211) 8.149 (0.633) 3.234 (0.061) 2.914 (0.163)ξm,j –0.084 (0.015) –0.102 (0.055) –0.057 (0.013) 0.018 (0.044)αj 0.385 (0.009) 0.368 (0.012) 0.400 (0.011) 0.412 (0.020)

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 49: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.2 More complex structures: Some results

4 6 8 10

34

56

78

9

−0.3 −0.2 −0.1 0.0 0.1

−0.25

−0.20

−0.15

−0.10

−0.05

0.00

0.37 0.38 0.39 0.40 0.41 0.42

0.37

0.38

0.39

0.40

0.41

0.42

0.43

80 100 120 140 160 180

8090

100110

120130

σ̃(h

iera

rchic

alm

odel

)

ξ(h

iera

rchic

alm

odel

)

α(h

iera

rchic

alm

odel

)

q1000

(hie

rarc

hic

alm

odel

)

σ̃ (maximum likelihood) ξ (maximum likelihood)

α (maximum likelihood) q1000 (maximum likelihood)

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 50: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.2 More complex structures: Some results

60

80

100

120

140

2 4 6 8

Pre

dic

tive

retu

rnle

vel(k

nots

)

−log(−log(1 − 1/r))

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes

Page 51: Session 5. Bayesian inference for extremesnlf8/shortcourse/slides5.pdf · 5.2.2 Metropolis–Hastings sampling 5.2.3 Hybrid methods 5.3 Bayesian inference for extremes 5.3.1 Example:

More complex problems Current topics

5.3.2 More complex structures: Some results

The main take–home points are:

A reduction in sampling variation under the Bayesianhierarchical model

– posterior standard deviations substantially smaller than thecorresponding standard errors...

– ... probably due to the pooling of information across sites andseasons

– This is also evident in the “shrinkage plots”

Estimates of return levels using maximum likelihoodestimation can be very unstable – the hierarchical modelachieves a greater degree of stability

The Bayesian paradigm offers an extension to predictivereturn levels, which cannot be achieved under the classicalapproach to inference

Lee Fawcett and Dave Walshaw

Modelling Environmental Extremes