multivariate probabilistic forecasting using ensemble bayesian model averaging and copulas

10
Quarterly Journal of the Royal Meteorological Society Q. J. R. Meteorol. Soc. (2012) Multivariate probabilistic forecasting using ensemble Bayesian model averaging and copulas Annette M¨ oller, Alex Lenkoski* and Thordis L. Thorarinsdottir Institute of Applied Mathematics, Heidelberg University, Germany *Correspondence to: A. Lenkoski, Institute of Applied Mathematics, Heidelberg University, Im Neuenheimer Feld 294, 69120 Heidelberg, Germany. E-mail: [email protected] We propose a method for post-processing an ensemble of multivariate forecasts in order to obtain a joint predictive distribution of weather. Our method utilizes existing univariate post-processing techniques, in this case ensemble Bayesian model averaging (BMA), to obtain estimated marginal distributions. However, implementing these methods individually offers no information regarding the joint distribution. To correct this, we propose the use of a Gaussian copula, which offers a simple procedure for recovering the dependence that is lost in the estimation of the ensemble BMA marginals. Our method is applied to 48 h forecasts of a set of five weather quantities using the eight-member University of Washington mesoscale ensemble. We show that our method recovers many well-understood dependencies between weather quantities and subsequently improves calibration and sharpness over both the raw ensemble and a method which does not incorporate joint distributional information. Copyright c 2012 Royal Meteorological Society Key Words: ensemble post-processing; joint predictive distributions; copula methods Received 16 February 2012; Revised 21 June 2012; Accepted 25 June 2012; Published online in Wiley Online Library Citation: oller A, Lenkoski A, Thorarinsdottir TL. Multivariate probabilistic forecasting using ensemble Bayesian model averaging and copulas. Q. J. R. Meteorol. Soc. DOI:10.1002/qj.2009 1. Introduction Mesoscale weather forecasting is usually conducted via a forecast ensemble where the ensemble members differ by the boundary conditions and/or the parametrization of the model physics in the numerical weather prediction (NWP) model (Leutbecher and Palmer, 2008). While ensemble systems aim to reflect and quantify sources of uncertainty in the forecast, they tend to be biased, and they are typically underdispersed (Hamill and Colucci, 1997). The predictive performance of an ensemble forecast can thus be substantially improved by applying statistical post- processing to the model output in which the forecasts are corrected in coherence with recently observed forecast errors (Wilks and Hamill, 2007). It has further been argued that the statistical post-processing techniques should be probabilistic in nature and return full predictive distributions (see, for example, Gneiting and Raftery, 2005). State-of-the- art approaches of this type include ensemble model output statistics or non-homogeneous Gaussian regression (Gneiting et al., 2005; Thorarinsdottir and Gneiting, 2010; Thorarinsdottir and Johnson, 2011) and kernel dressing or ensemble Bayesian model averaging (Raftery et al., 2005; Fortin et al., 2006; Sloughter et al., 2007, 2010; Wilson et al., 2007). However, many of these techniques consider only a single weather quantity at a fixed look-ahead time, with- out taking potential spatial dependencies into account. The post-processed forecasts may thus violate the mul- tivariate correlation structure of the original ensemble forecasts and the observations. For low-dimensional mul- tivariate settings, the correlation structure can be mod- elled directly with a parametric model. Parametric post- processing techniques that model the spatial dependencies explicitly through a geostatistical model have, for exam- ple, been proposed for temperature (Berrocal et al., 2007) and precipitation (Berrocal et al., 2008). Similarly, Pinson (2012), Sloughter et al. (2011) and Schuhen et al. (2012) consider bivariate probabilistic forecasts for wind vectors. Copyright c 2012 Royal Meteorological Society

Upload: annette-moeller

Post on 11-Oct-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multivariate probabilistic forecasting using ensemble Bayesian model averaging and copulas

Quarterly Journal of the Royal Meteorological Society Q. J. R. Meteorol. Soc. (2012)

Multivariate probabilistic forecasting using ensembleBayesian model averaging and copulas

Annette Moller, Alex Lenkoski* and Thordis L. ThorarinsdottirInstitute of Applied Mathematics, Heidelberg University, Germany

*Correspondence to: A. Lenkoski, Institute of Applied Mathematics, Heidelberg University, Im Neuenheimer Feld 294,69120 Heidelberg, Germany. E-mail: [email protected]

We propose a method for post-processing an ensemble of multivariate forecastsin order to obtain a joint predictive distribution of weather. Our method utilizesexisting univariate post-processing techniques, in this case ensemble Bayesianmodel averaging (BMA), to obtain estimated marginal distributions. However,implementing these methods individually offers no information regarding the jointdistribution. To correct this, we propose the use of a Gaussian copula, which offersa simple procedure for recovering the dependence that is lost in the estimationof the ensemble BMA marginals. Our method is applied to 48 h forecasts of aset of five weather quantities using the eight-member University of Washingtonmesoscale ensemble. We show that our method recovers many well-understooddependencies between weather quantities and subsequently improves calibrationand sharpness over both the raw ensemble and a method which does not incorporatejoint distributional information. Copyright c© 2012 Royal Meteorological Society

Key Words: ensemble post-processing; joint predictive distributions; copula methods

Received 16 February 2012; Revised 21 June 2012; Accepted 25 June 2012; Published online in Wiley OnlineLibrary

Citation: Moller A, Lenkoski A, Thorarinsdottir TL. Multivariate probabilistic forecasting using ensembleBayesian model averaging and copulas. Q. J. R. Meteorol. Soc. DOI:10.1002/qj.2009

1. Introduction

Mesoscale weather forecasting is usually conducted via aforecast ensemble where the ensemble members differ bythe boundary conditions and/or the parametrization ofthe model physics in the numerical weather prediction(NWP) model (Leutbecher and Palmer, 2008). Whileensemble systems aim to reflect and quantify sources ofuncertainty in the forecast, they tend to be biased, and theyare typically underdispersed (Hamill and Colucci, 1997).The predictive performance of an ensemble forecast canthus be substantially improved by applying statistical post-processing to the model output in which the forecasts arecorrected in coherence with recently observed forecast errors(Wilks and Hamill, 2007). It has further been argued that thestatistical post-processing techniques should be probabilisticin nature and return full predictive distributions (see,for example, Gneiting and Raftery, 2005). State-of-the-art approaches of this type include ensemble modeloutput statistics or non-homogeneous Gaussian regression

(Gneiting et al., 2005; Thorarinsdottir and Gneiting, 2010;Thorarinsdottir and Johnson, 2011) and kernel dressing orensemble Bayesian model averaging (Raftery et al., 2005;Fortin et al., 2006; Sloughter et al., 2007, 2010; Wilson et al.,2007).

However, many of these techniques consider only asingle weather quantity at a fixed look-ahead time, with-out taking potential spatial dependencies into account.The post-processed forecasts may thus violate the mul-tivariate correlation structure of the original ensembleforecasts and the observations. For low-dimensional mul-tivariate settings, the correlation structure can be mod-elled directly with a parametric model. Parametric post-processing techniques that model the spatial dependenciesexplicitly through a geostatistical model have, for exam-ple, been proposed for temperature (Berrocal et al., 2007)and precipitation (Berrocal et al., 2008). Similarly, Pinson(2012), Sloughter et al. (2011) and Schuhen et al. (2012)consider bivariate probabilistic forecasts for windvectors.

Copyright c© 2012 Royal Meteorological Society

Page 2: Multivariate probabilistic forecasting using ensemble Bayesian model averaging and copulas

A. Moller et al.

In higher dimensions, joint parametric modellingbecomes cumbersome, especially when the marginal distri-butions are assumed to be of different types as is the case, forexample, for temperature, precipitation and wind speed. Forsuch data, copula methods are advantageous as they allowfor independent modelling of the marginal distributionsand the multivariate dependence structure of the rankstatistics (see, for example, Genest and Favre, 2007). Wepropose a multivariate post-processing framework where,in a first step, established Bayesian model averaging (BMA)methodology is applied independently to each weathervariable to obtain calibrated and sharp marginal predictivedistributions. In a second step, the marginal distributionsare conjoined in a multivariate framework using a Gaussiancopula model (see, for example, Hoff, 2007).

We apply our method to 48 h ahead forecasts of fiveweather variables obtained from the eight-member Univer-sity of Washington mesoscale ensemble (Eckel and Mass,

2005) at 60 observation locations in the North AmericanPacific Northwest in 2008. The variables we consider aredaily maximum and minimum temperature, sea-level pres-sure, precipitation accumulation and maximum wind speed.The marginal predictive distributions of temperature andpressure are defined on the entire real axis, while wind speedtakes values on the positive real axis only. Furthermore, themarginal predictive distribution for precipitation accumula-tion takes values on the non-negative real axis with an addi-tional point mass in zero. While estimation of the marginaldistribution requires incorporation of these features, theGaussian copula remains largely agnostic to such factors,highlighting its flexibility when working distributions withmixed marginals (Hoff, 2007). An example of the resultingmultivariate predictive distribution is given in Figure 1.

Copula methods are widely used for prediction problemsin hydrology (see, for example, Genest and Favre, 2007;Scholzel and Friederichs, 2008; Kao and Govindaraju, 2010;

WindSp

2 4 6 8 12 0 1 2 3 4 5 7 9 11 6 8 12 16 1006 1010 1014

46

812

01

23

4

Precip

01

23

4

57

911

MinTemp5

79

11

68

1216

MaxTemp

68

1216

4 6 8 12

1006

1010

1014

0 1 2 3 4 5 7 9 11 6 8 12 16

Pressure

1006 1010 1014

Figure 1. Estimated joint predictive distribution for 1 January 2008 at the KSEA observation station, along with ensemble predictions (circles) andverifying observation (square). In the pairwise plots, lighter areas correspond to regions of higher probability mass. The diagonal shows the marginalpredictive distribution for each quantity. Wind speed is given in metres per second, precipitation in millimetres, temperature in degrees Celsius andpressure in millibars. This figure is available in colour online at wileyonlinelibrary.com/journal/qj

Copyright c© 2012 Royal Meteorological Society Q. J. R. Meteorol. Soc. (2012)

Page 3: Multivariate probabilistic forecasting using ensemble Bayesian model averaging and copulas

Multivariate Probabilistic Forecasting Using Ensemble BMA and Copulas

and references therein). In the context of multivariatestatistical post-processing, two discrete copula approacheshave been proposed where the multivariate rank structureis inherited from either past observations or the ensembleforecasts. The Schaake Shuffle (Clark et al., 2004) reorders apost-processed ensemble based on past observations in orderto recover the multivariate rank structure in the observeddata. Ensemble copula coupling (ECC) (Schefzik, 2011), onthe other hand, constructs multivariate samples from themarginal predictive distributions in such a way that themultivariate rank structure of the original NWP ensemble ispreserved.

Our work, by contrast, uses a Gaussian model fordependence, which requires estimation of a single modelcomponent (the multivariate residual correlation matrix).We feel this approach has the potential to scale considerablybetter than approaches based on estimation of multivariateranks, which would require a considerable number ofobservations in high dimensions to estimate properly.

Gaussian copulas have previously been applied to prob-abilistic forecasting in short-term river stage modelling ina series of articles by Krzysztofowicz (1999), Krzysztofow-icz and Kelly (2000) and Krzysztofowicz and Herr (2001).Wilks (2002) proposes a multivariate Gaussian frameworkto smooth ensemble forecasts, in which he applies a Gaus-sian copula to include the non-Gaussian variables windspeed and cloud cover. Gaussian copulas have also beenused for various settings in wind energy modelling, such asfor predicting multiple lead times simultaneously (Pinsonet al., 2009; Pinson and Girard, 2012) and in multivariateuncertainty analysis (Hagspiel et al., 2011). AghaKouchaket al. (2010) use Gaussian and t-copulas for the simulationof rainfall error fields. Note that the method proposed byPinson (2012) may also be considered a copula approach.

The paper is organized as follows. Description of the dataand the methods are given in section 2. We shortly review theunivariate BMA approaches and the multivariate verificationmethods that we apply and give a detailed description of theGaussian copula approach. The results of the case study arepresented in section 3 and the paper ends with conclusionsin section 4.

2. Data and methods

2.1. Data

In our case study, we employ daily 48 h forecasts based onthe University of Washington mesoscale ensemble (UWME;Eckel and Mass, 2005) with valid dates in the calendaryear 2008. The UWME is an eight-member multi-analysisensemble which then was based on the Fifth-GenerationPenn State/NCAR Mesoscale Model (MM5) with initialand lateral boundary conditions obtained from operationalcentres around the world. Currently, the UWME uses theWRF mesoscale model. Further information as well as real-time forecasts and observations can be found on the websitehttp://www.atmos.washington.edu/ ens/uwme.cgi.

The forecasts are made on a 12 km grid over the PacificNorthwest region of western North America. To obtain aforecast at a given observation location, the forecasts atthe four surrounding grid points are bilinearly interpolatedto that location. We consider observation locations in theUS states of Washington, Oregon, Idaho, California andNevada (see Figure 3). The daily observations are provided

by weather observation stations in the Automated SurfaceObserving Network (National Weather Service, 1998). Weconsider 2 m maximum and minimum temperature, sea-level pressure, 10 m maximum wind speed and 24 hprecipitation accumulation. Forecasts and observations areinitialized at 0000 UTC, which is 5 pm local time whendaylight saving time operates and 4 pm local time otherwise.Quality control procedures as described by Baars (2005)were applied to the entire dataset, removing dates andlocations with any missing forecasts or observations.

For the calendar year 2008, we consider 60 distinctobservation locations that have between 95 and 271 days inwhich all ensemble forecasts and verifying observations wereavailable. Additional data from 2006 and 2007 were used toprovide an appropriate rolling training period for all daysin 2008 and to estimate the multivariate residual correlationstructure.

2.2. Ensemble Bayesian model averaging

BMA was originally developed as a method to combinepredictions and inferences from multiple statistical models(Leamer, 1978). Raftery et al. (2005) extended the use ofBMA to statistical post-processing for forecast ensembles.In this context, the method is a kernel-dressing approachwhere each ensemble member xk is associated with a kerneldensity gk(y|xk). The ensemble BMA predictive density isthen given by a mixture of the individual kernel densities:

f (y|x1, . . . , xK ) =K∑

k=1

ωkgk(y|xk), (1)

where the weights ωk are assumed to be non-negative with∑Kk=1 ωk = 1. The choice of kernel gk depends heavily on the

weather variable of interest: Raftery et al. (2005) considertemperature and pressure for which Gaussian kernels seemappropriate, while Sloughter et al. (2010) apply gammakernels to wind speed forecasts.

Precipitation has to be treated slightly differently asprecipitation observations are non-negative with a largenumber of zero observations. Sloughter et al. (2007) proposea solution to this where the kernel density gk is modelled intwo parts:

gk(y|xk) =P(y = 0|xk)1{y = 0}+ P(y > 0|xk)hk(y1/3|xk)1{y > 0},

where hk is a gamma density, 1 denotes the indicatorfunction, and

P(y = 0|xk) = exp(a0k + a1kx1/3k + a2kδk)

1 + exp(a0k + a1kx1/3k + a2kδk)

with δk = 1 if xk = 0 and δk = 0 otherwise. Note that hk isa predictive density for the cube root of the precipitationamount. However, the resulting probabilistic forecast caneasily be expressed in terms of the original amounts.

Table 1 gives an overview over the different BMA modelswe consider, as well as the associated link functions forthe mean value and the variance of each kernel density.Further variants of the ensemble BMA method which willnot be considered here include work by Roquelaure and

Copyright c© 2012 Royal Meteorological Society Q. J. R. Meteorol. Soc. (2012)

Page 4: Multivariate probabilistic forecasting using ensemble Bayesian model averaging and copulas

A. Moller et al.

Table 1. The ensemble BMA kernel functions for the different weathervariables and the associated link functions for each mean value and variance.The gamma distribution is parametrized in terms of shape and scale with

mean αβ and variance αβ2.

Variable Range Kernel Mean Variance

Temperature y ∈ R N (µk, σ 2k ) b0k + b1kxk σ 2

Pressure y ∈ R N (µk, σ 2k ) b0k + b1kxk σ 2

Wind speed y ∈ R+ �(αk, βk) b0k + b1kxk c0 + c1xk

Precipitation y1/3 ∈ R+ �(αk, βk) b0k + b1kx1/3k c0 + c1xk

amount

Bergot (2008), Bao et al. (2010) and Chmielecki and Raftery(2010). For the parameter estimation, we apply theRpackageensembleBMA, which provides estimation methods for allthe models listed in Table 1 (R Development Core Team,2011; Fraley et al., 2011).

2.3. Gaussian copulas

The ensemble BMA methods discussed in section 2.2 haveproven to work well for post-processing ensemble forecastsof univariate quantities. However, by using these methodsfor each quantity individually, no attention has been paid tothe joint distribution of weather quantities. In this section,we outline a Gaussian copula approach that allows us torecover the dependence between weather quantities andconstruct a post-processed joint distribution.

Suppose that we have p weather quantities of interest,with marginal distributions F1, . . . , Fp, where

Fj(y) =∫ y

−∞fj(u|x1j, . . . , xKj)du. (2)

In Eq. (2), fj(u|x1j, . . . , xKj) represents the ensemble BMAdensity discussed in Eq. (1) for variable j, evaluated at u anddepending on the ensemble members x1j, . . . , xKj. Now let Cbe a p × p correlation matrix, i.e. a positive definite matrixwith unit diagonal. Under a Gaussian copula, the jointdistribution F of the weather quantities takes the followingform:

F(y1, . . . , yp|C)

= �p(�−1(F1(y1)), . . . , �−1(Fp(yp))|C), (3)

where �−1(·) is the inverse cumulative distribution function(cdf) of a standard Gaussian distribution and �p(·|�) is thecdf of a p-variate Gaussian distribution with covariancematrix �. The Gaussian copula is a particularly tractabletype of copula model, as it requires only the marginaldistributions F1, . . . , Fp and the correlation matrix C to befully defined (see Nelsen, 2006, for a more detailed accountof general copula models).

The Gaussian copula lends itself to a useful construction.Let

Z ∼ Np(0, C),

where Np(0, C) denotes a p-dimensional normal distribu-tion with mean vector 0 and correlation matrix C. Then, forj = 1, . . . , p set

Yj = F−1j (�(Zj)),

where

F−1j (u) = max{y : Fj(y) ≤ u}

denotes the pseudo-inverse of the marginal Fj. Then we havethat Y = (Y1, . . . , Yp) ∼ F. The construction also highlightsthat each Yj is marginally distributed according to Fj.Thus, by using the Gaussian copula with the ensembleBMA marginal distributions, we are able to obtain a jointdistribution whose marginals are the original univariatedistributions estimated by ensemble BMA.

The construction above thus creates a link between arealization Y sampled from F and a latent Gaussian factorZ. In particular, if Fj is a continuous marginal distribution,we can immediately see that

Zj = �−1(Fj(Yj)).

This indicates that for the majority of weather quantities,which have fully continuous distributions, given Fj and anobserved yj we can directly infer a latent zj. In the caseof precipitation, the situation is slightly more nuanced. Ingeneral, suppose that Yj ∈ [0, +∞), where Fj(0) = α with0 < α ≤ 1 and Fj is otherwise continuously increasing on(0, +∞). Then we have that

−∞ < Zj ≤ �−1(α)

when Yj = 0 and

Zj = �−1(Fj(Yj))

when Yj > 0. If we therefore collect several observationsy(1), . . . , y(T) we may infer latent Gaussian observationsz(1), . . . , z(T) and thereby estimate the matrix C.

Our process for forming the ensemble post-processedjoint distribution of weather quantities builds on the logicabove. Suppose that we have a collection y(1), . . . , y(T) ofobservations over T days. We assume that each

y(t) ∼ F(t)(·|C),

where

F(t)(

y(t)1 , . . . , y(t)

p |C)

= �p

(�−1

(F(t)

1 (y(t)1 )

), . . . , �−1

(F(t)

p (y(t)p )

)|C

)

and F(t)j denotes the ensemble BMA marginal distribution

for weather quantity j at time-point t using the K ensemble

members x(t)1 , . . . x(t)

K at time-point t. This frameworkassociates each observation y(t) with its own Gaussian copulaF(t), but all T copulas share one residual correlation matrixC. The assumption of a common residual correlation matrixC can be relaxed, as we outline in section 4.

Using y(t) and the marginal distributions F(t)1 , . . . , F(t)

p we

may then infer a latent Gaussian z(t) as discussed above.While each y(t) was given a separate distribution F(t), wenote that the associated z(t) are all distributed Np(0, C). Wetherefore use these latent Gaussian observations to estimateC, for instance by taking the sample correlation matrix. Hoff(2007) discusses more involved methods for estimating C;we discuss the inclusion of these methods in section 4.

Copyright c© 2012 Royal Meteorological Society Q. J. R. Meteorol. Soc. (2012)

Page 5: Multivariate probabilistic forecasting using ensemble Bayesian model averaging and copulas

Multivariate Probabilistic Forecasting Using Ensemble BMA and Copulas

Now consider forming a predictive distribution for thetime-point s (coming sometime after T), based on the

K ensemble members x(s)1 , . . . , x(s)

K and the estimate C.Our method proceeds by first forming the ensemble BMA

predictive marginals F(s)1 , . . . , F(s)

p and then setting

Y(s) ∼ F(s)(·|C).

While this joint predictive distribution may not have aneasy analytic structure, a sample Y may be obtained by firstsampling

Z ∼ Np(0, C)

and then setting

Yj = (F(s)j )−1(�(Zj)).

By sampling a large number of Y in this manner, weare able to effectively describe the entire joint predictivedistribution. As noted above, the marginal distributions foreach individual quantity of this sample remain the ensemble

BMA marginals F(s)1 , . . . , F(s)

p .We briefly comment on the interpretation of the values

in the matrix C. The residual correlation matrix C isthe correlation between the quantiles of the predictivedistribution after post-processing the ensemble. Thus C doesnot model the direct physical relationship between weatherquantities. This is often largely accounted for directly bythe ensemble. It instead models any subsequent residualcorrelation between these quantities after post-processinghas been performed.

2.4. Multivariate forecast verification

To assess the quality of the multivariate forecasts, weapply the methods described in Gneiting et al. (2008).For inspecting calibration, we use the multivariate rankhistogram (MRH), which is a direct generalization of theunivariate verification rank histogram or Talagrand diagram(Anderson, 1996; Hamill and Colucci, 1997; Talagrand et al.,1997). The only challenge lies in defining a multivariate rankorder, as no natural ordering exists for multivariate vectors.Here, we use the multivariate ordering described in Gneitinget al. (2008).

A forecast is said to be calibrated if the resulting MRHis close to being uniform. To quantify the deviation fromuniformity, we use the discrepancy or reliability index :

=m+1∑j=1

∣∣∣ζj − 1

m + 1

∣∣∣,

where ζj is the observed relative frequency of rank j (DelleMonache et al., 2006).

The sharpness of a univariate predictive distribution or anensemble forecast can easily be assessed by the correspondingstandard deviation. In the multivariate case, we employ ageneralization of the standard deviation, the determinantsharpness (DS):

DS = (det �)1/(2d),

where � is the covariance matrix of an ensemble or amultivariate predictive distribution for a d-dimensionalquantity. For ensemble forecasts, the matrix is generatedby empirical variances and correlation of the ensemble.

Scoring rules for the verification of deterministic orprobabilistic forecasts are well known and have beenwidely used in forecast assessment. We consider multivariateextensions of the absolute error and the continuous rankedprobability score (Matheson and Winkler, 1976; Hersbach,2000). The absolute error generalizes to the Euclidean error(EE):

EE(F, y) = ‖µ − y‖,

where µ is the median of F. For an ensemble or a samplefrom a continuous distribution, µ is defined as the vectorthat minimizes the sum of the Euclidean distance to theindividual forecast vectors:

minµ

{ m∑i=1

‖µ − xi‖}.

The vector µ can be determined numerically usingthe algorithm described in Vardi and Zhang (2000) asimplemented in the R package ICSNP.

For a generalization of the continuous ranked probabilityscore, Gneiting and Raftery (2007) introduce the energyscore (ES):

ES(F, y) = EF||X − y||− 1

2EF||X − X′||,

where || · || denotes the Euclidean norm and X and X′ areindependent random vectors with distribution F. If F is thecumulative distribution function associated with a forecastensemble of size m, the energy score can be computed as

ES(F, y) = 1

m

m∑j=1

||xj − y||

− 1

2m2

m∑i=1

m∑j=1

||xi − xj||.

Generally, the energy score may be approximated by

ES(F, y) ≈ 1

n

n∑j=1

||xj − y|| − 1

2n

n∑j=1

||xj − x′j||,

where {xj}nj=1 and {x′

j}nj=1 are two independent samples from

F.We assign the forecasting methods a score by averaging

the scoring rules over all locations and time-points in thetest set. Both the energy score and the Euclidean error arenegatively oriented such that a smaller score indicates abetter predictive performance. The values of the weathervariables we consider are given on scales which vary byseveral orders of magnitude. For this reason, we normalizethe components before we calculate the scores using theobserved mean values and standard deviations over the testset.

Copyright c© 2012 Royal Meteorological Society Q. J. R. Meteorol. Soc. (2012)

Page 6: Multivariate probabilistic forecasting using ensemble Bayesian model averaging and copulas

A. Moller et al.

3. Results

We now give the results of applying the established BMAframework together with our multivariate Gaussian copulaapproach to maximum and minimum temperatures, sea-level pressure, maximum wind speed and precipitationover the North American Pacific Northwest in 2008. Seesection 2.1 for a detailed description of the data. The BMAunivariate post-processing is applied at each observationlocation separately. Based on an exploratory analysis using asubset of the dataset, we use a 40-day sliding training periodfor the parameter estimation. That is, the training periodconsists of the 40 most recent days prior to the forecastfor which ensemble output and verifying observations wereavailable. In terms of calendar days, this period typicallycorresponds to more than 40 days (see, for example, Wilsonet al., 2007, for similar settings.

3.1. Results at Sea-Tac Airport

To show the behaviour of our methodology in depth, wefirst focus on the KSEA observation station, located at Sea-Tac Airport, a major transportation hub in the area. Usingall available data from 2007, we run the ensembleBMAmethodology for each of the five variables as described insection 2.2. We then use the observations for these data andthe estimated marginal distributions to infer a latent vectorz as described in section 2.3. This is performed separatelyfor each day in 2007 and the resulting latent data are thenused to estimate a single correlation matrix. Table 2 showsthe entries of this correlation matrix.

The correlations estimated in Table 2 show several clearpatterns of interaction. We see a strong negative correlationbetween forecast errors of pressure and both maximum andminimum temperatures. This is in line with the understoodinverse relationship between temperature and pressuresystems. We also see an intuitive positive error correlationbetween the minimum and maximum temperatures.

This estimated correlation matrix is then carried forwardinto an analysis of 2008, which we use for verification.For each day with observations in 2008, we again runensembleBMA individually for each weather quantity toobtain estimated marginal predictive distributions. We thenuse the estimated correlation matrix and these marginaldistributions to obtain 20 000 samples from the jointpredictive distribution for each day, as described in section2.3.

Figure 1 shows a pairwise plot of this joint predictivedistribution, for the date 1 January 2008. For each pair ofvariables the figure shows a heat map indicating regionsof significant probability mass –with lighter regions beinghigher values –as well as points showing the eight ensemblemembers (circles) and the observed level (a square). Themarginal predictive distribution, given by the ensembleBMAmethodology, is shown for each variable along the diagonal.

In this figure we can see that the correlation structure inTable 2 has been carried over to the predictive distribution:the positive correlation between maximum and minimumtemperatures is evident, as well as the negative correlationof each of these quantities with pressure. Further, wecan see the effect of post-processing, as the predictivedistributions are often centred away from the ensemblemembers (indicating the bias correction properties) anddisplay greater spread in the distribution than is evident

Table 2. Estimated correlation matrix at the station KSEA, Sea-Tac Airport,based on data and ensemble BMA predictive distributions from the calendar

year 2007.

maxwsp precip mintemp maxtemp pressure

maxwsp 1 −0.016 0.032 0.139 −0.123precip −0.016 1 −0.001 −0.174 −0.015mintemp 0.032 −0.001 1 0.239 −0.110maxtemp 0.139 −0.174 0.239 1 −0.203pressure −0.123 −0.015 −0.110 −0.203 1

Table 3. Predictive performance of the copula methodology, theindependence approach, and the raw UWME ensemble at the KSEA

observation station.

ES EE DS

UWME 0.938 1.081 0.185 0.566Independence 0.637 0.982 0.047 7.516Copula 0.636 0.982 0.019 6.971

Results are averaged over the 271 days in 2008 for which forecasts andverifying observations were available. The performance is measured bythe energy score (ES), Euclidean error (EE), reliability index () anddeterminant sharpness (DS).

in the ensemble. The diagonal elements, by construction,show that the marginal distributions remain unchanged.This means that the marginal probability of, for instance,precipitation over 1 mm (12% in this example) is the samein the copula distribution as the original ensembleBMAdistribution.

Table 3 shows that according to a number ofverification metrics the Gaussian copula approach yieldsan improvement in predictive performance. The tablecompares the Gaussian copula approach to an alternativewhere dependence is not modelled, which we call theindependence approach, as well as the raw UWME ensembleaveraged over 2008 for the KSEA observation station. Notethat the Gaussian copula and independence approaches havethe same marginal distributions, and thus differ only in themanner in which the joint distribution is constructed.

Both the copula and independence approaches improveconsiderably on the raw ensemble in all metrics except thedeterminant sharpness (DS). This is not surprising sincethe eight members of the raw ensemble will impart greatersharpness at the expense of calibration. We also see that thecopula approach improves on the independence approachfor all metrics. The values of the reliability index and theDS show that the predictive distribution for the copulaapproach is both better calibrated and somewhat sharperthan for the independence approach, a consequence of usinga non-diagonal correlation matrix. The Euclidean scores areessentially the same for the two approaches, since they canlargely be expected to return similar median values. Thecombination of similar median and improved sharpnessleads the copula approach to have a lower energy score.

Figure 2 shows the multivariate rank histograms for thetwo approaches, as well as for the raw ensemble. The figureshows that both methods improve calibration considerablyover the raw ensemble. However, as shown in the figure,in the independence approach the final bins are somewhatless filled than under the copula approach, though neitherreturns a perfectly uniform rank histogram.

Copyright c© 2012 Royal Meteorological Society Q. J. R. Meteorol. Soc. (2012)

Page 7: Multivariate probabilistic forecasting using ensemble Bayesian model averaging and copulas

Multivariate Probabilistic Forecasting Using Ensemble BMA and Copulas

0.00

0.05

0.10

0.15

1 2 3 4 5 6 7 8 9

0.00

0.05

0.10

0.15

1 2 3 4 5 6 7 8 9

0.00

0.05

0.10

0.15

1 2 3 4 5 6 7 8 9

(a) UWME (b) Independence (c) Copula

Figure 2. Multivariate rank histograms for the copula and independence approaches as well as the UWME ensemble for the KSEA observation stationover the 271 days in 2008 for which forecasts and verifying observations were available.

−0.13

−0.1

−0.08

−0.07

−0.06−0.04

−0.03

−0.03

−0.03

0.01

0.01 0.02

0.03

0.03

0.04

0.04

0.05

0.07

0.08

0.09

0.1

0.11

0.11

0.11

0.12

0.12

0.14

0.14

0.15

0.16

0.16

0.17

0.17

0.19

0.19

0.2

0.2

0.22

0.22

0.230.23

0.24

0.240.24

0.25

0.26

0.26

0.26

0.270.27

0.28

0.3

0.31

0.31

0.32

0.34

0.35

0.39

0.51 −0.26

−0.24

−0.27

−0.15

−0.22−0.22

−0.33

−0.35

−0.33

−0.45

−0.3 −0.22

−0.33

−0.22

−0.35

−0.29

−0.25

−0.25

−0.22

−0.31

−0.42

−0.23

−0.26

−0.23

−0.22

−0.42

−0.4

−0.45

−0.19

−0.4

−0.37

−0.08

−0.22

−0.37

−0.29

−0.42

−0.3

−0.18

−0.47

−0.33 −0.38

−0.18

−0.2−0.24

−0.35

−0.1

−0.44

−0.16

−0.44−0.41

−0.38

−0.24

−0.49

−0.27

−0.23

−0.51

−0.42

−0.49

−0.38

(a) (b)

Figure 3. Estimated correlation between (a) minimum and maximum temperature, and (b) minimum temperature and pressure at 60 observationstations in the Northwest USA using 2007 data and the Gaussian copula approach.

3.2. Results over the Northwest USA

A similar analysis to that above was run for 60separate observation stations in the Northwest USA. First,ensembleBMA was run individually for each station, dayand weather quantity during the period of 2007. Verifyingobservations were then used to estimate a correlation matrixseparately for each observation station. While these estimateswere performed locally, there is considerable agreementin estimated correlations between individual stations.Figure 3(a) shows the pairwise estimated correlation betweenminimum and maximum temperature at each observationstation. This plot shows that the majority of estimates arepositive, as expected, with correlations as high as 0.51.

Figure 3(b) shows a similar plot, but with the correlationbetween minimum temperature and pressure plotted foreach observation station. While the previous figure showed

some similarity in estimated correlations, this figure showsa considerable agreement between observation stations. Wesee that all values are quite negative, with the majorityof estimates between −0.2 and −0.4. Furthermore, thereappears to be a spatial component to the estimates.The values near the top of the Puget Sound are allroughly between −0.1 and −0.2, while those closer to theSeattle/Tacoma area are grouped between −0.22 and −0.26.A tight group of observation stations in the Columbia RiverValley, on the border of Washington and Oregon, all havecorrelations between −0.31 and −0.35, and finally those inEastern Washington and Eastern Oregon exhibit strongercorrelations, typically below −0.4. Figure 3 thereforesuggests that an important feature of the joint distributionis captured through the Gaussian copula methodology thatis ignored in the independence approach.

Copyright c© 2012 Royal Meteorological Society Q. J. R. Meteorol. Soc. (2012)

Page 8: Multivariate probabilistic forecasting using ensemble Bayesian model averaging and copulas

A. Moller et al.

0.00

0.05

0.10

0.15

1 2 3 4 5 6 7 8 9

0.00

0.05

0.10

0.15

1 2 3 4 5 6 7 8 9

0.00

0.05

0.10

0.15

1 2 3 4 5 6 7 8 9

(a) UWME (b) Independence (c) Copula

Figure 4. Multivariate rank histograms for the copula and independence approaches as well as the raw UWME ensemble taken over all availableobservations at 60 observation stations in the Northwest USA over the year 2008.

Table 4. Predictive performance of the copula and independenceapproaches as well as the raw UWME ensemble.

ES EE DS

UWME 0.943 1.061 0.161 0.811Independence 0.646 0.914 0.071 1.945Copula 0.644 0.914 0.066 1.905

Results are averaged over 60 observation stations in the Northwest USAand all days in 2008 for which forecasts and verifying observations wereavailable. The predictive performance is measured by the energy score (ES),Euclidean error (EE), reliability index () and determinant sharpness (DS).

Figure 4 compares the multivariate rank histograms(averaged over all observation stations) for the copulaand independence approaches as well as the raw UWMEensemble. We see a similar result across all stations as for theKSEA station above: namely that the highest bin is under-occupied in the independence approach in comparison tothe copula approach. Furthermore, both methods improvethe calibration considerably compared to the raw ensemble.

Table 4 presents verification scores averaged over all 60stations and all available days in 2008. The results here arebroadly consistent with those reported for KSEA. We seethat the determinant sharpness is improved in the copulaversus independence approach, while the Euclidean score isessentially the same. These two factors lead to an improve-ment in the energy score in the copula approach. Calibrationalso appears to be improved, as shown through lowerenergy score and reliability index for the copula approach.

3.3. Assessing significance

The results presented in Table 4 show a small improvementin the energy score when moving from the independenceto the copula approach, which is considerably less than theimprovement in moving from the raw UWME distributionto the independence approach. In this section we show that,while the magnitude of the difference is small, it is not purelythe result of sampling variability.

In order to make this comparison we conduct apermutation test. The permutation test works by treatingthe two populations of energy scores as interchangeableunder a null hypothesis that the scores come from the samedistribution (see Good, 1995, for a detailed discussion of thepermutation test and its properties).

−0.0015 −0.0010 −0.0005 0.0000 0.0005 0.0010 0.0015

010

020

030

040

050

060

0

Figure 5. Permutation distribution of difference between energy scoresfor the copula and independence approaches along with true value (solidvertical line), as well as 0.025 and 0.975 quantiles (dotted vertical lines).

The permutation test proceeds by constructing a largenumber of synthetic datasets under this assumption ofexchangeability. Thus, for every day and station combinationin which the two energy scores were calculated, we randomlyassign one score to the copula group and the other scoreto the independence group. This pairwise reassignment isperformed since the magnitude of the energy scores canvary dramatically throughout the year, while the differencesvary to a much smaller degree. Once a synthetic dataset isconstructed in this manner, the difference in the means ofthe two groups is computed and retained. The entire processis repeated a large number of times; in our case we construct10 000 synthetic datasets.

Figure 5 shows the distribution of these permutationscores, along with the true difference of −0.00145. As wouldbe expected, the permutation distribution is centred about 0,with 0.025 and 0.975 quantiles of −0.000243 and 0.000245respectively. As we can see, none of the 10 000 sampledvalues from the permutation distribution approach the truevalue.

This shows the statistical significance of the resultspresented in Table 4. While the magnitude of the differenceis not large, especially in comparison to the initial

Copyright c© 2012 Royal Meteorological Society Q. J. R. Meteorol. Soc. (2012)

Page 9: Multivariate probabilistic forecasting using ensemble Bayesian model averaging and copulas

Multivariate Probabilistic Forecasting Using Ensemble BMA and Copulas

improvement over the raw ensemble, the result could nothave come about purely from sampling variability.

We finally note that, intuitively, it is reasonable thatthe improvement in using the copula would be orders ofmagnitude smaller than the initial improvement over theraw ensemble, especially for the dataset under consideration.The raw ensemble consists of only eight forecasts, and thusthe UWME results are computed over a five-dimensionalpredictive distribution with only eight points to fill theentire space. By returning a full distribution, as opposedto such a sparse, discrete distribution, ensembleBMAwill undoubtedly yield a substantial improvement in theperformance of the predictive distribution.

By contrast, the improvement in moving from theindependence approach to the copula approach isunderstandably less dramatic, as both procedures returnfull predictive distributions and also have the same marginaldistributions. However, this section has shown that thesmall improvement shown in Table 4 is not due to samplingvariability and instead shows that the copula approach yieldsa small edge in predictive performance.

4. Conclusions

We have proposed a method for constructing a joint pre-dictive distribution based on post-processing an ensembleof forecasts for multiple weather quantities. Our methodutilizes existing techniques for post-processing univariateweather quantities and then leverages a Gaussian copulato tie these individual marginal distributions together. Themethod is relatively simple; it requires little additional com-putational effort after the univariate marginals are formedand it retains the marginal distributions learned from ensem-ble BMA. We have then shown that the method yieldsa calibrated and sharp distribution, using data from thePacific Northwest.

In this paper we focused on using the ensemble BMAmethodology to construct marginal predictive distributions.This was done since the methodology has already beenestablished for several interesting weather quantities, andsoftware exists to implement these methods. However, inpractice any method for forming marginal distributionscould replace the ensemble BMA framework and essentiallynothing of the overall methodology would be affected.This highlights the flexibility of the copula approach. Note,however, that the copula approach models the multivariatecorrelation structure only, leaving the marginals unchanged.It should thus only be used in connection with otherstatistical post-processing methods that yield calibrated andsharp marginals.

Alternative approaches to constructing joint distributionssuch as the Schaake shuffle and the ECC approach use adiscrete copula to construct the joint distribution. Thesemethods learn a multivariate rank structure by investigatingranks in historical observations (Schaake) or the ensemble(ECC). We feel our approach offers a useful alternative tothese methods for two reasons. First, the joint distributionis modelled using a Gaussian distribution. This does notrequire subsequent samples from the predictive distributionto strictly obey any observed rank structure and, in ouropinion, may scale better to high dimensions. The statisticsliterature is currently investigating these questions (see Hoffet al., 2011) and subsequent work should be undertakento compare these methods in the ensemble post-processing

context. The second feature of our method is that it aimsto model the joint distribution after post-processing. Byrunning the ensemble BMA approach and then learning thelatent Gaussian factor using a verifying observation, we areimplicitly modelling the joint residual structure implied bythe ensemble BMA method itself. This approach, in ouropinion, more appropriately reflects the modelling processthat is being performed.

Section 2.3 outlined the basics of Gaussian copulatheory and provided a simple algorithm for calculatinglatent Gaussian factors and the subsequent correlationmatrix. We presented this approach as it is relativelystraightforward and captures the main features of interest.More involved estimation strategies exist, for instance theBayesian approach of Hoff (2007), and subsequent workshould consider these frameworks to assess whether theyoffer improvements in predictive performance.

As noted in section 2.3, we assume a constant correlationfactor for all observations when estimating the Gaussiancopula. Certainly, such an assumption may be an over-simplification. More complicated methods that includetime-varying correlation factors are possible, but thesemethods are more difficult both to describe and to estimate.Further, investigations on our part revealed essentiallyno added benefit to considering time-varying correlationmodels in the example discussed in section 3. However,it may be useful to reconsider the estimation strategy inthe future and for different datasets, especially in higher-dimensional situations.

The Gaussian copula has received occasional criticism (seeMikosch, 2006) for not accurately capturing dependencein the tails of multivariate distributions. This criticismis germane, but not a central concern to the modellingtask undertaken in this paper. Our goal has been toconstruct a multivariate distribution after post-processingan ensemble of forecasts. If extreme weather events werepossible, it is probable that this information would alreadybe incorporated to some extent in the ensemble itself, andthe copula would simply indicate the variability about theseextreme values. Furthermore, the direct modelling task hasfocused on day-to-day weather prediction, as opposed tofocusing on extreme events. If the joint modelling of extremeweather is the central concern of the forecaster, it is likelythat an alternative copula model may be preferable.

While we have shown promising results in modelling fiveweather quantities simultaneously, the long-term goal isundoubtedly to model weather jointly in the spatial domainand for multiple variables. Once spatial factors are added,the dimensionality of the model can increase rapidly andissues related to spatial covariation estimation are necessaryto solve. There have been several recent advances in usingfast computational methods for Gaussian Markov randomfields (Lindgren et al., 2011), which could prove useful inconstructing high-dimensional joint distributions based onpost-processed ensemble forecasts. Our current steps inadvancing the methods discussed above have been to mergethis literature with that of ensemble post-processing.

Acknowledgements

We would like to thank Tilmann Gneiting and MichaelScheuerer for helpful discussions, Jeff Baars for providing thedata, and Chris Fraley for invaluable assistance regarding theensembleBMApackage. Annette Moller and Alex Lenkoski

Copyright c© 2012 Royal Meteorological Society Q. J. R. Meteorol. Soc. (2012)

Page 10: Multivariate probabilistic forecasting using ensemble Bayesian model averaging and copulas

A. Moller et al.

gratefully acknowledge support by the German ResearchFoundation (DFG) within the programme ‘Spatio-TemporalGraphical Models and Applications in Image Analysis’, grantGRK 1653. We would finally like to thank two anonymousreferees for their helpful comments.

References

AghaKouchak A, Bardossy A, Habib E. 2010. Copula-based uncertaintymodelling: application to multisensor precipitation estimates. Hydrol.Process. 24: 2111–2124.

Anderson JL. 1996. A method for producing and evaluating probabilisticforecasts from ensemble model integrations. J. Climate 9: 1518–1530.

Baars J. 2005. Observations QC documentation. http://www.atmos.washington.edu/mm5rt/qc obs/qc doc.html

Bao L, Gneiting T, Grimit EP, Guttorp P, Raftery AE. 2010. Biascorrection and Bayesian model averaging for ensemble forecasts ofsurface wind direction. Mon. Weather Rev. 138: 1811–1821.

Berrocal VJ, Raftery AE, Gneiting T. 2007. Combining spatial statisticaland ensemble information in probabilistic weather forecasts. Mon.Weather Rev. 135: 1386–1402.

Berrocal VJ, Raftery AE, Gneiting T. 2008. Probabilistic quantitativeprecipitation field forecasting using a two-stage spatial model. Ann.Appl. Statist. 2: 1170–1193.

Chmielecki RM, Raftery AE. 2010. Probabilistic visibility forecastingusing Bayesian model averaging. Mon. Weather Rev. 139: 1626–1636.

Clark MP, Gangopadhyay S, Hay LE, Rajagopalan B, Wilby RL. 2004. TheSchaake Shuffle: a method for reconstructing space–time variabilityin forecasted precipitation and temperature fields. J. Hydrometeorol.5: 243–262.

Delle Monache L, Hacker JP, Zhou Y, Deng X, Stull RB. 2006.Probabilistic aspects of meteorological and ozone regional ensembleforecasts. J. Geophys. Res. 111: D24307, DOI: 10.1029/2005JD006917.

Eckel FA, Mass CF. 2005. Aspects of effective mesoscale, short-rangeensemble forecasting. Weather Forecast. 20: 328–350.

Fortin V, Favre AC, Sa ıd M. 2006. Probabilistic forecasting fromensemble prediction systems: improving upon the best-membermethod by using a different weight and dressing kernel for eachmember. Q. J. R. Meteorol. Soc. 132: 1349–1369.

Fraley C, Raftery AE, Gneiting T, Sloughter JM, Berrocal VJ. 2011.Probabilistic weather forecasting in R. The R Journal 3: 55–63.

Genest C, Favre AE. 2007. Everything you always wanted to know aboutcopula modeling but where afraid to ask. J. Hydrol. Eng. 12: 347–368.

Gneiting T, Raftery AE. 2005. Weather forecasting with ensemblemethods. Science 310: 248–249.

Gneiting T, Raftery AE. 2007. Strictly proper scoring rules, prediction,and estimation. J. Am. Statist. Assoc. 10: 2: 359–378.

Gneiting T, Raftery AE, Westveld AH, Goldman T. 2005. Calibratedprobabilistic forecasting using ensemble model output statistics andminimum CRPS estimation. Mon. Weather Rev. 133: 1098–1118.

Gneiting T, Stanberry LI, Grimit EP, Held L, Johnson NA. 2008. Assessingprobabilistic forecasts of multivariate quantities, with applications toensemble predictions of surface winds (with discussion and rejoinder).Test 17: 211–264.

Good PI. 1995. Permutation Tests. Springer: Berlin.Hagspiel S, Papaemannouil A, Schmid M, Andersson G. 2011. Copula-

based modeling of stochastic wind power in Europe and implicationsfor the Swiss power grid. Appl. Energy 96: 33–44.

Hamill TM, Colucci SJ. 1997. Verification of Eta-RSM short-rangeensemble forecasts. Mon. Weather Rev. 125: 1312–1327.

Hersbach H. 2000. Decomposition of the continuous ranked probabilityscore for ensemble prediction systems. Weather Forecast. 15: 559–570.

Hoff PD. 2007. Extending the rank likelihood for semiparametric copulaestimation. Ann. Appl. Statist. 1: 265–283.

Hoff PD, Niu X, Wellner JA. 2011. Information bounds for Gaussiancopulas. http://arxiv.org/abs/1110.3572.

Kao SC, Govindaraju RS. 2010. A copula-based joint deficit index fordroughts. J. Hydrol. 380: 121–134.

Krzysztofowicz R. 1999. Bayesian theory of probabilistic forecasting viadeterministic hydrologic model. Water Resour. Res. 35: 2739–2750.

Krzysztofowicz R, Herr HD. 2001. Hyrdrologic uncertainty processor forprobabilistic river stage forecasting: precipitation-dependent model.J. Hydrol. 249: 46–68.

Krzysztofowicz R, Kelly KS. 2000. Hydrologic uncertainty processor forprobabilistic river stage forecasting. Water Resour. Res. 36: 3265–3277.

Leamer EE. 1978. Specification Searches. Wiley: New York.Leutbecher M, Palmer TN. 2008. Ensemble forecasting. J. Comput. Phys.

227: 3515–3539.Lindgren F, Rue H, Lindstrom J. 2011. An explicit link between Gaussian

fields and Gaussian Markov random fields: the stochastic partialdifferential equation approach. J. R. Statist. Soc. B 73: 423–498.

Matheson JE, Winkler RL. 1976. Scoring rules for continuous probabilitydistributions. Manage. Sci. 22: 1087–1096.

Mikosch T. 2006. Copulas: tales and facts. Extremes 9: 3–20.National Weather Service. 1998. Automated Surface Observing System

(ASOS) User’s Guide. http://www.weather.gov/asos/aum-toc.pdf.Nelsen RB. 2006. An Introduction to Copulas (2nd edn). Springer: New

York, NY.Pinson P. 2012. Adaptive calibration of (u, v)-wind ensemble forecasts.

Q. J. R. Meteorol. Soc. 138: 1273–1284, DOI: 10.1002/qj.1873.Pinson P, Girard R. 2012. Evaluating the quality of scenarios of short-

term wind power generation. Appl. Energy 96: 12–20.Pinson P, Papaefthymiou G, Klockl B, Nielsen HA, Madsen H. 2009.

From probabilistic forecasts to statistical scenarios of short-term windpower production. Wind Energy 12: 51–62.

R Development Core Team. 2011. R: A language and environment forstatistical computing. http://www.R-project.org.

Raftery AE, Gneiting T, Balabdaoui F, Polakowski M. 2005. UsingBayesian model averaging to calibrate forecast ensembles. Mon.Weather Rev. 133: 1155–1174.

Roquelaure S, Bergot T. 2008. A local ensemble prediction system for fogand low clouds: construction, Bayesian model averaging calibration,and validation. J. Appl. Meteorol. Climatol. 47: 3072–3088.

Schefzik R. 2011. Ensemble copula coupling. Diploma thesis, Faculty ofMathematics and Informatics, University of Heidelberg.

Scholzel C, Friederichs P. 2008. Multivariate non-normally distributedrandom variables in climate research-introduction to the copulaapproach. Nonlinear Proc. Geophys. 15: 761–772.

Schuhen N, Thorarinsdottir TL, Gneiting T. 2012. Ensemble modeloutput statistics for wind vectors. ArXiv:1201.2612.

Sloughter JM, Raftery AE, Gneiting T, Fraley C. 2007. Probabilisticquantitative precipitation forecasting using Bayesian model averaging.Mon. Weather Rev. 135: 3209–3220.

Sloughter JM, Gneiting T, Raftery AE. 2010. Probabilistic wind speedforecasting using ensembles and Bayesian model averaging. J. Am.Statist. Assoc. 105: 25–35.

Sloughter JM, Gneiting T, Raftery AE. 2011. Probabilistic wind vectorforecasting using ensembles and Bayesian model averaging. Mon.Weather Rev. (submitted).

Talagrand O, Vautard R, Strauss B. 1997. Evaluation of probabilisticprediction systems. In Proceedings of the Workshop on Predictability,European Centre for Medium-Range Weather Forecasts, Reading,UK; 1–25.

Thorarinsdottir TL, Gneiting T. 2010. Probabilistic forecasts ofwind speed: ensemble model output statistics using heteroskedasticcensored regression. J. R. Statist. Soc. A 173: 371–388.

Thorarinsdottir TL, Johnson MS. 2011. Probabilistic wind gustforecasting using non-homogeneous Gaussian regression. Mon.Weather Rev. 140: 889–897.

Vardi Y, Zhang CH. 2000. The multivariate L1-median and associateddata depth. Proc. Natl Acad. Sci. USA 97: 1423–1426.

Wilks DS. 2002. Smoothing forecast ensembles with fitted probabilitydistributions. Q. J. R. Meteorol. Soc. 128: 2821–2836.

Wilks DS, Hamill TM. 2007. Comparison of ensemble-MOS methodsusing GFS reforecasts. Mon. Weather Rev. 135: 2379–2390.

Wilson LJ, Beauregard S, Raftery AE, Verret R. 2007. Calibrated surfacetemperature forecasts from the Canadian ensemble prediction systemusing Bayesian model averaging. Mon. Weather Rev. 135: 1364–1385.

Copyright c© 2012 Royal Meteorological Society Q. J. R. Meteorol. Soc. (2012)