stochastic generation of hourly mean wind speed data

� Corresponding author. T

E-mail address: haksoy@

0960-1481/$ - see front mat

doi:10.1016/j.renene.2004.0

el.: +90-212-2856577; fax: +90-212-2856587.

itu.edu.tr (H. Aksoy).

ter # 2004 Elsevier Ltd. All rights reserved.

3.011

Renewable Energy 29 (2004) 2111–2131

www.elsevier.com/locate/renene

Stochastic generation of hourly mean windspeed data

Hafzullah Aksoy �, Z. Fuat Toprak, Ali Aytek,N. Erdem Unal

Department of Civil Engineering, Civil Engineering Faculty, Istanbul Technical University,

Hydraulics Division, Maslak, 34469 Istanbul, Turkey

Received 19 September 2003; accepted 23 March 2004

Abstract

Use of wind speed data is of great importance in civil engineering, especially in structuraland coastal engineering applications. Synthetic data generation techniques are used in prac-tice for cases where long wind speed data are required. In this study, a new wind speed datageneration scheme based upon wavelet transformation is introduced and compared to theexisting wind speed generation methods namely normal and Weibull distributed independentrandom numbers, the first- and second-order autoregressive models, and the first-order Mar-kov chain. Results propose the wavelet-based approach as a wind speed data generationscheme to alternate the existing methods.# 2004 Elsevier Ltd. All rights reserved.

Keywords: Normal distribution; Weibull distribution; Autoregressive models; Markov chain; Wavelet;

Hourly mean wind speed

1. Introduction and existing literature

Climatology is defined as a set of probabilistic statements on long-term weatherconditions [1], and wind climatology as that branch of climatology that specialisesin the study of winds, from which information on extreme winds is provided tostructural designers. Such information is also needed for wind energy producersand engineers who design coastal civil structures, for example breakwaters. From astructural engineering point of view, forecasting the maximum wind speed that is

H. Aksoy et al. / Renewable Energy 29 (2004) 2111–21312112

expected to affect a structure during its lifetime is important to the designer. On theother hand, in coastal engineering practices, not only the magnitude but also thedirectionality of wind becomes important. The duration of wind, in addition to itsmagnitude and direction, is also required in wind energy production systems, andthe amount of energy that can be produced depends upon it.

The information required by either structural and coastal engineers or windenergy producers is related to wind speed data, and is a matter of quality and quan-tity. The quality of the wind speed data refers to whether the data set is reliable andmicrometeorologically homogeneous. A data set is reliable if (i) the measurementinstrument performs adequately, (ii) the instrument is not influenced by obstructionsand (iii) the atmospheric stratification is neutral. A set of wind speed data is con-sidered micrometeorologically homogeneous if the data set is obtained under ident-ical micrometeorological conditions [1]. The size of the data set (quantity) is relatedto the time period during which the wind speed data are recorded. The time periodover which wind speed data are recorded is usually shorter than the lifetime of civilengineering structures. Therefore, the worst case of wind load that the structuraldesigner expects that the structure will face during its lifetime is determined by mod-elling the wind speed data record in hand. For this, climatological and physicalmodelling techniques are available. Additionally, probabilistic and stochastic modelshave been developed, for which the existing literature is reviewed in brief below. Themain aim in those techniques is to determine minimum design loads due to wind [2].

Short records of daily, weekly, and monthly highest wind speeds taken at 36weather stations in the US were empirically analyzed [3] in order to determinedesign wind speeds. Short records of hourly mean wind speed data from normalregions in the US were used by Cheng and Chiu [4] for determination of the tran-sition probabilities of the Markov chain upon which the methodology in that studywas based. This methodology was extended later to tropical cyclone-prone regions[5]. Also, a knowledge-based expert system, principally similar to the mentionedmethodologies, was made available [6,7]. Alternative approaches used in the gener-ation of simulated wind speed time series were compared by Kaminsky et al. [8].Sfetsos [9] examined adaptive neuro-fuzzy inference systems and neural logic net-works and compared them to the traditional autoregressive moving average(ARMA) models. Dukes and Palutikof [10] employed the Markov chain in order toestimate hourly mean wind speed with very long return periods. Another Markovchain based study was conducted by Sahin and Sen [11]. Castino et al. [12] coupledautoregressive processes to the Markov chain and simulated both wind speed anddirection. A recent study [13] presents a wavelet-based method to generate artificialwind data. The Weibull distribution has commonly been fitted to hourly meanwind speed data [14,15]. The peaks-over-threshold approach has also been com-monly used in the estimation of extreme quantiles of wind speed data [16–19].

2. Methods

In this study, a number of probabilistic and stochastic methods are used in orderto compare their ability to reproduce long series of hourly mean wind speed data

2113H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131

with the same statistical behaviour as that of the observations in hand. The normaland Weibull probability distribution functions are chosen in order to generateindependent and identically distributed random numbers. Autoregressive processesare useful tools in generating data sets in cases where persistency exists. Persistencymeans that large values tend to be followed by large values, and small values bysmall values, so that runs of values of similar magnitude tend to persist throughoutthe sequence. First- and second-order autoregressive processes are chosen in thisstudy. Another concept commonly employed in wind speed data generation studiesis the first-order Markov chain. The results of these methods are compared tothose obtained from a newly developed wavelet-based approach.

The methods are described below. Only the wavelet-based approach will bedetailed, whereas the remaining five methods will be outlined briefly as they havebeen well documented in literature.

2.1. Normal distribution

Hourly mean wind speed time series are generated by using a sequence of inde-pendent random numbers from the normal distribution. The normal probabilitydistribution function is given by

f ðwÞ ¼ 1

rffiffiffiffiffiffi2p

p exp½�ðw� lÞ2=2r2� ð1Þ

where w is the variable (hourly mean wind speed, in this study), l mean value ofwind speed, and r standard deviation of wind speed. A number of computationalmethods are available for the generation of random numbers with normal prob-ability distribution of mean l and standard deviation r.

2.2. Weibull distribution

The Weibull distribution is another probability distribution function commonlyused for the frequency analysis of wind speed data [14,15]. It is given by

f ðwÞ ¼ aba w

a�1exp � 1

ba wa

� �w 0; a; b > 0 ð2Þ

where a and b are shape and scale parameters, respectively, that can be determinedby using either a graphical method or the method of moments. They can also bedetermined using the method of probability weighted moments (PWMs) for whichexplicit equations are available. It is the method used in this study for the determi-nation of parameters.

Equations to be used for this purpose are given by

a ¼ lnð2ÞL2;ðln wÞ

ð3aÞ

b ¼ exp L1;ðln wÞ þ0:5772

a

� �ð3bÞ


In Eqs. (3a,b), L1,(ln w) and L2,(ln w) are L1 and L2 moments of the logarithm ofthe hourly mean wind speed time series. The L1 and L2 moments of a series aregiven by

L1 ¼ b0 ð4aÞ

L2 ¼ 2b1 � b0 ð4bÞ

in which b0 and b1 are given by

b0 ¼ �xx ð5aÞ

b1 ¼XN�1

j¼1

ðN � jÞNðN � 1Þ xj ð5bÞ

xj in Eq. (5b) comes from the time series sorted in descending order asxN � � � � � xi � � � � � x1. Detailed information on L moments and the method ofPWM is given in [20].

Once the parameters are determined, the generation of Weibull distributed ran-dom numbers is a matter of a simple computer code, as the cumulative distributionfunction of the Weibull distribution can be obtained in closed form.

2.3. AR(1) model

The hourly mean wind speed time series is of high dependence. This propertyparticularly requires a wind speed data generation model incorporating the depen-dence structure of the observations. As mentioned, both normal and Weibull dis-tributed random numbers do not take this property into account as they areindependent, but autoregressive models are of correlated type and hence capable ofsimulating this property of the data series.

The use of autoregressive type models is reported in literature very commonly.The first-order autoregressive [AR(1)] model accommodates only the effect of theprevious value in the series in which the observed sequence of wind speed data {w1,w2,. . ., wt,. . .} is used to fit a model of form

wi ¼Xmj¼1

ajwi�j þ ei ð6Þ

where w is the hourly mean wind speed, a the autoregressive coefficient, that is,model parameter, and e a normally distributed independent random variable. It isnoted that Eq. (6) is written for the mth order. The simplest case of Eq. (6) isobtained for m ¼ 1, which is also called the Markov model. Eq. (6) then becomes

yi ¼ r1yi�1 þ ei ð7Þ

where y is the standardised (zero mean and unit variance) version of the variableand r1 the lag-one serial correlation coefficient of the sequence.


The random component (e) in AR(1) is of normal distribution with zero meanand a variance of 1 � r21. The simulation procedure for the processes is very sim-ple. It requires only a random number of a normal distribution to be generated.

2.4. AR(2) model

With increase in order of the autoregressive model, the dependence structure inthe observations is better preserved. Therefore, the second-order autoregressive[AR(2)] model is preferred to AR(1). This becomes more important in cases wheredependence in the data set is very obvious, as in the hourly mean wind speed data.AR(2) is formulized as

yi ¼ /1yi�1 þ /2yi�2 þ ei ð8Þ

where autoregressive coefficients /1 and /2 are given by

/1 ¼ r1ð1� r2Þ=ð1� r21Þ ð9aÞ

/2 ¼ ðr2 � r21Þ=ð1� r21Þ ð9bÞ

in which r1 is the lag-one autocorrelation coefficient and r2 the lag-two autocorrela-tion coefficient of the wind speed time series. The random component in AR(2) isagain of normal distribution, with zero mean and variance equal to 1�R2, where

R2 ¼ r21 þ r22 � 2r1r2

1� r21ð10Þ

2.5. Markov chain

In this approach, the observed time series is divided into a number of states. Awind speed state contains wind speeds between certain values. For example, State 1might include wind speeds below 2 m/s, State 2 wind speeds between 2 and 4 m/s,etc. until the final wind speed state includes all speeds above the highest observedvalue or a predefined upper limit. The upper and lower limits of the states arehighly subjective values. For instance, the hourly mean wind speed data set in thisstudy was divided into 10 states. In another wind speed study [11], states weredefined depending upon the standard deviation of the data set. Each state in thatstudy [11] was taken as wide as one standard deviation of the observed hourlymean wind speed time series. Dukes and Palutikof [10], on the other hand, used afixed width for the states, which was equal to 2 m/s.

In the Markov chain approach, the state of wind speed in the current hour canbe defined depending only upon the previous state. This is called the first-order orone-step Markov chain. Two previous states are used in the second-order or two-step Markov chain in determining the current state of the wind speed. Althoughthey are not common as the first- and second-order Markov chains, higher-orderMarkov chains can also be used. However, dramatic increase in the number oftheir parameters limits their use.


The parameter set of a Markov chain consists of probabilities of transition fromone state to another that are given in transition probability matrices. The tran-sition probability matrix of a first-order Markov chain with m states can be writtensymbolically as

P ¼

P11 P12 . . . P1m

P21 P22 . . . P2m

. . . . . . Pij . . .Pm1 Pm2 . . . Pmm

2664

3775 ð11Þ

where Pij is the probability of transition from state i to state j. The number ofparameters is mðm� 1Þ, as the sum of the probabilities is equal to 1 (100%) foreach row of the matrix. If nij is the total number of hours of observation in state jwith the previous state i, the probabilities of transition from state i to state j can becalculated as

Pij ¼nijPj

niji; j ¼ 1; 2; . . . ;m ð12Þ

The procedure for generating the simulated hourly mean wind speed time seriesis explained below.

First, the cumulative transition probability matrix is calculated. In the cumulativetransition probability matrix, cumulative summation of probabilities within each rowis carried out; hence, each row in that matrix ends with 1. Then, an initial state isadopted. No wind (State 1) can, for example, be assumed as the initial state. Using auniform random number, the next state of wind speed can be determined. If State 1 isobtained as the new state of wind speed, then it is first checked if the wind speed iszero. If the wind speed is not zero, then a uniform random number is generated fromthe interval of State 1. If the highest state is found to be the new state of wind speed,then a shifted one-parameter gamma distributed random number is used in order tofind the magnitude of the wind speed. The reason for choosing the gamma distributionwill be discussed in the section where results obtained from application of the methodsare presented. For intermediate states, a uniform random number from the interval ofthe corresponding state is generated and set as the wind speed at the current hour.

2.6. Wavelet-based approach

A real or complex-value continuous function with zero mean and finite varianceis called a wavelet [21]. There are many functions that can qualify as wavelets.Some examples of wavelets are Morlet, Mexican hat, Shannon and Meyer. A sim-ple wavelet is the Haar wavelet (Fig. 1), defined as

wðtÞ ¼1 0 � t � 1=2�1 1=2 � t � 10 otherwise

8<: ð13Þ

Decomposing a signal and then reconstructing it is the base for the wavelettransform. In this study, the Haar wavelet was used due to its simplicity. There-


fore, decomposition of a signal (multiresolution analysis) with the Haar wavelet isconsidered and explained in detail below.

For a certain value of k, let us define fk(t) as the average of f(s) over an intervalof size 2k:

fkðtÞ ¼1

2k

ð2kðlþ1Þ

2kl

f ðsÞds 2kl < t < 2kðl þ 1Þ ð14Þ

where k and l are integers, k a scale variable (k > 0 means stretching and k < 0means contracting of the wavelet) and l a translation variable [21]. Fork ¼ �1,. . .,�1, 0, 1,. . ., 1, fk(t) is as follows:

f�1ðtÞ ¼ f ðtÞ...

f�1ðtÞ ¼ 2

ððlþ1Þ=2

l=2

f ðsÞdsl

2< t <

l þ 1

2

f0ðtÞ ¼ððlþ1Þ

l

f ðsÞds l < t < ðl þ 1Þ

f1ðtÞ ¼1

2

ð2ðlþ1Þ

2l

f ðsÞds 2l < t < 2ðl þ 1Þ

..

.

f1ðtÞ ¼ 0

ð15Þ

Fig. 1. Haar wavelet.


The resolution decreases as k increases. The difference between the successiveaverages fk�1(t) and fk(t) is defined as a detail function:

gkðtÞ ¼ fk�1ðtÞ � fkðtÞ ð16ÞIt can be easily seen that

f ðtÞ ¼X1

k¼�1gkðtÞ ð17Þ

According to Eq. (17), the original signal is obtained when all detail functionsare summed up. Change in data resolution with change in k, the resolution level,can be seen in the upper part of Fig. 2, in which the average of the time seriestaken at different resolution levels according to Eq. (15) is shown. Note that thedata sample used in Fig. 2 has 16 elements. Increase in the ordinates of fk(t) withdecrease in k shows the change (increase) in the resolution. The middle part ofFig. 2 shows the detail functions calculated using Eq. (16) for different resolutionlevels. Note from Eq. (15) that f 4ðtÞ ¼ 0 for all t. At the bottom of Fig. 2, f(t), thesum of the four detail functions according to Eq. (17), is seen, and it represents theoriginal data, f0(t). Eq. (17) is the basis for the generation algorithm explainedbelow.

Let us consider a data sample of size M ¼ 2K , where K is a positive integer(K ¼ 4 for the sequence in Fig. 2) taken from a stochastic process f(t) with zeromean: f(1), f(2),. . ., f(M). Define the sample fk(i) (k ¼ 0, 1,. . ., K; i ¼ 1,. . ., M) con-sisting of averages of 2k successive elements of the sample. f0(i) is the original sam-ple and fK(i) is a sample of all zeros, since the average of M elements is zero. Thedetail function gk(t) has a sample consisting of M elements given by Eq. (16) fork ¼ 1, 2,. . ., K.

Thus, for each element fi of the original sample, we have K detail functionvalues, gk(i), corresponding to different resolutions. Choosing from M elements foreach gk(t) randomly, and then summing them up using Eq. (17), one obtains asimulated value for f(t) as

f ðjÞ ¼XKk¼1

gkðjÞ ð18Þ

where j is the index for generated elements.The generation algorithm is given step by step as follows [22] and is illustrated in

Fig. 3 for K ¼ 4.

1. In order to obtain the first element of the series ( j ¼ 1), gk values (k ¼ 0, 1,. . .,K) are chosen from M values randomly and summed up to obtain f1 (Fig. 3).

2. The second element ( j ¼ 2) is generated by choosing, for each k, the gk comingjust after the gk values chosen in the first step. f2 is obtained by the summationof these (Fig. 3).

3. Data generation is continued in this way for a desired number of times using,for the generation of each element fj, the detail function values right next tothose of the previous step j�1 at each resolution level.


Fig. 2. Decomposition and reconstruction of a data sequence.


This generation algorithm is a newly developed approach for data simulationpurposes. It was first used in non-skewed annual and monthly streamflow datasimulation studies [22,23]. The approach was later used for the simulation of thestorage capacity of river reservoirs [24]. Modelling suspended sediment dischargeseries [25] and annual and monthly rainfall data series [26] was also performed bythis approach successfully. The algorithm generated the mean, standard deviationand correlation structure of the observed streamflow data sets. When one is inter-ested in the generation of skewed data, it is first required to transform the data toa non-skewed structure, generate them and then transform them back to theirskewed structure.

3. Application

The methods were applied to an hourly mean wind speed data set that will beintroduced in the following subsections. Results obtained from the application ofthe methods are presented and discussed below. The performance of the methods

Fig. 3. Construction of a simulated data sequence.


was measured according to their ability to capture the statistical behaviour of the

observed data set. A comparison of the methods is finally presented.

3.1. Data

Table 1 shows the main statistical characteristics of the data set of hourly mean

wind speed taken from the State Meteorological Works’ meteorology station in

Diyarbakir, a southeastern Anatolian city. The data set is of four years’ length,

from 1994 to 1997 (35064 hours in total). The region is normal, as is seen in

Table 1. The data set is highly correlated, as expected, and skewed. For the wave-

let-based approach, 32768 hours of data, extending from the first hour of April 6,

1994, to the eighth hour of December 31, 1997, were used. This is a choice with no

specific reason. Characteristics corresponding to that part of the observed series are

also given in Table 1.

3.2. Parameters

The hourly mean wind speed data set used in the study is of skewed structure.

This prevents fitting of the normal distribution to the data. Therefore, power trans-

formation [ y ¼ xh; where x the is raw (untransformed) variable, y the transformed

variable, and h the transformation coefficient] was adopted in order to obtain non-

skewed data, to which the normal distribution can be fitted. The transformation

coefficient was obtained as h ¼ 0:38585 for the data set in the study. As the normal

distribution is fitted to the transformed hourly mean wind speed time series (but

not to the raw data series), the parameters of the normal distribution are the mean

and standard deviation of the transformed hourly mean wind speed time series.

Those parameters are presented in Table 2. The normal probability distribution

function based upon the determined parameters was fitted to the transformed wind

speed data series (Fig. 4). It is seen that the distribution performs very well in fit-

ting to the observations as well as to the generated data, to be explained later in

following sections.The Weibull distribution has two parameters (a, the shape parameter, and b, the

scale parameter). The parameters were determined using the method of L-moments

on which detailed information was given previously. The reason for choosing this

method is that explicit equations are available for determination of the parameters

of the distribution. The method also has the superiority of being less sensitive to

outliers, which means that outliers do not affect the performance of the method in

determining the parameters correctly. The only problem with this method is the

presence of zero wind speeds, which makes the method inapplicable due to the log-

arithm included. In order to overcome this problem, zero wind speeds were ignored

from the observed time series as their number of occurrences was very small, less

than 0.5%. The parameters of the Weibull distribution determined by the method

of L-moments are listed in Table 2. Fig. 5 shows the agreement between the

observed data and the fitted Weibull probability distribution function. It can be

considered a very good fit, although the Weibull probability distribution function

Table

1

Sta

tist

icalch

ara

cter

istics

ofobse

rved

hourly

mea

nw

ind

spee

dtim

ese

ries

Date

Num

ber

ofdata

Mea

n

(m/s)

Sta

ndard

dev

iation

(m/s)

Coeffi

cien

t

ofvariat

ion

Coeffi

cien

t

ofsk

ewnes

s

Maxim

um

win

dsp

eed

(m/s)

Corr

elation

coeffi

cien

t

r 1r 2

r 3r 4

r 5

1Ja

nuary

1994–

31

Dec

ember

1997

35064

2.5

38

1.7

86

0.7

03

1.2

85

14.4

0.8

60

0.7

32

0.6

33

0.5

49

0.4

76

6A

pril

1994–

31

Dec

ember

1997

32768

2.5

55

1.7

94

0.7

02

1.2

83

14.4

0.8

61

0.7

33

0.6

35

0.5

51

0.4

38



gives the mode an occurrence probability slightly lower than that in the obser-vation.

AR(1) is a parametric model with two parameters (a, the autoregression coef-ficient, and r2

e , the variance of the independent normal variable). The modelrequires only the lag-one serial correlation coefficient (r1), as both parameters aredependent only upon r1.

AR(2) has three parameters (/1 and /2, the autoregression coefficients, and r2e ,

the variance of the independent normal variable), all functions of r1 and r2, the lag-one and lag-two serial correlation coefficients listed in Table 2.

Of the six methods, the Markov chain is the one that requires the highest num-ber of parameters. The number of parameters required changes with the number ofstates used for the wind speed. In this study, 10 states were chosen for the windspeed, each 1.5 m/s wide. This resulted in 90 transition probabilities to be determ-ined from the observed wind speed data set, when it is considered that summation

Table 2

Parameter sets of methods

Method P
arameter set
Normal l
¼ 1:347 m=s r ¼ 0:392 m=s
Weibull a
¼ 1:583 b ¼ 1:973
AR(1), AR(2) r
1 ¼ 0:820 r 2 ¼ 0:688
probability distribution function fitted to the observed and simulate
Fig. 4. Normal d random wind
speed sequences.


over any row in the transition probability matrix results in 100% probability. The

transition probability matrix of the data set is given in Table 3. Not only transition

probabilities, but also the wind speed distribution in each state should be known

by this method. In this study, wind speed was assumed to be distributed uniformly

over the states except for the last one (state of highest wind speeds with no upper

limit), where the one-parameter gamma distribution was used. In State 1 with the

lower limit of zero, the probability of occurrence of zero wind speed was also taken

probability distribution function fitted to the observed and simulate
Fig. 5. Weibull d random wind
speed sequences.

Table 3

Transition probability matrix of the observed hourly mean wind speed data set

Pij
j ¼ 1 2 3 4 5 6 7 8 9 10
i ¼ 1
0.7053 0.2779 0 .0144 0.0015 0.0008 0.0001 0.0000 0.0000 0.0000 0.0000 2 0.2405 0.6089 0 .1306 0.0153 0.0041 0.0005 0.0000 0.0001 0.0000 0.0000 3 0.0256 0.2839 0 .5317 0.1352 0.0178 0.0048 0.0005 0.0003 0.0002 0.0000 4 0.0042 0.0491 0 .3116 0.4954 0.1191 0.0179 0.0023 0.0003 0.0000 0.0000 5 0.0008 0.0176 0 .0865 0.3486 0.4311 0.0978 0.0168 0.0008 0.0000 0.0000 6 0.0000 0.0089 0 .0266 0.1197 0.3437 0.4013 0.0865 0.0111 0.0022 0.0000 7 0.0000 0.0152 0 .0076 0.0455 0.1212 0.3561 0.3485 0.1061 0.0000 0.0000 8 0.0000 0.0000 0 .0000 0.0526 0.0000 0.2105 0.3684 0.2632 0.0789 0.0263 9 0.0000 0.0000 0 .0000 0.0000 0.0000 0.0000 0.1818 0.3636 0.3636 0.0909 10 0.0000 0.0000 0 .0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.0000 0.0000


into consideration in order to reproduce the zero wind speeds, although their

occurrence was very low.There is no parameter to be listed for the wavelet approach, as it is a nonpara-

metric method. The length of the series to be used in this method is equal to 2K,

where K is a positive integer and equal to 15 in this study. This corresponds to a

series 32768 hours in length. Another requirement for the wavelet approach is that

the data set should be of a non-skewed structure. Therefore, the part of the

observed series used for the wavelet approach was transformed by using the power

transformation with h ¼ 0:3853.

3.3. Simulation and results

A thousand-year (8760000-hour) -long series was generated for each method.

The correlogram, frequency distribution of maximum wind speeds and wind dur-

ation curve obtained from the simulations will be compared to those of the

observed series.It is obvious that the hourly mean wind speed time series has a highly dependent

structure. The normal and Weibull distributions, however, are of independent

structures (Fig. 6), yet they are very common methods used in generating wind

speed data. These methods may be useful in offering, to the structural designer, the

highest wind speed that the structure will possibly face during its lifetime.It is seen from Fig. 4 that wind speed data generated by the normal distribution

fit the observed series very well. It is seen in Fig. 5 that the Weibull fit is perfect as

well.Other than those two methods, the AR(1), AR(2), and Markov chain methods

looked to produce the dependence structure of the series. However, with increasing

lags in time, the success of those methods in reproducing the correlation structure

Fig. 6. Correlogram of the observed and simulated wind speeds.


of the series decreases (Fig. 6). The wavelet method of the six studied, was foundto be the best in preserving the correlation structure of the series.

The annual maximum values of the simulation series were compared in Fig. 7. Itis seen that the normal distribution, AR(1) and AR(2) produced similar maxima,whereas the wavelet approach produced higher, the Markov chain slightly lowerand the Weibull distribution considerably lower maxima. From the structuralengineering point of view, therefore, it is safer to use the wind load due to themaximum wind speed generated by the wavelet-based approach.

Maxima obtained from the Markov chain method should be discussed specifi-cally. There are three vertical jumps (one of them is very obvious) in the cumulat-ive frequency diagram of the maxima of this method, as is seen in Fig. 7. Thereason for those jumps can be explained very simply. It is seen from Table 3 thatthe probabilities of transition of the wind speed to the highest states are too low,making transition of wind speed to those states almost impossible in the simulationseries. It is only possible to make a transition to State 10 if the previous state in thesimulation series is either State 8 or State 9. Otherwise State 10 is not simulated.This causes the maximum value of the series to be bounded by the upper limit ofState 9, which was taken as 13.5 m/s in this study. A very small jump exists in thefrequency curve in Fig. 7 due to this circumstance. Similarly, State 9 can be simu-lated if and only if the previous state of the wind speed is one of the followingstates: 3, 6, 8, 9 and 10 (Table 3). The big jump in the cumulative frequency curvein Fig. 7 is due to this situation. It is seen that maximums of the simulated seriesare bounded by the upper limit of State 8, which was taken as 12 m/s in this study.

Fig. 7. Cumulative frequency diagram of maximum wind speeds.


The third jump at the very beginning of the curve (close to the y-axis of the graph)is due to a similar situation. This is the result of not having simulated wind speedsfrom State 8, which limits the maximum wind speed to 10.5 m/s at the upper limitof State 7. This drawback of the method can be overcome by forcing the simula-tion series to have at least one value from the highest state that results in amaximum wind speed data series all generated from the highest state with no upperlimit. Such a forcing can be considered quite reasonable and it does not affect thetransition probability matrix as the number of data is usually very large (of theorder of tens of thousands).

Uniformly distributed wind speeds were accepted for the intermediate states,whereas the one-parameter gamma distribution was adopted for the highest state(State 10 in this study). The reason for choosing this distribution is explainedbelow together with a discussion on other distributions.

A distribution with no upper limit should be used for the highest state so thatmaximum wind speeds higher than those in the observed series can possibly begenerated. Therefore, in this study, it was first thought to simulate wind speeds ofthe highest state by using the exponential distribution shifted to that state as Sahinand Sen [11] did. This is quite a reasonable choice for simulating the wind speedsin that state. However, it was seen that the exponential distribution generatedlower maximum wind speeds compared to those generated by other methods.Therefore, the Gumbel distribution was tested. It was seen that the maximum windspeeds generated by this distribution were too low compared to those obtained bythe other methods. The Frechet distribution, which is accepted as the distributionof the maximum wind speeds [1], was also found to be unsuccessful in generatingmaximum wind speeds compared to other methods. The distribution generated lowmaximum wind speeds. In the end, the two-parameter gamma distribution was fit-ted, which resulted again in low maximum wind speeds. Finally, the one-parametergamma distribution was fitted and results comparable to those of the other meth-ods (in Fig. 7) were obtained.

The conclusion that can be drawn from those trials is that a one-parameter dis-tribution can fit to the highest state better than distributions with two or moreparameters. If the standard deviation of the highest state, which is bounded by thelower and upper limits of the state, is included in the generation scheme, thenlower maximum wind speeds are generated. Therefore, mean-dependent probabilitydistribution functions are better in the simulation of maximum wind speeds.

The transition probability matrix of the Markov chain based simulation windspeed series is given in Table 4. It is almost the same as its observed counterpartgiven in Table 3, which means that the Markov chain based simulation techniqueworked very well in the simulation of the state of the wind speed series.

The wind duration curve is a graph with time percentage as abscissa and windspeed as ordinate (Fig. 8). It is an important tool used in determining the percent-age of time that the wind speed exceeds a specified level. Wind energy productionsystems use this graph in order to determine the wind energy potential of theregion under consideration. A very good fit was obtained in Fig. 8, where the windduration curves of the six methods were plotted together with the one extracted


from the observed series. Although the wind duration curve of the Markov chain

method fluctuates around the others, it has a fit that is good enough as well.The first three central moments of the observed and simulated series are given in

Table 5. The maximum values and the first five lags of the correlation are also lis-

ted. It is seen that the mean values of the simulated series are almost the same as

those of the observations. The wavelet-based method approaches its counterparts

with a relative error of 0.3%. Standard deviation and variation coefficient were

best captured by the normal probability distribution, and AR(1) and AR(2) pro-

cesses. Skewness coefficient in the wind speed time series was best reproduced by

Table 4

Transition probability matrix of hourly mean wind speed data simulated by Markov chain method

Pij
j ¼ 1 2 3 4 5 6 7 8 9 10
i ¼ 1
0.7049 0.2782 0.0143 0 .0016 0 .0009 0 .0001 0.0000 0 .0000 0 .0000 0.0000 2 0.2406 0.6086 0.1309 0 .0153 0 .0041 0 .0005 0.0000 0 .0001 0 .0000 0.0000 3 0.0255 0.2845 0.5318 0 .1346 0 .0179 0 .0048 0.0005 0 .0003 0 .0002 0.0000 4 0.0042 0.0491 0.3115 0 .4958 0 .1191 0 .0177 0.0023 0 .0003 0 .0000 0.0000 5 0.0008 0.0181 0.0860 0 .3474 0 .4322 0 .0982 0.0165 0 .0008 0 .0000 0.0000 6 0.0001 0.0089 0.0259 0 .1206 0 .3421 0 .4028 0.0861 0 .0113 0 .0022 0.0000 7 0.0001 0.0156 0.0077 0 .0478 0 .1227 0 .3526 0.3491 0 .1045 0 .0000 0.0000 8 0.0000 0.0000 0.0000 0 .0501 0 .0000 0 .2197 0.3667 0 .2606 0 .0777 0.0252 9 0.0000 0.0000 0.0000 0 .0000 0 .0000 0 .0000 0.1888 0 .3619 0 .3720 0.0773 10 0.0000 0.0000 0.0000 0 .0000 0 .0000 0 .0000 0.0000 0 .0000 1 .0000 0.0000
Fig. 8. Wind duration curve of observed and simulated wind speeds.

Table

5

Sta

tist

icalch

ara

cter

istics

ofsim

ula

ted

series

Ser

ies

Mea

n

(m/s)

Sta

ndar

d

dev

iation

(m/s)

Coeffi

cien

t

ofvariation

Coeffi

cien

t

ofsk

ewnes

s

Maxim

um

win

dsp

eed

(m/s)

Corr

elation

coeffi

cien

t

r 1r 2

r 3r 4

r 5

Norm

al

2.5

38

1.7

77

0.7

00

1.3

15

26.9

40.0

005

�0.0

002

0.0

005

0.0

002

�0.0

002

Wei

bull

2.5

29

1.6

34

0.6

46

0.9

80

16.2

2�

0.0

003

0.0

005

0.0

003

�0.0

001

�0.0

007

AR

(1)

2.5

37

1.7

76

0.7

00

1.3

09

27.3

00.8

15

0.6

61

0.5

36

0.4

36

0.3

55

AR

(2)

2.5

37

1.7

76

0.7

00

1.3

13

25.2

30.8

15

0.6

77

0.5

61

0.4

66

0.3

87

Mark

ov

2.5

85

2.0

58

0.7

96

0.9

83

21.2

90.7

15

0.5

87

0.4

83

0.3

99

0.3

30

Wavel

et2.5

66

1.8

33

0.7

14

1.4

38

31.1

30.7

15

0.5

80

0.5

23

0.4

43

0.4

21



AR(1). Higher maximums were obtained by the methods of AR(1), wavelet andnormal distribution and lower maximums by Weibull distribution. Correlationstructure, as discussed earlier, was best simulated by the wavelet-based method.

4. Summary and conclusion

In this study, hourly mean wind speed data sets were generated by traditionalsimulation methods—the normal and Weibull probability distribution functions,the first- and second-order autoregressive processes, and the Markov chain.Additionally, the newly developed wavelet-based approach was used. The normaland Weibull probability distribution functions consist of independent identicallydistributed random numbers. The autoregressive models include the correlationstructure of the observation and hence generate dependent series. The Markovchain is a two-step method that first determines the state of the wind speed andthen generates its magnitude by using a preselected distribution. All the mentionedmethods are parametric and they therefore require the time series to have a specificprobability distribution. This is a drawback of parametric models more than alimitation. A nonparametric model, of which the wavelet approach in this study isone of the best examples, can be applied to data sets with any distribution. How-ever, it should be kept in mind that the wavelet approach works only with sequen-ces of zero skewness.

The correlation structure of the observations, distribution of the maximum windspeeds, wind duration curve and statistical features of the series were used in orderto compare the success of the methods.

The generation of maximum wind speeds requires special attention in Markovchain based simulation methods. Based upon the application in this study, it isconcluded that the uniform probability distribution function is suitable for use inthe first and intermediate states. A probability distribution function with no upperlimit should be used for the highest state. It is concluded that the one-parametergamma distribution is good enough in fitting to the wind speed data in the higheststate of the series for normal regions, such as the one used in this study.

Some methods performed better in preserving some particular characteristicsthan other methods did. For example, the wavelet method is obviously the best inpreserving the correlation structure of the sequence. This method is as good at pre-serving other statistical features of the series as other methods. Therefore, in con-clusion, the wavelet method is proposed as a tool to substitute for the classicalgeneration schemes for the simulation of hourly mean wind speed data.

Acknowledgements

The wavelet approach presented in this study is a result of an earlier cooperationbetween the first author (H. Aksoy) and Professor M. Bayazit of Istanbul Techni-cal University, Turkey, whom the authors sincerely thank.


References

[1] Simiu E, Scanlan RH. Wind effects on structures. New York: John Wiley & Sons; 1986.

[2] American Society of Civil Engineers. Minimum design loads for buildings and other structures.

ANSI/ASCE 7-93 (Revision of ANSI/ASCE 7-88), New York, 1994.

[3] Simiu E, Filliben JJ, Shaver JR. Short-term records and extreme wind speeds. ASCE, Journal of

the Structural Division 1982;108(ST11):2571–7.

[4] Cheng EDH, Chiu ANL. Extreme winds simulated from short-period records. ASCE, Journal of

Structural Engineering 1985;111(1):77–94.

[5] Cheng EDH, Chiu ANL. Extreme winds generated from short records in a tropical cyclone-prone

region. Journal of Wind Engineering and Industrial Aerodynamics 1988;28:69–78.

[6] Cheng EDH. Wind data generator: a knowledge-based expert system. Journal of Wind Engineering

and Industrial Aerodynamics 1991;38:101–8.

[7] Cheng EDH, Chiu ANL. An expert system for extreme wind simulation. Journal of Wind Engin-

eering and Industrial Aerodynamics 1990;36:1235–43.

[8] Kaminsky FC, Kirchhoff RH, Syu CY, Manwell JF. A comparison of alternative approaches for

the synthetic generation of a wind speed time series. Transactions of the ASME 1991;113:280–9.

[9] Sfetsos A. A comparison of various forecasting techniques applied to mean hourly wind speed time

series. Renewable Energy 2000;21:23–35.

[10] Dukes MDG, Palutikof JP. Estimation of extreme wind speeds with very long return periods. Jour-

nal of Applied Meteorology 1995;34:1950–61.

[11] Sahin AD, Sen Z. First-order Markov chain approach to wind speed modelling. Journal of Wind

Engineering and Industrial Aerodynamics 2001;89:263–9.

[12] Castino F, Festa R, Ratto CF. Stochastic modelling of wind velocities time series. Journal of Wind

Engineering and Industrial Aerodynamics 1998;74–76:141–51.

[13] Kitagawa T, Nomura T. A wavelet-based method to generate artificial wind fluctuation data. Jour-

nal of Wind Engineering and Industrial Aerodynamics 2003;91:943–64.

[14] Garcia A, Torres JL, Prieto E, De Francisco A. Fitting wind speed distributions: a case study.

Solar Energy 1998;62(2):139–44.

[15] Grigoriu M. Estimates of design wind from short records. ASCE Journal of the Structural Division

1982;108(ST5):1034–48.

[16] Heckert NA, Simiu E, Whalen T. Estimates of hurricane wind speeds by ‘peaks over threshold’

method. ASCE Journal of Structural Engineering 1998;124(4):445–9.

[17] Lechner A, Simiu E, Heckert NA. Assessment of ‘peaks over threshold’ methods for estimating

extreme value distribution tails. Structural Safety 1993;12:305–14.

[18] Pandey MD, Van Gelder PHAJM, Vrijling JK. The estimation of extreme quantiles of wind velo-

city using L-moments in the peaks-over-threshold approach. Structural Safety 2001;23:179–92.

[19] Simiu E, Heckert NA. Extreme wind distribution tails: a ‘peaks over threshold’ approach. ASCE,

Journal of Structural Engineering 1996;122(5):539–47.

[20] Stedinger JR, Vogel RM, Foufoula-Georgiou E. Frequency analysis of extreme events. In: Maidment

D, editor. Handbook of hydrology. New York: McGraw Hill Book Co; 1993 [Chapter 18].

[21] Rao RM, Bopardikar AJ. Wavelet transforms, introduction to theory and applications. Reading,

MA: Addison-Wesley; 1998.

[22] Bayazit M, Aksoy H. Using wavelets for data generation. Journal of Applied Statistics 2001;28(2):

157–66.

[23] Bayazit M, Onoz B, Aksoy H. Nonparametric streamflow simulation by wavelet or Fourier analy-

sis. Hydrological Sciences Journal 2001;46(4):623–34.

[24] Aksoy H. Storage capacity for river reservoirs by wavelet-based generation of sequent peak algor-

ithm. Water Resources Management 2001;15(6):423–37.

[25] Aksoy H, Akar T, Unal NE. Wavelet analysis for modeling suspended sediment discharge. Nordic

Hydrology 2004;35:165–74.

[26] Unal NE, Aksoy H, Akar T. Annual and monthly rainfall data generation schemes. Stochastic

Environmental Research and Risk Assessment 2044;18(6):in press.

stochastic generation of hourly mean wind speed data

Documents