virtual wind speed sensor for wind turbines · 2015. 7. 29. · note. this manuscript was submitted...

11
Virtual Wind Speed Sensor for Wind Turbines Andrew Kusiak 1 ; Haiyang Zheng 2 ; and Zijun Zhang 3 Abstract: A data-driven approach for development of a virtual wind-speed sensor for wind turbines is presented. The virtual wind-speed sensor is built from historical wind-farm data by data-mining algorithms. Four different data-mining algorithms are used to develop models using wind-speed data collected by anemometers of various wind turbines on a wind farm. The computational results produced by different algorithms are discussed. The neural network (NN) with the multilayer perceptron (MLP) algorithm produced the most accurate wind-speed prediction among all the algorithms tested. Wavelets are employed to denoise the high-frequency wind-speed data measured by anemometers. The models built with data-mining algorithms on the basis of the wavelet-transformed data are to serve as virtual wind-speed sensors for wind turbines. The wind speed generated by a virtual sensor can be used for different purposes, including online monitoring and calibration of the wind-speed sensors, as well as providing reliable wind-speed input to a turbine controller. The approach presented in this paper is applicable to utility-scale wind turbines of any type. DOI: 10.1061/(ASCE)EY.1943-7897.0000035. © 2011 American Society of Civil Engineers. CE Database subject headings: Turbines; Wind speed; Data collection; Neural networks; Dynamic models; Wavelet; Probe instruments; Statistics. Author keywords: Wind turbine; Wind speed; Data mining; Neural network; Dynamic modeling; Wavelet transformation; Virtual sensor; Statistical control chart. Introduction Wind power has become a leading source of renewable energy and is undergoing a remarkable expansion (U.S. Department of Energy 2008). Wind velocity is one of the key parameters determining the performance of a wind turbine. Beside the turbine control applica- tion, wind speed is widely used in wind-farm siting, power sched- uling, and grid management. The fast-growing wind energy industry strongly depends on the accuracy and robustness of wind-speed measurements. Wind speed is usually measured by a mechanical anemometer installed at the nacelle top of a wind turbine. Prior research has considered wind-speed measurement, sensor calibration, prediction of wind speed, and control optimization on the basis of the esti- mated wind speed. Freilich and Vanhoff (2006) compared the ac- curacy of measurements by Windsat (satellite-based wind velocity measurement), National Buoy Data Center (NDBC) buoys, and quick scatterometer (QuikSCAT). Hsu et al. (2007) modeled wind over the central and southern California coasts and showed the im- portance of high-resolution wind in coastal ocean modeling. Louka et al. (2008) improved wind-speed forecasts by using Kalman filter- ing. Qiao et al. (2008) developed a power maximization control method for a double fed induction generator (DFIG) wind turbine on the basis of wind-speed estimation. This approach allowed a turbine to produce more power without installing a mechanical anemometer. Soft sensing applications have been reported in the literature. Pena and Duro (2003) developed a virtual instrument for automatic anemometer calibration using a neural network model. Charfi et al. (2006) developed a soft sensor to estimate the atmospheric param- eters (temperature and humidity) on the basis of the data from the ambient air monitoring station. Different typologies of neural networks were studied. Aminian and Aminian (2007) developed a modular analog circuit fault-diagnostic system. Wavelet decom- position and principal component analysis were integrated with neural networks. Chella et al. (2006) presented an approach for fault detection and diagnosis in power systems to detect, identify, and classify faults. The wind-speed data measured by an anemometer is seldom an- alyzed for noise. Like other parameters of the wind turbine, wind- speed measurements are recorded at different time intervals (e.g., 10 s and 10 min), thus forming a time series. Wavelets are widely used in signal processing (Kobayashi 1998; Medina et al. 2003; Tang et al. 2000; Walker 1999). Zhang and Ulrych (2002) argued that a wavelet used in signal denoising should be designed on the basis of the characteristics of the signal. However, most research has focused on methods of the threshold selection. Dash et al. (2003) used a modified wavelet transform called the Multiresolu- tion S-transformto detect the transient features of the electric power signals. Dynamic modeling is widely used in control and engineering systems, chemistry, economics, and ecology (Hannon and Ruth 2001; Kulalowski et al. 2007). A process can be abstractly consid- ered as a dynamic multi-input-single-output (MISO) system. The dynamic MISO model could be extracted from the historical pro- cess data by data-mining algorithms. Developing an accurate and robust virtual wind-speed sensor is a challenge in the wind energy industry. Data mining and dynamic modeling are promising ven- ues. Numerous applications of data mining in manufacturing (Backus et al. 2006; Kusiak 2006), marketing (Berry and Linoff 2004; Tan et al. 2006), medical informatics (Kusiak et al. 2005; 1 Dept. of Mechanical and Industrial Engineering, 3131 Seamans Center, Univ. of Iowa, Iowa City, IA 52242-1527 (corresponding author). E-mail: [email protected] 2 Dept. of Mechanical and Industrial Engineering, 3131 Seamans Center, Univ. of Iowa, Iowa City, IA 52242-1527. 3 Dept. of Mechanical and Industrial Engineering, 3131 Seamans Center, Univ. of Iowa, Iowa City, IA 52242-1527. Note. This manuscript was submitted on April 30, 2009; approved on September 16, 2010; published online on September 18, 2010. Discussion period open until November 1, 2011; separate discussions must be sub- mitted for individual papers. This paper is part of the Journal of Energy Engineering, Vol. 137, No. 2, June 1, 2011. ©ASCE, ISSN 0733-9402/ 2011/2-5969/$25.00. JOURNAL OF ENERGY ENGINEERING © ASCE / JUNE 2011 / 59 Downloaded 13 Jun 2011 to 128.255.19.134. Redistribution subject to ASCE license or copyright. Visit http://www.ascelibrary.org

Upload: others

Post on 05-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Virtual Wind Speed Sensor for Wind Turbines · 2015. 7. 29. · Note. This manuscript was submitted on April 30, 2009; approved on September 16, 2010; published online on September

Virtual Wind Speed Sensor for Wind TurbinesAndrew Kusiak1; Haiyang Zheng2; and Zijun Zhang3

Abstract: A data-driven approach for development of a virtual wind-speed sensor for wind turbines is presented. The virtual wind-speedsensor is built from historical wind-farm data by data-mining algorithms. Four different data-mining algorithms are used to develop modelsusing wind-speed data collected by anemometers of various wind turbines on a wind farm. The computational results produced by differentalgorithms are discussed. The neural network (NN) with the multilayer perceptron (MLP) algorithm produced the most accurate wind-speedprediction among all the algorithms tested. Wavelets are employed to denoise the high-frequency wind-speed data measured by anemometers.The models built with data-mining algorithms on the basis of the wavelet-transformed data are to serve as virtual wind-speed sensors for windturbines. The wind speed generated by a virtual sensor can be used for different purposes, including online monitoring and calibration of thewind-speed sensors, as well as providing reliable wind-speed input to a turbine controller. The approach presented in this paper is applicableto utility-scale wind turbines of any type. DOI: 10.1061/(ASCE)EY.1943-7897.0000035. © 2011 American Society of Civil Engineers.

CEDatabase subject headings: Turbines; Wind speed; Data collection; Neural networks; Dynamic models; Wavelet; Probe instruments;Statistics.

Author keywords: Wind turbine; Wind speed; Data mining; Neural network; Dynamic modeling; Wavelet transformation; Virtual sensor;Statistical control chart.

Introduction

Wind power has become a leading source of renewable energy andis undergoing a remarkable expansion (U.S. Department of Energy2008). Wind velocity is one of the key parameters determining theperformance of a wind turbine. Beside the turbine control applica-tion, wind speed is widely used in wind-farm siting, power sched-uling, and grid management. The fast-growing wind energyindustry strongly depends on the accuracy and robustness ofwind-speed measurements.

Wind speed is usually measured by a mechanical anemometerinstalled at the nacelle top of a wind turbine. Prior research hasconsidered wind-speed measurement, sensor calibration, predictionof wind speed, and control optimization on the basis of the esti-mated wind speed. Freilich and Vanhoff (2006) compared the ac-curacy of measurements by Windsat (satellite-based wind velocitymeasurement), National Buoy Data Center (NDBC) buoys, andquick scatterometer (QuikSCAT). Hsu et al. (2007) modeled windover the central and southern California coasts and showed the im-portance of high-resolution wind in coastal ocean modeling. Loukaet al. (2008) improved wind-speed forecasts by using Kalman filter-ing. Qiao et al. (2008) developed a power maximization controlmethod for a double fed induction generator (DFIG) wind turbineon the basis of wind-speed estimation. This approach allowed a

turbine to produce more power without installing a mechanicalanemometer.

Soft sensing applications have been reported in the literature.Pena and Duro (2003) developed a virtual instrument for automaticanemometer calibration using a neural network model. Charfi et al.(2006) developed a soft sensor to estimate the atmospheric param-eters (temperature and humidity) on the basis of the data from theambient air monitoring station. Different typologies of neuralnetworks were studied. Aminian and Aminian (2007) developeda modular analog circuit fault-diagnostic system. Wavelet decom-position and principal component analysis were integrated withneural networks. Chella et al. (2006) presented an approach forfault detection and diagnosis in power systems to detect, identify,and classify faults.

The wind-speed data measured by an anemometer is seldom an-alyzed for noise. Like other parameters of the wind turbine, wind-speed measurements are recorded at different time intervals (e.g.,10 s and 10 min), thus forming a time series. Wavelets are widelyused in signal processing (Kobayashi 1998; Medina et al. 2003;Tang et al. 2000; Walker 1999). Zhang and Ulrych (2002) arguedthat a wavelet used in signal denoising should be designed on thebasis of the characteristics of the signal. However, most researchhas focused on methods of the threshold selection. Dash et al.(2003) used a modified wavelet transform called the “Multiresolu-tion S-transform” to detect the transient features of the electricpower signals.

Dynamic modeling is widely used in control and engineeringsystems, chemistry, economics, and ecology (Hannon and Ruth2001; Kulalowski et al. 2007). A process can be abstractly consid-ered as a dynamic multi-input-single-output (MISO) system. Thedynamic MISO model could be extracted from the historical pro-cess data by data-mining algorithms. Developing an accurate androbust virtual wind-speed sensor is a challenge in the wind energyindustry. Data mining and dynamic modeling are promising ven-ues. Numerous applications of data mining in manufacturing(Backus et al. 2006; Kusiak 2006), marketing (Berry and Linoff2004; Tan et al. 2006), medical informatics (Kusiak et al. 2005;

1Dept. of Mechanical and Industrial Engineering, 3131 SeamansCenter, Univ. of Iowa, Iowa City, IA 52242-1527 (corresponding author).E-mail: [email protected]

2Dept. of Mechanical and Industrial Engineering, 3131 SeamansCenter, Univ. of Iowa, Iowa City, IA 52242-1527.

3Dept. of Mechanical and Industrial Engineering, 3131 SeamansCenter, Univ. of Iowa, Iowa City, IA 52242-1527.

Note. This manuscript was submitted on April 30, 2009; approved onSeptember 16, 2010; published online on September 18, 2010. Discussionperiod open until November 1, 2011; separate discussions must be sub-mitted for individual papers. This paper is part of the Journal of EnergyEngineering, Vol. 137, No. 2, June 1, 2011. ©ASCE, ISSN 0733-9402/2011/2-59–69/$25.00.

JOURNAL OF ENERGY ENGINEERING © ASCE / JUNE 2011 / 59

Downloaded 13 Jun 2011 to 128.255.19.134. Redistribution subject to ASCE license or copyright. Visithttp://www.ascelibrary.org

Page 2: Virtual Wind Speed Sensor for Wind Turbines · 2015. 7. 29. · Note. This manuscript was submitted on April 30, 2009; approved on September 16, 2010; published online on September

Seidel et al. 2007), and energy (Kusiak and Song 2006; Kusiak et al.2009a) have proven to be effective.

Control charts are used in process monitoring to determine andeliminate the sources of process variation. They have been widelystudied in the statistical quality and process control literature(Woodall et al. 2004). Recent advances in profile monitoring haveled to applications in manufacturing, calibration, logistics, service,marketing, finance, and accounting (Kang and Albin 2000; Mesteket al. 1994; Mitra 1998; Montgomery 2005; Wheeler 2000).

In this paper, data-mining algorithms are used to develop adynamic model of a wind-speed sensor. The model is built usinghistorical data collected by the Supervisory Control and DataAcquisition (SCADA) system. For enhancement of the modelperformance, wavelets are used to denoise the wind-speed data.The constructed model provides reliable input to the turbine con-troller and can be used to monitor the performance of the turbineanemometers.

Data Description and Development of a VirtualWind-Speed Sensor

Data Characteristics

The data used in this research were generated at a wind farm withmore than 100 turbines. Though the data was sampled at high fre-quency, e.g., every 2 s, it was usually averaged and stored at 10-s or10-min intervals (referred to as the 10-min or the 10-s average data)in the wind industry. The 10-s (high-frequency) SCADA data showthe dynamic nature of the wind turbine, while the 10-min data re-flect the steady state of the turbine. The data used in this researchare high-frequency (10-s average) SCADA data collected over aperiod of one week on the wind farm.

The parameters collected at each turbine are shown in Table 1.In this paper, one turbine was selected for in-depth analysis, andtwo additional turbines were selected to validate the proposedmethodology and conception. However, the approach presentedin this paper can be applied to a wind turbine of any type.

The data set used for the development of a wind-speed sensorwas divided into two independent subsets: a training data set and atest data set. The training data set was used to develop models of thewind-speed sensor, while the test data set was used to validate theperformance of the models from the training data set. Figs. 1(a) and1(b) illustrate typical plots of 10-min wind-speed data and 10-swind-speed data, respectively. It can be observed that the low-frequency (10-min) wind-speed plot is smoother than the high-frequency wind-speed plot. The turbine control application calls

10-s rather than 10-min wind-speed data; however, the noisywind-speed 10-s data usually make the control inefficient.

The wind-speed plot in Fig. 1(b) shows frequent, steep speedchanges. There could be various reasons for the spikes, includingthe degraded performance of the sensors and unexpected opera-tional conditions around sensors. Such spikes could input noisysignals to the turbine controller, leading to inferior performanceof the wind turbine. An accurate and robust wind-speed sensormodel providing reliable input to the turbine controller is necessary.

Dynamic Modeling

The dynamics of the relationships between the wind speed mea-sured at a turbine and its other SCADA parameters are complex.Development of a quality model for wind-speed prediction on thebasis of high-dimensional SCADA data can be accomplished withdata-mining algorithms.

A process can be considered as a dynamic system changing overtime. The concept of dynamic modeling is used to build a virtualsensor of wind speed. Assume system parameter yðtÞ can be deter-mined on the basis of the previous system status:

fyðt � 1Þ;…; yðt � dyÞg; fx1ðt � 1Þ;…; x1ðt � dx1Þg;…;

fxkðt � 1Þ;…; xkðt � dxk Þg:

The positive integers dy; dx1 ;…; dxk are the maximum possibletime delays to be considered for the corresponding variables. Here,y is the response (or dependent) variable, e.g., wind speed, andx1; x2;…; xk are the predictors (input variables) of y. The dynamicmodel of the wind-speed sensor will be extracted from the historicalprocess data by data-mining algorithms.

Selecting appropriate predictors for the dynamic model is ofcritical importance to the performance of the wind-speed sensormodel, and data-mining algorithms can perform feature selection.The boosting tree (Friedman 2002; Friedman 2001) algorithm cancompute relative importance of a predictor and select important

Table 1. Parameter Description of the SCADA Data Set

Parameter Description Unit

T Generator torque Nm

WS Wind speed m=s

P Power produced kW

GS Generator speed rpm

GBTA Generator bearing A temperature °C

GBTB Generator bearing B temperature °C

BPA1 Blade 1 pitch angle °

BPA2 Blade 2 pitch angle °

BPA3 Blade 3 pitch angle °

YE Yaw error °

RS Rotor speed rpm

Fig. 1. Illustrative plots of the high-frequency and low-frequency windspeed: (a) low-frequency data; (b) high-frequency data

60 / JOURNAL OF ENERGY ENGINEERING © ASCE / JUNE 2011

Downloaded 13 Jun 2011 to 128.255.19.134. Redistribution subject to ASCE license or copyright. Visithttp://www.ascelibrary.org

Page 3: Virtual Wind Speed Sensor for Wind Turbines · 2015. 7. 29. · Note. This manuscript was submitted on April 30, 2009; approved on September 16, 2010; published online on September

variables for modeling, whereas the wrapper approach (Kohavi andJohn 1997) combined with a genetic random search (Espinosa et al.2005) could select the best predictor sets. For dependent(response) variable yðtÞ, the most important predictors amongfyðtÞ; yðt � 1Þ;…; yðt � dyÞg; fx1ðtÞ; x1ðt � 1Þ;…; x1ðt � dx1Þg;…; fxkðtÞ; xkðt � 1Þ;…; xkðt � dxk Þg are selected.

For ease of discussion, different index sets of y need to bedefined.

For response (or dependent) variable y, Dy ¼ fdlowy ;…; dhighy g isa set composed of integers selected from f1;…; dyg related to y’sprevious values and arranged in ascending sequence, dlowy ≤ dhighy .Similarly, Dx1 ¼ fdlowx1 ;…; dhighx1 g is a set selected from f1;…; dx1gfor predictors related to x1, and Dxk ¼ fdlowxk ;…; dhighxk g is a set se-lected from f1;…; dx1g for predictors related to xk . In total there arek þ 1 individual sets for y: Dy;Dx1 ;…;Dxk .

Based on this definition , the response variable y can be ex-pressed as the following dynamic model:

yðtÞ ¼ f ð½yðt � dÞ�d∈Dy; ½x1ðt � dÞ�d∈Dx1

;…; ½xkðt � dÞ�d∈DxkÞ ð1Þ

where ½yðt � dÞ�d∈Dy; ½x1ðt � dÞ�d∈Dx1

;…; ½xkðt � dÞ�d∈Dxkinvolves

all possible elements in the corresponding sets. In Eq. (1), y isthe dependent (response) variable wind speed, and x are theSCADA parameters used in this dynamic model as predictors.The x parameters (e.g., generator torque and rotor speed) are listedin Table 1. The function f ð·Þ is learned from the historical SCADAdata by data-mining algorithms.

Metrics for Algorithm Selection and ModelPerformance

The selection of an appropriate data-mining algorithm for buildinga dynamic model [Eq. (1)]of the wind-speed sensor is critical. Twobasic metrics, the mean absolute error (MAE) and standarddeviation of absolute error (Std) are used to compare the perfor-mance of various data-mining algorithms and models. The absoluteerror (AE), MAE , and Std are expressed in Eqs. (2)–(4):

AE ¼ jy� yj ð2Þ

MAE ¼P

Ni¼1 AEðiÞN

ð3Þ

Std ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiP

Ni¼1ðAEðiÞ �MAEÞ2

N � 1

rð4Þ

where y = predicted wind speed; y = observed (measured bymechanical anemometer) wind-speed value; and N = number oftest data points used to validate the performance of the wind-speedsensor model. The small value of the MAE and Std implies a supe-rior prediction performance of the wind-speed sensor model.

Dynamic Modeling of the Wind Speed Sensor:Industrial Case Study

The high-frequency (10-s) SCADA data collected over a period of7 days on the wind farm are used to validate the methodology pre-sented in this paper. The data set description is shown in Table 2.The runtime of data set 1 in Table 2 considered for analysis beginsat “8/8/2007 0:00:00” and ends at “8/15/2007 0:00:00”. During thistime period, the overall wind-farm performance was considerednormal. Data set 1 was divided into two subsets, data set 2 and dataset 3. Data set 2 contains 42,336 time-consecutive data points and

was used to develop a wind-speed sensor model with data-miningalgorithms. Data set 3 includes 18,145 time-consecutive data pointsand was used to test the prediction performance of the dynamicmodel learned from data set 2.

Feature Selection for Dynamic Model of Wind SpeedSensor

Because of the high dimensionality of the predictors of the dynamicwind-speed model, the number of inputs needs to be reduced. Im-portant predictors are determined by the boosting tree algorithm.The basic idea of the boosting tree algorithm (Friedman 2002;Friedman 2001) is to build a number of trees (e.g., binary trees)splitting the data set and approximating the underlying function.The importance of each predictor is measured by its contributionto the prediction accuracy of the training data set (Kusiaket al. 2009b).

The basic steps for computing the predictor importance are asfollows:1. During the building of each tree, for each split, predictor

statistics (sums of squares regression) are computed for everypredictor variable. The best predictor variable (yielding thebest split at the respective node) is selected for the actual split.

2. The averages of the predictor statistics for all variables over allsplits and over all trees in the boosting sequence are computed.

3. The final predictor importance values are computed by normal-izing those averages so that the highest average is assigned thevalue of 1. The importance of all other predictors is expressedin terms of the relative magnitudes of the average values of thepredictor statistics, relative to the most important predictor.There are two ways to build a virtual sensor; one is to predict the

wind speed withoutWS as input, and the other is to build a dynamicmodel withWS as input. The latter one is shown in Eq. (1), whereasthe former one is expressed in Eq. (5):

yðtÞ ¼ f ð½x1ðt � dÞ�d∈Dx1;…; ½xkðt � dÞ�d∈Dxk

Þ ð5Þ

where y = dependent (response) variable wind speed; and x = pre-dictors in the dynamic model and are the SCADA parameters withWS excluded. The 10 parameters withWS excluded in Table 1 wereselected by a boosting tree algorithm, and thus the final predictorsx1;…; xk were determined. Table 3 and Fig. 2 show the importanceof 10 predictors computed by the boosting tree algorithm on thebasis of data set 2 of Table 2. Variable rank is the scaled value(multiplied by 100%) of the importance computed by the boostingtree algorithm.

It is important to select predictors x1;…; xk with the highestinformation content among fT ;P;GS;GBTA;GBTB;BPA1;BPA2;BPA3; YE;RSg to maximize the prediction accuracy ofthe wind-speed sensor. A threshold value of 0.90 has been estab-lished heuristically to select the predictors for the wind-speed sen-sor model, as 0.9 used in computation has produced good qualityresults. A lower threshold value leads to more predictors and

Table 2. Data Set Description

Data setStart

time stampEnd timestamp Description

1 8/8/2007 0:00:00 8/15/2007 0:00:00 Total data set;

60,481 observations

2 8/8/2007 0:00:00 8/12/2007 21:35:00 Training data set;

42,336 observations

3 8/12/2007 21:35:10 8/15/2007 0:00:00 Test data set;

18,145 observations

JOURNAL OF ENERGY ENGINEERING © ASCE / JUNE 2011 / 61

Downloaded 13 Jun 2011 to 128.255.19.134. Redistribution subject to ASCE license or copyright. Visithttp://www.ascelibrary.org

Page 4: Virtual Wind Speed Sensor for Wind Turbines · 2015. 7. 29. · Note. This manuscript was submitted on April 30, 2009; approved on September 16, 2010; published online on September

greater complexity of the dynamic model. A large number ofpredictors could result in an inferior performance of the extractedmodels because of “the course of dimensionality” principle. Threepredictors fT ;P;RSg have been selected from the parameters listedin Table 1, as only these three predictors have values of relativeimportance larger than the threshold value 0.9.

The second step is to select the specific predictors for the k þ 1individual sets for y: Dy;Dx1 ;…;Dxk . In this research, the maxi-mum time delay of response variable y and the predictors is as-sumed to be 30 s. As the sampling time of SCADA data is10 s, the three previous values of predictors fT ;P;RSg are takeninto account and selected by the boosting tree algorithm fordynamic modeling. Important predictors are selected from thefollowing three predictor sets: fTðtÞ; Tðt � 1Þ;…;Tðt � 3Þg,fPðtÞ;Pðt � 1Þ;…;Pðt � 3Þg, fRSðtÞ;RSðt � 1Þ;…;RSðt � 3Þg.Table 4 and Fig. 3 show the importance of the three predictor setsfT ;P;RSg computed by the boosting tree algorithm on the basis ofthe data set 2 in Table 2. The threshold is set as 0.9, and the pre-dictors for dynamic modeling are reduced from 12 to 5. The finalsets of predictors for the dynamic model of the wind-speed sensorare: fTðtÞ;Tðt � 1Þg, fPðtÞ;Pðt � 1Þg, fRSðtÞg.Algorithm Selection for Modeling Wind-Speed Sensor

Four different algorithms were used to learn (identify) the dynamicmodel of wind speed [Eq. (5)]. They are the multilayer perceptronneural network (MLP NN) (Bishop 1995; Espinosa et al. 2005),random forest (Breiman 2001), support vector machine regression(SVMreg) (Shevade et al. 2000; Smola and Schoelkopf 2004),and classification and regression (C&R) tree (Breiman et al.1984; Tan et al. 2006). The MLP NN algorithms are usually usedin nonlinear regression and classification modeling because of their

ability to capture complex relationships between parameters. Therandom forest algorithm grows many classification trees to classifya new object from an input vector. Each tree “votes” for every class,and finally the forest chooses the classification having the mostvotes over all the trees in the forest. The SVM is a supervised learn-ing algorithm used in classification and regression. It constructs alinear discriminant function, separating instances (examples) aswidely as possible. A C&R tree builds a decision tree to predicteither classes (classification) or Gaussians (regression).

In this research, 35 MLP NN models with different kernels andstructures were built, and the most accurate and robust model wasselected. Five different activation functions were used for the hid-den and output neurons, namely, the logistic, identity, tanh, expo-nential, and sine functions. The number of hidden units was setfrom 3 to 11, and the weight decay for both the hidden and outputlayer varied from 0.0001 to 0.001. The best trained MLP NN hasthe 5-11-1 three-layer structure, i.e., 5 inputs and 11 hidden units,and 1 output. It has the logistic and identity activation function forthe hidden and output neurons, respectively.

The MAE in Eq. (3) and Std in Eq. (4) were used as two mainmetrics to select the most suitable algorithm for building thedynamic model [Eq. (5)]. The small values of the MAE and Stdindicate a satisfactory prediction performance. Table 5 summarizesthe prediction accuracy of the models learned from differentdata-mining algorithms using data set 3 of Table 2. Fig. 4 illustratesthe bar chart of absolute error statistics of wind-speed sensormodels learned from different data-mining algorithms. The MLPalgorithm performed best among the four data-mining algorithmsbecause of its smallest MAE and Std, whereas the SVMreg per-formed the worst, as indicated by the large values of the MAEand Std. The MLP algorithm provided high-quality predictionsfor the test data set of Table 2 and captured the wind dynamics withhigh fidelity.

Fig. 5 shows the first 305 data points of wind speed predicted bythe MLP algorithm and the observed wind speed for the test data setof Table 2. The observed and predicted wind speeds are almostidentical, and the predicted wind speed follows exactly the trendof the observed wind speed. The virtual wind-speed sensor, onthe basis of dynamic modeling and built by the MLP algorithm,accurately predicts wind velocity in data set 3 of Table 2.

Extended Validation of the Proposed SensorDevelopment Concept

The MLP NN algorithm performed best among the four differentdata-mining algorithms, and thus it was selected for further inves-tigation in the data set of another two turbines. The methodology of

Table 3. Variable Rank and Importance of the Predictors of Table 1

Parameter Variable rank (%) Importance

RS 100 1.00

T 96 0.96

P 92 0.92

GS 88 0.88

BPA1 64 0.64

BPA2 63 0.63

BPA3 63 0.63

YE 52 0.52

GBTA 46 0.46

GBTB 35 0.35

Fig. 2. Variable rank of the predictors from Table 1 computed by theboosting tree algorithm

Table 4. Variable Rank and Importance of Predictors of the DynamicModel [Eq. (5)]

Parameter Variable rank (%) Importance

TðtÞ 100 1.00

PðtÞ 99 0.99

Pðt � 1Þ 95 0.95

Tðt � 1Þ 95 0.95

RSðtÞ 93 0.93

Pðt � 2Þ 88 0.88

Tðt � 2Þ 88 0.88

Pðt � 3Þ 86 0.86

Tðt � 3Þ 86 0.86

RSðt � 1Þ 86 0.86

RSðt � 2Þ 83 0.83

RSðt � 3Þ 81 0.81

62 / JOURNAL OF ENERGY ENGINEERING © ASCE / JUNE 2011

Downloaded 13 Jun 2011 to 128.255.19.134. Redistribution subject to ASCE license or copyright. Visithttp://www.ascelibrary.org

Page 5: Virtual Wind Speed Sensor for Wind Turbines · 2015. 7. 29. · Note. This manuscript was submitted on April 30, 2009; approved on September 16, 2010; published online on September

a virtual wind-sensor developed on the basis of dynamic modelingwill be validated using SCADA data from two additional wind tur-bines, which were randomly selected from 100 turbines on the windfarm. Table 6 shows the data set of turbines 2 and 3, with the samebeginning time stamp of “8/8/2007 0:00:00” and the same endingtime stamp of “8/15/2007 0:00:00” as that of the data set in Table 2.Both data set 1 of turbine 2 and data set 4 of turbine 3 were dividedinto two separate data subsets, data sets 2 and 3 and data sets 5 and6, respectively. All the data sets contain consecutive data pointsdrawn in a time sequence.

The SCADA data for turbines operating in abnormal conditionswere removed, and therefore the number of data points of differentturbines varies. Abnormal conditions include turbine maintenance,low wind speed, turbine shutdown because of high wind speed, andsensor malfunctions. Data set 2 of turbine 2, containing 42,305time-consecutive data points, and data set 5 of turbine 3, containing42,050 time-consecutive data points, were used to develop a modelf ð·Þ estimating the wind speed. Data set 4 of turbine 2, containing18,078 time-consecutive data points, and data set 6 of turbine 3,including 18,124 time-consecutive data points, were used to testthe performance of the models f ð·Þ learned from data set 2 anddata set 5.

Figs. 6(a) and 6(b) show the predicted (by MLP) and observedwind speed of turbines 2 and 3 for the test data set of Table 6,respectively. The observed and predicted values of wind speedmatch quite well.

Table 7 summarizes the prediction accuracy of wind-speedmodels built by the MLP algorithm. The results in Table 7 producedby the MLP algorithm for turbines 2 and 3 are consistent with theresults for turbine 1 shown in Table 5. A large prediction error(MAE and Std) indicates that the model built on historical dataneeds to be updated. The update time is impacted by various fac-tors, e.g., seasonal and rapid change of weather conditions.

The sensitivity (Berry and Linoff 2004) of a neural network’soutput to its input perturbation is of importance to virtual sensordevelopment. The sums of the squares’ residuals are computedfor the model when the respective predictor is eliminated fromthe neural net; ratios (of the full model versus the reduced model)are also reported, and the predictors can be sorted by their impor-tance or relevance for the particular neural net. The sensitivityanalysis (Zheng and Yeung 2001) identifies input variables thatare important and those that can be safely ignored. Table 8 andFig. 7 show the sensitivity analysis results of the wind-speed mod-els of two turbines built by the MLP. The same predictors have

Fig. 3. Variable rank of the predictors for dynamic model of Eq. (5) computed by the boosting tree algorithm

Table 5. Statistics of Prediction Accuracy of Different Algorithms

Algorithm MAE (m=s) Std (m=s) Max (m=s) Min (m=s)

SVMreg 0.579 0.368 4.260 0.001

MLP 0.299 0.248 3.605 0.000

Random forest 0.432 0.369 3.854 0.000

C&R tree 0.492 0.494 8.574 0.000

Fig. 4. Bar chart of prediction performance of different algorithms fortest data set 3 of Table 2

Fig. 5. Predicted and observed values of wind speed of the test data setfrom Table 2

JOURNAL OF ENERGY ENGINEERING © ASCE / JUNE 2011 / 63

Downloaded 13 Jun 2011 to 128.255.19.134. Redistribution subject to ASCE license or copyright. Visithttp://www.ascelibrary.org

Page 6: Virtual Wind Speed Sensor for Wind Turbines · 2015. 7. 29. · Note. This manuscript was submitted on April 30, 2009; approved on September 16, 2010; published online on September

different sensitivity ranks in the wind-speed models for differentturbines. The sensitivity analysis offers insight into the nonlinearrelationship between various SCADA parameters of a utility-scalewind turbine.

Virtual Sensor Based on Wavelet-Transformed Data

Though the virtual wind-speed sensor developed on the basis ofdynamic modeling and data mining presented in the section“Dynamic Modeling of the Wind Speed Sensor: Industrial CaseStudy” performs well, the noisiness of the data deserves attention.

Wavelet Transformation of Wind Speed

Data quality impacts the accuracy of the models built on the basisof a data-driven approach. In many practical applications, for ex-ample, medical and energy, the data may not be of the desired qual-ity for various reasons: nonstandard data collection processes,sensor reading errors, missing values, and system breakdowns.To deal with the noisy data, various methods need to be selectedon the basis of the properties of the data set. In this case, allSCADA parameters are continuous and recorded every 10 s, whichform a set of time series (signals). An example signal of wind speedis shown in Fig. 8. The wind speed oscillates around some values,and those oscillations can be regarded as noise of the signal.

Moving average and wavelets are widely used in signal process-ing. The moving average method is easy to implement, but it leadsto the “lagging” phenomenon. Wavelets usually perform better thanthe moving average in signal processing.

The selection of a suitable wavelet and the threshold settingfor signal denoising are two important elements for waveletapplications. Although there are numerous wavelets, the most fre-quently used wavelets are the Haar and Daubechies (Daub) series(Kobayashi 1998). Meng et al. (2008) presented a search methodfor the optimal threshold by minimizing the minimum square error.

Table 6. Description of Data Set of Two Additional Turbines

Turbine number Data set Start time stamp End time stamp Description

2 1 8/8/2007 0:00:00 8/15/2007 0:00:00 Total data set; 60,383 observations

2 8/8/2007 0:00:00 8/12/2007 21:35:00 Training data set; 42,305 observations

3 8/12/2007 21:35:10 8/15/2007 0:00:00 Test data set; 18,078 observations

3 4 8/8/2007 0:00:00 8/15/2007 0:00:00 Total data set; 60,174 observations

5 8/8/2007 0:00:00 8/12/2007 21:35:00 Training data set; 42,050 observations

6 8/12/2007 21:35:10 8/15/2007 0:00:00 Test data set; 18,124 observations

Table 7. Prediction Accuracy of the MLP NN Algorithm on the Test DataSet of Table 6

Turbinenumber MAE (m=s) Std (m=s) Max (m=s) Min (m=s)

2 0.286 0.237 2.941 0.000

3 0.297 0.238 3.396 0.000

Table 8. Sensitivity Analysis of MLP NN Constructed on the Basis of TestData Set of Table 6

Turbine number Turbine 2 Turbine 3

Parameter Sensitivity Sensitivity

RSðtÞ 16.306 22.709

TðtÞ 6.074 17.175

Tðt � 1Þ 5.469 5.761

PðtÞ 7.846 5.183

Pðt � 1Þ 11.177 3.519

Fig. 6. Predicted and observed values of wind speed for data sets 3 and6 of Table 4: (a) turbine 2; (b) turbine 3

Fig. 7. Sensitivity analysis of MLP NNs constructed for turbine 2 and 3in Table 6

64 / JOURNAL OF ENERGY ENGINEERING © ASCE / JUNE 2011

Downloaded 13 Jun 2011 to 128.255.19.134. Redistribution subject to ASCE license or copyright. Visithttp://www.ascelibrary.org

Page 7: Virtual Wind Speed Sensor for Wind Turbines · 2015. 7. 29. · Note. This manuscript was submitted on April 30, 2009; approved on September 16, 2010; published online on September

Medina et al. (2003) applied a neural network for selecting the suit-able threshold. Chen and Bui (2003) claimed to achieve gooddenoising results by using multiwavelets and considering theneighbor coefficients. Minimizing the square error between thedenoised signal and the clean signal is the most widely used cri-terion in denoising. The ultimate goal of denoising signals in thispaper is to improve the accuracy and robustness of the wind-speedsensor model. The threshold scheme of wavelets should be selectedon the basis of the final application goal.

Different threshold schemes of Daub4 wavelets with 12 levelshave been applied in this research. Two issues emerge; one is howto choose the threshold, and the other is how to perform the thresh-olding. Hard and soft thresholding (Kobayashi 1998; Tang et al.2000) could produce different denoising results. Hard thresholdingis the usual process of setting to zero the elements whose absolutevalues are lower than the threshold. Soft thresholding is an exten-sion of hard thresholding, first setting to zero the elements whoseabsolute values are lower than the threshold, and then shrinking thenonzero coefficients toward zero. Various threshold selection rulesexist; the three most widely used ones are Fix, Rigrsure andHeursure (Kobayashi 1998). Fix uses a fixed threshold multipliedby a small factor; Rigrsure is the threshold selection based on theprinciple of Stein’s unbiased risk estimate (SURE) (Stein 1981);Heursure is a mixture of the rules of Rigrsure and Fix. In this paper,three threshold selection rules and two thresholding methods arecombined to yield six different threshold settings, namely Fix_Soft,Fix_Hard, Rigrsure_Soft, Rigrsure_Hard, Heursure_Soft, andHeursure_Hard.

The wind-speed data after wavelet transformation should main-tain the high fidelity of the original data. In this research, MAE andStd are used as main metrics to measure the fidelity of the waveletswith different threshold settings. Table 9 shows the statistics of thedenoising results of different wavelet transformation schemes.

Fig. 9 shows the first 800 denoised (by wavelet with HeurSure_Soft thresholding method) and the observed (by mechanicalanemometer) wind speed of data set 2 in Table 2. The denoised

wind speed is continuously changing without steep changes.The denoised wind speed follows the observed wind speed accu-rately without the “lag” phenomenon.

Wavelet-Based Virtual Sensor Model

The wavelet transformation can denoise the wind speed and main-tain high fidelity; however, wavelets cannot be used in an onlinemode, because of their computational complexity and the require-ment of a large volume of data for wavelet analysis. If the denoisedwind speed could be generated in an online mode, the turbine con-trol system would be the major beneficiary.

To cope with this online problem, data mining could be used tobuild a wind-speed sensor model on the basis of the wavelet-transformed data. The wavelet-based model is shown in Eq. (6):

ywðtÞ ¼ f ð½yðt � dÞ�d∈Dy; ½x1ðt � dÞ�d∈Dx1

;…; ½xkðt � dÞ�d∈DxkÞð6Þ

where yw = dependent variable wind speed after wavelet transfor-mation; y = current or past WS collected by SCADA system; and x= predictors (with WS excluded) in the dynamic modeling. Thebasic steps to implement the wavelet-based wind-speed sensormodel are as follows:1. Preprocess current SCADA data of a certain period, e.g., one

week, into a suitable format.2. Transform the wind-speed data by the Daub4 wavelet.3. Learn the dynamic model (Eq. (6)) of wavelet-transformed

wind speed by data-mining algorithms from the training dataset.In this case, the final predictors selected by the boosting

tree algorithm are fWSðt � 1Þ;WSðt � 2Þg, fTðtÞ; Tðt � 1Þg,fPðtÞ;Pðt � 1Þg, and fRSðtÞg. The target and output of the wave-let-based virtual sensor model is the wavelet-transformed windspeed, rather than the original wind speedmeasured by the anemom-eter.However, the input of themodel is still the original SCADAdata(without transformation). The MLP NN algorithm performing wellin "Dynamic Modeling of the Wind Speed Sensor: Industrial CaseStudy" is selected to model the wavelet-based wind-speed sensorfrom the training data set in Table 2. The performance of the trainedmodel on the test data set is reported in Table 10. Fig. 10 shows thefirst 400 values of the wavelet denoised wind speed and the windspeed predicted by model Eq. (6) (learned from MLP NN algo-rithms) of data set 3 of Table 2. Table 10 shows the statisticsof the wavelet-based model on the test data set of Table 2. Thewavelet-based wind-speed model accurately and robustly predictedthe denoised wind speed. The model learned from the MLP NNalgorithm could generate denoised wind speed in an online mode.The target of the dynamic wind-speed model in the section

Fig. 8. Time series of 10-s wind speed

Table 9. Performance Statistics of Wavelets with Different ThresholdingMethods

Thresholdingmethod MAE (m=s) Std (m=s) Max (m=s) Min (m=s)

Fix_Soft 0.634 0.552 5.736 0.000

Fix_Hard 0.704 0.546 7.548 0.000

Rigrsure_Soft 0.512 0.474 5.330 0.000

Rigrsure_Hard 0.568 0.482 5.469 0.000

Heursure_Soft 0.365 0.235 3.089 0.000

Heursure_Hard 0.396 0.295 3.384 0.000

Fig. 9. Original and denoised wind speed (by wavelet with Heursur-e_Soft threshold setting) of data set 2 in Table 2

JOURNAL OF ENERGY ENGINEERING © ASCE / JUNE 2011 / 65

Downloaded 13 Jun 2011 to 128.255.19.134. Redistribution subject to ASCE license or copyright. Visithttp://www.ascelibrary.org

Page 8: Virtual Wind Speed Sensor for Wind Turbines · 2015. 7. 29. · Note. This manuscript was submitted on April 30, 2009; approved on September 16, 2010; published online on September

"Dynamic Modeling of the Wind Speed Sensor: Industrial CaseStudy" is the original SCADAwind speed with noise, whereas thatof thewavelet-basedwind-speedmodel is the denoised andwavelet-transformed wind speed. For this reason, the accuracy of the wave-let-based virtual wind-speed sensor has significantly improved.

Wind Speed Sensor Monitoring

Control Chart-Based Monitoring

The anemometer installed on a wind turbine degrades over time.This leads to inferior turbine performance and poor power qualityattributable to incorrect wind-speed input to the turbine controller.A formal approach for online monitoring of wind-speed sensors isnecessary. The wavelet-based models of wind-speed sensors andcontrol charts borrowed from statistical process control theorycan be used to detect and remedy performance anomalies.

Fig. 11 illustrates the basic concept of wind-speed sensor mod-eling and online monitoring presented in this paper. A data-miningalgorithm is used to identify a dynamic sensor model from thehistorical wavelet-transformed SCADA data. The model can beupdated to reflect the process changing over time or refreshedonce the model performance has degraded. The update frequency

could be, for example, one month. The operational updatefrequency depends on the wind turbine operational conditionsand the accuracy requirements. At the same time, a control chartis generated from the wavelet-transformed SCADA data and thedynamic model of the wind-speed sensor and can be further usedfor online monitoring of a mechanical anemometer. The virtualwind-speed sensor and control chart monitor the anemometer per-formance at a certain time interval, e.g., every 20 s.

The residual control chart approach (statistical quality control)(Kang and Albin 2000; Mitra 1998) is used to analyze residualsbetween the virtual sensor value and the observed value of the windspeed (measured by the sensor). The residual ε is expressed inEq. (7) (Kang and Albin 2000):

ε ¼ y� y ð7Þwhere y = observed wind speed; and y = reference value predictedby the wind-speed sensor model.

The control chart approach (Kusiak et al. 2009a; Mitra 1998;Montgomery 2005) allows the residuals and their variations tobe monitored, and thus can detect abnormal conditions and theanemometer fault. A training data set of N train observations withoutliers removed was selected to build a control chart. The trainingdata set is represented as y TrainSet ¼ ½yðiÞ; yðiÞ�, i ¼ 1;…;N train.

Using the training data set, the residual ε for each point is com-puted, as well as the mean and the standard deviation of ε. Themean residual μtrain and the standard deviation σtrain are shownin Eq. (8) (Montgomery 2005):

μtrain ¼1

N train

XNi¼1

½yðiÞ � yðiÞ�

σtrain ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

1N train � 1

XNi¼1

f½yðiÞ � yðiÞ� � μtraing2vuut ð8Þ

The test data set y TestSet ¼ ½yðiÞ; yðiÞ� includes N test consecu-tive data points drawn in time sequence from the test data set.

Similarly, the mean residual μtest and the standard deviation σtestof the test data set are expressed in Eq. (9) (Montgomery 2005):

μtest ¼1

N test

Xni¼1

½yðiÞ � yðiÞ�

σtest ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

1N test � 1

Xni¼1

f½yðiÞ � yðiÞ� � μtestg2s ð9Þ

Table 10. Prediction Performance Statistics of Wavelet-Based Wind-Speed Sensor Model

MAE (m=s) Std (m=s) Max (m=s) Min (m=s)

0.0275 0.054 1.389 0.000

Fig. 10. Denoised and predicted wind speed of data set 3 from Table 2

Fig. 11. Framework for modeling and online monitoring of wind-speed sensors

66 / JOURNAL OF ENERGY ENGINEERING © ASCE / JUNE 2011

Downloaded 13 Jun 2011 to 128.255.19.134. Redistribution subject to ASCE license or copyright. Visithttp://www.ascelibrary.org

Page 9: Virtual Wind Speed Sensor for Wind Turbines · 2015. 7. 29. · Note. This manuscript was submitted on April 30, 2009; approved on September 16, 2010; published online on September

Once μtrain and σtrain are known, the upper and lower controllimits of the control chart are computed and used to detect anoma-lies. On the basis of Eq. (8), the control limits of the control chartare derived from Eq. (10) (Montgomery 2005):

UCL1 ¼ μtrain þ ησtrainffiffiffiffiffiffiffiffiffiN test

p CenterLine1 ¼ μtrain

LCL1 ¼ μtrain � ησtrainffiffiffiffiffiffiffiffiffiN test

pð10Þ

N test = number of points in y TestSet; η (usually fixed as 3) is theinteger multiple for the control limits and can be adjusted to make

the control chart less sensitive to the data variability and thus reducethe risk of false alarms. If μtest is higher than UCL1 or lower thanLCL1, the wind speed at the sampling time y TestSet is consideredto be deficient, and this type of fault is defined as fault type I inthis paper.

Similarly, the control limits for σ2test are calculated from Eq. (11)

(Montgomery 2005):

UCL2 ¼σ2train

N test � 1× χ2

α=2;N test�1 CenterLine2 ¼ σ2train

LCL2 ¼ 0

ð11Þ

where χ2α=2;N test�1 denotes the right α=2 percentage points of the

chi-square distribution; and N test � 1 = degree of freedom of thechi-square distribution. The parameter α needs to be adjusted tomake the control chart less sensitive to the variability of the data.

Table 11. Limit Values for the Control Chart of the Wind-Speed Sensor

UCL1 LCL1 UCL2 LCL2

1.2330 �1:2732 3.1898 0.0000

Fig. 12. Online monitoring chart of wind speed with upper and lower limits of data set 3 of Table 2: (a) test data starting at “8/10/2007 23:22:00” andending at “8/10/2007 4:28:50” for data set 3 of Table 2; (b) test data starting at “8/12/2007 8:54:30” and ending at “8/12/2007 9:27:00” for data set 3of Table 2

JOURNAL OF ENERGY ENGINEERING © ASCE / JUNE 2011 / 67

Downloaded 13 Jun 2011 to 128.255.19.134. Redistribution subject to ASCE license or copyright. Visithttp://www.ascelibrary.org

Page 10: Virtual Wind Speed Sensor for Wind Turbines · 2015. 7. 29. · Note. This manuscript was submitted on April 30, 2009; approved on September 16, 2010; published online on September

LCL2 is set to 0 to indicate that the variation of residuals in the testdata is 0, so that the measured mechanical anemometer matches thereference virtual sensor value in the normal status. If σ2

test is higherthan UCL2, the wind-speed value at the sampling time y TestSet isconsidered as deficient, and this type of fault is defined as fault typeII in this paper.

On-Line Monitoring: Industrial Example

The wavelet-based virtual wind-speed sensor was discussed in“Wavelet-Based Virtual Sensor Model.” Online monitoring simu-lation of the wind-speed sensor models is performed and validatedusing the test data set of Table 2. Table 11 shows the control chartparameters of the wind-speed sensor models generated from thetraining data set of Table 2. The parameter α in Eq. (11) is fixedat 2, and η of Eq. (10) is set to 3 to enhance the confidence ofdetecting anomalies in the mechanical anemometer. However, thesetwo important parameters α and η can be adjusted according to thespecific application scenario to reduce the risk of false alarms. Acontrol chart monitors both the residual mean and the residualvariation of the sampling data at the same time.

Figs. 12(a) and 12(b) show the results of the online monitoringsimulation of the wind-speed sensor model [Eq. (6)]learned fromthe MLP NN algorithm. The observed wind-speed plot and theupper and lower limit plots were constructed from the test dataset of Table 2. The simulation results of the test data set overdifferent periods in Table 2 are shown in Figs. 12(a) and 12(b),respectively. The upper limits (boundary) curve of the wind-speedsensor are calculated on the basis of the UCL1 of Eq. (10), whereasthe lower limits (boundary) curve of the wind-speed sensor arecalculated on the basis of the LCL1 of Eq. (10).

The values of UCL1 and LCL1, calculated from Eq. (10), mon-itor the residual mean of the test data set of Table 2, and the sensorfault detected because of the abnormal residual mean is defined asfault type I. The value of UCL2, calculated from Eq. (11), monitorsthe residual variation of the test data set of Table 2, and the sensorfault detected because of the abnormal residual variation is definedas fault type II. Table 12 shows the statistics of the online monitor-ing simulation results of the wind-speed sensor control charts onthe basis of data set 3 of Table 2. It shows the number of type Ifaults and type II faults and their percentage among the entire testdata set (18,145 data points) of Table 2.

Conclusion

In this paper, a wind-speed sensor model for wind turbines wasbuilt by data-mining algorithms. Four data-mining algorithms wereused to construct a wind-speed sensor model. The MLP algorithmoutperformed other data-mining algorithms. The performance ofthe selected MLP algorithm was validated in the data sets oftwo additional wind turbines. Data-mining algorithms identifieddynamic models of the wind-speed sensor from the actual SCADAdata and captured relationships between various SCADA parame-ters of wind turbines. The variables necessary for building thewind-speed sensor model were selected by the boosting tree algo-rithm. Wavelet transformation was used to denoise the wind speedmeasured by the anemometer. The wind-speed sensor model

developed from the wavelet-transformed data provides robustand accurate wind-speed input to turbine controllers.

More stable and reliable setting values for the controllable var-iables of the wind turbine system could be generated on the basis ofthe wind-speed input of the wavelet-based wind-speed sensormodel, offering improved power quality and an extended lifetimeof mechanical and electrical components. The wind-speed sensormodels were used as the reference wind-speed sensors (online pro-file) for monitoring the performance of anemometers. The controlchart approach can be used to monitor the residuals between theobserved and the reference wind speed and their variation.

Constructing data-driven models with data-mining algorithmsis applicable to sensors in a wide range of processes in otherdomains. Control charts can detect the sensor faults; however,they cannot identify the specific reasons or factors causing thesefaults. Further research is needed to implement the online faultdetection and diagnosis of wind turbines. The virtual wind-speedsensor could be used for online monitoring and calibration ofanemometers. It could also conceivably replace the mechanicalanemometer.

Acknowledgments

The research reported in the paper has been supported by fundingfrom the Iowa Energy Center, Grant No. 07-01.

References

Aminian, M., and Aminian, F. (2007). “A modular fault-diagnostic systemfor analog electronic circuits using neural networks with wavelettransform as a preprocessor.” IEEE Trans. Instrum. Meas., 56(5),1546–1554.

Backus, P., Janakiram, M., Mowzoon, S., Runger, G. C., and Bhargava, A.(2006). “Factory cycle-time prediction with data-mining approach.”IEEE Trans. Semicond. Manuf., 19(2), 252–258.

Berry, M. J. A., and Linoff, G. S. (2004). Data mining techniques: Formarketing, sales, and customer relationship management, 2nd Ed.,Wiley, New York.

Bishop, C. M. (1995). Neural networks for pattern recognition, OxfordUniversity Press, New York.

Breiman, L. (2001). “Random forest.” Mach. Learn., 45(1), 5–32.Breiman, L., Friedman, J., Olshen, R. A., and Stone, C. J. (1984).

Classification and regression trees, Wadsworth International,Monterey, CA.

Charfi, F., Sellami, F., and Al-Haddad, K. (2006). “Fault diagnostic inpower system using wavelet transforms and neural networks.” IEEEInt. Symp. on Industrial Electronics, New York, 2, 1143–1148.

Chella, A., Ciarlini, P., andManiscalco, U. (2006). “Neural networks as softsensors: A comparison in a real world application.” Int. Joint Conf. onNeural Networks, IEEE, New York, 2662–2668.

Chen, G. Y., and Bui, T. D. (2003). “Multiwavelets denoising using neigh-boring coefficients.” IEEE Signal Process. Lett., 10(7), 211–214.

Dash, P. K., Chun, I. L. W., and Chilukuri, M. V. (2003). “Power qualitydata mining using soft computing and wavelet transform.” Proc. IEEERegion 10 Annual Int. Conf., New York, 976–980.

Espinosa, J., Vandewalle, J., and Wertz, V. (2005). Fuzzy logic, identifica-tion and predictive control, Springer-Verlag, London.

Freilich, M. H., and Vanhoff, B. A. (2006). “The accuracy of preliminaryWindSat vector wind measurements: Comparisons with NDBCbuoys and QuikSCAT.” IEEE Trans. Geosci. Remote Sens., 44(3),622–637.

Friedman, J. H. (2001). “Greedy function approximation: A gradientboosting machine.” Ann. Stat., 29(5), 1189–1232.

Friedman, J. H. (2002). “Stochastic gradient boosting.” Comput. Stat. DataAnal., 38(4), 367–378.

Hannon, B., and Ruth, M. (2001). Dynamic modeling, 2nd Ed., Springer-Verlag, New York.

Table 12. Statistics of Fault Detection of Control Charts for the VirtualWind-Speed Sensor

Fault type I (%) Fault type II (%) Fault type I Fault type II

1.76 2.06 319 373

68 / JOURNAL OF ENERGY ENGINEERING © ASCE / JUNE 2011

Downloaded 13 Jun 2011 to 128.255.19.134. Redistribution subject to ASCE license or copyright. Visithttp://www.ascelibrary.org

Page 11: Virtual Wind Speed Sensor for Wind Turbines · 2015. 7. 29. · Note. This manuscript was submitted on April 30, 2009; approved on September 16, 2010; published online on September

Hsu, H., Oey, L., Johnson, W., Dorman, C., and Hodur, R. (2007). “Modelwind over the central and southern California coastal ocean.” Mon.Weather Rev., 135(5), 1931–1944.

Kang, L., and Albin, S. L. (2000). “On-line monitoring when the processyields a linear profile.” J. Qual. Technol., 32(4), 418–426.

Kobayashi, M. (1998). Wavelets and their applications: Case studies, 1stEd., Society for Industrial and Applied Mathematics, Philadelphia.

Kohavi, R., and John, G. H. (1997). “Wrappers for feature subset selec-tion.” Artif. Intell., 97(1–2), 273–324.

Kulalowski, B. T., Gardner, J. F., and Shearer, J. L. (2007). Dynamic mod-eling and control of engineering systems, 1st Ed., Cambridge UniversityPress, New York.

Kusiak, A. (2006). “Data mining: Manufacturing and service applications.”Int. J. Prod. Res., 44(18–19), 4175–4191.

Kusiak, A., Dixon, B., and Shah, S. (2005). “Predicting survival time forkidney dialysis patients: A data mining approach.” Comput. Biol. Med.,35(4), 311–327.

Kusiak, A., and Song, Z. (2006). “Combustion efficiency optimization andvirtual testing: A data-mining approach.” IEEE Trans. Ind. Inf., 2(3),176–184.

Kusiak, A., Zheng, H., and Song, Z. (2009a). “Models for monitoring ofwind farm power.” Renewable Energy, 34(3), 583–590.

Kusiak, A., Zheng, H.-Y., and Song, Z. (2009b). “Short-term prediction ofwind farm power: A data-mining approach.” IEEE Trans. EnergyConvers., 24(1), 125–136.

Louka, P., et al. (2008). “Improvements in wind speed forecasts for windpower prediction purposes using Kalman filtering.” J. Wind Eng. Ind.Aerodyn., 96(12), 2348–2362.

Medina, C. A., Alcaim, A., and Apolinario, J. A. (2003). “Wavelet denois-ing of speech using neural networks for threshold selection.” Electron.Lett., 39(25), 1869–1871.

Meng, J., Jin, L., Pan, Q., and Zhang, H. (2008). “MSE analyses ofthresholding by multi-scale product.” Proc. 2007 Int. Conf. on WaveletAnalysis and Pattern Recognition, IEEE, New York, 891–896.

Mestek, O., Pavlik, J., and Suchanek, M. (1994). “Multivariate controlcharts: Control charts for calibration curves.” J. Anal. Chem., 350(6), 344–351.

Mitra, A. (1998). Fundamentals of quality control and improvement, 2ndEd., Prentice Hall, Upper Saddle River, NJ.

Montgomery, D. C. (2005). Introduction to statistical quality control, 5thEd., Wiley, New York.

Pena, F. L., and Duro, R. J. (2003). “A Virtual instrument for automaticanemometer calibration with ANN based supervision.” IEEE Trans.Instrum. Meas., 52(3), 654–661.

Qiao, W., Zhou, W., Alle, J. M., and Harley, R. G. (2008). “Wind speedestimation based sensorless output maximization control for a wind tur-bine driving a DFIG.” IEEE Trans. Power Electron., 23(3), 1156–1169.

Seidel, P., Seidel, A., and Herbarth, O. (2007). “Multilayer perceptrontumour diagnosis based on chromatography analysis of urinary nucleo-side.” Neural Networks, 20(5), 646–651.

Shevade, S. K., Keerthi, S. S., Bhattacharyya, C., and Murthy, K. R. K.(2000). “Improvements to the SMO algorithm for SVM regression.”IEEE Trans. Neural Networks, 11(5), 1188–1193.

Smola, A. J., and Schoelkopf, B. (2004). “A tutorial on support vectorregression.” Stat. Comput., 14(3), 199–222.

Stein, C. M. (1981). “Estimation of the mean of a multivariate normaldistribution.” Ann. Stat., 9(6), 1135–1151.

Tan, P. N., Steinbach, M., and Kumar, V. (2006). Introduction to datamining, Pearson Education/Addison Wesley, Boston.

Tang, Y. Y., Yang, L. H., Liu, J., and Ma, H. (2000).Wavelet theory and itsapplication to pattern recognition, 1st Ed., World Scientific, RiverEdge, NJ.

U.S. Department of Energy. (2008). “20% wind energy by 2030: Increasingwind energy's contribution to U.S. electricity supply.” ⟨http://www1.eere.energy.gov/windandhydro/wind_2030.html⟩ (Nov. 1, 2008).

Walker, J. S. (1999). A primer on wavelets and their scientific applications,1st Ed., CRC Press, Boca Raton, FL.

Wheeler, D. J. (2000). Understanding variation: The key to managingchaos, 2nd Ed., SPC Press, Knoxville, TN.

Woodall, W. H., Spitzner, D. J., Montgomery, D. C., and Gupta, S. (2004).“Using control charts to monitor process and product quality profile.”J. Qual. Technol., 36(3), 309–320.

Zhang, C., and Ulrych, T. J. (2002). “Estimation of quality factors fromCMP records.” Geophysics, 67(5), 1542–1547.

Zheng, X., and Yeung, D. S. (2001). “Sensitivity analysis of multilayerperceptron to input and weight perturbations.” IEEE Trans. NeuralNetworks, 12(6), 1358–1366.

JOURNAL OF ENERGY ENGINEERING © ASCE / JUNE 2011 / 69

Downloaded 13 Jun 2011 to 128.255.19.134. Redistribution subject to ASCE license or copyright. Visithttp://www.ascelibrary.org