[ieee 2008 5th international conference on electrical engineering/electronics, computer,...

4
Cross-Substation Short Term Load Forecasting Using Support Vector Machine Jonglak Pahasa* and Nipon Theera-Umpon, Senior Member, IEEE** *Department of Electrical Engineering, School of Engineering, Naresuan University Phayao, Phayao 56000 Thailand (e-mail: [email protected]) **Department of Electrical Engineering, Faculty of Engineering, Chiang Mai University, Chiang Mai 50200 Thailand (e-mail: [email protected]) Abstract- This paper investigates the behavior of a short term load forecasting system in the cross-substation scheme. The proposed forecasting system is based on the support vector machine with the input features of past loads and temperature. It is trained with the data from one substation and tested on the blind-test data from other substations. A set of real-world data from 4 substations in Bangkok, i.e., Bangkok Noi, North Bangkok, South Thonburi and Rangsit, is used in the experiments. The results show that the similarities of the daily load’s amplitude ranges and patterns of the training substations and the test substations is required to perform the cross-substation forecasting. This observation is beneficial to the model development in that the retraining stage at a new substation may be omitted if the similarities are obeyed. Index Terms—Short-term load forecasting, cross-substation forecasting, support vector machine, support vector regression. I. INTRODUCTION Substation short-term load forecasting takes an important role both in planning and operation of the distribution systems [1]. Forecasting load at the substation level is more difficult and less accurate scheme than forecasting the total system load demand. There are many influencing factors that make forecasting results less accurate such as an incomplete historical load and weather data [2]. However, predicting load at a substation is still an important and interesting scheme. Researchers employed many techniques such as fuzzy regression [1, 3], neural networks [2], periodic time series [4], Kalman filtering algorithm [5] and support vector machine (SVM) [6] to improve the accuracy of the forecast load. Normally, planning and operation at the substation level use the forecast load predicted by expert systems. The human use their experience to predict the load at one-day or larger time ahead. The expert system has many drawbacks such as it is not accurate and not reliable. The SVM has been applied to many prediction problems. The structural risk minimization (SRM) in the learning stage of the SVM is more powerful than the empirical risk minimization (ERM) in multilayer perceptrons. In order to solve the nonlinear function approximation problem such as load forecasting, the SVM is a powerful technique. It uses the kernel function to induce the data sets from the input space up to a high dimensional feature space in which we can implement the problem in a linear form [7, 8]. However, in the load forecasting problem, construction of an SVM model at every substation in order to predict all substations’ load is a big problem. A load forecasting system that is generalized enough to be utilized at every substation is preferable. In this research, we investigate the generalization of our proposed SVM-based load forecasting system on a real-world data set collected from 4 substations in Bangkok, Thailand. The rest of this paper is organized as follows. The proposed cross-substation investigation is described in section II. In section III, the data set descriptions are given. Section IV presents the experimental results and discussion. Finally, section V concludes the paper. II. THE PROPOSED ALGORITHM A. Cross-Substation Forecasting System This work intends to investigate the cross-substation electrical load forecasting. This alternative scheme trains and constructs the forecast model at a substation and tests the model on data from other substations. We try to investigate the factors that will make a model trained at a substation applicable to another substation. That is how we can tell in advance whether a model is generalized enough to be applied at a substation. In this case, the model development time will be reduced because we will not have to retrain the model when it is applied at a new substation. The conventional scheme used nowadays is shown in Fig. 1. Each model is trained separately to forecast load at each substation. The scheme we are investigating consists of only 1 forecasting model trained by using the data from a substation and tested at other substations as shown in Fig. 2. B. SVM-based Forecasting Models Support vector machine (SVM) is used as a forecasting tool. The details of SVM are well-known and widely available in the literature [7, 8]. In [6], we proposed an SVM-based forecasting model using a set of features extracted from the wavelet transform. We achieved a very good forecasting performance on the same-substation scheme, i.e., a model was trained and tested at the same substation. Proceedings of ECTI-CON 2008 978-1-4244-2101-5/08/$25.00 ©2008 IEEE 953

Upload: nipon

Post on 21-Mar-2017

214 views

Category:

Documents


0 download

TRANSCRIPT

Cross-Substation Short Term Load Forecasting Using Support Vector Machine

Jonglak Pahasa* and Nipon Theera-Umpon, Senior Member, IEEE**

*Department of Electrical Engineering, School of Engineering, Naresuan University Phayao,

Phayao 56000 Thailand (e-mail: [email protected]) **Department of Electrical Engineering, Faculty of Engineering, Chiang Mai University,

Chiang Mai 50200 Thailand (e-mail: [email protected])

Abstract- This paper investigates the behavior of a short term

load forecasting system in the cross-substation scheme. The proposed forecasting system is based on the support vector machine with the input features of past loads and temperature. It is trained with the data from one substation and tested on the blind-test data from other substations. A set of real-world data from 4 substations in Bangkok, i.e., Bangkok Noi, North Bangkok, South Thonburi and Rangsit, is used in the experiments. The results show that the similarities of the daily load’s amplitude ranges and patterns of the training substations and the test substations is required to perform the cross-substation forecasting. This observation is beneficial to the model development in that the retraining stage at a new substation may be omitted if the similarities are obeyed.

Index Terms—Short-term load forecasting, cross-substation forecasting, support vector machine, support vector regression.

I. INTRODUCTION

Substation short-term load forecasting takes an important role both in planning and operation of the distribution systems [1]. Forecasting load at the substation level is more difficult and less accurate scheme than forecasting the total system load demand. There are many influencing factors that make forecasting results less accurate such as an incomplete historical load and weather data [2]. However, predicting load at a substation is still an important and interesting scheme. Researchers employed many techniques such as fuzzy regression [1, 3], neural networks [2], periodic time series [4], Kalman filtering algorithm [5] and support vector machine (SVM) [6] to improve the accuracy of the forecast load.

Normally, planning and operation at the substation level use the forecast load predicted by expert systems. The human use their experience to predict the load at one-day or larger time ahead. The expert system has many drawbacks such as it is not accurate and not reliable.

The SVM has been applied to many prediction problems. The structural risk minimization (SRM) in the learning stage of the SVM is more powerful than the empirical risk minimization (ERM) in multilayer perceptrons. In order to solve the nonlinear function approximation problem such as load forecasting, the SVM is a powerful technique. It uses the kernel function to induce the data sets from the input space up

to a high dimensional feature space in which we can implement the problem in a linear form [7, 8]. However, in the load forecasting problem, construction of an SVM model at every substation in order to predict all substations’ load is a big problem. A load forecasting system that is generalized enough to be utilized at every substation is preferable. In this research, we investigate the generalization of our proposed SVM-based load forecasting system on a real-world data set collected from 4 substations in Bangkok, Thailand.

The rest of this paper is organized as follows. The proposed cross-substation investigation is described in section II. In section III, the data set descriptions are given. Section IV presents the experimental results and discussion. Finally, section V concludes the paper.

II. THE PROPOSED ALGORITHM

A. Cross-Substation Forecasting System This work intends to investigate the cross-substation

electrical load forecasting. This alternative scheme trains and constructs the forecast model at a substation and tests the model on data from other substations. We try to investigate the factors that will make a model trained at a substation applicable to another substation. That is how we can tell in advance whether a model is generalized enough to be applied at a substation. In this case, the model development time will be reduced because we will not have to retrain the model when it is applied at a new substation.

The conventional scheme used nowadays is shown in Fig. 1. Each model is trained separately to forecast load at each substation. The scheme we are investigating consists of only 1 forecasting model trained by using the data from a substation and tested at other substations as shown in Fig. 2.

B. SVM-based Forecasting Models Support vector machine (SVM) is used as a forecasting tool.

The details of SVM are well-known and widely available in the literature [7, 8]. In [6], we proposed an SVM-based forecasting model using a set of features extracted from the wavelet transform. We achieved a very good forecasting performance on the same-substation scheme, i.e., a model was trained and tested at the same substation.

Proceedings of ECTI-CON 2008

978-1-4244-2101-5/08/$25.00 ©2008 IEEE

953

Forecast tool 1

Forecast LoadAt Substation #1

Train #1

Substation #1Test #1

Forecast tool 2Forecast Load

At Substation #2

Train #2

Substation #2Test #2

Fig. 1. Conventional scheme for forecasting at 2 substations

Forecast tool 1

Substation #1

Substation #2

Forecast LoadAt Substation #1

Forecast LoadAt Substation #2

Train #1

Test #1

Test #2

Fig. 2. Investigated scheme for forecasting at 2 substations

1( , )K x x

RBF Kernel Function

2( , )K x x

( , )iK x x

( , )pK x x

( 24)T +

x

1x

2x

ix

pxInput Vectors

...

......

Support Vectors

Weights *( )i i iβ α α= −

(0)L

( 24)L −

( 144)−L

Biasb

( )( 24)1( ) ,i i i

iL K bα α ∗

+=

= − +� x x�

Fig. 3. SVM-based load forecasting model used in this research.

To decide the number of models in this particular problem, the load data is grouped according to the similarity of the load patterns. We decide not to classify on seasonal approach because there is fuzziness in the season partition for any given day. We classify the forecast load according to their time and day type. The reason is that the patterns of the load on different days of the week are different. Moreover, the load patterns at different time instants of a day are also different. In addition, the normal and abnormal days such as holidays also have the different load demands. However, this work does not forecast the load on the holidays because the different holidays have the different load demands that have to be considered more carefully. Therefore, the forecasted load data are grouped into 7×48 = 336 groups for the 7 calendar days in a week and 48 data points collected every half-hour in a day.

We use SVM as a forecasting tool. Fig. 3 shows the SVM-based forecasting model. The parameters of the SVMs that yield the best performance are constant C=10, deviation ε=0.001, and kernel function parameter σ=4. Support vectors (xi) are obtained from the training stage. In the testing stage, they are used for kernel functions to induce the data from input space up to a high dimensional feature space, i.e.,

2 2( , ) exp( / 2 )i iK σ= − −x x x x . (1)

C. Features Selection One of the important issues in short-term load forecasting is

the feature selection. There are many factors that affect the daily load. The main features are the characteristics of the past and current loads. Additionally, the weather condition also plays an important role. In different areas, the characteristics of weather that highly affect the load patterns are different. For example, in Thailand, the temperature is the most important weather condition for the load prediction.

The features used in our work are the input vectors as shown in Fig.3 in which L and T denote the electrical load and the temperature features, respectively. The subscripts indicate the time instances of the feature, i.e., (0), (¼24), (¼144), and (+24) indicate the present time, previous 24 hours, previous 144 hours, and the next 24 hours, respectively.

III. DATA DESCRIPTIONS

The data sets used in our experiments consist of the half-hourly electrical load series and temperature data from 4 substations in Bangkok, Thailand, over the two-year period from January 1, 2004 to December 31, 2005. Fig. 4 shows the pictorial illustrations of these load series. Some details of the customers at each substation are as follows:

Bangkok-Noi Substation (BN): The patterns of the daily load from the one-year data (365 days) are shown in Fig. 4(a). It can be seen that the demands at this substation are rather high from 8.00 a.m. to 12.00 p.m. and 2.00 p.m. to 5.00 p.m., because the customers during these time periods are business and industrial. While from 7.00 p.m. to 12.00 a.m., the demand is rather high again because of the residential demand.

North-Bangkok Substation (NB): The patterns of the load are shown in Fig. 4(b). It can be seen that the demand is higher from 8.00 a.m. to 12.00 p.m. and 2.00-4.00 p.m. than the other time periods. This is because the customers are big business buildings and shopping centers. While from 8.00 p.m. to 12.00 a.m., the demand is rather low because it is not a residential area.

Load

(MW

)

Time (hour)

Load

(MW

)

Time (hour)

Load

(MW

)

Time (hour)

Load

(MW

)

Time (hour)

(a) Bangkok Noi (b) North Bangkok

(c) South Thonburi (d) Rangsit Fig. 4. The load data from 4 substations

954

TABLE I EXPERIMENTAL RESULTS FOR ONE-DAY AHEAD FORECASTING

Train Test BN NB STB RS

BN 6.31 8.72 6.47 37.70 NB 8.75 6.02 8.88 50.89 STB 7.12 10.57 6.65 51.34 RS 19.60 29.58 29.60 5.48

South-Thonburi Substation (STB): The load of this

substation consists of the demand from both industrial and residential customers as shown in Fig. 4(c). The load is rather high from 8.00 a.m. to 12.00 p.m. and 2.00 p.m. to 5.00 p.m. because the demand from industrial customers. The demand of residential customers is high from 8.00 p.m. to 12.00 a.m.

Rangsit Substation (RS): The patterns of the load are shown in Fig. 4(d). The load of this substation consists of the demand from industrial and residential customers. The load of this substation is much higher than that of the aforementioned substations.

The entire data set is separated into 2 sets, i.e., a training set and a test set. The training set is the one-year data from January 1, 2004 to December 31, 2004. It is used in the evolution the proposed algorithm, i.e., selection of input features and the SVM parameters. The test set is the other one-year data from January 1, 2005 to December 31, 2005. It is used as a blind test set to evaluate the performance of the model. In the training stage, the 5-fold cross validation has been applied to evaluate the performance of the proposed method. It can detect and prevent over-fitting in a model in which the training model fits in the training data but does not fit in the test data and then produces poor forecasted values.

In the k-fold cross validation, the training data are broken into k subsets in the first stage. Then a regression model is trained on the union of k¼1 subsets, and evaluated on the remaining subset. The process is repeated k times, using each of subset as the test set once. Finally, the results from the test subsets are combined to get an overall estimate of the effectiveness of the training procedure. Selection of parameter k depends on the data set. In our experiments, we use k=5. Therefore, all the forecasting tools’ parameters and the features in the previous sections are obtained based on the 5-fold cross validation.

IV. EXPERIMENTAL RESULTS AND DISCUSSION

In our experiments, we perform the one-day ahead electrical load forecasting. The support vector machine (SVM) is employed as a forecasting tool. The cross-substation evaluation is performed between every substation. We report the accuracy of the forecasting performance in term of the mean absolute percentage error (MAPE) which is defined as

1

1 100%n

i i

i i

ActualLoad ForecastLoadMAPEn ActualLoad=

−= ×� , (2)

where n is the number of the test data.

Load

(MW

)

Hour

BN-modelActual

STB-modelNB-model

Fig. 5 Forecasting results when tested at Bangkok Noi (BN) substation

Load

(MW

)

Hour

BN-modelActual

STB-modelNB-model

Fig. 6 Forecasting results when tested at North Bangkok (NB) substation

Lo

ad (M

W)

Hour

BN-modelActual

STB-modelNB-model

Fig. 7 Forecasting results when tested at South Thonburi (STB) substation

Table I shows the error in term of the MAPE of the experimental results from 4 substations. It can be seen that the model can perform well if the training data and the test data come from the same substation. It also shows that not all substations are well-trained and well-tested in the cross-substation evaluation. The Rangsit substation cannot be used in the cross-substation evaluation in this experiment. It yields very high MAPE when it is either used as the training or test data. This is not surprising because the load range of the Ransit substation is very different from the other substations. Meanwhile, the other 3 substations, i.e., North Bangkok, Bangkok Noi, and South Thonburi, yield promising results on the cross-substation evaluation.

Therefore, we consider the details of the forecasting results of these 3 substations in order to investigate the actual performance of the proposed algorithm. Figures 4 to 6 show sample forecasting results on January 28, 2005. The BN-model, NB-model, and STB-model, are the one-day ahead load forecasting models trained using the data from the Bangkok Noi (BN), North Bangkok (NB) and South Thonburi (STB) substations, respectively. These 3 models are tested on the data from the 3 substations to evaluate the cross-substation forecasting performance. It should be noted that the test results shown in this section are evaluated on the blind test set as described in section III.

955

TABLE II THE TOTAL MAPE OF THE ONE-DAY AHEAD LOAD FORECASTING FROM ONE-

YEAR BLIND-TEST DATA AT 3 SUBSTATIONS Test at BN Test at NB Test at STB

Time

(Hr) Train

at BN

Train at

NB

Train at

STB

Train at

BN

Train at

NB

Train at

STB

Train at

BN

Train at

NB

Train at

STB 1 7.95 9.28 8.18 7.95 6.94 8.84 9.26 11.76 8.722 8.01 9.39 8.02 8.83 6.81 9.32 8.64 11.68 8.393 7.47 9.35 7.59 8.38 6.78 9.12 8.67 11.85 8.224 7.51 8.73 7.30 8.35 5.99 9.22 8.48 11.33 7.685 6.60 7.80 6.60 6.99 4.94 8.34 7.03 10.45 6.496 5.33 7.11 5.61 5.86 4.09 7.24 5.81 9.31 5.297 4.67 7.67 4.59 6.50 4.80 7.83 5.08 9.19 4.698 6.23 8.56 8.12 9.81 5.39 12.13 5.62 10.71 7.739 6.02 9.09 6.05 10.67 6.85 10.39 7.09 11.58 6.41

10 6.48 8.76 6.21 10.02 7.33 10.37 7.52 11.63 6.7011 6.06 9.16 6.23 10.45 7.30 10.95 7.15 11.47 6.7512 6.14 8.29 6.24 9.20 7.21 9.76 7.36 10.24 7.0313 7.30 8.86 7.11 10.86 6.63 10.66 7.78 10.15 7.6214 6.02 9.94 5.74 9.74 7.29 10.68 6.92 10.69 6.2315 5.25 9.37 5.48 10.90 6.66 10.71 5.71 10.00 5.2916 5.22 8.71 5.26 9.85 6.74 9.88 6.41 9.80 5.3017 4.50 8.09 5.43 6.26 5.46 6.05 6.28 8.30 5.3518 4.60 7.37 4.36 6.02 4.49 6.00 6.94 8.18 4.9419 3.96 7.69 4.04 6.92 3.76 6.72 5.48 9.11 4.1820 4.26 7.76 4.32 7.03 3.82 6.02 5.29 8.76 4.0021 4.51 6.49 4.66 7.74 4.28 6.41 5.19 7.80 4.2622 5.63 7.46 5.96 7.89 5.04 6.83 5.89 9.08 5.7323 6.94 8.83 7.02 7.56 6.32 7.49 8.61 11.56 7.9524 10.33 9.44 9.69 8.96 6.68 8.68 10.84 10.74 11.57

The forecasting results when the models are tested at the

Bangkok Noi substation are shown in Fig. 4. It can be seen that the BN-model and STB-model perform well with this test data set. It is normal that the BN-model can predict the data from the BN substation. The interesting observation is that the STB-model can predict the BN load data. Fig. 5 shows the forecasting results when the models are tested at the North Bangkok substation. It can be seen that only the NB-model performs well on this test data set. Fig. 6 shows the forecasting results when the models are tested at the South Thonburi substation. We see that the BN-model and STB-model perform well on this test data set. The interesting observation is that the BN-model can also predict the STB load data.

Table II depicts the total MAPE of the forecasting from one-year blind-test data between these 3 substations at different time of the day. The results confirm the observations mentioned earlier. The results indicate that there are two substations, i.e., Bangkok Noi and South Thonburi substations, that can perform the cross-substation forecasting. The model trained at the Bangkok Noi substation can be used to forecast the load at both the Bangkok Noi and South Thonburi substations. In the same way, the model which is trained at the South Thonburi substation can also be used to predict the load at both the Bangkok Noi and South Thonburi substations.

The properties of the loads explained in section III and the plots shown in Fig. 4 provide some correlation information

between the loads at different substations. The demands of customers are similar at the Bangkok Noi and South Thonburi substations. The North Bangkok substation has different power demand from the customers. This can be seen in the different load patterns plotted in Fig. 4(a) to 4(c). The patterns in Fig. 4(a) are similar to that in Fig. 4(c), whereas the patterns shown in Fig. 4(b) are different from the other two. Considering the Rangsit substation, even though the load patterns are similar to that of the Bangkok Noi and South Thonburi substations, the load values are much higher than that of the other two. Therefore, the forecasting model trained at the Rangsit substation is not suitable for the forecasting at the other substations. The cross-substation forecasting can be performed if the range and patterns of the daily load of the training substations are similar to that of the test substations.

V. CONCLUSION

We investigate the cross-substation support vector machine (SVM)-based short term load forecasting system. The cross-substation scheme trains a forecasting model using the electrical load from one substation and tests it at other substations. The features to the forecasting system include the past loads, and the predicted temperature. The experimental results on the data from 4 substations show that the cross-substation forecasting can be performed if the amplitude ranges and patterns of the daily load of the training substations are similar to that of the test substations. This observation is beneficial to the model development in that the retraining stage may be omitted at a new substation if the aforementioned characteristics are similar to that of the training substation.

ACKNOWLEDGMENT

The authors would like to thank the Metropolitan Area Control Center, Metropolitan Area Operation Division, Electricity Generating Authority of Thailand (EGAT) for providing the data.

REFERENCES [1] T. Konjic, V. Miranda and I. Kapetanovic, “Fuzzy Inference Systems

Applied to LV Substation Load Estimation,” IEEE Trans. on Power Syst., vol.20, no. 2, pp.742-749, May 2005.

[2] W. Tayati and W. Chankaipol, “Substation Short Term Load Forecasting Using Neural Network With Genetic Algorithm,” IEEE Proc. of TENCON’02, vol. 3, pp.1787-1790, Oct. 2002.

[3] J. Nazarko and W. Zalewski, “The Fuzzy Regression Approach to Peak Load Estimation in Power Distribution Systems,” IEEE Trans. on Power Syst., vol.14, no. 3, pp.809-814, Aug. 1999.

[4] M. Espinoza, C. Joye, R. Belmans and B. D. Moor, “Short-Term Load Forecasting, Profile Identification, and Customer Segmentation: A Methodology Based on Periodic Time Series,” IEEE Trans. on Power Syst., vol. 20, no. 3, pp.1622-1630, Aug. 2005.

[5] S. Sargunaraj, D. P. Sen Gupta and S. Devi, “Short-Term Load Forecasting for Demand Side Management,” IEE Proc. Gener. Transm. Distrib., vol.144, no. 1, pp.68-74, Jan. 1997.

[6] J. Pahasa and N. Theera-Umpon, “Short-Term Load Forecasting Using Wavelet Transform and Support Vector Machine,” The 8th Inter. Power Eng. Con., pp.47-52, Dec. 2007.

[7] A. J. Smola and B. Scholkopf, “A Tutorial on Support Vector Regression,” Statistics and Computing, vol.14, pp.199-222, 2004.

[8] S. Gunn, “Support Vector Machines for Classification and Regression,” ISIS Technical Report ISIS-1-98, Image Speech & Intelligent Systems Research Group, University of Southampton, May 1998.

956