classification of voltage disturbance using machine learning

CLASSIFICATION OF VOLTAGEDISTURBANCES USING SVM’s AND

BOOSTING CLASSIFIERS

September 29, 2015

Submitted byMOHAN KASHYAP.P (SC13M055)

M-Tech in MACHINE LEARNING AND COMPUTINGDepartment of mathematics

1

List of Figures

2.1 Simulink model for generation of different kinds of electricalfaults and their analysis . . . . . . . . . . . . . . . . . . . . . 3

3.1 Confusion matrix for Gradient Boosting Classifiers of estima-tors as 100 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.2 Confusion matrix for Ada Boosting Classifiers of estimators as100 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.3 Confusion matrix for Random forest Classifiers of estimatorsas 100 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2

Contents

1 Introduction 1

2 Simulink model and Algorithmic implementation 22.1 Simulink modeling . . . . . . . . . . . . . . . . . . . . . . . . 2

2.1.1 Classification of faults . . . . . . . . . . . . . . . . . . 22.1.2 Extraction of different features . . . . . . . . . . . . . . 3

2.2 Algorithm Implementation . . . . . . . . . . . . . . . . . . . . 42.2.1 Generation of data points . . . . . . . . . . . . . . . . 52.2.2 Implementation of LIBSVM . . . . . . . . . . . . . . . 52.2.3 Implementation of Gradient Boosting algorithm . . . . 52.2.4 AdaBoost Classifier . . . . . . . . . . . . . . . . . . . . 62.2.5 Random Forest Classifiers . . . . . . . . . . . . . . . . 7

3 Results 93.1 Results for Gradient Boosting Classifier . . . . . . . . . . . . . 93.2 Results for Ada Boosting Classifier . . . . . . . . . . . . . . . 103.3 Results for Random-Forest Classifier . . . . . . . . . . . . . . 11

3.3.1 Results for SVM . . . . . . . . . . . . . . . . . . . . . 12

4 Conclusion and Future work 134.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134.2 future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3

Abstract

This report briefly describes about the Classification of volatge disturbanceusing support vector machine and boosting classifiers. The process startsfrom building of a matlab/simulink model for simulation of various electri-cal faults. Followed by Feature extraction to generate the data-set and fi-nally classification of the obtained data using boosting and SVM. The resultsobatined were found to be promising

Chapter 1

Introduction

Over the past two decades utilities worldwide have gone through radicalchanges. One big change is the deregulation of the energy market that hastaken place in a number of countries worldwide. Another change is thattodays customers are more demanding than customers in the past. Thesechanges have forced the utilities to become even more customer-orientedwhere high network reliability and good power quality have been increasinglyimportant to keep customers satisfied. Therefore, recording and analyzingvoltage disturbances (also referred to as voltage events) and power qualityabnormalities have become a vital issue in order to better understand thebehavior of the power network. Disturbance data and power quality datahave become important information both for statistical purposes and as adecision-making document in mitigation projects. Reliable disturbance andpower quality information also open up for proactive maintenance approachwith focus on increasing the power network reliability. Today most of thedisturbance data is analyzed manually by specialists. However, a lot of timecould be saved if unimportant or minor disturbances could be removed orclassified automatically. Thereby, the specialists could focus on solving moresophisticated disturbance problems. This requires the development of robustautomatic classification systems. During the last few years yet another clas-sification method the support vector machine (SVM) has been increasinglypopular due to its interesting theoretical and practical characteristics. TheSVM which is based on statistical learning theory is a general classificationmethod. During recent years boosting algorithms have also gained slowlythe popularity. Its based on ensemble learning approach.

1

Chapter 2

Simulink model andAlgorithmic implementation

2.1 Simulink modeling

The basic idea is to generate simulink model was used to simulate differ-ent kinds of faults and extract features which were used to generate the datapoints. From the generated data points the multiclass svm,adaboost,gradientboostingand random forest algorithms were implemented. Now the simulink modelis described used the faults and various features extracted are shown in thefigure below:

2.1.1 Classification of faults

There are four different kinds of fault classification is described as givenbelow:

• single line to ground fault(SLG):on line is faulty is connected to groundother two lines are properly functioning.

• double line to ground fault(DLG):two lines are faulty connected toground one line is properly functioning

• three phase to ground fault:all three lines are faulty and connected toground

• unequally distributed faults:in which the voltages in three lines areimproperly distributed

2

Figure 2.1: Simulink model for generation of different kinds of electrical faultsand their analysis

2.1.2 Extraction of different features

1. RMS(root mean square)value:when the faults occur any one of thephase voltages will go zero or will have uneven voltage distribution.So this value of root mean square value of voltage will be determiningfactor and also one of the features which are use for classification theformula for rms is given by

frms = limT→∞

√1T

∫ T0

[f(t)]2 dt.

where the equivalent symbols are

frms=Vrms=voltage rms values of voltage

f(t)=v(t)=voltage magnitude values of voltages at various timeinstants=vm ∗ sin(T )

T=time period of the sinusoidal wave

2. Total harmonic distortion(THD):When the input is a sine wave themeasurement is most commonly defined as the ratio of the RMS am-plitude of a set of higher harmonic frequencies to the RMS amplitude

3

of the first harmonic or fundamental frequency.so since already RMSwas used as one of the features so this factor also contains imbibedRMS within it so its also useful in classification of voltage disturbancesince voltage magnitude is different in different phases during the faultstherefore the RMS values are also different therefore THD’s are also dif-ferent for various three different phases.the equation of THD is givenbelow:

THDF =

√V 22 +V 2

3 +V 24 +···+V 2

n

V1

where

where Vi is the RMS voltage of ith harmonic and i = 1 is thefundamental frequency. where i=1,2,3....n

3. magnitude and phase sequence analyzer:magnitude sequence analyzergives the changes in magnitude of voltage changes at various time in-stants and phase sequence analyzer analyses phase change with respectto time since voltage is represented in polar form for AC componentsthis can be used for classification of faults since magnitude of variousphases(3) are different for different kinds of faults.we can represent itas follows:

V=R6 theta

R=is magnitude of voltage, theta=is the phase of voltage

4. voltage at various selected intervals

So these were the features extracted in order to differentiate different kindsof faults.

2.2 Algorithm Implementation

The classification of generated data was performed using Support vector ma-chines, Gradient Boosting Classifier , Ada-boost and random forest classifier.The multiclass classification was performed using one versus all method.

4

2.2.1 Generation of data points

The next step was to generate the data points that was done done by featureextraction by using work-space block of simulink all the data points were gen-erated.the orientation of data points,different classifications,normalization ofdata was done using excel.Here the number of data points were 809 of 4 different classes.The numberof features were reduced to 9 because of datapreprocessing.

2.2.2 Implementation of LIBSVM

After generation of the data points we need svm implementation to classifythe data. Implementation of multiclass svm was performed using LIBSVMwhich is an open source library. The important parameter to be tuned wasthe kernel of SVM. From the 5-fold cross validation we found out that RBFkernel fits the best for our data.

2.2.3 Implementation of Gradient Boosting algorithm

Boosting algorithms are set of machine learning algorithms, which buildsstrong classifier from set of weak classifiers, typically decision tress. Gradi-ent boosting is one such algorithm which builds the model in a stage-wisefashion, and it generalizes the model by allowing optimization of an arbitrarydifferentiable loss function. The differentiable loss function in our case is Bi-nomial deviance loss function. The algorithm is implemented as follows asdescribed in (Friedman et al.,2001).Input : training set (Xi, yi), where i = 1....n , Xi ∈ H ⊆ Rn and yi ∈ [−1, 1]differential loss function L(y, F (X)) which in our case is Binomial devianceloss function defined as log(1 + exp(−2yF (X))) and M are the number ofiterations .

1. Initialize model with a constant value:

F0(X) =arg minγ

∑ni=1 L(yi, γ).

2. For m = 1 to M:

(a) Compute the pseudo-responses:

rim = −[∂L(yi,F (Xi))∂F (Xi)

]F (X)=Fm−1(X)

for i = 1, . . . , n.

(b) Fit a base learner hm(X) to pseudo-response, train the pseudoresponse

using the training set {(Xi, rim)}ni=1.

5

(c) Compute multiplier γm by solving the optimization problem:

γm = arg minγ

∑ni=1 L (yi, Fm−1(Xi) + γhm(Xi)).

(d) Update the model: Fm(X) = Fm−1(X) + γmhm(X).

3. Output FM(X) =∑M

m=1 γmhm(X)

The value of the weight γm is found by an approximated newton raphson

solution given as γm =∑Xi∈hm rim∑

Xi∈hm|rim|(2−|rim|)The parameter to be tuned was the number of classifiers within the en-

semble and the base classifier within the ensemble. With the 5-fold crossvalidation the optimal number of base estimators were found to be 100 withdecision tree as the base classifier.

2.2.4 AdaBoost Classifier

In adaBoost we assign (non-negative) weights to points in the data set whichare normalized, so that it forms a distribution. In each iteration, we generatea training set by sampling from the data using the weights, i.e. the datapoint (Xi, yi) would be chosen with probability wi, where wi is the currentweight for that data point. We generate the training set by such repeatedindependent sampling. After learning the current classifier, we increase the(relative) weights of data points that are misclassified by the current classifier.We generate a fresh training set using the modified weights and so on. Thefinal classifier is essentially a weighted majority voting by all the classifiers.The description of the algorithm as in (Freund et al., 1995) is given below:Input n examples: (X1, y1), ..., (Xn, yn), Xi ∈ H ⊆ Rn , yi ∈ [−1, 1]

1. Initialize: wi(1) = 1n, ∀i, each data point is initialized with equal

weight, so when data points are sampled from the probability distribu-tion the chance of getting the data point in the training set is equallylikely.

2. We assume that there as M classifiers within the Ensembles.For m=1 to M do

(a) Generate a training set by sampling with wi(m).

(b) Learn classifier hm using this training set.

(c) let ξm =∑n

i=1wi(m) I[yi 6=hm(Xi)] where IA is the indicator functionof A and is defined as

IA = 1 if [yi 6= hm(Xi)]

6

IA = 0 if [yi = hm(Xi)]

so ξm is the error computed due to the mth classifier.

(d) Set αm=log(1−ξmξm

) computed hypothesis weight, such that αm > 0because of the assumption that ξ < 0.5.

(e) Update the weight distribution over the training set as

w′i(m+ 1)= wi(m) exp(αmI[yi 6=hm(Xi)])

Normalization of the updated weights so that wi(m + 1) is a dis-

tribution. wi(m+ 1) =w′i(m+1)∑i w′i(m+1)

end for

3. Output is final vote h(X) = sgn(∑M

m=1 αmhm(x)) is the weighted sumof all classifiers in the ensemble.

In the adaboost algorithm M is a parameter. Due to the sampling withweights, we can continue the procedure for arbitrary number of iterations.Loss function used in adaboost algorithm is exponential loss function andfor a particular data point its defined as exp(−yif(Xi)). The parameter tobe tuned was the number of classifiers within the ensemble and the baseclassifier within the ensemble. With the 5-fold cross validation the optimalnumber of base estimators were found to be 100 with decision tree as thebase classifier.

2.2.5 Random Forest Classifiers

Random forests are a combination of tree predictors, such that each treedepends on the values of a random vector sampled independently, and withthe same distribution for all trees in the forest. The main difference betweenstandard decision trees and random forest is, in decision trees, each nodeis split using the best split among all variables and in random forest, eachnode is split using the best among a subset of predictors randomly chosenat that node. In random forest classifier ntree bootstrap samples are drawnfrom the original data, and for each obtained bootstrap sample grow anunpruned classification decision tree, with the following modification: ateach node, rather than choosing the best split among all predictors, randomlysample mtry of the predictors and choose the best split from among thosevariables. Predict new data by aggregating the predictions of the ntree trees(i.e., majority votes for classification). The algorithm is described as followsas in(Brieman, 2001):Input n examples: (X1, y1), ..., (Xn, yn) = D, Xi ∈ Rn, where D is the whole

7

dataset.for i=1,...,B:

1. Choose a boostrap sample Di from D.

2. Construct a decision Tree Ti from the bootstrap sample Di such thatat each node, choose a random subset of m features and only considersplitting on those features.

Finally given the testdata Xt take the majority votes for classification. HereB is the number of bootstrap data sets generated from original data set D.The parameter to be tuned was the number of classifiers within the ensembleand the base classifier within the ensemble. With the 5-fold cross validationthe optimal number of base estimators were found to be 100 with decisiontree as the base classifier.

8

Chapter 3

Results

The generated data points were very few in number. The electrical faultsnormally are distinguishable with additional features. Due to this scenariothe classification accuracies obtained were very high. The results for theensemble classifiers(i.e. Boosting classifiers) were implemented in python.SVM classification was implemented using LIBSVM.

3.1 Results for Gradient Boosting Classifier

This section shows the training accuracies using 5-fold cross validation,testingaccuracies and confusion matrix.

Table 3.1: Variation in accuracies for Gradient Boosting classifier for numberof parameter estimate as 100

Gradient Boosting classifier training accuracies for 5-fold5-folds Accuracy1 0.99222 0.98443 0.99724 0.99825 0.9957

Table 3.2: variation in test score for accuracy Gradient Boosting Classifierwith number of estimators as 100Test score for Gradient Boosting Classifier with number of estimators as 100Sl.no Accuracy1 99.83%

9

Figure 3.1: Confusion matrix for Gradient Boosting Classifiers of estimatorsas 100

3.2 Results for Ada Boosting Classifier


Table 3.3: Variation in accuracies for Ada Boosting classifier for number ofparameter estimate as 100

Ada Boosting classifier training accuracies for 5-fold5-folds Accuracy1 0.99222 0.99223 0.99724 0.99835 0.9945

Table 3.4: variation in test score for accuracy Ada Boosting Classifier withnumber of estimators as 100Test score for Ada Boosting Classifier with number of estimators as 100Sl.no Accuracy1 98.75%

10

Figure 3.2: Confusion matrix for Ada Boosting Classifiers of estimators as100

3.3 Results for Random-Forest Classifier


Table 3.5: Variation in accuracies for Random Forest classifier for number ofparameter estimate as 100

Random Forest classifier training accuracies for 5-fold5-folds Accuracy1 0.99342 0.99583 0.99724 0.99875 0.9954

Table 3.6: variation in test score for accuracy Random forest Classifier withnumber of estimators as 100Test score for Random Forest Classifier with number of estimators as 100Sl.no Accuracy1 99.37%

11

Figure 3.3: Confusion matrix for Random forest Classifiers of estimators as100

3.3.1 Results for SVM

This section shows the training accuracies using 5-fold cross validation,testingaccuracies.

Table 3.7: Variation in accuracies for SVM classifier for Kernel as RBFSVM classifier training accuracies for 5-fold

5-folds Accuracy1 0.99222 0.98443 1.04 0.99915 0.9973

Table 3.8: variation in test score for accuracy SVM Classifier with Kernel asRBFTest score for SVM with Kernel as RBFSl.no Accuracy1 99.37%

12

Chapter 4

Conclusion and Future work

4.1 Conclusion

The classification of voltage disturbances using SVMs and Boosting algo-rithms was performed and results obatined were promising.The main moti-vation behind the project is to understand various alogrithms (i.e.SVMs andBoosting algorithms) and their implementaton. The above implementationintegration of two domains Electrical and Machine Learning to solve realtime engineering problems. Here the above implementation was done onlyusing synthetic data as testing and both training. So the accuracies werefound to be very high.

4.2 future work

The above analysis of faults was only done for certain faults which are occur-ring very frequently in power system network.There are certain other faultscalled as L-L fault (line to line) and arcing faults.Even modeling of simulinkthe concept of voltage sag can be incorporated so that occurrences of faultscan be modeled more realistically and dynamically.

13

Bibliography

[1] Peter G. V. Axelberg, Irene Yu-Hua Gu. Support Vector Machinefor Classification of Voltage Disturbances. IEEE TRANSACTIONS ONPOWER DELIVERY, VOL. 22, NO. 3, JULY 2007.

[2] Breiman, Leo. Random forests. Machine learning 45, no. 1 (2001),pp.5-32. 2001

[3] Friedman, Jerome H.Greedy function approximation: a gradient boostingmachine. Annals of statistics:pp 1189-1232. 2001

14

classification of voltage disturbance using machine learning

Documents