implementation of back-propagation neural network using scilab and its convergence speed improvement

3
Int. Journal of Electrical & Electronics NITTTR, Chandigarh Implementatio Network usin Sp AbstractArtificial neural network h for solving non-linear complex tasks. Wit computer technology, machine learni becoming good choice. The selection of t technique depends upon the viabil application. Most of the non-linear proble using back propagation based neural ne time of neural network is directly affec speed. Several efforts are done to impr speed of back propagation algorithm. Th the implementation of back-propagation effort to improve its convergence speed written in SCILAB. UCI standard data se purposes. Proposed modification backpropagation algorithm prov improvement in the convergence speed. KeywordsBack-propagtion, SCILAB, I. INTRODUCTIO Multilayer perceptron(MLP) based playing a crucial role in artificial inte linear complex problems can be easily system. Gradient based back-propag commonly used for designing the criterion. Back-propagation based neura showing remarkable performance in problem of robotic arms[1]. Water p models, breast cancer detection[2 classification[3], remote sensing image some of the wide range of appli employing neural network efficiently back-propagation based neural netwo image directly as input and they handl selecting the appropriate features and n reduction efficiently. Pedestrian gen traffic sign recognition, character estimation, vision based classification recognition are some of the raw imag based applications employing grad propagation algorithm[5]-[9]. Figure 1.1 showing the multilayer network architecture. The architecture input layer, multiple hidden layers and o layers except input layer are processing processes the data in two passes. In the the forward pass, input features are fed output and gradients are calculated for backward pass the free parameters stochastic gradient (online upda gradient(offline updation)method are updation[10]. Continuous activation fun Engg. Vol. 2, Spl. Issue 1 (2015) e-ISSN: 16 EDIT -2015 on of Back-Propagati ng Scilab and its Con peed Improvement has been widely used th the development of ing techniques are the machine learning lity for particular ems have been solved etwork. The training cted by convergence rove the convergence This paper focuses on n algorithm and an ed. The algorithm is et is used for analysis n in standard vides substantial B, Neural network ON neural networks are elligence. The non- y solved using MLP gation algorithm is system to optimal al network controller trajectory tracking pollution forecasting 2], heart disease classification[4] are ications which are y. Recent trends in ork are using raw le the problems like number of dimension nder classification, recognition, Age n of cells, vehicle ge direct processing dient based back- r perceptron neural e consists of single output layer. All the g layers. The system e first pass which is d to the input layer, r each layer. In the are updated. The ation) and batch e used for weight nctions are used for gradient based system desi systems are slow learner. Figure1.6 ML II. SOFTWARE IMPL SCILAB is an open sourc functionality as MATL propagation algorithm is following pseudo code is 1. Initialization 2. Hidden layer output cal 3. Final layer output calcu 4. Error calculation and processing unit 5. Calculate loss function stop otherwise go to nex 6. Update the input to hidd 7. Update the output to hid 8. Go to step 2 Initialization includes the par constant, number of hidden n weights). The output of a weighted sum of inputs conn square error is used as a lo updated using the equation(i) w(t+1)=w(t)+ η* Where w(t+1) is the update previous weight value, learning signal and x is the in which is the product of erro designed system accepts num the user and contains only sin in the output layer. The sy offline mode. The maximum whole samples is taken and g that. The updation of all sam learning signal. The max 694-2310 | p-ISSN: 1694-2426 192 ion Neural nvergence ign. Back-propagation based LP architecture LEMENTATION OF BPA ce software providing the same LAB software. The Back- written in SCILAB 5.5.1. The followed for writing code. lculation ulation delta calculation for each n if meeting stopping criterion xt step den layer weights dden layer weights rameters initialization(learning neurons, input, desired output, a neuron is function of the nnected to that neuron. Mean oss function. The weights are ). *r(t)*x……..(i) ed weight value, w(t) is the is learning constant, r(t) is nput signal. r(t) is delta signal or and gradient of output. The mber of hidden neurons from ngle hidden layer with one unit ystem updates the weights in m mean square error from the gradient is calculated based on mples is done with the same ximum mean square error

Upload: ijeee

Post on 24-Jul-2015

88 views

Category:

Engineering


6 download

TRANSCRIPT

Page 1: Implementation of Back-Propagation Neural Network using Scilab and its Convergence Speed Improvement

Int. Journal of Electrical & Electronics Engg. Vol. 2, Spl. Issue 1 (2015) e-ISSN: 1694-2310 | p-ISSN: 1694-2426

NITTTR, Chandigarh EDIT -2015 192

Implementation of Back-Propagation NeuralNetwork using Scilab and its Convergence

Speed Improvement

Abstract—Artificial neural network has been widely usedfor solving non-linear complex tasks. With the development ofcomputer technology, machine learning techniques arebecoming good choice. The selection of the machine learningtechnique depends upon the viability for particularapplication. Most of the non-linear problems have been solvedusing back propagation based neural network. The trainingtime of neural network is directly affected by convergencespeed. Several efforts are done to improve the convergencespeed of back propagation algorithm. This paper focuses onthe implementation of back-propagation algorithm and aneffort to improve its convergence speed. The algorithm iswritten in SCILAB. UCI standard data set is used for analysispurposes. Proposed modification in standardbackpropagation algorithm provides substantialimprovement in the convergence speed.

Keywords—Back-propagtion, SCILAB, Neural network

I. INTRODUCTION

Multilayer perceptron(MLP) based neural networks areplaying a crucial role in artificial intelligence. The non-linear complex problems can be easily solved using MLPsystem. Gradient based back-propagation algorithm iscommonly used for designing the system to optimalcriterion. Back-propagation based neural network controllershowing remarkable performance in trajectory trackingproblem of robotic arms[1]. Water pollution forecastingmodels, breast cancer detection[2], heart diseaseclassification[3], remote sensing image classification[4] aresome of the wide range of applications which areemploying neural network efficiently. Recent trends inback-propagation based neural network are using rawimage directly as input and they handle the problems likeselecting the appropriate features and number of dimensionreduction efficiently. Pedestrian gender classification,traffic sign recognition, character recognition, Ageestimation, vision based classification of cells, vehiclerecognition are some of the raw image direct processingbased applications employing gradient based back-propagation algorithm[5]-[9].

Figure 1.1 showing the multilayer perceptron neuralnetwork architecture. The architecture consists of singleinput layer, multiple hidden layers and output layer. All thelayers except input layer are processing layers. The systemprocesses the data in two passes. In the first pass which isthe forward pass, input features are fed to the input layer,output and gradients are calculated for each layer. In thebackward pass the free parameters are updated. Thestochastic gradient (online updation) and batchgradient(offline updation)method are used for weightupdation[10]. Continuous activation functions are used for

gradient based system design. Back-propagation basedsystems are slow learner.

Figure1.6 MLP architecture

II. SOFTWARE IMPLEMENTATION OF BPA

SCILAB is an open source software providing the samefunctionality as MATLAB software. The Back-propagation algorithm is written in SCILAB 5.5.1. Thefollowing pseudo code is followed for writing code.

1. Initialization2. Hidden layer output calculation3. Final layer output calculation4. Error calculation and delta calculation for each

processing unit5. Calculate loss function if meeting stopping criterion

stop otherwise go to next step6. Update the input to hidden layer weights7. Update the output to hidden layer weights8. Go to step 2

Initialization includes the parameters initialization(learningconstant, number of hidden neurons, input, desired output,weights). The output of a neuron is function of theweighted sum of inputs connected to that neuron. Meansquare error is used as a loss function. The weights areupdated using the equation(i).

w(t+1)=w(t)+ η*r(t)*x……..(i)

Where w(t+1) is the updated weight value, w(t) is theprevious weight value, is learning constant, r(t) islearning signal and x is the input signal. r(t) is delta signalwhich is the product of error and gradient of output. Thedesigned system accepts number of hidden neurons fromthe user and contains only single hidden layer with one unitin the output layer. The system updates the weights inoffline mode. The maximum mean square error from thewhole samples is taken and gradient is calculated based onthat. The updation of all samples is done with the samelearning signal. The maximum mean square error

Int. Journal of Electrical & Electronics Engg. Vol. 2, Spl. Issue 1 (2015) e-ISSN: 1694-2310 | p-ISSN: 1694-2426

NITTTR, Chandigarh EDIT -2015 192

Implementation of Back-Propagation NeuralNetwork using Scilab and its Convergence

Speed Improvement

Abstract—Artificial neural network has been widely usedfor solving non-linear complex tasks. With the development ofcomputer technology, machine learning techniques arebecoming good choice. The selection of the machine learningtechnique depends upon the viability for particularapplication. Most of the non-linear problems have been solvedusing back propagation based neural network. The trainingtime of neural network is directly affected by convergencespeed. Several efforts are done to improve the convergencespeed of back propagation algorithm. This paper focuses onthe implementation of back-propagation algorithm and aneffort to improve its convergence speed. The algorithm iswritten in SCILAB. UCI standard data set is used for analysispurposes. Proposed modification in standardbackpropagation algorithm provides substantialimprovement in the convergence speed.

Keywords—Back-propagtion, SCILAB, Neural network

I. INTRODUCTION

Multilayer perceptron(MLP) based neural networks areplaying a crucial role in artificial intelligence. The non-linear complex problems can be easily solved using MLPsystem. Gradient based back-propagation algorithm iscommonly used for designing the system to optimalcriterion. Back-propagation based neural network controllershowing remarkable performance in trajectory trackingproblem of robotic arms[1]. Water pollution forecastingmodels, breast cancer detection[2], heart diseaseclassification[3], remote sensing image classification[4] aresome of the wide range of applications which areemploying neural network efficiently. Recent trends inback-propagation based neural network are using rawimage directly as input and they handle the problems likeselecting the appropriate features and number of dimensionreduction efficiently. Pedestrian gender classification,traffic sign recognition, character recognition, Ageestimation, vision based classification of cells, vehiclerecognition are some of the raw image direct processingbased applications employing gradient based back-propagation algorithm[5]-[9].

Figure 1.1 showing the multilayer perceptron neuralnetwork architecture. The architecture consists of singleinput layer, multiple hidden layers and output layer. All thelayers except input layer are processing layers. The systemprocesses the data in two passes. In the first pass which isthe forward pass, input features are fed to the input layer,output and gradients are calculated for each layer. In thebackward pass the free parameters are updated. Thestochastic gradient (online updation) and batchgradient(offline updation)method are used for weightupdation[10]. Continuous activation functions are used for

gradient based system design. Back-propagation basedsystems are slow learner.

Figure1.6 MLP architecture

II. SOFTWARE IMPLEMENTATION OF BPA

SCILAB is an open source software providing the samefunctionality as MATLAB software. The Back-propagation algorithm is written in SCILAB 5.5.1. Thefollowing pseudo code is followed for writing code.

1. Initialization2. Hidden layer output calculation3. Final layer output calculation4. Error calculation and delta calculation for each

processing unit5. Calculate loss function if meeting stopping criterion

stop otherwise go to next step6. Update the input to hidden layer weights7. Update the output to hidden layer weights8. Go to step 2

Initialization includes the parameters initialization(learningconstant, number of hidden neurons, input, desired output,weights). The output of a neuron is function of theweighted sum of inputs connected to that neuron. Meansquare error is used as a loss function. The weights areupdated using the equation(i).

w(t+1)=w(t)+ η*r(t)*x……..(i)

Where w(t+1) is the updated weight value, w(t) is theprevious weight value, is learning constant, r(t) islearning signal and x is the input signal. r(t) is delta signalwhich is the product of error and gradient of output. Thedesigned system accepts number of hidden neurons fromthe user and contains only single hidden layer with one unitin the output layer. The system updates the weights inoffline mode. The maximum mean square error from thewhole samples is taken and gradient is calculated based onthat. The updation of all samples is done with the samelearning signal. The maximum mean square error

Int. Journal of Electrical & Electronics Engg. Vol. 2, Spl. Issue 1 (2015) e-ISSN: 1694-2310 | p-ISSN: 1694-2426

NITTTR, Chandigarh EDIT -2015 192

Implementation of Back-Propagation NeuralNetwork using Scilab and its Convergence

Speed Improvement

Abstract—Artificial neural network has been widely usedfor solving non-linear complex tasks. With the development ofcomputer technology, machine learning techniques arebecoming good choice. The selection of the machine learningtechnique depends upon the viability for particularapplication. Most of the non-linear problems have been solvedusing back propagation based neural network. The trainingtime of neural network is directly affected by convergencespeed. Several efforts are done to improve the convergencespeed of back propagation algorithm. This paper focuses onthe implementation of back-propagation algorithm and aneffort to improve its convergence speed. The algorithm iswritten in SCILAB. UCI standard data set is used for analysispurposes. Proposed modification in standardbackpropagation algorithm provides substantialimprovement in the convergence speed.

Keywords—Back-propagtion, SCILAB, Neural network

I. INTRODUCTION

Multilayer perceptron(MLP) based neural networks areplaying a crucial role in artificial intelligence. The non-linear complex problems can be easily solved using MLPsystem. Gradient based back-propagation algorithm iscommonly used for designing the system to optimalcriterion. Back-propagation based neural network controllershowing remarkable performance in trajectory trackingproblem of robotic arms[1]. Water pollution forecastingmodels, breast cancer detection[2], heart diseaseclassification[3], remote sensing image classification[4] aresome of the wide range of applications which areemploying neural network efficiently. Recent trends inback-propagation based neural network are using rawimage directly as input and they handle the problems likeselecting the appropriate features and number of dimensionreduction efficiently. Pedestrian gender classification,traffic sign recognition, character recognition, Ageestimation, vision based classification of cells, vehiclerecognition are some of the raw image direct processingbased applications employing gradient based back-propagation algorithm[5]-[9].

Figure 1.1 showing the multilayer perceptron neuralnetwork architecture. The architecture consists of singleinput layer, multiple hidden layers and output layer. All thelayers except input layer are processing layers. The systemprocesses the data in two passes. In the first pass which isthe forward pass, input features are fed to the input layer,output and gradients are calculated for each layer. In thebackward pass the free parameters are updated. Thestochastic gradient (online updation) and batchgradient(offline updation)method are used for weightupdation[10]. Continuous activation functions are used for

gradient based system design. Back-propagation basedsystems are slow learner.

Figure1.6 MLP architecture

II. SOFTWARE IMPLEMENTATION OF BPA

SCILAB is an open source software providing the samefunctionality as MATLAB software. The Back-propagation algorithm is written in SCILAB 5.5.1. Thefollowing pseudo code is followed for writing code.

1. Initialization2. Hidden layer output calculation3. Final layer output calculation4. Error calculation and delta calculation for each

processing unit5. Calculate loss function if meeting stopping criterion

stop otherwise go to next step6. Update the input to hidden layer weights7. Update the output to hidden layer weights8. Go to step 2

Initialization includes the parameters initialization(learningconstant, number of hidden neurons, input, desired output,weights). The output of a neuron is function of theweighted sum of inputs connected to that neuron. Meansquare error is used as a loss function. The weights areupdated using the equation(i).

w(t+1)=w(t)+ η*r(t)*x……..(i)

Where w(t+1) is the updated weight value, w(t) is theprevious weight value, is learning constant, r(t) islearning signal and x is the input signal. r(t) is delta signalwhich is the product of error and gradient of output. Thedesigned system accepts number of hidden neurons fromthe user and contains only single hidden layer with one unitin the output layer. The system updates the weights inoffline mode. The maximum mean square error from thewhole samples is taken and gradient is calculated based onthat. The updation of all samples is done with the samelearning signal. The maximum mean square error

Page 2: Implementation of Back-Propagation Neural Network using Scilab and its Convergence Speed Improvement

Int. Journal of Electrical & Electronics Engg. Vol. 2, Spl. Issue 1 (2015) e-ISSN: 1694-2310 | p-ISSN: 1694-2426

193 NITTTR, Chandigarh EDIT-2015

minimization reduces the errors of all the input sampleswith time and converges to a desired level. Convergencegraph is also plotted between mean square error andnumber of iteration or epoch. Single iteration is equal tothe calculation of maximum mean square error from all thesamples. The developed system is tested for the Wisconsinbreast cancer dataset. The information of the dataset andsystem implementation for this dataset is explained in thenext section.

III. Data set information and System implementationWisconsin Breast Cancer(UCI) Database of 699 patients isused for experiment purpose. This dataset contains 9 inputfeatures namely Clump Thickness, Uniformity of Cell size,Uniformity of Cell Shape, Marginal Adhesion, SingleEpithelial Cell, Bare Nuclei, Bland Chromatin, NormalNucleoli, Mitosis and corresponding to these features,there are two classes of data benign data and malignantdata. There are 458 benign dataset and 241 malignantdataset. 16 instances from the dataset are of unknowncategory. There is replication of features for same class.After removing the replicated and unknown instances(16),the dataset for experimentation reduced to 445 having 209benign and 236 malignant dataset. Benign and malignantdataset is divided into two groups training dataset(70%)and testing dataset(30%). 146 benign data is used astraining dataset and 63 benign dataset for testing. 165malignant dataset is used for training and 71 malignantdataset for testing purposes[11].Architecture of 9 input units, 6 hidden neurons and 1output is made. The benign and malignant dataset aretrained individually. The learning rate is 0.5 and the meansquare error criterion is 0.001. Stopping criterion is errorless than mean square error. The weights are initialized inthe range 0 and 0.01. Several experiments are conducted toselect the weight initialization.

IV. CONVERGENCE GRAPH FOR BENIGN ANDMALIGNANT DATASET

Figure 1.2 showing the convergence graph for benigndataset. The system achieves the stopping criterion after1452 iterations. Figure1.3 showing the convergence graphfor the malignant dataset which converges after 480iterations.

Figure1.7 Benign dataset convergence curve

Figure1.8 Malignant data convergence curve

V. Proposed modificationThe convergence speed of neural based system is verycrucial parameter for good training. The generalization ofthe system depends upon the how well the system gettrained. Training usually requires lots of time for complexdata. Keeping training time in mind, an idea is developedto change the order of weight updation. The standard back-propagation algorithm follows the 8 steps as discussedabove. Now if we exchange the weight updation step thatis 6 and 7, the updated weight will get multiplied by deltaerror and hence modified error will flow from output tohidden. There is chance of faster convergence withoutmissing the steps largely. The accuracy of the proposedsystem will quantify the validity of new idea. Based onabove assumption, a proposed modification is applied andconvergence graph for benign and malignant dataset isplotted. The new proposed method is tested for the sameinitialization as the standard method except the step 6 andstep 7.

Figure1.4 showing the convergence graph for benigndataset. The system achieves the stopping criterion after

1417 iterations. Figure1.5 showing the convergence graphfor the malignant dataset which converges after 434

iterations.

Figure1.9 Benign data convergence curve(modifiedsystem)

Page 3: Implementation of Back-Propagation Neural Network using Scilab and its Convergence Speed Improvement

Int. Journal of Electrical & Electronics Engg. Vol. 2, Spl. Issue 1 (2015) e-ISSN: 1694-2310 | p-ISSN: 1694-2426

NITTTR, Chandigarh EDIT -2015 194

Figure1.10 Malignant dataset convergence curve (modifiedsystem)

VI. RESULTS

Table 1 Standard and modified BPA convergenceiteration

Dataset Standard BPAconvergenceiteration

Proposed BPAconvergenceiteration

Benign 1452 1417Malignant 480 434

Table 1 shows the number of iterations required forconvergence. An improvement of 2.41% for benign dataand 9.59% for malignant data is observed. The trainedsystem is also testes for accuracy. The test dataset is usedfor measuring the accuracy. It has been found that there isno change in the accuracy owing to the proposed method.

VII. CONCLUSION

Open source software is always beneficial as far asconcerned the resources. SCILAB software is evolvingswiftly. It is good software for researchers avoiding theconstraint of licensed based costly software Back-propagation based multilayer perceptron code is written inSCILAB. The system is working well. The Proposedmethod of weight updation is providing faster convergencewhile keeping the same accuracy.

FUTURE SCOPEThe proposed system is tested for a single datasetbelonging to same class. The efforts will be made to testthe effectiveness of the modified algorithm for morestandard dataset and multiple class dataset.

REFERENCES[1] J. J. Rubio, “Modified optimal control with a backpropagation

network for robotic arms”, IET Control Theory & Applications,vol. 6, pp. 2216–2225, 2012.

[2] Y. M. George, H. H. Zayed, M. I. Roushdy, and B. M. Elbagoury,“Remote Computer-Aided Breast Cancer Detection and DiagnosisSystem Based on Cytological Images,” IEEE JOURNAL, vol. 8, no.3, pp. 949–964, 2014.

[3] K. U. Rani, “Analysis of heart diseases dataset using neural networkapproach”, International Journal of Data Mining & KnowledgeManagement Process, vol. 1, no. 5, pp. 1–8, 2011.

[4] J. Jiang, J. Zhang, G. Yang, D. Zhang, and L. Zhang, “Applicationof Back Propagation Neural Network in the Classification of HighResolution Remote Sensing Image”, 18th International Conferenceon Geoinformatics, pp. 1–6, 2010.

[5] C.-B. Ng, Y.-H. Tay, and B.-M. Goi, “A convolutional neuralnetwork for pedestrian gender recognition”, in Advances in NeuralNetworks--ISNN , Lecture Notes in Computer Science, volume7951, Springer International Publishing, pp. 558–564, 2013.

[6] S. A. Radzi and M. Khalil-Hani, “Character recognition of licenseplate number using convolutional neural network”, Lecture Notes inComputer Science (including subseries Lecture Notes in ArtificialIntelligence and Lecture Notes in Bioinformatics), vol. 7066 LNCS,pp. 45–55, 2011.

[7] P. Buyssens, A. Elmoataz, and L. Olivier, “MultiscaleConvolutional Neural Networks for Vision – based Classification ofCells”, in Computer Vision--ACCV 2012, Lecture Notes inComputer Science, Springer Berlin Heidelberg, volume 7725, pp.342–352, 2013..

[8] J. Jin, K. Fu, and C. Zhang, “Traffic Sign Recognition With HingeLoss Trained Convolutional Neural Networks”, IEEE IntelligentTransportation Systems Magazine, vol. 15, no. 5, pp. 1991–2000,2014.

[9] C. Yan, C. Lang, T. Wang, X. Du, and C. Zhang, “Age EstimationBased on Convolutional Neural Network”, in Advances inMultimedia Information Processing – PCM ,Lecture Notes inComputer Science, vol. 8879, Springer International Publishing, pp.211–220, 2104.

[10] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-basedlearning applied to document recognition”, Proceedings of theIEEE, vol. 86, pp. 2278–2323, 1998.

[11] O. L. Mangasarian, R. Setiono, and W.H. Wolberg: "Patternrecognition via linear programming: Theory and application tomedical diagnosis", in: "Large-scale numerical optimization",Thomas F. Coleman and Yuying Li, editors, SIAM Publications,Philadelphia, pp 22-30, 1990.

[12]P. M. Leod and B. Verma, “Variable hidden neuron ensemble formass classification in digital mammograms”, IEEE ComputationalIntelligence Magazine, vol. 8, no. February, pp. 68–76, 2013.

[13] V. Baskaran, a Guergachi, R. K. Bali, and R. N. Naguib, “Predictingbreast screening attendance using machine learning techniques”,IEEE Trans. Inf. Technol. Biomed, vol. 15, no. 2, pp. 251–259,2011.

[14] Ghosh, S., Mondal, S., & Ghosh, B. ”A comparative study of breastcancer detection based on SVM and MLP BPN classifier”, inproceeding of IEEE Automation, Control, Energy and Systems(ACES),pp. 1-4, 2014