implementation of back-propagation neural network using scilab and its convergence speed improvement

Int. Journal of Electrical & Electronics Engg. Vol. 2, Spl. Issue 1 (2015) e-ISSN: 1694-2310 | p-ISSN: 1694-2426

NITTTR, Chandigarh EDIT -2015 192

Implementation of Back-Propagation NeuralNetwork using Scilab and its Convergence

Speed Improvement

Abstract—Artificial neural network has been widely usedfor solving non-linear complex tasks. With the development ofcomputer technology, machine learning techniques arebecoming good choice. The selection of the machine learningtechnique depends upon the viability for particularapplication. Most of the non-linear problems have been solvedusing back propagation based neural network. The trainingtime of neural network is directly affected by convergencespeed. Several efforts are done to improve the convergencespeed of back propagation algorithm. This paper focuses onthe implementation of back-propagation algorithm and aneffort to improve its convergence speed. The algorithm iswritten in SCILAB. UCI standard data set is used for analysispurposes. Proposed modification in standardbackpropagation algorithm provides substantialimprovement in the convergence speed.

Keywords—Back-propagtion, SCILAB, Neural network

I. INTRODUCTION

Multilayer perceptron(MLP) based neural networks areplaying a crucial role in artificial intelligence. The non-linear complex problems can be easily solved using MLPsystem. Gradient based back-propagation algorithm iscommonly used for designing the system to optimalcriterion. Back-propagation based neural network controllershowing remarkable performance in trajectory trackingproblem of robotic arms[1]. Water pollution forecastingmodels, breast cancer detection[2], heart diseaseclassification[3], remote sensing image classification[4] aresome of the wide range of applications which areemploying neural network efficiently. Recent trends inback-propagation based neural network are using rawimage directly as input and they handle the problems likeselecting the appropriate features and number of dimensionreduction efficiently. Pedestrian gender classification,traffic sign recognition, character recognition, Ageestimation, vision based classification of cells, vehiclerecognition are some of the raw image direct processingbased applications employing gradient based back-propagation algorithm[5]-[9].

Figure 1.1 showing the multilayer perceptron neuralnetwork architecture. The architecture consists of singleinput layer, multiple hidden layers and output layer. All thelayers except input layer are processing layers. The systemprocesses the data in two passes. In the first pass which isthe forward pass, input features are fed to the input layer,output and gradients are calculated for each layer. In thebackward pass the free parameters are updated. Thestochastic gradient (online updation) and batchgradient(offline updation)method are used for weightupdation[10]. Continuous activation functions are used for

gradient based system design. Back-propagation basedsystems are slow learner.

Figure1.6 MLP architecture

II. SOFTWARE IMPLEMENTATION OF BPA

SCILAB is an open source software providing the samefunctionality as MATLAB software. The Back-propagation algorithm is written in SCILAB 5.5.1. Thefollowing pseudo code is followed for writing code.

1. Initialization2. Hidden layer output calculation3. Final layer output calculation4. Error calculation and delta calculation for each

processing unit5. Calculate loss function if meeting stopping criterion

stop otherwise go to next step6. Update the input to hidden layer weights7. Update the output to hidden layer weights8. Go to step 2

Initialization includes the parameters initialization(learningconstant, number of hidden neurons, input, desired output,weights). The output of a neuron is function of theweighted sum of inputs connected to that neuron. Meansquare error is used as a loss function. The weights areupdated using the equation(i).

w(t+1)=w(t)+ η*r(t)*x……..(i)

Where w(t+1) is the updated weight value, w(t) is theprevious weight value, is learning constant, r(t) islearning signal and x is the input signal. r(t) is delta signalwhich is the product of error and gradient of output. Thedesigned system accepts number of hidden neurons fromthe user and contains only single hidden layer with one unitin the output layer. The system updates the weights inoffline mode. The maximum mean square error from thewhole samples is taken and gradient is calculated based onthat. The updation of all samples is done with the samelearning signal. The maximum mean square error




Speed Improvement



I. INTRODUCTION











w(t+1)=w(t)+ η*r(t)*x……..(i)





Speed Improvement



I. INTRODUCTION











w(t+1)=w(t)+ η*r(t)*x……..(i)



193 NITTTR, Chandigarh EDIT-2015

minimization reduces the errors of all the input sampleswith time and converges to a desired level. Convergencegraph is also plotted between mean square error andnumber of iteration or epoch. Single iteration is equal tothe calculation of maximum mean square error from all thesamples. The developed system is tested for the Wisconsinbreast cancer dataset. The information of the dataset andsystem implementation for this dataset is explained in thenext section.

III. Data set information and System implementationWisconsin Breast Cancer(UCI) Database of 699 patients isused for experiment purpose. This dataset contains 9 inputfeatures namely Clump Thickness, Uniformity of Cell size,Uniformity of Cell Shape, Marginal Adhesion, SingleEpithelial Cell, Bare Nuclei, Bland Chromatin, NormalNucleoli, Mitosis and corresponding to these features,there are two classes of data benign data and malignantdata. There are 458 benign dataset and 241 malignantdataset. 16 instances from the dataset are of unknowncategory. There is replication of features for same class.After removing the replicated and unknown instances(16),the dataset for experimentation reduced to 445 having 209benign and 236 malignant dataset. Benign and malignantdataset is divided into two groups training dataset(70%)and testing dataset(30%). 146 benign data is used astraining dataset and 63 benign dataset for testing. 165malignant dataset is used for training and 71 malignantdataset for testing purposes[11].Architecture of 9 input units, 6 hidden neurons and 1output is made. The benign and malignant dataset aretrained individually. The learning rate is 0.5 and the meansquare error criterion is 0.001. Stopping criterion is errorless than mean square error. The weights are initialized inthe range 0 and 0.01. Several experiments are conducted toselect the weight initialization.

IV. CONVERGENCE GRAPH FOR BENIGN ANDMALIGNANT DATASET

Figure 1.2 showing the convergence graph for benigndataset. The system achieves the stopping criterion after1452 iterations. Figure1.3 showing the convergence graphfor the malignant dataset which converges after 480iterations.

Figure1.7 Benign dataset convergence curve

Figure1.8 Malignant data convergence curve

V. Proposed modificationThe convergence speed of neural based system is verycrucial parameter for good training. The generalization ofthe system depends upon the how well the system gettrained. Training usually requires lots of time for complexdata. Keeping training time in mind, an idea is developedto change the order of weight updation. The standard back-propagation algorithm follows the 8 steps as discussedabove. Now if we exchange the weight updation step thatis 6 and 7, the updated weight will get multiplied by deltaerror and hence modified error will flow from output tohidden. There is chance of faster convergence withoutmissing the steps largely. The accuracy of the proposedsystem will quantify the validity of new idea. Based onabove assumption, a proposed modification is applied andconvergence graph for benign and malignant dataset isplotted. The new proposed method is tested for the sameinitialization as the standard method except the step 6 andstep 7.

Figure1.4 showing the convergence graph for benigndataset. The system achieves the stopping criterion after

1417 iterations. Figure1.5 showing the convergence graphfor the malignant dataset which converges after 434

iterations.

Figure1.9 Benign data convergence curve(modifiedsystem)



Figure1.10 Malignant dataset convergence curve (modifiedsystem)

VI. RESULTS

Table 1 Standard and modified BPA convergenceiteration

Dataset Standard BPAconvergenceiteration

Proposed BPAconvergenceiteration

Benign 1452 1417Malignant 480 434

Table 1 shows the number of iterations required forconvergence. An improvement of 2.41% for benign dataand 9.59% for malignant data is observed. The trainedsystem is also testes for accuracy. The test dataset is usedfor measuring the accuracy. It has been found that there isno change in the accuracy owing to the proposed method.

VII. CONCLUSION

Open source software is always beneficial as far asconcerned the resources. SCILAB software is evolvingswiftly. It is good software for researchers avoiding theconstraint of licensed based costly software Back-propagation based multilayer perceptron code is written inSCILAB. The system is working well. The Proposedmethod of weight updation is providing faster convergencewhile keeping the same accuracy.

FUTURE SCOPEThe proposed system is tested for a single datasetbelonging to same class. The efforts will be made to testthe effectiveness of the modified algorithm for morestandard dataset and multiple class dataset.

REFERENCES[1] J. J. Rubio, “Modified optimal control with a backpropagation

network for robotic arms”, IET Control Theory & Applications,vol. 6, pp. 2216–2225, 2012.

[2] Y. M. George, H. H. Zayed, M. I. Roushdy, and B. M. Elbagoury,“Remote Computer-Aided Breast Cancer Detection and DiagnosisSystem Based on Cytological Images,” IEEE JOURNAL, vol. 8, no.3, pp. 949–964, 2014.

[3] K. U. Rani, “Analysis of heart diseases dataset using neural networkapproach”, International Journal of Data Mining & KnowledgeManagement Process, vol. 1, no. 5, pp. 1–8, 2011.

[4] J. Jiang, J. Zhang, G. Yang, D. Zhang, and L. Zhang, “Applicationof Back Propagation Neural Network in the Classification of HighResolution Remote Sensing Image”, 18th International Conferenceon Geoinformatics, pp. 1–6, 2010.

[5] C.-B. Ng, Y.-H. Tay, and B.-M. Goi, “A convolutional neuralnetwork for pedestrian gender recognition”, in Advances in NeuralNetworks--ISNN , Lecture Notes in Computer Science, volume7951, Springer International Publishing, pp. 558–564, 2013.

[6] S. A. Radzi and M. Khalil-Hani, “Character recognition of licenseplate number using convolutional neural network”, Lecture Notes inComputer Science (including subseries Lecture Notes in ArtificialIntelligence and Lecture Notes in Bioinformatics), vol. 7066 LNCS,pp. 45–55, 2011.

[7] P. Buyssens, A. Elmoataz, and L. Olivier, “MultiscaleConvolutional Neural Networks for Vision – based Classification ofCells”, in Computer Vision--ACCV 2012, Lecture Notes inComputer Science, Springer Berlin Heidelberg, volume 7725, pp.342–352, 2013..

[8] J. Jin, K. Fu, and C. Zhang, “Traffic Sign Recognition With HingeLoss Trained Convolutional Neural Networks”, IEEE IntelligentTransportation Systems Magazine, vol. 15, no. 5, pp. 1991–2000,2014.

[9] C. Yan, C. Lang, T. Wang, X. Du, and C. Zhang, “Age EstimationBased on Convolutional Neural Network”, in Advances inMultimedia Information Processing – PCM ,Lecture Notes inComputer Science, vol. 8879, Springer International Publishing, pp.211–220, 2104.

[10] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-basedlearning applied to document recognition”, Proceedings of theIEEE, vol. 86, pp. 2278–2323, 1998.

[11] O. L. Mangasarian, R. Setiono, and W.H. Wolberg: "Patternrecognition via linear programming: Theory and application tomedical diagnosis", in: "Large-scale numerical optimization",Thomas F. Coleman and Yuying Li, editors, SIAM Publications,Philadelphia, pp 22-30, 1990.

[12]P. M. Leod and B. Verma, “Variable hidden neuron ensemble formass classification in digital mammograms”, IEEE ComputationalIntelligence Magazine, vol. 8, no. February, pp. 68–76, 2013.

[13] V. Baskaran, a Guergachi, R. K. Bali, and R. N. Naguib, “Predictingbreast screening attendance using machine learning techniques”,IEEE Trans. Inf. Technol. Biomed, vol. 15, no. 2, pp. 251–259,2011.

[14] Ghosh, S., Mondal, S., & Ghosh, B. ”A comparative study of breastcancer detection based on SVM and MLP BPN classifier”, inproceeding of IEEE Automation, Control, Energy and Systems(ACES),pp. 1-4, 2014

implementation of back-propagation neural network using scilab and its convergence speed improvement

Engineering

propagation neuralnetwork

propagation basedsystems

propagation algorithm5

areemploying neural

trainingtime of neural

neural networks

neural networki

output layer