research article a novel distributed online anomaly...

13
Research Article A Novel Distributed Online Anomaly Detection Method in Resource-Constrained Wireless Sensor Networks Zhiguo Ding, 1,2 Haikuan Wang, 1 Minrui Fei, 1 and Dajun Du 1 1 Shanghai Key Laboratory of Power Station Automation Technology, School of Mechatronics Engineering and Automation, Shanghai University, Shanghai 200072, China 2 College of Mathematics, Physics and Information Engineering, Zhejiang Normal University, Jinhua, Zhejiang 321004, China Correspondence should be addressed to Haikuan Wang; [email protected] Received 17 March 2015; Accepted 14 May 2015 Academic Editor: Fuwen Yang Copyright © 2015 Zhiguo Ding et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. In this paper, a novel distributed online anomaly detection method in resource-constrained WSNs was proposed. Firstly, the spatiotemporal correlation existing in the sensed data was exploited and a series of single anomaly detectors were built in each distributed deployment sensor node based on ensemble learning theory. Secondly, these trained detectors were broadcasted to the member sensor nodes in the cluster, combining with its trained detector, and the initial ensemble detector was built. irdly, considering resources-constrained WSNs, ensemble pruning based on biogeographical based optimization (BBO) was employed in the cluster head node to obtain an optimized subset of ensemble members. Further, the pruned ensemble detector coded by the state matrix was broadcasted to each member sensor nodes for the distributed online global anomaly detection. Finally, the experiments operated on a real WSN dataset demonstrated the effectiveness of the proposed method. 1. Introduction Wireless sensor networks (WSNs) are integrated with sens- ing, data processing, and wireless communication capabilities [1], which have received considerable attention for multiple types of applications. However, WSNs are highly susceptible to suffer from various kinds of interferences and faults hard- ware fault, electromagnetic interference, environmental fac- tor, and network intrusion. Consequently, anomalous obser- vations arise inevitably in WSNs. ese unusual observations (i.e., anomalies or outliers) can be generally classified into two different types: one is error and the other is event [2, 3]. e former refers to the observations that deviate from the true measurement significantly such as the dirty data. Detecting and cleaning them timely can save the limited memory and computation as well as expensive communication resources. e later usually refers to an event that occurred such as temperature change caused by the forest fire. Detecting such event timely can help to take corresponding measure. With the wide application of WSNs, detecting these anoma- lous observations accurately and timely is an important task. ough there are many anomaly detection methods avail- able up to now based on data mining and machine learning methods, most of them do not take the resource limitation into account and are not designed specifically for WSNs. Considering the limited resource (i.e., computation, memory, communication, and so on) of WSNs, how to develop a suitable anomaly detection method becomes an important and urgent work. Up to now, researchers have done some works and proposed some anomaly detection methods for WSNs [1, 47], which took resource limitation into account to some extent. As the first of four research directions in machine learning community, ensemble learning has attracted many researchers attention and been used widely in different appli- cations [8]. However, seldom work was done for anomaly detection of WSNs. A large body of theoretical and empirical researches has shown that the combination of the detecting results of multiple individual detectors can improve the gen- eralization performance observably, but original ensemble learning method usually needs to build and store multiple individual detectors which incur a large amount of com- putation and storage resource requirement and may not be Hindawi Publishing Corporation International Journal of Distributed Sensor Networks Volume 2015, Article ID 146189, 12 pages http://dx.doi.org/10.1155/2015/146189

Upload: others

Post on 13-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Research Article A Novel Distributed Online Anomaly ...downloads.hindawi.com/journals/ijdsn/2015/146189.pdfin the cluster head node to obtain an optimized subset of ensemble members

Research ArticleA Novel Distributed Online Anomaly Detection Method inResource-Constrained Wireless Sensor Networks

Zhiguo Ding12 Haikuan Wang1 Minrui Fei1 and Dajun Du1

1Shanghai Key Laboratory of Power Station Automation Technology School of Mechatronics Engineering and AutomationShanghai University Shanghai 200072 China2College of Mathematics Physics and Information Engineering Zhejiang Normal University Jinhua Zhejiang 321004 China

Correspondence should be addressed to Haikuan Wang eeewhk163com

Received 17 March 2015 Accepted 14 May 2015

Academic Editor Fuwen Yang

Copyright copy 2015 Zhiguo Ding et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

In this paper a novel distributed online anomaly detection method in resource-constrained WSNs was proposed Firstly thespatiotemporal correlation existing in the sensed data was exploited and a series of single anomaly detectors were built in eachdistributed deployment sensor node based on ensemble learning theory Secondly these trained detectors were broadcasted tothe member sensor nodes in the cluster combining with its trained detector and the initial ensemble detector was built Thirdlyconsidering resources-constrained WSNs ensemble pruning based on biogeographical based optimization (BBO) was employedin the cluster head node to obtain an optimized subset of ensemble members Further the pruned ensemble detector coded bythe state matrix was broadcasted to each member sensor nodes for the distributed online global anomaly detection Finally theexperiments operated on a real WSN dataset demonstrated the effectiveness of the proposed method

1 Introduction

Wireless sensor networks (WSNs) are integrated with sens-ing data processing andwireless communication capabilities[1] which have received considerable attention for multipletypes of applications However WSNs are highly susceptibleto suffer from various kinds of interferences and faults hard-ware fault electromagnetic interference environmental fac-tor and network intrusion Consequently anomalous obser-vations arise inevitably inWSNsThese unusual observations(ie anomalies or outliers) can be generally classified into twodifferent types one is error and the other is event [2 3] Theformer refers to the observations that deviate from the truemeasurement significantly such as the dirty data Detectingand cleaning them timely can save the limited memory andcomputation as well as expensive communication resourcesThe later usually refers to an event that occurred suchas temperature change caused by the forest fire Detectingsuch event timely can help to take corresponding measureWith the wide application of WSNs detecting these anoma-lous observations accurately and timely is an importanttask

Though there aremany anomaly detectionmethods avail-able up to now based on data mining and machine learningmethods most of them do not take the resource limitationinto account and are not designed specifically for WSNsConsidering the limited resource (ie computationmemorycommunication and so on) of WSNs how to develop asuitable anomaly detection method becomes an importantand urgent work Up to now researchers have done someworks and proposed some anomaly detection methods forWSNs [1 4ndash7] which took resource limitation into accountto some extent

As the first of four research directions in machinelearning community ensemble learning has attracted manyresearchers attention and been used widely in different appli-cations [8] However seldom work was done for anomalydetection of WSNs A large body of theoretical and empiricalresearches has shown that the combination of the detectingresults of multiple individual detectors can improve the gen-eralization performance observably but original ensemblelearning method usually needs to build and store multipleindividual detectors which incur a large amount of com-putation and storage resource requirement and may not be

Hindawi Publishing CorporationInternational Journal of Distributed Sensor NetworksVolume 2015 Article ID 146189 12 pageshttpdxdoiorg1011552015146189

2 International Journal of Distributed Sensor Networks

appropriate for WSNs The possible strategy is to select partof individual detectors to perform the anomaly detectionConsequently the ensemble pruning is a necessary strategy[9] which can obtain the better (at least same) performancecompared to the initial ensemble while the number ofindividual detectors decreased greatly

Analyzing the spatiotemporal correlation of sensed datain WSNs and motived by the online ensemble learningmethod the paper proposes a distributed anomaly detectionmethod for WSNs from the perspective of both modelbuilding and resource saving Further to mitigate the highcommunication requirements caused by broadcast ensembledetectors BBO based ensemble pruning is used to select theoptimized individual detectors to build the final ensembledetector that has at least same performance compared tothe initial ensemble detector The main contributions of thispaper include the following

(1) A distributed anomaly detection method for WSNs isproposed based on online ensemble learning

(2) BBO based ensemble pruning is used to get theoptimal subset for saving the limited store and com-munication resources in WSNs

(3) State matrix encoding method is designed for ensem-ble detector which can decrease the communicationand memory overhead significantly

The rest of this paper is organized as follows The relatedwork is described in Section 2 Based on ensemble learningtheory and BBO our proposed anomaly detection methodis presented in Section 3 Experiment analysis is provided inSection 4 Finally conclusions and future work are presentedin Section 5

2 Related Work

To clearly analyze the motivation of the paper the stateof the art of three key aspects related to our paper issummarized that is anomaly detection in WSNs onlineensemble learning and ensemble pruning

21 Anomaly Detection Method and Classification in WSNsWith the rapid development and wide application of WSNssome anomaly detection techniques for WSNs have beendeveloped and summarized based on different perspectiveFor example [10] discussed the prioritization of various char-acteristics ofWSNs including of spatiotemporal and attributecorrelations of sensed data anomaly types anomaly identifi-cation anomaly score and so forth A brief overview of theclassifications strategies for anomaly detection methods inWSNs deployed in harsh environment was provided whichgrouped anomaly detection methods into four types that isstatistical-based techniques the nearest neighbor based tech-niques the clustering based techniques and classification-based techniques Based on the nature of sensor data specificrequirements and limitations of the WSNs [1] provideda comprehensive overview of existing anomaly detectiontechniques specifically developed for WSNs It presented atechnique-based taxonomy and gave a comparative table

which could be used as a guideline to select the suitablemethod for the specific application For example based on thecharacteristic such as data types anomaly types and anomalydegree statistical-based methods are further classified intoparametric-based methods and nonparametric-based meth-ods Based on how the probability distribution model isbuilt classification-basedmethods are categorized as supportvector machined-based methods and radial basis functionneural networks-based methods [11] and so on The inter-ested reader is referred to more anomaly detection methodsand taxonomies in [5 6 12 13] These taxonomies of afore-mentioned methods may be some overlaps and machinelearning and computational intelligence-based techniquesare an increasing important research direction beyond alldoubt with respect to the complicated application Moreoverthough these methods have acceptable performance to someextent the resource constraint usually was not or seldomtaken into account With the wide applications of WSNsit also attracted some researchersrsquo attentions [14] Anothernoticeable characteristic of aforementionedmethodswas thatonly a single detector or model was trained It is well-knownthat the singlemodelmay be not well learned the complicateddecision boundary with respect to the complicated dataset For the sensed streaming data with the dynamic datadistribution single model is hard to or need expensive costto learn and obtain the whole profile such as training artificialneural network which leads to overlearning and degrade thegeneralization performance Besides concept drift [15] wasa common phenomenon that occurred in dataset collectedfromWSNs and the single mode was difficult in dealing withsuch dynamic changing of data distribution and providing acomprehensive detector to detect anomaly Moreover detec-tor updating based on all available dataset is also a hard workfor online learning

22 Ensemble Learning Method Ensemble learning is a com-putational intelligence method and theory and experimenthave proved that the combination of the predictions ofmany individual detectors can enhance the generalizationperformance There are many different ensemble learningmethods used widely and successfully such as Bagging [1617] Boosting [18 19] Random Forest [20] and their onlineversion [21 22] Generally an ensemble anomaly detector isconstructed in two steps Firstly a number of base detectorsare trained using the training dataset Secondly a combina-tion strategy of result is designed to obtain the aggregatedresult based on the results of each single detector For time-series dataset such as sensed dataset in WSNs learning asingle model to profile the whole dataset usually is difficultor impossible Generally there are two categorized ensemblepatterns to handle the streaming data that is horizontalensemble and vertical ensemble The former follows suchstrategy that the nearest 119899 consecutive data chunks are firstlyused to train 119899 base detector and the combination methodis employed to build the ensemble detector used to predictdata in the yet-to-arrive chunk The advantage of horizontalensemble is that it can handle noise data in the streamingdataset because the prediction of newly arriving data chunkdepends on the combination of different chunks Even if

International Journal of Distributed Sensor Networks 3

the noise data may deteriorate some chunks the ensemblecan still generate relatively accurate prediction result Thedisadvantage of horizontal ensemble is that the streamingdata is continuously changing and the information containedin the previous chunks may be invalid so that using theseold concept models will not improve the overall result ofprediction The latter ensemble pattern is vertical ensemblewhich uses the newest chunk to build ensemble model Theadvantage of vertical ensemble is that it uses different algo-rithms (heterogeneous ensemble) on same dataset or samealgorithm (homogeneous ensemble) on different samplingsubdataset from the chunk to build the model which candecrease the bias error between models The disadvantageis that vertical ensemble assumes that the data chunk iserrorless in real situation this precondition usually is hardtomeet Currently because online ensemble learningmethodcan address the concept drift and noisy data problem instreaming data ensemble learning has been used in anomalydetection forWSNs [23ndash25] In this paper after exploiting thespatiotemporal correlation existing in the sensed dataset inWSNs a distributed method is proposed based on horizontalensemble and like-vertical ensemble Section 3 will give thedetailed description

23 Ensemble Pruning Based on Optimization Search MethodAlthough there are many advantages for ensemble learningthe nontrivial disadvantage is that it needs more mem-ory especially more communication resource to store andcommunicate multiple detectors in WSNs which can drainenergy quickly and is intolerable in WSNs Motivated bythe ldquomany could be better than allrdquo in the ensemble learn-ing community [9] it implied that the combination of alldetectors maybe not a good choice in ensemble learningcommunity Ensemble pruning as necessary strategy to solveresource-limitation question [26] is employed which selectsa subset of initial ensemble and obtains better or at leastequal detecting performance than the original ensembleThemost advantage of ensemble pruning is that it reduces thecommunication requirement greatly In WSNs broadcastingthe relative few detectors can save the battery energy consid-erably However it is well known that pruning an ensembleof size119873 requires to search in the space composed of 2119873 minus 1nonempty subensembles which is a 119873119875 complete problemHence some heuristic searching approaches are used tofind the expected appropriate subset Biogeographical basedoptimization (BBO) [27 28] as a novel population-basedglobal optimization method had some features in commonwith existing optimizationmethods such as genetic algorithm(GA) and harmony search (HS) [29] In this paper BBO isused to obtain an optimalsuboptimal ensemble for reducingthe communication cost To the best of our knowledge as anew optimization method there is no paper employing thismethod to apply in the fields of WSNs and our study willextend its application

3 Proposed Method

Motivated by the increasing online ensemble learningmethodology [25] and considering the resource limitation

Base station (BS)

Cluster head (CH)CH to BSBoundary of cluster heads

Non-CH node

Figure 1 The considered WSN

of sensor node in WSNs we propose a distributed onlineanomaly detection method based on the ensemble learningFurther BBO is used for ensemble pruning to decrease thecommunication and memory requirements

31 Problem Statement of WSNs In this paper we assumethat the WSNs is applied in untouched area and to assure thesensed data quality the sensor nodes usually are deployeddensely Besides we assume that sensor nodes are timesynchronized which is mainly for clear presentation purposerather than a limitation of our proposed method Figure 1showsWSNs which consist of a large amount of sensor nodesand a base station (BS) [30] Generally the WSNs can berepresented as a graph119866 = (119881 119864) where119881 = V

1 V2 V

|119881|

is a finite set of vertices and 119864 = 1198901 1198902 119890

|119864| is a finite

set of edges and vertex (V119894 119894 = 1 |119881|) and edge (119890

119894 119894 =

1 |119864|) refer to sensor nodes and the one-hop or multihopcommunication link reachable between sensors V

119894and V

119895

respectivelyFrom Figure 1 we can clearly have an idea that some

clusters are formed based on node geographical positionsinformation and communication capability reachable Herewe only consider the one-hop communications among sensornodes Similarly this assumption is mainly for clear repre-sentation of our proposed method rather than a limitation ofcommunication capability of sensors In fact our proposedmethod can easily extend to multiple hop relaying communi-cation Besides in order to concisely describe our proposedanomaly detection method a relatively small subnetworkconsisted of some sensor nodes deployed densely is taken intoaccount which forms a cluster 119862

119894consisting of one cluster

head node and a number of sensor nodes represented as CH119894

and 119873119894119895 119895 = 1 |119862

119894| respectively For the whole WSNs

119881 = 1198621cup1198622cup cup 119862

119899and119862

119894cap119862119895= Φ All nodes in a cluster

are reachable to each other by one-hop communication andthe communication between clusters depends on the directlinks of cluster heads In each cluster the selection of clusterhead is randomized among all nodes in that cluster to avoiddraining of the energy

For one cluster 119862119894= CH

119894 1198731198941 119873

119894119898 which contains

a cluster head CH119894and its 119898 spatially neighboring nodes

(119873119894119895 119895 = 1 119898) Each sensor node in the subnetwork

4 International Journal of Distributed Sensor Networks

measures a data vector at every time interval Δ119905 which iscomposed of multiple attribute values For the cluster headCH119894 the observation is 119883

119894= (119909

119894

1 119909119894

2 119909

119894

119889) where 119889

denotes the dimension For the 119895th neighbor node 119873119894119895 the

observation is 119883119894119895= (119909119894

1198951 119909119894

1198952 119909

119894

119895119889) Nodes in the cluster

collect samples synchronously and our proposed method isto identify these new observations of each sensor node asnormal or anomalous online

32 Spatial and Temporal (Spatiotemporal) Correlation ofSensed Dataset For the sensed dataset in a cluster wedescribed the spatiotemporal correlation firstly which will beused later to build our proposed online ensemble detection

The collected sensor dataset from WSNs is a time seriesdataset A time series is a sequence of value 119883 = 119909(119905) 119905 =

1 119899which follows a nonrandom order and the 119899 consec-utive observation values are collected at same time intervalsAnalyzing and learning from these observations [31] canhelp to understand the data trend over time and build theappropriate detector based on temporal correlation as well asto predict the label of new coming observations

To obtain the detector the foremost requirement is toachieve a stationary time series dataset Some data processingmethods are used to eliminate data trend and obtain a sta-tionary time series dataset such as polynomial fitting movingaverages differencing and double exponential smoothing[32ndash34] Considering the requirement of low computationalcomplexity a simple and efficient nonparametric technique(ie first differencing) is used to eliminate the temporal trendand obtain a stationary time series for dataset collected inWSNs which can be formulated as

1198831015840= 1199091015840(119904 119905) = 119909 (119904 119905) minus 119909 (119904 119905 minus 1) 119905 = 2 3 119899 (1)

Besides the sensor nodes are always deployed denselyand the space redundancy existed A dataset 119883 = 119909(119904) 119904 =

1 119898 is collected from 119898 sensor nodes in a clusterat a timestamp This dataset can help to understand thespatial correlation structure of data and predict the datavalue at a location nearby Spatial data may present thelocal dependencywhich represents the similarity relationshipof observations collected at adjacent locations in a localregion Usually for a specified region the observations of onesensor can be estimated by a linear weighted combination ofobservations collected at its adjacent locations [32] which canbe expressed as

119909 (119904119894) = 1205821119909 (1199041) + sdot sdot sdot + 120582119894minus1119909 (119904119894minus1) + 120582119894+1119909 (119904119894+1)

+ sdot sdot sdot + 120582119898119909 (119904119898)

(2)

where 1199041 119904119894minus1 119904119894+1 119904119898 denotes positions of sensornodes and 1205821 120582119894minus1 120582119894+1 120582119898 denotes the weights ofobservations sum119898

119896=1119896 =119894 120582119896 = 1Consequently for sensed data collected in a local region

two reasonable assumptions are described as follows

(1) The sensed data of adjacent nonfault sensor nodes aresimilar at the same timestamp

(2) The sensed data of adjacent nonfault sensor nodeshave the similar trend over time

Motivated by the two assumptions and ensemble learningtheory a novel anomaly detectionmethod is proposed in thispaper We will give the details in the following section

33 Proposed Ensemble Learning Method of Anomaly Detec-tion in WSNs Spatiotemporal correlation exists among sen-sor data in a local region of WSNs and a relatively smallcomponent that is a cluster consisting of a few of sensornodes and a cluster head node is taken into account to clearlydescribe proposed distributed anomaly detection methodbased on ensemble learning Ensemble pruning based onBBO was adopted to optimize the initial trained detectorfor mitigating the resource requirements The optimizedensemble detector was used to identify global anomalousobservations at each individual sensor timely Our proposedmethod is shown as Figure 2

Online anomaly detection method consists of three keyprocedures that is detector training online detecting andonline detector updating From Figure 2 it can be seen thatour proposed method enables each distributed deploymentsensor node to globally judge every new coming observa-tion normal or anomalous in time Distributed detecting isemployed to achieve load (communication computation andstorage) evenly in the network and to prolong the lifetime ofthe whole network

The whole procedure of proposed method is described asfollows

Step 1 Considering the temporal correlation at the certaintime period each sensor node 119904

119894trains a local ensemble

detector using the history dataset collected from a timeinterval In facts using this initial local ensemble detectorthe new coming observation is normal or anomalous can bedetermined locally

Step 2 Each sensor node 119904119894transmits its local ensemble

detector as well as some related parameters such as themaximize value minimum value and mean of trainingdataset to the cluster head node and other member sensornode

Step 3 Cluster head node received the local ensemble detec-tors from its member nodes and combined with its owntrained detector the initial global ensemble detector is built

Step 4 The BBOmethod is introduced in the cluster head toprune the initial global ensemble detector and to obtain anacceptable final ensemble detector

Step 5 The pruned ensemble detector that is final ensembledetector is broadcasted to its each member sensor node foronline global anomaly detection

Step 6 Each sensor node selectively retains the test data foronline update based on the predefined sampling probability119901

International Journal of Distributed Sensor Networks 5

Learning(i) Input training data

(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector

Member nodes (MN1)

Learning(i) Input training data

(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector

Member nodes (MN2)

Learning(i) Input training data

(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector

Member nodes (MNm)

Cluster head node (CH)

Ensemble aggregating

(i) Receive the local ensemble detector(ii) Aggregating

(iii) Output the global ensemble detector

Bit coding mechanism

BBO (ensemble pruning)

Local ensembledetector

Local ensembledetector

Local ensembledetector

Broadcastcommunication

Broadcastcommunication

Broadcastcommunication

Broadcastcommunication

Local ensembledetector

Global initialensemble detector

State matrix Global initial ensemble detector

Figure 2 Distributed ensemble anomaly detection method based on BBO pruning in WSNs

Step 7 Once the updating condition was activated theprocedure of retraining and detector updating was triggered

This method can scale well with increase of number ofnodes in WSNs due to its distributed processing nature Ithas low communication requirements and does not need totransmit any actual observations between cluster head nodeand itsmember sensor node which saves the communicationresource significantly

Next we described some important procedures men-tioned above in detail Further considering the context ofresource constraint of each sensor node in WSNs sometricks are designed to save the communication and memoryrequirements

331 Building the Initial EnsembleDetector An initial ensem-ble detector is constructed by two steps Firstly a numberof base detectors are trained sequentially for each sensornodes in a cluster (including the cluster head node itself)based on the history dataset Because the data distributionmay be changed over time the previous trained detector maybe useless for the future detection Moreover the limitedmemory resource in the sensor node is another constraintto store too many previous detectors In practice accordingto the space of memory resource only the latest multipledetectors are kept to build the initial local ensemble for onesensor node For example to sensor node 119894 the sensed data iscollected anddivided into data chunk based on a time intervalΔ119905 which is determined by the actual monitoring processConsequently each node trains multiple individual detectorsover time In our paper supposing 119899 latest detector is kept fora sensor node if there are119898 nodes in one cluster then totally119899lowast119898 detectors are obtained for the initial ensemble Secondly

each sensor node (including cluster head node) broadcasts its119899 trained detector in the cluster Taking the cluster head as anexample after all (119899lowast(119898minus1)) individual detectors are receivedfrom its member nodes the cluster head combines with its 119899trained detector and the initial ensemble (including 119899 lowast 119898

individual detectors) is built in cluster head nodeMany techniques can be employed for combining the

results of each detector to obtain the final detection resultThe common used method in the literature is the major-ity vote (for classification problem) and weighted average(for regression problem) In our paper the final ensembledetection result can be calculated by (3) where 119908

119894denotes

weight coefficient that is 119908119894= 1 means the simple average

otherwise weighted average In our paper for simplicity thesimple average strategy is employed to combine the finallyresult

119910fin (119909) =1

119899 lowast 119898

119899lowast119898

sum

119894=1119910119894 (119909) lowast 119908119894

(3)

332 Ensemble Pruning Based on BBO Search To miti-gate the expensive communication cost and high memoryrequirement induced by ensemble learning inspired by theprinciple of ldquomany could be better than allrdquo in the ensemblelearning community the ensemble pruning is necessary

Given an initial ensemble anomaly detector 119864 =

AD1AD2 AD

119899lowast119898 AD

119894is a trained anomaly detector

which can test an observation anomalous or not a combi-nation method 119862 and a test dataset 119879 The goal of ensemblepruning is to find an optimalsuboptimal subset 1198641015840 sube 119864which can minimize the generalization error and obtainbetter or at least same detection performance compared to119864 Let 119891

119894119895(119894 = 1 2 119898 119895 = 1 2 119899) be the fitness values

6 International Journal of Distributed Sensor Networks

Input 119864mdashinitial ensemble anomaly detector 119879mdashThe number of maximization iterationOutput 1198641015840mdashfinal ensemble anomaly detectorlowast BBO parameter initialization lowast

Create a random set of habitats (populations) 1198671 1198672 119867

119873

Compute corresponding fitness that is HSI valueslowastOptimization search process lowast

While (119879)Compute immigration rate 120582 and emigration rate 119906 for each habitat based on HSIlowast Migration lowastSelect119867

119894with probability based on 120582

119894

If119867119894is selected

Select119867119895with probability based on 119906

119895

If119867119895is selected

Randomly select a SIV form119867119895

Replace a random SIV in119867119894with one from119867

119895

End ifEnd iflowast Mutation lowastSelect an SIV in119867

119894with probability based on the mutation rate 120578

If119867119894(SIV) is selected

Replace119867119894(SIV) with a randomly generated SIV

End ifRe-compute HSI values119879 = T minus 1

End whilelowast Ensemble pruning lowast

Get the final ensemble of anomaly detector 119864lowast based on the habitats119867119894

lowast with acceptable HSI

Algorithm 1 Ensemble pruning BBO (E T)

of the detecting performance such as true positive rate falsepositive rate accuracy and so on Obviously the fitness value119865 can be defined as (4) based on the results of testing data

119865 =

[

[

[

[

[

[

11989111 11989112 sdot sdot sdot 1198911119899

11989121 11989122 sdot sdot sdot 1198912119899

sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot

1198911198981 1198911198982 sdot sdot sdot 119891

119898119899

]

]

]

]

]

]

(4)

The final fitness function can be defined as

Maximize (

1198731015840

sum

119894=1119898119895=1119899

119891119894119895)

st 1198731015840le 119898 lowast 119899

(5)

Here the problem of ensemble pruning is to find thesubset of 1198641015840 which was composed of part single detectorsFinding the optimized subset requires much heavier andmore delicate computation resources Biogeography-basedoptimization (BBO) is a novel optimization method and isemployed to find out the acceptable set of ensemble Weonly simply present some key information about BBO theinterested reader can be referred to the detailed descriptionin [28]

BBO is a population-based global optimization methodwhich has some common characteristics similar to the

existing evolutionary algorithms (EAs) such as genetic algo-rithm (GA) particle swarm optimization (PSO) and antcolony optimization (ACO) When it was used to search thesolution domain and obtain an optimalsuboptimal solutionsome operators were employed to share information amongsolutions which makes BBO applicable to many problemsthat GA and PSO are used The more distinctive differencebetween BBO and other EAs can be seen in [27 28]

The pseudo-code of ensemble pruning based on BBO canbe described as shown in Algorithm 1 [7] Here 119867 indicateshabit HIS is fitness and SIV (suitability index variable) is asolution feature

333 Some Tricks Designed to Mitigate the CommunicationRequirement In the WSNs the main reason of quick energydepletion is the radio communication among the sensornodes It has been known that the cost of communicationof one bit equals the cost of processing thousands of bitsin sensors [35] This means that the most energy in sensornode is consumed by radio communication rather thancollecting or processing data Consequently reducing thecommunication quantity will decrease the power resourcerequirement and eventually lengthen the lifetime of thewholeWSNs

It is obvious that the aforementioned method has relativehigh communication overhead Each sensor node transmitsits local ensemble detector to the cluster head and the finalpruned global ensemble detector broadcasts back to its each

International Journal of Distributed Sensor Networks 7

Input 1198641015840mdashCurrent pruned ensemble anomaly detector 119901mdashSampling probabilityOutput 119864lowastmdashUpdated pruned ensemble anomaly detector

For each sensor nodeRetain the new observation with probability 119901If buffer is replaced completely by new observations

Train new detector and transmit its summary to cluster head119864lowast = Ensemble Pruning BBO(1198641015840 119879)

Broadcast 119864lowast to its member sensor node for subsequent anomaly detection

Algorithm 2 Online updating (1198641015840 119901)

member sensor nodes In order to relieve communicationburden some skills are used to descend the communicationoverhead

In fact the distributed traininglearning method onlytransmits the summary information of trained local ensembledetector to the cluster head which has significantly decreasedthe communication cost compared to centralized anomalydetectionmanners that sent all trained data to cluster head tobuild detector Besides after the pruned ensemble is obtainedin cluster head node each member sensor node in thiscluster can obtain the pruned ensemble detector from thecluster head node A straightforward method is broadcastingthis pruned ensemble to its member sensor nodes Thisis a common used strategy but it does not make full useof local ensemble detector information and will cost morecommunication resources Here a state matrix 119875 is designedin the cluster head its element 119901

119894119895is defined by formula (6)

to represent each single detector in initial ensemble Theneach local ensemble detector is represented as a bit stringusing one bit for each single detector Detector is included orexcluded from the ensemble detector depending on the valueof the corresponding bit that is 1 denotes this single detectorthat is included in the final ensemble and 0 means it was notincluded

119901119894119895=

1 AD119894119895isin 1198641015840 119894 = 1 119898 119895 = 1 119899

0 otherwise

119875

=

1 2 119894 minus 1 119894 119894 + 1 sdot sdot sdot 119899

1198781

1198782

119878119898

[

[

[

[

[

[

0

1

sdot

0

1

0

sdot

0

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

0

1

sdot

1

1

1

sdot

1

1

0

sdot

1

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

1

0

sdot

1

]

]

]

]

]

]

(6)

After the pruned procedure is finished the cluster headbroadcasts the statematrix119875 to its member sensor node eachsensor node keeps the single detector whose correspondingvalue of state element equals 1 and it deletes the rest to buildthe pruned ensemble global detector Employing the statematrix can save the energy greatly For example after theensemble pruning is finished 1198731015840 (1198731015840 le 119899 lowast 119898) individualdetectors are broadcast in cluster If matrix 119875 is not used it

will need 4 lowast1198731015840 lowast 119889 bytes communication cost (suppose thatthe individual detector can be represented by 119889 parametersand each parameter needs at least 4 bytes) If matrix 119875 isintroduced each itemofmatrix119875 only needs 1 bit to representan individual detector Consequently only 119898 lowast 1198998 bytes arerequired to broadcast Suppose that one-third of individualdetectors are pruned (ie1198731015840 = 2lowast119899lowast1198983) then (4lowast119899lowast119898lowast

119889 lowast 23)(119898 lowast 1198998) asymp 2133119889 By introducing the ensemblepruning and state matrix the quantity of energy saving incluster head sensor is significant and the lifetime ofWSNs canbe lengthened

334 Online Update and Relearning Distribution changeof sensed dataset occurred possibly and detector updat-ing is necessary Online detector update will accompanya relearning procedure A comprised strategy (ie delayupdating strategy [36]) can cater this situation and savethe computation communication and memory resources tosome extent Simple to say for the new coming observationwhether saving and using it to update the current detectoror not are decided by a sample probability 119901 Some heuristicrules can be employed to guide its value for example if thedynamics is relatively stationary the small 119901 should be usedotherwise the big 119901 should be chosen When the buffer of asensor node is replaced by the new data completely onlineupdate is triggered and new detector is trained The pseudo-codes of algorithm can be described as shown inAlgorithm 2

4 Experimental and Analysis

In this section the dataset data preprocessing methodexperiment results and analysis are described respectivelyExperiments were conducted on a personal PC with IntelCore 2 Duo CPU P7450213GHZ and 4GB memoryThe operating system is Windows 7 professional The dataprocessing is partly on the MATLAB 2010 and the algorithmmentioned in Section 3 was implemented with MicrosoftVisual C++ platform

41 Dataset and Data Preprocessing IBRL datasets [37] wereused in our paper to validate proposed method which wascollected from aWSN deployed in Intel Research Laboratoryat University of Berkeley and commonly used to evaluatethe performance of some existing models for WSNs [3536 38ndash41] This network consists of 54 Mica2Dot sensornodes Figure 3 shows the location of each node of the

8 International Journal of Distributed Sensor Networks

1

2

3

4

5 6

7

8

9

10

11

12

1314

15

16

17

18

19

20

21

22

23

2425

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

4142

43

44

45

46

47

48

49

50

51

52

54

Lab

Server

Quiet

Phon

e

Kitchen

Elec

Copy

Storage

Conference

Office Office

53

Figure 3 Sensor nodes location in the IBRL deployment

deployment (node locations are shown in black hexagonwith their corresponding node IDs) [35] The whole datasetwas collected from 29022004 to 05042004 Four typesof measures data that is light temperature and humidityas well as voltage were collected and those measurementswere recorded in 31 s interval Because these sensors weredeployed inside a lab and the measurement variables hadlittle changes over time (except the light having the suddenchanges due to the irregular nature of this variable andfrequent onoff operation) this dataset was considered a typeof static datasets for many researchers In our experimentsto evaluate our proposed anomaly detection algorithm someartificial anomalies are created by randomly modifying someobservations which is widely used bymany researchers in theliterature [41]

Since our proposed method adopts the cluster structurea cluster (consisting of 4 sensor nodes ie N7 N8 N9and N10) and dataset (collected on 29022004) are chosenThe data distribution can be seen in [7] Here only partobservations (during 000000 amndash075959 am) from eachsensor node are employed to evaluate proposed methodThedata trend is depicted in Figure 4

From Figure 4 an obvious fact is that data distributionin a cluster is almost same which well proved that spatialcorrelation exists Though there are some trivial differencesafter analyzing the dataset carefully the main reason is thatdataset has some missing data points largely due to packetloss which can be further proved from Figure 4 In ourexperiment these missing observations can be interpolatedusing the method described in Section 33 The obvious factis that sudden peakvalley appeared in Figure 4 for eachsensor observation which implies that an interested eventmay occurred

Suppose that 119863 = 119909119894 119910119894 119894 = 1 2 119899 is a dataset

used to train an anomaly detector Here the 119909119894is a vector

with feature values and 119910119894is the label which indicates whether

the given observation is normal or anomalous Because theIBRL dataset regards all its observations as normal someanomaly data points are generated and inserted to evaluatethe performance of our proposed method In the paper anumber of 30 data points of artificial anomalies for eachsensor were injected consecutively in each dataset to calculatethe true positive rate (TPR) false negative rate (FPR) anddetection accuracy (ACC) Without loss of generality theanomalous dataset should follow a distribution very muchdifferent from that of the training dataset but their rangesshould be overlapped as much as possible Besides ananomalous event should be a small probability event fora normal dataset collected by a nonfault sensor node Theanomalies were generated using a normal randomizer withslightly deviate statistical characteristics from the normaldata characteristics [41] The detailed dataset information(including statistical parameters) of selected sensor node ispresented in Table 1

42 Performance EvaluationMetrics and BBO Parameters Inorder to evaluate our proposed method some commonlyused performance evaluation metrics for anomaly detectionare used in our paper such as detection accuracy (ACC) truepositive rate (TPR) and false positivealarm rate (FPR)Theyare described as follows

ACC =

(TP + TN)(TP + TN + FP + FN)

International Journal of Distributed Sensor Networks 9

17

175

18

185

19

195

20Temperature

0 200 400 600 800 1000

N7N8

N9N10

(a)

38

39

40

41

42

43

44

45

46

0 200 400 600 800 1000

Humidity

N7N8

N9N10

(b)

Figure 4 The data (temperature humidity) trend during 00000 amndash75959 am on February 29 2004

Table 1 Detail dataset information of selected sensor node on 29022004

Node Initial sample Mean Variance Injected anomaly Mean Variance119879 119867 119879 119867 119879 119867 119879 119867

N7 823 184154 409176 05238 14494 30 1821 4110 054 146N8 548 179844 417123 05315 14612 30 1775 4195 055 148N9 652 181140 426295 05288 14827 30 1835 4245 055 150N10 620 181144 426215 05244 14191 30 1833 4247 054 143119879 temperature119867 humidity

TPR =

TP(TP + FN)

FPR =

FP(FP + TN)

(7)

where TP means number of samples correctly predicted asanomaly class FP means number of samples incorrectlypredicted as anomaly class TN means number of samplescorrectly predicted as normal class and FN means numberof samples incorrectly predicted as normal class

BBO is employed to prune the initial ensemble and themigration model is same as that present in [27 28] and therelated parameters are set as follows

Habitat (population) size 119878 = 30 the number of SIVs(suitability index variables) in each island 119899 = 20 40 60 80the maximum migration rates 119864 = 1 and 119868 = 1 and themutation rate 120578 = 001 and 120582 120583 are the immigration rateand the emigration rate respectively The elitism parameter120588 = 2

HSI (habitat suitability index) is a fitness function similarto other population-based optimization algorithms HIS isevaluated by 119865-measure (119865-score) which considers both the

precision probability and the recall probability of binaryclassification problem

119865-measure =(1 + 120573

2) precision lowast recall

1205732lowast precision + recall

=

(1 + 1205732) lowast TP

(1 + 1205732) lowast TP + 1205732 lowast FN + FP

(8)

119865-measure can be interpreted as a weighted average of theprecision and recall and its value reaches best at 1 and worstat 0 120573 is a parameter used to adjust the relative importancebetween precision and recall 120573 = 05 1 2 Usually thevalue of 119865-measure is close to the relative small value ofprecision and recall that is the big 119865-measuremeans that theprecision and recall are all big Consequently a good detectoris analogous to a habitat with a high HSI and is included inthe final ensemble detector and a poor detector is analogousto a habitat with a low HIS and is discarded from the finalensemble detector In our paper 120573 = 1 is specified

43 Results Presentation and Discussions In the data miningand machine learning communities SVM-based method has

10 International Journal of Distributed Sensor Networks

Table 2 Detection performance of local ensemble detector

Ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

5 08700 05833 01181 07900 03333 01809 08267 05000 01549 08267 05714 0160810 08800 06667 01111 08033 03889 01702 08267 04375 01514 08333 06429 0157315 08900 07500 01042 08167 05000 01631 08433 05000 01373 08600 07143 0132920 08933 08333 01042 08200 05000 01596 08367 05000 01444 08567 07143 01364

Table 3 Detection performance of global ensemble detector [7]

Combined ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

20 09467 08333 00486 09300 07778 00603 09467 07500 00423 09500 07857 0042040 09700 07500 00208 09433 08333 00496 09710 08938 00246 09650 08929 0031560 09700 08333 00243 09733 08889 00213 09800 09375 00176 09783 09357 0019680 09817 09583 00174 09800 09444 00177 09767 09375 00211 09780 09714 00217

been widely used in classification problem which separatesthe data belonging to the different classes by fitting a hyper-plane One class SVM based method as a variation of thismethod is especially favored for anomaly detection [42ndash44]In the paper it was used to train the base detectorThe datasetof each sensor node was divided into two parts about 66was used for training the local detector and the remainder asthe test set was to evaluate proposed method

Online Bagging the commonly used ensemble strategywas used to build initial ensemble detector Our experimentsaim to achieve two goals Firstly it is to prove the effectivenessof proposed method based on ensemble learning theorySecondly it is to prove that pruned ensemble detector canobtain better (at least equal) performance compared to initialensemble detector and mitigate the resource requirementAs a result three experiments were done that is localensemble anomaly detector only considering the temporalcorrelation of each sensor node global ensemble anomalydetector considering the spatiotemporal correlation and theglobal pruned ensemble anomaly detector based on BBOThe experimental results can be seen in Tables 2 3 and 4respectively

Table 2 shows the performance of each sensor node underthe different ensemble size which does not take into accountthe spatial correlation of sensed data in a cluster Though theensemble detection performance is becoming ldquogoodrdquo gradualwith the increasing of ensemble size (the higher value ofACCTPR the better performance and the lower value of FPR thebetter performance) the overall performance is relatively lowThemaximumvalue of detection accuracy is only 8933 andmost of true positive rates are unacceptable and most of falsepositive rates (FPR) have a relative high valueAll these resultsindicate that the performance of local ensemble detectoris poor Table 3 shows the global detection performance ofeach sensor node Here after the local ensemble detectorwas trained each member node sent its local ensembleto each other to form the global ensemble detector andeach member node used this global detector to online testthe local observation From the results of Table 3 [7] an

obvious fact is that the detection performances are higherthan presented in Table 2With the help of neighbor detectorthe detection results become better and better correspondingto the increasing of ensemble size

In order to further optimize the proposed algorithmperformance and save the resource ensemble pruning is usedfor global ensemble detector Table 4 [7] shows the result ofdetection performance of pruned global ensemble detectorbased on BBO

Table 4 shows a more practicable result and the sizeof global ensemble decreases sharply while the detectorperformance is as good as or better than the initial globalensemble detector From the results of Table 5 when thesize of initial ensemble reaches 80 the 60 resource costis saved In our experiment only for validating the methodeffectively we set the ensemble sizes 5 10 15 and 20 for eachlocal ensemble detector which may be small for the practicalapplications In fact how many local ensemble detectors arechosen is an open topic and is decided by many factors suchas the computation capability and the communication cost aswell as memory usage of sensor node the expected detectingaccuracy requirement and so on In the practical applicationa trade-off is commonly considered

5 Conclusion and Future Work

After exploiting the spatiotemporal correlation existing inthe sensed data of WSNs and motivated by the advantagesof online ensemble learning a distributed online ensembleanomaly detector method has been proposed Due to thespecific resource constrained in theWSNs ensemble pruningbased on BBO is employed to mitigate the high resourcerequirement and obtain the optimized detector that performsat least as good as the original ones The experimental resultson real dataset demonstrated that our proposed method iseffective

Because the diversity of base learners is a key factorrelated to the performance of ensemble learning as a possibleextension of our work we plan to include some diversity

International Journal of Distributed Sensor Networks 11

Table 4 Detection performance of global ensemble detector based on BBO pruning [7]

Ensemble size(BBO pruned)

N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

14 09480 08000 00458 09327 07667 00567 09500 08125 00423 09533 08571 0042023 09710 07750 00208 09447 08000 00461 09733 09250 00239 09697 09143 0027627 09713 08500 00236 09683 08333 00230 09810 09563 00176 09797 09357 0018232 09820 09750 00177 09750 08333 00160 09820 09500 00162 09830 09786 00168

Table 5 Rate of saving resource cost based on global ensembledetector of BBO pruned

Number Initial ensemblesize

Prunedensemble size

Saving resourcecost

1 20 14 302 40 23 4253 60 27 554 80 32 60

measures in fitness function to improve the detecting per-formance in future Besides the cost of communication isthe main reason of quick energy depletion of sensor nodesespecially for the cluster head the adaptive selection of clusterhead based on energy state will be taken into account tolengthen the lifetime of WSNs in next work

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by the National Key ScientificInstrument and Equipment Development Project(2012YQ15008703) the Zhejiang Provincial Natural ScienceFoundation of China (LY13F020015) the Open Project of TopKeyDiscipline of Computer Software andTheory in ZhejiangProvincial (ZC323014100) National Science Foundation ofChina (61473182) Science and Technology Commission ofShanghai Municipality (11JC1404000 14JC1402200) andShanghai Rising-Star Program (13QA1401600)

References

[1] Y Zhang N Meratnia and P Havinga ldquoOutlier detectiontechniques for wireless sensor networks a surveyrdquo IEEE Com-munications Surveys and Tutorials vol 12 no 2 pp 159ndash1702010

[2] Y Zhang N A S Hamm N Meratnia A Stein M van deVoort and P J M Havinga ldquoStatistics-based outlier detectionfor wireless sensor networksrdquo International Journal of Geo-graphical Information Science vol 26 no 8 pp 1373ndash1392 2012

[3] C Peng and Q-L Han ldquoA novel event-triggered transmissionscheme and L

2control co-design for sampled-data control

systemsrdquo IEEE Transactions on Automatic Control vol 58 no10 pp 2620ndash2626 2013

[4] S Rajasegarar C Leckie M Palaniswami and J C BezdekldquoDistributed anomaly detection in wireless sensor networksrdquo inProceedings of the 10th IEEE Singapore International Conferenceon Communication systems (ICCS rsquo06) pp 1ndash5 IEEE SingaporeOctober 2006

[5] S Rajasegarar C Leckie and M Palaniswami ldquoAnomalydetection in wireless sensor networksrdquo IEEE Wireless Commu-nications vol 15 no 4 pp 34ndash40 2008

[6] M Xie S Han B Tian and S Parvin ldquoAnomaly detectionin wireless sensor networks a surveyrdquo Journal of Network andComputer Applications vol 34 no 4 pp 1302ndash1325 2011

[7] Z Ding M Fei D Du and S Xu ldquoOnline anomaly detectionmethod based on BBO ensemble pruning in wireless sensornetworksrdquo in Life System Modeling and Simulation vol 461 ofCommunications in Computer and Information Science pp 160ndash169 Springer Berlin Germany 2014

[8] T G Dietterich ldquoMachine-learning researchmdashfour currentdirectionsrdquo AI Magazine vol 18 no 4 pp 97ndash136 1997

[9] Z-H Zhou J Wu andW Tang ldquoEnsembling neural networksmany could be better than allrdquoArtificial Intelligence vol 137 no1-2 pp 239ndash263 2002

[10] N Shahid I H Naqvi and S B Qaisar ldquoCharacteristics andclassification of outlier detection techniques for wireless sensornetworks in harsh environments a surveyrdquoArtificial IntelligenceReview vol 137 pp 1ndash36 2012

[11] D Du K Li and M Fei ldquoA fast multi-output RBF neuralnetwork constructionmethodrdquoNeurocomputing vol 73 no 10ndash12 pp 2196ndash2202 2010

[12] P Gil A Santos and A Cardoso ldquoDealing with outliers inwireless sensor networks an oil refinery applicationrdquo IEEETransactions on Control Systems Technology vol 23 no 4 pp1589ndash1596 2014

[13] M A Rassam M A Maarof and A Zainal ldquoAdaptive andonline data anomaly detection for wireless sensor systemsrdquoKnowledge-Based Systems vol 60 pp 44ndash57 2014

[14] S Rajasegarar A Gluhak M Ali Imran et al ldquoEllipsoidalneighbourhood outlier factor for distributed anomaly detectionin resource constrained networksrdquo Pattern Recognition vol 47no 9 pp 2867ndash2879 2014

[15] N Lu G Zhang and J Lu ldquoConcept drift detection viacompetence modelsrdquo Artificial Intelligence vol 209 pp 11ndash282014

[16] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[17] S Seguı L Igual and J Vitria ldquoBagged one-class classifiersin the presence of outliersrdquo International Journal of PatternRecognition and Artificial Intelligence vol 27 no 5 Article ID1350014 2013

12 International Journal of Distributed Sensor Networks

[18] N Duffy and D Helmbold ldquoBoosting methods for regressionrdquoMachine Learning vol 47 no 2-3 pp 153ndash200 2002

[19] W-C Chang and C-W Cho ldquoOnline boosting for vehicledetectionrdquo IEEETransactions on SystemsMan and CyberneticsPart B Cybernetics vol 40 no 3 pp 892ndash902 2010

[20] C Desir S Bernard C Petitjean and L Heutte ldquoOne classrandom forestsrdquo Pattern Recognition vol 46 no 12 pp 3490ndash3506 2013

[21] A Fern and R Givan ldquoOnline ensemble learning an empiricalstudyrdquoMachine Learning vol 53 no 1-2 pp 71ndash109 2003

[22] A Bifet G Holmes B Pfahringer and R Gavalda ldquoImprov-ing adaptive bagging methods for evolving data streamsrdquo inAdvances in Machine Learning vol 5828 of Lecture Notes inComputer Science pp 23ndash37 Springer Berlin Germany 2009

[23] D I Curiac and C Volosencu ldquoEnsemble based sensinganomaly detection in wireless sensor networksrdquo Expert Systemswith Applications vol 39 no 10 pp 9087ndash9096 2012

[24] X Zhou S Li and Z Ye ldquoA novel system anomaly predictionsystem based on belief markov model and ensemble classifica-tionrdquo Mathematical Problems in Engineering vol 2013 ArticleID 179390 10 pages 2013

[25] H He S Chen K Li and X Xu ldquoIncremental learning fromstream datardquo IEEE Transactions on Neural Networks vol 22 no12 pp 1901ndash1914 2011

[26] D Du K Li X Li and M Fei ldquoA novel forward gene selectionalgorithm for microarray datardquo Neurocomputing vol 133 pp446ndash458 2014

[27] H Ma ldquoAn analysis of the equilibrium of migration models forbiogeography-based optimizationrdquo Information Sciences vol180 no 18 pp 3444ndash3464 2010

[28] D Simon ldquoBiogeography-based optimizationrdquo IEEE Transac-tions on Evolutionary Computation vol 12 no 6 pp 702ndash7132008

[29] S Sheen R Anitha and P Sirisha ldquoMalware detection bypruning of parallel ensembles using harmony searchrdquo PatternRecognition Letters vol 34 no 14 pp 1679ndash1686 2013

[30] Y-Y Zhang H-C Chao M Chen L Shu C-H Park and M-S Park ldquoOutlier detection and countermeasure for hierarchicalwireless sensor networksrdquo IET Information Security vol 4 no4 pp 361ndash373 2010

[31] C Peng and M-R Fei ldquoAn improved result on the stability ofuncertain T-S fuzzy systems with interval time-varying delayrdquoFuzzy Sets and Systems vol 212 pp 97ndash109 2013

[32] Y Zhang Observing the Unobservable Distributed Online Out-lier Detection inWireless Sensor Networks University of TwenteEnschede The Netherlands 2010

[33] C Peng D Yue and M Fei ldquoRelaxed stability and stabilizationconditions of networked fuzzy control systems subject toasynchronous grades of membershiprdquo IEEE Transactions onFuzzy Systems vol 22 no 5 pp 1101ndash1112 2014

[34] C Peng M-R Fei E Tian and Y-P Guan ldquoOn hold or dropout-of-order packets in networked control systemsrdquo Informa-tion Sciences vol 268 pp 436ndash446 2014

[35] M A Rassam A Zainal and M A Maarof ldquoAn adaptive andefficient dimension reduction model for multivariate wirelesssensor networks applicationsrdquo Applied Soft Computing Journalvol 13 no 4 pp 1978ndash1996 2013

[36] M Xie J Hu S Han and H-H Chen ldquoScalable hypergridk-NN-based online anomaly detection in wireless sensor net-worksrdquo IEEE Transactions on Parallel and Distributed Systemsvol 24 no 8 pp 1661ndash1670 2013

[37] Intel Berkely Reseach Lab (IBRL) dataset 2004 httpdbcsailmitedulabdatalabdatahtml

[38] J W Branch C Giannella B Szymanski R Wolff and HKargupta ldquoIn-network outlier detection in wireless sensornetworksrdquo Knowledge and Information Systems vol 34 no 1pp 23ndash54 2013

[39] M Moshtaghi T C Havens J C Bezdek et al ldquoClusteringellipses for anomaly detectionrdquo Pattern Recognition vol 44 no1 pp 55ndash69 2011

[40] S Rajasegarar J C Bezdek C Leckie and M PalaniswamildquoElliptical anomalies in wireless sensor networksrdquo ACM Trans-actions on Sensor Networks vol 6 no 1 pp 1ndash28 2009

[41] M A Rassam A Zainal and M A Maarof ldquoOne-classprincipal component classifier for anomaly detection inwirelesssensor networkrdquo in Proceedings of the 4th International Confer-ence on Computational Aspects of Social Networks (CASoN rsquo12)pp 271ndash276 IEEE Sao Carlos Brazil November 2012

[42] H Sagha H Bayati J D R Millan and R Chavarriaga ldquoOn-line anomaly detection and resilience in classifier ensemblesrdquoPattern Recognition Letters vol 34 no 15 pp 1916ndash1927 2013

[43] M Hejazi and Y P Singh ldquoOne-class support vector machinesapproach to anomaly detectionrdquo Applied Artificial Intelligencevol 27 no 5 pp 351ndash366 2013

[44] Y Zhang NMeratnia and P JMHavinga ldquoDistributed onlineoutlier detection in wireless sensor networks using ellipsoidalsupport vector machinerdquo Ad Hoc Networks vol 11 no 3 pp1062ndash1074 2013

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 2: Research Article A Novel Distributed Online Anomaly ...downloads.hindawi.com/journals/ijdsn/2015/146189.pdfin the cluster head node to obtain an optimized subset of ensemble members

2 International Journal of Distributed Sensor Networks

appropriate for WSNs The possible strategy is to select partof individual detectors to perform the anomaly detectionConsequently the ensemble pruning is a necessary strategy[9] which can obtain the better (at least same) performancecompared to the initial ensemble while the number ofindividual detectors decreased greatly

Analyzing the spatiotemporal correlation of sensed datain WSNs and motived by the online ensemble learningmethod the paper proposes a distributed anomaly detectionmethod for WSNs from the perspective of both modelbuilding and resource saving Further to mitigate the highcommunication requirements caused by broadcast ensembledetectors BBO based ensemble pruning is used to select theoptimized individual detectors to build the final ensembledetector that has at least same performance compared tothe initial ensemble detector The main contributions of thispaper include the following

(1) A distributed anomaly detection method for WSNs isproposed based on online ensemble learning

(2) BBO based ensemble pruning is used to get theoptimal subset for saving the limited store and com-munication resources in WSNs

(3) State matrix encoding method is designed for ensem-ble detector which can decrease the communicationand memory overhead significantly

The rest of this paper is organized as follows The relatedwork is described in Section 2 Based on ensemble learningtheory and BBO our proposed anomaly detection methodis presented in Section 3 Experiment analysis is provided inSection 4 Finally conclusions and future work are presentedin Section 5

2 Related Work

To clearly analyze the motivation of the paper the stateof the art of three key aspects related to our paper issummarized that is anomaly detection in WSNs onlineensemble learning and ensemble pruning

21 Anomaly Detection Method and Classification in WSNsWith the rapid development and wide application of WSNssome anomaly detection techniques for WSNs have beendeveloped and summarized based on different perspectiveFor example [10] discussed the prioritization of various char-acteristics ofWSNs including of spatiotemporal and attributecorrelations of sensed data anomaly types anomaly identifi-cation anomaly score and so forth A brief overview of theclassifications strategies for anomaly detection methods inWSNs deployed in harsh environment was provided whichgrouped anomaly detection methods into four types that isstatistical-based techniques the nearest neighbor based tech-niques the clustering based techniques and classification-based techniques Based on the nature of sensor data specificrequirements and limitations of the WSNs [1] provideda comprehensive overview of existing anomaly detectiontechniques specifically developed for WSNs It presented atechnique-based taxonomy and gave a comparative table

which could be used as a guideline to select the suitablemethod for the specific application For example based on thecharacteristic such as data types anomaly types and anomalydegree statistical-based methods are further classified intoparametric-based methods and nonparametric-based meth-ods Based on how the probability distribution model isbuilt classification-basedmethods are categorized as supportvector machined-based methods and radial basis functionneural networks-based methods [11] and so on The inter-ested reader is referred to more anomaly detection methodsand taxonomies in [5 6 12 13] These taxonomies of afore-mentioned methods may be some overlaps and machinelearning and computational intelligence-based techniquesare an increasing important research direction beyond alldoubt with respect to the complicated application Moreoverthough these methods have acceptable performance to someextent the resource constraint usually was not or seldomtaken into account With the wide applications of WSNsit also attracted some researchersrsquo attentions [14] Anothernoticeable characteristic of aforementionedmethodswas thatonly a single detector or model was trained It is well-knownthat the singlemodelmay be not well learned the complicateddecision boundary with respect to the complicated dataset For the sensed streaming data with the dynamic datadistribution single model is hard to or need expensive costto learn and obtain the whole profile such as training artificialneural network which leads to overlearning and degrade thegeneralization performance Besides concept drift [15] wasa common phenomenon that occurred in dataset collectedfromWSNs and the single mode was difficult in dealing withsuch dynamic changing of data distribution and providing acomprehensive detector to detect anomaly Moreover detec-tor updating based on all available dataset is also a hard workfor online learning

22 Ensemble Learning Method Ensemble learning is a com-putational intelligence method and theory and experimenthave proved that the combination of the predictions ofmany individual detectors can enhance the generalizationperformance There are many different ensemble learningmethods used widely and successfully such as Bagging [1617] Boosting [18 19] Random Forest [20] and their onlineversion [21 22] Generally an ensemble anomaly detector isconstructed in two steps Firstly a number of base detectorsare trained using the training dataset Secondly a combina-tion strategy of result is designed to obtain the aggregatedresult based on the results of each single detector For time-series dataset such as sensed dataset in WSNs learning asingle model to profile the whole dataset usually is difficultor impossible Generally there are two categorized ensemblepatterns to handle the streaming data that is horizontalensemble and vertical ensemble The former follows suchstrategy that the nearest 119899 consecutive data chunks are firstlyused to train 119899 base detector and the combination methodis employed to build the ensemble detector used to predictdata in the yet-to-arrive chunk The advantage of horizontalensemble is that it can handle noise data in the streamingdataset because the prediction of newly arriving data chunkdepends on the combination of different chunks Even if

International Journal of Distributed Sensor Networks 3

the noise data may deteriorate some chunks the ensemblecan still generate relatively accurate prediction result Thedisadvantage of horizontal ensemble is that the streamingdata is continuously changing and the information containedin the previous chunks may be invalid so that using theseold concept models will not improve the overall result ofprediction The latter ensemble pattern is vertical ensemblewhich uses the newest chunk to build ensemble model Theadvantage of vertical ensemble is that it uses different algo-rithms (heterogeneous ensemble) on same dataset or samealgorithm (homogeneous ensemble) on different samplingsubdataset from the chunk to build the model which candecrease the bias error between models The disadvantageis that vertical ensemble assumes that the data chunk iserrorless in real situation this precondition usually is hardtomeet Currently because online ensemble learningmethodcan address the concept drift and noisy data problem instreaming data ensemble learning has been used in anomalydetection forWSNs [23ndash25] In this paper after exploiting thespatiotemporal correlation existing in the sensed dataset inWSNs a distributed method is proposed based on horizontalensemble and like-vertical ensemble Section 3 will give thedetailed description

23 Ensemble Pruning Based on Optimization Search MethodAlthough there are many advantages for ensemble learningthe nontrivial disadvantage is that it needs more mem-ory especially more communication resource to store andcommunicate multiple detectors in WSNs which can drainenergy quickly and is intolerable in WSNs Motivated bythe ldquomany could be better than allrdquo in the ensemble learn-ing community [9] it implied that the combination of alldetectors maybe not a good choice in ensemble learningcommunity Ensemble pruning as necessary strategy to solveresource-limitation question [26] is employed which selectsa subset of initial ensemble and obtains better or at leastequal detecting performance than the original ensembleThemost advantage of ensemble pruning is that it reduces thecommunication requirement greatly In WSNs broadcastingthe relative few detectors can save the battery energy consid-erably However it is well known that pruning an ensembleof size119873 requires to search in the space composed of 2119873 minus 1nonempty subensembles which is a 119873119875 complete problemHence some heuristic searching approaches are used tofind the expected appropriate subset Biogeographical basedoptimization (BBO) [27 28] as a novel population-basedglobal optimization method had some features in commonwith existing optimizationmethods such as genetic algorithm(GA) and harmony search (HS) [29] In this paper BBO isused to obtain an optimalsuboptimal ensemble for reducingthe communication cost To the best of our knowledge as anew optimization method there is no paper employing thismethod to apply in the fields of WSNs and our study willextend its application

3 Proposed Method

Motivated by the increasing online ensemble learningmethodology [25] and considering the resource limitation

Base station (BS)

Cluster head (CH)CH to BSBoundary of cluster heads

Non-CH node

Figure 1 The considered WSN

of sensor node in WSNs we propose a distributed onlineanomaly detection method based on the ensemble learningFurther BBO is used for ensemble pruning to decrease thecommunication and memory requirements

31 Problem Statement of WSNs In this paper we assumethat the WSNs is applied in untouched area and to assure thesensed data quality the sensor nodes usually are deployeddensely Besides we assume that sensor nodes are timesynchronized which is mainly for clear presentation purposerather than a limitation of our proposed method Figure 1showsWSNs which consist of a large amount of sensor nodesand a base station (BS) [30] Generally the WSNs can berepresented as a graph119866 = (119881 119864) where119881 = V

1 V2 V

|119881|

is a finite set of vertices and 119864 = 1198901 1198902 119890

|119864| is a finite

set of edges and vertex (V119894 119894 = 1 |119881|) and edge (119890

119894 119894 =

1 |119864|) refer to sensor nodes and the one-hop or multihopcommunication link reachable between sensors V

119894and V

119895

respectivelyFrom Figure 1 we can clearly have an idea that some

clusters are formed based on node geographical positionsinformation and communication capability reachable Herewe only consider the one-hop communications among sensornodes Similarly this assumption is mainly for clear repre-sentation of our proposed method rather than a limitation ofcommunication capability of sensors In fact our proposedmethod can easily extend to multiple hop relaying communi-cation Besides in order to concisely describe our proposedanomaly detection method a relatively small subnetworkconsisted of some sensor nodes deployed densely is taken intoaccount which forms a cluster 119862

119894consisting of one cluster

head node and a number of sensor nodes represented as CH119894

and 119873119894119895 119895 = 1 |119862

119894| respectively For the whole WSNs

119881 = 1198621cup1198622cup cup 119862

119899and119862

119894cap119862119895= Φ All nodes in a cluster

are reachable to each other by one-hop communication andthe communication between clusters depends on the directlinks of cluster heads In each cluster the selection of clusterhead is randomized among all nodes in that cluster to avoiddraining of the energy

For one cluster 119862119894= CH

119894 1198731198941 119873

119894119898 which contains

a cluster head CH119894and its 119898 spatially neighboring nodes

(119873119894119895 119895 = 1 119898) Each sensor node in the subnetwork

4 International Journal of Distributed Sensor Networks

measures a data vector at every time interval Δ119905 which iscomposed of multiple attribute values For the cluster headCH119894 the observation is 119883

119894= (119909

119894

1 119909119894

2 119909

119894

119889) where 119889

denotes the dimension For the 119895th neighbor node 119873119894119895 the

observation is 119883119894119895= (119909119894

1198951 119909119894

1198952 119909

119894

119895119889) Nodes in the cluster

collect samples synchronously and our proposed method isto identify these new observations of each sensor node asnormal or anomalous online

32 Spatial and Temporal (Spatiotemporal) Correlation ofSensed Dataset For the sensed dataset in a cluster wedescribed the spatiotemporal correlation firstly which will beused later to build our proposed online ensemble detection

The collected sensor dataset from WSNs is a time seriesdataset A time series is a sequence of value 119883 = 119909(119905) 119905 =

1 119899which follows a nonrandom order and the 119899 consec-utive observation values are collected at same time intervalsAnalyzing and learning from these observations [31] canhelp to understand the data trend over time and build theappropriate detector based on temporal correlation as well asto predict the label of new coming observations

To obtain the detector the foremost requirement is toachieve a stationary time series dataset Some data processingmethods are used to eliminate data trend and obtain a sta-tionary time series dataset such as polynomial fitting movingaverages differencing and double exponential smoothing[32ndash34] Considering the requirement of low computationalcomplexity a simple and efficient nonparametric technique(ie first differencing) is used to eliminate the temporal trendand obtain a stationary time series for dataset collected inWSNs which can be formulated as

1198831015840= 1199091015840(119904 119905) = 119909 (119904 119905) minus 119909 (119904 119905 minus 1) 119905 = 2 3 119899 (1)

Besides the sensor nodes are always deployed denselyand the space redundancy existed A dataset 119883 = 119909(119904) 119904 =

1 119898 is collected from 119898 sensor nodes in a clusterat a timestamp This dataset can help to understand thespatial correlation structure of data and predict the datavalue at a location nearby Spatial data may present thelocal dependencywhich represents the similarity relationshipof observations collected at adjacent locations in a localregion Usually for a specified region the observations of onesensor can be estimated by a linear weighted combination ofobservations collected at its adjacent locations [32] which canbe expressed as

119909 (119904119894) = 1205821119909 (1199041) + sdot sdot sdot + 120582119894minus1119909 (119904119894minus1) + 120582119894+1119909 (119904119894+1)

+ sdot sdot sdot + 120582119898119909 (119904119898)

(2)

where 1199041 119904119894minus1 119904119894+1 119904119898 denotes positions of sensornodes and 1205821 120582119894minus1 120582119894+1 120582119898 denotes the weights ofobservations sum119898

119896=1119896 =119894 120582119896 = 1Consequently for sensed data collected in a local region

two reasonable assumptions are described as follows

(1) The sensed data of adjacent nonfault sensor nodes aresimilar at the same timestamp

(2) The sensed data of adjacent nonfault sensor nodeshave the similar trend over time

Motivated by the two assumptions and ensemble learningtheory a novel anomaly detectionmethod is proposed in thispaper We will give the details in the following section

33 Proposed Ensemble Learning Method of Anomaly Detec-tion in WSNs Spatiotemporal correlation exists among sen-sor data in a local region of WSNs and a relatively smallcomponent that is a cluster consisting of a few of sensornodes and a cluster head node is taken into account to clearlydescribe proposed distributed anomaly detection methodbased on ensemble learning Ensemble pruning based onBBO was adopted to optimize the initial trained detectorfor mitigating the resource requirements The optimizedensemble detector was used to identify global anomalousobservations at each individual sensor timely Our proposedmethod is shown as Figure 2

Online anomaly detection method consists of three keyprocedures that is detector training online detecting andonline detector updating From Figure 2 it can be seen thatour proposed method enables each distributed deploymentsensor node to globally judge every new coming observa-tion normal or anomalous in time Distributed detecting isemployed to achieve load (communication computation andstorage) evenly in the network and to prolong the lifetime ofthe whole network

The whole procedure of proposed method is described asfollows

Step 1 Considering the temporal correlation at the certaintime period each sensor node 119904

119894trains a local ensemble

detector using the history dataset collected from a timeinterval In facts using this initial local ensemble detectorthe new coming observation is normal or anomalous can bedetermined locally

Step 2 Each sensor node 119904119894transmits its local ensemble

detector as well as some related parameters such as themaximize value minimum value and mean of trainingdataset to the cluster head node and other member sensornode

Step 3 Cluster head node received the local ensemble detec-tors from its member nodes and combined with its owntrained detector the initial global ensemble detector is built

Step 4 The BBOmethod is introduced in the cluster head toprune the initial global ensemble detector and to obtain anacceptable final ensemble detector

Step 5 The pruned ensemble detector that is final ensembledetector is broadcasted to its each member sensor node foronline global anomaly detection

Step 6 Each sensor node selectively retains the test data foronline update based on the predefined sampling probability119901

International Journal of Distributed Sensor Networks 5

Learning(i) Input training data

(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector

Member nodes (MN1)

Learning(i) Input training data

(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector

Member nodes (MN2)

Learning(i) Input training data

(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector

Member nodes (MNm)

Cluster head node (CH)

Ensemble aggregating

(i) Receive the local ensemble detector(ii) Aggregating

(iii) Output the global ensemble detector

Bit coding mechanism

BBO (ensemble pruning)

Local ensembledetector

Local ensembledetector

Local ensembledetector

Broadcastcommunication

Broadcastcommunication

Broadcastcommunication

Broadcastcommunication

Local ensembledetector

Global initialensemble detector

State matrix Global initial ensemble detector

Figure 2 Distributed ensemble anomaly detection method based on BBO pruning in WSNs

Step 7 Once the updating condition was activated theprocedure of retraining and detector updating was triggered

This method can scale well with increase of number ofnodes in WSNs due to its distributed processing nature Ithas low communication requirements and does not need totransmit any actual observations between cluster head nodeand itsmember sensor node which saves the communicationresource significantly

Next we described some important procedures men-tioned above in detail Further considering the context ofresource constraint of each sensor node in WSNs sometricks are designed to save the communication and memoryrequirements

331 Building the Initial EnsembleDetector An initial ensem-ble detector is constructed by two steps Firstly a numberof base detectors are trained sequentially for each sensornodes in a cluster (including the cluster head node itself)based on the history dataset Because the data distributionmay be changed over time the previous trained detector maybe useless for the future detection Moreover the limitedmemory resource in the sensor node is another constraintto store too many previous detectors In practice accordingto the space of memory resource only the latest multipledetectors are kept to build the initial local ensemble for onesensor node For example to sensor node 119894 the sensed data iscollected anddivided into data chunk based on a time intervalΔ119905 which is determined by the actual monitoring processConsequently each node trains multiple individual detectorsover time In our paper supposing 119899 latest detector is kept fora sensor node if there are119898 nodes in one cluster then totally119899lowast119898 detectors are obtained for the initial ensemble Secondly

each sensor node (including cluster head node) broadcasts its119899 trained detector in the cluster Taking the cluster head as anexample after all (119899lowast(119898minus1)) individual detectors are receivedfrom its member nodes the cluster head combines with its 119899trained detector and the initial ensemble (including 119899 lowast 119898

individual detectors) is built in cluster head nodeMany techniques can be employed for combining the

results of each detector to obtain the final detection resultThe common used method in the literature is the major-ity vote (for classification problem) and weighted average(for regression problem) In our paper the final ensembledetection result can be calculated by (3) where 119908

119894denotes

weight coefficient that is 119908119894= 1 means the simple average

otherwise weighted average In our paper for simplicity thesimple average strategy is employed to combine the finallyresult

119910fin (119909) =1

119899 lowast 119898

119899lowast119898

sum

119894=1119910119894 (119909) lowast 119908119894

(3)

332 Ensemble Pruning Based on BBO Search To miti-gate the expensive communication cost and high memoryrequirement induced by ensemble learning inspired by theprinciple of ldquomany could be better than allrdquo in the ensemblelearning community the ensemble pruning is necessary

Given an initial ensemble anomaly detector 119864 =

AD1AD2 AD

119899lowast119898 AD

119894is a trained anomaly detector

which can test an observation anomalous or not a combi-nation method 119862 and a test dataset 119879 The goal of ensemblepruning is to find an optimalsuboptimal subset 1198641015840 sube 119864which can minimize the generalization error and obtainbetter or at least same detection performance compared to119864 Let 119891

119894119895(119894 = 1 2 119898 119895 = 1 2 119899) be the fitness values

6 International Journal of Distributed Sensor Networks

Input 119864mdashinitial ensemble anomaly detector 119879mdashThe number of maximization iterationOutput 1198641015840mdashfinal ensemble anomaly detectorlowast BBO parameter initialization lowast

Create a random set of habitats (populations) 1198671 1198672 119867

119873

Compute corresponding fitness that is HSI valueslowastOptimization search process lowast

While (119879)Compute immigration rate 120582 and emigration rate 119906 for each habitat based on HSIlowast Migration lowastSelect119867

119894with probability based on 120582

119894

If119867119894is selected

Select119867119895with probability based on 119906

119895

If119867119895is selected

Randomly select a SIV form119867119895

Replace a random SIV in119867119894with one from119867

119895

End ifEnd iflowast Mutation lowastSelect an SIV in119867

119894with probability based on the mutation rate 120578

If119867119894(SIV) is selected

Replace119867119894(SIV) with a randomly generated SIV

End ifRe-compute HSI values119879 = T minus 1

End whilelowast Ensemble pruning lowast

Get the final ensemble of anomaly detector 119864lowast based on the habitats119867119894

lowast with acceptable HSI

Algorithm 1 Ensemble pruning BBO (E T)

of the detecting performance such as true positive rate falsepositive rate accuracy and so on Obviously the fitness value119865 can be defined as (4) based on the results of testing data

119865 =

[

[

[

[

[

[

11989111 11989112 sdot sdot sdot 1198911119899

11989121 11989122 sdot sdot sdot 1198912119899

sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot

1198911198981 1198911198982 sdot sdot sdot 119891

119898119899

]

]

]

]

]

]

(4)

The final fitness function can be defined as

Maximize (

1198731015840

sum

119894=1119898119895=1119899

119891119894119895)

st 1198731015840le 119898 lowast 119899

(5)

Here the problem of ensemble pruning is to find thesubset of 1198641015840 which was composed of part single detectorsFinding the optimized subset requires much heavier andmore delicate computation resources Biogeography-basedoptimization (BBO) is a novel optimization method and isemployed to find out the acceptable set of ensemble Weonly simply present some key information about BBO theinterested reader can be referred to the detailed descriptionin [28]

BBO is a population-based global optimization methodwhich has some common characteristics similar to the

existing evolutionary algorithms (EAs) such as genetic algo-rithm (GA) particle swarm optimization (PSO) and antcolony optimization (ACO) When it was used to search thesolution domain and obtain an optimalsuboptimal solutionsome operators were employed to share information amongsolutions which makes BBO applicable to many problemsthat GA and PSO are used The more distinctive differencebetween BBO and other EAs can be seen in [27 28]

The pseudo-code of ensemble pruning based on BBO canbe described as shown in Algorithm 1 [7] Here 119867 indicateshabit HIS is fitness and SIV (suitability index variable) is asolution feature

333 Some Tricks Designed to Mitigate the CommunicationRequirement In the WSNs the main reason of quick energydepletion is the radio communication among the sensornodes It has been known that the cost of communicationof one bit equals the cost of processing thousands of bitsin sensors [35] This means that the most energy in sensornode is consumed by radio communication rather thancollecting or processing data Consequently reducing thecommunication quantity will decrease the power resourcerequirement and eventually lengthen the lifetime of thewholeWSNs

It is obvious that the aforementioned method has relativehigh communication overhead Each sensor node transmitsits local ensemble detector to the cluster head and the finalpruned global ensemble detector broadcasts back to its each

International Journal of Distributed Sensor Networks 7

Input 1198641015840mdashCurrent pruned ensemble anomaly detector 119901mdashSampling probabilityOutput 119864lowastmdashUpdated pruned ensemble anomaly detector

For each sensor nodeRetain the new observation with probability 119901If buffer is replaced completely by new observations

Train new detector and transmit its summary to cluster head119864lowast = Ensemble Pruning BBO(1198641015840 119879)

Broadcast 119864lowast to its member sensor node for subsequent anomaly detection

Algorithm 2 Online updating (1198641015840 119901)

member sensor nodes In order to relieve communicationburden some skills are used to descend the communicationoverhead

In fact the distributed traininglearning method onlytransmits the summary information of trained local ensembledetector to the cluster head which has significantly decreasedthe communication cost compared to centralized anomalydetectionmanners that sent all trained data to cluster head tobuild detector Besides after the pruned ensemble is obtainedin cluster head node each member sensor node in thiscluster can obtain the pruned ensemble detector from thecluster head node A straightforward method is broadcastingthis pruned ensemble to its member sensor nodes Thisis a common used strategy but it does not make full useof local ensemble detector information and will cost morecommunication resources Here a state matrix 119875 is designedin the cluster head its element 119901

119894119895is defined by formula (6)

to represent each single detector in initial ensemble Theneach local ensemble detector is represented as a bit stringusing one bit for each single detector Detector is included orexcluded from the ensemble detector depending on the valueof the corresponding bit that is 1 denotes this single detectorthat is included in the final ensemble and 0 means it was notincluded

119901119894119895=

1 AD119894119895isin 1198641015840 119894 = 1 119898 119895 = 1 119899

0 otherwise

119875

=

1 2 119894 minus 1 119894 119894 + 1 sdot sdot sdot 119899

1198781

1198782

119878119898

[

[

[

[

[

[

0

1

sdot

0

1

0

sdot

0

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

0

1

sdot

1

1

1

sdot

1

1

0

sdot

1

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

1

0

sdot

1

]

]

]

]

]

]

(6)

After the pruned procedure is finished the cluster headbroadcasts the statematrix119875 to its member sensor node eachsensor node keeps the single detector whose correspondingvalue of state element equals 1 and it deletes the rest to buildthe pruned ensemble global detector Employing the statematrix can save the energy greatly For example after theensemble pruning is finished 1198731015840 (1198731015840 le 119899 lowast 119898) individualdetectors are broadcast in cluster If matrix 119875 is not used it

will need 4 lowast1198731015840 lowast 119889 bytes communication cost (suppose thatthe individual detector can be represented by 119889 parametersand each parameter needs at least 4 bytes) If matrix 119875 isintroduced each itemofmatrix119875 only needs 1 bit to representan individual detector Consequently only 119898 lowast 1198998 bytes arerequired to broadcast Suppose that one-third of individualdetectors are pruned (ie1198731015840 = 2lowast119899lowast1198983) then (4lowast119899lowast119898lowast

119889 lowast 23)(119898 lowast 1198998) asymp 2133119889 By introducing the ensemblepruning and state matrix the quantity of energy saving incluster head sensor is significant and the lifetime ofWSNs canbe lengthened

334 Online Update and Relearning Distribution changeof sensed dataset occurred possibly and detector updat-ing is necessary Online detector update will accompanya relearning procedure A comprised strategy (ie delayupdating strategy [36]) can cater this situation and savethe computation communication and memory resources tosome extent Simple to say for the new coming observationwhether saving and using it to update the current detectoror not are decided by a sample probability 119901 Some heuristicrules can be employed to guide its value for example if thedynamics is relatively stationary the small 119901 should be usedotherwise the big 119901 should be chosen When the buffer of asensor node is replaced by the new data completely onlineupdate is triggered and new detector is trained The pseudo-codes of algorithm can be described as shown inAlgorithm 2

4 Experimental and Analysis

In this section the dataset data preprocessing methodexperiment results and analysis are described respectivelyExperiments were conducted on a personal PC with IntelCore 2 Duo CPU P7450213GHZ and 4GB memoryThe operating system is Windows 7 professional The dataprocessing is partly on the MATLAB 2010 and the algorithmmentioned in Section 3 was implemented with MicrosoftVisual C++ platform

41 Dataset and Data Preprocessing IBRL datasets [37] wereused in our paper to validate proposed method which wascollected from aWSN deployed in Intel Research Laboratoryat University of Berkeley and commonly used to evaluatethe performance of some existing models for WSNs [3536 38ndash41] This network consists of 54 Mica2Dot sensornodes Figure 3 shows the location of each node of the

8 International Journal of Distributed Sensor Networks

1

2

3

4

5 6

7

8

9

10

11

12

1314

15

16

17

18

19

20

21

22

23

2425

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

4142

43

44

45

46

47

48

49

50

51

52

54

Lab

Server

Quiet

Phon

e

Kitchen

Elec

Copy

Storage

Conference

Office Office

53

Figure 3 Sensor nodes location in the IBRL deployment

deployment (node locations are shown in black hexagonwith their corresponding node IDs) [35] The whole datasetwas collected from 29022004 to 05042004 Four typesof measures data that is light temperature and humidityas well as voltage were collected and those measurementswere recorded in 31 s interval Because these sensors weredeployed inside a lab and the measurement variables hadlittle changes over time (except the light having the suddenchanges due to the irregular nature of this variable andfrequent onoff operation) this dataset was considered a typeof static datasets for many researchers In our experimentsto evaluate our proposed anomaly detection algorithm someartificial anomalies are created by randomly modifying someobservations which is widely used bymany researchers in theliterature [41]

Since our proposed method adopts the cluster structurea cluster (consisting of 4 sensor nodes ie N7 N8 N9and N10) and dataset (collected on 29022004) are chosenThe data distribution can be seen in [7] Here only partobservations (during 000000 amndash075959 am) from eachsensor node are employed to evaluate proposed methodThedata trend is depicted in Figure 4

From Figure 4 an obvious fact is that data distributionin a cluster is almost same which well proved that spatialcorrelation exists Though there are some trivial differencesafter analyzing the dataset carefully the main reason is thatdataset has some missing data points largely due to packetloss which can be further proved from Figure 4 In ourexperiment these missing observations can be interpolatedusing the method described in Section 33 The obvious factis that sudden peakvalley appeared in Figure 4 for eachsensor observation which implies that an interested eventmay occurred

Suppose that 119863 = 119909119894 119910119894 119894 = 1 2 119899 is a dataset

used to train an anomaly detector Here the 119909119894is a vector

with feature values and 119910119894is the label which indicates whether

the given observation is normal or anomalous Because theIBRL dataset regards all its observations as normal someanomaly data points are generated and inserted to evaluatethe performance of our proposed method In the paper anumber of 30 data points of artificial anomalies for eachsensor were injected consecutively in each dataset to calculatethe true positive rate (TPR) false negative rate (FPR) anddetection accuracy (ACC) Without loss of generality theanomalous dataset should follow a distribution very muchdifferent from that of the training dataset but their rangesshould be overlapped as much as possible Besides ananomalous event should be a small probability event fora normal dataset collected by a nonfault sensor node Theanomalies were generated using a normal randomizer withslightly deviate statistical characteristics from the normaldata characteristics [41] The detailed dataset information(including statistical parameters) of selected sensor node ispresented in Table 1

42 Performance EvaluationMetrics and BBO Parameters Inorder to evaluate our proposed method some commonlyused performance evaluation metrics for anomaly detectionare used in our paper such as detection accuracy (ACC) truepositive rate (TPR) and false positivealarm rate (FPR)Theyare described as follows

ACC =

(TP + TN)(TP + TN + FP + FN)

International Journal of Distributed Sensor Networks 9

17

175

18

185

19

195

20Temperature

0 200 400 600 800 1000

N7N8

N9N10

(a)

38

39

40

41

42

43

44

45

46

0 200 400 600 800 1000

Humidity

N7N8

N9N10

(b)

Figure 4 The data (temperature humidity) trend during 00000 amndash75959 am on February 29 2004

Table 1 Detail dataset information of selected sensor node on 29022004

Node Initial sample Mean Variance Injected anomaly Mean Variance119879 119867 119879 119867 119879 119867 119879 119867

N7 823 184154 409176 05238 14494 30 1821 4110 054 146N8 548 179844 417123 05315 14612 30 1775 4195 055 148N9 652 181140 426295 05288 14827 30 1835 4245 055 150N10 620 181144 426215 05244 14191 30 1833 4247 054 143119879 temperature119867 humidity

TPR =

TP(TP + FN)

FPR =

FP(FP + TN)

(7)

where TP means number of samples correctly predicted asanomaly class FP means number of samples incorrectlypredicted as anomaly class TN means number of samplescorrectly predicted as normal class and FN means numberof samples incorrectly predicted as normal class

BBO is employed to prune the initial ensemble and themigration model is same as that present in [27 28] and therelated parameters are set as follows

Habitat (population) size 119878 = 30 the number of SIVs(suitability index variables) in each island 119899 = 20 40 60 80the maximum migration rates 119864 = 1 and 119868 = 1 and themutation rate 120578 = 001 and 120582 120583 are the immigration rateand the emigration rate respectively The elitism parameter120588 = 2

HSI (habitat suitability index) is a fitness function similarto other population-based optimization algorithms HIS isevaluated by 119865-measure (119865-score) which considers both the

precision probability and the recall probability of binaryclassification problem

119865-measure =(1 + 120573

2) precision lowast recall

1205732lowast precision + recall

=

(1 + 1205732) lowast TP

(1 + 1205732) lowast TP + 1205732 lowast FN + FP

(8)

119865-measure can be interpreted as a weighted average of theprecision and recall and its value reaches best at 1 and worstat 0 120573 is a parameter used to adjust the relative importancebetween precision and recall 120573 = 05 1 2 Usually thevalue of 119865-measure is close to the relative small value ofprecision and recall that is the big 119865-measuremeans that theprecision and recall are all big Consequently a good detectoris analogous to a habitat with a high HSI and is included inthe final ensemble detector and a poor detector is analogousto a habitat with a low HIS and is discarded from the finalensemble detector In our paper 120573 = 1 is specified

43 Results Presentation and Discussions In the data miningand machine learning communities SVM-based method has

10 International Journal of Distributed Sensor Networks

Table 2 Detection performance of local ensemble detector

Ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

5 08700 05833 01181 07900 03333 01809 08267 05000 01549 08267 05714 0160810 08800 06667 01111 08033 03889 01702 08267 04375 01514 08333 06429 0157315 08900 07500 01042 08167 05000 01631 08433 05000 01373 08600 07143 0132920 08933 08333 01042 08200 05000 01596 08367 05000 01444 08567 07143 01364

Table 3 Detection performance of global ensemble detector [7]

Combined ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

20 09467 08333 00486 09300 07778 00603 09467 07500 00423 09500 07857 0042040 09700 07500 00208 09433 08333 00496 09710 08938 00246 09650 08929 0031560 09700 08333 00243 09733 08889 00213 09800 09375 00176 09783 09357 0019680 09817 09583 00174 09800 09444 00177 09767 09375 00211 09780 09714 00217

been widely used in classification problem which separatesthe data belonging to the different classes by fitting a hyper-plane One class SVM based method as a variation of thismethod is especially favored for anomaly detection [42ndash44]In the paper it was used to train the base detectorThe datasetof each sensor node was divided into two parts about 66was used for training the local detector and the remainder asthe test set was to evaluate proposed method

Online Bagging the commonly used ensemble strategywas used to build initial ensemble detector Our experimentsaim to achieve two goals Firstly it is to prove the effectivenessof proposed method based on ensemble learning theorySecondly it is to prove that pruned ensemble detector canobtain better (at least equal) performance compared to initialensemble detector and mitigate the resource requirementAs a result three experiments were done that is localensemble anomaly detector only considering the temporalcorrelation of each sensor node global ensemble anomalydetector considering the spatiotemporal correlation and theglobal pruned ensemble anomaly detector based on BBOThe experimental results can be seen in Tables 2 3 and 4respectively

Table 2 shows the performance of each sensor node underthe different ensemble size which does not take into accountthe spatial correlation of sensed data in a cluster Though theensemble detection performance is becoming ldquogoodrdquo gradualwith the increasing of ensemble size (the higher value ofACCTPR the better performance and the lower value of FPR thebetter performance) the overall performance is relatively lowThemaximumvalue of detection accuracy is only 8933 andmost of true positive rates are unacceptable and most of falsepositive rates (FPR) have a relative high valueAll these resultsindicate that the performance of local ensemble detectoris poor Table 3 shows the global detection performance ofeach sensor node Here after the local ensemble detectorwas trained each member node sent its local ensembleto each other to form the global ensemble detector andeach member node used this global detector to online testthe local observation From the results of Table 3 [7] an

obvious fact is that the detection performances are higherthan presented in Table 2With the help of neighbor detectorthe detection results become better and better correspondingto the increasing of ensemble size

In order to further optimize the proposed algorithmperformance and save the resource ensemble pruning is usedfor global ensemble detector Table 4 [7] shows the result ofdetection performance of pruned global ensemble detectorbased on BBO

Table 4 shows a more practicable result and the sizeof global ensemble decreases sharply while the detectorperformance is as good as or better than the initial globalensemble detector From the results of Table 5 when thesize of initial ensemble reaches 80 the 60 resource costis saved In our experiment only for validating the methodeffectively we set the ensemble sizes 5 10 15 and 20 for eachlocal ensemble detector which may be small for the practicalapplications In fact how many local ensemble detectors arechosen is an open topic and is decided by many factors suchas the computation capability and the communication cost aswell as memory usage of sensor node the expected detectingaccuracy requirement and so on In the practical applicationa trade-off is commonly considered

5 Conclusion and Future Work

After exploiting the spatiotemporal correlation existing inthe sensed data of WSNs and motivated by the advantagesof online ensemble learning a distributed online ensembleanomaly detector method has been proposed Due to thespecific resource constrained in theWSNs ensemble pruningbased on BBO is employed to mitigate the high resourcerequirement and obtain the optimized detector that performsat least as good as the original ones The experimental resultson real dataset demonstrated that our proposed method iseffective

Because the diversity of base learners is a key factorrelated to the performance of ensemble learning as a possibleextension of our work we plan to include some diversity

International Journal of Distributed Sensor Networks 11

Table 4 Detection performance of global ensemble detector based on BBO pruning [7]

Ensemble size(BBO pruned)

N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

14 09480 08000 00458 09327 07667 00567 09500 08125 00423 09533 08571 0042023 09710 07750 00208 09447 08000 00461 09733 09250 00239 09697 09143 0027627 09713 08500 00236 09683 08333 00230 09810 09563 00176 09797 09357 0018232 09820 09750 00177 09750 08333 00160 09820 09500 00162 09830 09786 00168

Table 5 Rate of saving resource cost based on global ensembledetector of BBO pruned

Number Initial ensemblesize

Prunedensemble size

Saving resourcecost

1 20 14 302 40 23 4253 60 27 554 80 32 60

measures in fitness function to improve the detecting per-formance in future Besides the cost of communication isthe main reason of quick energy depletion of sensor nodesespecially for the cluster head the adaptive selection of clusterhead based on energy state will be taken into account tolengthen the lifetime of WSNs in next work

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by the National Key ScientificInstrument and Equipment Development Project(2012YQ15008703) the Zhejiang Provincial Natural ScienceFoundation of China (LY13F020015) the Open Project of TopKeyDiscipline of Computer Software andTheory in ZhejiangProvincial (ZC323014100) National Science Foundation ofChina (61473182) Science and Technology Commission ofShanghai Municipality (11JC1404000 14JC1402200) andShanghai Rising-Star Program (13QA1401600)

References

[1] Y Zhang N Meratnia and P Havinga ldquoOutlier detectiontechniques for wireless sensor networks a surveyrdquo IEEE Com-munications Surveys and Tutorials vol 12 no 2 pp 159ndash1702010

[2] Y Zhang N A S Hamm N Meratnia A Stein M van deVoort and P J M Havinga ldquoStatistics-based outlier detectionfor wireless sensor networksrdquo International Journal of Geo-graphical Information Science vol 26 no 8 pp 1373ndash1392 2012

[3] C Peng and Q-L Han ldquoA novel event-triggered transmissionscheme and L

2control co-design for sampled-data control

systemsrdquo IEEE Transactions on Automatic Control vol 58 no10 pp 2620ndash2626 2013

[4] S Rajasegarar C Leckie M Palaniswami and J C BezdekldquoDistributed anomaly detection in wireless sensor networksrdquo inProceedings of the 10th IEEE Singapore International Conferenceon Communication systems (ICCS rsquo06) pp 1ndash5 IEEE SingaporeOctober 2006

[5] S Rajasegarar C Leckie and M Palaniswami ldquoAnomalydetection in wireless sensor networksrdquo IEEE Wireless Commu-nications vol 15 no 4 pp 34ndash40 2008

[6] M Xie S Han B Tian and S Parvin ldquoAnomaly detectionin wireless sensor networks a surveyrdquo Journal of Network andComputer Applications vol 34 no 4 pp 1302ndash1325 2011

[7] Z Ding M Fei D Du and S Xu ldquoOnline anomaly detectionmethod based on BBO ensemble pruning in wireless sensornetworksrdquo in Life System Modeling and Simulation vol 461 ofCommunications in Computer and Information Science pp 160ndash169 Springer Berlin Germany 2014

[8] T G Dietterich ldquoMachine-learning researchmdashfour currentdirectionsrdquo AI Magazine vol 18 no 4 pp 97ndash136 1997

[9] Z-H Zhou J Wu andW Tang ldquoEnsembling neural networksmany could be better than allrdquoArtificial Intelligence vol 137 no1-2 pp 239ndash263 2002

[10] N Shahid I H Naqvi and S B Qaisar ldquoCharacteristics andclassification of outlier detection techniques for wireless sensornetworks in harsh environments a surveyrdquoArtificial IntelligenceReview vol 137 pp 1ndash36 2012

[11] D Du K Li and M Fei ldquoA fast multi-output RBF neuralnetwork constructionmethodrdquoNeurocomputing vol 73 no 10ndash12 pp 2196ndash2202 2010

[12] P Gil A Santos and A Cardoso ldquoDealing with outliers inwireless sensor networks an oil refinery applicationrdquo IEEETransactions on Control Systems Technology vol 23 no 4 pp1589ndash1596 2014

[13] M A Rassam M A Maarof and A Zainal ldquoAdaptive andonline data anomaly detection for wireless sensor systemsrdquoKnowledge-Based Systems vol 60 pp 44ndash57 2014

[14] S Rajasegarar A Gluhak M Ali Imran et al ldquoEllipsoidalneighbourhood outlier factor for distributed anomaly detectionin resource constrained networksrdquo Pattern Recognition vol 47no 9 pp 2867ndash2879 2014

[15] N Lu G Zhang and J Lu ldquoConcept drift detection viacompetence modelsrdquo Artificial Intelligence vol 209 pp 11ndash282014

[16] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[17] S Seguı L Igual and J Vitria ldquoBagged one-class classifiersin the presence of outliersrdquo International Journal of PatternRecognition and Artificial Intelligence vol 27 no 5 Article ID1350014 2013

12 International Journal of Distributed Sensor Networks

[18] N Duffy and D Helmbold ldquoBoosting methods for regressionrdquoMachine Learning vol 47 no 2-3 pp 153ndash200 2002

[19] W-C Chang and C-W Cho ldquoOnline boosting for vehicledetectionrdquo IEEETransactions on SystemsMan and CyberneticsPart B Cybernetics vol 40 no 3 pp 892ndash902 2010

[20] C Desir S Bernard C Petitjean and L Heutte ldquoOne classrandom forestsrdquo Pattern Recognition vol 46 no 12 pp 3490ndash3506 2013

[21] A Fern and R Givan ldquoOnline ensemble learning an empiricalstudyrdquoMachine Learning vol 53 no 1-2 pp 71ndash109 2003

[22] A Bifet G Holmes B Pfahringer and R Gavalda ldquoImprov-ing adaptive bagging methods for evolving data streamsrdquo inAdvances in Machine Learning vol 5828 of Lecture Notes inComputer Science pp 23ndash37 Springer Berlin Germany 2009

[23] D I Curiac and C Volosencu ldquoEnsemble based sensinganomaly detection in wireless sensor networksrdquo Expert Systemswith Applications vol 39 no 10 pp 9087ndash9096 2012

[24] X Zhou S Li and Z Ye ldquoA novel system anomaly predictionsystem based on belief markov model and ensemble classifica-tionrdquo Mathematical Problems in Engineering vol 2013 ArticleID 179390 10 pages 2013

[25] H He S Chen K Li and X Xu ldquoIncremental learning fromstream datardquo IEEE Transactions on Neural Networks vol 22 no12 pp 1901ndash1914 2011

[26] D Du K Li X Li and M Fei ldquoA novel forward gene selectionalgorithm for microarray datardquo Neurocomputing vol 133 pp446ndash458 2014

[27] H Ma ldquoAn analysis of the equilibrium of migration models forbiogeography-based optimizationrdquo Information Sciences vol180 no 18 pp 3444ndash3464 2010

[28] D Simon ldquoBiogeography-based optimizationrdquo IEEE Transac-tions on Evolutionary Computation vol 12 no 6 pp 702ndash7132008

[29] S Sheen R Anitha and P Sirisha ldquoMalware detection bypruning of parallel ensembles using harmony searchrdquo PatternRecognition Letters vol 34 no 14 pp 1679ndash1686 2013

[30] Y-Y Zhang H-C Chao M Chen L Shu C-H Park and M-S Park ldquoOutlier detection and countermeasure for hierarchicalwireless sensor networksrdquo IET Information Security vol 4 no4 pp 361ndash373 2010

[31] C Peng and M-R Fei ldquoAn improved result on the stability ofuncertain T-S fuzzy systems with interval time-varying delayrdquoFuzzy Sets and Systems vol 212 pp 97ndash109 2013

[32] Y Zhang Observing the Unobservable Distributed Online Out-lier Detection inWireless Sensor Networks University of TwenteEnschede The Netherlands 2010

[33] C Peng D Yue and M Fei ldquoRelaxed stability and stabilizationconditions of networked fuzzy control systems subject toasynchronous grades of membershiprdquo IEEE Transactions onFuzzy Systems vol 22 no 5 pp 1101ndash1112 2014

[34] C Peng M-R Fei E Tian and Y-P Guan ldquoOn hold or dropout-of-order packets in networked control systemsrdquo Informa-tion Sciences vol 268 pp 436ndash446 2014

[35] M A Rassam A Zainal and M A Maarof ldquoAn adaptive andefficient dimension reduction model for multivariate wirelesssensor networks applicationsrdquo Applied Soft Computing Journalvol 13 no 4 pp 1978ndash1996 2013

[36] M Xie J Hu S Han and H-H Chen ldquoScalable hypergridk-NN-based online anomaly detection in wireless sensor net-worksrdquo IEEE Transactions on Parallel and Distributed Systemsvol 24 no 8 pp 1661ndash1670 2013

[37] Intel Berkely Reseach Lab (IBRL) dataset 2004 httpdbcsailmitedulabdatalabdatahtml

[38] J W Branch C Giannella B Szymanski R Wolff and HKargupta ldquoIn-network outlier detection in wireless sensornetworksrdquo Knowledge and Information Systems vol 34 no 1pp 23ndash54 2013

[39] M Moshtaghi T C Havens J C Bezdek et al ldquoClusteringellipses for anomaly detectionrdquo Pattern Recognition vol 44 no1 pp 55ndash69 2011

[40] S Rajasegarar J C Bezdek C Leckie and M PalaniswamildquoElliptical anomalies in wireless sensor networksrdquo ACM Trans-actions on Sensor Networks vol 6 no 1 pp 1ndash28 2009

[41] M A Rassam A Zainal and M A Maarof ldquoOne-classprincipal component classifier for anomaly detection inwirelesssensor networkrdquo in Proceedings of the 4th International Confer-ence on Computational Aspects of Social Networks (CASoN rsquo12)pp 271ndash276 IEEE Sao Carlos Brazil November 2012

[42] H Sagha H Bayati J D R Millan and R Chavarriaga ldquoOn-line anomaly detection and resilience in classifier ensemblesrdquoPattern Recognition Letters vol 34 no 15 pp 1916ndash1927 2013

[43] M Hejazi and Y P Singh ldquoOne-class support vector machinesapproach to anomaly detectionrdquo Applied Artificial Intelligencevol 27 no 5 pp 351ndash366 2013

[44] Y Zhang NMeratnia and P JMHavinga ldquoDistributed onlineoutlier detection in wireless sensor networks using ellipsoidalsupport vector machinerdquo Ad Hoc Networks vol 11 no 3 pp1062ndash1074 2013

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 3: Research Article A Novel Distributed Online Anomaly ...downloads.hindawi.com/journals/ijdsn/2015/146189.pdfin the cluster head node to obtain an optimized subset of ensemble members

International Journal of Distributed Sensor Networks 3

the noise data may deteriorate some chunks the ensemblecan still generate relatively accurate prediction result Thedisadvantage of horizontal ensemble is that the streamingdata is continuously changing and the information containedin the previous chunks may be invalid so that using theseold concept models will not improve the overall result ofprediction The latter ensemble pattern is vertical ensemblewhich uses the newest chunk to build ensemble model Theadvantage of vertical ensemble is that it uses different algo-rithms (heterogeneous ensemble) on same dataset or samealgorithm (homogeneous ensemble) on different samplingsubdataset from the chunk to build the model which candecrease the bias error between models The disadvantageis that vertical ensemble assumes that the data chunk iserrorless in real situation this precondition usually is hardtomeet Currently because online ensemble learningmethodcan address the concept drift and noisy data problem instreaming data ensemble learning has been used in anomalydetection forWSNs [23ndash25] In this paper after exploiting thespatiotemporal correlation existing in the sensed dataset inWSNs a distributed method is proposed based on horizontalensemble and like-vertical ensemble Section 3 will give thedetailed description

23 Ensemble Pruning Based on Optimization Search MethodAlthough there are many advantages for ensemble learningthe nontrivial disadvantage is that it needs more mem-ory especially more communication resource to store andcommunicate multiple detectors in WSNs which can drainenergy quickly and is intolerable in WSNs Motivated bythe ldquomany could be better than allrdquo in the ensemble learn-ing community [9] it implied that the combination of alldetectors maybe not a good choice in ensemble learningcommunity Ensemble pruning as necessary strategy to solveresource-limitation question [26] is employed which selectsa subset of initial ensemble and obtains better or at leastequal detecting performance than the original ensembleThemost advantage of ensemble pruning is that it reduces thecommunication requirement greatly In WSNs broadcastingthe relative few detectors can save the battery energy consid-erably However it is well known that pruning an ensembleof size119873 requires to search in the space composed of 2119873 minus 1nonempty subensembles which is a 119873119875 complete problemHence some heuristic searching approaches are used tofind the expected appropriate subset Biogeographical basedoptimization (BBO) [27 28] as a novel population-basedglobal optimization method had some features in commonwith existing optimizationmethods such as genetic algorithm(GA) and harmony search (HS) [29] In this paper BBO isused to obtain an optimalsuboptimal ensemble for reducingthe communication cost To the best of our knowledge as anew optimization method there is no paper employing thismethod to apply in the fields of WSNs and our study willextend its application

3 Proposed Method

Motivated by the increasing online ensemble learningmethodology [25] and considering the resource limitation

Base station (BS)

Cluster head (CH)CH to BSBoundary of cluster heads

Non-CH node

Figure 1 The considered WSN

of sensor node in WSNs we propose a distributed onlineanomaly detection method based on the ensemble learningFurther BBO is used for ensemble pruning to decrease thecommunication and memory requirements

31 Problem Statement of WSNs In this paper we assumethat the WSNs is applied in untouched area and to assure thesensed data quality the sensor nodes usually are deployeddensely Besides we assume that sensor nodes are timesynchronized which is mainly for clear presentation purposerather than a limitation of our proposed method Figure 1showsWSNs which consist of a large amount of sensor nodesand a base station (BS) [30] Generally the WSNs can berepresented as a graph119866 = (119881 119864) where119881 = V

1 V2 V

|119881|

is a finite set of vertices and 119864 = 1198901 1198902 119890

|119864| is a finite

set of edges and vertex (V119894 119894 = 1 |119881|) and edge (119890

119894 119894 =

1 |119864|) refer to sensor nodes and the one-hop or multihopcommunication link reachable between sensors V

119894and V

119895

respectivelyFrom Figure 1 we can clearly have an idea that some

clusters are formed based on node geographical positionsinformation and communication capability reachable Herewe only consider the one-hop communications among sensornodes Similarly this assumption is mainly for clear repre-sentation of our proposed method rather than a limitation ofcommunication capability of sensors In fact our proposedmethod can easily extend to multiple hop relaying communi-cation Besides in order to concisely describe our proposedanomaly detection method a relatively small subnetworkconsisted of some sensor nodes deployed densely is taken intoaccount which forms a cluster 119862

119894consisting of one cluster

head node and a number of sensor nodes represented as CH119894

and 119873119894119895 119895 = 1 |119862

119894| respectively For the whole WSNs

119881 = 1198621cup1198622cup cup 119862

119899and119862

119894cap119862119895= Φ All nodes in a cluster

are reachable to each other by one-hop communication andthe communication between clusters depends on the directlinks of cluster heads In each cluster the selection of clusterhead is randomized among all nodes in that cluster to avoiddraining of the energy

For one cluster 119862119894= CH

119894 1198731198941 119873

119894119898 which contains

a cluster head CH119894and its 119898 spatially neighboring nodes

(119873119894119895 119895 = 1 119898) Each sensor node in the subnetwork

4 International Journal of Distributed Sensor Networks

measures a data vector at every time interval Δ119905 which iscomposed of multiple attribute values For the cluster headCH119894 the observation is 119883

119894= (119909

119894

1 119909119894

2 119909

119894

119889) where 119889

denotes the dimension For the 119895th neighbor node 119873119894119895 the

observation is 119883119894119895= (119909119894

1198951 119909119894

1198952 119909

119894

119895119889) Nodes in the cluster

collect samples synchronously and our proposed method isto identify these new observations of each sensor node asnormal or anomalous online

32 Spatial and Temporal (Spatiotemporal) Correlation ofSensed Dataset For the sensed dataset in a cluster wedescribed the spatiotemporal correlation firstly which will beused later to build our proposed online ensemble detection

The collected sensor dataset from WSNs is a time seriesdataset A time series is a sequence of value 119883 = 119909(119905) 119905 =

1 119899which follows a nonrandom order and the 119899 consec-utive observation values are collected at same time intervalsAnalyzing and learning from these observations [31] canhelp to understand the data trend over time and build theappropriate detector based on temporal correlation as well asto predict the label of new coming observations

To obtain the detector the foremost requirement is toachieve a stationary time series dataset Some data processingmethods are used to eliminate data trend and obtain a sta-tionary time series dataset such as polynomial fitting movingaverages differencing and double exponential smoothing[32ndash34] Considering the requirement of low computationalcomplexity a simple and efficient nonparametric technique(ie first differencing) is used to eliminate the temporal trendand obtain a stationary time series for dataset collected inWSNs which can be formulated as

1198831015840= 1199091015840(119904 119905) = 119909 (119904 119905) minus 119909 (119904 119905 minus 1) 119905 = 2 3 119899 (1)

Besides the sensor nodes are always deployed denselyand the space redundancy existed A dataset 119883 = 119909(119904) 119904 =

1 119898 is collected from 119898 sensor nodes in a clusterat a timestamp This dataset can help to understand thespatial correlation structure of data and predict the datavalue at a location nearby Spatial data may present thelocal dependencywhich represents the similarity relationshipof observations collected at adjacent locations in a localregion Usually for a specified region the observations of onesensor can be estimated by a linear weighted combination ofobservations collected at its adjacent locations [32] which canbe expressed as

119909 (119904119894) = 1205821119909 (1199041) + sdot sdot sdot + 120582119894minus1119909 (119904119894minus1) + 120582119894+1119909 (119904119894+1)

+ sdot sdot sdot + 120582119898119909 (119904119898)

(2)

where 1199041 119904119894minus1 119904119894+1 119904119898 denotes positions of sensornodes and 1205821 120582119894minus1 120582119894+1 120582119898 denotes the weights ofobservations sum119898

119896=1119896 =119894 120582119896 = 1Consequently for sensed data collected in a local region

two reasonable assumptions are described as follows

(1) The sensed data of adjacent nonfault sensor nodes aresimilar at the same timestamp

(2) The sensed data of adjacent nonfault sensor nodeshave the similar trend over time

Motivated by the two assumptions and ensemble learningtheory a novel anomaly detectionmethod is proposed in thispaper We will give the details in the following section

33 Proposed Ensemble Learning Method of Anomaly Detec-tion in WSNs Spatiotemporal correlation exists among sen-sor data in a local region of WSNs and a relatively smallcomponent that is a cluster consisting of a few of sensornodes and a cluster head node is taken into account to clearlydescribe proposed distributed anomaly detection methodbased on ensemble learning Ensemble pruning based onBBO was adopted to optimize the initial trained detectorfor mitigating the resource requirements The optimizedensemble detector was used to identify global anomalousobservations at each individual sensor timely Our proposedmethod is shown as Figure 2

Online anomaly detection method consists of three keyprocedures that is detector training online detecting andonline detector updating From Figure 2 it can be seen thatour proposed method enables each distributed deploymentsensor node to globally judge every new coming observa-tion normal or anomalous in time Distributed detecting isemployed to achieve load (communication computation andstorage) evenly in the network and to prolong the lifetime ofthe whole network

The whole procedure of proposed method is described asfollows

Step 1 Considering the temporal correlation at the certaintime period each sensor node 119904

119894trains a local ensemble

detector using the history dataset collected from a timeinterval In facts using this initial local ensemble detectorthe new coming observation is normal or anomalous can bedetermined locally

Step 2 Each sensor node 119904119894transmits its local ensemble

detector as well as some related parameters such as themaximize value minimum value and mean of trainingdataset to the cluster head node and other member sensornode

Step 3 Cluster head node received the local ensemble detec-tors from its member nodes and combined with its owntrained detector the initial global ensemble detector is built

Step 4 The BBOmethod is introduced in the cluster head toprune the initial global ensemble detector and to obtain anacceptable final ensemble detector

Step 5 The pruned ensemble detector that is final ensembledetector is broadcasted to its each member sensor node foronline global anomaly detection

Step 6 Each sensor node selectively retains the test data foronline update based on the predefined sampling probability119901

International Journal of Distributed Sensor Networks 5

Learning(i) Input training data

(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector

Member nodes (MN1)

Learning(i) Input training data

(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector

Member nodes (MN2)

Learning(i) Input training data

(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector

Member nodes (MNm)

Cluster head node (CH)

Ensemble aggregating

(i) Receive the local ensemble detector(ii) Aggregating

(iii) Output the global ensemble detector

Bit coding mechanism

BBO (ensemble pruning)

Local ensembledetector

Local ensembledetector

Local ensembledetector

Broadcastcommunication

Broadcastcommunication

Broadcastcommunication

Broadcastcommunication

Local ensembledetector

Global initialensemble detector

State matrix Global initial ensemble detector

Figure 2 Distributed ensemble anomaly detection method based on BBO pruning in WSNs

Step 7 Once the updating condition was activated theprocedure of retraining and detector updating was triggered

This method can scale well with increase of number ofnodes in WSNs due to its distributed processing nature Ithas low communication requirements and does not need totransmit any actual observations between cluster head nodeand itsmember sensor node which saves the communicationresource significantly

Next we described some important procedures men-tioned above in detail Further considering the context ofresource constraint of each sensor node in WSNs sometricks are designed to save the communication and memoryrequirements

331 Building the Initial EnsembleDetector An initial ensem-ble detector is constructed by two steps Firstly a numberof base detectors are trained sequentially for each sensornodes in a cluster (including the cluster head node itself)based on the history dataset Because the data distributionmay be changed over time the previous trained detector maybe useless for the future detection Moreover the limitedmemory resource in the sensor node is another constraintto store too many previous detectors In practice accordingto the space of memory resource only the latest multipledetectors are kept to build the initial local ensemble for onesensor node For example to sensor node 119894 the sensed data iscollected anddivided into data chunk based on a time intervalΔ119905 which is determined by the actual monitoring processConsequently each node trains multiple individual detectorsover time In our paper supposing 119899 latest detector is kept fora sensor node if there are119898 nodes in one cluster then totally119899lowast119898 detectors are obtained for the initial ensemble Secondly

each sensor node (including cluster head node) broadcasts its119899 trained detector in the cluster Taking the cluster head as anexample after all (119899lowast(119898minus1)) individual detectors are receivedfrom its member nodes the cluster head combines with its 119899trained detector and the initial ensemble (including 119899 lowast 119898

individual detectors) is built in cluster head nodeMany techniques can be employed for combining the

results of each detector to obtain the final detection resultThe common used method in the literature is the major-ity vote (for classification problem) and weighted average(for regression problem) In our paper the final ensembledetection result can be calculated by (3) where 119908

119894denotes

weight coefficient that is 119908119894= 1 means the simple average

otherwise weighted average In our paper for simplicity thesimple average strategy is employed to combine the finallyresult

119910fin (119909) =1

119899 lowast 119898

119899lowast119898

sum

119894=1119910119894 (119909) lowast 119908119894

(3)

332 Ensemble Pruning Based on BBO Search To miti-gate the expensive communication cost and high memoryrequirement induced by ensemble learning inspired by theprinciple of ldquomany could be better than allrdquo in the ensemblelearning community the ensemble pruning is necessary

Given an initial ensemble anomaly detector 119864 =

AD1AD2 AD

119899lowast119898 AD

119894is a trained anomaly detector

which can test an observation anomalous or not a combi-nation method 119862 and a test dataset 119879 The goal of ensemblepruning is to find an optimalsuboptimal subset 1198641015840 sube 119864which can minimize the generalization error and obtainbetter or at least same detection performance compared to119864 Let 119891

119894119895(119894 = 1 2 119898 119895 = 1 2 119899) be the fitness values

6 International Journal of Distributed Sensor Networks

Input 119864mdashinitial ensemble anomaly detector 119879mdashThe number of maximization iterationOutput 1198641015840mdashfinal ensemble anomaly detectorlowast BBO parameter initialization lowast

Create a random set of habitats (populations) 1198671 1198672 119867

119873

Compute corresponding fitness that is HSI valueslowastOptimization search process lowast

While (119879)Compute immigration rate 120582 and emigration rate 119906 for each habitat based on HSIlowast Migration lowastSelect119867

119894with probability based on 120582

119894

If119867119894is selected

Select119867119895with probability based on 119906

119895

If119867119895is selected

Randomly select a SIV form119867119895

Replace a random SIV in119867119894with one from119867

119895

End ifEnd iflowast Mutation lowastSelect an SIV in119867

119894with probability based on the mutation rate 120578

If119867119894(SIV) is selected

Replace119867119894(SIV) with a randomly generated SIV

End ifRe-compute HSI values119879 = T minus 1

End whilelowast Ensemble pruning lowast

Get the final ensemble of anomaly detector 119864lowast based on the habitats119867119894

lowast with acceptable HSI

Algorithm 1 Ensemble pruning BBO (E T)

of the detecting performance such as true positive rate falsepositive rate accuracy and so on Obviously the fitness value119865 can be defined as (4) based on the results of testing data

119865 =

[

[

[

[

[

[

11989111 11989112 sdot sdot sdot 1198911119899

11989121 11989122 sdot sdot sdot 1198912119899

sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot

1198911198981 1198911198982 sdot sdot sdot 119891

119898119899

]

]

]

]

]

]

(4)

The final fitness function can be defined as

Maximize (

1198731015840

sum

119894=1119898119895=1119899

119891119894119895)

st 1198731015840le 119898 lowast 119899

(5)

Here the problem of ensemble pruning is to find thesubset of 1198641015840 which was composed of part single detectorsFinding the optimized subset requires much heavier andmore delicate computation resources Biogeography-basedoptimization (BBO) is a novel optimization method and isemployed to find out the acceptable set of ensemble Weonly simply present some key information about BBO theinterested reader can be referred to the detailed descriptionin [28]

BBO is a population-based global optimization methodwhich has some common characteristics similar to the

existing evolutionary algorithms (EAs) such as genetic algo-rithm (GA) particle swarm optimization (PSO) and antcolony optimization (ACO) When it was used to search thesolution domain and obtain an optimalsuboptimal solutionsome operators were employed to share information amongsolutions which makes BBO applicable to many problemsthat GA and PSO are used The more distinctive differencebetween BBO and other EAs can be seen in [27 28]

The pseudo-code of ensemble pruning based on BBO canbe described as shown in Algorithm 1 [7] Here 119867 indicateshabit HIS is fitness and SIV (suitability index variable) is asolution feature

333 Some Tricks Designed to Mitigate the CommunicationRequirement In the WSNs the main reason of quick energydepletion is the radio communication among the sensornodes It has been known that the cost of communicationof one bit equals the cost of processing thousands of bitsin sensors [35] This means that the most energy in sensornode is consumed by radio communication rather thancollecting or processing data Consequently reducing thecommunication quantity will decrease the power resourcerequirement and eventually lengthen the lifetime of thewholeWSNs

It is obvious that the aforementioned method has relativehigh communication overhead Each sensor node transmitsits local ensemble detector to the cluster head and the finalpruned global ensemble detector broadcasts back to its each

International Journal of Distributed Sensor Networks 7

Input 1198641015840mdashCurrent pruned ensemble anomaly detector 119901mdashSampling probabilityOutput 119864lowastmdashUpdated pruned ensemble anomaly detector

For each sensor nodeRetain the new observation with probability 119901If buffer is replaced completely by new observations

Train new detector and transmit its summary to cluster head119864lowast = Ensemble Pruning BBO(1198641015840 119879)

Broadcast 119864lowast to its member sensor node for subsequent anomaly detection

Algorithm 2 Online updating (1198641015840 119901)

member sensor nodes In order to relieve communicationburden some skills are used to descend the communicationoverhead

In fact the distributed traininglearning method onlytransmits the summary information of trained local ensembledetector to the cluster head which has significantly decreasedthe communication cost compared to centralized anomalydetectionmanners that sent all trained data to cluster head tobuild detector Besides after the pruned ensemble is obtainedin cluster head node each member sensor node in thiscluster can obtain the pruned ensemble detector from thecluster head node A straightforward method is broadcastingthis pruned ensemble to its member sensor nodes Thisis a common used strategy but it does not make full useof local ensemble detector information and will cost morecommunication resources Here a state matrix 119875 is designedin the cluster head its element 119901

119894119895is defined by formula (6)

to represent each single detector in initial ensemble Theneach local ensemble detector is represented as a bit stringusing one bit for each single detector Detector is included orexcluded from the ensemble detector depending on the valueof the corresponding bit that is 1 denotes this single detectorthat is included in the final ensemble and 0 means it was notincluded

119901119894119895=

1 AD119894119895isin 1198641015840 119894 = 1 119898 119895 = 1 119899

0 otherwise

119875

=

1 2 119894 minus 1 119894 119894 + 1 sdot sdot sdot 119899

1198781

1198782

119878119898

[

[

[

[

[

[

0

1

sdot

0

1

0

sdot

0

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

0

1

sdot

1

1

1

sdot

1

1

0

sdot

1

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

1

0

sdot

1

]

]

]

]

]

]

(6)

After the pruned procedure is finished the cluster headbroadcasts the statematrix119875 to its member sensor node eachsensor node keeps the single detector whose correspondingvalue of state element equals 1 and it deletes the rest to buildthe pruned ensemble global detector Employing the statematrix can save the energy greatly For example after theensemble pruning is finished 1198731015840 (1198731015840 le 119899 lowast 119898) individualdetectors are broadcast in cluster If matrix 119875 is not used it

will need 4 lowast1198731015840 lowast 119889 bytes communication cost (suppose thatthe individual detector can be represented by 119889 parametersand each parameter needs at least 4 bytes) If matrix 119875 isintroduced each itemofmatrix119875 only needs 1 bit to representan individual detector Consequently only 119898 lowast 1198998 bytes arerequired to broadcast Suppose that one-third of individualdetectors are pruned (ie1198731015840 = 2lowast119899lowast1198983) then (4lowast119899lowast119898lowast

119889 lowast 23)(119898 lowast 1198998) asymp 2133119889 By introducing the ensemblepruning and state matrix the quantity of energy saving incluster head sensor is significant and the lifetime ofWSNs canbe lengthened

334 Online Update and Relearning Distribution changeof sensed dataset occurred possibly and detector updat-ing is necessary Online detector update will accompanya relearning procedure A comprised strategy (ie delayupdating strategy [36]) can cater this situation and savethe computation communication and memory resources tosome extent Simple to say for the new coming observationwhether saving and using it to update the current detectoror not are decided by a sample probability 119901 Some heuristicrules can be employed to guide its value for example if thedynamics is relatively stationary the small 119901 should be usedotherwise the big 119901 should be chosen When the buffer of asensor node is replaced by the new data completely onlineupdate is triggered and new detector is trained The pseudo-codes of algorithm can be described as shown inAlgorithm 2

4 Experimental and Analysis

In this section the dataset data preprocessing methodexperiment results and analysis are described respectivelyExperiments were conducted on a personal PC with IntelCore 2 Duo CPU P7450213GHZ and 4GB memoryThe operating system is Windows 7 professional The dataprocessing is partly on the MATLAB 2010 and the algorithmmentioned in Section 3 was implemented with MicrosoftVisual C++ platform

41 Dataset and Data Preprocessing IBRL datasets [37] wereused in our paper to validate proposed method which wascollected from aWSN deployed in Intel Research Laboratoryat University of Berkeley and commonly used to evaluatethe performance of some existing models for WSNs [3536 38ndash41] This network consists of 54 Mica2Dot sensornodes Figure 3 shows the location of each node of the

8 International Journal of Distributed Sensor Networks

1

2

3

4

5 6

7

8

9

10

11

12

1314

15

16

17

18

19

20

21

22

23

2425

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

4142

43

44

45

46

47

48

49

50

51

52

54

Lab

Server

Quiet

Phon

e

Kitchen

Elec

Copy

Storage

Conference

Office Office

53

Figure 3 Sensor nodes location in the IBRL deployment

deployment (node locations are shown in black hexagonwith their corresponding node IDs) [35] The whole datasetwas collected from 29022004 to 05042004 Four typesof measures data that is light temperature and humidityas well as voltage were collected and those measurementswere recorded in 31 s interval Because these sensors weredeployed inside a lab and the measurement variables hadlittle changes over time (except the light having the suddenchanges due to the irregular nature of this variable andfrequent onoff operation) this dataset was considered a typeof static datasets for many researchers In our experimentsto evaluate our proposed anomaly detection algorithm someartificial anomalies are created by randomly modifying someobservations which is widely used bymany researchers in theliterature [41]

Since our proposed method adopts the cluster structurea cluster (consisting of 4 sensor nodes ie N7 N8 N9and N10) and dataset (collected on 29022004) are chosenThe data distribution can be seen in [7] Here only partobservations (during 000000 amndash075959 am) from eachsensor node are employed to evaluate proposed methodThedata trend is depicted in Figure 4

From Figure 4 an obvious fact is that data distributionin a cluster is almost same which well proved that spatialcorrelation exists Though there are some trivial differencesafter analyzing the dataset carefully the main reason is thatdataset has some missing data points largely due to packetloss which can be further proved from Figure 4 In ourexperiment these missing observations can be interpolatedusing the method described in Section 33 The obvious factis that sudden peakvalley appeared in Figure 4 for eachsensor observation which implies that an interested eventmay occurred

Suppose that 119863 = 119909119894 119910119894 119894 = 1 2 119899 is a dataset

used to train an anomaly detector Here the 119909119894is a vector

with feature values and 119910119894is the label which indicates whether

the given observation is normal or anomalous Because theIBRL dataset regards all its observations as normal someanomaly data points are generated and inserted to evaluatethe performance of our proposed method In the paper anumber of 30 data points of artificial anomalies for eachsensor were injected consecutively in each dataset to calculatethe true positive rate (TPR) false negative rate (FPR) anddetection accuracy (ACC) Without loss of generality theanomalous dataset should follow a distribution very muchdifferent from that of the training dataset but their rangesshould be overlapped as much as possible Besides ananomalous event should be a small probability event fora normal dataset collected by a nonfault sensor node Theanomalies were generated using a normal randomizer withslightly deviate statistical characteristics from the normaldata characteristics [41] The detailed dataset information(including statistical parameters) of selected sensor node ispresented in Table 1

42 Performance EvaluationMetrics and BBO Parameters Inorder to evaluate our proposed method some commonlyused performance evaluation metrics for anomaly detectionare used in our paper such as detection accuracy (ACC) truepositive rate (TPR) and false positivealarm rate (FPR)Theyare described as follows

ACC =

(TP + TN)(TP + TN + FP + FN)

International Journal of Distributed Sensor Networks 9

17

175

18

185

19

195

20Temperature

0 200 400 600 800 1000

N7N8

N9N10

(a)

38

39

40

41

42

43

44

45

46

0 200 400 600 800 1000

Humidity

N7N8

N9N10

(b)

Figure 4 The data (temperature humidity) trend during 00000 amndash75959 am on February 29 2004

Table 1 Detail dataset information of selected sensor node on 29022004

Node Initial sample Mean Variance Injected anomaly Mean Variance119879 119867 119879 119867 119879 119867 119879 119867

N7 823 184154 409176 05238 14494 30 1821 4110 054 146N8 548 179844 417123 05315 14612 30 1775 4195 055 148N9 652 181140 426295 05288 14827 30 1835 4245 055 150N10 620 181144 426215 05244 14191 30 1833 4247 054 143119879 temperature119867 humidity

TPR =

TP(TP + FN)

FPR =

FP(FP + TN)

(7)

where TP means number of samples correctly predicted asanomaly class FP means number of samples incorrectlypredicted as anomaly class TN means number of samplescorrectly predicted as normal class and FN means numberof samples incorrectly predicted as normal class

BBO is employed to prune the initial ensemble and themigration model is same as that present in [27 28] and therelated parameters are set as follows

Habitat (population) size 119878 = 30 the number of SIVs(suitability index variables) in each island 119899 = 20 40 60 80the maximum migration rates 119864 = 1 and 119868 = 1 and themutation rate 120578 = 001 and 120582 120583 are the immigration rateand the emigration rate respectively The elitism parameter120588 = 2

HSI (habitat suitability index) is a fitness function similarto other population-based optimization algorithms HIS isevaluated by 119865-measure (119865-score) which considers both the

precision probability and the recall probability of binaryclassification problem

119865-measure =(1 + 120573

2) precision lowast recall

1205732lowast precision + recall

=

(1 + 1205732) lowast TP

(1 + 1205732) lowast TP + 1205732 lowast FN + FP

(8)

119865-measure can be interpreted as a weighted average of theprecision and recall and its value reaches best at 1 and worstat 0 120573 is a parameter used to adjust the relative importancebetween precision and recall 120573 = 05 1 2 Usually thevalue of 119865-measure is close to the relative small value ofprecision and recall that is the big 119865-measuremeans that theprecision and recall are all big Consequently a good detectoris analogous to a habitat with a high HSI and is included inthe final ensemble detector and a poor detector is analogousto a habitat with a low HIS and is discarded from the finalensemble detector In our paper 120573 = 1 is specified

43 Results Presentation and Discussions In the data miningand machine learning communities SVM-based method has

10 International Journal of Distributed Sensor Networks

Table 2 Detection performance of local ensemble detector

Ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

5 08700 05833 01181 07900 03333 01809 08267 05000 01549 08267 05714 0160810 08800 06667 01111 08033 03889 01702 08267 04375 01514 08333 06429 0157315 08900 07500 01042 08167 05000 01631 08433 05000 01373 08600 07143 0132920 08933 08333 01042 08200 05000 01596 08367 05000 01444 08567 07143 01364

Table 3 Detection performance of global ensemble detector [7]

Combined ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

20 09467 08333 00486 09300 07778 00603 09467 07500 00423 09500 07857 0042040 09700 07500 00208 09433 08333 00496 09710 08938 00246 09650 08929 0031560 09700 08333 00243 09733 08889 00213 09800 09375 00176 09783 09357 0019680 09817 09583 00174 09800 09444 00177 09767 09375 00211 09780 09714 00217

been widely used in classification problem which separatesthe data belonging to the different classes by fitting a hyper-plane One class SVM based method as a variation of thismethod is especially favored for anomaly detection [42ndash44]In the paper it was used to train the base detectorThe datasetof each sensor node was divided into two parts about 66was used for training the local detector and the remainder asthe test set was to evaluate proposed method

Online Bagging the commonly used ensemble strategywas used to build initial ensemble detector Our experimentsaim to achieve two goals Firstly it is to prove the effectivenessof proposed method based on ensemble learning theorySecondly it is to prove that pruned ensemble detector canobtain better (at least equal) performance compared to initialensemble detector and mitigate the resource requirementAs a result three experiments were done that is localensemble anomaly detector only considering the temporalcorrelation of each sensor node global ensemble anomalydetector considering the spatiotemporal correlation and theglobal pruned ensemble anomaly detector based on BBOThe experimental results can be seen in Tables 2 3 and 4respectively

Table 2 shows the performance of each sensor node underthe different ensemble size which does not take into accountthe spatial correlation of sensed data in a cluster Though theensemble detection performance is becoming ldquogoodrdquo gradualwith the increasing of ensemble size (the higher value ofACCTPR the better performance and the lower value of FPR thebetter performance) the overall performance is relatively lowThemaximumvalue of detection accuracy is only 8933 andmost of true positive rates are unacceptable and most of falsepositive rates (FPR) have a relative high valueAll these resultsindicate that the performance of local ensemble detectoris poor Table 3 shows the global detection performance ofeach sensor node Here after the local ensemble detectorwas trained each member node sent its local ensembleto each other to form the global ensemble detector andeach member node used this global detector to online testthe local observation From the results of Table 3 [7] an

obvious fact is that the detection performances are higherthan presented in Table 2With the help of neighbor detectorthe detection results become better and better correspondingto the increasing of ensemble size

In order to further optimize the proposed algorithmperformance and save the resource ensemble pruning is usedfor global ensemble detector Table 4 [7] shows the result ofdetection performance of pruned global ensemble detectorbased on BBO

Table 4 shows a more practicable result and the sizeof global ensemble decreases sharply while the detectorperformance is as good as or better than the initial globalensemble detector From the results of Table 5 when thesize of initial ensemble reaches 80 the 60 resource costis saved In our experiment only for validating the methodeffectively we set the ensemble sizes 5 10 15 and 20 for eachlocal ensemble detector which may be small for the practicalapplications In fact how many local ensemble detectors arechosen is an open topic and is decided by many factors suchas the computation capability and the communication cost aswell as memory usage of sensor node the expected detectingaccuracy requirement and so on In the practical applicationa trade-off is commonly considered

5 Conclusion and Future Work

After exploiting the spatiotemporal correlation existing inthe sensed data of WSNs and motivated by the advantagesof online ensemble learning a distributed online ensembleanomaly detector method has been proposed Due to thespecific resource constrained in theWSNs ensemble pruningbased on BBO is employed to mitigate the high resourcerequirement and obtain the optimized detector that performsat least as good as the original ones The experimental resultson real dataset demonstrated that our proposed method iseffective

Because the diversity of base learners is a key factorrelated to the performance of ensemble learning as a possibleextension of our work we plan to include some diversity

International Journal of Distributed Sensor Networks 11

Table 4 Detection performance of global ensemble detector based on BBO pruning [7]

Ensemble size(BBO pruned)

N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

14 09480 08000 00458 09327 07667 00567 09500 08125 00423 09533 08571 0042023 09710 07750 00208 09447 08000 00461 09733 09250 00239 09697 09143 0027627 09713 08500 00236 09683 08333 00230 09810 09563 00176 09797 09357 0018232 09820 09750 00177 09750 08333 00160 09820 09500 00162 09830 09786 00168

Table 5 Rate of saving resource cost based on global ensembledetector of BBO pruned

Number Initial ensemblesize

Prunedensemble size

Saving resourcecost

1 20 14 302 40 23 4253 60 27 554 80 32 60

measures in fitness function to improve the detecting per-formance in future Besides the cost of communication isthe main reason of quick energy depletion of sensor nodesespecially for the cluster head the adaptive selection of clusterhead based on energy state will be taken into account tolengthen the lifetime of WSNs in next work

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by the National Key ScientificInstrument and Equipment Development Project(2012YQ15008703) the Zhejiang Provincial Natural ScienceFoundation of China (LY13F020015) the Open Project of TopKeyDiscipline of Computer Software andTheory in ZhejiangProvincial (ZC323014100) National Science Foundation ofChina (61473182) Science and Technology Commission ofShanghai Municipality (11JC1404000 14JC1402200) andShanghai Rising-Star Program (13QA1401600)

References

[1] Y Zhang N Meratnia and P Havinga ldquoOutlier detectiontechniques for wireless sensor networks a surveyrdquo IEEE Com-munications Surveys and Tutorials vol 12 no 2 pp 159ndash1702010

[2] Y Zhang N A S Hamm N Meratnia A Stein M van deVoort and P J M Havinga ldquoStatistics-based outlier detectionfor wireless sensor networksrdquo International Journal of Geo-graphical Information Science vol 26 no 8 pp 1373ndash1392 2012

[3] C Peng and Q-L Han ldquoA novel event-triggered transmissionscheme and L

2control co-design for sampled-data control

systemsrdquo IEEE Transactions on Automatic Control vol 58 no10 pp 2620ndash2626 2013

[4] S Rajasegarar C Leckie M Palaniswami and J C BezdekldquoDistributed anomaly detection in wireless sensor networksrdquo inProceedings of the 10th IEEE Singapore International Conferenceon Communication systems (ICCS rsquo06) pp 1ndash5 IEEE SingaporeOctober 2006

[5] S Rajasegarar C Leckie and M Palaniswami ldquoAnomalydetection in wireless sensor networksrdquo IEEE Wireless Commu-nications vol 15 no 4 pp 34ndash40 2008

[6] M Xie S Han B Tian and S Parvin ldquoAnomaly detectionin wireless sensor networks a surveyrdquo Journal of Network andComputer Applications vol 34 no 4 pp 1302ndash1325 2011

[7] Z Ding M Fei D Du and S Xu ldquoOnline anomaly detectionmethod based on BBO ensemble pruning in wireless sensornetworksrdquo in Life System Modeling and Simulation vol 461 ofCommunications in Computer and Information Science pp 160ndash169 Springer Berlin Germany 2014

[8] T G Dietterich ldquoMachine-learning researchmdashfour currentdirectionsrdquo AI Magazine vol 18 no 4 pp 97ndash136 1997

[9] Z-H Zhou J Wu andW Tang ldquoEnsembling neural networksmany could be better than allrdquoArtificial Intelligence vol 137 no1-2 pp 239ndash263 2002

[10] N Shahid I H Naqvi and S B Qaisar ldquoCharacteristics andclassification of outlier detection techniques for wireless sensornetworks in harsh environments a surveyrdquoArtificial IntelligenceReview vol 137 pp 1ndash36 2012

[11] D Du K Li and M Fei ldquoA fast multi-output RBF neuralnetwork constructionmethodrdquoNeurocomputing vol 73 no 10ndash12 pp 2196ndash2202 2010

[12] P Gil A Santos and A Cardoso ldquoDealing with outliers inwireless sensor networks an oil refinery applicationrdquo IEEETransactions on Control Systems Technology vol 23 no 4 pp1589ndash1596 2014

[13] M A Rassam M A Maarof and A Zainal ldquoAdaptive andonline data anomaly detection for wireless sensor systemsrdquoKnowledge-Based Systems vol 60 pp 44ndash57 2014

[14] S Rajasegarar A Gluhak M Ali Imran et al ldquoEllipsoidalneighbourhood outlier factor for distributed anomaly detectionin resource constrained networksrdquo Pattern Recognition vol 47no 9 pp 2867ndash2879 2014

[15] N Lu G Zhang and J Lu ldquoConcept drift detection viacompetence modelsrdquo Artificial Intelligence vol 209 pp 11ndash282014

[16] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[17] S Seguı L Igual and J Vitria ldquoBagged one-class classifiersin the presence of outliersrdquo International Journal of PatternRecognition and Artificial Intelligence vol 27 no 5 Article ID1350014 2013

12 International Journal of Distributed Sensor Networks

[18] N Duffy and D Helmbold ldquoBoosting methods for regressionrdquoMachine Learning vol 47 no 2-3 pp 153ndash200 2002

[19] W-C Chang and C-W Cho ldquoOnline boosting for vehicledetectionrdquo IEEETransactions on SystemsMan and CyberneticsPart B Cybernetics vol 40 no 3 pp 892ndash902 2010

[20] C Desir S Bernard C Petitjean and L Heutte ldquoOne classrandom forestsrdquo Pattern Recognition vol 46 no 12 pp 3490ndash3506 2013

[21] A Fern and R Givan ldquoOnline ensemble learning an empiricalstudyrdquoMachine Learning vol 53 no 1-2 pp 71ndash109 2003

[22] A Bifet G Holmes B Pfahringer and R Gavalda ldquoImprov-ing adaptive bagging methods for evolving data streamsrdquo inAdvances in Machine Learning vol 5828 of Lecture Notes inComputer Science pp 23ndash37 Springer Berlin Germany 2009

[23] D I Curiac and C Volosencu ldquoEnsemble based sensinganomaly detection in wireless sensor networksrdquo Expert Systemswith Applications vol 39 no 10 pp 9087ndash9096 2012

[24] X Zhou S Li and Z Ye ldquoA novel system anomaly predictionsystem based on belief markov model and ensemble classifica-tionrdquo Mathematical Problems in Engineering vol 2013 ArticleID 179390 10 pages 2013

[25] H He S Chen K Li and X Xu ldquoIncremental learning fromstream datardquo IEEE Transactions on Neural Networks vol 22 no12 pp 1901ndash1914 2011

[26] D Du K Li X Li and M Fei ldquoA novel forward gene selectionalgorithm for microarray datardquo Neurocomputing vol 133 pp446ndash458 2014

[27] H Ma ldquoAn analysis of the equilibrium of migration models forbiogeography-based optimizationrdquo Information Sciences vol180 no 18 pp 3444ndash3464 2010

[28] D Simon ldquoBiogeography-based optimizationrdquo IEEE Transac-tions on Evolutionary Computation vol 12 no 6 pp 702ndash7132008

[29] S Sheen R Anitha and P Sirisha ldquoMalware detection bypruning of parallel ensembles using harmony searchrdquo PatternRecognition Letters vol 34 no 14 pp 1679ndash1686 2013

[30] Y-Y Zhang H-C Chao M Chen L Shu C-H Park and M-S Park ldquoOutlier detection and countermeasure for hierarchicalwireless sensor networksrdquo IET Information Security vol 4 no4 pp 361ndash373 2010

[31] C Peng and M-R Fei ldquoAn improved result on the stability ofuncertain T-S fuzzy systems with interval time-varying delayrdquoFuzzy Sets and Systems vol 212 pp 97ndash109 2013

[32] Y Zhang Observing the Unobservable Distributed Online Out-lier Detection inWireless Sensor Networks University of TwenteEnschede The Netherlands 2010

[33] C Peng D Yue and M Fei ldquoRelaxed stability and stabilizationconditions of networked fuzzy control systems subject toasynchronous grades of membershiprdquo IEEE Transactions onFuzzy Systems vol 22 no 5 pp 1101ndash1112 2014

[34] C Peng M-R Fei E Tian and Y-P Guan ldquoOn hold or dropout-of-order packets in networked control systemsrdquo Informa-tion Sciences vol 268 pp 436ndash446 2014

[35] M A Rassam A Zainal and M A Maarof ldquoAn adaptive andefficient dimension reduction model for multivariate wirelesssensor networks applicationsrdquo Applied Soft Computing Journalvol 13 no 4 pp 1978ndash1996 2013

[36] M Xie J Hu S Han and H-H Chen ldquoScalable hypergridk-NN-based online anomaly detection in wireless sensor net-worksrdquo IEEE Transactions on Parallel and Distributed Systemsvol 24 no 8 pp 1661ndash1670 2013

[37] Intel Berkely Reseach Lab (IBRL) dataset 2004 httpdbcsailmitedulabdatalabdatahtml

[38] J W Branch C Giannella B Szymanski R Wolff and HKargupta ldquoIn-network outlier detection in wireless sensornetworksrdquo Knowledge and Information Systems vol 34 no 1pp 23ndash54 2013

[39] M Moshtaghi T C Havens J C Bezdek et al ldquoClusteringellipses for anomaly detectionrdquo Pattern Recognition vol 44 no1 pp 55ndash69 2011

[40] S Rajasegarar J C Bezdek C Leckie and M PalaniswamildquoElliptical anomalies in wireless sensor networksrdquo ACM Trans-actions on Sensor Networks vol 6 no 1 pp 1ndash28 2009

[41] M A Rassam A Zainal and M A Maarof ldquoOne-classprincipal component classifier for anomaly detection inwirelesssensor networkrdquo in Proceedings of the 4th International Confer-ence on Computational Aspects of Social Networks (CASoN rsquo12)pp 271ndash276 IEEE Sao Carlos Brazil November 2012

[42] H Sagha H Bayati J D R Millan and R Chavarriaga ldquoOn-line anomaly detection and resilience in classifier ensemblesrdquoPattern Recognition Letters vol 34 no 15 pp 1916ndash1927 2013

[43] M Hejazi and Y P Singh ldquoOne-class support vector machinesapproach to anomaly detectionrdquo Applied Artificial Intelligencevol 27 no 5 pp 351ndash366 2013

[44] Y Zhang NMeratnia and P JMHavinga ldquoDistributed onlineoutlier detection in wireless sensor networks using ellipsoidalsupport vector machinerdquo Ad Hoc Networks vol 11 no 3 pp1062ndash1074 2013

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 4: Research Article A Novel Distributed Online Anomaly ...downloads.hindawi.com/journals/ijdsn/2015/146189.pdfin the cluster head node to obtain an optimized subset of ensemble members

4 International Journal of Distributed Sensor Networks

measures a data vector at every time interval Δ119905 which iscomposed of multiple attribute values For the cluster headCH119894 the observation is 119883

119894= (119909

119894

1 119909119894

2 119909

119894

119889) where 119889

denotes the dimension For the 119895th neighbor node 119873119894119895 the

observation is 119883119894119895= (119909119894

1198951 119909119894

1198952 119909

119894

119895119889) Nodes in the cluster

collect samples synchronously and our proposed method isto identify these new observations of each sensor node asnormal or anomalous online

32 Spatial and Temporal (Spatiotemporal) Correlation ofSensed Dataset For the sensed dataset in a cluster wedescribed the spatiotemporal correlation firstly which will beused later to build our proposed online ensemble detection

The collected sensor dataset from WSNs is a time seriesdataset A time series is a sequence of value 119883 = 119909(119905) 119905 =

1 119899which follows a nonrandom order and the 119899 consec-utive observation values are collected at same time intervalsAnalyzing and learning from these observations [31] canhelp to understand the data trend over time and build theappropriate detector based on temporal correlation as well asto predict the label of new coming observations

To obtain the detector the foremost requirement is toachieve a stationary time series dataset Some data processingmethods are used to eliminate data trend and obtain a sta-tionary time series dataset such as polynomial fitting movingaverages differencing and double exponential smoothing[32ndash34] Considering the requirement of low computationalcomplexity a simple and efficient nonparametric technique(ie first differencing) is used to eliminate the temporal trendand obtain a stationary time series for dataset collected inWSNs which can be formulated as

1198831015840= 1199091015840(119904 119905) = 119909 (119904 119905) minus 119909 (119904 119905 minus 1) 119905 = 2 3 119899 (1)

Besides the sensor nodes are always deployed denselyand the space redundancy existed A dataset 119883 = 119909(119904) 119904 =

1 119898 is collected from 119898 sensor nodes in a clusterat a timestamp This dataset can help to understand thespatial correlation structure of data and predict the datavalue at a location nearby Spatial data may present thelocal dependencywhich represents the similarity relationshipof observations collected at adjacent locations in a localregion Usually for a specified region the observations of onesensor can be estimated by a linear weighted combination ofobservations collected at its adjacent locations [32] which canbe expressed as

119909 (119904119894) = 1205821119909 (1199041) + sdot sdot sdot + 120582119894minus1119909 (119904119894minus1) + 120582119894+1119909 (119904119894+1)

+ sdot sdot sdot + 120582119898119909 (119904119898)

(2)

where 1199041 119904119894minus1 119904119894+1 119904119898 denotes positions of sensornodes and 1205821 120582119894minus1 120582119894+1 120582119898 denotes the weights ofobservations sum119898

119896=1119896 =119894 120582119896 = 1Consequently for sensed data collected in a local region

two reasonable assumptions are described as follows

(1) The sensed data of adjacent nonfault sensor nodes aresimilar at the same timestamp

(2) The sensed data of adjacent nonfault sensor nodeshave the similar trend over time

Motivated by the two assumptions and ensemble learningtheory a novel anomaly detectionmethod is proposed in thispaper We will give the details in the following section

33 Proposed Ensemble Learning Method of Anomaly Detec-tion in WSNs Spatiotemporal correlation exists among sen-sor data in a local region of WSNs and a relatively smallcomponent that is a cluster consisting of a few of sensornodes and a cluster head node is taken into account to clearlydescribe proposed distributed anomaly detection methodbased on ensemble learning Ensemble pruning based onBBO was adopted to optimize the initial trained detectorfor mitigating the resource requirements The optimizedensemble detector was used to identify global anomalousobservations at each individual sensor timely Our proposedmethod is shown as Figure 2

Online anomaly detection method consists of three keyprocedures that is detector training online detecting andonline detector updating From Figure 2 it can be seen thatour proposed method enables each distributed deploymentsensor node to globally judge every new coming observa-tion normal or anomalous in time Distributed detecting isemployed to achieve load (communication computation andstorage) evenly in the network and to prolong the lifetime ofthe whole network

The whole procedure of proposed method is described asfollows

Step 1 Considering the temporal correlation at the certaintime period each sensor node 119904

119894trains a local ensemble

detector using the history dataset collected from a timeinterval In facts using this initial local ensemble detectorthe new coming observation is normal or anomalous can bedetermined locally

Step 2 Each sensor node 119904119894transmits its local ensemble

detector as well as some related parameters such as themaximize value minimum value and mean of trainingdataset to the cluster head node and other member sensornode

Step 3 Cluster head node received the local ensemble detec-tors from its member nodes and combined with its owntrained detector the initial global ensemble detector is built

Step 4 The BBOmethod is introduced in the cluster head toprune the initial global ensemble detector and to obtain anacceptable final ensemble detector

Step 5 The pruned ensemble detector that is final ensembledetector is broadcasted to its each member sensor node foronline global anomaly detection

Step 6 Each sensor node selectively retains the test data foronline update based on the predefined sampling probability119901

International Journal of Distributed Sensor Networks 5

Learning(i) Input training data

(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector

Member nodes (MN1)

Learning(i) Input training data

(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector

Member nodes (MN2)

Learning(i) Input training data

(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector

Member nodes (MNm)

Cluster head node (CH)

Ensemble aggregating

(i) Receive the local ensemble detector(ii) Aggregating

(iii) Output the global ensemble detector

Bit coding mechanism

BBO (ensemble pruning)

Local ensembledetector

Local ensembledetector

Local ensembledetector

Broadcastcommunication

Broadcastcommunication

Broadcastcommunication

Broadcastcommunication

Local ensembledetector

Global initialensemble detector

State matrix Global initial ensemble detector

Figure 2 Distributed ensemble anomaly detection method based on BBO pruning in WSNs

Step 7 Once the updating condition was activated theprocedure of retraining and detector updating was triggered

This method can scale well with increase of number ofnodes in WSNs due to its distributed processing nature Ithas low communication requirements and does not need totransmit any actual observations between cluster head nodeand itsmember sensor node which saves the communicationresource significantly

Next we described some important procedures men-tioned above in detail Further considering the context ofresource constraint of each sensor node in WSNs sometricks are designed to save the communication and memoryrequirements

331 Building the Initial EnsembleDetector An initial ensem-ble detector is constructed by two steps Firstly a numberof base detectors are trained sequentially for each sensornodes in a cluster (including the cluster head node itself)based on the history dataset Because the data distributionmay be changed over time the previous trained detector maybe useless for the future detection Moreover the limitedmemory resource in the sensor node is another constraintto store too many previous detectors In practice accordingto the space of memory resource only the latest multipledetectors are kept to build the initial local ensemble for onesensor node For example to sensor node 119894 the sensed data iscollected anddivided into data chunk based on a time intervalΔ119905 which is determined by the actual monitoring processConsequently each node trains multiple individual detectorsover time In our paper supposing 119899 latest detector is kept fora sensor node if there are119898 nodes in one cluster then totally119899lowast119898 detectors are obtained for the initial ensemble Secondly

each sensor node (including cluster head node) broadcasts its119899 trained detector in the cluster Taking the cluster head as anexample after all (119899lowast(119898minus1)) individual detectors are receivedfrom its member nodes the cluster head combines with its 119899trained detector and the initial ensemble (including 119899 lowast 119898

individual detectors) is built in cluster head nodeMany techniques can be employed for combining the

results of each detector to obtain the final detection resultThe common used method in the literature is the major-ity vote (for classification problem) and weighted average(for regression problem) In our paper the final ensembledetection result can be calculated by (3) where 119908

119894denotes

weight coefficient that is 119908119894= 1 means the simple average

otherwise weighted average In our paper for simplicity thesimple average strategy is employed to combine the finallyresult

119910fin (119909) =1

119899 lowast 119898

119899lowast119898

sum

119894=1119910119894 (119909) lowast 119908119894

(3)

332 Ensemble Pruning Based on BBO Search To miti-gate the expensive communication cost and high memoryrequirement induced by ensemble learning inspired by theprinciple of ldquomany could be better than allrdquo in the ensemblelearning community the ensemble pruning is necessary

Given an initial ensemble anomaly detector 119864 =

AD1AD2 AD

119899lowast119898 AD

119894is a trained anomaly detector

which can test an observation anomalous or not a combi-nation method 119862 and a test dataset 119879 The goal of ensemblepruning is to find an optimalsuboptimal subset 1198641015840 sube 119864which can minimize the generalization error and obtainbetter or at least same detection performance compared to119864 Let 119891

119894119895(119894 = 1 2 119898 119895 = 1 2 119899) be the fitness values

6 International Journal of Distributed Sensor Networks

Input 119864mdashinitial ensemble anomaly detector 119879mdashThe number of maximization iterationOutput 1198641015840mdashfinal ensemble anomaly detectorlowast BBO parameter initialization lowast

Create a random set of habitats (populations) 1198671 1198672 119867

119873

Compute corresponding fitness that is HSI valueslowastOptimization search process lowast

While (119879)Compute immigration rate 120582 and emigration rate 119906 for each habitat based on HSIlowast Migration lowastSelect119867

119894with probability based on 120582

119894

If119867119894is selected

Select119867119895with probability based on 119906

119895

If119867119895is selected

Randomly select a SIV form119867119895

Replace a random SIV in119867119894with one from119867

119895

End ifEnd iflowast Mutation lowastSelect an SIV in119867

119894with probability based on the mutation rate 120578

If119867119894(SIV) is selected

Replace119867119894(SIV) with a randomly generated SIV

End ifRe-compute HSI values119879 = T minus 1

End whilelowast Ensemble pruning lowast

Get the final ensemble of anomaly detector 119864lowast based on the habitats119867119894

lowast with acceptable HSI

Algorithm 1 Ensemble pruning BBO (E T)

of the detecting performance such as true positive rate falsepositive rate accuracy and so on Obviously the fitness value119865 can be defined as (4) based on the results of testing data

119865 =

[

[

[

[

[

[

11989111 11989112 sdot sdot sdot 1198911119899

11989121 11989122 sdot sdot sdot 1198912119899

sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot

1198911198981 1198911198982 sdot sdot sdot 119891

119898119899

]

]

]

]

]

]

(4)

The final fitness function can be defined as

Maximize (

1198731015840

sum

119894=1119898119895=1119899

119891119894119895)

st 1198731015840le 119898 lowast 119899

(5)

Here the problem of ensemble pruning is to find thesubset of 1198641015840 which was composed of part single detectorsFinding the optimized subset requires much heavier andmore delicate computation resources Biogeography-basedoptimization (BBO) is a novel optimization method and isemployed to find out the acceptable set of ensemble Weonly simply present some key information about BBO theinterested reader can be referred to the detailed descriptionin [28]

BBO is a population-based global optimization methodwhich has some common characteristics similar to the

existing evolutionary algorithms (EAs) such as genetic algo-rithm (GA) particle swarm optimization (PSO) and antcolony optimization (ACO) When it was used to search thesolution domain and obtain an optimalsuboptimal solutionsome operators were employed to share information amongsolutions which makes BBO applicable to many problemsthat GA and PSO are used The more distinctive differencebetween BBO and other EAs can be seen in [27 28]

The pseudo-code of ensemble pruning based on BBO canbe described as shown in Algorithm 1 [7] Here 119867 indicateshabit HIS is fitness and SIV (suitability index variable) is asolution feature

333 Some Tricks Designed to Mitigate the CommunicationRequirement In the WSNs the main reason of quick energydepletion is the radio communication among the sensornodes It has been known that the cost of communicationof one bit equals the cost of processing thousands of bitsin sensors [35] This means that the most energy in sensornode is consumed by radio communication rather thancollecting or processing data Consequently reducing thecommunication quantity will decrease the power resourcerequirement and eventually lengthen the lifetime of thewholeWSNs

It is obvious that the aforementioned method has relativehigh communication overhead Each sensor node transmitsits local ensemble detector to the cluster head and the finalpruned global ensemble detector broadcasts back to its each

International Journal of Distributed Sensor Networks 7

Input 1198641015840mdashCurrent pruned ensemble anomaly detector 119901mdashSampling probabilityOutput 119864lowastmdashUpdated pruned ensemble anomaly detector

For each sensor nodeRetain the new observation with probability 119901If buffer is replaced completely by new observations

Train new detector and transmit its summary to cluster head119864lowast = Ensemble Pruning BBO(1198641015840 119879)

Broadcast 119864lowast to its member sensor node for subsequent anomaly detection

Algorithm 2 Online updating (1198641015840 119901)

member sensor nodes In order to relieve communicationburden some skills are used to descend the communicationoverhead

In fact the distributed traininglearning method onlytransmits the summary information of trained local ensembledetector to the cluster head which has significantly decreasedthe communication cost compared to centralized anomalydetectionmanners that sent all trained data to cluster head tobuild detector Besides after the pruned ensemble is obtainedin cluster head node each member sensor node in thiscluster can obtain the pruned ensemble detector from thecluster head node A straightforward method is broadcastingthis pruned ensemble to its member sensor nodes Thisis a common used strategy but it does not make full useof local ensemble detector information and will cost morecommunication resources Here a state matrix 119875 is designedin the cluster head its element 119901

119894119895is defined by formula (6)

to represent each single detector in initial ensemble Theneach local ensemble detector is represented as a bit stringusing one bit for each single detector Detector is included orexcluded from the ensemble detector depending on the valueof the corresponding bit that is 1 denotes this single detectorthat is included in the final ensemble and 0 means it was notincluded

119901119894119895=

1 AD119894119895isin 1198641015840 119894 = 1 119898 119895 = 1 119899

0 otherwise

119875

=

1 2 119894 minus 1 119894 119894 + 1 sdot sdot sdot 119899

1198781

1198782

119878119898

[

[

[

[

[

[

0

1

sdot

0

1

0

sdot

0

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

0

1

sdot

1

1

1

sdot

1

1

0

sdot

1

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

1

0

sdot

1

]

]

]

]

]

]

(6)

After the pruned procedure is finished the cluster headbroadcasts the statematrix119875 to its member sensor node eachsensor node keeps the single detector whose correspondingvalue of state element equals 1 and it deletes the rest to buildthe pruned ensemble global detector Employing the statematrix can save the energy greatly For example after theensemble pruning is finished 1198731015840 (1198731015840 le 119899 lowast 119898) individualdetectors are broadcast in cluster If matrix 119875 is not used it

will need 4 lowast1198731015840 lowast 119889 bytes communication cost (suppose thatthe individual detector can be represented by 119889 parametersand each parameter needs at least 4 bytes) If matrix 119875 isintroduced each itemofmatrix119875 only needs 1 bit to representan individual detector Consequently only 119898 lowast 1198998 bytes arerequired to broadcast Suppose that one-third of individualdetectors are pruned (ie1198731015840 = 2lowast119899lowast1198983) then (4lowast119899lowast119898lowast

119889 lowast 23)(119898 lowast 1198998) asymp 2133119889 By introducing the ensemblepruning and state matrix the quantity of energy saving incluster head sensor is significant and the lifetime ofWSNs canbe lengthened

334 Online Update and Relearning Distribution changeof sensed dataset occurred possibly and detector updat-ing is necessary Online detector update will accompanya relearning procedure A comprised strategy (ie delayupdating strategy [36]) can cater this situation and savethe computation communication and memory resources tosome extent Simple to say for the new coming observationwhether saving and using it to update the current detectoror not are decided by a sample probability 119901 Some heuristicrules can be employed to guide its value for example if thedynamics is relatively stationary the small 119901 should be usedotherwise the big 119901 should be chosen When the buffer of asensor node is replaced by the new data completely onlineupdate is triggered and new detector is trained The pseudo-codes of algorithm can be described as shown inAlgorithm 2

4 Experimental and Analysis

In this section the dataset data preprocessing methodexperiment results and analysis are described respectivelyExperiments were conducted on a personal PC with IntelCore 2 Duo CPU P7450213GHZ and 4GB memoryThe operating system is Windows 7 professional The dataprocessing is partly on the MATLAB 2010 and the algorithmmentioned in Section 3 was implemented with MicrosoftVisual C++ platform

41 Dataset and Data Preprocessing IBRL datasets [37] wereused in our paper to validate proposed method which wascollected from aWSN deployed in Intel Research Laboratoryat University of Berkeley and commonly used to evaluatethe performance of some existing models for WSNs [3536 38ndash41] This network consists of 54 Mica2Dot sensornodes Figure 3 shows the location of each node of the

8 International Journal of Distributed Sensor Networks

1

2

3

4

5 6

7

8

9

10

11

12

1314

15

16

17

18

19

20

21

22

23

2425

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

4142

43

44

45

46

47

48

49

50

51

52

54

Lab

Server

Quiet

Phon

e

Kitchen

Elec

Copy

Storage

Conference

Office Office

53

Figure 3 Sensor nodes location in the IBRL deployment

deployment (node locations are shown in black hexagonwith their corresponding node IDs) [35] The whole datasetwas collected from 29022004 to 05042004 Four typesof measures data that is light temperature and humidityas well as voltage were collected and those measurementswere recorded in 31 s interval Because these sensors weredeployed inside a lab and the measurement variables hadlittle changes over time (except the light having the suddenchanges due to the irregular nature of this variable andfrequent onoff operation) this dataset was considered a typeof static datasets for many researchers In our experimentsto evaluate our proposed anomaly detection algorithm someartificial anomalies are created by randomly modifying someobservations which is widely used bymany researchers in theliterature [41]

Since our proposed method adopts the cluster structurea cluster (consisting of 4 sensor nodes ie N7 N8 N9and N10) and dataset (collected on 29022004) are chosenThe data distribution can be seen in [7] Here only partobservations (during 000000 amndash075959 am) from eachsensor node are employed to evaluate proposed methodThedata trend is depicted in Figure 4

From Figure 4 an obvious fact is that data distributionin a cluster is almost same which well proved that spatialcorrelation exists Though there are some trivial differencesafter analyzing the dataset carefully the main reason is thatdataset has some missing data points largely due to packetloss which can be further proved from Figure 4 In ourexperiment these missing observations can be interpolatedusing the method described in Section 33 The obvious factis that sudden peakvalley appeared in Figure 4 for eachsensor observation which implies that an interested eventmay occurred

Suppose that 119863 = 119909119894 119910119894 119894 = 1 2 119899 is a dataset

used to train an anomaly detector Here the 119909119894is a vector

with feature values and 119910119894is the label which indicates whether

the given observation is normal or anomalous Because theIBRL dataset regards all its observations as normal someanomaly data points are generated and inserted to evaluatethe performance of our proposed method In the paper anumber of 30 data points of artificial anomalies for eachsensor were injected consecutively in each dataset to calculatethe true positive rate (TPR) false negative rate (FPR) anddetection accuracy (ACC) Without loss of generality theanomalous dataset should follow a distribution very muchdifferent from that of the training dataset but their rangesshould be overlapped as much as possible Besides ananomalous event should be a small probability event fora normal dataset collected by a nonfault sensor node Theanomalies were generated using a normal randomizer withslightly deviate statistical characteristics from the normaldata characteristics [41] The detailed dataset information(including statistical parameters) of selected sensor node ispresented in Table 1

42 Performance EvaluationMetrics and BBO Parameters Inorder to evaluate our proposed method some commonlyused performance evaluation metrics for anomaly detectionare used in our paper such as detection accuracy (ACC) truepositive rate (TPR) and false positivealarm rate (FPR)Theyare described as follows

ACC =

(TP + TN)(TP + TN + FP + FN)

International Journal of Distributed Sensor Networks 9

17

175

18

185

19

195

20Temperature

0 200 400 600 800 1000

N7N8

N9N10

(a)

38

39

40

41

42

43

44

45

46

0 200 400 600 800 1000

Humidity

N7N8

N9N10

(b)

Figure 4 The data (temperature humidity) trend during 00000 amndash75959 am on February 29 2004

Table 1 Detail dataset information of selected sensor node on 29022004

Node Initial sample Mean Variance Injected anomaly Mean Variance119879 119867 119879 119867 119879 119867 119879 119867

N7 823 184154 409176 05238 14494 30 1821 4110 054 146N8 548 179844 417123 05315 14612 30 1775 4195 055 148N9 652 181140 426295 05288 14827 30 1835 4245 055 150N10 620 181144 426215 05244 14191 30 1833 4247 054 143119879 temperature119867 humidity

TPR =

TP(TP + FN)

FPR =

FP(FP + TN)

(7)

where TP means number of samples correctly predicted asanomaly class FP means number of samples incorrectlypredicted as anomaly class TN means number of samplescorrectly predicted as normal class and FN means numberof samples incorrectly predicted as normal class

BBO is employed to prune the initial ensemble and themigration model is same as that present in [27 28] and therelated parameters are set as follows

Habitat (population) size 119878 = 30 the number of SIVs(suitability index variables) in each island 119899 = 20 40 60 80the maximum migration rates 119864 = 1 and 119868 = 1 and themutation rate 120578 = 001 and 120582 120583 are the immigration rateand the emigration rate respectively The elitism parameter120588 = 2

HSI (habitat suitability index) is a fitness function similarto other population-based optimization algorithms HIS isevaluated by 119865-measure (119865-score) which considers both the

precision probability and the recall probability of binaryclassification problem

119865-measure =(1 + 120573

2) precision lowast recall

1205732lowast precision + recall

=

(1 + 1205732) lowast TP

(1 + 1205732) lowast TP + 1205732 lowast FN + FP

(8)

119865-measure can be interpreted as a weighted average of theprecision and recall and its value reaches best at 1 and worstat 0 120573 is a parameter used to adjust the relative importancebetween precision and recall 120573 = 05 1 2 Usually thevalue of 119865-measure is close to the relative small value ofprecision and recall that is the big 119865-measuremeans that theprecision and recall are all big Consequently a good detectoris analogous to a habitat with a high HSI and is included inthe final ensemble detector and a poor detector is analogousto a habitat with a low HIS and is discarded from the finalensemble detector In our paper 120573 = 1 is specified

43 Results Presentation and Discussions In the data miningand machine learning communities SVM-based method has

10 International Journal of Distributed Sensor Networks

Table 2 Detection performance of local ensemble detector

Ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

5 08700 05833 01181 07900 03333 01809 08267 05000 01549 08267 05714 0160810 08800 06667 01111 08033 03889 01702 08267 04375 01514 08333 06429 0157315 08900 07500 01042 08167 05000 01631 08433 05000 01373 08600 07143 0132920 08933 08333 01042 08200 05000 01596 08367 05000 01444 08567 07143 01364

Table 3 Detection performance of global ensemble detector [7]

Combined ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

20 09467 08333 00486 09300 07778 00603 09467 07500 00423 09500 07857 0042040 09700 07500 00208 09433 08333 00496 09710 08938 00246 09650 08929 0031560 09700 08333 00243 09733 08889 00213 09800 09375 00176 09783 09357 0019680 09817 09583 00174 09800 09444 00177 09767 09375 00211 09780 09714 00217

been widely used in classification problem which separatesthe data belonging to the different classes by fitting a hyper-plane One class SVM based method as a variation of thismethod is especially favored for anomaly detection [42ndash44]In the paper it was used to train the base detectorThe datasetof each sensor node was divided into two parts about 66was used for training the local detector and the remainder asthe test set was to evaluate proposed method

Online Bagging the commonly used ensemble strategywas used to build initial ensemble detector Our experimentsaim to achieve two goals Firstly it is to prove the effectivenessof proposed method based on ensemble learning theorySecondly it is to prove that pruned ensemble detector canobtain better (at least equal) performance compared to initialensemble detector and mitigate the resource requirementAs a result three experiments were done that is localensemble anomaly detector only considering the temporalcorrelation of each sensor node global ensemble anomalydetector considering the spatiotemporal correlation and theglobal pruned ensemble anomaly detector based on BBOThe experimental results can be seen in Tables 2 3 and 4respectively

Table 2 shows the performance of each sensor node underthe different ensemble size which does not take into accountthe spatial correlation of sensed data in a cluster Though theensemble detection performance is becoming ldquogoodrdquo gradualwith the increasing of ensemble size (the higher value ofACCTPR the better performance and the lower value of FPR thebetter performance) the overall performance is relatively lowThemaximumvalue of detection accuracy is only 8933 andmost of true positive rates are unacceptable and most of falsepositive rates (FPR) have a relative high valueAll these resultsindicate that the performance of local ensemble detectoris poor Table 3 shows the global detection performance ofeach sensor node Here after the local ensemble detectorwas trained each member node sent its local ensembleto each other to form the global ensemble detector andeach member node used this global detector to online testthe local observation From the results of Table 3 [7] an

obvious fact is that the detection performances are higherthan presented in Table 2With the help of neighbor detectorthe detection results become better and better correspondingto the increasing of ensemble size

In order to further optimize the proposed algorithmperformance and save the resource ensemble pruning is usedfor global ensemble detector Table 4 [7] shows the result ofdetection performance of pruned global ensemble detectorbased on BBO

Table 4 shows a more practicable result and the sizeof global ensemble decreases sharply while the detectorperformance is as good as or better than the initial globalensemble detector From the results of Table 5 when thesize of initial ensemble reaches 80 the 60 resource costis saved In our experiment only for validating the methodeffectively we set the ensemble sizes 5 10 15 and 20 for eachlocal ensemble detector which may be small for the practicalapplications In fact how many local ensemble detectors arechosen is an open topic and is decided by many factors suchas the computation capability and the communication cost aswell as memory usage of sensor node the expected detectingaccuracy requirement and so on In the practical applicationa trade-off is commonly considered

5 Conclusion and Future Work

After exploiting the spatiotemporal correlation existing inthe sensed data of WSNs and motivated by the advantagesof online ensemble learning a distributed online ensembleanomaly detector method has been proposed Due to thespecific resource constrained in theWSNs ensemble pruningbased on BBO is employed to mitigate the high resourcerequirement and obtain the optimized detector that performsat least as good as the original ones The experimental resultson real dataset demonstrated that our proposed method iseffective

Because the diversity of base learners is a key factorrelated to the performance of ensemble learning as a possibleextension of our work we plan to include some diversity

International Journal of Distributed Sensor Networks 11

Table 4 Detection performance of global ensemble detector based on BBO pruning [7]

Ensemble size(BBO pruned)

N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

14 09480 08000 00458 09327 07667 00567 09500 08125 00423 09533 08571 0042023 09710 07750 00208 09447 08000 00461 09733 09250 00239 09697 09143 0027627 09713 08500 00236 09683 08333 00230 09810 09563 00176 09797 09357 0018232 09820 09750 00177 09750 08333 00160 09820 09500 00162 09830 09786 00168

Table 5 Rate of saving resource cost based on global ensembledetector of BBO pruned

Number Initial ensemblesize

Prunedensemble size

Saving resourcecost

1 20 14 302 40 23 4253 60 27 554 80 32 60

measures in fitness function to improve the detecting per-formance in future Besides the cost of communication isthe main reason of quick energy depletion of sensor nodesespecially for the cluster head the adaptive selection of clusterhead based on energy state will be taken into account tolengthen the lifetime of WSNs in next work

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by the National Key ScientificInstrument and Equipment Development Project(2012YQ15008703) the Zhejiang Provincial Natural ScienceFoundation of China (LY13F020015) the Open Project of TopKeyDiscipline of Computer Software andTheory in ZhejiangProvincial (ZC323014100) National Science Foundation ofChina (61473182) Science and Technology Commission ofShanghai Municipality (11JC1404000 14JC1402200) andShanghai Rising-Star Program (13QA1401600)

References

[1] Y Zhang N Meratnia and P Havinga ldquoOutlier detectiontechniques for wireless sensor networks a surveyrdquo IEEE Com-munications Surveys and Tutorials vol 12 no 2 pp 159ndash1702010

[2] Y Zhang N A S Hamm N Meratnia A Stein M van deVoort and P J M Havinga ldquoStatistics-based outlier detectionfor wireless sensor networksrdquo International Journal of Geo-graphical Information Science vol 26 no 8 pp 1373ndash1392 2012

[3] C Peng and Q-L Han ldquoA novel event-triggered transmissionscheme and L

2control co-design for sampled-data control

systemsrdquo IEEE Transactions on Automatic Control vol 58 no10 pp 2620ndash2626 2013

[4] S Rajasegarar C Leckie M Palaniswami and J C BezdekldquoDistributed anomaly detection in wireless sensor networksrdquo inProceedings of the 10th IEEE Singapore International Conferenceon Communication systems (ICCS rsquo06) pp 1ndash5 IEEE SingaporeOctober 2006

[5] S Rajasegarar C Leckie and M Palaniswami ldquoAnomalydetection in wireless sensor networksrdquo IEEE Wireless Commu-nications vol 15 no 4 pp 34ndash40 2008

[6] M Xie S Han B Tian and S Parvin ldquoAnomaly detectionin wireless sensor networks a surveyrdquo Journal of Network andComputer Applications vol 34 no 4 pp 1302ndash1325 2011

[7] Z Ding M Fei D Du and S Xu ldquoOnline anomaly detectionmethod based on BBO ensemble pruning in wireless sensornetworksrdquo in Life System Modeling and Simulation vol 461 ofCommunications in Computer and Information Science pp 160ndash169 Springer Berlin Germany 2014

[8] T G Dietterich ldquoMachine-learning researchmdashfour currentdirectionsrdquo AI Magazine vol 18 no 4 pp 97ndash136 1997

[9] Z-H Zhou J Wu andW Tang ldquoEnsembling neural networksmany could be better than allrdquoArtificial Intelligence vol 137 no1-2 pp 239ndash263 2002

[10] N Shahid I H Naqvi and S B Qaisar ldquoCharacteristics andclassification of outlier detection techniques for wireless sensornetworks in harsh environments a surveyrdquoArtificial IntelligenceReview vol 137 pp 1ndash36 2012

[11] D Du K Li and M Fei ldquoA fast multi-output RBF neuralnetwork constructionmethodrdquoNeurocomputing vol 73 no 10ndash12 pp 2196ndash2202 2010

[12] P Gil A Santos and A Cardoso ldquoDealing with outliers inwireless sensor networks an oil refinery applicationrdquo IEEETransactions on Control Systems Technology vol 23 no 4 pp1589ndash1596 2014

[13] M A Rassam M A Maarof and A Zainal ldquoAdaptive andonline data anomaly detection for wireless sensor systemsrdquoKnowledge-Based Systems vol 60 pp 44ndash57 2014

[14] S Rajasegarar A Gluhak M Ali Imran et al ldquoEllipsoidalneighbourhood outlier factor for distributed anomaly detectionin resource constrained networksrdquo Pattern Recognition vol 47no 9 pp 2867ndash2879 2014

[15] N Lu G Zhang and J Lu ldquoConcept drift detection viacompetence modelsrdquo Artificial Intelligence vol 209 pp 11ndash282014

[16] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[17] S Seguı L Igual and J Vitria ldquoBagged one-class classifiersin the presence of outliersrdquo International Journal of PatternRecognition and Artificial Intelligence vol 27 no 5 Article ID1350014 2013

12 International Journal of Distributed Sensor Networks

[18] N Duffy and D Helmbold ldquoBoosting methods for regressionrdquoMachine Learning vol 47 no 2-3 pp 153ndash200 2002

[19] W-C Chang and C-W Cho ldquoOnline boosting for vehicledetectionrdquo IEEETransactions on SystemsMan and CyberneticsPart B Cybernetics vol 40 no 3 pp 892ndash902 2010

[20] C Desir S Bernard C Petitjean and L Heutte ldquoOne classrandom forestsrdquo Pattern Recognition vol 46 no 12 pp 3490ndash3506 2013

[21] A Fern and R Givan ldquoOnline ensemble learning an empiricalstudyrdquoMachine Learning vol 53 no 1-2 pp 71ndash109 2003

[22] A Bifet G Holmes B Pfahringer and R Gavalda ldquoImprov-ing adaptive bagging methods for evolving data streamsrdquo inAdvances in Machine Learning vol 5828 of Lecture Notes inComputer Science pp 23ndash37 Springer Berlin Germany 2009

[23] D I Curiac and C Volosencu ldquoEnsemble based sensinganomaly detection in wireless sensor networksrdquo Expert Systemswith Applications vol 39 no 10 pp 9087ndash9096 2012

[24] X Zhou S Li and Z Ye ldquoA novel system anomaly predictionsystem based on belief markov model and ensemble classifica-tionrdquo Mathematical Problems in Engineering vol 2013 ArticleID 179390 10 pages 2013

[25] H He S Chen K Li and X Xu ldquoIncremental learning fromstream datardquo IEEE Transactions on Neural Networks vol 22 no12 pp 1901ndash1914 2011

[26] D Du K Li X Li and M Fei ldquoA novel forward gene selectionalgorithm for microarray datardquo Neurocomputing vol 133 pp446ndash458 2014

[27] H Ma ldquoAn analysis of the equilibrium of migration models forbiogeography-based optimizationrdquo Information Sciences vol180 no 18 pp 3444ndash3464 2010

[28] D Simon ldquoBiogeography-based optimizationrdquo IEEE Transac-tions on Evolutionary Computation vol 12 no 6 pp 702ndash7132008

[29] S Sheen R Anitha and P Sirisha ldquoMalware detection bypruning of parallel ensembles using harmony searchrdquo PatternRecognition Letters vol 34 no 14 pp 1679ndash1686 2013

[30] Y-Y Zhang H-C Chao M Chen L Shu C-H Park and M-S Park ldquoOutlier detection and countermeasure for hierarchicalwireless sensor networksrdquo IET Information Security vol 4 no4 pp 361ndash373 2010

[31] C Peng and M-R Fei ldquoAn improved result on the stability ofuncertain T-S fuzzy systems with interval time-varying delayrdquoFuzzy Sets and Systems vol 212 pp 97ndash109 2013

[32] Y Zhang Observing the Unobservable Distributed Online Out-lier Detection inWireless Sensor Networks University of TwenteEnschede The Netherlands 2010

[33] C Peng D Yue and M Fei ldquoRelaxed stability and stabilizationconditions of networked fuzzy control systems subject toasynchronous grades of membershiprdquo IEEE Transactions onFuzzy Systems vol 22 no 5 pp 1101ndash1112 2014

[34] C Peng M-R Fei E Tian and Y-P Guan ldquoOn hold or dropout-of-order packets in networked control systemsrdquo Informa-tion Sciences vol 268 pp 436ndash446 2014

[35] M A Rassam A Zainal and M A Maarof ldquoAn adaptive andefficient dimension reduction model for multivariate wirelesssensor networks applicationsrdquo Applied Soft Computing Journalvol 13 no 4 pp 1978ndash1996 2013

[36] M Xie J Hu S Han and H-H Chen ldquoScalable hypergridk-NN-based online anomaly detection in wireless sensor net-worksrdquo IEEE Transactions on Parallel and Distributed Systemsvol 24 no 8 pp 1661ndash1670 2013

[37] Intel Berkely Reseach Lab (IBRL) dataset 2004 httpdbcsailmitedulabdatalabdatahtml

[38] J W Branch C Giannella B Szymanski R Wolff and HKargupta ldquoIn-network outlier detection in wireless sensornetworksrdquo Knowledge and Information Systems vol 34 no 1pp 23ndash54 2013

[39] M Moshtaghi T C Havens J C Bezdek et al ldquoClusteringellipses for anomaly detectionrdquo Pattern Recognition vol 44 no1 pp 55ndash69 2011

[40] S Rajasegarar J C Bezdek C Leckie and M PalaniswamildquoElliptical anomalies in wireless sensor networksrdquo ACM Trans-actions on Sensor Networks vol 6 no 1 pp 1ndash28 2009

[41] M A Rassam A Zainal and M A Maarof ldquoOne-classprincipal component classifier for anomaly detection inwirelesssensor networkrdquo in Proceedings of the 4th International Confer-ence on Computational Aspects of Social Networks (CASoN rsquo12)pp 271ndash276 IEEE Sao Carlos Brazil November 2012

[42] H Sagha H Bayati J D R Millan and R Chavarriaga ldquoOn-line anomaly detection and resilience in classifier ensemblesrdquoPattern Recognition Letters vol 34 no 15 pp 1916ndash1927 2013

[43] M Hejazi and Y P Singh ldquoOne-class support vector machinesapproach to anomaly detectionrdquo Applied Artificial Intelligencevol 27 no 5 pp 351ndash366 2013

[44] Y Zhang NMeratnia and P JMHavinga ldquoDistributed onlineoutlier detection in wireless sensor networks using ellipsoidalsupport vector machinerdquo Ad Hoc Networks vol 11 no 3 pp1062ndash1074 2013

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 5: Research Article A Novel Distributed Online Anomaly ...downloads.hindawi.com/journals/ijdsn/2015/146189.pdfin the cluster head node to obtain an optimized subset of ensemble members

International Journal of Distributed Sensor Networks 5

Learning(i) Input training data

(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector

Member nodes (MN1)

Learning(i) Input training data

(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector

Member nodes (MN2)

Learning(i) Input training data

(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector

Member nodes (MNm)

Cluster head node (CH)

Ensemble aggregating

(i) Receive the local ensemble detector(ii) Aggregating

(iii) Output the global ensemble detector

Bit coding mechanism

BBO (ensemble pruning)

Local ensembledetector

Local ensembledetector

Local ensembledetector

Broadcastcommunication

Broadcastcommunication

Broadcastcommunication

Broadcastcommunication

Local ensembledetector

Global initialensemble detector

State matrix Global initial ensemble detector

Figure 2 Distributed ensemble anomaly detection method based on BBO pruning in WSNs

Step 7 Once the updating condition was activated theprocedure of retraining and detector updating was triggered

This method can scale well with increase of number ofnodes in WSNs due to its distributed processing nature Ithas low communication requirements and does not need totransmit any actual observations between cluster head nodeand itsmember sensor node which saves the communicationresource significantly

Next we described some important procedures men-tioned above in detail Further considering the context ofresource constraint of each sensor node in WSNs sometricks are designed to save the communication and memoryrequirements

331 Building the Initial EnsembleDetector An initial ensem-ble detector is constructed by two steps Firstly a numberof base detectors are trained sequentially for each sensornodes in a cluster (including the cluster head node itself)based on the history dataset Because the data distributionmay be changed over time the previous trained detector maybe useless for the future detection Moreover the limitedmemory resource in the sensor node is another constraintto store too many previous detectors In practice accordingto the space of memory resource only the latest multipledetectors are kept to build the initial local ensemble for onesensor node For example to sensor node 119894 the sensed data iscollected anddivided into data chunk based on a time intervalΔ119905 which is determined by the actual monitoring processConsequently each node trains multiple individual detectorsover time In our paper supposing 119899 latest detector is kept fora sensor node if there are119898 nodes in one cluster then totally119899lowast119898 detectors are obtained for the initial ensemble Secondly

each sensor node (including cluster head node) broadcasts its119899 trained detector in the cluster Taking the cluster head as anexample after all (119899lowast(119898minus1)) individual detectors are receivedfrom its member nodes the cluster head combines with its 119899trained detector and the initial ensemble (including 119899 lowast 119898

individual detectors) is built in cluster head nodeMany techniques can be employed for combining the

results of each detector to obtain the final detection resultThe common used method in the literature is the major-ity vote (for classification problem) and weighted average(for regression problem) In our paper the final ensembledetection result can be calculated by (3) where 119908

119894denotes

weight coefficient that is 119908119894= 1 means the simple average

otherwise weighted average In our paper for simplicity thesimple average strategy is employed to combine the finallyresult

119910fin (119909) =1

119899 lowast 119898

119899lowast119898

sum

119894=1119910119894 (119909) lowast 119908119894

(3)

332 Ensemble Pruning Based on BBO Search To miti-gate the expensive communication cost and high memoryrequirement induced by ensemble learning inspired by theprinciple of ldquomany could be better than allrdquo in the ensemblelearning community the ensemble pruning is necessary

Given an initial ensemble anomaly detector 119864 =

AD1AD2 AD

119899lowast119898 AD

119894is a trained anomaly detector

which can test an observation anomalous or not a combi-nation method 119862 and a test dataset 119879 The goal of ensemblepruning is to find an optimalsuboptimal subset 1198641015840 sube 119864which can minimize the generalization error and obtainbetter or at least same detection performance compared to119864 Let 119891

119894119895(119894 = 1 2 119898 119895 = 1 2 119899) be the fitness values

6 International Journal of Distributed Sensor Networks

Input 119864mdashinitial ensemble anomaly detector 119879mdashThe number of maximization iterationOutput 1198641015840mdashfinal ensemble anomaly detectorlowast BBO parameter initialization lowast

Create a random set of habitats (populations) 1198671 1198672 119867

119873

Compute corresponding fitness that is HSI valueslowastOptimization search process lowast

While (119879)Compute immigration rate 120582 and emigration rate 119906 for each habitat based on HSIlowast Migration lowastSelect119867

119894with probability based on 120582

119894

If119867119894is selected

Select119867119895with probability based on 119906

119895

If119867119895is selected

Randomly select a SIV form119867119895

Replace a random SIV in119867119894with one from119867

119895

End ifEnd iflowast Mutation lowastSelect an SIV in119867

119894with probability based on the mutation rate 120578

If119867119894(SIV) is selected

Replace119867119894(SIV) with a randomly generated SIV

End ifRe-compute HSI values119879 = T minus 1

End whilelowast Ensemble pruning lowast

Get the final ensemble of anomaly detector 119864lowast based on the habitats119867119894

lowast with acceptable HSI

Algorithm 1 Ensemble pruning BBO (E T)

of the detecting performance such as true positive rate falsepositive rate accuracy and so on Obviously the fitness value119865 can be defined as (4) based on the results of testing data

119865 =

[

[

[

[

[

[

11989111 11989112 sdot sdot sdot 1198911119899

11989121 11989122 sdot sdot sdot 1198912119899

sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot

1198911198981 1198911198982 sdot sdot sdot 119891

119898119899

]

]

]

]

]

]

(4)

The final fitness function can be defined as

Maximize (

1198731015840

sum

119894=1119898119895=1119899

119891119894119895)

st 1198731015840le 119898 lowast 119899

(5)

Here the problem of ensemble pruning is to find thesubset of 1198641015840 which was composed of part single detectorsFinding the optimized subset requires much heavier andmore delicate computation resources Biogeography-basedoptimization (BBO) is a novel optimization method and isemployed to find out the acceptable set of ensemble Weonly simply present some key information about BBO theinterested reader can be referred to the detailed descriptionin [28]

BBO is a population-based global optimization methodwhich has some common characteristics similar to the

existing evolutionary algorithms (EAs) such as genetic algo-rithm (GA) particle swarm optimization (PSO) and antcolony optimization (ACO) When it was used to search thesolution domain and obtain an optimalsuboptimal solutionsome operators were employed to share information amongsolutions which makes BBO applicable to many problemsthat GA and PSO are used The more distinctive differencebetween BBO and other EAs can be seen in [27 28]

The pseudo-code of ensemble pruning based on BBO canbe described as shown in Algorithm 1 [7] Here 119867 indicateshabit HIS is fitness and SIV (suitability index variable) is asolution feature

333 Some Tricks Designed to Mitigate the CommunicationRequirement In the WSNs the main reason of quick energydepletion is the radio communication among the sensornodes It has been known that the cost of communicationof one bit equals the cost of processing thousands of bitsin sensors [35] This means that the most energy in sensornode is consumed by radio communication rather thancollecting or processing data Consequently reducing thecommunication quantity will decrease the power resourcerequirement and eventually lengthen the lifetime of thewholeWSNs

It is obvious that the aforementioned method has relativehigh communication overhead Each sensor node transmitsits local ensemble detector to the cluster head and the finalpruned global ensemble detector broadcasts back to its each

International Journal of Distributed Sensor Networks 7

Input 1198641015840mdashCurrent pruned ensemble anomaly detector 119901mdashSampling probabilityOutput 119864lowastmdashUpdated pruned ensemble anomaly detector

For each sensor nodeRetain the new observation with probability 119901If buffer is replaced completely by new observations

Train new detector and transmit its summary to cluster head119864lowast = Ensemble Pruning BBO(1198641015840 119879)

Broadcast 119864lowast to its member sensor node for subsequent anomaly detection

Algorithm 2 Online updating (1198641015840 119901)

member sensor nodes In order to relieve communicationburden some skills are used to descend the communicationoverhead

In fact the distributed traininglearning method onlytransmits the summary information of trained local ensembledetector to the cluster head which has significantly decreasedthe communication cost compared to centralized anomalydetectionmanners that sent all trained data to cluster head tobuild detector Besides after the pruned ensemble is obtainedin cluster head node each member sensor node in thiscluster can obtain the pruned ensemble detector from thecluster head node A straightforward method is broadcastingthis pruned ensemble to its member sensor nodes Thisis a common used strategy but it does not make full useof local ensemble detector information and will cost morecommunication resources Here a state matrix 119875 is designedin the cluster head its element 119901

119894119895is defined by formula (6)

to represent each single detector in initial ensemble Theneach local ensemble detector is represented as a bit stringusing one bit for each single detector Detector is included orexcluded from the ensemble detector depending on the valueof the corresponding bit that is 1 denotes this single detectorthat is included in the final ensemble and 0 means it was notincluded

119901119894119895=

1 AD119894119895isin 1198641015840 119894 = 1 119898 119895 = 1 119899

0 otherwise

119875

=

1 2 119894 minus 1 119894 119894 + 1 sdot sdot sdot 119899

1198781

1198782

119878119898

[

[

[

[

[

[

0

1

sdot

0

1

0

sdot

0

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

0

1

sdot

1

1

1

sdot

1

1

0

sdot

1

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

1

0

sdot

1

]

]

]

]

]

]

(6)

After the pruned procedure is finished the cluster headbroadcasts the statematrix119875 to its member sensor node eachsensor node keeps the single detector whose correspondingvalue of state element equals 1 and it deletes the rest to buildthe pruned ensemble global detector Employing the statematrix can save the energy greatly For example after theensemble pruning is finished 1198731015840 (1198731015840 le 119899 lowast 119898) individualdetectors are broadcast in cluster If matrix 119875 is not used it

will need 4 lowast1198731015840 lowast 119889 bytes communication cost (suppose thatthe individual detector can be represented by 119889 parametersand each parameter needs at least 4 bytes) If matrix 119875 isintroduced each itemofmatrix119875 only needs 1 bit to representan individual detector Consequently only 119898 lowast 1198998 bytes arerequired to broadcast Suppose that one-third of individualdetectors are pruned (ie1198731015840 = 2lowast119899lowast1198983) then (4lowast119899lowast119898lowast

119889 lowast 23)(119898 lowast 1198998) asymp 2133119889 By introducing the ensemblepruning and state matrix the quantity of energy saving incluster head sensor is significant and the lifetime ofWSNs canbe lengthened

334 Online Update and Relearning Distribution changeof sensed dataset occurred possibly and detector updat-ing is necessary Online detector update will accompanya relearning procedure A comprised strategy (ie delayupdating strategy [36]) can cater this situation and savethe computation communication and memory resources tosome extent Simple to say for the new coming observationwhether saving and using it to update the current detectoror not are decided by a sample probability 119901 Some heuristicrules can be employed to guide its value for example if thedynamics is relatively stationary the small 119901 should be usedotherwise the big 119901 should be chosen When the buffer of asensor node is replaced by the new data completely onlineupdate is triggered and new detector is trained The pseudo-codes of algorithm can be described as shown inAlgorithm 2

4 Experimental and Analysis

In this section the dataset data preprocessing methodexperiment results and analysis are described respectivelyExperiments were conducted on a personal PC with IntelCore 2 Duo CPU P7450213GHZ and 4GB memoryThe operating system is Windows 7 professional The dataprocessing is partly on the MATLAB 2010 and the algorithmmentioned in Section 3 was implemented with MicrosoftVisual C++ platform

41 Dataset and Data Preprocessing IBRL datasets [37] wereused in our paper to validate proposed method which wascollected from aWSN deployed in Intel Research Laboratoryat University of Berkeley and commonly used to evaluatethe performance of some existing models for WSNs [3536 38ndash41] This network consists of 54 Mica2Dot sensornodes Figure 3 shows the location of each node of the

8 International Journal of Distributed Sensor Networks

1

2

3

4

5 6

7

8

9

10

11

12

1314

15

16

17

18

19

20

21

22

23

2425

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

4142

43

44

45

46

47

48

49

50

51

52

54

Lab

Server

Quiet

Phon

e

Kitchen

Elec

Copy

Storage

Conference

Office Office

53

Figure 3 Sensor nodes location in the IBRL deployment

deployment (node locations are shown in black hexagonwith their corresponding node IDs) [35] The whole datasetwas collected from 29022004 to 05042004 Four typesof measures data that is light temperature and humidityas well as voltage were collected and those measurementswere recorded in 31 s interval Because these sensors weredeployed inside a lab and the measurement variables hadlittle changes over time (except the light having the suddenchanges due to the irregular nature of this variable andfrequent onoff operation) this dataset was considered a typeof static datasets for many researchers In our experimentsto evaluate our proposed anomaly detection algorithm someartificial anomalies are created by randomly modifying someobservations which is widely used bymany researchers in theliterature [41]

Since our proposed method adopts the cluster structurea cluster (consisting of 4 sensor nodes ie N7 N8 N9and N10) and dataset (collected on 29022004) are chosenThe data distribution can be seen in [7] Here only partobservations (during 000000 amndash075959 am) from eachsensor node are employed to evaluate proposed methodThedata trend is depicted in Figure 4

From Figure 4 an obvious fact is that data distributionin a cluster is almost same which well proved that spatialcorrelation exists Though there are some trivial differencesafter analyzing the dataset carefully the main reason is thatdataset has some missing data points largely due to packetloss which can be further proved from Figure 4 In ourexperiment these missing observations can be interpolatedusing the method described in Section 33 The obvious factis that sudden peakvalley appeared in Figure 4 for eachsensor observation which implies that an interested eventmay occurred

Suppose that 119863 = 119909119894 119910119894 119894 = 1 2 119899 is a dataset

used to train an anomaly detector Here the 119909119894is a vector

with feature values and 119910119894is the label which indicates whether

the given observation is normal or anomalous Because theIBRL dataset regards all its observations as normal someanomaly data points are generated and inserted to evaluatethe performance of our proposed method In the paper anumber of 30 data points of artificial anomalies for eachsensor were injected consecutively in each dataset to calculatethe true positive rate (TPR) false negative rate (FPR) anddetection accuracy (ACC) Without loss of generality theanomalous dataset should follow a distribution very muchdifferent from that of the training dataset but their rangesshould be overlapped as much as possible Besides ananomalous event should be a small probability event fora normal dataset collected by a nonfault sensor node Theanomalies were generated using a normal randomizer withslightly deviate statistical characteristics from the normaldata characteristics [41] The detailed dataset information(including statistical parameters) of selected sensor node ispresented in Table 1

42 Performance EvaluationMetrics and BBO Parameters Inorder to evaluate our proposed method some commonlyused performance evaluation metrics for anomaly detectionare used in our paper such as detection accuracy (ACC) truepositive rate (TPR) and false positivealarm rate (FPR)Theyare described as follows

ACC =

(TP + TN)(TP + TN + FP + FN)

International Journal of Distributed Sensor Networks 9

17

175

18

185

19

195

20Temperature

0 200 400 600 800 1000

N7N8

N9N10

(a)

38

39

40

41

42

43

44

45

46

0 200 400 600 800 1000

Humidity

N7N8

N9N10

(b)

Figure 4 The data (temperature humidity) trend during 00000 amndash75959 am on February 29 2004

Table 1 Detail dataset information of selected sensor node on 29022004

Node Initial sample Mean Variance Injected anomaly Mean Variance119879 119867 119879 119867 119879 119867 119879 119867

N7 823 184154 409176 05238 14494 30 1821 4110 054 146N8 548 179844 417123 05315 14612 30 1775 4195 055 148N9 652 181140 426295 05288 14827 30 1835 4245 055 150N10 620 181144 426215 05244 14191 30 1833 4247 054 143119879 temperature119867 humidity

TPR =

TP(TP + FN)

FPR =

FP(FP + TN)

(7)

where TP means number of samples correctly predicted asanomaly class FP means number of samples incorrectlypredicted as anomaly class TN means number of samplescorrectly predicted as normal class and FN means numberof samples incorrectly predicted as normal class

BBO is employed to prune the initial ensemble and themigration model is same as that present in [27 28] and therelated parameters are set as follows

Habitat (population) size 119878 = 30 the number of SIVs(suitability index variables) in each island 119899 = 20 40 60 80the maximum migration rates 119864 = 1 and 119868 = 1 and themutation rate 120578 = 001 and 120582 120583 are the immigration rateand the emigration rate respectively The elitism parameter120588 = 2

HSI (habitat suitability index) is a fitness function similarto other population-based optimization algorithms HIS isevaluated by 119865-measure (119865-score) which considers both the

precision probability and the recall probability of binaryclassification problem

119865-measure =(1 + 120573

2) precision lowast recall

1205732lowast precision + recall

=

(1 + 1205732) lowast TP

(1 + 1205732) lowast TP + 1205732 lowast FN + FP

(8)

119865-measure can be interpreted as a weighted average of theprecision and recall and its value reaches best at 1 and worstat 0 120573 is a parameter used to adjust the relative importancebetween precision and recall 120573 = 05 1 2 Usually thevalue of 119865-measure is close to the relative small value ofprecision and recall that is the big 119865-measuremeans that theprecision and recall are all big Consequently a good detectoris analogous to a habitat with a high HSI and is included inthe final ensemble detector and a poor detector is analogousto a habitat with a low HIS and is discarded from the finalensemble detector In our paper 120573 = 1 is specified

43 Results Presentation and Discussions In the data miningand machine learning communities SVM-based method has

10 International Journal of Distributed Sensor Networks

Table 2 Detection performance of local ensemble detector

Ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

5 08700 05833 01181 07900 03333 01809 08267 05000 01549 08267 05714 0160810 08800 06667 01111 08033 03889 01702 08267 04375 01514 08333 06429 0157315 08900 07500 01042 08167 05000 01631 08433 05000 01373 08600 07143 0132920 08933 08333 01042 08200 05000 01596 08367 05000 01444 08567 07143 01364

Table 3 Detection performance of global ensemble detector [7]

Combined ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

20 09467 08333 00486 09300 07778 00603 09467 07500 00423 09500 07857 0042040 09700 07500 00208 09433 08333 00496 09710 08938 00246 09650 08929 0031560 09700 08333 00243 09733 08889 00213 09800 09375 00176 09783 09357 0019680 09817 09583 00174 09800 09444 00177 09767 09375 00211 09780 09714 00217

been widely used in classification problem which separatesthe data belonging to the different classes by fitting a hyper-plane One class SVM based method as a variation of thismethod is especially favored for anomaly detection [42ndash44]In the paper it was used to train the base detectorThe datasetof each sensor node was divided into two parts about 66was used for training the local detector and the remainder asthe test set was to evaluate proposed method

Online Bagging the commonly used ensemble strategywas used to build initial ensemble detector Our experimentsaim to achieve two goals Firstly it is to prove the effectivenessof proposed method based on ensemble learning theorySecondly it is to prove that pruned ensemble detector canobtain better (at least equal) performance compared to initialensemble detector and mitigate the resource requirementAs a result three experiments were done that is localensemble anomaly detector only considering the temporalcorrelation of each sensor node global ensemble anomalydetector considering the spatiotemporal correlation and theglobal pruned ensemble anomaly detector based on BBOThe experimental results can be seen in Tables 2 3 and 4respectively

Table 2 shows the performance of each sensor node underthe different ensemble size which does not take into accountthe spatial correlation of sensed data in a cluster Though theensemble detection performance is becoming ldquogoodrdquo gradualwith the increasing of ensemble size (the higher value ofACCTPR the better performance and the lower value of FPR thebetter performance) the overall performance is relatively lowThemaximumvalue of detection accuracy is only 8933 andmost of true positive rates are unacceptable and most of falsepositive rates (FPR) have a relative high valueAll these resultsindicate that the performance of local ensemble detectoris poor Table 3 shows the global detection performance ofeach sensor node Here after the local ensemble detectorwas trained each member node sent its local ensembleto each other to form the global ensemble detector andeach member node used this global detector to online testthe local observation From the results of Table 3 [7] an

obvious fact is that the detection performances are higherthan presented in Table 2With the help of neighbor detectorthe detection results become better and better correspondingto the increasing of ensemble size

In order to further optimize the proposed algorithmperformance and save the resource ensemble pruning is usedfor global ensemble detector Table 4 [7] shows the result ofdetection performance of pruned global ensemble detectorbased on BBO

Table 4 shows a more practicable result and the sizeof global ensemble decreases sharply while the detectorperformance is as good as or better than the initial globalensemble detector From the results of Table 5 when thesize of initial ensemble reaches 80 the 60 resource costis saved In our experiment only for validating the methodeffectively we set the ensemble sizes 5 10 15 and 20 for eachlocal ensemble detector which may be small for the practicalapplications In fact how many local ensemble detectors arechosen is an open topic and is decided by many factors suchas the computation capability and the communication cost aswell as memory usage of sensor node the expected detectingaccuracy requirement and so on In the practical applicationa trade-off is commonly considered

5 Conclusion and Future Work

After exploiting the spatiotemporal correlation existing inthe sensed data of WSNs and motivated by the advantagesof online ensemble learning a distributed online ensembleanomaly detector method has been proposed Due to thespecific resource constrained in theWSNs ensemble pruningbased on BBO is employed to mitigate the high resourcerequirement and obtain the optimized detector that performsat least as good as the original ones The experimental resultson real dataset demonstrated that our proposed method iseffective

Because the diversity of base learners is a key factorrelated to the performance of ensemble learning as a possibleextension of our work we plan to include some diversity

International Journal of Distributed Sensor Networks 11

Table 4 Detection performance of global ensemble detector based on BBO pruning [7]

Ensemble size(BBO pruned)

N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

14 09480 08000 00458 09327 07667 00567 09500 08125 00423 09533 08571 0042023 09710 07750 00208 09447 08000 00461 09733 09250 00239 09697 09143 0027627 09713 08500 00236 09683 08333 00230 09810 09563 00176 09797 09357 0018232 09820 09750 00177 09750 08333 00160 09820 09500 00162 09830 09786 00168

Table 5 Rate of saving resource cost based on global ensembledetector of BBO pruned

Number Initial ensemblesize

Prunedensemble size

Saving resourcecost

1 20 14 302 40 23 4253 60 27 554 80 32 60

measures in fitness function to improve the detecting per-formance in future Besides the cost of communication isthe main reason of quick energy depletion of sensor nodesespecially for the cluster head the adaptive selection of clusterhead based on energy state will be taken into account tolengthen the lifetime of WSNs in next work

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by the National Key ScientificInstrument and Equipment Development Project(2012YQ15008703) the Zhejiang Provincial Natural ScienceFoundation of China (LY13F020015) the Open Project of TopKeyDiscipline of Computer Software andTheory in ZhejiangProvincial (ZC323014100) National Science Foundation ofChina (61473182) Science and Technology Commission ofShanghai Municipality (11JC1404000 14JC1402200) andShanghai Rising-Star Program (13QA1401600)

References

[1] Y Zhang N Meratnia and P Havinga ldquoOutlier detectiontechniques for wireless sensor networks a surveyrdquo IEEE Com-munications Surveys and Tutorials vol 12 no 2 pp 159ndash1702010

[2] Y Zhang N A S Hamm N Meratnia A Stein M van deVoort and P J M Havinga ldquoStatistics-based outlier detectionfor wireless sensor networksrdquo International Journal of Geo-graphical Information Science vol 26 no 8 pp 1373ndash1392 2012

[3] C Peng and Q-L Han ldquoA novel event-triggered transmissionscheme and L

2control co-design for sampled-data control

systemsrdquo IEEE Transactions on Automatic Control vol 58 no10 pp 2620ndash2626 2013

[4] S Rajasegarar C Leckie M Palaniswami and J C BezdekldquoDistributed anomaly detection in wireless sensor networksrdquo inProceedings of the 10th IEEE Singapore International Conferenceon Communication systems (ICCS rsquo06) pp 1ndash5 IEEE SingaporeOctober 2006

[5] S Rajasegarar C Leckie and M Palaniswami ldquoAnomalydetection in wireless sensor networksrdquo IEEE Wireless Commu-nications vol 15 no 4 pp 34ndash40 2008

[6] M Xie S Han B Tian and S Parvin ldquoAnomaly detectionin wireless sensor networks a surveyrdquo Journal of Network andComputer Applications vol 34 no 4 pp 1302ndash1325 2011

[7] Z Ding M Fei D Du and S Xu ldquoOnline anomaly detectionmethod based on BBO ensemble pruning in wireless sensornetworksrdquo in Life System Modeling and Simulation vol 461 ofCommunications in Computer and Information Science pp 160ndash169 Springer Berlin Germany 2014

[8] T G Dietterich ldquoMachine-learning researchmdashfour currentdirectionsrdquo AI Magazine vol 18 no 4 pp 97ndash136 1997

[9] Z-H Zhou J Wu andW Tang ldquoEnsembling neural networksmany could be better than allrdquoArtificial Intelligence vol 137 no1-2 pp 239ndash263 2002

[10] N Shahid I H Naqvi and S B Qaisar ldquoCharacteristics andclassification of outlier detection techniques for wireless sensornetworks in harsh environments a surveyrdquoArtificial IntelligenceReview vol 137 pp 1ndash36 2012

[11] D Du K Li and M Fei ldquoA fast multi-output RBF neuralnetwork constructionmethodrdquoNeurocomputing vol 73 no 10ndash12 pp 2196ndash2202 2010

[12] P Gil A Santos and A Cardoso ldquoDealing with outliers inwireless sensor networks an oil refinery applicationrdquo IEEETransactions on Control Systems Technology vol 23 no 4 pp1589ndash1596 2014

[13] M A Rassam M A Maarof and A Zainal ldquoAdaptive andonline data anomaly detection for wireless sensor systemsrdquoKnowledge-Based Systems vol 60 pp 44ndash57 2014

[14] S Rajasegarar A Gluhak M Ali Imran et al ldquoEllipsoidalneighbourhood outlier factor for distributed anomaly detectionin resource constrained networksrdquo Pattern Recognition vol 47no 9 pp 2867ndash2879 2014

[15] N Lu G Zhang and J Lu ldquoConcept drift detection viacompetence modelsrdquo Artificial Intelligence vol 209 pp 11ndash282014

[16] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[17] S Seguı L Igual and J Vitria ldquoBagged one-class classifiersin the presence of outliersrdquo International Journal of PatternRecognition and Artificial Intelligence vol 27 no 5 Article ID1350014 2013

12 International Journal of Distributed Sensor Networks

[18] N Duffy and D Helmbold ldquoBoosting methods for regressionrdquoMachine Learning vol 47 no 2-3 pp 153ndash200 2002

[19] W-C Chang and C-W Cho ldquoOnline boosting for vehicledetectionrdquo IEEETransactions on SystemsMan and CyberneticsPart B Cybernetics vol 40 no 3 pp 892ndash902 2010

[20] C Desir S Bernard C Petitjean and L Heutte ldquoOne classrandom forestsrdquo Pattern Recognition vol 46 no 12 pp 3490ndash3506 2013

[21] A Fern and R Givan ldquoOnline ensemble learning an empiricalstudyrdquoMachine Learning vol 53 no 1-2 pp 71ndash109 2003

[22] A Bifet G Holmes B Pfahringer and R Gavalda ldquoImprov-ing adaptive bagging methods for evolving data streamsrdquo inAdvances in Machine Learning vol 5828 of Lecture Notes inComputer Science pp 23ndash37 Springer Berlin Germany 2009

[23] D I Curiac and C Volosencu ldquoEnsemble based sensinganomaly detection in wireless sensor networksrdquo Expert Systemswith Applications vol 39 no 10 pp 9087ndash9096 2012

[24] X Zhou S Li and Z Ye ldquoA novel system anomaly predictionsystem based on belief markov model and ensemble classifica-tionrdquo Mathematical Problems in Engineering vol 2013 ArticleID 179390 10 pages 2013

[25] H He S Chen K Li and X Xu ldquoIncremental learning fromstream datardquo IEEE Transactions on Neural Networks vol 22 no12 pp 1901ndash1914 2011

[26] D Du K Li X Li and M Fei ldquoA novel forward gene selectionalgorithm for microarray datardquo Neurocomputing vol 133 pp446ndash458 2014

[27] H Ma ldquoAn analysis of the equilibrium of migration models forbiogeography-based optimizationrdquo Information Sciences vol180 no 18 pp 3444ndash3464 2010

[28] D Simon ldquoBiogeography-based optimizationrdquo IEEE Transac-tions on Evolutionary Computation vol 12 no 6 pp 702ndash7132008

[29] S Sheen R Anitha and P Sirisha ldquoMalware detection bypruning of parallel ensembles using harmony searchrdquo PatternRecognition Letters vol 34 no 14 pp 1679ndash1686 2013

[30] Y-Y Zhang H-C Chao M Chen L Shu C-H Park and M-S Park ldquoOutlier detection and countermeasure for hierarchicalwireless sensor networksrdquo IET Information Security vol 4 no4 pp 361ndash373 2010

[31] C Peng and M-R Fei ldquoAn improved result on the stability ofuncertain T-S fuzzy systems with interval time-varying delayrdquoFuzzy Sets and Systems vol 212 pp 97ndash109 2013

[32] Y Zhang Observing the Unobservable Distributed Online Out-lier Detection inWireless Sensor Networks University of TwenteEnschede The Netherlands 2010

[33] C Peng D Yue and M Fei ldquoRelaxed stability and stabilizationconditions of networked fuzzy control systems subject toasynchronous grades of membershiprdquo IEEE Transactions onFuzzy Systems vol 22 no 5 pp 1101ndash1112 2014

[34] C Peng M-R Fei E Tian and Y-P Guan ldquoOn hold or dropout-of-order packets in networked control systemsrdquo Informa-tion Sciences vol 268 pp 436ndash446 2014

[35] M A Rassam A Zainal and M A Maarof ldquoAn adaptive andefficient dimension reduction model for multivariate wirelesssensor networks applicationsrdquo Applied Soft Computing Journalvol 13 no 4 pp 1978ndash1996 2013

[36] M Xie J Hu S Han and H-H Chen ldquoScalable hypergridk-NN-based online anomaly detection in wireless sensor net-worksrdquo IEEE Transactions on Parallel and Distributed Systemsvol 24 no 8 pp 1661ndash1670 2013

[37] Intel Berkely Reseach Lab (IBRL) dataset 2004 httpdbcsailmitedulabdatalabdatahtml

[38] J W Branch C Giannella B Szymanski R Wolff and HKargupta ldquoIn-network outlier detection in wireless sensornetworksrdquo Knowledge and Information Systems vol 34 no 1pp 23ndash54 2013

[39] M Moshtaghi T C Havens J C Bezdek et al ldquoClusteringellipses for anomaly detectionrdquo Pattern Recognition vol 44 no1 pp 55ndash69 2011

[40] S Rajasegarar J C Bezdek C Leckie and M PalaniswamildquoElliptical anomalies in wireless sensor networksrdquo ACM Trans-actions on Sensor Networks vol 6 no 1 pp 1ndash28 2009

[41] M A Rassam A Zainal and M A Maarof ldquoOne-classprincipal component classifier for anomaly detection inwirelesssensor networkrdquo in Proceedings of the 4th International Confer-ence on Computational Aspects of Social Networks (CASoN rsquo12)pp 271ndash276 IEEE Sao Carlos Brazil November 2012

[42] H Sagha H Bayati J D R Millan and R Chavarriaga ldquoOn-line anomaly detection and resilience in classifier ensemblesrdquoPattern Recognition Letters vol 34 no 15 pp 1916ndash1927 2013

[43] M Hejazi and Y P Singh ldquoOne-class support vector machinesapproach to anomaly detectionrdquo Applied Artificial Intelligencevol 27 no 5 pp 351ndash366 2013

[44] Y Zhang NMeratnia and P JMHavinga ldquoDistributed onlineoutlier detection in wireless sensor networks using ellipsoidalsupport vector machinerdquo Ad Hoc Networks vol 11 no 3 pp1062ndash1074 2013

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 6: Research Article A Novel Distributed Online Anomaly ...downloads.hindawi.com/journals/ijdsn/2015/146189.pdfin the cluster head node to obtain an optimized subset of ensemble members

6 International Journal of Distributed Sensor Networks

Input 119864mdashinitial ensemble anomaly detector 119879mdashThe number of maximization iterationOutput 1198641015840mdashfinal ensemble anomaly detectorlowast BBO parameter initialization lowast

Create a random set of habitats (populations) 1198671 1198672 119867

119873

Compute corresponding fitness that is HSI valueslowastOptimization search process lowast

While (119879)Compute immigration rate 120582 and emigration rate 119906 for each habitat based on HSIlowast Migration lowastSelect119867

119894with probability based on 120582

119894

If119867119894is selected

Select119867119895with probability based on 119906

119895

If119867119895is selected

Randomly select a SIV form119867119895

Replace a random SIV in119867119894with one from119867

119895

End ifEnd iflowast Mutation lowastSelect an SIV in119867

119894with probability based on the mutation rate 120578

If119867119894(SIV) is selected

Replace119867119894(SIV) with a randomly generated SIV

End ifRe-compute HSI values119879 = T minus 1

End whilelowast Ensemble pruning lowast

Get the final ensemble of anomaly detector 119864lowast based on the habitats119867119894

lowast with acceptable HSI

Algorithm 1 Ensemble pruning BBO (E T)

of the detecting performance such as true positive rate falsepositive rate accuracy and so on Obviously the fitness value119865 can be defined as (4) based on the results of testing data

119865 =

[

[

[

[

[

[

11989111 11989112 sdot sdot sdot 1198911119899

11989121 11989122 sdot sdot sdot 1198912119899

sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot

1198911198981 1198911198982 sdot sdot sdot 119891

119898119899

]

]

]

]

]

]

(4)

The final fitness function can be defined as

Maximize (

1198731015840

sum

119894=1119898119895=1119899

119891119894119895)

st 1198731015840le 119898 lowast 119899

(5)

Here the problem of ensemble pruning is to find thesubset of 1198641015840 which was composed of part single detectorsFinding the optimized subset requires much heavier andmore delicate computation resources Biogeography-basedoptimization (BBO) is a novel optimization method and isemployed to find out the acceptable set of ensemble Weonly simply present some key information about BBO theinterested reader can be referred to the detailed descriptionin [28]

BBO is a population-based global optimization methodwhich has some common characteristics similar to the

existing evolutionary algorithms (EAs) such as genetic algo-rithm (GA) particle swarm optimization (PSO) and antcolony optimization (ACO) When it was used to search thesolution domain and obtain an optimalsuboptimal solutionsome operators were employed to share information amongsolutions which makes BBO applicable to many problemsthat GA and PSO are used The more distinctive differencebetween BBO and other EAs can be seen in [27 28]

The pseudo-code of ensemble pruning based on BBO canbe described as shown in Algorithm 1 [7] Here 119867 indicateshabit HIS is fitness and SIV (suitability index variable) is asolution feature

333 Some Tricks Designed to Mitigate the CommunicationRequirement In the WSNs the main reason of quick energydepletion is the radio communication among the sensornodes It has been known that the cost of communicationof one bit equals the cost of processing thousands of bitsin sensors [35] This means that the most energy in sensornode is consumed by radio communication rather thancollecting or processing data Consequently reducing thecommunication quantity will decrease the power resourcerequirement and eventually lengthen the lifetime of thewholeWSNs

It is obvious that the aforementioned method has relativehigh communication overhead Each sensor node transmitsits local ensemble detector to the cluster head and the finalpruned global ensemble detector broadcasts back to its each

International Journal of Distributed Sensor Networks 7

Input 1198641015840mdashCurrent pruned ensemble anomaly detector 119901mdashSampling probabilityOutput 119864lowastmdashUpdated pruned ensemble anomaly detector

For each sensor nodeRetain the new observation with probability 119901If buffer is replaced completely by new observations

Train new detector and transmit its summary to cluster head119864lowast = Ensemble Pruning BBO(1198641015840 119879)

Broadcast 119864lowast to its member sensor node for subsequent anomaly detection

Algorithm 2 Online updating (1198641015840 119901)

member sensor nodes In order to relieve communicationburden some skills are used to descend the communicationoverhead

In fact the distributed traininglearning method onlytransmits the summary information of trained local ensembledetector to the cluster head which has significantly decreasedthe communication cost compared to centralized anomalydetectionmanners that sent all trained data to cluster head tobuild detector Besides after the pruned ensemble is obtainedin cluster head node each member sensor node in thiscluster can obtain the pruned ensemble detector from thecluster head node A straightforward method is broadcastingthis pruned ensemble to its member sensor nodes Thisis a common used strategy but it does not make full useof local ensemble detector information and will cost morecommunication resources Here a state matrix 119875 is designedin the cluster head its element 119901

119894119895is defined by formula (6)

to represent each single detector in initial ensemble Theneach local ensemble detector is represented as a bit stringusing one bit for each single detector Detector is included orexcluded from the ensemble detector depending on the valueof the corresponding bit that is 1 denotes this single detectorthat is included in the final ensemble and 0 means it was notincluded

119901119894119895=

1 AD119894119895isin 1198641015840 119894 = 1 119898 119895 = 1 119899

0 otherwise

119875

=

1 2 119894 minus 1 119894 119894 + 1 sdot sdot sdot 119899

1198781

1198782

119878119898

[

[

[

[

[

[

0

1

sdot

0

1

0

sdot

0

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

0

1

sdot

1

1

1

sdot

1

1

0

sdot

1

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

1

0

sdot

1

]

]

]

]

]

]

(6)

After the pruned procedure is finished the cluster headbroadcasts the statematrix119875 to its member sensor node eachsensor node keeps the single detector whose correspondingvalue of state element equals 1 and it deletes the rest to buildthe pruned ensemble global detector Employing the statematrix can save the energy greatly For example after theensemble pruning is finished 1198731015840 (1198731015840 le 119899 lowast 119898) individualdetectors are broadcast in cluster If matrix 119875 is not used it

will need 4 lowast1198731015840 lowast 119889 bytes communication cost (suppose thatthe individual detector can be represented by 119889 parametersand each parameter needs at least 4 bytes) If matrix 119875 isintroduced each itemofmatrix119875 only needs 1 bit to representan individual detector Consequently only 119898 lowast 1198998 bytes arerequired to broadcast Suppose that one-third of individualdetectors are pruned (ie1198731015840 = 2lowast119899lowast1198983) then (4lowast119899lowast119898lowast

119889 lowast 23)(119898 lowast 1198998) asymp 2133119889 By introducing the ensemblepruning and state matrix the quantity of energy saving incluster head sensor is significant and the lifetime ofWSNs canbe lengthened

334 Online Update and Relearning Distribution changeof sensed dataset occurred possibly and detector updat-ing is necessary Online detector update will accompanya relearning procedure A comprised strategy (ie delayupdating strategy [36]) can cater this situation and savethe computation communication and memory resources tosome extent Simple to say for the new coming observationwhether saving and using it to update the current detectoror not are decided by a sample probability 119901 Some heuristicrules can be employed to guide its value for example if thedynamics is relatively stationary the small 119901 should be usedotherwise the big 119901 should be chosen When the buffer of asensor node is replaced by the new data completely onlineupdate is triggered and new detector is trained The pseudo-codes of algorithm can be described as shown inAlgorithm 2

4 Experimental and Analysis

In this section the dataset data preprocessing methodexperiment results and analysis are described respectivelyExperiments were conducted on a personal PC with IntelCore 2 Duo CPU P7450213GHZ and 4GB memoryThe operating system is Windows 7 professional The dataprocessing is partly on the MATLAB 2010 and the algorithmmentioned in Section 3 was implemented with MicrosoftVisual C++ platform

41 Dataset and Data Preprocessing IBRL datasets [37] wereused in our paper to validate proposed method which wascollected from aWSN deployed in Intel Research Laboratoryat University of Berkeley and commonly used to evaluatethe performance of some existing models for WSNs [3536 38ndash41] This network consists of 54 Mica2Dot sensornodes Figure 3 shows the location of each node of the

8 International Journal of Distributed Sensor Networks

1

2

3

4

5 6

7

8

9

10

11

12

1314

15

16

17

18

19

20

21

22

23

2425

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

4142

43

44

45

46

47

48

49

50

51

52

54

Lab

Server

Quiet

Phon

e

Kitchen

Elec

Copy

Storage

Conference

Office Office

53

Figure 3 Sensor nodes location in the IBRL deployment

deployment (node locations are shown in black hexagonwith their corresponding node IDs) [35] The whole datasetwas collected from 29022004 to 05042004 Four typesof measures data that is light temperature and humidityas well as voltage were collected and those measurementswere recorded in 31 s interval Because these sensors weredeployed inside a lab and the measurement variables hadlittle changes over time (except the light having the suddenchanges due to the irregular nature of this variable andfrequent onoff operation) this dataset was considered a typeof static datasets for many researchers In our experimentsto evaluate our proposed anomaly detection algorithm someartificial anomalies are created by randomly modifying someobservations which is widely used bymany researchers in theliterature [41]

Since our proposed method adopts the cluster structurea cluster (consisting of 4 sensor nodes ie N7 N8 N9and N10) and dataset (collected on 29022004) are chosenThe data distribution can be seen in [7] Here only partobservations (during 000000 amndash075959 am) from eachsensor node are employed to evaluate proposed methodThedata trend is depicted in Figure 4

From Figure 4 an obvious fact is that data distributionin a cluster is almost same which well proved that spatialcorrelation exists Though there are some trivial differencesafter analyzing the dataset carefully the main reason is thatdataset has some missing data points largely due to packetloss which can be further proved from Figure 4 In ourexperiment these missing observations can be interpolatedusing the method described in Section 33 The obvious factis that sudden peakvalley appeared in Figure 4 for eachsensor observation which implies that an interested eventmay occurred

Suppose that 119863 = 119909119894 119910119894 119894 = 1 2 119899 is a dataset

used to train an anomaly detector Here the 119909119894is a vector

with feature values and 119910119894is the label which indicates whether

the given observation is normal or anomalous Because theIBRL dataset regards all its observations as normal someanomaly data points are generated and inserted to evaluatethe performance of our proposed method In the paper anumber of 30 data points of artificial anomalies for eachsensor were injected consecutively in each dataset to calculatethe true positive rate (TPR) false negative rate (FPR) anddetection accuracy (ACC) Without loss of generality theanomalous dataset should follow a distribution very muchdifferent from that of the training dataset but their rangesshould be overlapped as much as possible Besides ananomalous event should be a small probability event fora normal dataset collected by a nonfault sensor node Theanomalies were generated using a normal randomizer withslightly deviate statistical characteristics from the normaldata characteristics [41] The detailed dataset information(including statistical parameters) of selected sensor node ispresented in Table 1

42 Performance EvaluationMetrics and BBO Parameters Inorder to evaluate our proposed method some commonlyused performance evaluation metrics for anomaly detectionare used in our paper such as detection accuracy (ACC) truepositive rate (TPR) and false positivealarm rate (FPR)Theyare described as follows

ACC =

(TP + TN)(TP + TN + FP + FN)

International Journal of Distributed Sensor Networks 9

17

175

18

185

19

195

20Temperature

0 200 400 600 800 1000

N7N8

N9N10

(a)

38

39

40

41

42

43

44

45

46

0 200 400 600 800 1000

Humidity

N7N8

N9N10

(b)

Figure 4 The data (temperature humidity) trend during 00000 amndash75959 am on February 29 2004

Table 1 Detail dataset information of selected sensor node on 29022004

Node Initial sample Mean Variance Injected anomaly Mean Variance119879 119867 119879 119867 119879 119867 119879 119867

N7 823 184154 409176 05238 14494 30 1821 4110 054 146N8 548 179844 417123 05315 14612 30 1775 4195 055 148N9 652 181140 426295 05288 14827 30 1835 4245 055 150N10 620 181144 426215 05244 14191 30 1833 4247 054 143119879 temperature119867 humidity

TPR =

TP(TP + FN)

FPR =

FP(FP + TN)

(7)

where TP means number of samples correctly predicted asanomaly class FP means number of samples incorrectlypredicted as anomaly class TN means number of samplescorrectly predicted as normal class and FN means numberof samples incorrectly predicted as normal class

BBO is employed to prune the initial ensemble and themigration model is same as that present in [27 28] and therelated parameters are set as follows

Habitat (population) size 119878 = 30 the number of SIVs(suitability index variables) in each island 119899 = 20 40 60 80the maximum migration rates 119864 = 1 and 119868 = 1 and themutation rate 120578 = 001 and 120582 120583 are the immigration rateand the emigration rate respectively The elitism parameter120588 = 2

HSI (habitat suitability index) is a fitness function similarto other population-based optimization algorithms HIS isevaluated by 119865-measure (119865-score) which considers both the

precision probability and the recall probability of binaryclassification problem

119865-measure =(1 + 120573

2) precision lowast recall

1205732lowast precision + recall

=

(1 + 1205732) lowast TP

(1 + 1205732) lowast TP + 1205732 lowast FN + FP

(8)

119865-measure can be interpreted as a weighted average of theprecision and recall and its value reaches best at 1 and worstat 0 120573 is a parameter used to adjust the relative importancebetween precision and recall 120573 = 05 1 2 Usually thevalue of 119865-measure is close to the relative small value ofprecision and recall that is the big 119865-measuremeans that theprecision and recall are all big Consequently a good detectoris analogous to a habitat with a high HSI and is included inthe final ensemble detector and a poor detector is analogousto a habitat with a low HIS and is discarded from the finalensemble detector In our paper 120573 = 1 is specified

43 Results Presentation and Discussions In the data miningand machine learning communities SVM-based method has

10 International Journal of Distributed Sensor Networks

Table 2 Detection performance of local ensemble detector

Ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

5 08700 05833 01181 07900 03333 01809 08267 05000 01549 08267 05714 0160810 08800 06667 01111 08033 03889 01702 08267 04375 01514 08333 06429 0157315 08900 07500 01042 08167 05000 01631 08433 05000 01373 08600 07143 0132920 08933 08333 01042 08200 05000 01596 08367 05000 01444 08567 07143 01364

Table 3 Detection performance of global ensemble detector [7]

Combined ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

20 09467 08333 00486 09300 07778 00603 09467 07500 00423 09500 07857 0042040 09700 07500 00208 09433 08333 00496 09710 08938 00246 09650 08929 0031560 09700 08333 00243 09733 08889 00213 09800 09375 00176 09783 09357 0019680 09817 09583 00174 09800 09444 00177 09767 09375 00211 09780 09714 00217

been widely used in classification problem which separatesthe data belonging to the different classes by fitting a hyper-plane One class SVM based method as a variation of thismethod is especially favored for anomaly detection [42ndash44]In the paper it was used to train the base detectorThe datasetof each sensor node was divided into two parts about 66was used for training the local detector and the remainder asthe test set was to evaluate proposed method

Online Bagging the commonly used ensemble strategywas used to build initial ensemble detector Our experimentsaim to achieve two goals Firstly it is to prove the effectivenessof proposed method based on ensemble learning theorySecondly it is to prove that pruned ensemble detector canobtain better (at least equal) performance compared to initialensemble detector and mitigate the resource requirementAs a result three experiments were done that is localensemble anomaly detector only considering the temporalcorrelation of each sensor node global ensemble anomalydetector considering the spatiotemporal correlation and theglobal pruned ensemble anomaly detector based on BBOThe experimental results can be seen in Tables 2 3 and 4respectively

Table 2 shows the performance of each sensor node underthe different ensemble size which does not take into accountthe spatial correlation of sensed data in a cluster Though theensemble detection performance is becoming ldquogoodrdquo gradualwith the increasing of ensemble size (the higher value ofACCTPR the better performance and the lower value of FPR thebetter performance) the overall performance is relatively lowThemaximumvalue of detection accuracy is only 8933 andmost of true positive rates are unacceptable and most of falsepositive rates (FPR) have a relative high valueAll these resultsindicate that the performance of local ensemble detectoris poor Table 3 shows the global detection performance ofeach sensor node Here after the local ensemble detectorwas trained each member node sent its local ensembleto each other to form the global ensemble detector andeach member node used this global detector to online testthe local observation From the results of Table 3 [7] an

obvious fact is that the detection performances are higherthan presented in Table 2With the help of neighbor detectorthe detection results become better and better correspondingto the increasing of ensemble size

In order to further optimize the proposed algorithmperformance and save the resource ensemble pruning is usedfor global ensemble detector Table 4 [7] shows the result ofdetection performance of pruned global ensemble detectorbased on BBO

Table 4 shows a more practicable result and the sizeof global ensemble decreases sharply while the detectorperformance is as good as or better than the initial globalensemble detector From the results of Table 5 when thesize of initial ensemble reaches 80 the 60 resource costis saved In our experiment only for validating the methodeffectively we set the ensemble sizes 5 10 15 and 20 for eachlocal ensemble detector which may be small for the practicalapplications In fact how many local ensemble detectors arechosen is an open topic and is decided by many factors suchas the computation capability and the communication cost aswell as memory usage of sensor node the expected detectingaccuracy requirement and so on In the practical applicationa trade-off is commonly considered

5 Conclusion and Future Work

After exploiting the spatiotemporal correlation existing inthe sensed data of WSNs and motivated by the advantagesof online ensemble learning a distributed online ensembleanomaly detector method has been proposed Due to thespecific resource constrained in theWSNs ensemble pruningbased on BBO is employed to mitigate the high resourcerequirement and obtain the optimized detector that performsat least as good as the original ones The experimental resultson real dataset demonstrated that our proposed method iseffective

Because the diversity of base learners is a key factorrelated to the performance of ensemble learning as a possibleextension of our work we plan to include some diversity

International Journal of Distributed Sensor Networks 11

Table 4 Detection performance of global ensemble detector based on BBO pruning [7]

Ensemble size(BBO pruned)

N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

14 09480 08000 00458 09327 07667 00567 09500 08125 00423 09533 08571 0042023 09710 07750 00208 09447 08000 00461 09733 09250 00239 09697 09143 0027627 09713 08500 00236 09683 08333 00230 09810 09563 00176 09797 09357 0018232 09820 09750 00177 09750 08333 00160 09820 09500 00162 09830 09786 00168

Table 5 Rate of saving resource cost based on global ensembledetector of BBO pruned

Number Initial ensemblesize

Prunedensemble size

Saving resourcecost

1 20 14 302 40 23 4253 60 27 554 80 32 60

measures in fitness function to improve the detecting per-formance in future Besides the cost of communication isthe main reason of quick energy depletion of sensor nodesespecially for the cluster head the adaptive selection of clusterhead based on energy state will be taken into account tolengthen the lifetime of WSNs in next work

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by the National Key ScientificInstrument and Equipment Development Project(2012YQ15008703) the Zhejiang Provincial Natural ScienceFoundation of China (LY13F020015) the Open Project of TopKeyDiscipline of Computer Software andTheory in ZhejiangProvincial (ZC323014100) National Science Foundation ofChina (61473182) Science and Technology Commission ofShanghai Municipality (11JC1404000 14JC1402200) andShanghai Rising-Star Program (13QA1401600)

References

[1] Y Zhang N Meratnia and P Havinga ldquoOutlier detectiontechniques for wireless sensor networks a surveyrdquo IEEE Com-munications Surveys and Tutorials vol 12 no 2 pp 159ndash1702010

[2] Y Zhang N A S Hamm N Meratnia A Stein M van deVoort and P J M Havinga ldquoStatistics-based outlier detectionfor wireless sensor networksrdquo International Journal of Geo-graphical Information Science vol 26 no 8 pp 1373ndash1392 2012

[3] C Peng and Q-L Han ldquoA novel event-triggered transmissionscheme and L

2control co-design for sampled-data control

systemsrdquo IEEE Transactions on Automatic Control vol 58 no10 pp 2620ndash2626 2013

[4] S Rajasegarar C Leckie M Palaniswami and J C BezdekldquoDistributed anomaly detection in wireless sensor networksrdquo inProceedings of the 10th IEEE Singapore International Conferenceon Communication systems (ICCS rsquo06) pp 1ndash5 IEEE SingaporeOctober 2006

[5] S Rajasegarar C Leckie and M Palaniswami ldquoAnomalydetection in wireless sensor networksrdquo IEEE Wireless Commu-nications vol 15 no 4 pp 34ndash40 2008

[6] M Xie S Han B Tian and S Parvin ldquoAnomaly detectionin wireless sensor networks a surveyrdquo Journal of Network andComputer Applications vol 34 no 4 pp 1302ndash1325 2011

[7] Z Ding M Fei D Du and S Xu ldquoOnline anomaly detectionmethod based on BBO ensemble pruning in wireless sensornetworksrdquo in Life System Modeling and Simulation vol 461 ofCommunications in Computer and Information Science pp 160ndash169 Springer Berlin Germany 2014

[8] T G Dietterich ldquoMachine-learning researchmdashfour currentdirectionsrdquo AI Magazine vol 18 no 4 pp 97ndash136 1997

[9] Z-H Zhou J Wu andW Tang ldquoEnsembling neural networksmany could be better than allrdquoArtificial Intelligence vol 137 no1-2 pp 239ndash263 2002

[10] N Shahid I H Naqvi and S B Qaisar ldquoCharacteristics andclassification of outlier detection techniques for wireless sensornetworks in harsh environments a surveyrdquoArtificial IntelligenceReview vol 137 pp 1ndash36 2012

[11] D Du K Li and M Fei ldquoA fast multi-output RBF neuralnetwork constructionmethodrdquoNeurocomputing vol 73 no 10ndash12 pp 2196ndash2202 2010

[12] P Gil A Santos and A Cardoso ldquoDealing with outliers inwireless sensor networks an oil refinery applicationrdquo IEEETransactions on Control Systems Technology vol 23 no 4 pp1589ndash1596 2014

[13] M A Rassam M A Maarof and A Zainal ldquoAdaptive andonline data anomaly detection for wireless sensor systemsrdquoKnowledge-Based Systems vol 60 pp 44ndash57 2014

[14] S Rajasegarar A Gluhak M Ali Imran et al ldquoEllipsoidalneighbourhood outlier factor for distributed anomaly detectionin resource constrained networksrdquo Pattern Recognition vol 47no 9 pp 2867ndash2879 2014

[15] N Lu G Zhang and J Lu ldquoConcept drift detection viacompetence modelsrdquo Artificial Intelligence vol 209 pp 11ndash282014

[16] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[17] S Seguı L Igual and J Vitria ldquoBagged one-class classifiersin the presence of outliersrdquo International Journal of PatternRecognition and Artificial Intelligence vol 27 no 5 Article ID1350014 2013

12 International Journal of Distributed Sensor Networks

[18] N Duffy and D Helmbold ldquoBoosting methods for regressionrdquoMachine Learning vol 47 no 2-3 pp 153ndash200 2002

[19] W-C Chang and C-W Cho ldquoOnline boosting for vehicledetectionrdquo IEEETransactions on SystemsMan and CyberneticsPart B Cybernetics vol 40 no 3 pp 892ndash902 2010

[20] C Desir S Bernard C Petitjean and L Heutte ldquoOne classrandom forestsrdquo Pattern Recognition vol 46 no 12 pp 3490ndash3506 2013

[21] A Fern and R Givan ldquoOnline ensemble learning an empiricalstudyrdquoMachine Learning vol 53 no 1-2 pp 71ndash109 2003

[22] A Bifet G Holmes B Pfahringer and R Gavalda ldquoImprov-ing adaptive bagging methods for evolving data streamsrdquo inAdvances in Machine Learning vol 5828 of Lecture Notes inComputer Science pp 23ndash37 Springer Berlin Germany 2009

[23] D I Curiac and C Volosencu ldquoEnsemble based sensinganomaly detection in wireless sensor networksrdquo Expert Systemswith Applications vol 39 no 10 pp 9087ndash9096 2012

[24] X Zhou S Li and Z Ye ldquoA novel system anomaly predictionsystem based on belief markov model and ensemble classifica-tionrdquo Mathematical Problems in Engineering vol 2013 ArticleID 179390 10 pages 2013

[25] H He S Chen K Li and X Xu ldquoIncremental learning fromstream datardquo IEEE Transactions on Neural Networks vol 22 no12 pp 1901ndash1914 2011

[26] D Du K Li X Li and M Fei ldquoA novel forward gene selectionalgorithm for microarray datardquo Neurocomputing vol 133 pp446ndash458 2014

[27] H Ma ldquoAn analysis of the equilibrium of migration models forbiogeography-based optimizationrdquo Information Sciences vol180 no 18 pp 3444ndash3464 2010

[28] D Simon ldquoBiogeography-based optimizationrdquo IEEE Transac-tions on Evolutionary Computation vol 12 no 6 pp 702ndash7132008

[29] S Sheen R Anitha and P Sirisha ldquoMalware detection bypruning of parallel ensembles using harmony searchrdquo PatternRecognition Letters vol 34 no 14 pp 1679ndash1686 2013

[30] Y-Y Zhang H-C Chao M Chen L Shu C-H Park and M-S Park ldquoOutlier detection and countermeasure for hierarchicalwireless sensor networksrdquo IET Information Security vol 4 no4 pp 361ndash373 2010

[31] C Peng and M-R Fei ldquoAn improved result on the stability ofuncertain T-S fuzzy systems with interval time-varying delayrdquoFuzzy Sets and Systems vol 212 pp 97ndash109 2013

[32] Y Zhang Observing the Unobservable Distributed Online Out-lier Detection inWireless Sensor Networks University of TwenteEnschede The Netherlands 2010

[33] C Peng D Yue and M Fei ldquoRelaxed stability and stabilizationconditions of networked fuzzy control systems subject toasynchronous grades of membershiprdquo IEEE Transactions onFuzzy Systems vol 22 no 5 pp 1101ndash1112 2014

[34] C Peng M-R Fei E Tian and Y-P Guan ldquoOn hold or dropout-of-order packets in networked control systemsrdquo Informa-tion Sciences vol 268 pp 436ndash446 2014

[35] M A Rassam A Zainal and M A Maarof ldquoAn adaptive andefficient dimension reduction model for multivariate wirelesssensor networks applicationsrdquo Applied Soft Computing Journalvol 13 no 4 pp 1978ndash1996 2013

[36] M Xie J Hu S Han and H-H Chen ldquoScalable hypergridk-NN-based online anomaly detection in wireless sensor net-worksrdquo IEEE Transactions on Parallel and Distributed Systemsvol 24 no 8 pp 1661ndash1670 2013

[37] Intel Berkely Reseach Lab (IBRL) dataset 2004 httpdbcsailmitedulabdatalabdatahtml

[38] J W Branch C Giannella B Szymanski R Wolff and HKargupta ldquoIn-network outlier detection in wireless sensornetworksrdquo Knowledge and Information Systems vol 34 no 1pp 23ndash54 2013

[39] M Moshtaghi T C Havens J C Bezdek et al ldquoClusteringellipses for anomaly detectionrdquo Pattern Recognition vol 44 no1 pp 55ndash69 2011

[40] S Rajasegarar J C Bezdek C Leckie and M PalaniswamildquoElliptical anomalies in wireless sensor networksrdquo ACM Trans-actions on Sensor Networks vol 6 no 1 pp 1ndash28 2009

[41] M A Rassam A Zainal and M A Maarof ldquoOne-classprincipal component classifier for anomaly detection inwirelesssensor networkrdquo in Proceedings of the 4th International Confer-ence on Computational Aspects of Social Networks (CASoN rsquo12)pp 271ndash276 IEEE Sao Carlos Brazil November 2012

[42] H Sagha H Bayati J D R Millan and R Chavarriaga ldquoOn-line anomaly detection and resilience in classifier ensemblesrdquoPattern Recognition Letters vol 34 no 15 pp 1916ndash1927 2013

[43] M Hejazi and Y P Singh ldquoOne-class support vector machinesapproach to anomaly detectionrdquo Applied Artificial Intelligencevol 27 no 5 pp 351ndash366 2013

[44] Y Zhang NMeratnia and P JMHavinga ldquoDistributed onlineoutlier detection in wireless sensor networks using ellipsoidalsupport vector machinerdquo Ad Hoc Networks vol 11 no 3 pp1062ndash1074 2013

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 7: Research Article A Novel Distributed Online Anomaly ...downloads.hindawi.com/journals/ijdsn/2015/146189.pdfin the cluster head node to obtain an optimized subset of ensemble members

International Journal of Distributed Sensor Networks 7

Input 1198641015840mdashCurrent pruned ensemble anomaly detector 119901mdashSampling probabilityOutput 119864lowastmdashUpdated pruned ensemble anomaly detector

For each sensor nodeRetain the new observation with probability 119901If buffer is replaced completely by new observations

Train new detector and transmit its summary to cluster head119864lowast = Ensemble Pruning BBO(1198641015840 119879)

Broadcast 119864lowast to its member sensor node for subsequent anomaly detection

Algorithm 2 Online updating (1198641015840 119901)

member sensor nodes In order to relieve communicationburden some skills are used to descend the communicationoverhead

In fact the distributed traininglearning method onlytransmits the summary information of trained local ensembledetector to the cluster head which has significantly decreasedthe communication cost compared to centralized anomalydetectionmanners that sent all trained data to cluster head tobuild detector Besides after the pruned ensemble is obtainedin cluster head node each member sensor node in thiscluster can obtain the pruned ensemble detector from thecluster head node A straightforward method is broadcastingthis pruned ensemble to its member sensor nodes Thisis a common used strategy but it does not make full useof local ensemble detector information and will cost morecommunication resources Here a state matrix 119875 is designedin the cluster head its element 119901

119894119895is defined by formula (6)

to represent each single detector in initial ensemble Theneach local ensemble detector is represented as a bit stringusing one bit for each single detector Detector is included orexcluded from the ensemble detector depending on the valueof the corresponding bit that is 1 denotes this single detectorthat is included in the final ensemble and 0 means it was notincluded

119901119894119895=

1 AD119894119895isin 1198641015840 119894 = 1 119898 119895 = 1 119899

0 otherwise

119875

=

1 2 119894 minus 1 119894 119894 + 1 sdot sdot sdot 119899

1198781

1198782

119878119898

[

[

[

[

[

[

0

1

sdot

0

1

0

sdot

0

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

0

1

sdot

1

1

1

sdot

1

1

0

sdot

1

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

sdot sdot sdot

1

0

sdot

1

]

]

]

]

]

]

(6)

After the pruned procedure is finished the cluster headbroadcasts the statematrix119875 to its member sensor node eachsensor node keeps the single detector whose correspondingvalue of state element equals 1 and it deletes the rest to buildthe pruned ensemble global detector Employing the statematrix can save the energy greatly For example after theensemble pruning is finished 1198731015840 (1198731015840 le 119899 lowast 119898) individualdetectors are broadcast in cluster If matrix 119875 is not used it

will need 4 lowast1198731015840 lowast 119889 bytes communication cost (suppose thatthe individual detector can be represented by 119889 parametersand each parameter needs at least 4 bytes) If matrix 119875 isintroduced each itemofmatrix119875 only needs 1 bit to representan individual detector Consequently only 119898 lowast 1198998 bytes arerequired to broadcast Suppose that one-third of individualdetectors are pruned (ie1198731015840 = 2lowast119899lowast1198983) then (4lowast119899lowast119898lowast

119889 lowast 23)(119898 lowast 1198998) asymp 2133119889 By introducing the ensemblepruning and state matrix the quantity of energy saving incluster head sensor is significant and the lifetime ofWSNs canbe lengthened

334 Online Update and Relearning Distribution changeof sensed dataset occurred possibly and detector updat-ing is necessary Online detector update will accompanya relearning procedure A comprised strategy (ie delayupdating strategy [36]) can cater this situation and savethe computation communication and memory resources tosome extent Simple to say for the new coming observationwhether saving and using it to update the current detectoror not are decided by a sample probability 119901 Some heuristicrules can be employed to guide its value for example if thedynamics is relatively stationary the small 119901 should be usedotherwise the big 119901 should be chosen When the buffer of asensor node is replaced by the new data completely onlineupdate is triggered and new detector is trained The pseudo-codes of algorithm can be described as shown inAlgorithm 2

4 Experimental and Analysis

In this section the dataset data preprocessing methodexperiment results and analysis are described respectivelyExperiments were conducted on a personal PC with IntelCore 2 Duo CPU P7450213GHZ and 4GB memoryThe operating system is Windows 7 professional The dataprocessing is partly on the MATLAB 2010 and the algorithmmentioned in Section 3 was implemented with MicrosoftVisual C++ platform

41 Dataset and Data Preprocessing IBRL datasets [37] wereused in our paper to validate proposed method which wascollected from aWSN deployed in Intel Research Laboratoryat University of Berkeley and commonly used to evaluatethe performance of some existing models for WSNs [3536 38ndash41] This network consists of 54 Mica2Dot sensornodes Figure 3 shows the location of each node of the

8 International Journal of Distributed Sensor Networks

1

2

3

4

5 6

7

8

9

10

11

12

1314

15

16

17

18

19

20

21

22

23

2425

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

4142

43

44

45

46

47

48

49

50

51

52

54

Lab

Server

Quiet

Phon

e

Kitchen

Elec

Copy

Storage

Conference

Office Office

53

Figure 3 Sensor nodes location in the IBRL deployment

deployment (node locations are shown in black hexagonwith their corresponding node IDs) [35] The whole datasetwas collected from 29022004 to 05042004 Four typesof measures data that is light temperature and humidityas well as voltage were collected and those measurementswere recorded in 31 s interval Because these sensors weredeployed inside a lab and the measurement variables hadlittle changes over time (except the light having the suddenchanges due to the irregular nature of this variable andfrequent onoff operation) this dataset was considered a typeof static datasets for many researchers In our experimentsto evaluate our proposed anomaly detection algorithm someartificial anomalies are created by randomly modifying someobservations which is widely used bymany researchers in theliterature [41]

Since our proposed method adopts the cluster structurea cluster (consisting of 4 sensor nodes ie N7 N8 N9and N10) and dataset (collected on 29022004) are chosenThe data distribution can be seen in [7] Here only partobservations (during 000000 amndash075959 am) from eachsensor node are employed to evaluate proposed methodThedata trend is depicted in Figure 4

From Figure 4 an obvious fact is that data distributionin a cluster is almost same which well proved that spatialcorrelation exists Though there are some trivial differencesafter analyzing the dataset carefully the main reason is thatdataset has some missing data points largely due to packetloss which can be further proved from Figure 4 In ourexperiment these missing observations can be interpolatedusing the method described in Section 33 The obvious factis that sudden peakvalley appeared in Figure 4 for eachsensor observation which implies that an interested eventmay occurred

Suppose that 119863 = 119909119894 119910119894 119894 = 1 2 119899 is a dataset

used to train an anomaly detector Here the 119909119894is a vector

with feature values and 119910119894is the label which indicates whether

the given observation is normal or anomalous Because theIBRL dataset regards all its observations as normal someanomaly data points are generated and inserted to evaluatethe performance of our proposed method In the paper anumber of 30 data points of artificial anomalies for eachsensor were injected consecutively in each dataset to calculatethe true positive rate (TPR) false negative rate (FPR) anddetection accuracy (ACC) Without loss of generality theanomalous dataset should follow a distribution very muchdifferent from that of the training dataset but their rangesshould be overlapped as much as possible Besides ananomalous event should be a small probability event fora normal dataset collected by a nonfault sensor node Theanomalies were generated using a normal randomizer withslightly deviate statistical characteristics from the normaldata characteristics [41] The detailed dataset information(including statistical parameters) of selected sensor node ispresented in Table 1

42 Performance EvaluationMetrics and BBO Parameters Inorder to evaluate our proposed method some commonlyused performance evaluation metrics for anomaly detectionare used in our paper such as detection accuracy (ACC) truepositive rate (TPR) and false positivealarm rate (FPR)Theyare described as follows

ACC =

(TP + TN)(TP + TN + FP + FN)

International Journal of Distributed Sensor Networks 9

17

175

18

185

19

195

20Temperature

0 200 400 600 800 1000

N7N8

N9N10

(a)

38

39

40

41

42

43

44

45

46

0 200 400 600 800 1000

Humidity

N7N8

N9N10

(b)

Figure 4 The data (temperature humidity) trend during 00000 amndash75959 am on February 29 2004

Table 1 Detail dataset information of selected sensor node on 29022004

Node Initial sample Mean Variance Injected anomaly Mean Variance119879 119867 119879 119867 119879 119867 119879 119867

N7 823 184154 409176 05238 14494 30 1821 4110 054 146N8 548 179844 417123 05315 14612 30 1775 4195 055 148N9 652 181140 426295 05288 14827 30 1835 4245 055 150N10 620 181144 426215 05244 14191 30 1833 4247 054 143119879 temperature119867 humidity

TPR =

TP(TP + FN)

FPR =

FP(FP + TN)

(7)

where TP means number of samples correctly predicted asanomaly class FP means number of samples incorrectlypredicted as anomaly class TN means number of samplescorrectly predicted as normal class and FN means numberof samples incorrectly predicted as normal class

BBO is employed to prune the initial ensemble and themigration model is same as that present in [27 28] and therelated parameters are set as follows

Habitat (population) size 119878 = 30 the number of SIVs(suitability index variables) in each island 119899 = 20 40 60 80the maximum migration rates 119864 = 1 and 119868 = 1 and themutation rate 120578 = 001 and 120582 120583 are the immigration rateand the emigration rate respectively The elitism parameter120588 = 2

HSI (habitat suitability index) is a fitness function similarto other population-based optimization algorithms HIS isevaluated by 119865-measure (119865-score) which considers both the

precision probability and the recall probability of binaryclassification problem

119865-measure =(1 + 120573

2) precision lowast recall

1205732lowast precision + recall

=

(1 + 1205732) lowast TP

(1 + 1205732) lowast TP + 1205732 lowast FN + FP

(8)

119865-measure can be interpreted as a weighted average of theprecision and recall and its value reaches best at 1 and worstat 0 120573 is a parameter used to adjust the relative importancebetween precision and recall 120573 = 05 1 2 Usually thevalue of 119865-measure is close to the relative small value ofprecision and recall that is the big 119865-measuremeans that theprecision and recall are all big Consequently a good detectoris analogous to a habitat with a high HSI and is included inthe final ensemble detector and a poor detector is analogousto a habitat with a low HIS and is discarded from the finalensemble detector In our paper 120573 = 1 is specified

43 Results Presentation and Discussions In the data miningand machine learning communities SVM-based method has

10 International Journal of Distributed Sensor Networks

Table 2 Detection performance of local ensemble detector

Ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

5 08700 05833 01181 07900 03333 01809 08267 05000 01549 08267 05714 0160810 08800 06667 01111 08033 03889 01702 08267 04375 01514 08333 06429 0157315 08900 07500 01042 08167 05000 01631 08433 05000 01373 08600 07143 0132920 08933 08333 01042 08200 05000 01596 08367 05000 01444 08567 07143 01364

Table 3 Detection performance of global ensemble detector [7]

Combined ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

20 09467 08333 00486 09300 07778 00603 09467 07500 00423 09500 07857 0042040 09700 07500 00208 09433 08333 00496 09710 08938 00246 09650 08929 0031560 09700 08333 00243 09733 08889 00213 09800 09375 00176 09783 09357 0019680 09817 09583 00174 09800 09444 00177 09767 09375 00211 09780 09714 00217

been widely used in classification problem which separatesthe data belonging to the different classes by fitting a hyper-plane One class SVM based method as a variation of thismethod is especially favored for anomaly detection [42ndash44]In the paper it was used to train the base detectorThe datasetof each sensor node was divided into two parts about 66was used for training the local detector and the remainder asthe test set was to evaluate proposed method

Online Bagging the commonly used ensemble strategywas used to build initial ensemble detector Our experimentsaim to achieve two goals Firstly it is to prove the effectivenessof proposed method based on ensemble learning theorySecondly it is to prove that pruned ensemble detector canobtain better (at least equal) performance compared to initialensemble detector and mitigate the resource requirementAs a result three experiments were done that is localensemble anomaly detector only considering the temporalcorrelation of each sensor node global ensemble anomalydetector considering the spatiotemporal correlation and theglobal pruned ensemble anomaly detector based on BBOThe experimental results can be seen in Tables 2 3 and 4respectively

Table 2 shows the performance of each sensor node underthe different ensemble size which does not take into accountthe spatial correlation of sensed data in a cluster Though theensemble detection performance is becoming ldquogoodrdquo gradualwith the increasing of ensemble size (the higher value ofACCTPR the better performance and the lower value of FPR thebetter performance) the overall performance is relatively lowThemaximumvalue of detection accuracy is only 8933 andmost of true positive rates are unacceptable and most of falsepositive rates (FPR) have a relative high valueAll these resultsindicate that the performance of local ensemble detectoris poor Table 3 shows the global detection performance ofeach sensor node Here after the local ensemble detectorwas trained each member node sent its local ensembleto each other to form the global ensemble detector andeach member node used this global detector to online testthe local observation From the results of Table 3 [7] an

obvious fact is that the detection performances are higherthan presented in Table 2With the help of neighbor detectorthe detection results become better and better correspondingto the increasing of ensemble size

In order to further optimize the proposed algorithmperformance and save the resource ensemble pruning is usedfor global ensemble detector Table 4 [7] shows the result ofdetection performance of pruned global ensemble detectorbased on BBO

Table 4 shows a more practicable result and the sizeof global ensemble decreases sharply while the detectorperformance is as good as or better than the initial globalensemble detector From the results of Table 5 when thesize of initial ensemble reaches 80 the 60 resource costis saved In our experiment only for validating the methodeffectively we set the ensemble sizes 5 10 15 and 20 for eachlocal ensemble detector which may be small for the practicalapplications In fact how many local ensemble detectors arechosen is an open topic and is decided by many factors suchas the computation capability and the communication cost aswell as memory usage of sensor node the expected detectingaccuracy requirement and so on In the practical applicationa trade-off is commonly considered

5 Conclusion and Future Work

After exploiting the spatiotemporal correlation existing inthe sensed data of WSNs and motivated by the advantagesof online ensemble learning a distributed online ensembleanomaly detector method has been proposed Due to thespecific resource constrained in theWSNs ensemble pruningbased on BBO is employed to mitigate the high resourcerequirement and obtain the optimized detector that performsat least as good as the original ones The experimental resultson real dataset demonstrated that our proposed method iseffective

Because the diversity of base learners is a key factorrelated to the performance of ensemble learning as a possibleextension of our work we plan to include some diversity

International Journal of Distributed Sensor Networks 11

Table 4 Detection performance of global ensemble detector based on BBO pruning [7]

Ensemble size(BBO pruned)

N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

14 09480 08000 00458 09327 07667 00567 09500 08125 00423 09533 08571 0042023 09710 07750 00208 09447 08000 00461 09733 09250 00239 09697 09143 0027627 09713 08500 00236 09683 08333 00230 09810 09563 00176 09797 09357 0018232 09820 09750 00177 09750 08333 00160 09820 09500 00162 09830 09786 00168

Table 5 Rate of saving resource cost based on global ensembledetector of BBO pruned

Number Initial ensemblesize

Prunedensemble size

Saving resourcecost

1 20 14 302 40 23 4253 60 27 554 80 32 60

measures in fitness function to improve the detecting per-formance in future Besides the cost of communication isthe main reason of quick energy depletion of sensor nodesespecially for the cluster head the adaptive selection of clusterhead based on energy state will be taken into account tolengthen the lifetime of WSNs in next work

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by the National Key ScientificInstrument and Equipment Development Project(2012YQ15008703) the Zhejiang Provincial Natural ScienceFoundation of China (LY13F020015) the Open Project of TopKeyDiscipline of Computer Software andTheory in ZhejiangProvincial (ZC323014100) National Science Foundation ofChina (61473182) Science and Technology Commission ofShanghai Municipality (11JC1404000 14JC1402200) andShanghai Rising-Star Program (13QA1401600)

References

[1] Y Zhang N Meratnia and P Havinga ldquoOutlier detectiontechniques for wireless sensor networks a surveyrdquo IEEE Com-munications Surveys and Tutorials vol 12 no 2 pp 159ndash1702010

[2] Y Zhang N A S Hamm N Meratnia A Stein M van deVoort and P J M Havinga ldquoStatistics-based outlier detectionfor wireless sensor networksrdquo International Journal of Geo-graphical Information Science vol 26 no 8 pp 1373ndash1392 2012

[3] C Peng and Q-L Han ldquoA novel event-triggered transmissionscheme and L

2control co-design for sampled-data control

systemsrdquo IEEE Transactions on Automatic Control vol 58 no10 pp 2620ndash2626 2013

[4] S Rajasegarar C Leckie M Palaniswami and J C BezdekldquoDistributed anomaly detection in wireless sensor networksrdquo inProceedings of the 10th IEEE Singapore International Conferenceon Communication systems (ICCS rsquo06) pp 1ndash5 IEEE SingaporeOctober 2006

[5] S Rajasegarar C Leckie and M Palaniswami ldquoAnomalydetection in wireless sensor networksrdquo IEEE Wireless Commu-nications vol 15 no 4 pp 34ndash40 2008

[6] M Xie S Han B Tian and S Parvin ldquoAnomaly detectionin wireless sensor networks a surveyrdquo Journal of Network andComputer Applications vol 34 no 4 pp 1302ndash1325 2011

[7] Z Ding M Fei D Du and S Xu ldquoOnline anomaly detectionmethod based on BBO ensemble pruning in wireless sensornetworksrdquo in Life System Modeling and Simulation vol 461 ofCommunications in Computer and Information Science pp 160ndash169 Springer Berlin Germany 2014

[8] T G Dietterich ldquoMachine-learning researchmdashfour currentdirectionsrdquo AI Magazine vol 18 no 4 pp 97ndash136 1997

[9] Z-H Zhou J Wu andW Tang ldquoEnsembling neural networksmany could be better than allrdquoArtificial Intelligence vol 137 no1-2 pp 239ndash263 2002

[10] N Shahid I H Naqvi and S B Qaisar ldquoCharacteristics andclassification of outlier detection techniques for wireless sensornetworks in harsh environments a surveyrdquoArtificial IntelligenceReview vol 137 pp 1ndash36 2012

[11] D Du K Li and M Fei ldquoA fast multi-output RBF neuralnetwork constructionmethodrdquoNeurocomputing vol 73 no 10ndash12 pp 2196ndash2202 2010

[12] P Gil A Santos and A Cardoso ldquoDealing with outliers inwireless sensor networks an oil refinery applicationrdquo IEEETransactions on Control Systems Technology vol 23 no 4 pp1589ndash1596 2014

[13] M A Rassam M A Maarof and A Zainal ldquoAdaptive andonline data anomaly detection for wireless sensor systemsrdquoKnowledge-Based Systems vol 60 pp 44ndash57 2014

[14] S Rajasegarar A Gluhak M Ali Imran et al ldquoEllipsoidalneighbourhood outlier factor for distributed anomaly detectionin resource constrained networksrdquo Pattern Recognition vol 47no 9 pp 2867ndash2879 2014

[15] N Lu G Zhang and J Lu ldquoConcept drift detection viacompetence modelsrdquo Artificial Intelligence vol 209 pp 11ndash282014

[16] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[17] S Seguı L Igual and J Vitria ldquoBagged one-class classifiersin the presence of outliersrdquo International Journal of PatternRecognition and Artificial Intelligence vol 27 no 5 Article ID1350014 2013

12 International Journal of Distributed Sensor Networks

[18] N Duffy and D Helmbold ldquoBoosting methods for regressionrdquoMachine Learning vol 47 no 2-3 pp 153ndash200 2002

[19] W-C Chang and C-W Cho ldquoOnline boosting for vehicledetectionrdquo IEEETransactions on SystemsMan and CyberneticsPart B Cybernetics vol 40 no 3 pp 892ndash902 2010

[20] C Desir S Bernard C Petitjean and L Heutte ldquoOne classrandom forestsrdquo Pattern Recognition vol 46 no 12 pp 3490ndash3506 2013

[21] A Fern and R Givan ldquoOnline ensemble learning an empiricalstudyrdquoMachine Learning vol 53 no 1-2 pp 71ndash109 2003

[22] A Bifet G Holmes B Pfahringer and R Gavalda ldquoImprov-ing adaptive bagging methods for evolving data streamsrdquo inAdvances in Machine Learning vol 5828 of Lecture Notes inComputer Science pp 23ndash37 Springer Berlin Germany 2009

[23] D I Curiac and C Volosencu ldquoEnsemble based sensinganomaly detection in wireless sensor networksrdquo Expert Systemswith Applications vol 39 no 10 pp 9087ndash9096 2012

[24] X Zhou S Li and Z Ye ldquoA novel system anomaly predictionsystem based on belief markov model and ensemble classifica-tionrdquo Mathematical Problems in Engineering vol 2013 ArticleID 179390 10 pages 2013

[25] H He S Chen K Li and X Xu ldquoIncremental learning fromstream datardquo IEEE Transactions on Neural Networks vol 22 no12 pp 1901ndash1914 2011

[26] D Du K Li X Li and M Fei ldquoA novel forward gene selectionalgorithm for microarray datardquo Neurocomputing vol 133 pp446ndash458 2014

[27] H Ma ldquoAn analysis of the equilibrium of migration models forbiogeography-based optimizationrdquo Information Sciences vol180 no 18 pp 3444ndash3464 2010

[28] D Simon ldquoBiogeography-based optimizationrdquo IEEE Transac-tions on Evolutionary Computation vol 12 no 6 pp 702ndash7132008

[29] S Sheen R Anitha and P Sirisha ldquoMalware detection bypruning of parallel ensembles using harmony searchrdquo PatternRecognition Letters vol 34 no 14 pp 1679ndash1686 2013

[30] Y-Y Zhang H-C Chao M Chen L Shu C-H Park and M-S Park ldquoOutlier detection and countermeasure for hierarchicalwireless sensor networksrdquo IET Information Security vol 4 no4 pp 361ndash373 2010

[31] C Peng and M-R Fei ldquoAn improved result on the stability ofuncertain T-S fuzzy systems with interval time-varying delayrdquoFuzzy Sets and Systems vol 212 pp 97ndash109 2013

[32] Y Zhang Observing the Unobservable Distributed Online Out-lier Detection inWireless Sensor Networks University of TwenteEnschede The Netherlands 2010

[33] C Peng D Yue and M Fei ldquoRelaxed stability and stabilizationconditions of networked fuzzy control systems subject toasynchronous grades of membershiprdquo IEEE Transactions onFuzzy Systems vol 22 no 5 pp 1101ndash1112 2014

[34] C Peng M-R Fei E Tian and Y-P Guan ldquoOn hold or dropout-of-order packets in networked control systemsrdquo Informa-tion Sciences vol 268 pp 436ndash446 2014

[35] M A Rassam A Zainal and M A Maarof ldquoAn adaptive andefficient dimension reduction model for multivariate wirelesssensor networks applicationsrdquo Applied Soft Computing Journalvol 13 no 4 pp 1978ndash1996 2013

[36] M Xie J Hu S Han and H-H Chen ldquoScalable hypergridk-NN-based online anomaly detection in wireless sensor net-worksrdquo IEEE Transactions on Parallel and Distributed Systemsvol 24 no 8 pp 1661ndash1670 2013

[37] Intel Berkely Reseach Lab (IBRL) dataset 2004 httpdbcsailmitedulabdatalabdatahtml

[38] J W Branch C Giannella B Szymanski R Wolff and HKargupta ldquoIn-network outlier detection in wireless sensornetworksrdquo Knowledge and Information Systems vol 34 no 1pp 23ndash54 2013

[39] M Moshtaghi T C Havens J C Bezdek et al ldquoClusteringellipses for anomaly detectionrdquo Pattern Recognition vol 44 no1 pp 55ndash69 2011

[40] S Rajasegarar J C Bezdek C Leckie and M PalaniswamildquoElliptical anomalies in wireless sensor networksrdquo ACM Trans-actions on Sensor Networks vol 6 no 1 pp 1ndash28 2009

[41] M A Rassam A Zainal and M A Maarof ldquoOne-classprincipal component classifier for anomaly detection inwirelesssensor networkrdquo in Proceedings of the 4th International Confer-ence on Computational Aspects of Social Networks (CASoN rsquo12)pp 271ndash276 IEEE Sao Carlos Brazil November 2012

[42] H Sagha H Bayati J D R Millan and R Chavarriaga ldquoOn-line anomaly detection and resilience in classifier ensemblesrdquoPattern Recognition Letters vol 34 no 15 pp 1916ndash1927 2013

[43] M Hejazi and Y P Singh ldquoOne-class support vector machinesapproach to anomaly detectionrdquo Applied Artificial Intelligencevol 27 no 5 pp 351ndash366 2013

[44] Y Zhang NMeratnia and P JMHavinga ldquoDistributed onlineoutlier detection in wireless sensor networks using ellipsoidalsupport vector machinerdquo Ad Hoc Networks vol 11 no 3 pp1062ndash1074 2013

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 8: Research Article A Novel Distributed Online Anomaly ...downloads.hindawi.com/journals/ijdsn/2015/146189.pdfin the cluster head node to obtain an optimized subset of ensemble members

8 International Journal of Distributed Sensor Networks

1

2

3

4

5 6

7

8

9

10

11

12

1314

15

16

17

18

19

20

21

22

23

2425

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

4142

43

44

45

46

47

48

49

50

51

52

54

Lab

Server

Quiet

Phon

e

Kitchen

Elec

Copy

Storage

Conference

Office Office

53

Figure 3 Sensor nodes location in the IBRL deployment

deployment (node locations are shown in black hexagonwith their corresponding node IDs) [35] The whole datasetwas collected from 29022004 to 05042004 Four typesof measures data that is light temperature and humidityas well as voltage were collected and those measurementswere recorded in 31 s interval Because these sensors weredeployed inside a lab and the measurement variables hadlittle changes over time (except the light having the suddenchanges due to the irregular nature of this variable andfrequent onoff operation) this dataset was considered a typeof static datasets for many researchers In our experimentsto evaluate our proposed anomaly detection algorithm someartificial anomalies are created by randomly modifying someobservations which is widely used bymany researchers in theliterature [41]

Since our proposed method adopts the cluster structurea cluster (consisting of 4 sensor nodes ie N7 N8 N9and N10) and dataset (collected on 29022004) are chosenThe data distribution can be seen in [7] Here only partobservations (during 000000 amndash075959 am) from eachsensor node are employed to evaluate proposed methodThedata trend is depicted in Figure 4

From Figure 4 an obvious fact is that data distributionin a cluster is almost same which well proved that spatialcorrelation exists Though there are some trivial differencesafter analyzing the dataset carefully the main reason is thatdataset has some missing data points largely due to packetloss which can be further proved from Figure 4 In ourexperiment these missing observations can be interpolatedusing the method described in Section 33 The obvious factis that sudden peakvalley appeared in Figure 4 for eachsensor observation which implies that an interested eventmay occurred

Suppose that 119863 = 119909119894 119910119894 119894 = 1 2 119899 is a dataset

used to train an anomaly detector Here the 119909119894is a vector

with feature values and 119910119894is the label which indicates whether

the given observation is normal or anomalous Because theIBRL dataset regards all its observations as normal someanomaly data points are generated and inserted to evaluatethe performance of our proposed method In the paper anumber of 30 data points of artificial anomalies for eachsensor were injected consecutively in each dataset to calculatethe true positive rate (TPR) false negative rate (FPR) anddetection accuracy (ACC) Without loss of generality theanomalous dataset should follow a distribution very muchdifferent from that of the training dataset but their rangesshould be overlapped as much as possible Besides ananomalous event should be a small probability event fora normal dataset collected by a nonfault sensor node Theanomalies were generated using a normal randomizer withslightly deviate statistical characteristics from the normaldata characteristics [41] The detailed dataset information(including statistical parameters) of selected sensor node ispresented in Table 1

42 Performance EvaluationMetrics and BBO Parameters Inorder to evaluate our proposed method some commonlyused performance evaluation metrics for anomaly detectionare used in our paper such as detection accuracy (ACC) truepositive rate (TPR) and false positivealarm rate (FPR)Theyare described as follows

ACC =

(TP + TN)(TP + TN + FP + FN)

International Journal of Distributed Sensor Networks 9

17

175

18

185

19

195

20Temperature

0 200 400 600 800 1000

N7N8

N9N10

(a)

38

39

40

41

42

43

44

45

46

0 200 400 600 800 1000

Humidity

N7N8

N9N10

(b)

Figure 4 The data (temperature humidity) trend during 00000 amndash75959 am on February 29 2004

Table 1 Detail dataset information of selected sensor node on 29022004

Node Initial sample Mean Variance Injected anomaly Mean Variance119879 119867 119879 119867 119879 119867 119879 119867

N7 823 184154 409176 05238 14494 30 1821 4110 054 146N8 548 179844 417123 05315 14612 30 1775 4195 055 148N9 652 181140 426295 05288 14827 30 1835 4245 055 150N10 620 181144 426215 05244 14191 30 1833 4247 054 143119879 temperature119867 humidity

TPR =

TP(TP + FN)

FPR =

FP(FP + TN)

(7)

where TP means number of samples correctly predicted asanomaly class FP means number of samples incorrectlypredicted as anomaly class TN means number of samplescorrectly predicted as normal class and FN means numberof samples incorrectly predicted as normal class

BBO is employed to prune the initial ensemble and themigration model is same as that present in [27 28] and therelated parameters are set as follows

Habitat (population) size 119878 = 30 the number of SIVs(suitability index variables) in each island 119899 = 20 40 60 80the maximum migration rates 119864 = 1 and 119868 = 1 and themutation rate 120578 = 001 and 120582 120583 are the immigration rateand the emigration rate respectively The elitism parameter120588 = 2

HSI (habitat suitability index) is a fitness function similarto other population-based optimization algorithms HIS isevaluated by 119865-measure (119865-score) which considers both the

precision probability and the recall probability of binaryclassification problem

119865-measure =(1 + 120573

2) precision lowast recall

1205732lowast precision + recall

=

(1 + 1205732) lowast TP

(1 + 1205732) lowast TP + 1205732 lowast FN + FP

(8)

119865-measure can be interpreted as a weighted average of theprecision and recall and its value reaches best at 1 and worstat 0 120573 is a parameter used to adjust the relative importancebetween precision and recall 120573 = 05 1 2 Usually thevalue of 119865-measure is close to the relative small value ofprecision and recall that is the big 119865-measuremeans that theprecision and recall are all big Consequently a good detectoris analogous to a habitat with a high HSI and is included inthe final ensemble detector and a poor detector is analogousto a habitat with a low HIS and is discarded from the finalensemble detector In our paper 120573 = 1 is specified

43 Results Presentation and Discussions In the data miningand machine learning communities SVM-based method has

10 International Journal of Distributed Sensor Networks

Table 2 Detection performance of local ensemble detector

Ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

5 08700 05833 01181 07900 03333 01809 08267 05000 01549 08267 05714 0160810 08800 06667 01111 08033 03889 01702 08267 04375 01514 08333 06429 0157315 08900 07500 01042 08167 05000 01631 08433 05000 01373 08600 07143 0132920 08933 08333 01042 08200 05000 01596 08367 05000 01444 08567 07143 01364

Table 3 Detection performance of global ensemble detector [7]

Combined ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

20 09467 08333 00486 09300 07778 00603 09467 07500 00423 09500 07857 0042040 09700 07500 00208 09433 08333 00496 09710 08938 00246 09650 08929 0031560 09700 08333 00243 09733 08889 00213 09800 09375 00176 09783 09357 0019680 09817 09583 00174 09800 09444 00177 09767 09375 00211 09780 09714 00217

been widely used in classification problem which separatesthe data belonging to the different classes by fitting a hyper-plane One class SVM based method as a variation of thismethod is especially favored for anomaly detection [42ndash44]In the paper it was used to train the base detectorThe datasetof each sensor node was divided into two parts about 66was used for training the local detector and the remainder asthe test set was to evaluate proposed method

Online Bagging the commonly used ensemble strategywas used to build initial ensemble detector Our experimentsaim to achieve two goals Firstly it is to prove the effectivenessof proposed method based on ensemble learning theorySecondly it is to prove that pruned ensemble detector canobtain better (at least equal) performance compared to initialensemble detector and mitigate the resource requirementAs a result three experiments were done that is localensemble anomaly detector only considering the temporalcorrelation of each sensor node global ensemble anomalydetector considering the spatiotemporal correlation and theglobal pruned ensemble anomaly detector based on BBOThe experimental results can be seen in Tables 2 3 and 4respectively

Table 2 shows the performance of each sensor node underthe different ensemble size which does not take into accountthe spatial correlation of sensed data in a cluster Though theensemble detection performance is becoming ldquogoodrdquo gradualwith the increasing of ensemble size (the higher value ofACCTPR the better performance and the lower value of FPR thebetter performance) the overall performance is relatively lowThemaximumvalue of detection accuracy is only 8933 andmost of true positive rates are unacceptable and most of falsepositive rates (FPR) have a relative high valueAll these resultsindicate that the performance of local ensemble detectoris poor Table 3 shows the global detection performance ofeach sensor node Here after the local ensemble detectorwas trained each member node sent its local ensembleto each other to form the global ensemble detector andeach member node used this global detector to online testthe local observation From the results of Table 3 [7] an

obvious fact is that the detection performances are higherthan presented in Table 2With the help of neighbor detectorthe detection results become better and better correspondingto the increasing of ensemble size

In order to further optimize the proposed algorithmperformance and save the resource ensemble pruning is usedfor global ensemble detector Table 4 [7] shows the result ofdetection performance of pruned global ensemble detectorbased on BBO

Table 4 shows a more practicable result and the sizeof global ensemble decreases sharply while the detectorperformance is as good as or better than the initial globalensemble detector From the results of Table 5 when thesize of initial ensemble reaches 80 the 60 resource costis saved In our experiment only for validating the methodeffectively we set the ensemble sizes 5 10 15 and 20 for eachlocal ensemble detector which may be small for the practicalapplications In fact how many local ensemble detectors arechosen is an open topic and is decided by many factors suchas the computation capability and the communication cost aswell as memory usage of sensor node the expected detectingaccuracy requirement and so on In the practical applicationa trade-off is commonly considered

5 Conclusion and Future Work

After exploiting the spatiotemporal correlation existing inthe sensed data of WSNs and motivated by the advantagesof online ensemble learning a distributed online ensembleanomaly detector method has been proposed Due to thespecific resource constrained in theWSNs ensemble pruningbased on BBO is employed to mitigate the high resourcerequirement and obtain the optimized detector that performsat least as good as the original ones The experimental resultson real dataset demonstrated that our proposed method iseffective

Because the diversity of base learners is a key factorrelated to the performance of ensemble learning as a possibleextension of our work we plan to include some diversity

International Journal of Distributed Sensor Networks 11

Table 4 Detection performance of global ensemble detector based on BBO pruning [7]

Ensemble size(BBO pruned)

N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

14 09480 08000 00458 09327 07667 00567 09500 08125 00423 09533 08571 0042023 09710 07750 00208 09447 08000 00461 09733 09250 00239 09697 09143 0027627 09713 08500 00236 09683 08333 00230 09810 09563 00176 09797 09357 0018232 09820 09750 00177 09750 08333 00160 09820 09500 00162 09830 09786 00168

Table 5 Rate of saving resource cost based on global ensembledetector of BBO pruned

Number Initial ensemblesize

Prunedensemble size

Saving resourcecost

1 20 14 302 40 23 4253 60 27 554 80 32 60

measures in fitness function to improve the detecting per-formance in future Besides the cost of communication isthe main reason of quick energy depletion of sensor nodesespecially for the cluster head the adaptive selection of clusterhead based on energy state will be taken into account tolengthen the lifetime of WSNs in next work

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by the National Key ScientificInstrument and Equipment Development Project(2012YQ15008703) the Zhejiang Provincial Natural ScienceFoundation of China (LY13F020015) the Open Project of TopKeyDiscipline of Computer Software andTheory in ZhejiangProvincial (ZC323014100) National Science Foundation ofChina (61473182) Science and Technology Commission ofShanghai Municipality (11JC1404000 14JC1402200) andShanghai Rising-Star Program (13QA1401600)

References

[1] Y Zhang N Meratnia and P Havinga ldquoOutlier detectiontechniques for wireless sensor networks a surveyrdquo IEEE Com-munications Surveys and Tutorials vol 12 no 2 pp 159ndash1702010

[2] Y Zhang N A S Hamm N Meratnia A Stein M van deVoort and P J M Havinga ldquoStatistics-based outlier detectionfor wireless sensor networksrdquo International Journal of Geo-graphical Information Science vol 26 no 8 pp 1373ndash1392 2012

[3] C Peng and Q-L Han ldquoA novel event-triggered transmissionscheme and L

2control co-design for sampled-data control

systemsrdquo IEEE Transactions on Automatic Control vol 58 no10 pp 2620ndash2626 2013

[4] S Rajasegarar C Leckie M Palaniswami and J C BezdekldquoDistributed anomaly detection in wireless sensor networksrdquo inProceedings of the 10th IEEE Singapore International Conferenceon Communication systems (ICCS rsquo06) pp 1ndash5 IEEE SingaporeOctober 2006

[5] S Rajasegarar C Leckie and M Palaniswami ldquoAnomalydetection in wireless sensor networksrdquo IEEE Wireless Commu-nications vol 15 no 4 pp 34ndash40 2008

[6] M Xie S Han B Tian and S Parvin ldquoAnomaly detectionin wireless sensor networks a surveyrdquo Journal of Network andComputer Applications vol 34 no 4 pp 1302ndash1325 2011

[7] Z Ding M Fei D Du and S Xu ldquoOnline anomaly detectionmethod based on BBO ensemble pruning in wireless sensornetworksrdquo in Life System Modeling and Simulation vol 461 ofCommunications in Computer and Information Science pp 160ndash169 Springer Berlin Germany 2014

[8] T G Dietterich ldquoMachine-learning researchmdashfour currentdirectionsrdquo AI Magazine vol 18 no 4 pp 97ndash136 1997

[9] Z-H Zhou J Wu andW Tang ldquoEnsembling neural networksmany could be better than allrdquoArtificial Intelligence vol 137 no1-2 pp 239ndash263 2002

[10] N Shahid I H Naqvi and S B Qaisar ldquoCharacteristics andclassification of outlier detection techniques for wireless sensornetworks in harsh environments a surveyrdquoArtificial IntelligenceReview vol 137 pp 1ndash36 2012

[11] D Du K Li and M Fei ldquoA fast multi-output RBF neuralnetwork constructionmethodrdquoNeurocomputing vol 73 no 10ndash12 pp 2196ndash2202 2010

[12] P Gil A Santos and A Cardoso ldquoDealing with outliers inwireless sensor networks an oil refinery applicationrdquo IEEETransactions on Control Systems Technology vol 23 no 4 pp1589ndash1596 2014

[13] M A Rassam M A Maarof and A Zainal ldquoAdaptive andonline data anomaly detection for wireless sensor systemsrdquoKnowledge-Based Systems vol 60 pp 44ndash57 2014

[14] S Rajasegarar A Gluhak M Ali Imran et al ldquoEllipsoidalneighbourhood outlier factor for distributed anomaly detectionin resource constrained networksrdquo Pattern Recognition vol 47no 9 pp 2867ndash2879 2014

[15] N Lu G Zhang and J Lu ldquoConcept drift detection viacompetence modelsrdquo Artificial Intelligence vol 209 pp 11ndash282014

[16] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[17] S Seguı L Igual and J Vitria ldquoBagged one-class classifiersin the presence of outliersrdquo International Journal of PatternRecognition and Artificial Intelligence vol 27 no 5 Article ID1350014 2013

12 International Journal of Distributed Sensor Networks

[18] N Duffy and D Helmbold ldquoBoosting methods for regressionrdquoMachine Learning vol 47 no 2-3 pp 153ndash200 2002

[19] W-C Chang and C-W Cho ldquoOnline boosting for vehicledetectionrdquo IEEETransactions on SystemsMan and CyberneticsPart B Cybernetics vol 40 no 3 pp 892ndash902 2010

[20] C Desir S Bernard C Petitjean and L Heutte ldquoOne classrandom forestsrdquo Pattern Recognition vol 46 no 12 pp 3490ndash3506 2013

[21] A Fern and R Givan ldquoOnline ensemble learning an empiricalstudyrdquoMachine Learning vol 53 no 1-2 pp 71ndash109 2003

[22] A Bifet G Holmes B Pfahringer and R Gavalda ldquoImprov-ing adaptive bagging methods for evolving data streamsrdquo inAdvances in Machine Learning vol 5828 of Lecture Notes inComputer Science pp 23ndash37 Springer Berlin Germany 2009

[23] D I Curiac and C Volosencu ldquoEnsemble based sensinganomaly detection in wireless sensor networksrdquo Expert Systemswith Applications vol 39 no 10 pp 9087ndash9096 2012

[24] X Zhou S Li and Z Ye ldquoA novel system anomaly predictionsystem based on belief markov model and ensemble classifica-tionrdquo Mathematical Problems in Engineering vol 2013 ArticleID 179390 10 pages 2013

[25] H He S Chen K Li and X Xu ldquoIncremental learning fromstream datardquo IEEE Transactions on Neural Networks vol 22 no12 pp 1901ndash1914 2011

[26] D Du K Li X Li and M Fei ldquoA novel forward gene selectionalgorithm for microarray datardquo Neurocomputing vol 133 pp446ndash458 2014

[27] H Ma ldquoAn analysis of the equilibrium of migration models forbiogeography-based optimizationrdquo Information Sciences vol180 no 18 pp 3444ndash3464 2010

[28] D Simon ldquoBiogeography-based optimizationrdquo IEEE Transac-tions on Evolutionary Computation vol 12 no 6 pp 702ndash7132008

[29] S Sheen R Anitha and P Sirisha ldquoMalware detection bypruning of parallel ensembles using harmony searchrdquo PatternRecognition Letters vol 34 no 14 pp 1679ndash1686 2013

[30] Y-Y Zhang H-C Chao M Chen L Shu C-H Park and M-S Park ldquoOutlier detection and countermeasure for hierarchicalwireless sensor networksrdquo IET Information Security vol 4 no4 pp 361ndash373 2010

[31] C Peng and M-R Fei ldquoAn improved result on the stability ofuncertain T-S fuzzy systems with interval time-varying delayrdquoFuzzy Sets and Systems vol 212 pp 97ndash109 2013

[32] Y Zhang Observing the Unobservable Distributed Online Out-lier Detection inWireless Sensor Networks University of TwenteEnschede The Netherlands 2010

[33] C Peng D Yue and M Fei ldquoRelaxed stability and stabilizationconditions of networked fuzzy control systems subject toasynchronous grades of membershiprdquo IEEE Transactions onFuzzy Systems vol 22 no 5 pp 1101ndash1112 2014

[34] C Peng M-R Fei E Tian and Y-P Guan ldquoOn hold or dropout-of-order packets in networked control systemsrdquo Informa-tion Sciences vol 268 pp 436ndash446 2014

[35] M A Rassam A Zainal and M A Maarof ldquoAn adaptive andefficient dimension reduction model for multivariate wirelesssensor networks applicationsrdquo Applied Soft Computing Journalvol 13 no 4 pp 1978ndash1996 2013

[36] M Xie J Hu S Han and H-H Chen ldquoScalable hypergridk-NN-based online anomaly detection in wireless sensor net-worksrdquo IEEE Transactions on Parallel and Distributed Systemsvol 24 no 8 pp 1661ndash1670 2013

[37] Intel Berkely Reseach Lab (IBRL) dataset 2004 httpdbcsailmitedulabdatalabdatahtml

[38] J W Branch C Giannella B Szymanski R Wolff and HKargupta ldquoIn-network outlier detection in wireless sensornetworksrdquo Knowledge and Information Systems vol 34 no 1pp 23ndash54 2013

[39] M Moshtaghi T C Havens J C Bezdek et al ldquoClusteringellipses for anomaly detectionrdquo Pattern Recognition vol 44 no1 pp 55ndash69 2011

[40] S Rajasegarar J C Bezdek C Leckie and M PalaniswamildquoElliptical anomalies in wireless sensor networksrdquo ACM Trans-actions on Sensor Networks vol 6 no 1 pp 1ndash28 2009

[41] M A Rassam A Zainal and M A Maarof ldquoOne-classprincipal component classifier for anomaly detection inwirelesssensor networkrdquo in Proceedings of the 4th International Confer-ence on Computational Aspects of Social Networks (CASoN rsquo12)pp 271ndash276 IEEE Sao Carlos Brazil November 2012

[42] H Sagha H Bayati J D R Millan and R Chavarriaga ldquoOn-line anomaly detection and resilience in classifier ensemblesrdquoPattern Recognition Letters vol 34 no 15 pp 1916ndash1927 2013

[43] M Hejazi and Y P Singh ldquoOne-class support vector machinesapproach to anomaly detectionrdquo Applied Artificial Intelligencevol 27 no 5 pp 351ndash366 2013

[44] Y Zhang NMeratnia and P JMHavinga ldquoDistributed onlineoutlier detection in wireless sensor networks using ellipsoidalsupport vector machinerdquo Ad Hoc Networks vol 11 no 3 pp1062ndash1074 2013

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 9: Research Article A Novel Distributed Online Anomaly ...downloads.hindawi.com/journals/ijdsn/2015/146189.pdfin the cluster head node to obtain an optimized subset of ensemble members

International Journal of Distributed Sensor Networks 9

17

175

18

185

19

195

20Temperature

0 200 400 600 800 1000

N7N8

N9N10

(a)

38

39

40

41

42

43

44

45

46

0 200 400 600 800 1000

Humidity

N7N8

N9N10

(b)

Figure 4 The data (temperature humidity) trend during 00000 amndash75959 am on February 29 2004

Table 1 Detail dataset information of selected sensor node on 29022004

Node Initial sample Mean Variance Injected anomaly Mean Variance119879 119867 119879 119867 119879 119867 119879 119867

N7 823 184154 409176 05238 14494 30 1821 4110 054 146N8 548 179844 417123 05315 14612 30 1775 4195 055 148N9 652 181140 426295 05288 14827 30 1835 4245 055 150N10 620 181144 426215 05244 14191 30 1833 4247 054 143119879 temperature119867 humidity

TPR =

TP(TP + FN)

FPR =

FP(FP + TN)

(7)

where TP means number of samples correctly predicted asanomaly class FP means number of samples incorrectlypredicted as anomaly class TN means number of samplescorrectly predicted as normal class and FN means numberof samples incorrectly predicted as normal class

BBO is employed to prune the initial ensemble and themigration model is same as that present in [27 28] and therelated parameters are set as follows

Habitat (population) size 119878 = 30 the number of SIVs(suitability index variables) in each island 119899 = 20 40 60 80the maximum migration rates 119864 = 1 and 119868 = 1 and themutation rate 120578 = 001 and 120582 120583 are the immigration rateand the emigration rate respectively The elitism parameter120588 = 2

HSI (habitat suitability index) is a fitness function similarto other population-based optimization algorithms HIS isevaluated by 119865-measure (119865-score) which considers both the

precision probability and the recall probability of binaryclassification problem

119865-measure =(1 + 120573

2) precision lowast recall

1205732lowast precision + recall

=

(1 + 1205732) lowast TP

(1 + 1205732) lowast TP + 1205732 lowast FN + FP

(8)

119865-measure can be interpreted as a weighted average of theprecision and recall and its value reaches best at 1 and worstat 0 120573 is a parameter used to adjust the relative importancebetween precision and recall 120573 = 05 1 2 Usually thevalue of 119865-measure is close to the relative small value ofprecision and recall that is the big 119865-measuremeans that theprecision and recall are all big Consequently a good detectoris analogous to a habitat with a high HSI and is included inthe final ensemble detector and a poor detector is analogousto a habitat with a low HIS and is discarded from the finalensemble detector In our paper 120573 = 1 is specified

43 Results Presentation and Discussions In the data miningand machine learning communities SVM-based method has

10 International Journal of Distributed Sensor Networks

Table 2 Detection performance of local ensemble detector

Ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

5 08700 05833 01181 07900 03333 01809 08267 05000 01549 08267 05714 0160810 08800 06667 01111 08033 03889 01702 08267 04375 01514 08333 06429 0157315 08900 07500 01042 08167 05000 01631 08433 05000 01373 08600 07143 0132920 08933 08333 01042 08200 05000 01596 08367 05000 01444 08567 07143 01364

Table 3 Detection performance of global ensemble detector [7]

Combined ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

20 09467 08333 00486 09300 07778 00603 09467 07500 00423 09500 07857 0042040 09700 07500 00208 09433 08333 00496 09710 08938 00246 09650 08929 0031560 09700 08333 00243 09733 08889 00213 09800 09375 00176 09783 09357 0019680 09817 09583 00174 09800 09444 00177 09767 09375 00211 09780 09714 00217

been widely used in classification problem which separatesthe data belonging to the different classes by fitting a hyper-plane One class SVM based method as a variation of thismethod is especially favored for anomaly detection [42ndash44]In the paper it was used to train the base detectorThe datasetof each sensor node was divided into two parts about 66was used for training the local detector and the remainder asthe test set was to evaluate proposed method

Online Bagging the commonly used ensemble strategywas used to build initial ensemble detector Our experimentsaim to achieve two goals Firstly it is to prove the effectivenessof proposed method based on ensemble learning theorySecondly it is to prove that pruned ensemble detector canobtain better (at least equal) performance compared to initialensemble detector and mitigate the resource requirementAs a result three experiments were done that is localensemble anomaly detector only considering the temporalcorrelation of each sensor node global ensemble anomalydetector considering the spatiotemporal correlation and theglobal pruned ensemble anomaly detector based on BBOThe experimental results can be seen in Tables 2 3 and 4respectively

Table 2 shows the performance of each sensor node underthe different ensemble size which does not take into accountthe spatial correlation of sensed data in a cluster Though theensemble detection performance is becoming ldquogoodrdquo gradualwith the increasing of ensemble size (the higher value ofACCTPR the better performance and the lower value of FPR thebetter performance) the overall performance is relatively lowThemaximumvalue of detection accuracy is only 8933 andmost of true positive rates are unacceptable and most of falsepositive rates (FPR) have a relative high valueAll these resultsindicate that the performance of local ensemble detectoris poor Table 3 shows the global detection performance ofeach sensor node Here after the local ensemble detectorwas trained each member node sent its local ensembleto each other to form the global ensemble detector andeach member node used this global detector to online testthe local observation From the results of Table 3 [7] an

obvious fact is that the detection performances are higherthan presented in Table 2With the help of neighbor detectorthe detection results become better and better correspondingto the increasing of ensemble size

In order to further optimize the proposed algorithmperformance and save the resource ensemble pruning is usedfor global ensemble detector Table 4 [7] shows the result ofdetection performance of pruned global ensemble detectorbased on BBO

Table 4 shows a more practicable result and the sizeof global ensemble decreases sharply while the detectorperformance is as good as or better than the initial globalensemble detector From the results of Table 5 when thesize of initial ensemble reaches 80 the 60 resource costis saved In our experiment only for validating the methodeffectively we set the ensemble sizes 5 10 15 and 20 for eachlocal ensemble detector which may be small for the practicalapplications In fact how many local ensemble detectors arechosen is an open topic and is decided by many factors suchas the computation capability and the communication cost aswell as memory usage of sensor node the expected detectingaccuracy requirement and so on In the practical applicationa trade-off is commonly considered

5 Conclusion and Future Work

After exploiting the spatiotemporal correlation existing inthe sensed data of WSNs and motivated by the advantagesof online ensemble learning a distributed online ensembleanomaly detector method has been proposed Due to thespecific resource constrained in theWSNs ensemble pruningbased on BBO is employed to mitigate the high resourcerequirement and obtain the optimized detector that performsat least as good as the original ones The experimental resultson real dataset demonstrated that our proposed method iseffective

Because the diversity of base learners is a key factorrelated to the performance of ensemble learning as a possibleextension of our work we plan to include some diversity

International Journal of Distributed Sensor Networks 11

Table 4 Detection performance of global ensemble detector based on BBO pruning [7]

Ensemble size(BBO pruned)

N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

14 09480 08000 00458 09327 07667 00567 09500 08125 00423 09533 08571 0042023 09710 07750 00208 09447 08000 00461 09733 09250 00239 09697 09143 0027627 09713 08500 00236 09683 08333 00230 09810 09563 00176 09797 09357 0018232 09820 09750 00177 09750 08333 00160 09820 09500 00162 09830 09786 00168

Table 5 Rate of saving resource cost based on global ensembledetector of BBO pruned

Number Initial ensemblesize

Prunedensemble size

Saving resourcecost

1 20 14 302 40 23 4253 60 27 554 80 32 60

measures in fitness function to improve the detecting per-formance in future Besides the cost of communication isthe main reason of quick energy depletion of sensor nodesespecially for the cluster head the adaptive selection of clusterhead based on energy state will be taken into account tolengthen the lifetime of WSNs in next work

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by the National Key ScientificInstrument and Equipment Development Project(2012YQ15008703) the Zhejiang Provincial Natural ScienceFoundation of China (LY13F020015) the Open Project of TopKeyDiscipline of Computer Software andTheory in ZhejiangProvincial (ZC323014100) National Science Foundation ofChina (61473182) Science and Technology Commission ofShanghai Municipality (11JC1404000 14JC1402200) andShanghai Rising-Star Program (13QA1401600)

References

[1] Y Zhang N Meratnia and P Havinga ldquoOutlier detectiontechniques for wireless sensor networks a surveyrdquo IEEE Com-munications Surveys and Tutorials vol 12 no 2 pp 159ndash1702010

[2] Y Zhang N A S Hamm N Meratnia A Stein M van deVoort and P J M Havinga ldquoStatistics-based outlier detectionfor wireless sensor networksrdquo International Journal of Geo-graphical Information Science vol 26 no 8 pp 1373ndash1392 2012

[3] C Peng and Q-L Han ldquoA novel event-triggered transmissionscheme and L

2control co-design for sampled-data control

systemsrdquo IEEE Transactions on Automatic Control vol 58 no10 pp 2620ndash2626 2013

[4] S Rajasegarar C Leckie M Palaniswami and J C BezdekldquoDistributed anomaly detection in wireless sensor networksrdquo inProceedings of the 10th IEEE Singapore International Conferenceon Communication systems (ICCS rsquo06) pp 1ndash5 IEEE SingaporeOctober 2006

[5] S Rajasegarar C Leckie and M Palaniswami ldquoAnomalydetection in wireless sensor networksrdquo IEEE Wireless Commu-nications vol 15 no 4 pp 34ndash40 2008

[6] M Xie S Han B Tian and S Parvin ldquoAnomaly detectionin wireless sensor networks a surveyrdquo Journal of Network andComputer Applications vol 34 no 4 pp 1302ndash1325 2011

[7] Z Ding M Fei D Du and S Xu ldquoOnline anomaly detectionmethod based on BBO ensemble pruning in wireless sensornetworksrdquo in Life System Modeling and Simulation vol 461 ofCommunications in Computer and Information Science pp 160ndash169 Springer Berlin Germany 2014

[8] T G Dietterich ldquoMachine-learning researchmdashfour currentdirectionsrdquo AI Magazine vol 18 no 4 pp 97ndash136 1997

[9] Z-H Zhou J Wu andW Tang ldquoEnsembling neural networksmany could be better than allrdquoArtificial Intelligence vol 137 no1-2 pp 239ndash263 2002

[10] N Shahid I H Naqvi and S B Qaisar ldquoCharacteristics andclassification of outlier detection techniques for wireless sensornetworks in harsh environments a surveyrdquoArtificial IntelligenceReview vol 137 pp 1ndash36 2012

[11] D Du K Li and M Fei ldquoA fast multi-output RBF neuralnetwork constructionmethodrdquoNeurocomputing vol 73 no 10ndash12 pp 2196ndash2202 2010

[12] P Gil A Santos and A Cardoso ldquoDealing with outliers inwireless sensor networks an oil refinery applicationrdquo IEEETransactions on Control Systems Technology vol 23 no 4 pp1589ndash1596 2014

[13] M A Rassam M A Maarof and A Zainal ldquoAdaptive andonline data anomaly detection for wireless sensor systemsrdquoKnowledge-Based Systems vol 60 pp 44ndash57 2014

[14] S Rajasegarar A Gluhak M Ali Imran et al ldquoEllipsoidalneighbourhood outlier factor for distributed anomaly detectionin resource constrained networksrdquo Pattern Recognition vol 47no 9 pp 2867ndash2879 2014

[15] N Lu G Zhang and J Lu ldquoConcept drift detection viacompetence modelsrdquo Artificial Intelligence vol 209 pp 11ndash282014

[16] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[17] S Seguı L Igual and J Vitria ldquoBagged one-class classifiersin the presence of outliersrdquo International Journal of PatternRecognition and Artificial Intelligence vol 27 no 5 Article ID1350014 2013

12 International Journal of Distributed Sensor Networks

[18] N Duffy and D Helmbold ldquoBoosting methods for regressionrdquoMachine Learning vol 47 no 2-3 pp 153ndash200 2002

[19] W-C Chang and C-W Cho ldquoOnline boosting for vehicledetectionrdquo IEEETransactions on SystemsMan and CyberneticsPart B Cybernetics vol 40 no 3 pp 892ndash902 2010

[20] C Desir S Bernard C Petitjean and L Heutte ldquoOne classrandom forestsrdquo Pattern Recognition vol 46 no 12 pp 3490ndash3506 2013

[21] A Fern and R Givan ldquoOnline ensemble learning an empiricalstudyrdquoMachine Learning vol 53 no 1-2 pp 71ndash109 2003

[22] A Bifet G Holmes B Pfahringer and R Gavalda ldquoImprov-ing adaptive bagging methods for evolving data streamsrdquo inAdvances in Machine Learning vol 5828 of Lecture Notes inComputer Science pp 23ndash37 Springer Berlin Germany 2009

[23] D I Curiac and C Volosencu ldquoEnsemble based sensinganomaly detection in wireless sensor networksrdquo Expert Systemswith Applications vol 39 no 10 pp 9087ndash9096 2012

[24] X Zhou S Li and Z Ye ldquoA novel system anomaly predictionsystem based on belief markov model and ensemble classifica-tionrdquo Mathematical Problems in Engineering vol 2013 ArticleID 179390 10 pages 2013

[25] H He S Chen K Li and X Xu ldquoIncremental learning fromstream datardquo IEEE Transactions on Neural Networks vol 22 no12 pp 1901ndash1914 2011

[26] D Du K Li X Li and M Fei ldquoA novel forward gene selectionalgorithm for microarray datardquo Neurocomputing vol 133 pp446ndash458 2014

[27] H Ma ldquoAn analysis of the equilibrium of migration models forbiogeography-based optimizationrdquo Information Sciences vol180 no 18 pp 3444ndash3464 2010

[28] D Simon ldquoBiogeography-based optimizationrdquo IEEE Transac-tions on Evolutionary Computation vol 12 no 6 pp 702ndash7132008

[29] S Sheen R Anitha and P Sirisha ldquoMalware detection bypruning of parallel ensembles using harmony searchrdquo PatternRecognition Letters vol 34 no 14 pp 1679ndash1686 2013

[30] Y-Y Zhang H-C Chao M Chen L Shu C-H Park and M-S Park ldquoOutlier detection and countermeasure for hierarchicalwireless sensor networksrdquo IET Information Security vol 4 no4 pp 361ndash373 2010

[31] C Peng and M-R Fei ldquoAn improved result on the stability ofuncertain T-S fuzzy systems with interval time-varying delayrdquoFuzzy Sets and Systems vol 212 pp 97ndash109 2013

[32] Y Zhang Observing the Unobservable Distributed Online Out-lier Detection inWireless Sensor Networks University of TwenteEnschede The Netherlands 2010

[33] C Peng D Yue and M Fei ldquoRelaxed stability and stabilizationconditions of networked fuzzy control systems subject toasynchronous grades of membershiprdquo IEEE Transactions onFuzzy Systems vol 22 no 5 pp 1101ndash1112 2014

[34] C Peng M-R Fei E Tian and Y-P Guan ldquoOn hold or dropout-of-order packets in networked control systemsrdquo Informa-tion Sciences vol 268 pp 436ndash446 2014

[35] M A Rassam A Zainal and M A Maarof ldquoAn adaptive andefficient dimension reduction model for multivariate wirelesssensor networks applicationsrdquo Applied Soft Computing Journalvol 13 no 4 pp 1978ndash1996 2013

[36] M Xie J Hu S Han and H-H Chen ldquoScalable hypergridk-NN-based online anomaly detection in wireless sensor net-worksrdquo IEEE Transactions on Parallel and Distributed Systemsvol 24 no 8 pp 1661ndash1670 2013

[37] Intel Berkely Reseach Lab (IBRL) dataset 2004 httpdbcsailmitedulabdatalabdatahtml

[38] J W Branch C Giannella B Szymanski R Wolff and HKargupta ldquoIn-network outlier detection in wireless sensornetworksrdquo Knowledge and Information Systems vol 34 no 1pp 23ndash54 2013

[39] M Moshtaghi T C Havens J C Bezdek et al ldquoClusteringellipses for anomaly detectionrdquo Pattern Recognition vol 44 no1 pp 55ndash69 2011

[40] S Rajasegarar J C Bezdek C Leckie and M PalaniswamildquoElliptical anomalies in wireless sensor networksrdquo ACM Trans-actions on Sensor Networks vol 6 no 1 pp 1ndash28 2009

[41] M A Rassam A Zainal and M A Maarof ldquoOne-classprincipal component classifier for anomaly detection inwirelesssensor networkrdquo in Proceedings of the 4th International Confer-ence on Computational Aspects of Social Networks (CASoN rsquo12)pp 271ndash276 IEEE Sao Carlos Brazil November 2012

[42] H Sagha H Bayati J D R Millan and R Chavarriaga ldquoOn-line anomaly detection and resilience in classifier ensemblesrdquoPattern Recognition Letters vol 34 no 15 pp 1916ndash1927 2013

[43] M Hejazi and Y P Singh ldquoOne-class support vector machinesapproach to anomaly detectionrdquo Applied Artificial Intelligencevol 27 no 5 pp 351ndash366 2013

[44] Y Zhang NMeratnia and P JMHavinga ldquoDistributed onlineoutlier detection in wireless sensor networks using ellipsoidalsupport vector machinerdquo Ad Hoc Networks vol 11 no 3 pp1062ndash1074 2013

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 10: Research Article A Novel Distributed Online Anomaly ...downloads.hindawi.com/journals/ijdsn/2015/146189.pdfin the cluster head node to obtain an optimized subset of ensemble members

10 International Journal of Distributed Sensor Networks

Table 2 Detection performance of local ensemble detector

Ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

5 08700 05833 01181 07900 03333 01809 08267 05000 01549 08267 05714 0160810 08800 06667 01111 08033 03889 01702 08267 04375 01514 08333 06429 0157315 08900 07500 01042 08167 05000 01631 08433 05000 01373 08600 07143 0132920 08933 08333 01042 08200 05000 01596 08367 05000 01444 08567 07143 01364

Table 3 Detection performance of global ensemble detector [7]

Combined ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

20 09467 08333 00486 09300 07778 00603 09467 07500 00423 09500 07857 0042040 09700 07500 00208 09433 08333 00496 09710 08938 00246 09650 08929 0031560 09700 08333 00243 09733 08889 00213 09800 09375 00176 09783 09357 0019680 09817 09583 00174 09800 09444 00177 09767 09375 00211 09780 09714 00217

been widely used in classification problem which separatesthe data belonging to the different classes by fitting a hyper-plane One class SVM based method as a variation of thismethod is especially favored for anomaly detection [42ndash44]In the paper it was used to train the base detectorThe datasetof each sensor node was divided into two parts about 66was used for training the local detector and the remainder asthe test set was to evaluate proposed method

Online Bagging the commonly used ensemble strategywas used to build initial ensemble detector Our experimentsaim to achieve two goals Firstly it is to prove the effectivenessof proposed method based on ensemble learning theorySecondly it is to prove that pruned ensemble detector canobtain better (at least equal) performance compared to initialensemble detector and mitigate the resource requirementAs a result three experiments were done that is localensemble anomaly detector only considering the temporalcorrelation of each sensor node global ensemble anomalydetector considering the spatiotemporal correlation and theglobal pruned ensemble anomaly detector based on BBOThe experimental results can be seen in Tables 2 3 and 4respectively

Table 2 shows the performance of each sensor node underthe different ensemble size which does not take into accountthe spatial correlation of sensed data in a cluster Though theensemble detection performance is becoming ldquogoodrdquo gradualwith the increasing of ensemble size (the higher value ofACCTPR the better performance and the lower value of FPR thebetter performance) the overall performance is relatively lowThemaximumvalue of detection accuracy is only 8933 andmost of true positive rates are unacceptable and most of falsepositive rates (FPR) have a relative high valueAll these resultsindicate that the performance of local ensemble detectoris poor Table 3 shows the global detection performance ofeach sensor node Here after the local ensemble detectorwas trained each member node sent its local ensembleto each other to form the global ensemble detector andeach member node used this global detector to online testthe local observation From the results of Table 3 [7] an

obvious fact is that the detection performances are higherthan presented in Table 2With the help of neighbor detectorthe detection results become better and better correspondingto the increasing of ensemble size

In order to further optimize the proposed algorithmperformance and save the resource ensemble pruning is usedfor global ensemble detector Table 4 [7] shows the result ofdetection performance of pruned global ensemble detectorbased on BBO

Table 4 shows a more practicable result and the sizeof global ensemble decreases sharply while the detectorperformance is as good as or better than the initial globalensemble detector From the results of Table 5 when thesize of initial ensemble reaches 80 the 60 resource costis saved In our experiment only for validating the methodeffectively we set the ensemble sizes 5 10 15 and 20 for eachlocal ensemble detector which may be small for the practicalapplications In fact how many local ensemble detectors arechosen is an open topic and is decided by many factors suchas the computation capability and the communication cost aswell as memory usage of sensor node the expected detectingaccuracy requirement and so on In the practical applicationa trade-off is commonly considered

5 Conclusion and Future Work

After exploiting the spatiotemporal correlation existing inthe sensed data of WSNs and motivated by the advantagesof online ensemble learning a distributed online ensembleanomaly detector method has been proposed Due to thespecific resource constrained in theWSNs ensemble pruningbased on BBO is employed to mitigate the high resourcerequirement and obtain the optimized detector that performsat least as good as the original ones The experimental resultson real dataset demonstrated that our proposed method iseffective

Because the diversity of base learners is a key factorrelated to the performance of ensemble learning as a possibleextension of our work we plan to include some diversity

International Journal of Distributed Sensor Networks 11

Table 4 Detection performance of global ensemble detector based on BBO pruning [7]

Ensemble size(BBO pruned)

N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

14 09480 08000 00458 09327 07667 00567 09500 08125 00423 09533 08571 0042023 09710 07750 00208 09447 08000 00461 09733 09250 00239 09697 09143 0027627 09713 08500 00236 09683 08333 00230 09810 09563 00176 09797 09357 0018232 09820 09750 00177 09750 08333 00160 09820 09500 00162 09830 09786 00168

Table 5 Rate of saving resource cost based on global ensembledetector of BBO pruned

Number Initial ensemblesize

Prunedensemble size

Saving resourcecost

1 20 14 302 40 23 4253 60 27 554 80 32 60

measures in fitness function to improve the detecting per-formance in future Besides the cost of communication isthe main reason of quick energy depletion of sensor nodesespecially for the cluster head the adaptive selection of clusterhead based on energy state will be taken into account tolengthen the lifetime of WSNs in next work

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by the National Key ScientificInstrument and Equipment Development Project(2012YQ15008703) the Zhejiang Provincial Natural ScienceFoundation of China (LY13F020015) the Open Project of TopKeyDiscipline of Computer Software andTheory in ZhejiangProvincial (ZC323014100) National Science Foundation ofChina (61473182) Science and Technology Commission ofShanghai Municipality (11JC1404000 14JC1402200) andShanghai Rising-Star Program (13QA1401600)

References

[1] Y Zhang N Meratnia and P Havinga ldquoOutlier detectiontechniques for wireless sensor networks a surveyrdquo IEEE Com-munications Surveys and Tutorials vol 12 no 2 pp 159ndash1702010

[2] Y Zhang N A S Hamm N Meratnia A Stein M van deVoort and P J M Havinga ldquoStatistics-based outlier detectionfor wireless sensor networksrdquo International Journal of Geo-graphical Information Science vol 26 no 8 pp 1373ndash1392 2012

[3] C Peng and Q-L Han ldquoA novel event-triggered transmissionscheme and L

2control co-design for sampled-data control

systemsrdquo IEEE Transactions on Automatic Control vol 58 no10 pp 2620ndash2626 2013

[4] S Rajasegarar C Leckie M Palaniswami and J C BezdekldquoDistributed anomaly detection in wireless sensor networksrdquo inProceedings of the 10th IEEE Singapore International Conferenceon Communication systems (ICCS rsquo06) pp 1ndash5 IEEE SingaporeOctober 2006

[5] S Rajasegarar C Leckie and M Palaniswami ldquoAnomalydetection in wireless sensor networksrdquo IEEE Wireless Commu-nications vol 15 no 4 pp 34ndash40 2008

[6] M Xie S Han B Tian and S Parvin ldquoAnomaly detectionin wireless sensor networks a surveyrdquo Journal of Network andComputer Applications vol 34 no 4 pp 1302ndash1325 2011

[7] Z Ding M Fei D Du and S Xu ldquoOnline anomaly detectionmethod based on BBO ensemble pruning in wireless sensornetworksrdquo in Life System Modeling and Simulation vol 461 ofCommunications in Computer and Information Science pp 160ndash169 Springer Berlin Germany 2014

[8] T G Dietterich ldquoMachine-learning researchmdashfour currentdirectionsrdquo AI Magazine vol 18 no 4 pp 97ndash136 1997

[9] Z-H Zhou J Wu andW Tang ldquoEnsembling neural networksmany could be better than allrdquoArtificial Intelligence vol 137 no1-2 pp 239ndash263 2002

[10] N Shahid I H Naqvi and S B Qaisar ldquoCharacteristics andclassification of outlier detection techniques for wireless sensornetworks in harsh environments a surveyrdquoArtificial IntelligenceReview vol 137 pp 1ndash36 2012

[11] D Du K Li and M Fei ldquoA fast multi-output RBF neuralnetwork constructionmethodrdquoNeurocomputing vol 73 no 10ndash12 pp 2196ndash2202 2010

[12] P Gil A Santos and A Cardoso ldquoDealing with outliers inwireless sensor networks an oil refinery applicationrdquo IEEETransactions on Control Systems Technology vol 23 no 4 pp1589ndash1596 2014

[13] M A Rassam M A Maarof and A Zainal ldquoAdaptive andonline data anomaly detection for wireless sensor systemsrdquoKnowledge-Based Systems vol 60 pp 44ndash57 2014

[14] S Rajasegarar A Gluhak M Ali Imran et al ldquoEllipsoidalneighbourhood outlier factor for distributed anomaly detectionin resource constrained networksrdquo Pattern Recognition vol 47no 9 pp 2867ndash2879 2014

[15] N Lu G Zhang and J Lu ldquoConcept drift detection viacompetence modelsrdquo Artificial Intelligence vol 209 pp 11ndash282014

[16] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[17] S Seguı L Igual and J Vitria ldquoBagged one-class classifiersin the presence of outliersrdquo International Journal of PatternRecognition and Artificial Intelligence vol 27 no 5 Article ID1350014 2013

12 International Journal of Distributed Sensor Networks

[18] N Duffy and D Helmbold ldquoBoosting methods for regressionrdquoMachine Learning vol 47 no 2-3 pp 153ndash200 2002

[19] W-C Chang and C-W Cho ldquoOnline boosting for vehicledetectionrdquo IEEETransactions on SystemsMan and CyberneticsPart B Cybernetics vol 40 no 3 pp 892ndash902 2010

[20] C Desir S Bernard C Petitjean and L Heutte ldquoOne classrandom forestsrdquo Pattern Recognition vol 46 no 12 pp 3490ndash3506 2013

[21] A Fern and R Givan ldquoOnline ensemble learning an empiricalstudyrdquoMachine Learning vol 53 no 1-2 pp 71ndash109 2003

[22] A Bifet G Holmes B Pfahringer and R Gavalda ldquoImprov-ing adaptive bagging methods for evolving data streamsrdquo inAdvances in Machine Learning vol 5828 of Lecture Notes inComputer Science pp 23ndash37 Springer Berlin Germany 2009

[23] D I Curiac and C Volosencu ldquoEnsemble based sensinganomaly detection in wireless sensor networksrdquo Expert Systemswith Applications vol 39 no 10 pp 9087ndash9096 2012

[24] X Zhou S Li and Z Ye ldquoA novel system anomaly predictionsystem based on belief markov model and ensemble classifica-tionrdquo Mathematical Problems in Engineering vol 2013 ArticleID 179390 10 pages 2013

[25] H He S Chen K Li and X Xu ldquoIncremental learning fromstream datardquo IEEE Transactions on Neural Networks vol 22 no12 pp 1901ndash1914 2011

[26] D Du K Li X Li and M Fei ldquoA novel forward gene selectionalgorithm for microarray datardquo Neurocomputing vol 133 pp446ndash458 2014

[27] H Ma ldquoAn analysis of the equilibrium of migration models forbiogeography-based optimizationrdquo Information Sciences vol180 no 18 pp 3444ndash3464 2010

[28] D Simon ldquoBiogeography-based optimizationrdquo IEEE Transac-tions on Evolutionary Computation vol 12 no 6 pp 702ndash7132008

[29] S Sheen R Anitha and P Sirisha ldquoMalware detection bypruning of parallel ensembles using harmony searchrdquo PatternRecognition Letters vol 34 no 14 pp 1679ndash1686 2013

[30] Y-Y Zhang H-C Chao M Chen L Shu C-H Park and M-S Park ldquoOutlier detection and countermeasure for hierarchicalwireless sensor networksrdquo IET Information Security vol 4 no4 pp 361ndash373 2010

[31] C Peng and M-R Fei ldquoAn improved result on the stability ofuncertain T-S fuzzy systems with interval time-varying delayrdquoFuzzy Sets and Systems vol 212 pp 97ndash109 2013

[32] Y Zhang Observing the Unobservable Distributed Online Out-lier Detection inWireless Sensor Networks University of TwenteEnschede The Netherlands 2010

[33] C Peng D Yue and M Fei ldquoRelaxed stability and stabilizationconditions of networked fuzzy control systems subject toasynchronous grades of membershiprdquo IEEE Transactions onFuzzy Systems vol 22 no 5 pp 1101ndash1112 2014

[34] C Peng M-R Fei E Tian and Y-P Guan ldquoOn hold or dropout-of-order packets in networked control systemsrdquo Informa-tion Sciences vol 268 pp 436ndash446 2014

[35] M A Rassam A Zainal and M A Maarof ldquoAn adaptive andefficient dimension reduction model for multivariate wirelesssensor networks applicationsrdquo Applied Soft Computing Journalvol 13 no 4 pp 1978ndash1996 2013

[36] M Xie J Hu S Han and H-H Chen ldquoScalable hypergridk-NN-based online anomaly detection in wireless sensor net-worksrdquo IEEE Transactions on Parallel and Distributed Systemsvol 24 no 8 pp 1661ndash1670 2013

[37] Intel Berkely Reseach Lab (IBRL) dataset 2004 httpdbcsailmitedulabdatalabdatahtml

[38] J W Branch C Giannella B Szymanski R Wolff and HKargupta ldquoIn-network outlier detection in wireless sensornetworksrdquo Knowledge and Information Systems vol 34 no 1pp 23ndash54 2013

[39] M Moshtaghi T C Havens J C Bezdek et al ldquoClusteringellipses for anomaly detectionrdquo Pattern Recognition vol 44 no1 pp 55ndash69 2011

[40] S Rajasegarar J C Bezdek C Leckie and M PalaniswamildquoElliptical anomalies in wireless sensor networksrdquo ACM Trans-actions on Sensor Networks vol 6 no 1 pp 1ndash28 2009

[41] M A Rassam A Zainal and M A Maarof ldquoOne-classprincipal component classifier for anomaly detection inwirelesssensor networkrdquo in Proceedings of the 4th International Confer-ence on Computational Aspects of Social Networks (CASoN rsquo12)pp 271ndash276 IEEE Sao Carlos Brazil November 2012

[42] H Sagha H Bayati J D R Millan and R Chavarriaga ldquoOn-line anomaly detection and resilience in classifier ensemblesrdquoPattern Recognition Letters vol 34 no 15 pp 1916ndash1927 2013

[43] M Hejazi and Y P Singh ldquoOne-class support vector machinesapproach to anomaly detectionrdquo Applied Artificial Intelligencevol 27 no 5 pp 351ndash366 2013

[44] Y Zhang NMeratnia and P JMHavinga ldquoDistributed onlineoutlier detection in wireless sensor networks using ellipsoidalsupport vector machinerdquo Ad Hoc Networks vol 11 no 3 pp1062ndash1074 2013

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 11: Research Article A Novel Distributed Online Anomaly ...downloads.hindawi.com/journals/ijdsn/2015/146189.pdfin the cluster head node to obtain an optimized subset of ensemble members

International Journal of Distributed Sensor Networks 11

Table 4 Detection performance of global ensemble detector based on BBO pruning [7]

Ensemble size(BBO pruned)

N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR

14 09480 08000 00458 09327 07667 00567 09500 08125 00423 09533 08571 0042023 09710 07750 00208 09447 08000 00461 09733 09250 00239 09697 09143 0027627 09713 08500 00236 09683 08333 00230 09810 09563 00176 09797 09357 0018232 09820 09750 00177 09750 08333 00160 09820 09500 00162 09830 09786 00168

Table 5 Rate of saving resource cost based on global ensembledetector of BBO pruned

Number Initial ensemblesize

Prunedensemble size

Saving resourcecost

1 20 14 302 40 23 4253 60 27 554 80 32 60

measures in fitness function to improve the detecting per-formance in future Besides the cost of communication isthe main reason of quick energy depletion of sensor nodesespecially for the cluster head the adaptive selection of clusterhead based on energy state will be taken into account tolengthen the lifetime of WSNs in next work

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by the National Key ScientificInstrument and Equipment Development Project(2012YQ15008703) the Zhejiang Provincial Natural ScienceFoundation of China (LY13F020015) the Open Project of TopKeyDiscipline of Computer Software andTheory in ZhejiangProvincial (ZC323014100) National Science Foundation ofChina (61473182) Science and Technology Commission ofShanghai Municipality (11JC1404000 14JC1402200) andShanghai Rising-Star Program (13QA1401600)

References

[1] Y Zhang N Meratnia and P Havinga ldquoOutlier detectiontechniques for wireless sensor networks a surveyrdquo IEEE Com-munications Surveys and Tutorials vol 12 no 2 pp 159ndash1702010

[2] Y Zhang N A S Hamm N Meratnia A Stein M van deVoort and P J M Havinga ldquoStatistics-based outlier detectionfor wireless sensor networksrdquo International Journal of Geo-graphical Information Science vol 26 no 8 pp 1373ndash1392 2012

[3] C Peng and Q-L Han ldquoA novel event-triggered transmissionscheme and L

2control co-design for sampled-data control

systemsrdquo IEEE Transactions on Automatic Control vol 58 no10 pp 2620ndash2626 2013

[4] S Rajasegarar C Leckie M Palaniswami and J C BezdekldquoDistributed anomaly detection in wireless sensor networksrdquo inProceedings of the 10th IEEE Singapore International Conferenceon Communication systems (ICCS rsquo06) pp 1ndash5 IEEE SingaporeOctober 2006

[5] S Rajasegarar C Leckie and M Palaniswami ldquoAnomalydetection in wireless sensor networksrdquo IEEE Wireless Commu-nications vol 15 no 4 pp 34ndash40 2008

[6] M Xie S Han B Tian and S Parvin ldquoAnomaly detectionin wireless sensor networks a surveyrdquo Journal of Network andComputer Applications vol 34 no 4 pp 1302ndash1325 2011

[7] Z Ding M Fei D Du and S Xu ldquoOnline anomaly detectionmethod based on BBO ensemble pruning in wireless sensornetworksrdquo in Life System Modeling and Simulation vol 461 ofCommunications in Computer and Information Science pp 160ndash169 Springer Berlin Germany 2014

[8] T G Dietterich ldquoMachine-learning researchmdashfour currentdirectionsrdquo AI Magazine vol 18 no 4 pp 97ndash136 1997

[9] Z-H Zhou J Wu andW Tang ldquoEnsembling neural networksmany could be better than allrdquoArtificial Intelligence vol 137 no1-2 pp 239ndash263 2002

[10] N Shahid I H Naqvi and S B Qaisar ldquoCharacteristics andclassification of outlier detection techniques for wireless sensornetworks in harsh environments a surveyrdquoArtificial IntelligenceReview vol 137 pp 1ndash36 2012

[11] D Du K Li and M Fei ldquoA fast multi-output RBF neuralnetwork constructionmethodrdquoNeurocomputing vol 73 no 10ndash12 pp 2196ndash2202 2010

[12] P Gil A Santos and A Cardoso ldquoDealing with outliers inwireless sensor networks an oil refinery applicationrdquo IEEETransactions on Control Systems Technology vol 23 no 4 pp1589ndash1596 2014

[13] M A Rassam M A Maarof and A Zainal ldquoAdaptive andonline data anomaly detection for wireless sensor systemsrdquoKnowledge-Based Systems vol 60 pp 44ndash57 2014

[14] S Rajasegarar A Gluhak M Ali Imran et al ldquoEllipsoidalneighbourhood outlier factor for distributed anomaly detectionin resource constrained networksrdquo Pattern Recognition vol 47no 9 pp 2867ndash2879 2014

[15] N Lu G Zhang and J Lu ldquoConcept drift detection viacompetence modelsrdquo Artificial Intelligence vol 209 pp 11ndash282014

[16] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[17] S Seguı L Igual and J Vitria ldquoBagged one-class classifiersin the presence of outliersrdquo International Journal of PatternRecognition and Artificial Intelligence vol 27 no 5 Article ID1350014 2013

12 International Journal of Distributed Sensor Networks

[18] N Duffy and D Helmbold ldquoBoosting methods for regressionrdquoMachine Learning vol 47 no 2-3 pp 153ndash200 2002

[19] W-C Chang and C-W Cho ldquoOnline boosting for vehicledetectionrdquo IEEETransactions on SystemsMan and CyberneticsPart B Cybernetics vol 40 no 3 pp 892ndash902 2010

[20] C Desir S Bernard C Petitjean and L Heutte ldquoOne classrandom forestsrdquo Pattern Recognition vol 46 no 12 pp 3490ndash3506 2013

[21] A Fern and R Givan ldquoOnline ensemble learning an empiricalstudyrdquoMachine Learning vol 53 no 1-2 pp 71ndash109 2003

[22] A Bifet G Holmes B Pfahringer and R Gavalda ldquoImprov-ing adaptive bagging methods for evolving data streamsrdquo inAdvances in Machine Learning vol 5828 of Lecture Notes inComputer Science pp 23ndash37 Springer Berlin Germany 2009

[23] D I Curiac and C Volosencu ldquoEnsemble based sensinganomaly detection in wireless sensor networksrdquo Expert Systemswith Applications vol 39 no 10 pp 9087ndash9096 2012

[24] X Zhou S Li and Z Ye ldquoA novel system anomaly predictionsystem based on belief markov model and ensemble classifica-tionrdquo Mathematical Problems in Engineering vol 2013 ArticleID 179390 10 pages 2013

[25] H He S Chen K Li and X Xu ldquoIncremental learning fromstream datardquo IEEE Transactions on Neural Networks vol 22 no12 pp 1901ndash1914 2011

[26] D Du K Li X Li and M Fei ldquoA novel forward gene selectionalgorithm for microarray datardquo Neurocomputing vol 133 pp446ndash458 2014

[27] H Ma ldquoAn analysis of the equilibrium of migration models forbiogeography-based optimizationrdquo Information Sciences vol180 no 18 pp 3444ndash3464 2010

[28] D Simon ldquoBiogeography-based optimizationrdquo IEEE Transac-tions on Evolutionary Computation vol 12 no 6 pp 702ndash7132008

[29] S Sheen R Anitha and P Sirisha ldquoMalware detection bypruning of parallel ensembles using harmony searchrdquo PatternRecognition Letters vol 34 no 14 pp 1679ndash1686 2013

[30] Y-Y Zhang H-C Chao M Chen L Shu C-H Park and M-S Park ldquoOutlier detection and countermeasure for hierarchicalwireless sensor networksrdquo IET Information Security vol 4 no4 pp 361ndash373 2010

[31] C Peng and M-R Fei ldquoAn improved result on the stability ofuncertain T-S fuzzy systems with interval time-varying delayrdquoFuzzy Sets and Systems vol 212 pp 97ndash109 2013

[32] Y Zhang Observing the Unobservable Distributed Online Out-lier Detection inWireless Sensor Networks University of TwenteEnschede The Netherlands 2010

[33] C Peng D Yue and M Fei ldquoRelaxed stability and stabilizationconditions of networked fuzzy control systems subject toasynchronous grades of membershiprdquo IEEE Transactions onFuzzy Systems vol 22 no 5 pp 1101ndash1112 2014

[34] C Peng M-R Fei E Tian and Y-P Guan ldquoOn hold or dropout-of-order packets in networked control systemsrdquo Informa-tion Sciences vol 268 pp 436ndash446 2014

[35] M A Rassam A Zainal and M A Maarof ldquoAn adaptive andefficient dimension reduction model for multivariate wirelesssensor networks applicationsrdquo Applied Soft Computing Journalvol 13 no 4 pp 1978ndash1996 2013

[36] M Xie J Hu S Han and H-H Chen ldquoScalable hypergridk-NN-based online anomaly detection in wireless sensor net-worksrdquo IEEE Transactions on Parallel and Distributed Systemsvol 24 no 8 pp 1661ndash1670 2013

[37] Intel Berkely Reseach Lab (IBRL) dataset 2004 httpdbcsailmitedulabdatalabdatahtml

[38] J W Branch C Giannella B Szymanski R Wolff and HKargupta ldquoIn-network outlier detection in wireless sensornetworksrdquo Knowledge and Information Systems vol 34 no 1pp 23ndash54 2013

[39] M Moshtaghi T C Havens J C Bezdek et al ldquoClusteringellipses for anomaly detectionrdquo Pattern Recognition vol 44 no1 pp 55ndash69 2011

[40] S Rajasegarar J C Bezdek C Leckie and M PalaniswamildquoElliptical anomalies in wireless sensor networksrdquo ACM Trans-actions on Sensor Networks vol 6 no 1 pp 1ndash28 2009

[41] M A Rassam A Zainal and M A Maarof ldquoOne-classprincipal component classifier for anomaly detection inwirelesssensor networkrdquo in Proceedings of the 4th International Confer-ence on Computational Aspects of Social Networks (CASoN rsquo12)pp 271ndash276 IEEE Sao Carlos Brazil November 2012

[42] H Sagha H Bayati J D R Millan and R Chavarriaga ldquoOn-line anomaly detection and resilience in classifier ensemblesrdquoPattern Recognition Letters vol 34 no 15 pp 1916ndash1927 2013

[43] M Hejazi and Y P Singh ldquoOne-class support vector machinesapproach to anomaly detectionrdquo Applied Artificial Intelligencevol 27 no 5 pp 351ndash366 2013

[44] Y Zhang NMeratnia and P JMHavinga ldquoDistributed onlineoutlier detection in wireless sensor networks using ellipsoidalsupport vector machinerdquo Ad Hoc Networks vol 11 no 3 pp1062ndash1074 2013

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 12: Research Article A Novel Distributed Online Anomaly ...downloads.hindawi.com/journals/ijdsn/2015/146189.pdfin the cluster head node to obtain an optimized subset of ensemble members

12 International Journal of Distributed Sensor Networks

[18] N Duffy and D Helmbold ldquoBoosting methods for regressionrdquoMachine Learning vol 47 no 2-3 pp 153ndash200 2002

[19] W-C Chang and C-W Cho ldquoOnline boosting for vehicledetectionrdquo IEEETransactions on SystemsMan and CyberneticsPart B Cybernetics vol 40 no 3 pp 892ndash902 2010

[20] C Desir S Bernard C Petitjean and L Heutte ldquoOne classrandom forestsrdquo Pattern Recognition vol 46 no 12 pp 3490ndash3506 2013

[21] A Fern and R Givan ldquoOnline ensemble learning an empiricalstudyrdquoMachine Learning vol 53 no 1-2 pp 71ndash109 2003

[22] A Bifet G Holmes B Pfahringer and R Gavalda ldquoImprov-ing adaptive bagging methods for evolving data streamsrdquo inAdvances in Machine Learning vol 5828 of Lecture Notes inComputer Science pp 23ndash37 Springer Berlin Germany 2009

[23] D I Curiac and C Volosencu ldquoEnsemble based sensinganomaly detection in wireless sensor networksrdquo Expert Systemswith Applications vol 39 no 10 pp 9087ndash9096 2012

[24] X Zhou S Li and Z Ye ldquoA novel system anomaly predictionsystem based on belief markov model and ensemble classifica-tionrdquo Mathematical Problems in Engineering vol 2013 ArticleID 179390 10 pages 2013

[25] H He S Chen K Li and X Xu ldquoIncremental learning fromstream datardquo IEEE Transactions on Neural Networks vol 22 no12 pp 1901ndash1914 2011

[26] D Du K Li X Li and M Fei ldquoA novel forward gene selectionalgorithm for microarray datardquo Neurocomputing vol 133 pp446ndash458 2014

[27] H Ma ldquoAn analysis of the equilibrium of migration models forbiogeography-based optimizationrdquo Information Sciences vol180 no 18 pp 3444ndash3464 2010

[28] D Simon ldquoBiogeography-based optimizationrdquo IEEE Transac-tions on Evolutionary Computation vol 12 no 6 pp 702ndash7132008

[29] S Sheen R Anitha and P Sirisha ldquoMalware detection bypruning of parallel ensembles using harmony searchrdquo PatternRecognition Letters vol 34 no 14 pp 1679ndash1686 2013

[30] Y-Y Zhang H-C Chao M Chen L Shu C-H Park and M-S Park ldquoOutlier detection and countermeasure for hierarchicalwireless sensor networksrdquo IET Information Security vol 4 no4 pp 361ndash373 2010

[31] C Peng and M-R Fei ldquoAn improved result on the stability ofuncertain T-S fuzzy systems with interval time-varying delayrdquoFuzzy Sets and Systems vol 212 pp 97ndash109 2013

[32] Y Zhang Observing the Unobservable Distributed Online Out-lier Detection inWireless Sensor Networks University of TwenteEnschede The Netherlands 2010

[33] C Peng D Yue and M Fei ldquoRelaxed stability and stabilizationconditions of networked fuzzy control systems subject toasynchronous grades of membershiprdquo IEEE Transactions onFuzzy Systems vol 22 no 5 pp 1101ndash1112 2014

[34] C Peng M-R Fei E Tian and Y-P Guan ldquoOn hold or dropout-of-order packets in networked control systemsrdquo Informa-tion Sciences vol 268 pp 436ndash446 2014

[35] M A Rassam A Zainal and M A Maarof ldquoAn adaptive andefficient dimension reduction model for multivariate wirelesssensor networks applicationsrdquo Applied Soft Computing Journalvol 13 no 4 pp 1978ndash1996 2013

[36] M Xie J Hu S Han and H-H Chen ldquoScalable hypergridk-NN-based online anomaly detection in wireless sensor net-worksrdquo IEEE Transactions on Parallel and Distributed Systemsvol 24 no 8 pp 1661ndash1670 2013

[37] Intel Berkely Reseach Lab (IBRL) dataset 2004 httpdbcsailmitedulabdatalabdatahtml

[38] J W Branch C Giannella B Szymanski R Wolff and HKargupta ldquoIn-network outlier detection in wireless sensornetworksrdquo Knowledge and Information Systems vol 34 no 1pp 23ndash54 2013

[39] M Moshtaghi T C Havens J C Bezdek et al ldquoClusteringellipses for anomaly detectionrdquo Pattern Recognition vol 44 no1 pp 55ndash69 2011

[40] S Rajasegarar J C Bezdek C Leckie and M PalaniswamildquoElliptical anomalies in wireless sensor networksrdquo ACM Trans-actions on Sensor Networks vol 6 no 1 pp 1ndash28 2009

[41] M A Rassam A Zainal and M A Maarof ldquoOne-classprincipal component classifier for anomaly detection inwirelesssensor networkrdquo in Proceedings of the 4th International Confer-ence on Computational Aspects of Social Networks (CASoN rsquo12)pp 271ndash276 IEEE Sao Carlos Brazil November 2012

[42] H Sagha H Bayati J D R Millan and R Chavarriaga ldquoOn-line anomaly detection and resilience in classifier ensemblesrdquoPattern Recognition Letters vol 34 no 15 pp 1916ndash1927 2013

[43] M Hejazi and Y P Singh ldquoOne-class support vector machinesapproach to anomaly detectionrdquo Applied Artificial Intelligencevol 27 no 5 pp 351ndash366 2013

[44] Y Zhang NMeratnia and P JMHavinga ldquoDistributed onlineoutlier detection in wireless sensor networks using ellipsoidalsupport vector machinerdquo Ad Hoc Networks vol 11 no 3 pp1062ndash1074 2013

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 13: Research Article A Novel Distributed Online Anomaly ...downloads.hindawi.com/journals/ijdsn/2015/146189.pdfin the cluster head node to obtain an optimized subset of ensemble members

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of