distributed data source verification in wireless sensor networks

12
Distributed data source verification in wireless sensor networks Mauro Conti a, * , Roberto Di Pietro b , Luigi V. Mancini a , Alessandro Mei a a Università di Roma ‘‘La Sapienza, Dipartimento di Informatica, Via Salaria 113, 00198 Roma, Italy b Università di Roma ‘‘Tre, Dipartimento di Matematica, Largo San Leonardo Murialdo 1, 00146 Roma, Italy article info Article history: Received 20 December 2006 Received in revised form 3 September 2007 Accepted 23 January 2009 Available online 4 February 2009 Keywords: Wireless sensor networks Securing data fusion Clone detection Distributed protocol abstract In-network data aggregation is favorable for wireless sensor networks (WSNs): It allows in-network data processing while reducing the network traffic and hence saving the sensors energy. However, due to the distributed and unattended nature of WSNs, several attacks aiming at compromising the authenticity of the collected data could be perpetrated. For example, an adversary could capture a node to create clones of the captured one. These clones disseminated through the network could provide malicious data to the aggregating node, thus poisoning/disrupting the aggregation process. In this paper we address the prob- lem of detecting cloned nodes; a requirement to be fulfilled to provide authenticity of the data fusion pro- cess. First, we analyze the desirable properties a distributed clone detection protocol should meet. Specifi- cally: It should avoid having a single point of failure; the load should be totally distributed across the nodes in the network; the position of the clones in the network should not influence the detection prob- ability. We then show that current solutions do not meet the exposed requirements. Next, we propose the Information Fusion Based Clone Detection Protocol (ICD). ICD is a probabilistic, completely distributed protocol that efficiently detects clones. ICD combines two cryptographic mechanisms: The pseudo-ran- dom key pre-distribution, usually employed to secure node pairwise communications, with a sparing use of asymmetric crypto primitives. We show that ICD matches all the requirements above mentioned and compare its performance with current solutions in the literature; experimental results show that ICD has better performance than existing solutions for all the cost parameters considered: Number of mes- sages sent, per sensor storage requirement, and signature verification. These savings allow to increase the network operating lifetime. Finally, note that ICD protocol could be used as an independent layer by any data aggregation mechanism. Ó 2009 Elsevier B.V. All rights reserved. 1. Introduction A Wireless Sensor Network (WSN) is a collection of sensors with limited resources that collaborate to achieve a common goal. A WSN can be deployed in harsh environments to fulfil both military and civil applications [1]. In WSN applications, data-centric mech- anisms that perform in-network aggregation of data are needed for energy-efficient formation flow [30,39]. These mechanisms also al- low in-network data processing, hence decreasing the time that the network needs in order to react to the sensed data. WSNs are often unattended, hence prone to different kinds of novel attacks. For instance, an adversary could eavesdrop the exchanged mes- sages and capture nodes acquiring all the information stored in the devices (sensors are assumed not tamper proof [1]). Further, the adversary could clone captured nodes and create multiple nodes with the same identity. The clones could then be deployed in the network area and, for instance, subvert the data aggregation or the decision making in the network if based on some voting mechanism [9,18,20,36]. An adversary interested in modifying the data the network stores, processes or sends to a sink (base sta- tion) can even leverage the data aggregation mechanism to go undetected. In fact, in different aggregation or storing protocols the aggregator/destination node is selected in an pseudo-random way [39]: Influencing this decision mechanism, as well as provid- ing several bogus readings could poison the data fusion process. In- deed, for an aggregated value, no information on the source nodes that contributed to that values are usually kept (for efficiency rea- sons). Then, a cloned node could send malicious data to a randomly chosen aggregator. The aggregator has no way to identify a source node as a corrupted one by analyzing received messages only: In- deed the clone could use the credential of the original node (for example its secret private key). A similar attack, the sybil attack [18,36], consists of claiming multiple existing identities stolen from corrupted nodes. Sybil and clone attacks will result in identity theft. While the former can be efficiently addressed with mecha- nism based on RSSI (Received Signal Strength Indicator) [12] or 1566-2535/$ - see front matter Ó 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.inffus.2009.01.002 * Corresponding author. Tel.: +39 06 49918421; fax: +39 06 8541842. E-mail addresses: [email protected] (M. Conti), [email protected] (R. Di Pietro), [email protected] (L.V. Mancini), [email protected] (A. Mei). Information Fusion 10 (2009) 342–353 Contents lists available at ScienceDirect Information Fusion journal homepage: www.elsevier.com/locate/inffus

Upload: mauro-conti

Post on 15-Jul-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Information Fusion 10 (2009) 342–353

Contents lists available at ScienceDirect

Information Fusion

journal homepage: www.elsevier .com/ locate / inf fus

Distributed data source verification in wireless sensor networks

Mauro Conti a,*, Roberto Di Pietro b, Luigi V. Mancini a, Alessandro Mei a

a Università di Roma ‘‘La Sapienza”, Dipartimento di Informatica, Via Salaria 113, 00198 Roma, Italyb Università di Roma ‘‘Tre”, Dipartimento di Matematica, Largo San Leonardo Murialdo 1, 00146 Roma, Italy

a r t i c l e i n f o

Article history:Received 20 December 2006Received in revised form 3 September 2007Accepted 23 January 2009Available online 4 February 2009

Keywords:Wireless sensor networksSecuring data fusionClone detectionDistributed protocol

1566-2535/$ - see front matter � 2009 Elsevier B.V. Adoi:10.1016/j.inffus.2009.01.002

* Corresponding author. Tel.: +39 06 49918421; faxE-mail addresses: [email protected] (M. Cont

(R. Di Pietro), [email protected] (L.V. Mancini),

a b s t r a c t

In-network data aggregation is favorable for wireless sensor networks (WSNs): It allows in-network dataprocessing while reducing the network traffic and hence saving the sensors energy. However, due to thedistributed and unattended nature of WSNs, several attacks aiming at compromising the authenticity ofthe collected data could be perpetrated. For example, an adversary could capture a node to create clonesof the captured one. These clones disseminated through the network could provide malicious data to theaggregating node, thus poisoning/disrupting the aggregation process. In this paper we address the prob-lem of detecting cloned nodes; a requirement to be fulfilled to provide authenticity of the data fusion pro-cess.

First, we analyze the desirable properties a distributed clone detection protocol should meet. Specifi-cally: It should avoid having a single point of failure; the load should be totally distributed across thenodes in the network; the position of the clones in the network should not influence the detection prob-ability. We then show that current solutions do not meet the exposed requirements. Next, we propose theInformation Fusion Based Clone Detection Protocol (ICD). ICD is a probabilistic, completely distributedprotocol that efficiently detects clones. ICD combines two cryptographic mechanisms: The pseudo-ran-dom key pre-distribution, usually employed to secure node pairwise communications, with a sparinguse of asymmetric crypto primitives. We show that ICD matches all the requirements above mentionedand compare its performance with current solutions in the literature; experimental results show that ICDhas better performance than existing solutions for all the cost parameters considered: Number of mes-sages sent, per sensor storage requirement, and signature verification. These savings allow to increasethe network operating lifetime. Finally, note that ICD protocol could be used as an independent layerby any data aggregation mechanism.

� 2009 Elsevier B.V. All rights reserved.

1. Introduction

A Wireless Sensor Network (WSN) is a collection of sensors withlimited resources that collaborate to achieve a common goal. AWSN can be deployed in harsh environments to fulfil both militaryand civil applications [1]. In WSN applications, data-centric mech-anisms that perform in-network aggregation of data are needed forenergy-efficient formation flow [30,39]. These mechanisms also al-low in-network data processing, hence decreasing the time thatthe network needs in order to react to the sensed data. WSNs areoften unattended, hence prone to different kinds of novel attacks.For instance, an adversary could eavesdrop the exchanged mes-sages and capture nodes acquiring all the information stored inthe devices (sensors are assumed not tamper proof [1]). Further,the adversary could clone captured nodes and create multiplenodes with the same identity. The clones could then be deployed

ll rights reserved.

: +39 06 8541842.i), [email protected]@di.uniroma1.it (A. Mei).

in the network area and, for instance, subvert the data aggregationor the decision making in the network if based on some votingmechanism [9,18,20,36]. An adversary interested in modifyingthe data the network stores, processes or sends to a sink (base sta-tion) can even leverage the data aggregation mechanism to goundetected. In fact, in different aggregation or storing protocolsthe aggregator/destination node is selected in an pseudo-randomway [39]: Influencing this decision mechanism, as well as provid-ing several bogus readings could poison the data fusion process. In-deed, for an aggregated value, no information on the source nodesthat contributed to that values are usually kept (for efficiency rea-sons). Then, a cloned node could send malicious data to a randomlychosen aggregator. The aggregator has no way to identify a sourcenode as a corrupted one by analyzing received messages only: In-deed the clone could use the credential of the original node (forexample its secret private key). A similar attack, the sybil attack[18,36], consists of claiming multiple existing identities stolenfrom corrupted nodes. Sybil and clone attacks will result in identitytheft. While the former can be efficiently addressed with mecha-nism based on RSSI (Received Signal Strength Indicator) [12] or

M. Conti et al. / Information Fusion 10 (2009) 342–353 343

with authentication based on the knowledge of a fixed keys’ set [9–11,15], efficient detection of clone attacks is actually an open issue.To the best of our knowledge, [37] is the first distributed clonedetection solution. Before this one, only centralized or localizedprotocols were proposed: While the first ones have a single pointof failure, the second ones might not detect replicated nodes dis-tributed in different area of the network.

This paper provides several contributions: First, we analyze thedesirable properties of distributed mechanisms for detection of rep-lication attacks. Within this framework, we analyze the first protocolfor distributed detection, recently proposed in [37], and show thatthe protocol does not match the identified requirements. Hence,the efficient and distributed detection of node-replication attack re-mains an open issue. Second, we propose the Information FusionBased Clone Detection Protocol (ICD): A completely distributed,probabilistic protocol for the distributed detection of node-replica-tion attack. The ICD protocol is thought to be executed in conjunc-tion with a data fusion protocol to enforce authenticity of the datato be fused. The detection of a cloned node will result in a distributedrevocation procedure. The ICD protocol combines two differentsecurity mechanisms: The random key pre-deployment distributionwith a sparing use of digital signatures. We show that ICD meets therequirements above mentioned. In particular, experimental resultsshow that ICD performance overcome those of solutions in the liter-ature, since it requires to send and to store fewer messages as well asto verify fewer signatures. Note that these features allow to save en-ergy, hence to stretch the operating life of the WSN. Finally, note thatthe proposed protocol could be adopted as an independent layer byany data aggregation mechanism.

The remainder of this paper is organized as follows. Next sec-tion reviews related work and introduces some background no-tions; Section 3 illustrates the threat model assumed in thepaper; Section 4 discusses the requirements of the protocols fordetection of node identity replicas in WSNs. In Section 5 we intro-duce and detail the ICD protocol, while in Section 6 we performextensive simulations to compare the performances of ICD withstate of the art solutions and discuss these findings. Finally, Section7 presents some concluding remarks.

2. Related work and background

2.1. Security in information fusion

Early work in data aggregation for WSNs used to take forgranted that every node of the network is honest [21,28]. This opti-mistic assumption has been relaxed in [27], that addresses theproblem of aggregating data when one node in the network is com-promised. However, the protocol in [27] is vulnerable when as fewas two nodes (neighbors in the tree used to convey the data to thebase station) are colluding. Further, the protocol is expensive: Eachaggregator has to forward a number of messages that is propor-tional to the number of contributors.

The work in [38] enables the base station to verify whether theaggregated value provided by the network is a good approximationof the values that are actually collected by the sensors even when afraction of the sensor nodes are corrupted. However, this isachieved by interactive proof between the aggregators and thebase station. Hence, due to the computational and communicationcost incurred by this scheme the aggregators should be more pow-erful than common low-end sensors. The work in [41] analyses theintrinsic degree of security of various aggregating functions (aver-age, computing the maximum, the median, and so on) as theirrobustness in the presence of maliciously modified data. In somecases insecure functions can be replaced by more robust, similarones (e. g. computing the median should replace computing theaverage), usually with an increase of computational cost.

The Concealed Data Aggregation (CDA) concept was first intro-duced in [24,25]. CDA is a solution for end-to-end encryption thatuses homomorphic encryption to enforce secure data aggregation.In [43] this notion is extended: The authors present a generaliza-tion of the approach for a class of routing protocols and also pro-pose a key pre-distribution algorithm that limits the adversarycapability of disrupting the confidentiality of the aggregated data.Another work [7], building upon the concepts introduced in[24,25], proposes a new scheme that addresses both confidentialityand efficiency issues in data aggregation algorithms. This schemerelies on a simple but provably secure homomorphic encryptionfunction. However, some drawbacks affect this scheme: First, it isnot robust against node compromise, that is a single node failurecan disrupt the whole computation; second, the data packet islarge; finally, there is the need of piggybacking toward the basestation either the IDs of all the nodes that contributed to the com-putation or the IDs of the nodes that did not contribute.

Finally, the watchdog mechanism has been introduced in [33] inthe context of wireless mobile networks as a mechanism to detectnodes that misbehave in routing packets. The same mechanism hasbeen adopted to enforce cooperation among nodes [35,23].

2.2. Clone detection

One of the first solutions for replicated node detection in WSNsrelies on a centralized base station [20]. In this solution, each nodecan send a list of its neighbors and their claimed locations to a basestation. The same entry in two lists sent by nodes that are not‘‘close” to each other will result in a replica detection. Then, thebase station will revoke the replicated node. This solution has sev-eral drawbacks. For instance, a single point of failure (the base sta-tion), and high communication cost due to the high number ofmessages. Other solutions rely on local detection [9]; using local-ized voting mechanism, a set of neighbors can agree on the replica-tion of a given node that has been replicated within theneighborhood. However, this kind of method fails to detect repli-cated nodes that are not within the same neighborhood.

A naı̈ve distributed solution for node-replication attack use theNode-To-Network Broadcasting [37]. Each node floods the networkwith a message containing its location information and comparesthe received location information with that of its neighbors. If aneighbor sw of node sa receives a location claim that the same nodesa is in a position not coherent with the position of sa detected bysw, this will result in a clone detection. However, this method is en-ergy consuming since each node is required to send OðnÞmessages,where n is the size of the network.

To the best of our knowledge the first globally-aware distrib-uted node-replication detection solution was recently proposedin [37]. In particular, two distributed detection protocols have beenproposed. The first one, the Randomized Multicast (RM), distrib-utes node location information to randomly-selected nodes. Thesecond one, the Line-Selected Multicast (LSM), uses the routingtopology of the network to detect replication. In the RM, when anode announces its location, each of its neighbors sends (withprobability p) a digitally signed copy of the location claim to aset of randomly selected nodes. If every neighbor selects a givennumber of claim’s destinations, that is Oð

ffiffiffinpÞ, exploiting the birth-

day paradox [34], with high probability at least one node, the wit-ness, will receive a pair of not coherent location claims (that is, anode is detected in two different locations in the same time-frame). The RM Protocol implies a high communication costs: Eachneighbor has to send Oð

ffiffiffinpÞ messages. To solve this problem the

authors propose the LSM Protocol. In the LSM Protocol, when anode announces its location, every neighbor forward this locationclaim with probability p (with probability 1� p no operations willbe performed). If the neighbor forwards the claim, it randomly se-

344 M. Conti et al. / Information Fusion 10 (2009) 342–353

lects a fixed number g of destination nodes and sends the signedclaim to all the destination nodes. Moreover, every node thatroutes this claim message will store the message and will checkthe coherence with the other location claims received within thesame iteration of the detection protocol. If, during a check, thesame node sa is present with at least two non-coherent locations,the witness will trigger a revocation protocol for node sa.

2.3. Pair-wise key establishment and key pre-deployment

In many applications, especially when communication confi-dentiality is an issue, pair-wise secure communication is a funda-mental building block.

Many recent research threads focused on finding distributedsolutions for pair-wise keys establishment based on symmetrickeys. We can classify these solutions into deterministic and proba-bilistic ones. For deterministic solutions one can see [5,8,32]. How-ever, none of these is completely satisfactory. In [32], the adversaryonly needs to corrupt a constant number of nodes to disrupt theconfidentiality of the whole network. In [5], the authors acknowl-edge that, given a fixed key-ring size, the number of sensor in thenetwork is limited. Finally, in [8], each sensor is required to storeOð

ffiffiffinpÞ keys; moreover, the number of sensors that belong to the

same network must be known at design time. To overcome theselimitations some probabilistic solutions have been proposed. Theidea was firstly introduced in [20]. In the proposed solution, eachof the n sensors is assigned k symmetric encryption keys randomlyselected from a common pool of P keys (key pre-deployment phase).When two sensors need to communicate securely, they must firstfind out which keys (if any) of the pool they share (shared key dis-covery phase). Then, they compute a common key as a function ofthe shared keys (pairwise-key establishment phase). This latter keyis used to secure the channel by using a symmetric key encryptionalgorithm. Recent solutions based on pseudo-random key assign-ment were presented in [9,13,14,19,32,44]. Between these, the Effi-cient and Secure Pre-deployment (ESP) scheme protocol [14]guarantees an efficient and secure shared key discovery phasewithout any message exchange and also probabilistic authentica-tion between any pair of nodes (a node proves its identity by prov-ing knowledge of the keys it is supposed to hold).

In our proposal we will use the ESP scheme as a basic tool,where the core idea behind ESP is briefly reviewed in Section 5.1.For full details on ESP, refer to [15]. Table 1 summarizes the nota-tion used in this paper.

3. Threat model

We devise a simple yet powerful adversary: Before a round ofthe replica detection protocol is invoked, the adversary can com-promise a certain fix amount of sensors. To cope with this threat,it could be possible to assume that sensors are tamper-proof. How-

Table 1Summary of the notation.

n number of sensors in the WSNr sensor’s communication radiusd average neighborhood sizeg number of destinations for every location claim (LSM)p claiming probability (LSM)Kpriv

a ð�Þ signature function invoked by sensor IDa

Kpuba ð�Þ verification of the signature possibly generated by sensor IDa

Kpriva asymmetric private key of sensors IDa

Kpuba asymmetric public key of sensors IDa

k key-ring sizela location of node IDa

P pool sizeHð�Þ hash function

ever, consistently with a large part of the literature, we will assumethat sensors do not have tamper proof components and that theycan be captured. The adversary goal is to prevent the sensors underits control that have been replicated from being detected. Hence,we assume that the adversary will try to subvert those sensors thatwill possibly act like witnesses. To formalize the adversary model,we introduce the following definition.

Definition 3.1. Assume that the adversary goal is to subvert thedistributed detection protocol by compromising a possibly smallsubset T of the sensors. The adversary has already compromised aset of sensors W, while N is the initial set of sensors in the WSN.For every sensor s in the WSN, the sensor appeal SðsÞ returns theprobability that s 2N nW is a witness for the next round.

We define two adversaries, both of which tamper with sensorssequentially:

(i) The oblivious adversary: At each step of the attack sequence,the next sensor to be tampered with is chosen randomlyamong the ones that have yet to be compromised;

(ii) The smart adversary: At each step of the attack sequence, thenext sensor to be tamper with is sensor s, where s maximisesSðsÞ; s 2N nW.

Intuitively, the oblivious adversary does not take advantage ofany information about the protocol used by the network. Con-versely, the smart adversary greedily chooses which sensor to cor-rupt (the one that maximises its appeal) in order to maximize itschance for its replicas to go undetected.

4. Requirements for distributed detection

4.1. Witnesses distribution

An important problem when designing a protocol to detect thereplica attack is how the witnesses, the nodes that detect the at-tack, are selected. Indeed, assume that the adversary could pre-cisely predict which nodes are going to be the witnesses duringan attack. If this is the case, it is possible to imagine that the adver-sary could subvert these nodes beforehand, and the attack wouldgo undetected. The prediction can be such that the adversary isable to compute the ID of the witnesses or their geographic posi-tion in the network. In the former case we will talk of ID predict-ability; in the latter of geographic predictability.

We will say that a protocol for replica detection assures ID obliv-iousness if the adversary cannot compute any information on theidentity of the witnesses once provided with the public parametersof the protocol and the IDs of the sensors in the network. To intro-duce the concept of geographic predictability, assume that proba-bility SðsiÞ depends on the geographical position of sensor si

within the network. In this case, the adversary can concentrateits effort on a subset of the sensors, based on their position inthe network area. We can thus introduce the concept of geographicobliviousness. A protocol guarantees geographic obliviousness ifprobability SðsiÞ does not depend on the geographical position ofsensor si in the network.

4.2. Overhead

Designing protocols for WSNs is a challenging task due to theresource constraints: Any protocol is required to generate littleoverhead to preserve sensors’ batteries. Not only that, overheadshould be evenly distributed among sensors, otherwise even if aprotocol generates little overhead on each sensor on the average,it is possible that a subset of the sensors is so heavily overwhelmed

Table 2LSM overhead.

Memoryoccupancy

Sent messages Receivedmessages

Signaturecheck

Asymptotic Oðg � p � d �ffiffiffinpÞ Oðg � p � d �

ffiffiffinpÞ Oðg � p � d �

ffiffiffinpÞ Oðg � p � d �

ffiffiffinpÞ

Average 20.33 22.08 49.84 21.08Max 197 216 252 223

M. Conti et al. / Information Fusion 10 (2009) 342–353 345

with computations and communications to run out quickly batterypower, possibly creating important problems to the network func-tionality. A more subtle consideration regards local memory. As-sume that a protocol, due to uneven distribution of memoryoverhead, requires some of the sensors to use more memory thanthat on the device. In this case, it is possible that the protocol effi-ciency drops considerably, or that it simply fails to detect replicas.

We can summarize the above discussion with the generalrequirement that the overhead generated by the protocol shouldbe small, that is sustainable by the WSN as a whole, and (almost)evenly distributed among sensors. Just to make a real example,in the LSM protocol [37] every sensor that forwards a positionclaim should also store the message. Since every line-segment isof length Oð

ffiffiffinpÞ on the average, every node stores Oð

ffiffiffinpÞ location

claims on the average. Note that this could be impractical in a realnetwork with thousands of nodes. Table 2 shows the asymptoticmemory overhead of one round of the LSM Protocol. The secondrow of the table reports the average overhead for a network of1,000 sensors, deployed in the unit square with transmitting radiusr ¼ 0:1 (31 neighbors on the average), p ¼ 0:1, and g ¼ 1. Finally,the third row shows the overhead experienced by the sensor withmaximal load: Some sensors in the network are required to use amuch higher amount of memory than predicted by the average.

5. ICD protocol

In this section we propose a new information-fusion basedclone detection protocol (ICD). Our solution integrates two differ-ent security mechanisms: The pseudo-random key pre-deploy-ment introduced in [14,15] and the limited use of digitalsignatures, as proposed in [37]. Before describing our protocol inSections 5.2 and 5.3 in Section 5.1 we review the ESP mechanismused in our proposal.

5.1. Details on the ESP protocol

ESP works as follows: Consider a sensor a and a pool of symmet-ric keys P. For every key kP

i of the pool, compute z ¼ fyðajjkPi Þ, where

fy is a pseudo-random function, that is an efficient (deterministic)algorithm which given an h-bit seed, y, and an h-bit argument, x,returns an h-bit string, denoted fyðxÞ, so that it is unfeasible to dis-tinguish the responses of fy, for a uniformly chosen y, from the re-sponses of a truly random function. Then, put kP

i into the key ring ofa, if and only if z � 0 mod ðjPj=kÞ. Two aspects of the assigningphase should be underlined:

– Due to physical memory limitation, whenever the number ofkeys assigned by this mechanism exceeds the number of keysthe node can store, the corresponding node id should be trashedand a new one should be used.

– Due to security requirements, whenever the number of keysassigned by this mechanism is lower than ak, where a is a secu-rity parameter, the corresponding node id should be trashed anda new one should be used.

ESP supports a very efficient key discovery procedure. Considera sensor b that is willing to know which keys it shares with sensor

a. For every key kbj in the key ring of b sensor b computes

z ¼ fyðajjkbj Þ. Then, by testing z � 0 mod ðjPj=kÞ; b discovers

whether sensor a also has key kbj or not. Indeed, whoever already

knows key kPi is the only one who can know whether kP

i is in thekey ring of a or not. This is computationally impossible for all otherentities, since fy, being a pseudo-random function, is also one-wayand thus hard to invert [26]. For this reason, from the ID of a nodean adversary cannot acquire neither the keys stored by this node,nor the corresponding key indexes: fyðxÞ is applied to the actual va-lue of the key, not to the corresponding key index. For the samereason a node can authenticate itself by proving to know the keysit is supposed to hold (keys depends on the assigned id).

5.2. High level protocol description

The first security mechanism integrated in our protocol is theESP pseudo-random key pre-distribution [14] introduced in Sec-tion 2.3 and described in the previous section. Note that ESP relieson the pseudo-random key pre-distribution scheme; with thisscheme it has been shown in [16,17,31] that nodes are requiredto store Oðlog nÞ keys: A storage overhead that can be considerednegligible if compared to that of the LSM protocol, as discussedin Section 6.3.

The second security mechanism is a public-key crypto systemwhere the public key of every node can be derived in a publicway from the node ID, an idea originally proposed in [4]. Each nodestores its own private key of the asymmetric scheme and the sym-metric keys assigned with the ESP mechanism. We also use theidea of hashing to randomize the location of the witnesses; an ideasimilar to that employed in [39] to provide an efficient solution toimplement the data-centric storage paradigm.

Every run of ICD works as follows. Every node in the networkbroadcasts a location claim signed with its own private key. Thelocation claim contains the ID and the current location of theclaiming node. We remark that we make use of asymmetric cryp-tography only once per run (that is in a sporadic way). Each node inthe transmitting radius r of the claiming node computes the sym-metric keys shared (if any) with the claiming node. If at least a keyis shared, for each of these keys a network location is computed viaa pseudo-random function seeded with the ID of the claimingnode, the specific shared key, and the value of the current protocolround. We need to consider the protocol round, the node ID, andthe shared key(s) to identify the targeted network location forthe following reasons:

– The protocol round is used to induce a (probabilistic) change inthe witnesses for a node at every different protocol iteration.

– The ID is used in order to distribute the task of being a witnessfor a given key (a key can belong to nodes with different IDs).

– The shared key is used to prevent an adversary from computingthe witnesses of a node based on the only knowledge of the IDand the protocol round.

Suppose node a is cloned: Let a0 be its clone and kz a key sharedbetween a and one if its neighbors b. Also, suppose the same key kz

is shared between the node a0 and one of its neighbors b0. In thisway, abiding to the protocol, the nodes b and b0 will send the posi-tion claim signed by a and a0 to the same network location. Observethat, for each shared key between a neighbor b and the claimingnode a, the claim is sent to a different network locations. For exam-ple, if a and b share four keys, the node b will send the claim of theposition of a to four pseudo-randomly-selected network locations.Further, note that since the destination is also function of the valueround, at protocol round t, the four selected destinations will beindependent from the four destinations selected at protocol roundt þ 1. Finally, note that if the event where the neighbors b and b0 do

Fig. 1. Pseudo-code of the ICD protocol.

346 M. Conti et al. / Information Fusion 10 (2009) 342–353

not share any key occurs, cloning will be not detected. In order forthe clone attack to be detected, at least one witness should receivethe claim from both b and b0. Even if this could seem a weakness ofour detection protocol, the use of pseudo-random key pre-distribu-tion thwarts the negative event above described, as supported bysimulation results showing our detection probability.

In order to compare our protocol with LSM we assume (as LSMauthors do) that the communication between nodes are reliableand the routing will deliver a message destined to a network loca-tion to the node closest to this location [6,29,39].

Note that the ICD protocol works also in context where nodescan be mobile.

5.3. Protocol description

The ICD protocol executes at fixed intervals of time; that is weassume that nodes are loosely synchronized [22,40]. Then, we as-sume all the nodes store, in the variable round, the number ofelapsed time intervals. The pseudo-code is described in Fig. 1. Fur-ther, we use the following notation to indicate a message sent fromnode a to node b: a! b :< M >. The message M can also be indi-cated listing its components separated by commas: < msg�part1; . . . ;msg � partn >. Then the notation:< var1; . . . ; varn > Mindicates the assignment to the variables var1; . . . ; varn of the com-ponents of the messages M. The destination of a message can alsobe expressed as a network location ldst . The rationale of this choiceis clarified in the following protocol description. Finally, the outputproduced by the invocation of the procedures introduced by theprotocol follows:

– neighborsOf ðaÞ: Gives the set of neighbors of node a.– ReceiveMessageðMÞ: Let the invoking nodes receive, if exist, a

message M.– IsClaimðMÞ; IsForwardedClaim: Check whether msg � part3 is

equal to ‘‘IsClaim” or ‘‘ForwardedClaim” respectively.– IsSharedðki; IDxÞ: Checks whether ki is in the key ring of node IDx.– PseudoRandðIDx; ki; roundxÞ: Pseudo randomly generates an ldst

value as a function of the input values.– IsNotPresentðMEM; IDxÞ;AddðMEM;valuesÞ;LookUpLocationðMEM;

IDxÞ: These functions respectively check whether an entry corre-sponding to a given IDx is not present in the node local memory,add the given values in the node’s memory, and retrieve fromthe node local memory the last location associated to a givenIDx.

– IsNotCoherentðlx; lyÞ: Check whether the two given locations areclose enough to be coherent with a node position (that is,whether these locations are within the communication rangeof the node).

We assume that every node stores a key-ring of symmetric keysassigned in a pseudo-random way as described in [14]. The size ofthe key-ring (that is, the number of per node stored keys) variesbetween 0:8k and k. We also assume that each node is assignedan asymmetric key pair (that is the private key and the correspond-ing public key). While the private key of a node is locally stored,the public key of a node i could be computed by every nodethrough a publicly known function based on the ID of the node i.

After a time-out (Fig. 1, line i), each node digitally signs andbroadcasts the hash (computed via the function Hð�Þ) of its ownposition claim: ID and location (line ii). For each node, each of itsd neighbors checks the symmetric keys shared with the claimingnodes (line vii). Note that this operation could be performed with-out any message exchange due to the properties of the key assign-ment we use [14]. For every message claim received andsymmetric key shared with the claiming node, the neighbor pseu-do-randomly selects a network location (line viii). The input to the

pseudo-random function is given by the ID of the claiming node,the shared symmetric key, and the value of the current protocolround.

Loose time synchronization among nodes is used to derive thecurrent protocol round. Current protocol round allows: To distin-guish among different invocations of the protocol and to refreshat each invocation the set of possible witnesses.

Note that a claim is sent to a network location: We wantto avoid sending a claim to a specific node ID (the same solu-tion is used in [37]) because this kind of solution is not ro-bust. Indeed, a claim sent to a witness ID no more presentin the network would be lost; further, nodes deployed in suc-cessive steps after the first network deployment could not actas witness without updating information in all the nodes. In-stead, using geographical location allows to overcome theseproblems.

Every node signs its claim message with its private key beforesending it (Fig. 1, line ii). All the relay nodes toward destination

Fig. 2. Example of sensors deployment and 5% incremental areas. n ¼ 1000.

Fig. 3. Example of LSM Protocol iteration: n ¼ 1000; r ¼ 0:1; g ¼ 1;p ¼ 0:1.

M. Conti et al. / Information Fusion 10 (2009) 342–353 347

are not required to add any signature or to store any message. Forevery received claim, the witness node:

– Verifies the signature;– checks for the freshness of the message by looking at the value

of the protocol round.

For every genuine message that passes the previous checks thewitness node extracts the information (ID and location) and itchecks:

– If this is the first claim received for this ID, it simply stores themessage (Fig. 1, line xxi);

– otherwise, it checks if the claimed location is coherent with theother claims stored for this ID (Fig. 1, line xxiv). If it is not, thenthe witness node triggers a revocation procedure for the givenID (line xxv).

Finally, when the time-out expires, the node frees the memory.

6. Simulations and discussion

In this section we compare ICD with LSM in order to assess thecompliance of these protocols with the requirements in Section 4,that is: ID obliviousness; geographic obliviousness; low and bal-anced overhead; and replica attack detection probability. In the fol-lowing, we assume that the deployment area is the unit square[2,3,16]. In the simulations shown in the next three sections weconsider a network composed of 1000 nodes and a communicationrange of 0.1. For the LSM protocol we use p ¼ 0:1 and g ¼ 1, wherep is the probability for a neighbor to send a position claim and g isthe number of witnesses chosen as destinations of that claim. Wechoose the ICD protocol parameters k (symmetric keys stored byeach node) and P (size of the Pool from which symmetric keysare chosen) in such a way that the underlying WSN is securely con-nected. In particular, as proved in [17,31], a choice of k and P suchthat k2

=P ¼ Oðlog n=nÞ assures secure connectivity with high prob-ability. Since in the following we will consider a WSN of 1000nodes, our parameters choice is k ¼ 10, P ¼ 1000. We use ESP askey pre-deployment mechanism setting a ¼ 0:8 (that is, we haveno nodes with less than ka ¼ 8 symmetric keys as a result of theESP node construction phase) (see Table 3).

6.1. Witnesses distribution

Due to randomization, it is straightforward to verify that bothLSM and ICD protocols are ID oblivious: In both protocols the IDof the witness nodes are randomly and uniformly selected amongall the nodes. In order to assess geographic obliviousness, we studythe distribution of the witnesses as follows: We select larger andlarger sub-areas of the network, where each sub area increases ofthe 5% of the total area with respect to the previous sub area (asin Fig. 3), and for each sub-area we count the number of witnessesafter a run of the detection protocol.

In Fig. 3 the result of one iteration of the LSM protocol: The largefilled circles indicate the cloned nodes, the small filled circles indi-

Table 3ICD overhead.

Memoryoccupancy

Sentmessages

Receivedmessages

Signaturecheck

Asymptotic O(1) O k2

P � d �ffiffiffinp� �

O k2

P � d �ffiffiffinp� �

O(1)Average 2.05 18.19 46.00 2.34Max 18 133 170 21

cate relay nodes, and finally the empty circles indicate the wit-nesses. In Fig. 4 the same result is shown for one iteration of ICD.Comparison between Figs. 3 and 4 suggests that LSM protocol usesa higher number of relay nodes, compared to the ICD protocol.Also, the witnesses are located near the center for the LSM proto-col, while this is not the case for the ICD. In the following we willsee through extensive simulations that this phenomenon is notaccidental, and we will also see how it affects the performancesof the two protocols.

We simulated 10,000 different network deployments. For eachdeployment we randomly select two nodes and assign them thesame ID. For each deployment and with the same couple of clonednodes, we execute a single LSM iteration and a single ICD iteration.After each of these iterations we localize the witness nodes in thetwo different protocols: For each of the 20 incremental sub-areaswe compute the percentage of witnesses present, with respect tothe total number of witnesses. Finally, we plot the average. Thex-axis of Fig. 5 indicates the percentage of the network area consid-ered, while the y-axis reports the corresponding percentage of thewitnesses in that area.

Fig. 4. Example of ICD Protocol iteration: n ¼ 1000; r ¼ 0:1; P ¼ 1000; k ¼ 10.

348 M. Conti et al. / Information Fusion 10 (2009) 342–353

It is interesting to note that, for the LSM protocol, the centralarea corresponding to 20% of the area network ðA1Þ collects morethan 50% of the witnesses, while the most external area corre-sponding to the 20% of the area network ðA2Þ, contains only1.75% of all the witnesses. The LSM is therefore not area-oblivious,since SðsiÞ � SðsjÞ for an si selected from A1 and sj selected from A2.Due to the pseudo-random choice of witness nodes in the ICD pro-tocol, it is straightforward to prove that ICD has a uniform wit-nesses distribution. Simulations reported in Fig. 5 support thefact that the behavior of the ICD protocol corresponds to that ofan ideal protocol: The witnesses are equally distributed in the net-work area. The ICD protocol has geographic obliviousness.

6.2. Threat analysis

Note that the described protocol can be subject to a possiblethreat: Once an adversary has captured a node, the adversaryknows the symmetric keys stored on that node; hence, the adver-sary could determine the different IDs of the witnesses for any fu-ture given protocol round. If the adversary could tamper with thesewitnesses, the replica would go undetected.

0

20

40

60

80

100

0 0.2 0.4 0.6 0.8 1

% o

f witn

ess

with

in th

e ar

ea

% of network’s areas (concentric square)

LSM ProtocolICD Protocol

Fig. 5. Witness density: n ¼ 1000; r ¼ 0:1. LSM: p ¼ 0:1; g ¼ 1. ICD:P ¼ 1000; k ¼ 10;a ¼ 0:8.

In the following we show that for the adversary to be able to goundetected, it should corrupt a sensitive fraction of all the nodes inthe network. Indeed, first note that at each different protocol roundthe set of witnesses will be different, with a non-negligible proba-bility. Further, remind that for each protocol round the set of wit-nesses for a given ID is randomly chosen from all of the networknodes (this is due to the PseudoRand function of line viii, algorithmin Fig. 1). Hence, for the adversary to remain undetected, it has tocompromise an increasing number of nodes. Based on the aboveconsiderations, we argue that this threat is not realistic: As timegoes by, the adversary should capture almost all of the networknodes to escape detection. These considerations are supported bythe following analytical results.

Let Xi;j be the random variable that takes on the value 1 if thenode i is selected as a witness during any of the first j rounds,and 0 otherwise. It can be shown that Pr½Xi;j ¼ 0� ¼ ð1�w=nÞj,where we assume that w witnesses have been involved by node ifor each of the j rounds. In the following, we denote with Xj therandom variable that describes the number of different nodes thathave been selected as witness up to the round j, that is:Xj ¼

Pni¼1Xi;j. Now, let us evaluate E½Xj�:

E½Xj� ¼ EXn

i¼1

Xi;j

" #¼ nP½X1;j ¼ 1� ¼ nð1� P½Xi;j ¼ 0�Þ

¼ n 1� 1�wn

� �j� �

P nð1� e�wj=nÞ ð1Þ

In Fig. 6 we plot the result from Eq. (1) for a suitable range of theparameters, coherent with the parameters of the figures shown inthis paper. In particular, for a low number of rounds, the increaseis even proportional to wj. It follows that the number of differentwitnesses to capture as rounds evolve is strictly increasing withthe protocol rounds, confirming our analysis.

6.3. Storage overhead

In order to evaluate the distribution of the storage require-ments, Fig. 7 reports the number of messages the nodes are re-quired to store for the LSM and the ICD protocols respectively.For a given value of messages in memory reported on the x axis,we report on the y axis the percentage of the nodes that are re-quired to store that number of messages. The values were obtainedaveraging the result of 10,000 simulations.

0

200

400

600

800

1000

0 200 400 600 800 1000

Num

ber

of d

iffer

ent w

itnes

ses

Protocol round

ICD (w=2)ICD (w=4)ICD (w=8)

Fig. 6. Number of different witnesses involved for several protocol iterations:n ¼ 1000.

0

10

20

30

40

50

60

0 50 100 150 200

% o

f exh

aust

ed s

enso

rs

Iterations

LSM ProtocolICD Protocol

Fig. 8. Exhausted nodes in different iterations: n ¼ 1000; r ¼ 0:1. LSM: p ¼ 0:1; g ¼1. ICD: P ¼ 1000; k ¼ 10;a ¼ 0:8.

0

10

20

30

40

50

0 20 40 60 80 100

% o

f sen

sors

sto

ring

the

fixed

num

ber

of m

essa

ges

number of messages in the sensor’s memory

LSM ProtocolICD Protocol

Fig. 7. Used memory: n ¼ 1000; r ¼ 0:1. LSM: p ¼ 0:1; g ¼ 1. ICD: P ¼ 1000; k ¼10;a ¼ 0:8.

0

20

40

60

80

100

0 5 10 15 20

% o

f exh

aust

ed s

enso

rs

Areas

LSM ProtocolICD Protocol

Fig. 9. Exhausted nodes distribution after 200 iterations: n ¼ 1000; r ¼ 0:1. LSM:p ¼ 0:1; g ¼ 1. ICD: P ¼ 1000; k ¼ 10;a ¼ 0:8.

M. Conti et al. / Information Fusion 10 (2009) 342–353 349

Note that for LSM some nodes could be required to store asmany as 200 messages. We decided not to report the valuesexceeding the 100 messages to store. Despite this fact, Fig. 7 showsthat the LSM Protocol requires 2.6% of the nodes to store more than60 messages, 8.8% of nodes to store a number of messages between40 and 59, and 30.3% of nodes to store a number of messages be-tween 20 and 39. Since each message carries 512 bits (a digital sig-nature and the list of neighbors), 2.6% of the sensors would requiremore than 512*60=30,720 bits. Note that the Mica2 motes can onlyprovide 4 kB of RAM [1], that is more than the 92% of the memorywould be used only to store messages related to the detection pro-tocol. For the ICD protocol only a negligible percentage of nodes(0.13) require to store more than 10 messages. Moreover, Fig. 7shows that for the ICD protocol just 5% of the nodes need to storemore than 5 messages and less than 30% of nodes to store a num-ber of messages between 3 and 5. It is interesting to note that 47%of the nodes store only one or two messages while 20.18% of thenodes do not require to store any message at all. Observe that forLSM only 0.2% of the nodes do not store any message.

As discussed in Section 5.3 we assume the symmetric keysstored for the pair-wise key establishment is not a direct cost ofour protocol. However, referring to the simulation results shown,note that each node needs to store at most 10 pre-deployed sym-metric keys. Assuming 128 bit long keys, this implies only 160memory bytes dedicated to symmetric keys storing. If we considerthis cost as a direct overhead of our proposed protocol, this is stillnegligible compared to the LSM memory overhead. For example,more than 40% of nodes in the LSM protocol require to store atleast 20 messages, that is at least 1280 bytes. Finally, note that ingeneral the pair-wise key establishment protocol require only a lit-tle number of stored symmetric keys [17,31].

The number of signatures is proportional to the number of mes-sages, shown in Fig. 7. Further, also the number of messages sent isproportional to the number of messages received. Note that trans-mission is a quite battery-consuming operation [42]; the impact ofmessage transmission is detailed in the following section.

6.4. Energy consumption

In LSM every forwarding node is required to verify the signatureof the claim, while in ICD only the witnesses are required to per-form this operation. Note that digital signature verification comeswith an energetic cost [42]. The number of signatures required isproportional to the number of messages stored, shown in Fig. 7.Further, also the number of messages sent is proportional to the

number of messages received. Note that transmission is a bat-tery-consuming operation [42]. In the following simulation weconsidered the energetic model proposed in [42]: A node batteryof 324,000 mJ; 45.0 mJ are required to sign a packet (or to checksignature); 15.104 mJ to send a packet and 7.168 mJ to receive it(assuming packet length of 32 bytes).

The different computational overhead of LSM and ICD will re-sult in different energy consumption. Fig. 8 shows this behavior.After 100 iterations started from the same deployment condition,while for the LSM protocol there are 20% of exhausted nodes, forthe ICD protocol all the nodes are still alive. After 150 iterations,while the LSM shows 40% of exhausted nodes, ICD shows only0.1%. Finally, after 200 iterations LSM shows 50% of exhaustednodes, while for ICD this percentage is less than 2%. It is also inter-esting to note the different distributions of exhausted nodes in thenetwork deployment area. Fig. 9 shows the distribution after 200protocol iterations. The x-axis indicates the network sub-areas(as plotted in Fig. 2), numbered sequentially from the inner oneto the external one. The y-axis indicates the percentage of ex-hausted nodes in the area. For the LSM protocol it is interestingto note the little increase in exhausted nodes percentage for theareas closer to the center (from the 1st to the 5th one). We explainthis behavior as follows: after a certain number of protocol itera-

0

0.2

0.4

0.6

0.8

1

0 50 100 150 200

Det

ectio

n P

roba

bilit

y

Iterations

LSM ProtocolICD Protocol

Fig. 10. Detection probability: n ¼ 1000; r ¼ 0:1. LSM: p ¼ 0:1; g ¼ 1. ICD:P ¼ 1000; k ¼ 10;a ¼ 0:8.

0

0.2

0.4

0.6

0.8

1

0 50 100 150 200

Det

ectio

n P

roba

bilit

y

Iterations

LSM ProtocolICD Protocol (k=20, P=2000)ICD Protocol (k=30, P=4500)

Fig. 11. Detection probability: n ¼ 1000; r ¼ 0:1. LSM: p ¼ 0:2; g ¼ 1. ICD: a ¼ 0:8.

0

10

20

30

40

50

0 20 40 60 80 100

% o

f sen

sors

sto

ring

the

fixed

num

ber

of m

essa

ges

number of messages in the sensor’s memory

LSM ProtocolICD Protocol (k=20, P=2000)ICD Protocol (k=30, P=4500)

Fig. 12. Used memory: n ¼ 1000; r ¼ 0:1.LSM: p ¼ 0:2; g ¼ 1. ICD: a ¼ 0:8.

350 M. Conti et al. / Information Fusion 10 (2009) 342–353

tions some nodes in the central area become isolated, even if thesenodes do not run out of battery and are thus alive. The same phe-nomenon happens with increasing degree of mitigation, movingfrom the center to the border. The parameters that we chose forthe ICD protocol yield almost the same detection probability atthe first iteration (0.36 for LSM, 0.41 for ICD). However, differentdistributions of node exhaustion also imply different clone attackdetection probabilities, as shown in Fig. 10: This figure shows thedetection probability at different iterations (x-axis). In particular,we plot the detection probability for the first 200 iterations. Plottedvalues are computed averaging over 10,000 network deployments.Each single deployment is evaluated for both the LSM and the ICDprotocol.

It is interesting to note the strict correlation between the per-centage of exhausted nodes (Fig. 8) and the detection probability(Fig. 10). For the LSM protocol the nodes exhaustion starts after50 iterations; at the same iteration number, the detection probabil-ity starts decreasing. A similar behavior can be observed also forICD.

The experiments reported in this section indicate that theimplementation of LSM does not match the requirement of balanc-ing the overhead (almost) evenly among the nodes. Conversely, theICD protocol matches the requirement of low, balanced overhead,and also shows a better node-replication attack detection capabil-ity. Further, the efficiency of ICD overcomes the efficiency of LSM;this also reflects in a better detection probability and in a longeroperational life of the network.

6.5. Witnesses and clone detection

In this subsection we want to analyze how the number of claimmessages for each node affects clone detection in both ICD andLSM. Further, we want to study the behavior of the ICD protocolwhen the size of the key ring increases.

To increase the number of claim messages per node in LSM, weset p ¼ 0:2 rather than p ¼ 0:1. In a similar way, for the ICD proto-col we consider values of k and P such that the ratio k2

P is doubledwith respect to the ratio considered in the previous set of experi-ments. Note that the ratio k2

=P is the average number of keysshared between two nodes [16]. To analyze the behavior of ICDas the key ring size increases, we consider < k ¼ 20; P ¼ 2000 >and < k ¼ 30; P ¼ 4500 >.

The first plot that we will analyze is the detection probabilityfor the two protocols, reported in Fig. 11. We can notice that in

the first few runs of the detection protocol, the detection probabil-ity is almost doubled in LSM. This supports the intuition that dou-bling the number of witnesses, the detection probability woulddouble. However, after few runs of the protocol, the detectionprobability for LSM drops dramatically. After the 50th run of theprotocol, the detection probability is lower than the detectionprobability experienced in the first set of experiments (Fig. 10). In-deed, more messages mean more overhead and therefore higherenergy consumption. This implies a faster node battery exhaustionand hence a lower detection probability.

As for the ICD protocol, the detection probability is lower thanthe one showed by LSM for the first few runs. However, after 60runs of the protocol, the detection probability of ICD is alwaysgreater than LSM. After 80 runs, the probability of detection ofICD is at least twice the detection probability of LSM. In the firstfew runs of the protocols, the LSM provides better performancesthan ICD, due to the witness selection mechanism adopted byICD. Indeed, for a given ratio k2

=P, the number of claim messagessent by a node is fixed (on average). However, the total numberof potential witnesses for a node depends on k as well, and not onlyon the ratio k2

=P. In particular, a larger key ring implies a lowerprobability that two neighbors of a given node select the same wit-ness for that node. There is a kind of dispersion of message claim.This dispersion decreases the number of possible witnesses thatin turn affects the detection probability.

20000

30000

40000

50000

60000

70000

80000

90000

0 50 100 150 200

Ove

rall

ener

gy c

onsu

mpt

ion

(mill

iJou

le)

Number of attack messages

LSMICD

Fig. 15. Energy consumption in presence of a DoS attack.

M. Conti et al. / Information Fusion 10 (2009) 342–353 351

If we focus on the two parameter setting of the ICD protocol, wecan notice that the two curves exhibit exactly the same qualitativebehavior. Moreover, the curve that is plotted with k ¼ 20 has a bet-ter detection probability than k ¼ 30. This is in compliance withthe above observation on the dispersion of witnesses.

As for the memory requirements note that for both LSM and ICDthe curve in Fig. 12 shows the same behavior: The number of nodesthat need to store fewer messages decreases with respect to Fig. 7,while the number of nodes that require to store more messages in-creases. For instance, comparing Figs. 7 and 12 for LSM, the per-centage of nodes storing 5 messages moves from the percentageof 3.59 to 0.82; in the meanwhile the percentage of network nodesstoring 40 messages moves from 0.77 to 1.23. While we can noticea similar qualitative behavior in node storage requirements also forICD, note that the vast majority of nodes in ICD still require a smallamount of local storage, showing an excellent scalability.

Focusing on the number of exhausted sensors, note that the plotreported in Fig. 13 shows the same qualitative behavior of the plotin Fig. 8. However, the value on the y-axes are reached twice faster.This fact reflects the observation that sending twice the number ofclaim message halves the life time of a node.

Finally, in Fig. 14 we report the area distribution of exhaustedsensor. Again, the LSM protocol shows a considerable concentra-

0

20

40

60

80

100

0 5 10 15 20

% o

f exh

aust

ed s

enso

rs

Areas

LSM ProtocolICD Protocol (k=20, P=2000)ICD Protocol (K=30, P=4500)

Fig. 13. Exhausted nodes in different iterations: n ¼ 1000; r ¼ 0:1. LSM:p ¼ 0:2; g ¼ 1. ICD: a ¼ 0:8.

0

20

40

60

80

100

0 5 10 15 20

% o

f exh

aust

ed s

enso

rs

Areas

LSM ProtocolICD Protocol (k=20, P=2000)ICD Protocol (K=30, P=4500)

Fig. 14. Exhausted nodes distribution after 200 iterations: n ¼ 1000; r ¼ 0:1. LSM:p ¼ 0:2; g ¼ 1. ICD: a ¼ 0:8.

tion of exhausted nodes in the central areas. As for the ICD, thisphenomenon is much more smoothed, that is the exhausted sen-sors are more evenly distributed across the different areas. How-ever, while LSM reflects the same behavior shown in Fig. 9, thisis not the case for ICD. This is due to the fact that routing still re-quires nodes in the central area to route more messages, henceanticipating their battery-exhaustion.

6.6. Influence of per-hop signature verification on preventing DoSattack

In this section we want to quantify the cost of a Denial of Ser-vice (DoS) attack against our proposed protocol compared to thecost incurred by LSM when subject to a similar attack. The scenariois the following: The WSN is required to route 100 genuine mes-sages. Together with these messages, we assume the adversary toinject a given number of bogus messages, that is messages that willnot pass the signature verification check. In our simulations, wevaried the number of bogus messages from 0 to 200 (x axis). Ourobjective is to analyze the overall network energy consumption(in milliJoule, y axis). We assume a path length of d

ffiffiffinpe, while

the energetic cost to send, to receive, to sign and to verify a mes-sage is the same we used in previous simulations. Fig. 15 showsthe result. We can notice that, until the number of bogus messagesis less than 180, the overall cost is favorable to ICD in comparisonwith LSM.

Finally, one should also notice that if the adversary can send anumber of bogus messages that doubles the number of genuinemessages, the network is completely under adversary control. Thatis, in real case scenario, the number of bogus messages injectedwould be far less than the double of genuine messages, henceour protocol shows better DoS resilience.

7. Concluding remarks

In this paper we presented a few fundamental requirements anideal protocol for distributed detection of node replicas shouldhave. Note that such a protocol could enforce authentication of col-lected data, preventing cloned nodes to inject bogus data, hencesecuring a specific aspect of information fusion. In particular, wehave introduced the preliminary notion of ID obliviousness and geo-graphic obliviousness that convey a measure of the quality of thenode identity replicas detection algorithm. Moreover, we haveindicated that the overhead of such a protocol should be not onlysmall, but also evenly distributed among nodes, otherwise the pro-

352 M. Conti et al. / Information Fusion 10 (2009) 342–353

tocol itself could sensibly impact: On the network life as for the en-ergy required by the number of exchanged messages and the com-putations performed; on the effectiveness of the protocol itself ifthe memory requirements exceed the storage available to the sen-sor. Further, we have analyzed the state of the art solution for nodeidentity replicas detection, and we have shown that the proposedsolution does not completely fulfil the issues above described. Toovercome these issues, we have proposed the Information FusionBased Clone Detection Protocol (ICD). ICD is a probabilistic, scal-able, distributed protocol that efficiently detects cloned nodes. Inparticular, ICD combines two different security primitives (randomkey pre-deployment and sparing use of asymmetric cryptography)to match all the above highlighted requirements. Extensive simu-lations show that ICD is particularly efficient, and that its perfor-mance are quite robust for different set of parameters. Finally,note that ICD protocol could be used as an independent layer byany data fusion mechanism.

Acknowledgement

The authors would like to thank the anonymous reviewersfor their comments that helped improving the quality of thepaper.

References

[1] Ian F. Akyildiz, Weilian Su, Yogesh Sankarasubramaniam, Erdai Cayirci,Wireless sensor networks: a survey, International Journal of Computer andTelecommunications Networking Elsevier 38 (4) (2002) 393–422.

[2] Christian Bettstetter, On the minimum node degree and connectivity of awireless multihop network, in: Proceedings of the 3rd ACM InternationalSymposium on Mobile Ad Hoc Networking and Computing (MobiHoc’02),2002, pp. 80–91.

[3] Christian Bettstetter, Christian Hartmann, Connectivity of wireless multihopnetworks in a shadow fading environment, in: Proceedings of the 6th ACMInternational Workshop on Modeling, Analysis and Simulation of Wireless andMobile Systems (MSWiM’03), 2003, pp. 28–32.

[4] Dan Boneh, Matthew Franklin, Identity-based encryption from the weilpairing, SIAM Journal on Computing 32 (3) (2003) 586–615.

[5] Seyit A. Camtepe, Bulent Yener, Combinatorial design of key distributionmechanisms for wireless sensor networks, in: Proceedings of the 9th EuropeanSymposium On Research Computer Security (ESORICS’04), vol. 3193 of LNCS,Springer, 2004, pp. 293–308.

[6] Antonio Caruso, Alessandro Urpi, Stefano Chessa, Swades De, Gps-freecoordinate assignment and routing in wireless sensor networks, in:Proceedings of the 24th Annual Joint Conference of the IEEE Computer andCommunications Societies (INFOCOM’05), 2005, pp. 150–160.

[7] Claude Castelluccia, Einar Mykletun, Gene Tsudik, Efficient aggregation ofencrypted data in wireless sensor networks, in: Proceedings of the SecondACM/IEEE Annual International Conference on Mobile and UbiquitousSystems: Networking and Services (Mobiquitous’05), San Diego, CA, USA,July 2005, pp. 109–117.

[8] Haowen Chan, Adrian Perrig, PIKE: peer intermediaries for key establishmentin sensor networks, in: Proceedings of the 24th Annual Joint Conference of theIEEE Computer and Communications Societies (INFOCOM’05), 2005, pp. 524–535.

[9] Haowen Chan, Adrian Perrig, Dawn Song, Random key predistribution schemesfor sensor networks, in: Proceedings of the 2003 IEEE Symposium on Securityand Privacy (S&P’03), 2003, pp. 197–213.

[10] Mauro Conti, Roberto Di Pietro, Luigi Vincenzo Mancini, Secure cooperativechannel establishment in wireless sensor networks, in: Proceedings of theFourth Annual IEEE International Conference on Pervasive Computing andCommunications Workshops (PERCOMW’06), IEEE Computer Society,Washington, DC, USA, 2006, pp. 327–331.

[11] Mauro Conti, Roberto Di Pietro, Luigi Vincenzo Mancini, ECCE: enhancedcooperative channel establishment for secure pair-wise communication inwireless sensor netwokrs, Journal of Ad Hoc Networks Elsevier 5 (1) (2007)49–62. January.

[12] Murat Demirbas, Youngwhan Song, An rssi-based scheme for sybil attackdetection in wireless sensor networks, in: 1st Workshop on AdvancedEXPerimental Activities on Wireless Networks and Systems(EXPONWIRELESS 2006), 2006, pp. 564–570.

[13] Roberto Di Pietro, Luigi Vincenzo Mancini, Alessandro Mei, Random key-assignment for secure wireless sensor networks, in: Proceedings of the 1stACM Workshop on Security of Ad Hoc and Sensor Networks (SASN’03), 2003,pp. 62–71.

[14] Roberto Di Pietro, Luigi Vincenzo Mancini, Alessandro Mei, Efficient andresilient key discovery based on pseudo-random key pre-deployment, in:

Proceedings of the 18th IEEE International Parallel and Distributed ProcessingSymposium (IPDPS’04), 2004, pp. 217–224.

[15] Roberto Di Pietro, Luigi Vincenzo Mancini, Alessandro Mei, Energy efficientnode-to-node authentication and communication confidentiality in wirelesssensor networks, Wireless Networks 12 (6) (2006) 709–721.

[16] Roberto Di Pietro, Luigi Vincenzo Mancini, Alessandro Mei, AlessandroPanconesi, Jaikumar Radhakrishnan, Connectivity properties of securewireless sensor networks, in: Proceedings of the 2nd ACM Workshop onSecurity of Ad Hoc and Sensor Networks (SASN’04), 2004, pp. 53–58.

[17] Roberto Di Pietro, Alessandro Mei, Luigi V. Mancini, Alessandro Panconesi,Jaikumar Radhakrishnan, Sensor Networks that are Provably Resilient, in:Proceedings of the 2nd IEEE International Conference on Security and Privacyfor Emerging Areas in Communication Networks (SecureComm ’06),Baltimore, MD, USA, August 2006, pp. 1–10.

[18] John R. Douceur, The sybil attack, in: Proceedings of the 1st InternationalWorkshop on Peer-to-Peer Systems (IPTPS’01), Springer, 2002, pp. 251–260.

[19] Wenliang Du, Jing Deng, Yunghsiang S. Han, Pramod K. Varshney, A pairwisekey pre-distribution scheme for wireless sensor networks, in: Proceedings ofthe 10th ACM Conference on Computer and Communications Security(CCS’03), 2003, pp. 42–51.

[20] Laurent Eschenauer, Virgil D. Gligor, A key-management scheme fordistributed sensor networks, in: Proceedings of the 9th ACM Conference onComputer and Communications Security (CCS’02), 2002, pp. 41–47.

[21] Mike Esler, Jeffrey Hightower, Thomas E. Anderson, Gaetano Borriello, Nextcentury challenges: data-centric networking for invisible computing, in:Proceedings of the 5th Annual International Conference on MobileComputing and Networking (MobiCom’99), 1999, pp. 256–262.

[22] Emerson Farrugia, Robert Simon, An efficient and secure protocol for sensornetwork time synchronization, Journal of Systems and Software 79 (2) (2006)147–162.

[23] Michal Feldman, Kevin Lai, Ion Stoica, John Chuang, Robust incentivetechniques for peer-to-peer networks, in: Proceedings of the 5th ACMConference on Electronic Commerce (EC’04), ACM Press, New York, NY, USA,2004, pp. 102–111.

[24] Joao Girao, Markus Schneider, Dirk Westhoff, Cda: concealed data aggregationin wireless sensor networks, in: Proceedings of the 3th ACM Workshop onWireless Security (WiSe’04), Philadelphia, USA, October 2004.

[25] Joao Girao, Dirk Westhoff, Markus Schneider, Cda: concealed data aggregationfor reverse multicast traffic in wireless sensor networks, in: Proceedings of the2005 IEEE International Conference on Communications (ICC2005), Seoul,Korea, May 2005.

[26] Oded Goldreich, Foundations of Cryptography: Basic Tools, CambridgeUniversity Press, 2001. August.

[27] Lingxuan Hu, David Evans, Secure aggregation for wireless networks, in:Proceedings of the 2003 Symposium on Applications and the InternetWorkshops (SAINT’03 Workshops), IEEE Computer Society, Washington, DC,USA, 2003, pp. 384.

[28] Chalermek Intanagonwiwat, Deborah Estrin, Ramesh Govindan, JohnHeidemann, Impact of network density on data aggregation in wirelesssensor networks, in: ICDCS’02: Proceedings of the 22nd InternationalConference on Distributed Computing Systems (ICDCS’02), IEEE ComputerSociety, Washington, DC, USA, 2002, pp. 457.

[29] Brad Karp, H.T. Kung, GPSR: greedy perimeter stateless routing for wirelessnetworks, in: Proceedings of the 6th Annual ACM/IEEE InternationalConference on Mobile Computing and Networking (MobiCom’00), 2000, pp.243–254.

[30] Bhaskar Krishnamachari, Deborah Estrin, Stephen B. Wicker, The impact ofdata aggregation in wireless sensor networks, in: Proceedings of the 22ndInternational Conference on Distributed Computing Systems (ICDCSW’02),IEEE Computer Society, Washington, DC, USA, 2002, pp. 575–578.

[31] Yee Wei Law, Li-Hsing Yen, Roberto Di Pietro, Marimuthu Palaniswami, Securek-connectivity properties of wireless sensor networks, in: Proceedings of the3rd IEEE International Workshop on Wireless and Sensor Networks Security(WSNS), Pisa, Italy, October 2007, pp. 1–6.

[32] Donggang Liu, Peng Ning, Establishing pairwise keys in distributed sensornetworks, in: Proceedings of the 10th ACM Conference on Computer andCommunications Security (CCS’03), 2003, pp. 52–61.

[33] Sergio Marti, T.J. Giuli, Kevin Lai, Mary Baker, Mitigating routing misbehaviorin mobile ad hoc networks, in: Proceedings of the 6th Annual InternationalConference on Mobile Computing and Networking (MobiCom’00), ACM Press,New York, NY, USA, 2000, pp. 255–265.

[34] Alfred J. Menezes, Scott A. Vanstone, Paul C. Van Orschot, Handbook of AppliedCryptography, CRC Press, Inc., 1996.

[35] Pietro Michiardi, Refik Molva, Core: a collaborative reputation mechanism toenforce node cooperation in mobile ad hoc networks, in: Proceedings of theIFIP TC6/TC11 Sixth Joint Working Conference on Communications andMultimedia Security, Kluwer, B.V., Deventer, The Netherlands, 2002, pp.107–121.

[36] James Newsome, Elaine Shi, Dawn Song, Adrian Perrig, The sybil attack insensor networks: analysis & defenses, in: Proceedings of the 3rd ACMInternational Symposium on Information Processing in Sensor Networks(IPSN’04), 2004, pp. 259–268.

[37] Bryan Parno, Adrian Perrig, Virgil D. Gligor, Distributed detection of nodereplication attacks in sensor networks, in: Proceedings of the 2005 IEEESymposium on Security and Privacy (SP’05), Washington, DC, USA, 2005, pp.49–63.

M. Conti et al. / Information Fusion 10 (2009) 342–353 353

[38] Bartosz Przydatek, Dawn Song, Adrian Perrig, SIA: secure informationaggregation in sensor networks, in: Proceedings of the 1st InternationalConference on Embedded Networked Sensor Systems (SenSys’03), ACM Press,New York, NY, USA, 2003, pp. 255–265.

[39] Sylvia Ratnasamy, Brad Karp, Scott Shenker, Deborah Estrin, RameshGovindan, Li Yin, Fang Yu, Data-centric storage in sensornets with a ght,geographic hash table, Mobile Networks and Applications (MONET) 8 (4)(2003) 427–442.

[40] Kun Sun, Peng Ning, Cliff Wang, Fault-tolerant cluster-wise clocksynchronization for wireless sensor networks, IEEE Transactions onDependable and Secure Computing 2 (3) (2005) 177–189. September.

[41] David Wagner, Resilient aggregation in sensor networks, in: Proceedings of the2nd ACM Workshop on Security of Ad Hoc and Sensor Networks (SASN’04),ACM Press, New York, NY, USA, 2004, pp. 78–87.

[42] Arvinderpal Wander, Nils Gura, Hans Eberle, Vipul Gupta, Sheueling ChangShantz, Energy analysis of public-key cryptography for wireless sensornetworks, in: Proceedings of the Third Annual IEEE International Conferenceon Pervasive Computing and Communications (PERCOM’05), 2005, pp. 324–328.

[43] Dirk Westhoff, Joao Girao, Mithun Acharya, Concealed data aggregation forreverse multicast traffic in sensor networks: encryption key distribution androuting adaptation, IEEE Transactions on Mobile Computing 5 (10) (2006)1417–1431.

[44] Sencun Zhu, Shouhoui Xu, Sanjeev Setia, Sushil Jajodia, Establishing pair-wisekeys for secure communication in ad hoc networks: a probabilistic approach,in: Proceedings of the 11th IEEE International Conference on NetworkProtocols (ICNP’03), 2003, pp. 326.