fault tolerance in wsn

[email protected] & [email protected]

Elham Hormozi & Razieh Asadi

1

Outline

Review of Wireless Sensor Network

Fault Tolerance in WSNs

Fault Detection

Fault Recovery

Relay Node Placement in Wireless Sensor Networks

Hop-by-Hop TCP for Sensor Networks

Conclusion

2

Review of Wireless Sensor Network

A WSN is a self-organized network that consists of a large number

of low-cost and low powered sensor devices, called sensor nodes

Can be deployed on the ground, in the air, in vehicles, on bodies,

under water, and inside buildings

Each sensor node is equipped with a sensing unit, which is used to

capture events of interest, and a wireless transceiver, which is used

to transform the captured events back to the base station, called

sink node

Sensor nodes collaborate with each other to perform tasks of data

sensing, data communication, and data processing

3

Type of failure in WSNs

Energy depletion Have very limited energy and their batteries cannot usually be recharged or

replaced, due to hostile or hazardous environments

Hardware failure A sensor node has two component: sensing unit and wireless transceiver

Usually directly interact with the environment, which is subject to variety of physical,chemical, and biological factors.

Communication link errors Even if condition of the hardware is good, the communication between sensor

nodes is affected by many factors, such as signal strength, antenna angle,obstacles, weather conditions

Malicious attack

It results in low reliability of performance of sensor nodes.

Therefore, fault tolerance is one of the critical issues in WSNs

4

Fault Detection:

Centralized Approach

• Sympathy

• Secure Locations

Distributed Approach

1. Node Self-detection

2. Clustering Approach( MANNA)

5

Sympathy[4]

Using a message-flooding approach to pool event data and current

states (metrics) from sensor node

Nodes periodically send metrics back to a sink to detect failures and

cause of failure

Given sensor hardware and network limitations, these transmitted

metrics must be minimized

Insufficient data at the sink implies failure; sufficient data at the sink

implies acceptable network behavior

Based on these metrics, it detects which nodes or components have

not delivered sufficient data and infers the causes of failures6

Secure Locations[5]

Work on location-aware sensor networks

Introduces a scalable trust-based routing protocol (TRANS)

Select trusted paths that do not include misbehaving

nodes by identifying the insecure locations and routing

Include two parts:

1. trust routing

2. insecure location discovery and isolation

7

Secure Locations (cont’d)

Select a secure path and avoid insecure locations

All destination nodes use TESLA, to authenticate all requests

1. sink creates a message with( source location, destinationlocation, authentication message)

2. encrypts this message with its share key and broadcasts it.

3. neighbors who know its shared key will be able to decrypt therequest

4. trusted neighbor decrypts the request, adds its location,encrypts the message with its share key and sends it toneighbors

8

Secure Locations (cont’d)

Use Expanding TTL Search (ETS).

1. Sink marks data packets with increasing hop-count

2. Each intermediate node decrements the hop-count beforeforwarding

3. When hop count reaches zero node sends ACK to thesource informing it of its location is safe

4. The source identifies that part of the path as safe andincreases the hop count in subsequent packets.

9

Advantage & Disadvantage of Centralize

Approaches

The centralized approach is efficient and accurate to identify

the network faults in certain ways

Resource-constrained sensor networks can not always afford

to periodically collect all the sensor measurements and states

in a centralized manner

Central node easily becomes a single point of data traffic

concentration in the network, as it is responsible for all the

fault detection and fault management

This subsequently causes a high volume of message traffic and

quick energy depletion in certain regions of the network,

especially the nodes closer to the base station

10

Advantage & Disadvantage of Centralize

Approaches(cont’d)

This approach will become extremely inefficient and expensive

in consideration of a large-scale sensor network

Multi-hops communication of this approach will also increase

the response delay from the base station to faults occurred in

the network

Therefore, we have to seek a localized and more

Computationally efficient fault detection model

11

Distributed Approach & Node Self-detection

Use flexible circuit acts as a sensing layer around a node,

capable of sensing the physical condition of a node.

Detect physical faults requires the use:

1. Hardware interface consists of a

sensing layer(wraps around the node).

1. Software interface reads the sensors,

and transmits the data to the Sink

Use TinyOS( have very small footprint, energy-aware, event-based )

12

Figure 1: SYS25 node.

Distributed Approach & Clustering

Approach MANNA

Design for event-driven WSN

Clustering use for building scalable and energy balanced applicationsfor WSNs

Distribute fault management into each cluster

Management agents execute in the cluster-heads

This mechanism decreases the information flow and energyconsumption as well

A manager is located externally to the WSN has a global vision

13


Approach MANNA

Management application is divided into two phases:

Installation

Occurs as soon as the nodes are deployed in the network.

Each node report its position and energy to the agent located in the

cluster-head.

Agent sends a LOCATION TRAP and ENERGY TRAP to the

manager

Manager build topology map model and the WSN energy model

14


Approach MANNA

Management application is divided into two phases:

Operation

Each node report its energy level and position to the agent

whenever there is a state change (another ENERGY TRAP or

LOCATION TRAP)

Manager rebuild topology map model and energy model

Manager sends GET operations in order to retrieve the node

state

15

Fault Recovery

WSN restructured or reconfigured, in such a way that

failures or faulty nodes do not impact further on network

performance

The most commonly used technique for fault recovery is

replication or redundancy of components that are prone

to be failure

When some nodes fail to provide data, the base station still

gets sufficient data if redundant sensor nodes are deployed in

the region

16

Fault Recovery(cont’d)

Relay Node Placement in Wireless Sensor Networks

Two-Tiered Wireless Sensor Networks


RideSharing: Fault Tolerant Aggregation

17

Relay Node Placement in Wireless Sensor

Networks(Two-Tiered Wireless Sensor Networks)

Improving reliability and prolonging lifetime of WSNs

Energy consumption is proportional to d for transmitting overdistance d, where is a constant in the interval , long distancetransmission in WSNs is costly

Employs some powerful relay nodes whose main function is togather information from raw data from sensor nodes and relay theinformation to the sink

Relay nodes serve as a backbone of the network

The relay nodes are more powerful than sensor nodes ( energystorage, computing, and communication capabilities)

18


Each cluster has only one cluster head and each sensor

belongs to at least (backup cluster heads)

Receiver of a relay node fails

Data sent by the sensors will be lost

Sensor to be reallocated to other cluster heads

Handle general communication faults

There should be at least two node-disjoint paths between each

pair of relay nodes in the network

19


An intuitive objective of relay node placement in two-tiered

WSNs is to place the minimum number of relay nodes, such

that some degree of fault tolerance can be achieved.

There are other works that study placement of sensor nodes

to make a sensor network k-connected

20


Why conventional TCP protocol can not be used?

Communication links in a sensor network are unstable

TCP protocol over a high loss rate will suffer from severe

performance degradation

Sensor may not have sufficient computing power to implement

the entire TCP/IP protocol


Aiming to accelerate reliable packet delivery

Minimizing end-to-end packet delivery time without too much

throughput degradation

Minimizing the number of retransmissions

21


Every intermediate node execute a light-weight local

TCP

Include two part:

1. End-to-End TCP

Working on the source and destination nodes

2. One-HopTCP

Working on every node

The sender module of a One-Hop TCP is working at the

sender end of a link, and the receiver module is working at the

receiver end.

22


23

Figure2. Protocol Stack Hop by Hop TCP

End-to-End TCP

Reuse an existing popular TCP protocol, NewReno, with

several modifications

1. Sender module forwards packets to the One-Hop TCP

module

2. Receiver module receives packets from the One-Hop TCP

module

3. One-Hop TCP in each node forwards data packets hop by

hop

4. End-to-End ACKs, are forwarded to the source node using

One-HopTCP in the opposite direction

5. Set a larger initial RTO value

24

One-Hop TCP

A light-weight version of TCP running on each node to

forward received packets reliably

Many TCP features, such as packetization and congestion

control, are removed

1. Add the IP address of current node to the packet header

(receiver knows where to send Local ACK)

2. Set CWND to 1

3. Set the upper threshold for the number of

retransmissions.

25


26

Aggregation use for filter redundancy and reduce communication and energy consumption

Multipath routing can overcome losses by duplicating and forwarding each sensor measurement

One or more other sensors have correctly overheard the packet

Some aggregate functions, such as SUM, COUNT, are duplicate-sensitive

Use RideSharing (RS) scheme for fault-tolerant, duplicate-sensitive aggregation


27

Edges are classified into three types: primary, backup, and side

edges

Using a small bit vector that each parent attaches to each data

message it sends

Parents detect link errors

when one or more children

are missing from the bit vector

Figure3. Track Topology

Cascaded RideSharing

28

Each parent broadcasts children ids and their bit positions

inside its bit vector

When an error occurs, each backup parent decides whether

or not to correct the error based on its order in a correction

sequence(parent with smallest id)

References

[1] Hai Liu, Amiya Nayak, and Ivan Stojmenovi ' Fault-Tolerant Algorithms/Protocols in

Wireless Sensor Networks' Department of Computer Science, Hong Kong Baptist

University, Springer-Verlag London Limited 2009

[2] M.Yu, H.Mokhtar, and M.Merabti, 'A Survey on Fault Management in Wireless Sensor

Networks' School of Computing & Mathematical Science Liverpool John Moores

University, 2007

[3] Farinaz Koushanfar1, Miodrag Potkonjak2, Alberto Sangiovanni-Vincentelli1, ' FAULT

TOLERANCE IN WIRELESS SENSOR NETWORKS'1Department of Electrical Engineering

and Computer Science Univeristy of California, Berkeley , CA, US 94720, 2Department of

Computer Science Univeristy of California, Los Angeles Los Angeles, CA, US 90095

[4] Nithya Ramanathan, Kevin Chang, Rahul Kapur, Lewis Girod, Eddie Kohler, and eborah

Estrin,' Sympathy for the Sensor Network Debugger' UCLA Center for Embedded Network

Sensing, ACM 2005

29

References(cont’d)

[5] Jessica Staddon, Dirk Balfanz, Glenn Durfee' Efficient Tracing of Failed Nodes in

Sensor Networks ', September 28, 2002,Atlanta, Georgia, USA,ACM.

[6] Sapon Tanachaiwiwat1, Pinalkumar Dave1, Rohan Bhindwale2, Ahmed Helmy1,'

Secure Locations: Routing on Trust and Isolating Compromised Sensors in Location-Aware

Sensor Networks ' 1. Department of Electrical Engineering – Systems 2. Department of

Computer Science University of Southern California,ACM 2003

[7] Gaurav Gupta1, Mohamed Younis2, ' Fault-Tolerant Clustering of Wireless Sensor

Networks ', Dept. of Computer Science and Elec. Eng. Dept. of Computer Science and

Elec. Eng. University of Maryland Baltimore County University of Maryland Baltimore

County 2003 IEEE

30

References(cont’d)

31

[8] Jinran Chen, Shubha Kher, and Arun Somani,' Distributed Fault Detection of Wireless

Sensor Networks' Dependable Computing and Networking Lab Iowa State University

Ames, Iowa 50010, 2006 IEEE

[9] Sameh Gobriel, Sherif Khattab, Daniel Moss´e, Jos´e Brustoloni and Rami Melhem,’

RideSharing: Fault Tolerant Aggregation in Sensor Networks Using Corrective Actions’,

Computer Science Department, University of Pittsburgh,2006

[10] Weiyi Zhang, Guoliang Xue and Satyajayant Misra,'Fault-Tolerant Relay Node

Placement in Wireless Sensor Networks', Department of Computer Science and

Engineering at Arizona State University, IEEE INFOCOM 2007

[11] S Harte1, A Rahman1, K M Razeeb2 'FAULT TOLERANCE IN SENSOR NETWORKS

USING SELF-DIAGNOSING SENSOR NODES', 1 University of Limerick, Ireland 2 Tyndall

National Institute, Ireland,2005

fault tolerance in wsn

Technology

sensor node nodes

sensor hardware

calledsink node sensor

sensor measurements

wireless sensor networks

sensor networks conclusion2

locationaware sensor

intermediate node