an architecture for a dependable distributed sensor system

12
408 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 60, NO. 2, FEBRUARY 2011 An Architecture for a Dependable Distributed Sensor System Sebastian Zug, André Dietrich, and Jörg Kaiser Abstract—In future smart environments, mobile applications will find a dynamically varying number of networked sensors that offer their measurement data. This additional information supports mobile applications in operating faster, with a higher precision and enhanced safety. The potentially increased redun- dancy obtained in such scenarios, however, is seriously affected by additional uncertainties as well. First, the dependence on wireless communication introduces new latencies and faults, as well as errors, and second, sensors of the environment may be disturbed, of low quality, or even faulty. The quality of the collected data therefore has to be dynamically assessed. Our work aims to pro- vide a generic programming abstraction for fault-tolerant sensors and fusion nodes that cope with varying quality of measurements and communication. Index Terms—Data fusion, distributed system, fault tolerance, intelligent sensors, smart sensors. I. I NTRODUCTION D URING the last decade, applications for distributed en- vironment monitoring have received increased attention. Many example applications exist, including habitat monitoring [1], object tracking [2], pollution detection [3], and climate observation [4]. In addition to applications that are restricted to a fixed set of sensors, those that exploit the existence of ambient sensors for mobile applications have also come into focus. These include, among others, driverless transport sys- tems and cooperating robots. In many everyday applications, distributed sensor systems supply their information about the environment, for example, automatic door-openers monitoring moving objects through a combined system of cameras and radar sensors or the heating system of a building controlled by a temperature sensor network that provides the ambient temperature for various mobile applications. However, today’s applications collect this perception and only use it for a single purpose. We assume that, in the future, due to the rising number of wireless sensor networks, more and more sensors will be available for collecting data about the environment. In this scenario, the different sensor nodes establish an instrumented Manuscript received December 15, 2009; revised May 25, 2010; accepted May 26, 2010. Date of publication November 11, 2010; date of current version January 7, 2011. This work was supported in part by the Ministry of Education and Science (BMBF) within the project “Virtual and Augmented Reality for Highly Safety and Reliable Embedded Systems” (ViERforES). The Associate Editor coordinating the review process for this paper was Dr. Cesare Alippi. The authors are with the Faculty of Computer Science, Institute of Distributed Systems, Otto-von-Guericke University Magdeburg, 39106 Magdeburg, Germany (e-mail: [email protected]; [email protected]. uni-magdeburg.de; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TIM.2010.2085871 and intelligent environment with the capability of varying the degree of information exchanged according to the specific task at hand. For example, temperature values are useful for ultrasonic distance measurements due to the relation between the speed of sound and temperature or a mobile transportation platform uses the information from an automatic door system and decreases its velocity based on knowledge of waiting people. The main challenge is to effectively handle the amount and variability of the observed and relevant information. The availability of information depends on the existence of ap- propriate sensor nodes, as well as on network range, and can thus be disturbed by faults (i.e., hardware, software, and communication). A generalized programming abstraction is necessary in this context and leads to the notion of smart sensors and smart fusion nodes. The idea of smart sensors is rather old [5] but gained momentum due to the technological progress of recent years and is supported by the development of respective stan- dards [6]. The smart sensor is composed of a simple transducer and a processing unit. It offers preprocessed-application-related information via a well-defined communication interface. Our work extends this former approach and combines smart sensor nodes (SSNs) and adaptive fusion nodes (AFNs) for computa- tionally expensive tasks in a single architectural concept. The AFN flexibly provides an improved result based on all relevant sensor information. This combination represents an architec- tural framework, defining the basic components necessary for fault-tolerant distributed applications. A special feature of the fault detection mechanisms is the feedback approach for SSNs. The aggregated information is shared with all SNNs to provide an individual validation of the current measurements inside each SSN. In [7], we presented this idea and illustrated its benefits by exploiting the feedback of the actual joint estimation in every SSN. The second contribution of this concept is a common format of validity estimation transmitted with each value. Thus, an AFN enables assessment of the quality of measurement. SSNs and AFNs use a generalized interface that comple- ments each measurement value with additional information, for example, a time stamp, localization information, and a validity estimation. Our common interface of SSNs and AFNs provides the correct interpretation of those values, based on an electronic datasheet for each sensor. Of course, messages transmitted by SSNs connected to a camera, a laser scanner, or a simple bumper vary in the size of the result, validity, units, etc. Only a global knowledge of the message format allows dynamic interaction and combination [8]. 0018-9456/$26.00 © 2010 IEEE

Upload: j

Post on 23-Sep-2016

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: An Architecture for a Dependable Distributed Sensor System

408 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 60, NO. 2, FEBRUARY 2011

An Architecture for a DependableDistributed Sensor System

Sebastian Zug, André Dietrich, and Jörg Kaiser

Abstract—In future smart environments, mobile applicationswill find a dynamically varying number of networked sensorsthat offer their measurement data. This additional informationsupports mobile applications in operating faster, with a higherprecision and enhanced safety. The potentially increased redun-dancy obtained in such scenarios, however, is seriously affected byadditional uncertainties as well. First, the dependence on wirelesscommunication introduces new latencies and faults, as well aserrors, and second, sensors of the environment may be disturbed,of low quality, or even faulty. The quality of the collected datatherefore has to be dynamically assessed. Our work aims to pro-vide a generic programming abstraction for fault-tolerant sensorsand fusion nodes that cope with varying quality of measurementsand communication.

Index Terms—Data fusion, distributed system, fault tolerance,intelligent sensors, smart sensors.

I. INTRODUCTION

DURING the last decade, applications for distributed en-vironment monitoring have received increased attention.

Many example applications exist, including habitat monitoring[1], object tracking [2], pollution detection [3], and climateobservation [4]. In addition to applications that are restrictedto a fixed set of sensors, those that exploit the existence ofambient sensors for mobile applications have also come intofocus. These include, among others, driverless transport sys-tems and cooperating robots. In many everyday applications,distributed sensor systems supply their information about theenvironment, for example, automatic door-openers monitoringmoving objects through a combined system of cameras andradar sensors or the heating system of a building controlledby a temperature sensor network that provides the ambienttemperature for various mobile applications. However, today’sapplications collect this perception and only use it for a singlepurpose. We assume that, in the future, due to the rising numberof wireless sensor networks, more and more sensors will beavailable for collecting data about the environment. In thisscenario, the different sensor nodes establish an instrumented

Manuscript received December 15, 2009; revised May 25, 2010; acceptedMay 26, 2010. Date of publication November 11, 2010; date of current versionJanuary 7, 2011. This work was supported in part by the Ministry of Educationand Science (BMBF) within the project “Virtual and Augmented Reality forHighly Safety and Reliable Embedded Systems” (ViERforES). The AssociateEditor coordinating the review process for this paper was Dr. Cesare Alippi.

The authors are with the Faculty of Computer Science, Instituteof Distributed Systems, Otto-von-Guericke University Magdeburg, 39106Magdeburg, Germany (e-mail: [email protected]; [email protected]; [email protected]).

Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TIM.2010.2085871

and intelligent environment with the capability of varying thedegree of information exchanged according to the specifictask at hand. For example, temperature values are useful forultrasonic distance measurements due to the relation betweenthe speed of sound and temperature or a mobile transportationplatform uses the information from an automatic door systemand decreases its velocity based on knowledge of waitingpeople.

The main challenge is to effectively handle the amountand variability of the observed and relevant information. Theavailability of information depends on the existence of ap-propriate sensor nodes, as well as on network range, andcan thus be disturbed by faults (i.e., hardware, software, andcommunication).

A generalized programming abstraction is necessary in thiscontext and leads to the notion of smart sensors and smartfusion nodes. The idea of smart sensors is rather old [5] butgained momentum due to the technological progress of recentyears and is supported by the development of respective stan-dards [6]. The smart sensor is composed of a simple transducerand a processing unit. It offers preprocessed-application-relatedinformation via a well-defined communication interface. Ourwork extends this former approach and combines smart sensornodes (SSNs) and adaptive fusion nodes (AFNs) for computa-tionally expensive tasks in a single architectural concept. TheAFN flexibly provides an improved result based on all relevantsensor information. This combination represents an architec-tural framework, defining the basic components necessary forfault-tolerant distributed applications. A special feature of thefault detection mechanisms is the feedback approach for SSNs.The aggregated information is shared with all SNNs to providean individual validation of the current measurements insideeach SSN. In [7], we presented this idea and illustrated itsbenefits by exploiting the feedback of the actual joint estimationin every SSN. The second contribution of this concept is acommon format of validity estimation transmitted with eachvalue. Thus, an AFN enables assessment of the quality ofmeasurement.

SSNs and AFNs use a generalized interface that comple-ments each measurement value with additional information,for example, a time stamp, localization information, and avalidity estimation. Our common interface of SSNs and AFNsprovides the correct interpretation of those values, based onan electronic datasheet for each sensor. Of course, messagestransmitted by SSNs connected to a camera, a laser scanner, ora simple bumper vary in the size of the result, validity, units, etc.Only a global knowledge of the message format allows dynamicinteraction and combination [8].

0018-9456/$26.00 © 2010 IEEE

Page 2: An Architecture for a Dependable Distributed Sensor System

ZUG et al.: ARCHITECTURE FOR A DEPENDABLE DISTRIBUTED SENSOR SYSTEM 409

The distributed dynamic approach includes a number of addi-tional sources of faults and errors. For this reason, we designedthe internal structure of the SSN and AFN to consider a broadclassification of faults and errors. In [7], we discuss differenttypes of faulty measurements, as well as adequate detectionmechanisms and their influence on the final result of fusing thedata. In this paper, we extend our analysis to communicationfaults and unpredictable network latencies. This paper willexamine the influence from both sensor and network faults.To evaluate the concepts, we simulated an application scenarioof a mobile robot driving in an instrumented environment.In particular, we measure the impact of sensor and networkfaults on position estimation. To base the evaluation on realisticassumptions, we derived the main parameters for the sensorsand actuators from a physical system and used them to simulatemultiple fault conditions.

The contribution of this paper is an architecture for distrib-uted fault-tolerant and flexible sensor systems. This is basedon an analysis of the different types of smart sensor faults.Section II gives a short overview of the different issues thatare related to fault-tolerant sensing and reviews the respectivestate of the art. In Section III, we summarize and compare thefaults that can be encountered in the described scenario. Basedon this analysis, our design of a fault-tolerant SSN and AFN isdeveloped in Section IV. The approach is validated through asimulation that is elaborated on in Section V. A conclusion andan outlook on future research are provided in Section VI.

II. STATE OF THE ART

A. Fault-Tolerant Fusion Abstraction

A fault-tolerant programming abstraction for the fusion ofdistributed measurements involves and integrates many aspectsand disciplines. Issues such as fault detection and isolation(FDI), fusion architectures, and fault-tolerant data fusion andnetworking, among others, have to be considered. Most authorsconcentrate on partial problems, for instance, self-validatingsensors without an integrated view of the distributed ensemble.

However, only a holistic approach exploits the benefits of afault-tolerant fusion network completely.

1) Methods of Fault Detection: A systematic examinationof smart sensor faults is given in [9] and [10]. The resultingsensor fault detection is based on a comparison of redundantinformation. This redundancy in a measurement system can beobtained in three ways.

a) Hardware redundancy: Hardware redundancy is usedfor safety-critical applications in different ways [11]. Redun-dant, heterogeneous, or homogeneous sensors measure thesame or related values and observe a common area. Faulty sen-sors are identified by monitoring measurement deviations fromthe joint mean, median, etc. The criterion for acceptance canbe, for instance, a simple threshold related to a measurementuncertainty or based on statistical knowledge like x -percentile[12]. Such methods discard a deviating minority as faultymeasurements like a k-out-of-n voter. This means that themaximum number of simultaneous faulty sensors that can bedetected is defined by the number of redundant measurements.In [13], the authors derive the number of necessary additional

sensors for different types of sensor faults. If a general distur-bance manipulates the majority of the measurements, then themajority-oriented approach for detection fails.

Another way of selection is to compare the impact of ameasurement to the fusion quality. If the elimination of anindividual sensor improves the consistency of the fusion result,the sensor node is assumed to be faulty [14].

b) Model-based redundancy: Another approach is to“simulate” redundancy with a mathematical model of the ob-served system. An overview is presented in [15]. The responseto known inputs is calculated using this system model and iscompared to the reaction of the real system. If a state vector isassumed, the resulting residuals can be classified by rule-basedsystems, neural networks, or fuzzy sets [16]. This approach isvery common in fault-tolerant control applications (for a goodoverview, see [17]).

c) Signal analysis: Model-based redundancy uses knowl-edge of the observed system to derive the validity of ameasurement. Signal analysis monitors the parameters of themeasurement process and models the transducer behavior. Thisis more robust in case of uncertain behavior of the controlledsystem. Signal noise, frequency response, velocity of amplitudechange, etc., are known parameters of a measurement for anundisturbed system. If such a value significantly changes, thenthe sensor is classified as faulty, or the observed system is mod-ified. In contrast, mechanical and civil engineers reverse the lastassumption for damage detection in engines and buildings [18],[19], whereby they assume that the observed building or engineis damaged and the sensor works fine. We assume the oppositesituation, i.e., unchanged system parameters, and that the faultysensor is responsible for the changed signal.

There is a large body of research concerning methods foronline FDI. Most methods integrate redundant hardware for thispurpose. We developed a statistical method for signal analysiscomparing a short time series with a fault-free reference sam-ple. This is included as a stochastic test in the SSN describedin [20].

B. Data Fusion Architectures

Most fusion architectures for sensor systems do not considerfault-tolerant schemes (see [21] and [22]). A hierarchical frame-work is presented in [23], based on three different types of“virtual sensors.” The fusion task is user defined by an SQLstatement, and the fault tolerance approach tries to match thecurrent available information for this query. In [24], the authorsdescribe a programming abstraction called “logical sensor” thatprovides a general structure and combines self-validation mod-ules, selection mechanisms, and a processing unit. The faultdetection is limited to each logical sensor. Validity informationfor a joint validation of a measurement was not intended. Thejoint fusion of different measurements using a voter is discussedin [25] for smart sensors. The sensor message of this approachconsists of the sensor output and additional validity estimation.As in other fault-tolerant projects for control engineering, theauthors do not consider communication parameters. Delayed ormissed messages due to a disturbed wireless communicationchannel are not taken into account.

Page 3: An Architecture for a Dependable Distributed Sensor System

410 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 60, NO. 2, FEBRUARY 2011

Fig. 1. Categorization of faulty measurement.

The different approaches presented in this review of the stateof the art represent broad research topics from multiple fields.Most of them handle very specific problems and have to becombined to achieve a general fault-tolerant fusion architecture.

III. SENSOR FAULT TYPES

The design of a fault-tolerant sensor and fusion structurerequires an analysis and classification of typical sensor faults.Based on this, we can determine an efficient error detectionstrategy according to the individual properties of the fault types.By efficient, we mean that we try to identify and handle asensor fault as close to its origin as possible. A sensor fault mayoccur anywhere in the processing and dissemination chain fromthe transducer via the computational components in the sensornode to the components of the communication channel. In thispaper, we focus on errors originating from transducer faultsand uncertainties in the network. Faults of the computationalcomponents are not explicitly considered. The reason for thisis that there has already been much work carried out on howto enforce a fail silent behavior for this component. Therefore,we assume that such faults are covered by the stuck-at faulttype in our scheme. Fig. 1 specifies the fault model of a smartsensor as perceived by a remote consumer. Table I illustrateshow the faults impact the measurements of the smart sensor.It also specifies the error compared to an ideal reference forsimple examples. The scheme also considers network crashesand omissions.

We consider five typical smart sensor faults in our analysis.Fig. 1 relates these faults in a tree structure. Table I illustrateshow the different fault types from the classification schemeaffect measurements as compared to an ideal sensor, thusdefining an error model. The dashed lines in all diagrams ofTable I plot the correct reference measurements of the physicalvalue y(t). We assume a simple linear progression of this valuefor illustration purposes here. Common mathematical modelsare used for the description of the fault F (t), and presentedbelow are the respective diagrams. These basic fault models of

a smart sensor are used for the validity estimation of the currentmeasurement.

The first level in the tree distinguishes between constant andvariable measurement faults. Constant faults (marked by “1”in Fig. 1 and Table I) represent a constant (relative) offsetfrom the correct value. This fault occurs, for example, due touncalibrated sensors, variations in temperature, or the offsetof an analog-to-digital converter. All other fault types changetheir relation to the reference over time. On the second level,the varying faults can be divided into continuous and non-continuous faults with a linear or nonlinear deviation. Faultsbelonging to the continuous group (2) can be modeled throughmultiplicative combination of piecewise constant factors ct andcy and time F (t) = ctt or in relation to the observed physicalvalue F (t) = cyy(t). Examples of multiplicative faults arewrong assumptions about the environment or the aging processof a transducer. A single sensor is not able to detect sensor faultsof type 1 and 2 in a standalone situation. It needs redundancyto detect such faults, for example, an analytical model oradditional sensors. We discuss this issue in Section IV. In thecase of communication with remote sensor nodes, continuousfaults can also be caused by an offset between clocks resultingin an imprecise global time. Since a sensor value is related tothe time it was acquired, an offset between clocks at differentnodes will associate the value of a sensor reading to differentpoints in time. The resulting error increases with the clockoffset. Usually, the offset between clocks can be bounded bya synchronization protocol. If synchronization messages arelost, this error grows proportionally with the time to the nextsuccessful synchronization and the clocks’ skews.

Fault category 3 specifies transient and permanent faults thatresult in “stuck” fixed values. Accordingly, they are dividedinto variants 3a and 3b: Faults of category 3a (intermittent)may result from a massive external disturbance, like strongsunlight for an infrared sensor. These environmental conditionsresult in a constant output value Xc as long as the externalcondition holds, as shown in Table I for time slot Td. If asensor completely crashes or the communication goes downfor a longer period of time, measurement information can nolonger be received, and this is classified as fault model 3b(permanent). A fusion node consuming respective sensor datacannot distinguish between a network fault and a completesensor crash. This lack of knowledge is fatal for a sensor dataselection strategy. If it is assumed that wireless communicationis temporarily disturbed, then model 3a would be appropriate,while a sensor crash means permanent loss (3b).

Category 4 models outliers. These are faults that are sporadicin the temporal domain and stochastic in the value domain.They are caused by the physical properties of the measurementprocess. Good examples for this fault category are ultrasonicsensors that show a comparatively large rate of outliers in asequence of measurements. We discuss a qualitative evaluationof this fact in Section V-B2. With a well-adapted model of theoccurrence and amplitude, an SSN is able to detect outliers byapplying simple filter or smoothing algorithms.

So far, we have not considered continuous high-frequencynoise in the presented fault classes. Noise is inherent in everymeasurement, and the noise level depends on environmental

Page 4: An Architecture for a Dependable Distributed Sensor System

ZUG et al.: ARCHITECTURE FOR A DEPENDABLE DISTRIBUTED SENSOR SYSTEM 411

TABLE IFAULT CATEGORIES AND ERROR MODELS

conditions, the physics of the measurement process, and thebehavior of the electronic components in the sensor unit.Network latencies are another source of noise. In a scenariowith unknown communication delays and without a globaltime stamp, noise and uncertainty represent an additional faultdimension. There are different approaches for coping withnoisy measurements and for extracting an optimal estimation ofthe correct value. It is common to model the observed systemas a stochastic process using a Kalman or Bayes filter. In thelast few decades, many authors assumed that smart sensorscannot provide such complex filtering algorithms due to theirlimited computational performance, as, for instance, describedin [25]. Since prices for powerful microcontrollers and signalprocessors are continuously dropping, we are convinced thatperformance will not be a problem in the future and that themost common filtering operations will be available on smartsensors.

Timing and omission errors on the network are caused byvarious factors such as oversaturated network links, corruptedpackets, signal degradation over the network medium, and er-roneous hardware and software. Note that timing problems cancreate omissions and that omissions can cause timing errors.

The occurrence of the introduced fault types depends on thesensor types, the communication structures, and the environ-ment conditions. A fault-tolerant fusion structure for SSNs andAFNs therefore has to be adaptable to those varying conditionsand must consider a large number of possible combinationsbecause, in most realistic scenarios, the presented faults do notoccur strictly separately from each other.

IV. FAULT-TOLERANT PROGRAMMING ABSTRACTIONS

As discussed in the previous section, there are many ap-proaches to dependable sensor-based systems. We partly useand integrate these schemes in our work. Three major points gobeyond current approaches and distinguish our work. First, we

provide an architectural framework and a structure that allowsfor integrating this previous work in a common scheme for asensor. We developed a generic internal structure based on ananalysis of the typical fault types. This defines the necessarybuilding blocks for a reliable sensor, keeping all necessarycomputations and assessment within the smart sensor, which,in turn, may greatly ease overall system development. The userhas the freedom to customize these modules, for example, thestochastic tests or the filter, according to the specific conditionsthat may vary with the sensor type and the application. Second,we added the option to exploit a history of measurements tobe compared to the actual measurement of the transducer. Thisenables a smart sensor to validate the current measurement andto achieve a higher dependability. Because we assume a setof sensors (possibly with different modalities) participating inthe perception of the environment, the results of the respectiveaggregation or fusion should be exploited. For this feedback,we provide an additional network interface. Finally, we includean assessment of the quality of information provided by asmart sensor in each of its messages. We firmly believe thatthis is needed in a distributed sensor-based system, particularlyif ambient sensors are dynamically included in the overallenvironmental perception.

There are already standards for smart sensors, for example,[6]. They mainly describe a structure and an interface forsuch devices. However, aspects concerning erroneous sensorreadings or faulty sensors are not considered.

Our architectural framework defines two components fordifferent purposes: an SSN and an AFN. They are depicted inFigs. 2 and 4, respectively. Of course, an SSN can contain anAFN and vice versa, but due to the limited resources of wirelesssensor nodes, it is appropriate to separate the tasks, delegatingthem to different nodes. The SSN is composed of a transducer,computational modules that form a sequence of processingsteps for improving the raw transducer data, a validity estimatorfor assessing the quality of the local sensor information, and

Page 5: An Architecture for a Dependable Distributed Sensor System

412 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 60, NO. 2, FEBRUARY 2011

Fig. 2. Internal structure of our SSN.

a network interface. The network interface receives informa-tion from the AFN for checking local measurements via thenetwork input port and disseminates its results on the networkoutput port.

A. Fault-Tolerant SSNs

The SSN processes measurements of a single transducer(or sensor arrays of a unique type) and prepares the data fordissemination. It produces a message including an informationitem that reflects the proper physical unit of the respective en-vironment state and is tagged with an assessment of its validity.The transducer module represents the sensor interface provid-ing the digital representation of environmental conditions. Thisincludes the direct interaction with the sensor via I2C, SPI,etc., and the analog-to-digital conversion. For error detection,the raw measurements are compared to an expected input rangethat is defined by either the sensor specifications or knowledgeabout the respective environmental conditions. For example,the measurements of larger distances made by our short-rangeultrasonic sensor are not considered because of the very higherror rate in long-distance measurements. This interval checkproduces a binary fault detection fIC ∈ R, R = {0, 1}.

The second preprocessing step is the stochastic test. Itcompares the noise spectrum of the measured signal with areference distribution of an undisturbed measurement processand calculates a similarity value. This allows for detection ofdisturbed (different noise spectrum) or crashed sensors (faultcategory 3a) with high probability. The reference distributionis included on the electronic data sheet of the sensor element.Of course, due to the sliding-window technique, which exploitsmeasurement history, the detection of a fault will be delayed.The similarity value of 0 ≤ fST ≤ 1 represents the probability

that a fault will occur. Additionally, SSNs are able to receivethe results of the sensor fusion carried out by an AFN and scalethem to the current point in time, according to the availablesystem and environmental model.

As an example, let us assume an inertial navigation systemconsisting of three SSNs equipped with odometric, gyro, andacceleration sensors. The respective measurements are fusedon an AFN that calculates position and velocity estimates.The AFN transmits the fused velocity/position values basedon multiple SSNs, and the odometric system uses this jointestimated velocity to check the validity of its own measure-ments. This way, the SSNs first provide an improved perceptionof the current environment and then are able to judge theirown primary measurements. The chain of modules dealing withthe feedback of the fusion results is depicted by the dashedlines in Fig. 2. Because fusion results may not always beavailable, these modules require some degree of adaptability.The transformation module performs a time adjustment andscales the received joint results to the individual coordinatesystem.

Current measurements are compared with the transformedjoint estimation received from an AFN to detect outliers. Thedeveloper specifies an algorithm, which examines the signif-icance of those deviations. The result has to be matched toan error probability value 0 ≤ fOD ≤ 1. If the current valuederived from the real measurement is not identified as an outlier,then it is passed along to the filter module. For a specificenvironment and sensor type, various filter algorithms can beutilized depending on the dynamics of the system, like thefrequency of disturbances and computational performance ofthe SSN. All modules calculate a fault probability value inaddition to the current estimate x and its variance estima-tion σ, which is analyzed and interpreted by the validity esti-mation module.

Currently, we use simple rules to merge the individual faultprobabilities (fIC , fST , fOD, fFI) into a joint validity estima-tion v. If the measurements leave the interval fIC = 1 or thefilter module detects an outlier, the joint validity value becomeszero. The message disseminated via the network includes thevalue and the estimated validity as a quality indicator. Theformat and contents of the messages are defined in an electronicdata sheet for each SSN [8].

Fig. 3 shows an ultrasonic measurement used in the scenarioin Section V. The diagrams compare the raw measurement ofthe transducer to the output produced by an SSN. The correctposition of the robot is marked by the dotted line. The upperdiagram illustrates the noisy measurements of the ultrasonicsensor combining fault types 3b, 4, and 5. The SSN is ableto detect outliers due to the feedback of the estimate jointlyderived from all three sensors. The system uses a dynamicallyadapted and weighted sliding-window filter that significantlyreduces the deviation. The solid line in the upper diagram inFig. 3 represents the output of the SSN.

The experimental evaluation shows the significant gain in thequality of SSN information compared to a raw sensor. Althougha raw transducer output can be filtered and improved, the use ofconsolidated information from the AFN further improves themeasurement results.

Page 6: An Architecture for a Dependable Distributed Sensor System

ZUG et al.: ARCHITECTURE FOR A DEPENDABLE DISTRIBUTED SENSOR SYSTEM 413

Fig. 3. Noisy ultrasonic distance measurements, detected outliers, and theresulting SSN output.

Fig. 4. Internal structure of our AFN.

B. Fault-Tolerant AFNs

An AFN, which is depicted in Fig. 4, receives the messagesof SSNs as input and computes a result according to somefusion algorithm. The general architecture and structure of anAFN is similar to an SSN. In fact, it outputs the informationvia the common interface in a way analogous to the methodused by an SSN. It differs in that the main input does not comefrom a transducer. The problems that the fusion node is facedwith center around the varying latencies of messages and the

selection of the most appropriate SSN messages. We call thefusion node adaptive because it can respond to latencies ofthe network, the dynamic availability of sensor information,and the changing quality of sensor information in a flexibleway. The synchronization module in the AFN has the sametask as the transformation module in the SSN. It maps thesensor information received from multiple SSNs to a commonpoint in time exploiting the time stamp and a model of howthe respective value (e.g., a position) changes over time. Thisis necessary to compensate for large skews that negativelyaffect the fusion process but requires, in turn, appropriate timesynchronization. The subsequent selection module also usesthe age of a measurement and variance estimate together withthe estimation of the quality of a received value to choose themost appropriate sensor information. The filter module takesthe selected measurements and calculates an estimation of theobserved value. It should be noted that more than one value isusually used in this calculation. In fact, there may be a seriesof values available from an SSN whereby the filter considersthem all within a defined time window. The Kalman filter isa good choice for measurements that are disturbed only byGaussian noise. More general assumptions about process andsensor uncertainties, as well as noise, may result in applyinga Bayesian filter. For less demanding requirements or systemswith substantial performance and memory constraints, simpleexponential smoothing or weighted average functions may besufficient.

In this section, we describe the interaction of our SSN, whichrepresents an improved smart sensor, and its new complement,which is the AFN. We examined the fault tolerance relatedto sensor faults in [7]. In the following section, we enhancedthe scenario with communication simulation to determine theinfluence of communication faults on the fusion result. InSection V-D-2, we outline the individual implementation of theSSN and AFN based on our simulated scenario.

V. EVALUATION

A. Scenario

In this section, we evaluate our approach in an experimentalsetting and illustrate the benefits of our SSN and AFN architec-ture. Fig. 7 depicts the application of a mobile robot movingthrough an environment with sensors. To allow for differentevaluation parameters and for producing various fault situa-tions, we base this evaluation on a simulation. Fig. 5 presentsthe respective Simulink model that shows all components andsignals included in the network.

The mobile system is equipped with an odometric velocitysensor that measures speed and an ultrasonic system with alimited range. A controller area network (CAN) [26] connectsthose transducers with two AFNs, the “virtual position” andthe “position estimation.” The former integrates the velocitysignals and calculates a position result. This virtual positionmeasurement, together with the outputs of the other smartposition sensors, is merged into a joint position estimate in thesecond AFN.

Two additional remote sensors are available. A precise laserdistance sensor monitors the scenario, and the camera system

Page 7: An Architecture for a Dependable Distributed Sensor System

414 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 60, NO. 2, FEBRUARY 2011

Fig. 5. Illustration of the simulated validation scenario.

TABLE IISENSOR PARAMETER OF THE SCENARIO

localizes the robot within a limited area. The different sensorperiods and observation ranges are available in Table II. Bothexternal sensors are connected to the smart position fusion viaa ZigBee network [27].

We specify a fusion period of 100 ms. This value is motivatedby the control cycle of the position control of a real robot.

B. Sensors

To obtain realistic behavior of the SSNs, we chose the sensormodels according to parameters of real transducers.

1) Ultrasonic Sensors: With a known operating temperatureand for waves orthogonally emitted to the obstacle, previ-ous work (see, for example, [28] and [29]) determined anerror of 2% of the measured distance for ultrasonic sensors.Based on additional examinations with nonoptimal reflectionangles, we assumed a normal distribution for the ultrasonicsystem with a standard deviation of σus(d) = (1/3) · 6% · d.Seignez et al. [28] differentiated two further measurementdistortions: 5% of the measurements are outliers and generatea larger error than expected from the default distribution andσus. If a sensor operates in an environment with unknowntemperature, an additional uncertainty of approximately 3%has to be taken into account. We did not consider changesin temperature in our simulation but did account for outliers.We assume that our mobile robot is equipped with SFR08ultrasonic sensors made by Devantech. This type of sensor isoften used in robotics applications. The maximum distance of3 m that can be measured by the sensor depends on the period.

2) Odometric Sensors: The small robot modeled in our sce-nario has an odometric system with 120 ticks per rotation. Theassumptions concerning noise in the velocity measurements aresimilar to the approaches presented in [30].

3) Laser Distance Sensor: The model of the laser distancesensor was inspired by a YT87MG sensor made by Wenglor

[31], which we use in robot applications for dependable dis-tance measurements. The maximum deviation is known tobe 2% of the working range. The manufacturer specifies theresponse time to be 8 ms.

4) Camera: Based on our experience in robot localizationwith web cameras, we assume a processing time of 100 msfor one frame in our scenario. If the environment provides aconstant homogeneous illumination, a precise localization ispossible. The error function of the calculated position can beapproximated by a Gaussian distribution like the one used inthe robot simulator described in [32].

Knowledge of the sensor specification and individual uncer-tainties was used to implement a realistic “measurement” of therobot’s position by the SSNs in the validation scenario.

C. Communication Simulation

Every simulation has the challenge of reproducing results asclose as possible to a real environment. Next to an adequatesimulation of the sensors, we also tried to realistically modelthe communication. As described in Section III, we considercommunication characteristics like jitter and latencies. Con-sidering end-to-end delivery, packet loss produces by far thelargest latencies. Other sources such as media access thereforehave a rather small impact, and we will therefore concentrateon packet loss rates in our evaluation. For an examination of theinfluence of the packet loss, we have to choose an appropriatenetwork simulator.

A good overview of different simulators is given in [33],and a table overview with various attributes like “parallelexecution,” “radio propagation models,” “mobility models,”and “physical-layer and antenna models” can be found in[34]. Of course, today, a wide range of network simulatorsand emulators for different domains and purposes exist. Themost common are ns-2 [35] and ns-3 [36], which support thesimulation of TCP, UDP, routing, and multicast protocols overwired and wireless networks. ns-2 is also applicable for networksimulations in combination with Matlab/Simulink through thecosimulator PiccSIM [37], which handles communication, timesynchronization, and the update of node positions betweenboth simulators. In addition to PiccSIM, however, some othertoolboxes for the simulation of communication channels inMatlab/Simulink exist, such as the SimEvents toolbox [38] andTrueTime [39]. SimEvents offers models for queues, servers,routing, timers, etc., and built-in statistics—such as delay,throughput, and average queue length—which allows for thesimulation of transactions between components. This func-tionality makes it possible to model various types of networkprotocols. In contrast to SimEvents, TrueTime is aimed atreal-time networks, as well as real-time kernels. The majoradvantage of TrueTime is that it already has models for thedistribution of messages according to a chosen network model.We chose TrueTime for the network simulation because of itshuge number of communication models, its simple extensibil-ity, and the fact that TrueTime is easy to integrate into Simulinksimulations.

As stated earlier, packet loss causes the largest variance inend-to-end latency. Therefore, we were mainly interested in

Page 8: An Architecture for a Dependable Distributed Sensor System

ZUG et al.: ARCHITECTURE FOR A DEPENDABLE DISTRIBUTED SENSOR SYSTEM 415

Fig. 6. Gilbert–Elliot model.

determining the impact of this parameter on the sensor informa-tion and ran the simulation under different packet loss rates anddistribution models. In addition, TrueTime simulates latencyand jitter for the respective CAN and the wireless networkto provide realistic network characteristics. In the followingparagraphs, we introduce two common models that we used forsimulating packet losses.

1) Random: This is the standard model in TrueTime, wherepacket loss occurs independently with a specified probabilityPR during a transmission.

2) Simplified Gilbert–Elliot: As the name suggests, thismodel is a simplified version of the computationally expensiveGilbert–Elliot bit-error model, which was first introduced byGilbert [40] in 1960 and extended by Elliot [41] in 1963. It wasproved in [42] that this model is useful for simulating errorson wireless links in industrial environments. In contrast to theprevious model, packet losses do not occur independently butare instead correlated.

The simplified Gilbert–Elliot model as described in [43]consists of two states (see Fig. 6), where G denotes the “good”channel state, and B denotes the “bad” channel state. Theprobability of a packet loss in the “good” state is zero, and inthe “bad” state, it is greater than zero (for simplicity, we set thisprobability to 0.95). The probability of landing in state G if theprevious state was also G is assumed with PGG, and that oflanding in the bad state if the previous state was B is PBB . Thetransition from state G to state B takes place with probabilityPBG = 1 − PBB , and that for the opposite direction takes placewith the probability PGB = 1 − PGG.

Unlike the previous model, the determination of parameterseven in the simplified Gilbert–Elliot model can be very com-plicated. Gilbert himself suggested a method to estimate thoseparameters from a measured trace. Different methods for theparameter estimation were proposed in [44] and [45]. In themethod we used, the transition probabilities PGB and PBG

are deduced according to the variance of the burst length ofpacket losses and the variance of the burst length of correctlytransmitted packets (see [46]).

3) Other Packet Loss Models: For different areas of appli-cation, like wireless local area network or, for example, globalsystem for mobile communications, there also exist some otherpacket loss models based on Markov chains with more thantwo states, like the three-state packet loss model introduced in[47] or the four-state model introduced in [48]. An advantageof using Markov chains is that they can be trained to produceresults that are similar to a measured trace. In some cases, it isalso common to directly use traces to decide whether a packetis lost or not [35]. Periodic packet loss models are mostly usedto build reliable multimedia applications [49].

D. Simulation Implementation

1) Simulink: The robot model, SSNs, AFNs, and the com-munication simulation were implemented in the Simulinkmodel shown in Fig. 7. On the left side, the blocks of the inter-nal SSN are connected with the CAN simulation. In Simulink,the small arrows and those blocks with a point on one siderepresent data flow between two blocks. The simulated SSNsof the intelligent environment are visible on the right. Abovethe central position estimator, the virtual position calculationreceives the measurements of the velocity sensor. The outputof the position estimation block is fed back to every SSN foroutlier detection.

2) AFN and SSN Implementation: Each AFN and SSN usesa noise acceleration model (described, for example, in [50])for smoothing and estimation. The maneuvering model of therobot movement is defined for two state variables—position andvelocity x(t) = [p p]T —in a discrete-time system, i.e.,

x(tk+1) =[

1 ΔT0 1

]x(tk) + v(tk)

y(tk) = [1 0]x(tk) + w(tk)

where ΔT is the sampling interval of the sensor or the tempo-ral distance from the last sensor measurement to the currentestimation. The variable v(tk) represents a zero-mean whiteGaussian process noise. The observation model for the sensorsis defined such that w is a zero-mean white Gaussian processwith the covariance R.

Of course, this model does not describe realistic roboticmotion. Hence, for this specific scenario, we had to find a moresophisticated model. We do not aim for a perfectly adaptedmathematical description of the process but wish to show thata quite general model can be used in our context with goodresults.

The mentioned model was implemented in each SSN and wasused for outlier detection according to [51] and in combinationwith a Kalman filter for state and variance estimation. Theselection module of the AFN sorts all incoming measurementsaccording to their time stamp. Depending on the current ve-locity of the robot, the last n measurements were selected andsmoothed, and a fusion estimate was calculated by a Kalmanfilter.

3) Network Configuration: The TrueTime-CAN and theTrueTime-ZigBee networks were configured in compliancewith the CAN standard (see [26]) and the IEEE 802.15.4 stan-dard [52], as listed in Table III. The standard allows a maximumof seven retransmissions if no acknowledgment packet (ACK)is received within a given time. To examine the impact ofretransmissions, we run simulations with and without them.

4) Simulation: For the sake of repeatability, we initializedall random number generators before a simulation run witha given seed. A sequence of simulations for one packet lossmodel, one packet loss probability, and a fixed number ofretransmissions was performed with different seeds until nomore significant change in the cumulative position estimationerror occurred. The results of all simulations are presented anddiscussed in the following section.

Page 9: An Architecture for a Dependable Distributed Sensor System

416 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 60, NO. 2, FEBRUARY 2011

Fig. 7. Simulink implementation of the validation scenario.

TABLE IIITRUE TIME NETWORK CONFIGURATION

E. Results

In Fig. 8, we present the results of the position estimation ofthe AFN. The deviation of the estimation is presented in by acumulative error for different models and packet loss probabil-ities. The pictures of the first row are based on simulations withrandom package loss, while the second row contains the resultsproduced with the simplified Gilbert–Elliot model. Differenttypes of lines indicate different error probabilities, as shownin the legends. The columns separate communication withoutretransmissions on the left and up to seven retransmissions onthe right.

All four diagrams show a joint result for different packet lossprobabilities (e.g., 0, 0.2, 0.5, 0.8, and 1, where “0” means thatno disturbance occurred and “1” means that no communicationwas possible)—the position estimation error increases withcommunication disturbances. The influence of the packet lossprobability differs according to the disturbance model.

1) Random Package Loss Model: The loosely dashed lineindicates a packet loss of 50%. The comparison of the cumula-tive error for this disturbance level with the “optimal” estima-tion (upper solid), which means that no packet loss occurred,shows only a small deviation. Our dynamic integration of allincoming measurements into an appropriate fusion model copeswith communication disturbances up to 50%. However, in the

same simulations, we also obtain larger outliers of up to 40 cm,although with a small frequency. The appearance rate and valueof those extreme outliers rises with a higher rate of packet loss.

As expected, the retransmission of packets stabilizes theposition estimation. Retransmission reduces the overall prob-ability of losing one of the packets. Now, the loss rate of 80%shows a similar level of deviation compared to a packet loss of50% without retransmissions.

2) Simplified Gilbert–Elliot Model: According to the resultsin [42], we defined a period of 50 consecutively transmittedpackets as the average packet loss free burst length. For thatreason, we calculated a fixed value for PGG = 0.979800, whichis consistent with that average amount of transmitted packets,with the method proposed in [46]. We calculated different val-ues for PBB according to increasing packet loss burst lengths.The tested burst lengths of packet losses and the correspondingoverall packet loss probabilities are listed in Fig. 8(c) and (d).The left value defines the average burst length of consecutivelost packets, while the right one in brackets determines theprobability of a packet loss.

The more realistic Gilbert–Elliot model produces a largerestimation error than comparable settings using the randommodel. The loosely dashed line indicating 50% packet lossshows that 80% of the estimation errors are smaller than10 cm. According to this, the random model depicts an error of2.2 cm in Fig. 8(a). The retransmission of lost packets improvesthe position estimation validity. It should be noted that theresults of both models exhibit similar behavior in this respect[compare the loosely dashed lines in Fig. 8(b) and (d)]. Re-transmission decreases the probability of larger bursts of lostpackets.

3) Problem of Undersampled Measurements: InFig. 8(a)–(d), we illustrate that the integration of externalmeasurements improves the position estimate. The cumulativeerror of an estimation based on local sensors represents a

Page 10: An Architecture for a Dependable Distributed Sensor System

ZUG et al.: ARCHITECTURE FOR A DEPENDABLE DISTRIBUTED SENSOR SYSTEM 417

Fig. 8. Cumulative error probability of the estimation for a random disturbance in the communication and based on the Gilbert–Elliot model; The single valuesrepresent the overall packet loss probability, while the values in brackets for the Gilbert–Elliot model correspond to the mean packet loss burst length. (a) Randomdisturbance. (b) Random disturbance with retransmission. (c) Gilbert–Elliot disturbance. (d) Gilbert–Elliot disturbance with retransmission.

lower limit. In case of no retransmission, however, we observethat the results concerning the loss probability become lessprecise. The cumulative error graphs on the right of the lowersolid reference line (no external sensor) indicate this. Wesuppose that the reason for this behavior might be caused byour assumptions of a Gaussian distribution for our selection,synchronization, and fusion implementation. Packet lossesreduce the number of measurements and therefore change thedistribution function of the measurements. Undersamplingdecreases the possibility of estimating the underlying valuefrom a number of measurements within a given confidencerange.

VI. CONCLUSION AND FUTURE WORK

We have presented a holistic concept of fault-tolerant distrib-uted environment perception. In contrast to other approaches,we have considered all aspects in a distributed measurementchain—sensing, communication, and fusion. For those ele-ments, we have analyzed and classified the possible faults oftransducers and the network. Based on this investigation, wehave derived a generic programming abstraction. The basicelements are SSNs and AFNs, which closely cooperate toimprove robustness and dependability.

To validate our approach, we have considered disturbancesin the environment and faults of the system. We have shown

the benefit of the flexible and modular architecture of SSNs andAFNs. The outcome of our experiments further demonstratedmajor advantages. First, our modules provide information aboutthe quality of a measurement that is exploited for fusion andapplication programming. Second, the fault tolerance propertiesof the sensing system are improved by using information fromprevious fusions. This results in the detection of a large numberof failures and disturbances that are difficult to detect by otherforms of redundancy. The results of the combined mechanismsof the quantitative evaluation show that we can tolerate packetlosses of as much as 50% without major validity degradation.Our experiments have also shown that the results of the aggre-gated position estimation are much better than those based onlocal sensors only.

Currently, we are working on the extension of the presentedstructure in the following directions.

• Fault classification within the SSN uses a simple rule-based method to analyze and identify faults. We wantto examine adaptable and robust classification strategieswithin the SSN and AFN structure. The extended faultidentification method could be used for an improved se-lection of measurements, and it will result in a fine-grainadaptation of smoothing and estimation algorithms in theSSN and AFN.

• As shown in the preceding section, packet loss producesan incorrect perception of the environment, because in a

Page 11: An Architecture for a Dependable Distributed Sensor System

418 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 60, NO. 2, FEBRUARY 2011

periodic system, those missing messages can be detected.If this loss of packets affects the current distribution func-tion of sensor measurements in a way that it significantlydiffers from the reference, then we have to adapt the nextprocessing steps.

• The “data sheets” of sensors integrated in our scenario sofar are simple Matlab structures. More general solutionsare presented in [8] and [53]. Based on the XML descrip-tion, we will automatically generate the implementation ofthe SSN and AFN.

• Currently, our results are based on a simulation. TheMatlab/Simulink environment offers a code generationtool chain for embedded devices. We are pursuing workon an improved tool chain to apply this concept to realhardware.

REFERENCES

[1] R. Szewczyk, E. Osterweil, J. Polastre, M. Hamilton, A. Mainwaring, andD. Estrin, “Habitat monitoring with sensor networks,” Commun. ACM,vol. 47, no. 6, pp. 34–40, Jun. 2004.

[2] N. Priyantha, A. Chakraborty, and H. Balakrishnan, “The cricket location-support system,” in Proc. 6th Annu. Int. Conf. Mobile Comput. Netw.,2000, pp. 32–43.

[3] W. Tsujita, A. Yoshino, H. Ishida, and T. Moriizumi, “Gas sensor networkfor air-pollution monitoring,” Sens. Actuators B, Chem., vol. 110, no. 2,pp. 304–311, Oct. 2005.

[4] N. Leonard, D. Paley, F. Lekien, R. Sepulchre, D. Fratantoni, andR. Davis, “Collective motion, sensor networks, and ocean sampling,”Proc. IEEE, vol. 95, no. 1, pp. 48–74, Jan. 2007.

[5] R. Breckenridge and S. Katzberg, “The status of smart sensors,” in Proc.Sensor Syst. 80’s Conf., Colorado Springs, CO, 1980, pp. 41–46.

[6] IEEE Standard for a Smart Transducer Interface for Sensors andActuators, IEEE 1451.2, 1997.

[7] S. Zug and J. Kaiser, “An approach towards smart fault-tolerant sensors,”in Proc. Int. Workshop ROSE, Lecco, Italy, Nov. 2009, pp. 35–40.

[8] J. Kaiser, S. Zug, M. Schulze, and H. Piontek, “Exploiting self-descriptions for checking interoperations between embedded compo-nents,” in Proc. Int. Workshop DNCMS, Napoli, Italy, Oct. 2008.pp. 41–45.

[9] M. van der Meulen, “On the use of smart sensors, common cause fail-ure and the need for diversity,” in Proc. 6th Int. Symp. ProgrammableElectron. Syst. Safety Related Appl., Cologne, Germany, May 2004.

[10] G. Vachtsevanos, F. Lewis, M. Roemer, A. Hess, and B. Wu, IntelligentFault Diagnosis and Prognosis for Engineering Systems. Hoboken, NJ:Wiley, 2006.

[11] V. P. Nelson, “Fault-tolerant computing: Fundamental concepts,”Computer, vol. 23, no. 7, pp. 19–25, Jul. 1990.

[12] B. Hardekopf, K. Kwiat, and S. Upadhyaya, “Secure and fault-tolerantvoting in distributed systems,” in Proc. IEEE Aerosp. Conf., 2001, vol. 3,pp. 1117–1126.

[13] K. Marzullo, “Tolerating failures of continuous-valued sensors,” ACMTrans. Comput. Syst. (TOCS), vol. 8, no. 4, pp. 284–304, Nov. 1990.

[14] F. Koushanfar, M. Potkonjak, and A. Sangiovanni-Vincentelli, “On-linefault detection of sensor measurements,” in Proc. IEEE Sensors, 2003,vol. 2, pp. 974–979.

[15] R. Isermann, “Model-based fault-detection and diagnosis—Status andapplications,” Annu. Rev. Control, vol. 29, no. 1, pp. 71–85, 2005.

[16] P. Frank and B. Köppen-Seliger, “Fuzzy logic and neural network applica-tions to fault diagnosis,” Int. J. Approx. Reason., vol. 16, no. 1, pp. 67–88,Jan. 1997.

[17] R. Patton, “Fault-tolerant control systems: The 1997 situation,” in Proc.IFAC Symp. Fault Detection Supervision Safety Tech. Processes, 1997,vol. 3, pp. 1033–1054.

[18] K. Worden, G. Manson, and N. Fieller, “Damage detection using outlieranalysis,” J. Sound Vib., vol. 229, no. 3, pp. 647–667, Jan. 2000.

[19] S. Doebling, C. Farrar, and M. Prime, “A summary review of vibration-based damage identification methods,” Shock Vib. Dig., vol. 30, no. 2,pp. 91–105, 1998.

[20] A. Dietrich, S. Zug, and J. Kaiser, “Detecting external measurement dis-turbances based on statistical analysis for smart sensors,” in Proc. IEEEISIE, 2010, pp. 2067–2072.

[21] A. Makarenko and H. Durrant-Whyte, “Decentralized data fusion andcontrol in active sensor networks,” in Proc. 7th Int. Conf. Inf. Fusion,2004, pp. 479–486.

[22] B. Agarwalla, P. Hutto, A. Paul, and U. Ramachandran, “Fusion channels:A multi-sensor data fusion architecture,” Georgia Tech College Comput.,Atlanta, GA, Tech. Rep. 53, GIT-CC-02, 2002.

[23] R. Bose, A. Helal, V. Sivakumar, and S. Lim, “Virtual sensors for serviceoriented intelligent environments,” in Proc. 3rd IASTED Int. Conf. Adv.Comput. Sci. Technol., 2007, pp. 165–170.

[24] T. C. Henderson and M. Dekhil, “Instrumented sensor system architec-ture,” Int. J. Robot. Res., vol. 17, no. 4, pp. 402–417, Apr. 1998.

[25] H. Benítez Pérez, J. Ortega Arjona, and G. Reza Latif Shabgahi, “De-finition and empirical evaluation of voters for redundant smart sen-sor systems,” Computación y Sistemas, vol. 11, no. 1, pp. 39–60,Sep. 2007. [Online]. Available: http://redalyc.uaemex.mx/redalyc/pdf/615/61511105.pdf.

[26] CAN Specification, Robert Bosch GmbH, Stuttgart, Germany, Sep. 1991.ver. 2.0.

[27] ZigBee Specification, IEEE 802.15.4, 2003.[28] E. Seignez, M. Kieffer, A. Lambert, E. Walter, and T. Maurin, “Real-time

bounded-error state estimation for vehicle tracking,” Int. J. Robot. Res.,vol. 28, no. 1, pp. 34–48, Jan. 2009.

[29] J. de Gois and M. Hiller, “Ultrasound sensor system with fuzzy dataprocessing,” in Proc. 7th Int. Conf. CLAWAR, 2004, pp. 411–417.

[30] A. Martinelli and R. Siegwart, “Estimating the odometry error of amobile robot during navigation,” in Proc. ECMR, Warsaw, Poland,Sep. 2003.

[31] Wenglor Sensoric, Data sheet YT87MGV802003. [Online].Available: http://www.q-tech.hu/pdf/Wenglor/Photoelectric%20sensors/YT87MGV80.pdf

[32] R. Groß and M. Dorigo, “Evolving a cooperative transport behaviorfor two simple robots,” in Proc. 6th Int. Conf. Artif. Evol., EvolutionArtificielle (EA), vol. 2936, Lecture Notes in Computer Science, 2004,pp. 305–317.

[33] G. Nicolescu and P. J. Mosterman, Model-Based Design for EmbeddedSystems. Boca Raton, FL: CRC Press, 2009.

[34] M. Becker, Simulation tool comparison matrix—CRUISE ISTproject2008. [Online]. Available: http://www.comnets.uni-bremen.de/~mab/cruise/simulation-tool-comparison-matrix.html

[35] The network simulator—ns-2, 2006. [Online]. Available: http://www.isi.edu/nsnam/ns/

[36] The ns-3 network simulator, 2009. [Online]. Available: http://www.nsnam.org

[37] PiccSIM—Simulation of wireless control systems, 2009. [Online].Available: http://autsys.tkk.fi/en/Control/PiccSIM

[38] SimEvents 3.0—Model and simulate discrete-event systems, 2009.[39] TrueTime: Simulation of networked and embedded control systems, 2009.

[Online]. Available: http://www.control.lth.se/truetime/[40] E. Gilbert, “Capacity of a burst-noise channel,” Bell Syst. Tech. J., vol. 39,

no. 9, pp. 1253–1265, Sep. 1960.[41] E. Elliott, “Estimates of error rates for codes on burst-noise channels,”

Bell Syst. Tech. J., vol. 42, no. 9, pp. 1977–1997, Sep. 1963.[42] A. Willig, M. Kubisch, C. Hoene, and A. Wolisz, “Measurements of

a wireless link in an industrial environment using an IEEE 802.11-compliant physical layer,” IEEE Trans. Ind. Electron., vol. 49, no. 6,pp. 1265–1282, Dec. 2002.

[43] B. Wysocki and A. Dadej, Advanced Wired and Wireless Networks.New York: Springer-Verlag, 2004.

[44] M. Yajnik, S. Moon, J. Kurose, and D. Towsley, “Measurement andmodeling of the temporal dependence in packet loss,” in Proc. IEEEINFOCOM, 1999, vol. 1, pp. 345–352.

[45] J. Hartwell and A. Fapojuwo, “Modeling and characterization of frameloss process in IEEE 802.11 wireless local area networks,” in Proc. IEEE60th VTC—Fall, Sep. 2004, vol. 6, pp. 4481–4485.

[46] J. Poikonen, “Geometric run length packet channel models applied inDVB-H simulations,” in Proc. IEEE 17th Int. Symp. PIMRC, Helsinki,Finland, 2006, pp. 1–5.

[47] B. Milner, “Robust speech recognition in burst-like packet loss,” in Proc.IEEE ICASSP, Salt Lake City, UT, May 2001, vol. 1, pp. 261–264.

[48] A. Clark, “Modeling the effects of burst packet loss and recency on sub-jective voice quality,” in Proc. Internet Telephony Workshop, New York,2001, pp. 123–127.

[49] J. Ellis, C. Pursell, and J. Rahman, Voice, Video, and Data NetworkConvergence. New York: Academic, Sep. 2003.

[50] Y. Bar-Shalom, X.-R. Li, and T. Kirubarajan, Estimation With Ap-plications to Tracking and Navigation, 1st ed. Hoboken, NJ: Wiley,Jun. 2001.

Page 12: An Architecture for a Dependable Distributed Sensor System

ZUG et al.: ARCHITECTURE FOR A DEPENDABLE DISTRIBUTED SENSOR SYSTEM 419

[51] H. B. Mitchell, Multi-Sensor Data Fusion: An Introduction, 1st ed.Berlin, Germany: Springer-Verlag, 2007. [Online]. Available: http://www.loc.gov/catdir/enhancements/fy0825/2007926176-d.html

[52] Wireless Medium Access Control (MAC) and Physical Layer (PHY) Spec-ifications for Low-Rate Wireless Personal Area Networks (WPANs), IEEE802.15.4, 2006. [Online]. http://standards.ieee.org/getieee802/download/802.15.4-2006.pdf

[53] J. Kaiser and H. Piontek, “CODES: Supporting the development processin a publish/subscribe system,” in Proc. 4th WISES, Vienna, Austria,Jun. 30, 2006, pp. 1–12.

Sebastian Zug received the diploma degree in me-chanical engineering from the Technical Universityof Cottbus, Cottbus, Germany, in 2005. He is cur-rently working toward the Ph.D. degree at Otto-von-Guericke University Magdeburg, Magdeburg,Germany. His Ph.D. research focuses on methodsand architectures for fault-tolerant data fusion in dis-tributed heterogeneous systems, which is supervisedby Prof. J. Kaiser.

Since 2005, he has been a member of the Depart-ment of Embedded Systems and Operating Systems

(EOS), Faculty of Computer Science, Institute of Distributed Systems, Otto-von-Guericke University Magdeburg.

André Dietrich received the Masters degree incomputer science from the University of Leipzig,Leipzig, Germany, in 2009.

He was Application Developer with the Re-search Establishment for Applied Science (FGAN),Wachtberg, Germany. He is currently a ResearchAssistant with the “Virtual and Augmented Re-ality for Highly Safety and Reliable EmbeddedSystems”-Project (ViERforES), Faculty of Com-puter Science, Institute of Distributed Systems, Otto-von-Guericke University Magdeburg, Magdeburg,

Germany.

Jörg Kaiser received the M.S. and Ph.D. degreesfrom Bonn University, Bonn, Germany.

He worked with the German National ResearchCentre for Information Technology, St. Augustin,Germany, and was a Professor with the Universityof Ulm, Ulm, Germany. He is currently a Full Pro-fessor with the Faculty of Computer Science, Otto-von-Guericke University Magdeburg, Magdeburg,Germany, where he also leads the Department ofEmbedded Systems and Operating Systems (EOS).His research work ranges from computer architecture

and operating systems to distributed systems and middleware, with emphasisto component orientation, fault tolerance, and real time. His current researchareas have a strong emphasis on large-scale distributed real-time systemssupervising and controlling real-world applications. He has recently coinitiateda Competence Centre for Service Robotics together with the Fraunhofer IFFin Magdeburg to foster the exchange between academia and more industry-oriented developments.