irisnet: an architecture for a worldwide sensor web · an architecture for a worldwide sensor web w...

22 PERVASIVEcomputing Published by the IEEE CS and IEEE ComSoc ■ 1536-1268/03/$17.00 © 2003 IEEE

S E N S O R A N D A C T U AT O R N E T W O R K S

IrisNet:An Architecture for aWorldwide Sensor Web

Wide-area architectures for perva-sive sensing will enable a newgeneration of powerful distrib-uted sensing services. Such archi-tectures have so far received lit-

tle attention but are increasingly relevant becauseof a confluence of technological trends. Com-modity off-the-shelf sensors—including videocameras (Webcams), microphones, and motiondetectors—have become readily available. Suchsensors interface easily with today’s low-cost PCs,which are deployed globally and connect throughthe Internet. Indeed, we could even view a PC’s

high-speed network interface asa rich sensor that senses the vir-tual environment of a LAN orthe Internet rather than thephysical environment. Together,these devices present a promis-ing hardware platform for anemerging wide-area sensor net-

work. What’s missing are the architecture, algo-rithms, and software system needed to orchestratethis hardware into a global sensor system thatresponds to users’ queries.

We envision a worldwide sensor web, in whichusers can query, as a single unit, vast quantities ofdata from thousands or even millions of widelydistributed, heterogeneous sensors. Internet-con-nected PCs that source sensor feeds and cooper-ate to answer users’ queries will form the globalsensor web’s backbone. Developers of wide-areasensing services (service authors) will deploy their

services on this distributed infrastructure.Imagine the following scenario: after an oil

spill, an ecologist wishes to know all locationswhere oil has significantly encroached on thecoastal habitat. She queries a coastal-monitoringservice, which collects data from video camerasdirected at the coastline. In response, she receivesboth images of these contaminated sites and theirgeographic locations. The same coastal-moni-toring service can store triggered queries, wherebythe service notifies the appropriate lifeguardswhen a strong riptide develops in a beach region.

In the IrisNet (Internet-scale Resource-Inten-sive Sensor Network Services) project at IntelResearch, we are designing an architecture andbuilding a system that enable easy deployment ofsuch wide-area sensing services. Our aim is toprovide the missing software components forrealizing a worldwide sensor web.

Worldwide sensor web:context and benefits

To date, sensor-network research has largelybeen defined by the design of algorithms and sys-tems to cope with the severe resource constraintsof tiny battery-powered sensors that use wirelesscommunication (for example, slow CPUs, low-bit-rate radios, and scarce energy). Such sensor net-works are distributed over a single, contiguouscommunication domain. They use simple sensorsthat provide time series of single numerical mea-surements, such as temperature, pressure, lightlevel, and so on. Researchers have developed spe-

Today’s common computing hardware—Internet connected desktop PCsand inexpensive, commodity off-the-shelf sensors such as Webcams—isan ideal platform for a worldwide sensor web. IrisNet provides asoftware infrastructure for this platform that lets users query globallydistributed collections of high-bit-rate sensors powerfully and efficiently.

Phillip B. Gibbons and Brad Karp Intel Research Pittsburgh

Yan Ke, Suman Nath, andSrinivasan Seshan Carnegie Mellon University

cialized hardware, operating systems, pro-gramming languages, and database sys-tems to accommodate such severely con-strained environments.1–3

In contrast, in IrisNet we seek tobroaden the traditional notion of sensornetworks to include wide-area “sensorwebs,”4 such as those comprising Inter-net-connected, widely-dispersed PC-classnodes with powerful CPUs that canprocess rich sensor data sources. A world-wide sensor web seamlessly integrates awide range of sensor feeds, from high-bit-rate feeds from Webcam-equipped PCs tolow-bit-rate feeds from traditional wire-less sensor networks.

Additionally, a worldwide sensor webenables a wide variety of useful services.Consumer-oriented services include

• Alert services for notifying users whento head to the bus stop or when waterconditions have become dangerous

• Waiting-time monitors for reportingon queueing delays at post offices,food courts, and so on

• Parking-space-finder services fordirecting drivers to available parkingspaces near their destinations

• Lost-and-found services for locatinglost objects or pets

• Watch-my-child (or watch-my-parent)services for monitoring children play-ing in the neighborhood (or elderlyparents about town)

Other promising services exist in vari-ous domains. Epidemic early warning ser-vices (public health), homeland defenseservices (security), computer networkmonitoring services (technology), andInternet-scale sensing observatories (nat-ural sciences) are a few possibilities.

Envisioning a worldwidesensor web

Wide-area sensing services have sev-eral key demands that drive worldwidesensor web design.

Planet-wide local data collectionand storage

First and foremost, wide-area sensingnecessarily employs numerous globallydistributed sensing devices that observethe physical world. Because the sensorswould collect a vast volume of data, andbecause we would want to retain both themost recent observations and a historicalrecord, the system should store observa-tions near their sources and transmit themacross the Internet only as needed.

Real-time adaptation of collectionand processing

The system should be able to reconfig-ure data-collection and -filtering processesin reaction to sensed data. Sampling ratesmight change; we might invoke new, spe-cial-purpose processing routines; wemight even control actuators to modifydata collection. Deciding when and howto adapt collection and processing mightdepend on sophisticated analysis of dataderived from multiple sensors.

Data as a single queriable unitThe user should view the sensing

device network as a single unit that sup-ports a high-level query language. Eachquery would be able to operate over datacollected from across the global sensornetwork, just as a single Google searchquery encompasses millions of Webpages. Beyond the keyword searchesGoogle offers, the worldwide sensor webshould support rich queries, which couldinclude arithmetic, aggregation, andother database operators. (Google is notintended for sensing applications and sooffers few of these features.)

Queries posed anywhere on the Internet

We are accustomed to retrieving infor-mation stored anywhere on the Internetfrom anywhere on the Internet. The sen-sor web should preserve this ubiquitousinformation access. At the same time, the

system should actively seek to exploit anylocality between the querier and thequeried data. Because the sensed data areinherently coupled to a physical location,many queries will be posed within tens ofmiles of the data. For example, lifeguardsresponsible for monitoring a beach regionwould tend to request riptide alerts forthe waters adjacent to “their” beach.

Data integrity and privacyPervasive monitoring of the physical

world raises significant data integrityand privacy concerns. In our initial Iris-Net prototype, we assume that the entireworldwide sensor web is administeredby a single, universally trusted author-ity. In any real-world deployment of aglobal sensor web, different authoritieswill control portions of the sensing infra-structure, and sensor service authorsmight wish to compose services acrossauthority boundaries.

Moreover, different service usersmight trust different authorities, as coulddifferent service authors. The sensor webshould support defining and enforcingdata integrity and privacy policies thatmatch the concerns of four constituen-cies: service users, service authors, thosein a sensed region, and those operatingthe sensor web. The combination ofthese policies should determine how dataare distributed, processed, and shared ina sensor web.

RobustnessIn a system that uses so many sensing

devices and so many computing devices,failures will occur often. The systemshould operate smoothly despite thesefailures and the resulting unreachablehosts, unavailability of previously storedobservations, missed new observations,and so on.

Ease of service authorshipFinally, the system should make it as

easy as possible for service authors to

OCTOBER–DECEMBER 2003 PERVASIVEcomputing 23

develop sensing services by providinghigh-level abstractions of the sensinginfrastructure that hide the complexitiesof the underlying distributed-data-col-lection and query-processing mecha-nisms. Ease of service authorship is cru-cial to fostering system adoption andproliferation of new services that use it.

Challenges for service authorsA service author deploying a wide-

area sensing service faces several chal-

lenges to realizing the vision we justdescribed. IrisNet can help ease serviceauthorship.

Wide-area sensing’s two most funda-mental challenges are data collection andquery answering. IrisNet provides soft-ware tools that automate these twoprocesses in a service-neutral fashion sothat several different services can bedeployed on a single IrisNet softwareinfrastructure.

Data collectionAlthough different services may seek

different measurements from sensors, asubset of data collection tasks is com-mon to all services. First, all servicesmust instrument the environment withsensors. In a naive sensor deployment,each service author would need todeploy a separate sensor to monitor thesame physical area. Requiring that sep-arate sensors be deployed for each ser-vice, however would greatly increase thecost and configuration effort involved indeploying a service, hindering new ser-vice deployment. IrisNet shares sensorsamong multiple sensing services to max-

imize each deployed sensor’s use to ser-vice authors. A particular sensor mightbe positioned to measure a physical areaof interest for one particular service, butit might, in fact, provide data useful inmultiple services.

While shared sensors simplify servicedeployment, they complicate resourcemanagement in the infrastructure. Howdo sensor owners allocate resourcesamong services competing for the samesensor? How do they protect services

running on the same sensor from oneanother? How do they avoid wastingcomputation if two sensing services over-lap in the processing they perform ontheir input? The IrisNet architecture pro-vides a runtime environment for sensornodes that assists with these tasks. More-over, sharing the sensor infrastructureacross all services forces each service toselect a set of sensor feeds relevant to thatservice. In the current IrisNet prototype,service authors learn about availabledeployed sensor feeds out-of-band ofIrisNet. In the future, IrisNet will includea sensor feed discovery service to pro-vide this system functionality.

To support several rich sensor feeds, aservice must reduce raw observations topostprocessed, extracted informationnear the data source. This filtering strat-egy avoids overloading the network andspreads the computational burden amongmany machines operating independentlyin parallel. The largest data reductionarises from using service-specific filter-ing. For example, filtering code for acoastal imaging service can reduce a 1.5-Mbps compressed video feed of a coast-

line to a single time-exposure image of20–150 Kbytes every 10 minutes (under2 Kbps). The IrisNet architecture makesit easy for services to upload such filter-ing code to sensing devices.

Querying the collected dataSensing services typically collect fil-

tered sensor readings in a database thatusers can query. For example, a coastalimaging service might use a database ofall past single-frame images for moni-tored coastline locations, while a person-locator service might use a database ofindividual identifying information (forexample, name or social security num-ber) and location. Databases for differ-ent services might have grossly differentproperties. The database for an imagingservice for an entire coast could be quitelarge, while that for a citywide personlocator might be much smaller. Similarly,updates to sensed data for services suchas a network and host monitor servicethat records every packet traversing anetwork’s monitored portions mightoccur much more frequently than updatesfor services such as a parking-space finderand coastal imaging. Furthermore, thequery patterns for distinct services mightdiffer substantially. One would expectmost parking-space-finder queries to befor neighborhoods near the querier,while queries for a coastal imaging ser-vice might cover large regions far fromthe querier. IrisNet provides tools thatlet service authors implement sensor-ser-vice databases. The key challenge indesigning such tools is ensuring that theymeet the wide range of real demands dif-ferent services make.

Sensing services have a few commonkey properties that IrisNet’s databasesupport specifically addresses. First,although update rates for sensor data-bases can vary from service to service,they are often much greater than thosein traditional databases. This occursbecause data sources in a sensor data-

24 PERVASIVEcomputing http://computer.org/pervasive


A particular sensor might be positioned to

measure a physical area of interest for one

particular service, but it might, in fact, provide

data useful in multiple services.

base are automated, allowing for fre-quent updates without human interven-tion. Consider a person-locator service,which updates stored user positions onceevery 10 minutes. For a moderately sizedmetropolitan area of 1.5 million people,the user location database would needto process approximately 25,000 up-dates per second. The entire US (approx-imately 300 million people) would haveapproximately 500,000 updates per sec-ond. To support high update rates, wecan partition the database across multi-ple host nodes. However, partitioning adatabase often limits query types orquery performance. Fortunately, whileall sensing services let users query the col-lected sensor readings, services alsocommonly target particular query types.For example, a coastal-imaging servicemight require queries to include a targetlocation and date. Similarly, a person-locator service might need queries tospecify a person’s name and social secu-rity number. IrisNet gives service authorsa hierarchically organized distributeddatabase system tailored to these com-

mon characteristics. In addition, IrisNetautomatically partitions the databaseamong the hosts that participate in it androutes queries among these hosts.

The support IrisNet gives data collec-tion and query processing drasticallyreduces the difficulty of authoring awide-area sensing service. A serviceauthor using IrisNet need only write thecode that filters sensor readings and theschema that define the database’s con-tents. IrisNet’s architecture implementsthis high-level abstraction of a sensingservice for use by service authors.

The IrisNet architectureFigure 1 shows IrisNet’s two-tier archi-

tecture. The following observations moti-vated this design:

• Despite differences between sensortypes, developers need a generic dataacquisition interface to access sensors.In IrisNet, the nodes that implementthis interface are called sensing agents(SAs).

• Services must store the service-specific

data the SAs produce in a distributeddatabase. In IrisNet, the nodes thatimplement this distributed databaseare called organizing agents (OAs).

OA architectureService developers deploy sensing ser-

vices by orchestrating a group of dedicatedOAs. Consequently, each OA partici-pates in only one sensing service (a singlephysical machine can run multiple OAs).The group of OAs for a single servicemust collect and organize sensor data toanswer the particular class of queries rel-evant to the service (for example, queriesabout parking spots for a parking-space-finder service). OAs also should providefault tolerance and balance load acrossthe system. We use the parking-space-finder service to illustrate the importantproperties of IrisNet’s query processing.

Choice of database. We envision a rich andevolving set of data types, aggregate fields,and so on within and across services, bestcaptured by self-describing tags. In addi-tion, each sensor reads a geographic loca-


SA

SA

SA

SA

SA

SA

Parking-space-finder service OA group

University Downtown

AmyPerson-locator service OA group

Kim Tom

Figure 1. The IrisNet architecture, with sensing agent (SA) and organizing agent (OA) nodes.

tion, so it’s natural to organize sensor datainto a geographic- or political-boundaryhierarchy. So, we choose to represent sen-sor-derived data in XML, which is well-suited to representing hierarchical dataand uses self-describing tags to organize

data. Figure 2a shows part of an XMLdocument representing the parking-space-finder service’s schema. Theschema describes the static (for example,<handicapped>) and dynamic (for example,<available>) data as well as the hierarchythe service uses. The database schema theservice author provides identifies the data-base’s hierarchical portion using specialID tags. Figure 2b shows the hierarchy’stree representation, formed by the ID-tagged nodes.

Distributing the database. To adapt toquery and update workloads, IrisNetdynamically partitions the sensor data-base among a collection of OAs. Iris-Net permits an OA to own any subsetof the nodes in the hierarchy (includ-ing noncontiguous subsets). It employsa distributed algorithm that—using sev-eral statistics that the OAs maintain—dynamically decides which parts ofthe sensor database should be parti-tioned or replicated and which OA tochoose when placing database frag-ments. This adaptive data placementalgorithm reduces average query re-

sponse time and network traffic (forread and write operations) while keep-ing each OA’s load below a load-limitthreshold.

The path from the hierarchy’s root toa lower node defines that lower node’sglobally unique name. (Achieving thisproperty requires that the sibling nodes’ID tags be unique.) For example, thePittsburgh node in Figure 2b is namedcity-Pittsburgh.state-PA.usRegion-NE. Each OA reg-isters with the Internet’s Domain NameSystem (DNS)5 each node that it ownswhose parent is owned by a differentOA. The registered name is the node’sglobal name prepended to the service’sname and a registered domain suffix (forexample, intel-iris.net). This is the onlymapping from the logical database hier-archy to physical hosts’ IP addresses inthe system. This mapping allows con-siderable flexibility in mapping nodes inthe XML document to OAs and OAs tophysical machines. For example, Figure3 shows an example configurationwhere one OA owns the nodes NE, PA, andPittsburgh, and different OAs own each ofthe remaining nodes.



<USRegion id="NE"><state id="PA">

<city id="Pittsburgh">

</neighborhood></city>

</state></USRegion>

<block id="block3"><parkingSpace>

<available>yes</available></parkingSpace>

</block></neighborhood><neighborhood id="Shadyside">


<available>no</available></parkingSpace>

</block>


<available>yes</available></parkingSpace>

</block>


<handicapped>yes</handicapped><available>yes</available>

</parkingSpace></block>

<neighborhood id="Oakland">

block1

NE

PA

Pittsburgh

OaklandShadyside

block2 block3block1

(b)

(a)

Figure 2. (a) Part of the XML schema usedin the parking-space-finder service; (b) thehierarchy defined by the ID-tagged nodes.

3

3

4

2

2

1 OaklandShadyside

block1

Web server

Physicalmachine

Webserver

Hierarchy Message

/parkingSpace[available=’yes’]

Pittsburgh

NE, PA

block2 block3

/block[@id=’block1’ OR @id=’block3’]/city[@id=’Pittsburgh’]/neighborhood[@id=’Oakland’]

block1

/USRegion[@id=’NE’]/state[@id=’PA’]

Figure 3. An example database configuration including (a) an XPATH query and (b) amapping of logical nodes to seven machines and the messages sent between themachines to answer the query (numbers depict the relative order of messages).

Query routing. We use the XPATH querylanguage because it is the most widelyused for XML, with good query pro-cessing support. Figure 3 shows anexample XPATH query asking for allavailable parking spaces at block1 andblock3 of Oakland.

Because our XML database is distrib-uted, providing fast and correct answersto user queries is challenging. The sys-tem must route a query directly to therelevant nodes and must pass databetween OAs only as needed. An XPATH

query selects data from a node set in thehierarchy. In IrisNet, the query is routeddirectly to the lowest common ancestor(LCA) of the nodes the query potentiallyselects. For example, Oakland is the LCAnode for the XPATH query Figure 3shows. Note that we can derive the LCAnode’s globally unique name simply byparsing the XPATH query. A DNS lookupon this name provides the IP addressesof all OAs owning the LCA node. (Aswe discuss later, a node can be replicated,and thus owned by multiple OAs.) Iris-Net selects one of these OAs, referred toas the LCA OA, and routes the query tothat OA. This mechanism prevents thehierarchy’s root from becoming a bot-tleneck; the LCA is typically far down inthe hierarchy.

On receiving a query, the LCA OAqueries its portion of the overall XMLdocument and evaluates the result. Formany queries, a single OA might nothave enough of the document to respond.The OA determines which part of a user’squery it can answer from the local docu-ment and where to gather the missingparts (extracting the needed global namesfrom the document). The OA looks upother OAs’ IP addresses and sends sub-queries to them. These OAs might, inturn, perform a similar gathering task.Finally, the LCA OA collects the differ-ent responses and sends the combinedresult back to the user. For the examplein Figure 3, the Oakland OA receives the

query from the web server and sends sub-queries to the block1 and block3 OAs. Theseeach return a list of available parkingspaces to be combined at the Oakland OAand returned to the user.

Caching and data consistency. To improvethe performance of repeated requests forsimilar information (for example, multi-ple users requesting parking spacesdowntown), OAs cache data from anyquery-gathering task that they perform.IrisNet’s query-processing mechanismuses cached data even if the new queryonly partially matches the cached data.Using this cached information proveschallenging because it complicates theOA’s task of identifying the remaininginformation it must gather to answer aquery.

Because of network propagation delaysand the use of cached data, answersreturned to users might not reflect themost recent database updates. In IrisNet,a query might specify consistency criteriaindicating its tolerance for stale data.IrisNet stores timestamps along with thecached data that indicate when the datawere created, so that an XPATH queryspecifying a tolerance automatically goesto data of appropriate freshness.

More details on IrisNet’s database dis-tribution, query processing, and cachingtechniques are available elsewhere.6

Fault tolerance. IrisNet replicates nodesin the logical hierarchy on multiple OAs.It uses two types of replicas: primaryreplicas that are kept strongly mutuallyconsistent and placed in geographicallyoptimal locations (for example, near thesensor reading sources) and secondaryreplicas that only maintain a weakly con-sistent copy of the data and are placedfar from the primary replicas to main-tain robustness when the primary repli-cas suffer correlated failure. Duringquery routing, an OA generating a sub-query to a target OA retrieves the list ofprimary replicas for the target OA usinga DNS lookup and tries replicas sequen-tially until the query successfully reachesa live target OA. If the OA fails to find alive target OA, it performs a second DNSlookup to retrieve the list of secondaryreplicas, and again tries replicas sequen-tially. IrisNet uses a weighted proba-bilistic scheme that favors nearby repli-cas to select which replica to query next.

SA architectureSAs collect raw sensor data from sev-

eral sensors (possibly of different types).The sensor types can range from Web-cams and microphones to temperatureand pressure gauges. We focus our designon sensors such as Webcams that producelarge volumes of data and that variousservices can use. Figure 4 shows the exe-


SA hostWebcam

NonI

risNe

t app

licat

ions

sensor bufferRaw

Privacy filters

SA d

aem

on

memory poolShared

Trustedsenselets

sensor bufferProcessed

Sharedmemory pool

Untrustedsenselets

Figure 4. The execution environment in a sensing-agent host.

cution environment in an IrisNet sensorhost. A sensor host receives one or moreraw sensor feeds from directly attachedsensors and stores them in circular sharedmemory buffers. Multiple services canshare an SA and access these buffers.

Programmable SAs. IrisNet lets servicesupload and control the execution of codethat filters sensor readings dynamicallyin a service-specific fashion. We call this

code a senselet. A single SA can executeone or more senselets for each service thatwants to access its sensor feeds. A sense-let instructs the SA to take the raw sensorfeed, perform a specified set of processingsteps (as the specific service requires), andsend the resulting distilled informationto a nearby OA. Although the OA receiv-ing sensor readings from an SA will typ-ically own the parts of the global sensordatabase that store the readings, this is

not required. In the more general sce-nario, the nearby OA routes the updateto the appropriate set of OAs owning therelevant portion of the database. Decou-pling the SA that generates a sensor read-ing and the OA that owns the relevantdatabase part provides a useful abstrac-tion to those services that monitor mobileentities. For example, a person-locatorservice might use an organizational hier-archy to store people’s locations (for



Researchers have made various related efforts toward enabling

wide-area sensing services. We can classify these into efforts with

similar applications goals (sensor networks and video surveillance)

and those that employ similar techniques (Internet service frame-

works and distributed databases).

Sensor networksSensor networks and IrisNet share the goal of letting users access

real-world measurements. The work on sensor networks has largely

concentrated on using motes, small nodes containing a simple

processor, a little memory, a wireless-network connection, and a

sensing device. Due to the emphasis on resource-constrained

motes, earlier key contributions have been in the areas of tiny oper-

ating systems1 and low-power network protocols.2 Mote-based

systems have relied on techniques such as directed diffusion3 to

direct sensor readings to interested parties or long-running queries4

to retrieve the needed sensor data to a front-end database. Other

groups have explored using query techniques for streaming data and

using sensor proxies to coordinate queries to address sensor motes’

limitations.5–7 None of this work considers sensor networks with

intelligent sensor nodes, high-bit-rate sensor feeds, and global scale.

Video surveillanceEfforts such as the Video Surveillance and Monitoring project8

have explored using video sensors. Such efforts have concentrated

on image-processing challenges such as identifying and tracking

moving objects in a camera’s field of vision. These efforts are com-

plementary to our focus on wide-area scaling and service author-

ship tools.

Internet services frameworksSeveral efforts have produced frameworks for simplifying the

development of scalable, robust Internet services.9–12 In general,

these projects target a lower level of the architecture than IrisNet.

They concentrate on problems that are generic across all Internet ser-

vices, such as load balancing, resource allocation, and network place-

ment. In contrast, IrisNet addresses problems unique to services that

must collect vast amounts of data and process queries on the data.

In this way, IrisNet is largely complementary to these previous efforts

and could, in fact, be implemented using these frameworks.

Distributed databasesIrisNet’s distributed database infrastructure has much in common

with various large-scale distributed databases. For example, the

Domain Name System13 relies on a distributed database that uses

a hierarchy on the basis of host name structure to support name-

to-address mapping. The Lightweight Directory Access Protocol14

addresses some of the DNS’s limitations by enabling richer stan-

dardized naming using hierarchically organized values and attributes.

However, it still supports only a relatively restrictive querying model.

Matthew Harren and his colleagues have investigated peer-to-

peer databases that provide a richer querying model than the

exact-match queries existing distributed hash table (DHT) systems

support. (They reported significant hotspots in storage, processing,

and routing with their initial design.)15 Replicated DHT-based data-

bases and replicated hierarchical databases offer qualitatively differ-

ent robustness behaviors. DHTs replicate portions of the DHT key

space. Because DHT keys are typically hashes of data, failure of all

replicas for the same portion of the DHT key space results in the dis-

appearance of data items diffusely scattered throughout the data-

base. In contrast, a hierarchical database replicates contiguous por-

tions of its hierarchy across replicas. Should all replicas fail, all data

beneath the corresponding contiguous portion of the hierarchy

becomes unavailable. DHT-based databases are less well suited to

leveraging XML and the hierarchically scoped queries typical in

sensing services. Distributed databases supporting a full query-pro-

cessing language such as SQL (Structured Query Language) are a

well-studied topic,16 with the focus on supporting distributed trans-

actions or other consistency guarantees.17–23 The previous work

doesn’t address the difficulties in distributed query processing over

Related Work

example, people in academic depart-ments in universities). When a partici-pant in the service travels, he or she willpass within the view of sensors whoseSAs might have no a priori associationwith the OAs that own the databasenode containing his or her location. Eachsuch SA sends the updated locationinformation to a nearby OA. IrisNetroutes that update to the OAs owningthe node.

Protecting senselets and SA hosts. TheSA execution environment protects thehost and senselets from buggy or mali-cious senselets. Each senselet executes asa different process. This separation pro-vides process-level protection betweensenselets. It also enables limiting theresources that a senselet consumes,although our current implementationdoesn’t enforce such limits. This lack ofresource limitations would not signifi-

cantly hinder deployments in which ser-vice developers pay service providers forthe resources consumed.

Privacy mechanisms. Video streams fromcameras in public places will often con-tain human faces, automobile licenseplates, and other data that can be used toidentify a person. This disclosure of iden-tifying information raises an importantquestion: how can IrisNet help protect


an XML document. Researchers have also done considerable work

on query processing over a federation of heterogeneous data-

bases,16 dealing with problems such as incompatible schemas.

These problems do not arise in the current IrisNet architecture

because the service author defines the schema for an entire

service.

REFERENCES

1. J. Hill et al., “System Architecture Directions for Network Sensors,” Proc.9th Int’l Conf. Architectural Support for Programming Languages andOperating Systems, ACM Press, 2000, pp. 93–104.

2. J. Kulik, W. Rabiner, and H. Balakrishnan, “Adaptive Protocols for Infor-mation Dissemination in Wireless Sensor Networks,” Proc. 5th Ann. Int’lConf. Mobile Computing and Networking, ACM Press, 1999, pp. 174–185.

3. J. Heidemann et al., “Building Efficient Wireless Sensor Networks withLow-Level Naming,” Proc. 18th ACM Symp. Operating Systems Principles,ACM Press, 2001, pp. 146–159.

4. P. Bonnet, J.E. Gehrke, and P. Seshadri, “Towards Sensor Database Sys-tems,” Proc. 2nd Int’l Conf. Mobile Data Management, LNCS, Springer-Verlag, 2001.

5. S. Madden and M.J. Franklin, “Fjording the Stream: An Architecture forQueries Over Streaming Sensor Data,” Proc. 18th Int’l Conf. Data Eng.,IEEE CS Press, 2002, pp. 555–566.

6. S. Madden et al., “TAG: A Tiny Aggregation Service for Ad Hoc SensorNetworks,” ACM SIGOPS Operating Systems Rev., vol. 36, no. SI, Winter2002, pp. 131–146.

7. S. Madden et al., “The Design of an Acquisitional Query Processor forSensor Networks,” Proc. ACM SIGMOD Int’l Conf. Management of Data,ACM Press, 2003, pp. 491–502.

8. R. Collins et al., “Algorithms for Cooperative Multisensor Surveillance,”Proc. IEEE, vol. 89, no. 10, Oct. 2001, pp. 1456–1477.

9. M. Welsh, D.E. Culler, and E.A. Brewer, “SEDA: An Architecture for Well-Conditioned, Scalable Internet Services,” Proc. 18th ACM Symp. Operat-ing Systems Principles, ACM Press, 2001, pp. 230–243.

10. J.R. von Behren et al., “Ninja: A Framework for Network Services,” Proc.

USENIX Ann. Tech. Conf., USENIX, 2002.

11. Libra: Scalable Advanced Network Services Based on Coordinated ActiveComponents, http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/cmcl/www/team.

12. I. Foster and C. Kesselman, The Grid: Blueprint for a New ComputingInfrastructure, Morgan Kaufmann, 1999.

13. P.V. Mockapetris and K.J. Dunlap, “Development of the Domain NameSystem,” Proc. Symp. Communications Architectures and Protocols, ACMPress, 1988, pp. 123–133.

14. M. Wahl, T. Howes, and S. Kille, Lightweight Directory Access Protocol (ver-sion 3), tech. report RFC 2251, Internet Engineering Task Force, 1997.

15. M. Harren et al., “Complex Queries in DHT-Based Peer-to-PeerNetworks,” Proc. 1st Int’l workshop on Peer-to-Peer Systems, LNCS,Springer-Verlag, 2002, pp. 242–250.

16. A. Silberschatz, H.F. Korth, and S. Sudarshan, Database SystemsConcepts, McGraw Hill, 2002.

17. J. Sidell et al., “Data Replication in Mariposa,” Proc. 12th Int’l Conf.Data Eng., IEEE CS Press, 1996, pp. 485–494.

18. J. Gray et al., “The Dangers of Replication and a Solution,” Proc. ACMSIGMOD Int’l Conf. Management of Data, ACM Press, 1996, pp. 173–182.

19. C. Pu and A. Leff, “Replica Control in Distributed Systems: An Asynchro-nous Approach,” Proc. ACM SIGMOD Int’l Conf. Management of Data,ACM Press, 1991, pp. 377–386.

20. D. Agrawal and S. Sengupta, “Modular Synchronization in Distributed,Multiversion Databases: Version Control and Concurrency Control,”IEEE Trans. Knowledge and Data Eng., vol. 5, no. 1, 1993, pp. 126–137.

21. N. Krishnakumar and A. Bernstein, “Bounded Ignorance in ReplicatedSystems,” Proc. Symp. Principles of Database Systems, ACM Press, 1991,pp. 63–74.

22. R. Alonso, D. Barbara, and H. Garcia-Molina, “Data Caching Issues in anInformation Retrieval System,” ACM Trans. Database Systems, vol. 15,no. 3, Sept. 1990, pp. 359–384.

23. S.W. Chen and C. Pu, A Structural Classification of Integrated Replica Con-trol Mechanisms, tech. report CUCS-006-92, Columbia Univ. 1992.

people’s privacy? Ensuring with full gen-erality that no one can use a video streamto compromise another individual’s pri-vacy is out of scope for our work on Iris-Net. Nevertheless, we believe that Iris-Net must provide a framework to helplimit senselets’ ability to misuse videostreams for unauthorized surveillance.To this end, our current SA implemen-tation uses privacy filters to identify thesensitive components (for example,human faces) in an image and replacesthese regions with black rectangles. Sys-tem administrators can load arbitrary

privacy filters to enforce policies that areappropriate for their sites. As Figure 4shows, IrisNet distinguishes betweentrusted and untrusted senselets; trustedsenselets receive raw video feeds, whereasuntrusted ones receive only the privacy-filtered video data. IrisNet uses digital sig-natures to authenticate trusted senseletsto ensure that if sensitive information isleaked, system administrators can dis-cover who is responsible.

Shared computation among senselets.The filtering that senselets do is compu-

tationally expensive. To increase the effi-ciency of executing multiple senselets onthe same SA, we exploit the fact that onesensor feed might interest multiple, dif-ferent IrisNet services. For example, avideo feed in a particular location mightmonitor parking spaces in one service andtrack passersby in the same visual field inanother. Senselets working on the samevideo stream often use many of the sameimage-processing primitives (for exam-ple, color-to-gray conversion, noise reduc-tion, edge detection, maintaining a sta-tistical background model, and so on).IrisNet provides a limited memoizationmechanism that senselets can use to shareintermediate computational results withother senselets through a shared memorypool. Intermediate results are namedusing the sequence of function calls per-formed on them, where each function isfrom a common library of image pro-cessing primitives. Each function call firstchecks whether the desired result alreadyexists in the shared memory.

Prototype IrisNet applicationsAlong with our collaborators (in the

case of the coastal imaging service) wehave built three services from differentapplication domains using IrisNet. Herewe describe their implementation anddeployment.

Parking-space finderThe parking-space finder uses cameras

throughout a metropolitan area to trackparking space availability. Users fill outa Web form to specify a destination andany constraints on a desired parkingspace (for example, does not require apermit, must be covered, and so on).Based on the input criteria, the parking-space-finder service identifies the near-est available parking space that satisfiesthe user constraints, then uses theYahoo! Maps service to find drivingdirections to that parking space from theuser’s current location.



Figure 5. (a) Mock parking lots, (b) postprocessed video stream of one parking lot, and(c) driving directions to the parking spot.

(a)

(c)

(b)

Although our current prototypeoperates on mock parking lots andminiature cars laid out on a table top,the rest of the system operates as itwould in an outdoor setting. Senseletsthat recognize parking space avail-ability run on laptops with Webcams.After the reference background imageis initially calibrated, each senseletdetects a car’s presence by compar-ing the current image of the spaceswith the corresponding backgroundimages.7 The senselets do these imageprocessing tasks using the Intel OpenSource Computer Vision library (www.intel.com/research/mrl/research/opencv) provided with the SA envi-ronment. Once a senselet determineswhich parking spaces are empty, itsends the availability information tothe appropriate OAs.

Figure 2 shows part of the databaseschema for this service. As mentionedearlier, this schema defines the database’sgeographically hierarchical organizationand the dynamic (for example, avail-ability information from the SAs) andstatic (for example, handicapped or not)descriptions for each space.

Figure 5 shows our laboratory deploy-ment of this service on IrisNet. Figure 5ashows the mock parking lots, monitoredby Webcams. Figure 5b shows the viewof one parking lot after local image pro-cessing to recognize full and empty park-ing spaces; red rectangles indicate fullspaces, while green rectangles indicateempty ones. Figure 5c shows an exampleof the driving directions the user receives.These directions are continually updatedas the user drives toward the destination,if the parking spot availability changes,or if a closer parking spot satisfying theuser’s constraints becomes available.

Network and host monitor (IrisLog)The IrisLog service collects data from

host and network monitoring tools(which act as sensing devices) running

on a widely dispersed set of hosts andlets users query those data efficiently.IrisLog is deployed on PlanetLab,8 anopen, shared, planetary-scale applicationtestbed comprising hundreds of nodesdistributed across five continents (seeFigure 6). Using a Web-based form, userscan query sets of PlanetLab nodes (forexample, all the nodes at Carnegie Mel-lon University) by particular metrics (forexample, CPU load) and time periodsover which the data have been collected(for example, the last hour). The Web

user interface also lets users issue arbi-trary XPATH queries over the distributedXML database of all IrisLog data.

IrisLog supports a superset of thequeries from the Ganglia PlanetLab mon-itoring service,9 but incurs far less networkoverhead. Each PlanetLab node runs anSA, which uses the local Ganglia daemonoutput to create a sensor feed describing30 different performance metrics. Thesenselet for IrisLog transmits these met-rics to the matching OA in the schema.

The IrisLog schema describes the


CMUcluster

Cambridgecluster

CMUCambridge

USA East Europe

IrisLog root

Query:least loadednode at CMU

SA

Ganglia

Client

Response

OA

OA OA

OA

OA

OA

OA

OA

OAs arereplicated

OAOA

OA

OA

Figure 7. A portion of the IrisLog hierarchy.

Figure 6. Deployed PlanetLab nodes as of October 2003.

metrics each PlanetLab node must mon-itor (for example, CPU and memoryload, bandwidth usage, and so on) andorganizes them into a geographical hier-archy. This hierarchy (see Figure 7)allows efficient processing of geo-graphically scoped queries (for exam-ple, find the least loaded CMU node).To support simple historical queriesover the monitored data, the schemauses multiresolution vectors to storeeach monitored metric. These vectorsprovide samples of past data where thesampling rate is higher for recent datathan for older data.

Coastal imaging serviceIn collaboration with oceanogra-

phers of the Argus project at OregonState University, we have deployed acoastal imaging service on IrisNet (seethe Applications department on page14).10 The service uses cameras in-stalled at sites along the Oregon coast-line and processes the live feed from thecameras to identify the visible signa-tures of nearshore phenomena such asriptides, sandbar formations, and soon. For example, Figure 8 shows howwe can use the time-averaged expo-

sures of camera images shown in fig-ure 8b, generated by merging the rawframes in Figure 8a, to identify sand-bar formations. The front-end of thisservice lets users query this historicalinformation distributed across multi-ple sites. Users can change collectionand filtering parameters remotely andinstall triggers to change the data ac-quisition rate after certain events (forexample, darkness).

The senselet for this service producesimage data such as 10-minute intervalsnapshots, 10-minute time-averagedexposure images that show wave dissi-pation patterns (indicating submergedsandbars), variance images, and pho-togrammetric metadata.

The schema defines a shallow hierar-chy comprising a root node with eachcoastal camera location connected as leafnodes. Each leaf node stores only themost recent snapshot and 10-minutetime exposure, while the root stores anarchive of all past snapshots from all thecoastal cameras. These nodes aremapped onto physical OAs such that theroot is located at a computer at OregonState and the leaf OAs are located at thecorresponding coastal locations.

Sensor network research has largelyfocused on localized deploymentsof low-power wireless sensors col-lecting numerical measurements.

However, as our examples show, manysensing services require a wide-area net-work of powerful sensors such as videocameras. IrisNet is a general-purposesoftware infrastructure that supports thecentral tasks common to these services:collecting, filtering, and combining sen-sor feeds, and performing distributedqueries within reasonable response times.

We are far from the vision of a highlyavailable, high-performance, easy-to-useworldwide sensor web. While IrisNet rep-resents an important first step toward thisvision, its design focuses on the technicalchallenges we’ve described. Importantpolicy, privacy, and security concerns mustbe addressed before rich sensors can existpervasively at a global scale. Moreover,sensors must be deployed and maintained.Currently, companies such as Honeywelland ADT provide sensor-based homesecurity systems. We envision that in time,such private-sector companies, in combi-nation with public-sector entities, non-profits, and individuals, might provide theneeded sensor infrastructure.



Figure 8. Coastal images in the form of (a) raw video frames and (b) 10-minute time-averaged exposure.

(a) (b)

Sand bars

ACKNOWLEDGMENTSWe thank Amol Deshpande, Rahul Sukthankar, andShimin Chen for their important contributions to theIrisNet system. We thank Mahadev Satyanarayananfor many valuable suggestions and Casey Helfrichfor helping with the experimental setup. Finally, wethank Mark Abbott, Rob Holman, Chuck Sears,Ganesh Gopalan, and Curt Vandetta for their collab-oration on the coastal-imaging service.

REFERENCES1. J. Hill et al., “System Architecture Direc-

tions for Network Sensors,” Proc. 9th Int’lConf. Architectural Support for Program-ming Languages and Operating Systems,ACM Press, 2000, pp. 93–104.

2. J. Kulik, W. Rabiner, and H. Balakrishnan,“Adaptive Protocols for Information Dis-semination in Wireless Sensor Networks,”Proc. 5th Ann. Int’l Conf. Mobile Com-puting and Networking, ACM Press, 1999,pp. 174–185.

3. S. Madden et al., “TAG: A Tiny Aggrega-tion Service for Ad Hoc Sensor Networks,”ACM SIGOPS Operating Systems Rev.,vol. 36, no. SI, Winter 2002, pp. 131–146.

4. Open GIS Consortium, Sensor Web Enable-ment and OpenGIS SensorWeb, www.opengis.org/functional/?page=swe.

5. P.V. Mockapetris and K.J. Dunlap, “Devel-opment of the Domain Name System,” Proc.Symp. Communications Architectures andProtocols, ACM Press, 1988, pp. 123–133.

6. A. Deshpande et al., “Cache-and-Query forWide Area Sensor Databases,” Proc. ACMSIGMOD Int’l Conf. Management of Data,ACM Press, 2003, pp. 503–514.

7. L.G. Shapiro and G.C. Stockman, Com-puter Vision, Prentice Hall, 2001.

8. L. Peterson et al., “A Blueprint for Intro-ducing Disruptive Technology into theInternet,” Proc. 1st Workshop Hot Topicsin Networks (Hotnets-I), ACM Press, 2002.

9. M.L. Massie, B.N. Chun, and D.E. Culler,“The Ganglia Distributed Monitoring Sys-tem: Design, Implementation, and Experi-ence,” submitted for publication, Feb.2003; http://ganglia.sourceforge.net.

10. R. Holman, J. Stanley, and T. Ozkan-Haller,“Applying Video Sensor Networks toNearshore Environment Monitoring,”IEEE Pervasive Computing, vol. 2, no. 4,Oct.–Dec. 2003, pp. 14–21.


the AUTHORS

Phillip B. Gibbons is a principal research scientist at Intel Research Pittsburgh andan adjunct professor in the computer science departments at Carnegie Mellon Uni-versity and the University of Pittsburgh. His research interests include wide-areasensing, massive data sets, data streams, query processing and optimization, high-performance databases, approximate query answering, models of parallel computa-tion, verification and testing, multiprocessor cache protocols, multiprocessor sched-uling, and distributed computing. He received his PhD in computer science fromthe University of California at Berkeley. He is on the editorial board for the Journal of

the ACM, and he is Conference Chair for the ACM Symposium on Parallelism in Algorithms and Archi-tectures. He is a member of the ACM, the IEEE, and SIAM. Contact him at [email protected].

Brad Karp is a staff researcher at Intel Research Pittsburgh and an adjunct assistantprofessor of computer science at Carnegie Mellon University. His research interestsinclude algorithms and systems for routing, congestion control, and distributedstorage in sensor networks, multihop wireless networks, and the Internet. He re-ceived his PhD in computer science from Harvard University. He is a member of the ACM. Contact him at [email protected].

Yan Ke is a graduate student at Carnegie Mellon University. His current researchinterests are computer vision and computer systems. He received his MS fromCarnegie Mellon University. Contact him at [email protected].

Suman Nath is a PhD candidate in the Computer Science Department at CarnegieMellon University. His research interests include distributed systems, sensor networks,database systems, and networking. He received his MS in computer science fromCarnegie Mellon University. He is a member of the ACM. Contact him at [email protected].

Srinivasan Seshan is an associate professor in Carnegie Mellon University’s Com-puter Science Department. His primary interests are in the broad areas of networkprotocols and distributed network applications. His current work explores newapproaches in overlay networking, sensor networking, online multiplayer gamesand wide-area Internet routing. He received his PhD from the Computer ScienceDepartment at the University of California at Berkeley. Contact him at [email protected]; www.cs.cmu.edu/~srini.

on all conferences sponsored by the

IEEE Computer Society.

Not a member?Join online today!

computer.org/join/

IEEEComputerSociety

memberssave 25%

IEEEComputerSociety

memberssave 25%

irisnet: an architecture for a worldwide sensor web · an architecture for a worldwide sensor web w...

Documents