joint routing and compression in sensor networks: …anrg.usc.edu/www/thesis/sundeepthesis.pdf ·...

JOINT ROUTING AND COMPRESSION IN SENSOR NETWORKS: FROM

THEORY TO PRACTICE

by

Sundeep Pattem

A Dissertation Presented to theFACULTY OF THE USC GRADUATE SCHOOLUNIVERSITY OF SOUTHERN CALIFORNIA

In Partial Fulfillment of theRequirements for the Degree

DOCTOR OF PHILOSOPHY(ELECTRICAL ENGINEERING)

August 2010

Copyright 2010 Sundeep Pattem

Dedication

To Sameera

ii

Acknowledgements

My work at USC owes a great deal to collaborations with and help from several faculty

and colleagues: Prof. Bhaskar Krishnamachari, Prof. Antonio Ortega, Prof. Ramesh

Govindan, Prof. Gaurav Sukhatme, Prof. Kristina Lerman (USC/ISI), Sameera Poduri,

Avinash Sridharan, Ying Chen, Alexandre Ciancio, Godwin Shen, Sungwon Lee, Matt

Klimesh (JPL), Maheswaran Sathiamoorthy, Aaron Tu, Aram Galstyan (USC/ISI).

It has been a privilege to be associated with Prof. Bhaskar Krishnamachari for all

these years. It is no exaggeration to say that I would not be writing a thesis but for

Bhaskar’s help - the extra-ordinary patience and kindness, the ability to empathize, en-

thuse and inspire, and the passion for helping students realize their potential, not just in

research, but as well-rounded people. I hope my life will reflect what I imbibed from his

emphasis on values and service.

My roommates and buddies, Apoorva Jindal, Rahul Urgaonkar, Avinash Sridharan,

Sonal Jain, Ankit Singhal, made time fly. The wise old(er) folks, Narayanan Sadagopan,

Venkata Pingali, Amit Ramesh, Srikanth Saripalli, Krishnakant Chintalapudi, Karthik

Dantu, made life easy. ANRG members - Marco Zuniga, Ying Chen, Shyam Kapadia,

Divya Devaguptapu, Joon Ahn, Hua Liu, Pai-Han Huang, Kiran Yedavalli, Jung-Hyun

iii

Jun, made working in the lab a pleasure. I will miss Shane Goodoff and his edgy sense

of humor.

My mother, Sri Shobha Rani, is my first teacher. I owe all my learning to her. My

father, Sri Rajeswara Rao, toiled hard so we could dream. I hope he thinks his harvest has

been good. My grandmother, Sri Kalavathi, introduced me to the power, and pleasures,

of the imagination. My sister, Santoshi, is my biggest (and only) fan. We’ll always have

each other. My friend, Rakesh, helped open my eyes to the vast and the blissful. My wife,

Sameera, has shared in all the delight and the despair. I hope she thinks it worthwhile.

For me, she has been a blessing.

iv

Table of Contents

Dedication ii

Acknowledgements iii

List Of Figures viii

Abstract xi

Chapter 1: Introduction 11.1 Data gathering sensor networks . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Thesis and Research Summary . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2.1 Impact of spatial correlations on optimal routing . . . . . . . . . . 41.2.2 Algorithms for achieving distributed compression . . . . . . . . . . 51.2.3 Architecture and system implementation for distributed compression 5

Chapter 2: Background on in-network compression 72.1 Distributed aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.2 Distributed compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.2.1 Distributed source coding . . . . . . . . . . . . . . . . . . . . . . . 92.2.2 Analysis of impact of correlations on routing . . . . . . . . . . . . 92.2.3 Spatial transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.2.4 Compressed sensing . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.3 Distributed configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.4 System implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Chapter 3: Modeling of joint routing and compression 173.1 Assumptions and Methodology . . . . . . . . . . . . . . . . . . . . . . . . 19

3.1.1 Note on Heuristic Approximation . . . . . . . . . . . . . . . . . . . 233.2 Routing Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.2.1 Comparison of the schemes . . . . . . . . . . . . . . . . . . . . . . 253.3 A Generalized Clustering Scheme . . . . . . . . . . . . . . . . . . . . . . . 28

3.3.1 Description of the scheme . . . . . . . . . . . . . . . . . . . . . . . 293.3.1.1 Metrics for evaluation of the scheme . . . . . . . . . . . . 29

3.3.2 1-D Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303.3.2.1 Sequential compression along SPT to cluster head . . . . 31

v

3.3.2.2 Compression at cluster head only . . . . . . . . . . . . . 373.3.3 2-D analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.3.3.1 Opportunistic compression along SPT to cluster head . . 413.3.3.2 Compression at cluster head only . . . . . . . . . . . . . 43

3.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493.4.1 Communication and Topology models . . . . . . . . . . . . . . . . 49

3.4.1.1 Random geometric graphs . . . . . . . . . . . . . . . . . 493.4.1.2 Realistic Wireless Communication model . . . . . . . . . 50

3.4.2 Joint entropy models . . . . . . . . . . . . . . . . . . . . . . . . . . 523.4.2.1 Linear and convex functions of distance . . . . . . . . . . 533.4.2.2 Continuous, Gaussian data model . . . . . . . . . . . . . 53

3.4.3 Summary of results . . . . . . . . . . . . . . . . . . . . . . . . . . . 553.5 Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

Chapter 4: Practical schemes for distributed compression 584.1 Wavelet transform design for wireless broadcast advantage . . . . . . . . . 58

4.1.1 Wavelet basics: The 5/3 lifting transform . . . . . . . . . . . . . . 594.1.2 Wavelets for sensor networks . . . . . . . . . . . . . . . . . . . . . 60

4.1.2.1 Unidirectional 1D wavelet . . . . . . . . . . . . . . . . . . 604.1.2.2 2D wavelet for tree topologies . . . . . . . . . . . . . . . 60

4.1.3 2D wavelet for wireless broadcast scenario . . . . . . . . . . . . . . 624.1.3.1 Augmented neighborhoods . . . . . . . . . . . . . . . . . 624.1.3.2 New transform definition . . . . . . . . . . . . . . . . . . 634.1.3.3 Performance of new transform . . . . . . . . . . . . . . . 63

4.2 Compressed sensing for multi-hop network setting . . . . . . . . . . . . . 644.2.1 Combining routing with known results in compressed sensing . . . 65

Chapter 5: SenZip: Distributed compression as a service 685.1 SenZip architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

5.1.1 SenZip Specification . . . . . . . . . . . . . . . . . . . . . . . . . . 705.1.1.1 Compression Service . . . . . . . . . . . . . . . . . . . . . 705.1.1.2 Networking components . . . . . . . . . . . . . . . . . . . 72

5.1.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725.2 Mapping algorithms to architecture . . . . . . . . . . . . . . . . . . . . . . 73

5.2.1 Algorithm details . . . . . . . . . . . . . . . . . . . . . . . . . . . . 745.2.1.1 DPCM . . . . . . . . . . . . . . . . . . . . . . . . . . . . 745.2.1.2 2D wavelet . . . . . . . . . . . . . . . . . . . . . . . . . . 75

5.2.2 Relating algorithms to SenZip . . . . . . . . . . . . . . . . . . . . . 765.2.2.1 Initialization . . . . . . . . . . . . . . . . . . . . . . . . . 765.2.2.2 Data forwarding and compression . . . . . . . . . . . . . 785.2.2.3 Reconfiguration . . . . . . . . . . . . . . . . . . . . . . . 78

5.3 System implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 795.3.1 TinyOS code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

5.3.1.1 Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 805.3.1.2 AggregationP component . . . . . . . . . . . . . . . . . . 80

vi

5.3.1.3 CompressionP component . . . . . . . . . . . . . . . . . . 815.3.1.4 Changes to CTP . . . . . . . . . . . . . . . . . . . . . . . 835.3.1.5 Application . . . . . . . . . . . . . . . . . . . . . . . . . . 83

5.3.2 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 845.3.2.1 Static topologies . . . . . . . . . . . . . . . . . . . . . . . 845.3.2.2 Dynamic topologies . . . . . . . . . . . . . . . . . . . . . 86

Chapter 6: Conclusion 886.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 886.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

References 91

vii

List Of Figures

1.1 (a) Illustration of a distributed phenomena and data gathering using sensornetwork. (b) Hardware - Telosb mote device. . . . . . . . . . . . . . . . . 2

1.2 (a) Software abstraction from application developer perspective. (b) Pos-sible fit for compression. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3 Software abstraction for compression as a service . . . . . . . . . . . . . . 3

3.1 Empirical data (from the rainfall data-set) and approximation for jointentropy of linearly placed sources separated by different distances . . . . . 21

3.2 Illustration of routing for the three schemes: DSC, CDR, and RDC. Hi isthe joint entropy of i sources. . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.3 Comparison of energy expenditures for the RDC, CDR and DSC schemeswith respect to the degree of correlation c. . . . . . . . . . . . . . . . . . . 27

3.4 Illustration of clustering for a two-dimensional field of sensors . . . . . . . 30

3.5 Comparison of the performance of different cluster-sizes for linear array ofsources(n = D = 105) with compression performed sequentially along thepath to cluster heads. The optimal cluster size is a function of correlationparameter c. Also, cluster size s = 15 performs close to optimal over therange of c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.6 Illustration of the existence of a static cluster for near-optimal performanceacross a range of correlations. The sources are in a linear array and datais sequentially compressed along the path to cluster heads. . . . . . . . . . 37

3.7 Performance with compression only at cluster head with nodes in a lineararray(n = D = 105). Cluster sizes s = 5, 7 are close to optimal over therange of c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

viii

3.8 Illustration of the near-optimal cluster size with compression only at clusterhead with nodes in a linear array. The performance of cluster sizes near

s = 7(≈√

1052 ) is close to optimal over the range of c values . . . . . . . . 40

3.9 Routing in a 2-D grid arrangement. (a) Calculation of joint entropy. Usingthe iterative approximation joint entropy of k nodes forming a contiguousset is the same as the joint entropy of k sensors lying on a straight line.This is illustrated along the diagonal. This also illustrates opportunisticcompression along SPT to cluster head. (b) Intra-cluster, shortest pathfrom source to cluster head routing with compression only at cluster head.The routing from cluster heads to sink is similar. . . . . . . . . . . . . . . 41

3.10 Comparison of the performance of various cluster sizes for a network with106 nodes on a 1000x1000 grid when compression is possible only at clusterheads. The performance for s = 5, 10 is observed to be close to optimalover the range of c values. . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3.11 Illustration of the existence of a near-optimal cluster size. The networksize is n × n = 1000 × 1000 and compression is possible only at clusterheads. The performance of cluster side values near s = .6487n

13 is quite

close to optimal for all values of c ranging from 0.0001 to 10000 . . . . . . 48

3.12 Random geometric graph topology. Performance of clustering with density= 1 node/m2, communication radius = 3m for network of size (a) 24x24(b) 84x84 (c) 200x200. Near-optimal cluster sizes are (a) 3,4 (b) 4,7 (c) 8,10. 51

3.13 Realistic wireless communication topology. Performance of clustering in48mx48m network with density = .25 nodes/m2 for power level (a) -3dB(b) -7dB (c) -10dB. Cluster sizes 6, 8 are near-optimal. . . . . . . . . . . . 52

3.14 (a) Example forms of joint entropy functions for 2 sources. The entropyof each source is normalized to 1 unit. The convex and linear curves areclipped when the joint entropy equals the sum of individual entropies.The curves shown are for correlation parameter c = 2. Performance ofclustering in 72m × 72m network with density = .25 nodes/m2 for (b)linear model (c) convex model of joint topology. Cluster size 6 is near-optimal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

3.15 Performance of clustering in 48m×48m network with density = .25 nodes/m2

with a continuous, jointly Gaussian data model and quatization step (a) δ= 1 (b) δ = 0.5 (c) δ = 0.05. Cluster size 6, 8 are near-optimal. . . . . . . 56

4.1 Example (a) signal and (b) 5/3 wavelet coefficients . . . . . . . . . . . . . 59

ix

4.2 Illustration of odd (green) and even (blue) nodes in a subtree for 2D wavelet(a) with unicast and (b) exploiting broadcast nature of wireless communi-cations. The solid arrows are part of the tree routing paths. The dashedarrows are the wireless links not part of the tree. The arrows crossed offin red denote disallowed interactions for transform invertibility and unidi-rectionality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

4.3 (a) Sample tree topology (b) With additional broadcast links in the aug-mented neighborhoods at each node (c) Performance gain in terms of SNRvs. cost for new transform compared to 2D wavelet for tree topologies . . 63

4.4 Compressed sensing performance in multi-hop setting. Plot of SNR vscost for different schemes. The black and green curves are for SparseRandom Projections (SRP). The blue and red curves are for two variationsof computing projections over shortest path routing. . . . . . . . . . . . . 67

5.1 The SenZip architecture. A completely distributed compression service isenabled by having the interacting components shown here at each networknode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

5.2 Aggregation table example. The recursive entry structure allows the samedefinition for different compression schemes. . . . . . . . . . . . . . . . . . 71

5.3 Partial computations for 2D wavelet. Gray (white) circles denote even (odd)nodes. Operations at each node are done in the order listed. . . . . . . . . . . . 77

5.4 Code structure of (a) CTP and (b) SenZip compression service over CTP 79

5.5 (a) Distributed compression and (b) Centralized reconstruction . . . . . . 81

5.6 Experiments on static trees with 2D wavelet transform and fixed quanti-zation. (a) Two fixed tree topologies, tree 1 and tree 2, for same set andlocations of nodes. Raw measurement (dashed red) and reconstruction(solid blue) for node 7 with 2 bits allocated per sample for (b) tree 1 and(c) tree 2, for node 12 with 3 bits per sample for (d) tree 1 and (e) tree 2.Histogram of RMS error at all nodes with 3 bits per sample for (f) tree 1and (g) tree 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

5.7 (a) Average RMS error for tree 1 with increasing bit allocation per samplefor DPCM and 2D wavelet. (b) Normalized cost wrt. to raw data gatheringwith CTP for increasing bit allocaton per sample. . . . . . . . . . . . . . . 86

x

Abstract

In-network compression is essential for extending the lifetime of data gathering sensor

networks. To be really effective, in addition to the computations, the configuration re-

quired for such compression must also be achieved in a distributed manner. The thesis of

this dissertation is that it is possible to demonstrate completely distributed in-network

compression in sensor networks. Establishing this thesis requires studying several aspects

of joint routing and compression.

First, our analysis of the impact of spatial correlations on optimal routing shows that

there exist correlation-independent schemes for routing that are near-optimal. This im-

plies that static routing schemes may perform as well as sophisticated ones based on

learning correlations. Next, we develop novel and practical algorithms for distributed

compression that take into account the routing structure over which data is transported.

Finally, we argue that lack of work on (a) distributed configuration for compression oper-

ations and (b) reusable software development, is the primary reason why compression has

not been widely adopted in sensor network deployments. Our solution to address this gap

is SenZip, an architectural view of compression as a service that interacts with standard

networking services. A system implementation based on SenZip and results from experi-

ments concretely demonstrate distributed and self-organizing in-network compression.

xi

Chapter 1

Introduction

The sensor networks vision arose in the late 1990s, with the emergence of a new class

of devices that allow fine-grained sensing of the physical world. Technological advances

made it possible to integrate computation, communication, sensing, and even actuation,

on the same platform, while keeping the form factor small and cost low. Potentially large

numbers of these devices could be distributed in space to form a collaborating wireless

network capable of achieving complex global tasks. It was apparent that such networks

will enable several new applications of benefit to society. The last ten years have seen a

great amount of research activity in this area in both academia and industry.

1.1 Data gathering sensor networks

Sensor networks are aiding the evolution of monitoring systems for earth and space science

applications [GBR, TMEC+10, WALJ+06]. Frequently, these systems require continuous

data gathering from a distributed field to a central base station. A typical scenario is

illustrated in Figure 1.1(a). The image shows the temperature of the ocean surface off

the Los Angeles coast. A sensor network has been deployed to collect the temperature

1

(a) (b)

Figure 1.1: (a) Illustration of a distributed phenomena and data gathering using sensornetwork. (b) Hardware - Telosb mote device.

(a) (b)

Figure 1.2: (a) Software abstraction from application developer perspective. (b) Possiblefit for compression.

measurements and transport them to a base station on the coast. The hardware de-

ployed could be, for instance, Telosb motes shown in Figure 1.1(b). In software, the data

gathering application interfaces with sensors to receive sensor measurements and sends

them to a networking “black box” that will perform operations necessary to transport

the measurements to the sink. A crude abstraction for the software at each node from

an application developer perspective is shown in Figure 1.2(a).

The phenomena of interest being sensed evolves in space and time. For most naturally

occurring phenomena, it can be expected that the signal will be correlated in both of these

2

Figure 1.3: Software abstraction for compression as a service

dimensions. In-network compression or multi-node fusion is considered as a necessity due

to the energy constraints of sensor nodes. Since the energy cost is directly related to the

number of bits transmitted, it would be more efficient to exploit the correlations in the

data to compress it inside the network. Where should this compression be performed?

Perhaps as part of the application, as is the case in the Internet? The abstraction would

then look something like Figure 1.2(b). However, in this situation, the ’spatial image’ is

not available at any single node. The compression needs to be performed as the data is

routed to the sink.

1.2 Thesis and Research Summary

From an application developer perspective, compression needs to be provided as a ser-

vice. Given such a service, as shown in Figure 1.3, the application now sends the sensor

measurements to a “compression plus networking black box”. In addition to the regular

networking functions, this “black box” will be capable of achieving both the computations

3

and configuration required for in-network compression in a distributed manner. Is it pos-

sible to define such a service? What is inside the box? Our thesis is that it is possible

to practically demonstrate completely distributed in-network compression in

sensor networks. Establishing this thesis to arrive at our goal of “compression as a

service“ requires studying several aspects of joint routing and compression - What is

the impact of spatial correlations on optimal routing? What algorithms can be used for

distributed en-route compression? What issues need to be addressed in going from the

theory to a widely adopted system design for distributed compression?

1.2.1 Impact of spatial correlations on optimal routing

In considering the impact of spatial correlations on routing, since energy-efficiency is the

prime motivation for compression of correlated data, it makes sense to route along paths

which allow for more compression. However, the increased routing costs for deviating

too much from the original shortest paths might overwhelm the gains from compression.

We build models and perform analysis to explore this tension. Clustering is a natural

way of trading off progress towards the sink and opportunities for compression close

to data sources. The optimal cluster size can be expected to depend on the degree

or level of correlation in the data. Our analysis confirms this but also throws up two

surprising results. First, when every node is capable of compression computations, the

optimal cluster consists of the whole network i.e shortest path routing is optimal. Second,

when compression operations are performed only at cluster heads, there exists a near-

optimal cluster size that works well over a range of correlation levels. The implication

4

for correlated data gathering is that simple, non-adaptive routing schemes can perform

as well as sophisticated, adaptive ones.

1.2.2 Algorithms for achieving distributed compression

In the second part, we focus on the design of distributed compression algorithms. We

consider two different views of structure in data: one based on wavelet transforms and

the other on compressed sensing. Shen and Ortega [SO08a] have developed lifting based

wavelet transforms that can operate over any 2-D tree routing topology. Their algorithms

assume unicast communications between nodes in the network. We extend their work

by designing a new transform to take advantage of the broadcast nature of wireless

communication [SPO09]. This transform allows for better compression of data and hence

energy efficiency. The second approach is to extend the recent results in compressive

sensing for the multi-hop routing scenario. Our work is the first to consider this problem.

1.2.3 Architecture and system implementation for distributed compression

In this part, we focus on software development and system implementation issues for dis-

tributed compression. Earlier work has mostly dealt with specific schemes and optimiza-

tions and has not led to reusable software development, which is the crucial step in wide

adoption in deployments. Another important issue that earlier work has not addressed

is that of distributed configuration and re-configuration. Along with the computations

which have to be performed in a distributed fashion at the nodes, the configuration of

compression operations i.e. which ”roles” nodes play in the transform, which other nodes

they receive data from and perform computations over, the topology-specific parameter

5

settings in the transform etc., (along with re-configuration in the face of network dynam-

ics) also has to happen in a distributed manner. Finally, to avoid a re-design of the stack,

the compression should be able to work over standard networking (esp. routing) compo-

nents. Our solution incorporates these issues to propose SenZip, an architectural view of

compression as a service that works over standard networking components. To establish

that SenZip is a working architecture, we have implemented a nesC/TinyOS system that

provides a compression service based on the SenZip architecture that works on top of

the Collection Tree Protocol [tos]. The resulting system demonstrated distributed con-

figuration and computations and good reconstruction for compression with two different

schemes, DPCM and 2D wavelet over both static and dynamic routing topologies. This

system adapts to changes in the network topology using the tools provided by CTP. When

the topology changes,the local aggregation tree is re-configured in a distributed manner

and both compression and reconstruction continue smoothly. The software modules are

available for download on tinyos-contribs.

The rest of the dissertation is organized as follows. Chapter 2 provides background

on in-network compression for sensor networks by discussing related literature. Chapter

3 presents the modeling and analysis of the impact of spatial correlations on routing.

Chapter 4 presents new schemes for distributed compression. Chapter 5 presents the

SenZip architecture and details of the system implementation based on it. Chapter 6

concludes the thesis with a summary of contributions and future work.

6

Chapter 2

Background on in-network compression

In-network compression or multi-node fusion is essential for data gathering sensor net-

works due to the energy constraints of the nodes. We discuss the several approaches

that have been proposed to exploit the correlations for efficient and long-lived operation.

There is some limited work on in-network compression in wired networks. A set of stan-

dards has been developed for header compression inside the network [rfc90, rfc99, rfc01].

Obviously, in this case, the payload of the packets is not altered. Work on active networks

looked at performing operations on data inside the network to tradeoff computation and

communication [BK01, TW96]. However, the vision of an ActiveNet that will succeed

and replace the Internet did not materialize. The Internet is based on the end-to-end

paradigm with only end hosts performing operations on the data.

In this section, we begin by looking at schemes for distributed aggregation and com-

pression for sensor networks. These works primarily focus on achieving the computations

required for compression in a distributed manner. We then discuss the problem of dis-

tributed configuration and reconfiguration required at the nodes to perform compression

7

operations. Finally, we describe work on system design and software development for

energy-efficient data gathering.

2.1 Distributed aggregation

Aggregation schemes aim to avoid redundancy at packet level. Some examples are dupli-

cate suppression and finding statistics such as the minimum, maximum, average, count

etc. for the measurements of distributed sensors.

Krishnamachari et al. [KEW02] presented models and performance analysis for simple

aggregation (duplicate suppression, min, max) and illustrated the gains when compared

to end-to-end routing. They also studied the effects of network topology and the nature

of optimal routing for such aggregation. Aggregation via a minimum Steiner tree is

shown to be optimal and hence NP-hard, and some sub-optimal structures are then

considered. Intanagonwiwat et al. [IEGH02] observed that greedy aggregation based on

directed diffusion [IGE+03] can do better than opportunistic aggregation in high density

scenarios. Madden et al. [MFHH] argued that aggregation should be provided as a core

service for sensor network applications. They proposed the TAG (Tiny AGgregation)

service for answering declarative queries over a routing tree.

2.2 Distributed compression

In this section, we discuss literature on distributed compression.

8

2.2.1 Distributed source coding

These works involve constructive approximations to distriuted Slepian-Wolf. Several

works with little or no interaction between encoders. Typically, these approaches require

knowledge of global correlation structure at the sink or at all nodes. Multi-hop routing is

not considered. The techniques proposed by Pradhan et al. [PR99] suggest mechanisms

to compress the content at the original sources in a distributed manner without explicit

routing-based aggregation. The sink has complete knowledge of the correlation structure,

which it uses to arrive at the optimal coding rates at each node and then disseminates

the same to them. No inter-sensor communication is required for compression purposes.

Gastpar et al. [GDV06] present the distributed K-L transform that has applications for

distributed compression problems. The authors consider the optimal local operations at

distributed agents, such as sensor nodes, to provide a locally compressed version of the

data to a central base station which will then reconstruct the whole field with minimum

error. In general, the solution needs knowledge of global correlation structure and is

shown to be globally convergent for the Gauss-Markov data case.

In DSC techniques, the correlation between data captured by different nodes has

to be known, which in practice will require data exchange between nodes. Practical

application and deployment of such techniques for sensor network data gathering has not

been attempted.

2.2.2 Analysis of impact of correlations on routing

Work by Scaglione and Servetto [SS02] was the first to explicitly consider the problem of

joint routing and compression. By considering the joint entropy of sources as the data

9

metric and routing for compression within localized partitions (or clusters), it is shown

that the network broadcast problem in multi-hop networks is feasible.

Work by Enachescu et al. [EGGM04] presents a randomized algorithm which is a

constant factor approximation (in expectation) to the optimum aggregation tree simulta-

neously for all correlation parameters. A notion of correlation is introduced in which the

information gathered by a sensor is proportional to the area it covers and the aggregate

information generated by a set of sensors is the total area they cover. The performance of

aggregation under an arbitrary, general model is considered by Goel and Estrin [GE03].

In this thesis, we analyze the relative performance of various routing and compression

schemes based on using an empirically motivated model for the joint entropy as a func-

tion of inter-source distances [PKG04, PKG08]. The optimal routing structure is then

analyzed using this approximation. The analysis demonstrates that the optimal routing

structure depends on where the actual data compression is performed; at each individual

node or at “micro-servers” acting as intermediate data collection points. In both cases,

we show that there exist efficient correlation independent routing schemes.

The correlated data gathering problem and the need for jointly optimizing the coding

rate at nodes and routing structure is also considered in [CBLV04]. The authors provide

analysis of two strategies: the Slepian-Wolf or DSC model, for which the optimal coding

is complex (needs global knowledge of correlations) and optimal routing is simple (always

along a shortest path tree) and a joint entropy coding model with explicit communication

for which coding is simple and optimizing routing structure is difficult. For the Slepian-

Wolf model, a closed form solution is derived while for the explicit communication case

10

it is shown that the optimization problem is NP-complete and approximation algorithms

are presented.

In [vRW04], “self-coding” and “foreign-coding” are differentiated. In self-coding,

a node uses data from other nodes to compress its own data, while in foreign-coding

a node can also compress data from other nodes. With foreign-coding, the authors

show that energy-optimal data gathering involves building a directed minimum span-

ning tree (DMST). For self-coding, it is shown in [CBLV04] that the optimal solution is

NP-complete.

Work by Enachescu et al. [EGGM04] presents a randomized algorithm which is a

constant factor approximation (in expectation) to the optimum aggregation tree simulta-

neously for all correlation parameters. A notion of correlation is introduced in which the

information gathered by a sensor is proportional to the area it covers and the aggregate

information generated by a set of sensors is the total area they cover. The performance of

aggregation under an arbitrary, general model is considered by Goel and Estrin [GE03].

Zhu et. al [ZSS05] have shown that under many network scenarios, a shortest path

tree has performance that is comparable to an optimal correlation aware routing struc-

ture. While [GE03] takes a more general view of aggregation functions rather than as

compression of spatially correlated sources and results in [ZSS05] are contingent on a lim-

ited data compression model - compression gain independent of number of neighbors and

distances between nodes, our finding that there exists a near-optimal clustering scheme

that performs well for a wide range of correlations is in keeping with the results presented

in these works.

11

2.2.3 Spatial transforms

The design of spatial transforms involves separating the spatially distributed signal into

low and high pass portions. One particular class of methods only send trend data or data

models within a given cluster. In Ken [CDHH06] and PAQ [TM06] nodes are separated

into clusters and assigned roles as cluster head or non-cluster head. Then, nodes forward

data to cluster heads on some aggregation graph, model parameters for data in each

cluster are estimated at cluster heads and only model parameters are forwarded to the

sink along a routing tree. Note that an ordering of communications is implicit in this

process. Another simple form of distributed data compression is differential encoding.

For example, in DOSA [ZCH07], nodes are assigned roles as either correlating (C) or

non-correlating (NC) nodes, NC nodes forward data to C nodes and C nodes compute

and forward differentials of their NC neighbors. More sophisticated techniques have

also been proposed [LTP05]. Distributed computation of differentials must be done in

a predefined order on an aggregation graph. Differentials are forwarded along a routing

tree.

Ciancio and Ortega [CO05] developed a distributed scheme for removing spatial cor-

relations using wavelet transforms via lifting steps. This scheme was for 1D paths and

was later extended by Ciancio et al. [CPOK06] to handle the merging of multiple paths.

A further enhancement by Shen and Ortega [SO08a] designed a transform to work over

any given tree. We describe a new transform [SPO09] that exploits the broadcast nature

of wireless transmission to achieve better SNR vs. cost performance.

12

2.2.4 Compressed sensing

Wavelet transform techniques are essentially critically sampled approaches, so that their

cost of gathering scales up with the number of sensors, which could be undesirable when

large deployments are considered. Compressed sensing (CS) has been considered as a

potential alternative in this context, as the number of samples required (i.e., number

of sensors that need to transmit data), depends on the characteristics (sparseness) of

the signal [CRT06, Don06]. In addition CS is also potentially attractive for wireless

sensor networks because most computations take place at the decoder (sink), rather than

encoder (sensors), and thus sensors with minimal computational power can efficiently

encode data.

CS theoretical developments have focused on minimizing the number of measurements

(i.e., the number of samples captured), rather than on minimizing the cost of each mea-

surement. In many CS applications (e.g., [DDT+08, LDP07]), each measurement is a

linear combination of many (or all) samples of the signal to be reconstructed.

It is easy to see why this is not desirable in the context of a sensor network: the

signal to be sampled is spatially distributed so that measuring a linear combination of all

the samples would entail a significant transport cost to generate each aggregate measure-

ment. To address this problem, sparse measurement approaches (where each measure-

ment requires information from a few sensors) have been proposed. Wang et al. [WGR07]

look at such an approach in a single hop network. We consider multi-hop sensor net-

works [LPS+09, PLS+09]. Compared with state of the art compressed sensing techniques

13

for sensor networks, our experimental results demonstrate significant gains in reconstruc-

tion accuracy and transmission cost.

2.3 Distributed configuration

Only limited efforts have been devoted to understanding the problems associated with

distributed node configuration for compression. For efficiency and scalability, only a small

amount of “local” communications should be needed to determine which nodes exactly

perform which compression computations, over what data and how the data is then routed

to them. Distributed configuration is also desirable as it can help reduce initialization and

reconfiguration times since it is not necessary for a sink node to first gather information

about all nodes.

From a purely architectural viewpoint, it is well understood that addressing the re-

source constraints in sensor network operation requires cross-layer designs. However, this

flexibility has led to a proliferation of monolithic and vertically integrated systems. In the

absence of an agreement on the decomposition of services provided by system components

and their interactions, interfacing such designs for a practical deployment is more or less

infeasible.

Culler et al. [CDE+05] advocate the need for an overall sensor network architecture.

In a follow up paper by Tavakoli et al. [TDJ+07], a set of design principles is proposed

for the development of elements of the networking software architecture. In addition

to the traditional goals of code reuse and interoperability, these include extensibility.

This requirement arises in view of the relative immaturity of the field, where a rigid

14

and complete modularization stifles innovation. They recommend a hybrid approach,

with modularity for low level components (underlying infrastructure) and flexibility and

extensibility at the higher layer (programming paradigm).

We address the problem of distributed configuration for compression by proposing the

SenZip architecture. SenZip specifies a compression service that can encompass differ-

ent compression schemes and its modular interactions with standard networking services

such as routing. This architecture enables a distributed node configuration for compres-

sion, just as existing systems make it possible for sensors to configure themselves for

routing in a distributed manner. The architecture proposal is based on (a) lessons from

overall architectural principles for sensor networks [TDJ+07], (b) our own experience in

implementing a practical wavelet-based distributed compression system, and (c) identify-

ing common patterns in existing compression schemes. Work by Tarrio et al. [TVSO09]

GSN ’09 considers the design of simple wavelet-like techniques for distributed compression

which are exlicitly designed to work over and take advantage of configuration mechanisms

provided the Collection Tree Protocol [tos].

2.4 System implementation

Most earlier work has focused on theory and simulations to understand performance

limits. These studies, and some limited system implementations (e.g., [ZCH07]), have

therefore had limited impact on technology adoption and sensor network software devel-

opment because they have not yielded modular and inter-operable software.

15

Previous efforts to implement simpler kinds of aggregation mechanisms in sensor net-

works. These include the aggregation services in TAG/TinyDB [MFHH], application inde-

pendent distributed aggregation (AIDA) [HBSA04], and the differential encoding-based

distributed compression scheme whose implementation is described in [ZCH07]. There

has also been some prior work [GGP+03] on implementing traditional non-distributed

wavelets to compression for multi-resolution storage and querying in sensor networks.

To demonstrate the utility and practicality of SenZip, we have implemented a system

to achieve compression over the Collection Tree Protocol [tos]. The resulting system

demonstrated distributed configuration and computations and good reconstruction for

compression with two different schemes, DPCM and 2D wavelet over both static and

dynamic routing topologies [PSC+09]. The software modules were designed to be re-

usable and extensible and are available on tinyos-contribs [sen].

16

Chapter 3

Modeling of joint routing and compression

In order to understand the space of interactions between routing and compression, we

study simplified models of three qualitatively different schemes. In routing-driven com-

pression data is routed through shortest paths to the sink, with compression taking

place opportunistically wherever these routes happen to overlap [IEGH02] [KEW02]. In

compression-driven routing the route is dictated in such a way as to compress the data

from all nodes sequentially - not necessarily along a shortest path to the sink. Our analy-

sis of these schemes shows that they each perform well when there is low and high spatial

correlation respectively. As an ideal performance bound on joint routing-compression

techniques, we consider distributed source coding in which perfect source compression is

done a priori at the sources using complete knowledge of all correlations.

In order to obtain an application-independent abstraction for compression, we use the

joint entropy of sources as a measure of the uncorrelated data they generate. An empirical

The work described in this section was published as follows:Sundeep Pattem, Bhaskar Krishnamachari and Ramesh Govindan, “The Impact of Spatial Correlationon Routing with Compression in Wireless Sensor Networks”, Transactions on Sensor Networks (TOSN),Volume 4, Number 4, August 2008.Sundeep Pattem, Bhaskar Krishnamachari and Ramesh Govindan, “The Impact of Spatial Correlation onRouting with Compression in Wireless Sensor Networks,” Third Symposium on Information Processingin Sensor Networks (IPSN), 2004

17

approximation for the joint entropy of sources as a function of the distance between them

is developed. A bit-hop metric is used to quantify the total cost of joint routing with

compression. Evaluation of the above schemes using these metrics leads naturally to a

clustering approach for schemes that perform well over the range of correlations.

We develop a simple scheme based on static, localized clustering that generalizes these

techniques. Analysis shows that the nature of optimal routing will depend on the number

of nodes, level of correlation and also on where the compression is effected; at the individ-

ual nodes or at intermediate aggregation points (cluster heads). Our main contribution

is a surprising result that there exists a near-optimal cluster size that performs well over

a wide range of spatial correlations. A min-max optimization metric for the near-optimal

performance is defined and a rigorous analysis of the solution is presented for both 1-D

(line) and 2-D (grid) network topologies. We show further that this near-optimal size is in

fact asymptotically optimal in the sense that, for any constant correlation level, the ratio

of the energy costs associated with the near-optimal cluster size to those associated with

the optimal clustering goes to one as the network size increases. Simulation experiments

confirm that the results hold for more general topologies - 2-D random geometric graphs

and realistic wireless communication topology with lossy links, and also for a continuous,

Gaussian data model for the joint entropy with varying quantization.

From a system-engineering perspective, this is a very desirable result because it elim-

inates the need for highly sophisticated compression-aware routing algorithms that adapt

to changing correlations in the environment (which may even incur additional overhead

for adaptation), and therefore simplifies the overall system design.

18

3.1 Assumptions and Methodology

Our focus is on applications which involve continuous data gathering for large scale and

distributed physical phenomena using a dense wireless sensor network where joint routing

and compression techniques would be useful. An example of this is the collection of data

from a field of weather sensors. If the nodes are densely deployed, the readings from

nearby nodes are likely to be highly correlated and hence contain redundancies, because

of the inherent smoothness or continuity properties of the physical phenomenon.

To compare and evaluate different routing with compression schemes, we will need a

common metric. Our focus is on energy expenditure, and we have therefore chosen to

use the bit-hop metric. This metric counts the total number of bit transmissions in the

network for one round of gathering data from all sources. Formally, let T = (V,E, ξT )

represent the directed aggregation tree (a subgraph of the communication graph) corre-

sponding to a particular routing scheme with compression, which connects all sources to

the sink. Associated with each edge e = (u, v) is the expected number of bits be to be

transported over that edge in the tree (per cycle). For edges emanating from sources that

are leaves on the tree, the bit count is the amount of data generated by a single source.

For edges emanating from aggregation points, the outgoing edge may have a smaller bit

count than the sum of bits on the incoming edges, due to aggregation. For nodes that

are neither sources or aggregation points but act solely as routers, the outgoing edge will

contain the same number of bits as the incoming edge. The bit-hop metric ξT is simply:

ξT =∑e∈E

be. (3.1)

19

There are two possible criticisms of this metric that we should address directly. The

first is that the total transmissions may not capture the “hot-spot” energy usage of

bottleneck nodes, typically near the sink. However, an alternative metric that better

captures hot-spot behavior is not necessarily relevant if the a priori deployment and

energy placement ensure that the bottlenecks are not near the sink or if the sink changes

over time. The second possible criticism is that this does not incorporate reception costs

explicitly. However, the use of bit-hop metric is justified because it does in-fact implicitly

incorporate reception costs. If every bit transmission incurs the same corresponding

reception cost in the network, the sum of these transmission and reception costs will be

proportional to the total number of bit-hops.

To quantify the bit-hop performance of a particular scheme, therefore, we need to

quantify the amount of information generated by sources and by the aggregation points

after compression. For this purpose we use the entropy H of a source, which is a measure

of the amount of information it originates [CT91]. In this paper, we consider only lossless

compression of data. In order to characterize correlation in an application-independent

manner, we use the joint entropy of multiple sources to measure the total uncorrelated

data they originate. Theoretically, using entropy-coding techniques this is the maximum

possible lossless compression of the data from these sources. We now attempt to construct

a parsimonious model to capture the essential nature of joint entropy of sources as a

function of distance. The simplicity of this approximation model enables the analysis

presented in Sections 3 and 4.

In general, the extent of correlation in the data from different sources can be expected

to be a function of the distance between them. We used an empirical data-set pertaining

20

0 50 100 150 200 250 300 350 400 450

Distance (km)E

ntro

py (

bits

)

actual dataapproximation

H2

H3

H4

H1

2H1

3H1

4H1

[c = 25, RMS error = .03]

[c = 25, RMS error = .09]

[c = 25, RMS error = .055]

Figure 3.1: Empirical data (from the rainfall data-set) and approximation for joint en-tropy of linearly placed sources separated by different distances

to rainfall1 [WB99] to examine the amount of correlation in the readings of two sources

placed at different distances from each other. Since rainfall measurements are a contin-

uous valued random variable and hence would have infinite entropy, we present results

obtained from quantization. The range of values was normalized for a maximum value

of 100 and all readings ‘binned’ into intervals of size 10. Fig.3.1 is a plot of the average

joint entropy of multiple sources as a function of inter-source distance.

The figure shows a steeply rising convex curve that reaches saturation quickly. This

is expected since the inter-source distance is large (in multiples of 50km). From the

empirical curve, a suitable model for the average joint entropy of two sources (H2) as a

function of inter-source distance d is obtained as:

H2(d) = H1 + [1− 1(dc + 1)

]H1. (3.2)

1This data-set consists of the daily rainfall precipitation for the pacific northwest region over a periodof 46 years. The final measurement points in the data-set formed a regular grid of 50km x 50km regionsover the entire region under study. Although this is considerably larger-scale than the sensor networksof interest to us, we believe the use of such “real” physical measurements to validate spatial correlationmodels is important.

21

Here c is a constant that characterizes the extent of spatial correlation in the data. It

is chosen such that when d = c, H2 = 32H1. In other words, when inter-source distance

d = c, the second source generates half the first node’s amount in terms of uncorrelated

data. In Fig.3.1, a value of c = 25 matches the H2 curve well.

Finally, this leaves open the question of how to obtain a general expression for the joint

entropy of n sources at arbitrary locations. As we shall show later, this is needed in order

to study the performance of various strategies for combined routing and compression. To

this end, we now present a constructive technique to calculate approximately the total

amount of uncorrelated data generated by a set of n nodes.

From Eqn.3.2, it appears that on average, each new source contributes an amount of

uncorrelated data equal to [1− 1( d

c+1)

]H1, where we take the d as the minimum distance

to an existing set of sources. This suggests a constructive iterative technique to calculate

approximately the total amount of uncorrelated data generated by a set of n nodes:

1. initialize a set S1 = v1 where v1 is any node. We will denote by H(Si) the joint

entropy of nodes in set Si; where H(S1) = H1. Let V be the set of all nodes.

2. Iterate the following for i = 2 : n

(a) Update the set by adding a node vi where vi ∈ V \Si−1 is the closest (in terms

of Euclidean distance) of the nodes not in Si−1 to any node in Si−1, i.e. set

Si = Si−1, vi.

(b) Let di be the shortest distance between vi and the set of nodes in Si−1. Then

calculate the joint entropy as H(Si) = H(Si−1) + [1− 1

(dic

+1)]H1.

3. The final iteration yields H(Sn) as an approximation of Hn.

22

In the simple case when all nodes are located on a line equally spaced by a distance

d, this procedure would yield the expression:

Hn(d) = H1 + (n− 1)[1− 1(dc + 1)

]H1. (3.3)

That this closed-form expression provides a good approximation for a linear scenario is

validated by our measurements from the rainfall data set, as seen in Fig.3.1. The curve

for H3 was obtained by considering all sets of grid points (p1, p2, p3) such that they lie in

a straight line with the distance between two adjacent points plotted on the x-axis. The

curve for H4 was similarly obtained using all sets of 4 points.

3.1.1 Note on Heuristic Approximation

We note that the final approximationH(Sn) is guaranteed to be greater than the true joint

entropy H(v1, v2, ...., vn). Thus it does represent a rate achievable by lossless compression.

The approximation roughly corresponds to a rate allocation of H(vi/ηvi) at every node vi,

where ηvi is the nearest neighbor of vi. A more precise information-theoretic treatment

in terms of the rate allocations at each node is possible, for instance, as in [CBLV04,

CBLVW06]. We relinquish some rigor with the objective of gaining practical insight. This

approach makes the problem more tractable and is the basis for analysis in subsequent

sections.

Another point of contention is the need for such a heuristic approach instead of using

a continuous data model and using analytical expressions for the joint entropy for this

model. In this regard, we note that (a) our model matches the standard jointly Gaussian

23

entropy model for low correlation [Appendix ??] and (b) since the standard expression is

in covariance form, it cannot be used for high correlation values, necessitating a reasonable

approximation.

3.2 Routing Schemes

Given this framework, we can now evaluate the performance of different routing schemes

across a range of spatial correlations. We choose three qualitatively different routing

schemes; these schemes are simplified models of schemes that have been proposed in the

literature.

1. Distributed Source Coding (DSC): If the sensor nodes have perfect knowledge about

their correlations, they can encode/compress data so as to avoid transmitting re-

dundant information. In this case, each source can send its data to the sink along

the shortest path possible without the need for intermediate aggregation. Since

we ignore the cost of obtaining this global knowledge, our model for DSC is very

idealized and provides a baseline for evaluating the other schemes.

2. Routing Driven Compression (RDC): In this scheme, the sensor nodes do not have

any knowledge about their correlations and send data along the shortest paths to the

sink while allowing for opportunistic aggregation wherever the paths overlap. Such

shortest path tree aggregation techniques are described, for example, in [IEGH02]

and [KEW02].

3. Compression Driven Routing (CDR): As in RDC, nodes have no knowledge of the

correlations but the data is aggregated close to the sources and initially routed so

24

Routing and Aggregation in Distributed Source Codingsourcessinkrouters

2n−1

n1

H2n−1

Routing and Aggregation in Routing Driven Compressionsourcessinkrouters

2n−1

n1

H1

H1

H1

H1

H1

H1

H1

H1

H1

H1

H1

H1

H1

H1

H1

H1

H1

H1

H1

H1

H1 H

3 H

5

H2n−1

H2n−1

H2n−1

Routing and Aggregation in Compression Driven Routing

sourcessinkrouters

2n−1

n1

H1

H1

H2n−1

H2n−1

H2n−1

H2

H3

H2

H3

H2n−1

Figure 3.2: Illustration of routing for the three schemes: DSC, CDR, and RDC. Hi is thejoint entropy of i sources.

as to allow for maximum possible aggregation at each hop. Eventually, this leads

to the collection of data removed of all redundancy at a central source from where

it is sent to the sink along the shortest possible path. This model is motivated by

the scheme in [SS05].

3.2.1 Comparison of the schemes

Consider the arrangement of sensor nodes in a grid, where only the 2n− 1 nodes in the

first column are sources. We assume that there are n1 hops on the shortest path between

the sources and the sink. For each of the three schemes, the paths taken by data and the

intermediate aggregation are shown in Fig.3.2.

25

In our analysis, we ignore the costs associated for each compressing node to learn the

relevant correlations. This cost is particularly high in DSC where each node must learn

the correlations with all other source nodes. However the bit-hop cost still provides a

useful metric for evaluating the performance of the various schemes and allows us to treat

DSC as the optimal policy providing a lower-bound on the bit-hop metric.

Using the approximation formulae for joint entropy and the bit-hop metric for energy,

the expressions for the energy expenditure (E) for each scheme are as follows.

For the idealized DSC scheme, each source is able to send exactly the right amount

of uncorrelated data, and each source can send the data along the shortest path to the

sink, so that:

EDSC = n1H2n−1. (3.4)

Lemma 3.2.1. EDSC represents a lower bound on bit-hop costs for any possible routing

scheme with lossless compression.

Proof: The total joint information of all (2n− 1) sources is H2n−1. As discussed before,

no lossless compression scheme can reduce the total information transmitted below this

level. Each bit of this information must travel at least n1 hops to get from any source to

the sink. Thus n1H2n−1, the cost of the idealized DSC scheme, represents a lower bound

on all possible routing schemes with lossless compression.

In the RDC scheme, the tree is as shown in Fig.3.2 (middle), with data being com-

pressed along the spine in the middle. It is possible to derive an expression for this

scenario:

ERDC = (n1 − n)H2n−1 + 2H1

n−1∑i=1

(i) +n−2∑j=0

H2j+1. (3.5)

26

10−1

100

101

102

0

1000

2000

3000

4000

5000

6000

7000

8000Performance with a convex function for joint entropy vs distance

Correlation parameter in log scale log(c)E

nerg

y us

age

in b

it−ho

ps

DSCRDCCDR

Figure 3.3: Comparison of energy expenditures for the RDC, CDR and DSC schemeswith respect to the degree of correlation c.

For the CDR scheme, the data is compressed along the location of the sources, and

then sent together along the middle, as shown in Fig. 3.2. It can be shown that for this

scenario:

ECDR = n1H2n−1 + 2n−1∑i=1

Hi. (3.6)

The above expressions, in conjunction with the expression for Hn presented earlier,

allow us to quantify the performance of each scheme. Fig.3.3 plots the energy expenditure

for the DSC, RDC and CDR schemes as a function of the correlation constant c, for

different forms of the correlation function. For these calculations, we assumed a grid

with n1 = n = 53 and 2n − 1 = 105 sources. From this figure it is clear that CDR

approaches DSC and outperforms RDC for higher values of c (high correlation) while

RDC performance matches DSC and outperforms CDR for low c (no correlation). This

can be intuitively explained by the tradeoff between compressing close to the sources and

transporting information toward the sink. CDR places a greater emphasis on maximizing

the amount of compression close to the sources, at the expense of longer routes to the

27

sink, while RDC does the reverse. When there is no correlation in the data (small c),

no compression is possible and hence it is RDC that minimizes the total bit-hop metric.

When there is high correlation (large c), significant energy gains can be realized by

compressing as close to the sources as possible and hence CDR performs better under

these conditions.

Interestingly, it appears that neither RDC nor CDR perform well for intermediate

correlation values. This suggests that in this range a hybrid scheme may provide energy-

efficient performance closer to the DSC curve. CDR and RDC can be viewed as two

extremes of a clustering scheme, with CDR having all data sources form a single aggre-

gation cluster before sending data towards the sink while RDC has each source acting as

a separate cluster in itself. A hybrid scheme would be one in which sources form small

clusters and data is aggregated within them at a cluster head, which then sends data

to the sink along a shortest path. This insight leads us to an examination of suitable

clustering techniques.

3.3 A Generalized Clustering Scheme

The idea behind using clustering for data routing is to achieve a tradeoff between aggre-

gating near the sources and making progress towards the sink. In addition to factors like

number of nodes and position of sink, the optimal cluster size will also depend on the

amount of correlation in the data originated by the sources (quantified by the value of

c). Generally, the amount of correlation in the data is highest for sensor nodes located

close to each other and can be expected to decrease as the separation between nodes

28

increases. Once an optimal clustering based on correlations is obtained, aggregation of

data is required only for the sources within a cluster, after which data can be routed to

the sink without the need for further aggregation. As a consequence, none of the scenarios

considered henceforth will resemble RDC exactly.

3.3.1 Description of the scheme

We now describe a simple, location-based clustering scheme. Given a sensor field and

a cluster size, nodes close to each other form clusters. The clusters so formed remain

static for the lifetime of the network. Within each cluster, the data from each of the

nodes is routed along a shortest path tree (SPT) to a cluster head node. This node then

sends the aggregated data from its cluster to the sink along a multi-hop path with no

intermediate aggregation. This is illustrated in Fig. 3.4. The intermediate nodes on the

SPT may or may not perform aggregation. Data aggregation in the form of compression

is computationally intensive. All nodes in a network might not be capable of performing

compression, either because it is too expensive for them to do so or the delays involved

are unacceptable. It is conceivable that there will be a few high power nodes or micro-

servers [HCJB04] which will perform the compression. Nodes form clusters around these

nodes and route data to them. In this case, data aggregation takes place only at the

cluster head.

3.3.1.1 Metrics for evaluation of the scheme

Es(c) is defined as the energy cost (in bit-hops) for correlation c and cluster size s.

The optimal cluster size sopt(c) minimizes the cost for a given c. Let E∗(c) = Esopt(c)

29

0 1 2 3 4 5 60

1

2

3

4

5

6

Intra−clusterrouting

Extra−cluster routing of compresseddata to sink

Cluster−head

Source

Sink

Figure 3.4: Illustration of clustering for a two-dimensional field of sensors

represent the optimal energy cost for a given correlation c. For simplifying system design,

it is desirable to have a cluster size that performs close to the optimal over the range of

c values. We quantify the notion of ‘being close to optimal’ by defining a near-optimal

cluster size sno as the value of s that minimizes the maximum difference metric, i.e.

sno = arg mins∈[1,n]

maxc∈[0,∞)

Es(c)− E∗(c). (3.7)

In the following sections, we analyze the performance of the clustering scheme for

both 1-D and 2-D networks when aggregation is performed

• at intermediate nodes on the SPT, and

• only at the cluster-heads.

3.3.2 1-D Analysis

We begin with an analysis of the energy costs of clustering for a setup involving a linear

array of sources to better understand the tradeoffs. Consider n source nodes linearly

placed with unit spacing (i.e. d = 1) on one side of a 2-D grid of nodes, with the sink on

30

the other side, and assuming the correlation model, Hn = H1(1 + (n−1)1+c ). We consider n

s

clusters each consisting of s nodes. Since all sources have the same shortest hop distance

to the sink, the position of the cluster head within a cluster has no effect on the results.

Within each cluster, the data can either be compressed sequentially on the path to the

cluster head or only when it reaches the cluster head. The cluster head then sends the

compressed data along a shortest path involving D hops to the sink. The total bit-hop

cost for such a routing scheme is therefore

Es(c) =n

s(Eintras,c + Eextras,c ), (3.8)

where Eintras,c and Eextras,c are the bit-hop cost within each cluster and the bit-hop cost for

each cluster to send the aggregate information to the sink respectively.

3.3.2.1 Sequential compression along SPT to cluster head

At each hop within the cluster, a node receives Hi bits, aggregates them with its own

data and transmits Hi+1 bits. This is done sequentially until the data reaches the cluster

head. We have,

Eintras,c =s−1∑i=1

Hi =s−1∑i=1

(1 +

i− 11 + c

)H1

=(s− 1 +

(s− 2)(s− 1)2(1 + c)

)H1.

31

Since the cluster heads get aggregated data from s sources and send it to the sink using

a shortest path of D hops,

Eextras,c = Hs ·D =(

1 +s− 11 + c

)H1 ·D

⇒ Es(c) = nH1

(s− 1s

+(s− 2)(s− 1)

2s(1 + c)+D

s+

(s− 1)Ds(1 + c)

). (3.9)

The optimum value of the cluster size sopt can be determined by setting the derivative

of the above expression equal to zero. It can be shown that

sopt = 1, if c ≤ 12(D − 1)

=√

2c(D − 1), if1

2(D − 1)< c <

n2

2(D − 1)

= n, if c ≥ n2

2(D − 1).

Note that sopt depends on the distance from the sources to the sink2 and the degree of

correlation c.

Fig.3.5 shows (based on the analysis) how different cluster sizes perform across a range

of correlation levels, based on the analysis presented above for a set of 105 linearly placed

nodes. As expected the small cluster sizes and large cluster sizes perform well at low

and high correlations respectively. However, it appears that an intermediate cluster size

near 15 would perform well across the whole range of correlation values. The curve with

s = 105 corresponds to CDR and the DSC curve is also plotted for reference.

2It is, however, assumed that D ≥ n, so there is an implicit dependence on n.

32

Theorem 3.3.1. For Es(c) given by Equation.3.9, the near-optimal cluster size sno de-

fined by Equation.3.7 exists, and is given by

sno = Θ(min(√D,n)).

The following lemma is required for proving the theorem.

Lemma 3.3.2. To solve the optimization problem in Eqn.3.7 for Es(c) given by Eqn.3.9

it suffices to find s = sno such that

Esno(0)− E∗(0) = Esno(∞)− E∗(∞). (3.10)

Proof. We first show that for any arbitrary s, this difference is maximum at one of the

two extremes (i.e. at c = 0 and c→∞). Let

Eds (c) = Es(c)− E∗(c) = Es(c)− Esopt(c)

= nH1(s− sopt)

(s · sopt − 2c(D − 1)

)2s · sopt(1 + c)

∂Eds (c)∂c

= −nH1(s− 1)

(s+ 2(D − 1)

)2s(1 + c)2

, if c ≤ 12(D − 1)

= −nH1

(s−

√2c(D − 1)

)(s+

√2(D−1)

c

)2s(1 + c)2

, if1

2(D − 1)< c <

n2

2(D − 1)

= −nH1(s− n)

(s · n+ 2(D − 1)

)2s · n(1 + c)2

, if c ≥ n2

2(D − 1).

Eds (c) and its derivative vanish for the same values of c and since Eds (c) is non-negative,

the minimum is achieved at these values of c.

33

The derivative is continuous for all s ∈ [1, n], and

• for a particular value of s ∈ (1, n), it is zero only for one value of c.

• for s = 1, it is zero only for c ∈ [0, 12(D−1) ].

• for s = n, it is zero only for c ∈ [ n2

2(D−1) ,∞).

From the non-negativity of Eds (c) and the above properties of its derivative, we can

conclude that:

• for s ∈ (1, n), Eds (c) is convex

• for s = 1, it is monotonously increasing

• for s = n, it is monotonously decreasing.

This implies that Eds (c) is maximum either for c = 0 or c =∞ and Eqn.(3.7) reduces

to

mins∈[1,n]

max(Es(0)− E∗(0), Es(∞)− E∗(∞)). (3.11)

From Eqn. (3.9), we can derive the following expressions for energy costs of clustering

schemes for the two extreme correlation values:

Es(0) = nH1(s− 1

2+D)

E∗(0) = nH1D

Es(∞) = nH1(1 +D − 1s

)

E∗(∞) = nH1(1 +D − 1n

). (3.12)

34

Substituting Eqn. (3.12) in Eqn. (3.11) and disregarding common factors, we obtain:

mins∈[1,n]

max(s− 1

2,D − 1s− D − 1

n). (3.13)

Let f1(s) = s−12 , f2(s) = D−1

s −D−1n . We have

maxs=1

(f1, f2) = f2(1)

maxs=n

(f1, f2) = f1(n).

For s ∈ (1, n), f1, f2 are continuous, f1 is increasing and f2 is decreasing. Therefore,

max(f1, f2) achieves its minimum for s = sno such that

f1(sno) = f2(sno)

i.e. Esno(0)− E∗(0) = Esno(∞)− E∗(∞).

Proof of Theorem 3.3.1: Solving for f1(sno) = f2(sno), we get

sno − 12

=D − 1sno

− D − 1n

⇒ s2no + (2(D − 1)

n− 1)sno − 2(D − 1) = 0

⇒ sno =

√2(D − 1) + (

D − 1n− 1

2)2 − (

D − 1n− 1

2)

= Θ(min(√D,n)).

35

10−3

10−2

10−1

100

101

102

103

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

correlation parameter in log scale log(c)T

rans

imis

sion

cos

t Es(c

) (b

it−ho

ps)

s = 1s = 3s = 7s = 15s = 35s = 105 (CDR)DSC

Figure 3.5: Comparison of the performance of different cluster-sizes for linear array ofsources(n = D = 105) with compression performed sequentially along the path to clusterheads. The optimal cluster size is a function of correlation parameter c. Also, cluster sizes = 15 performs close to optimal over the range of c

This is illustrated in Fig.3.6, in which the costs are plotted with respect to the cluster

sizes for a few different values of the spatial correlation. The figure shows clearly that

although the optimal cluster size does increase with correlation level, the near-optimal

static cluster size performs very well across a range of correlation values. In this figure,

D = n = 105 and the near-optimal cluster size obtained from Theorem.3.3.1, sno = 14 is

indicated by the vertical line in the plot. Intersections of the dotted lines and the nearest

c curve with this vertical line show the difference in energy cost between the near-optimal

and optimal solutions.

36

1 10 14 1000

2000

4000

6000

8000

10000

12000

14000

16000

18000

cluster size in log scale log(s)T

rans

mis

sion

cos

t Es(c

) (b

it−ho

ps)

s = sopt

(c)

s = sno

c = .01

c = 1

c = 2

c = 5

c = 10

c = 100

Figure 3.6: Illustration of the existence of a static cluster for near-optimal performanceacross a range of correlations. The sources are in a linear array and data is sequentiallycompressed along the path to cluster heads.

3.3.2.2 Compression at cluster head only

In this case, each source within a cluster sends data to the cluster head using a shortest

path. There is no aggregation before reaching the cluster head. We have,

Eintras,c =s−1∑i=1

i ·H1 =s(s− 1)

2H1

Eextras,c =(

1 +s− 11 + c

)H1 ·D

⇒ Es(c) = nH1

(s− 12

+D

s+

(s− 1)D(s)(1 + c)

). (3.14)

It can be shown that

sopt = 1, if c ≤ 12D − 1

= n, if c >n2

2D − n2, 2D > n2

=

√2Dcc+ 1

, else .

37

Fig.3.7 shows that for a linear array of sources (with n = D = 105), the performance

for cluster sizes s = 5, 7 are close to optimal over the range of c. The DSC curve is plotted

for reference.

Theorem 3.3.3. For Es(c) given by Equation.3.14, the near-optimal cluster size sno

defined by Equation.3.7 exists, and is given by

sno = Θ(min(√D,n))

.

The following lemma is required for proving the theorem.

Lemma 3.3.4. The near-optimal cluster size s = sno for Es(c) given by Eqn.3.14 satisfies

the condition

Esno(0)− E∗(0) = Esno(∞)− E∗(∞).

Proof. The proof is similar to proof of Lemma 3.3.2 with

f1(s) =Es(0)− E∗(0)

nH1=s− 1

2, and

f2(s) =Es(∞)− E∗(∞)

nH1=

s

2+D

s−√

2D if 2D ≤ n2

=s− n

2+D

s− D

nelse.

38

10−3

10−2

10−1

100

101

102

103

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

correlation parameter in log scale log(c)T

rans

mis

sion

cos

t Es(c

) (b

it−ho

ps)

s = 1s = 3s = 5s = 7s = 15s = 105DSC

Figure 3.7: Performance with compression only at cluster head with nodes in a lineararray(n = D = 105). Cluster sizes s = 5, 7 are close to optimal over the range of c

Proof. of Theorem 3.3.3: Using Lemma 3.3.4 and solving

Esno(0)− E∗(0) = Esno(∞)− E∗(∞)

for Es(c) given by Eqn.3.14, we get

sno =2D

2√

2D − 1(≈√D

2) if 2D < n2

=2Dn

2D + n(n− 1)else.

It can be verified that

sno = Θ(√D) if D = o(n2)

= n if D = Ω(n2).

39

1 7 10 1000

2000

4000

6000

8000

10000

12000

14000

16000

18000

cluster size in log scale log(s)T

rans

mis

sion

cos

t Es(c

) (b

it−ho

ps)

c = 0.01

c = 0.5

c = 1.0

c = 2.0

c = 10

c = 100

c = 10000

s = (n/2)1/2

s = sopt

(c)

Figure 3.8: Illustration of the near-optimal cluster size with compression only at cluster

head with nodes in a linear array. The performance of cluster sizes near s = 7(≈√

1052 )

is close to optimal over the range of c values

The existence of a near-optimal cluster size is illustrated in Fig. 3.8. The performance

of cluster sizes near s = 7 is close to optimal over the range of c values.

3.3.3 2-D analysis

Consider a 2-D network in which N = n2 nodes are placed on a n× n unit grid and are

divided into clusters of size s × s. We assume that each node can communicate directly

only with its 8 immediate neighbors. The routing pattern within a cluster and from

the cluster-heads to the sink is similar and is illustrated in Fig.3.9. Note that using the

iterative approximation described in Section 3.1, the joint entropy of k adjacent3 nodes

on a grid is the same as the joint entropy of k sensors lying on a straight line. Fig.3.9(a)

illustrates this along the diagonal.

The results for the linear array of sources do not extend directly to a two-dimensional

arrangement where every node is both a source and a router. In the 1-D case, the optimal

3nodes forming a contiguous set

40

H1

H2

H1H2

H1

H1

H1

H4

H9

Hs2 cluster head

to sink

Hs2

H1H1

H1

H1

H1

2H1

2H1

4H1

9H1

cluster head

to sink

(a) (b)

Figure 3.9: Intra-cluster routing in a 2-D grid arrangement. (a) Opportunistic com-pression along shortest path to cluster head. For calculation of joint entropy, using theiterative approximation, joint entropy of k nodes forming a contiguous set is the sameas the joint entropy of k sensors lying on a straight line. This is illustrated along thediagonal. (b) Compression only at cluster head. The routing from cluster heads to sinkis similar to this case.

aggregation tree is different from the shortest path tree (except for the case with zero

correlation). This is because moving towards the sources allows greater compression than

moving towards the sink. In the 2-D case however, there are opportunities for compres-

sion in all directions. Hence, it is always possible to achieve compression while making

progress towards the sink.

3.3.3.1 Opportunistic compression along SPT to cluster head

According to the approximation we have been using for the joint entropy, the contribution

of a node v is H(v/ηv), where ηv is the nearest neighbor of v. If we assume that H(v/ηv)

is the fixed rate allocation for every node v, it follows4 that a network-wide SPT is the

4see [CBLV04] for a formal proof

41

optimal routing structure. In other words, the optimal cluster size s = n for all values of

correlation parameter c. There is no incentive for data to deviate from a shortest path

to the sink. The result is established more precisely in the following lemma.

Lemma 3.3.5. For a 2-D grid with opportunistic compression along an SPT to cluster

head, the optimal cluster size is s = n for any value of correlation parameter c ∈ [0,∞].

Proof. Consider a cluster of size sxs. The routing within the cluster is as shown in Fig.

3.9a and routing from cluster head to sink is as shown in Fig. 3.9b. The routing costs

are obtained as follows:

Eintras,c =(ns

)2s−1∑i=1

(2(s− i)Hi +Hi2

)=

(ns

)2s−1∑i=1

((2(s− i)

(1 +

i− 11 + c

)H1 +

(1 +

i2 − 11 + c

)H1

)=

(ns

)2(s− 1)

(s+ 1 +

(s− 2)(4s+ 3)6(1 + c)

)H1

Eextras,c =

ns−1∑i=0

ns−1∑j=0

maxs · i, s · jHs2

= s(

ns−1∑i=0

i∑j=0

i+

ns−1∑i=0

ns−1∑

j=i+1

j)(

1 +s2 − 11 + c

)H1

=n

6(n

s− 1)

(4ns

+ 1)(

1 +s2 − 11 + c

)H1.

The total cost is

Es(c) = Eintras,c + Eextras,c

42

The routing cost for a network-wide SPT i.e. with s = n is

En(c) = Eintran,c + 0 = (n− 1)(n+ 1 +

(n− 2)(4n+ 3)6(1 + c)

)H1.

now for any s < n and any value of c consider the difference

Es(c)− En(c)

=n

6(1 + c)

((ns− n

s− s2 + 1

)+

c

s2(4n2 − 3ns− s2 − 6n+

6s2

n

)). (3.15)

It can be verified that the two terms

ns− n

s− s2 + 1 and 4n2 − 3ns− s2 − 6n+

6s2

n

are positive for any value of s < n. Hence the difference in Eqn. 3.15 is always positive.

This implies that for all values of c ∈ [0,∞], Es(c) is minimum for s = n.

It should be noted that the optimality of a network-wide SPT obtained above is

contingent on two of our assumptions: 1. a grid topology, and 2. routing within clusters

is along an SPT. Cristecu et al [CBLV04] and Rickenbach et al [vRW04] show results for

general graph topologies.

3.3.3.2 Compression at cluster head only

When compression is possible only at cluster heads, there is a definite tradeoff in progress

towards the sink and compression at intermediate points. Since there is no compression

before reaching and after leaving the cluster-heads, shortest-path routing is optimal within

43

clusters and from cluster-heads to sink (Fig.3.9(b)). Let Es(c) be the total cost for a

network with cluster size s×s and correlation parameter c. Eintras and Eextras are defined

as the combined intra-cluster costs and the overall cost for routing from cluster heads to

the sink respectively. From Fig.3.9, a node at (i, j) will take maxi, j hops to reach the

cluster head at (0, 0). Since there are (ns )2 clusters, we have

Eintras,c =(ns

)2s−1∑i=0

s−1∑j=0

maxi, jH1 =(ns

)2( s−1∑i=0

i∑j=0

i+s−1∑i=0

s−1∑j=i+1

j)H1

=(ns

)2( s−1∑i=0

i(i+ 1) +s−1∑i=0

((i+ 1) + (i+ 2) + ...+ (s− 1)

))H1

=(ns

)2( s−1∑i=0

i(i+ 1) +s−1∑i=0

((s− 1)s2

− i(i+ 1)2

))H1

=n2

6s(s− 1)(4s+ 1)H1. (3.16)

Now, the shortest route between adjacent cluster-heads is s hops. Hence,

Eextras,c =

ns−1∑i=0

ns−1∑j=0

maxs · i, s · jHs2 = s

ns−1∑i=0

ns−1∑j=0

maxi, j(

1 +s2 − 11 + c

)H1

=n

6

(ns− 1)(4n

s+ 1)(

1 +s2 − 11 + c

)H1. (3.17)

[using the expression for∑∑

maxi, j from Eqn.3.16]

Es(c) = Eintras,c + Eextras,c

=[n2

6s(s− 1)(4s+ 1) +

n

6

(ns− 1)(4n

s+ 1)(

1 +s2 − 11 + c

)]H1. (3.18)

Fig.3.10 shows the performance of the scheme for various cluster sizes for a 1000×1000

network. While the optimal cluster size depends on the value of c, we again find that

44

there are certain intermediate cluster sizes (s =5, 10, 25) that perform near optimally

over a wide range of spatial correlations.

It can be shown that

sopt(c) =( 8c

4c+ 1n) 1

3 + o(n13 ).

Setting the partial derivative of Es(c) w.r.t s to zero,

∂Es(c)∂s

=n

6(c+ 1)

(− 2s+ (4c+ 1)n+ (c− 2)

n

s2− 8c

n2

s3

)H1 = 0

⇒ −2s3 + ns2 + n = 0, if c = 0

⇒ −2s4 + (4c+ 1)ns3 + (c− 2)ns− 8cn2 = 0, if c 6= 0. (3.19)

Differentiating again w.r.t s

∂E2s(c)

∂2s= −

(2ns2

+ 2)H1, if c = 0 (3.20)

=n

3(c+ 1)s4(12cn2 − s4 − (c− 2)ns)H1, if c 6= 0. (3.21)

If c = 0, the second derivative in Eqn.3.20 is always negative and hence the minimum

is achieved at the two extremities s = 1 and s = n. Therefore,

sopt(0) = 1, n. (3.22)

45

• If c > 0, for s = o(n12 ), s4 = o(n2) and (c − 2)ns = o(n2). Solving Eqn.3.19 with

this constraint,

(4c+ 1)ns3 − 8cn2 + o(n2) = 0

⇒ sopt(c) =( 8c

4c+ 1n) 1

3 + o(n13 ). (3.23)

It can be verified that a minimum is achieved since the second derivative in Eqn.3.21

is positive for this value of s.

• If c > 0, for s = Ω(n12 ), it can be verified that Eqn.3.19 has no solution for s ≤ n.

Lemma 3.3.6. The near-optimal cluster size s = sno for Es(c) given by Eqn.3.18 satisfies

the condition

Esno(0)− E∗(0) = Esno(∞)− E∗(∞).

The proof is similar to proof of Lemma 3.3.2 with

f1(s) =Es(0)− E∗(0)

n6H1

− n

s(s− 1)(4s+ 1)

= −s2 − 3ns+ 3n+ 1, and

f2(s) =Es(∞)− E∗(∞)

n6H1

− n

s(s− 1)(4s+ 1)

= −4n2

s2− 3n

s− 6 · 2

13n

43 + 3n+ 2 · 2

23n

23 .

46

Theorem 3.3.7. For Es(c) given by Equation.3.18, the near-optimal cluster size

sno = Θ(n13 )(≈ 0.6847n

13 ).

Proof. From Eqns. 3.22 and 3.23, sopt(0) = 1, n and sopt(∞)→ (2n)13 .

Using Lemma 3.3.6, the near-optimal cluster size s = sno satisfies:

Es(0)− E∗(0) = Es(∞)− E∗(∞)

⇒[n2

6s(s− 1)(4s+ 1) +

n

6

(ns− 1)(4n

s+ 1)s2]−[n

6(n− 1)(4n+ 1)

]=

[n2

6s(s− 1)(4s+ 1) +

n

6

(ns− 1)(4n

s+ 1)]

−[ n2

6(2n)13

((2n)

13 − 1

)(4(2n)

13 + 1

)+n

6

( n

(2n)13

− 1)( 4n

(2n)13

+ 1)]. (3.24)

Rearranging Eqn.3.24 and factoring out n6s2

, we get the condition:

s4 + 3ns3 − (6 · 213n

43 + 3n+ 2)s2 − 3ns+ 4n2 + o(n2) = 0. (3.25)

Since s4 = o(ns3), ns = o(n2), by factoring out n, Eqn.3.25 reduces to

3s3 − 6 · 213n

13 s2 + 4n+ o(s3) + o(n) = 0. (3.26)

It can be verified that Eqn.3.26 has only one non negative solution,

sno = 0.6487n13 + o(n

13 ).

47

10−3

10−2

10−1

100

101

102

103

0

1

2

3

4

5

6

7

8x 10

8

correlation parameter in log scale log(c)

Tra

nsm

issi

on c

ost E

s(c)

(bit−

hops

) s = 1s = 5s = 10s = 100s = 200s = 500

Figure 3.10: Comparison of the performance of various cluster sizes for a network with106 nodes on a 1000x1000 grid when compression is possible only at cluster heads. Theperformance for s = 5, 10 is observed to be close to optimal over the range of c values.

1 10 13 100 10000

1

2

3

4

5

6

7

8x 10

8

cluster side in log scale log(s)

Tra

nsm

issi

on c

ost E

s(c)

(bit−

hops

)

c = 0.0001

c = 10

c = 100

c = 10000

c = 1.0

c = 0.1

s = sopt

(c)s = .6487N1/3

s = (2N)1/3

Figure 3.11: Illustration of the existence of a near-optimal cluster size. The network sizeis n×n = 1000×1000 and compression is possible only at cluster heads. The performanceof cluster side values near s = .6487n

13 is quite close to optimal for all values of c ranging

from 0.0001 to 10000

48

Fig.3.11 illustrates the existence of the near-optimal cluster size for a network of 106

nodes on a 1000× 1000 grid. Clearly, the transmission cost with cluster side values near

s = 7(= d.6487n13 e) is quite close to the optimal for a large range of correlation coefficient

c values.

3.4 Simulation Results

The analysis in Section 3.3 is based on simple and restricted communication, topology

and joint entropy models. To verify the robustness of the conclusions from analysis,

we present results from extensive simulation experiments with more general models. As

before, the network is deployed in a N × N area which is partitioned into grids of size

s× s, for s ∈ [1, N ]. All nodes which are located within the same grid form a cluster.

3.4.1 Communication and Topology models

We consider more general communication and topology models, while using the same

entropy model as in the analysis. Nodes are deployed uniformly at random within the

network area. Each node is assumed to transmit 1 bit of data. The joint entropy of nodes

within the cluster are calculated using the iterative, approximation technique described

in Section 3.1.

3.4.1.1 Random geometric graphs

In this model, all nodes that are within the communication radius can communicate with

each other over ideal, lossless links. Since each link has a unit cost, the routing cost is

calculated as:

49

intra-cluster cost =∑

all nodes in cluster(node depth in cluster SPT)

extra-cluster cost =∑all clusters in network

(cluster-head depth in network SPT)· (cluster joint entropy)

total cost = intra-cluster cost + extra-cluster cost.

The simulation parameters are as follows:

• network sizes 24mx24m, 84mx84m, 240mx240m

• density of deployment = 1 node/m2

• communication radius = 3m

Figures 12 (a), (b), (c) show performance of clustering for the network sizes considered.

As predicted by the analysis, for a network of N nodes, N13 is a good estimate of the

near-optimal cluster size.

3.4.1.2 Realistic Wireless Communication model

We consider the model for lossy, low power wireless links proposed in [ZK04a]. Link

costs are the average number of transmissions required for a successful transfer and these

are used as weights for obtaining the shortest-path tree. The routing cost is calculated as:

intra-cluster cost =∑

all nodes in cluster(node cost in cluster SPT)

extra-cluster =∑all clusters in network

(cluster head cost in network SPT) · (cluster joint entropy)

50

10−2

100

102

0

1000

2000

3000

4000

5000

6000

correlation parameter in log scale

tran

smis

sion

cos

t

s = 1s = 2s = 3s = 4s = 6s = 8s = 12

10−2

100

102

0

0.5

1

1.5

2

2.5x 10

5


tran

smis

sion

cos

t

s = 1s = 2s = 4s = 7s = 12s = 28s = 42

10−2

100

102

0

0.5

1

1.5

2

2.5

3

3.5

4x 10

6


tran

smis

sion

cos

t

s = 2s = 4s = 8s = 10s = 20s = 40

(a) (b) (c)

Figure 3.12: Random geometric graph topology. Performance of clustering with density= 1 node/m2, communication radius = 3m for network of size (a) 24x24 (b) 84x84 (c)200x200. Near-optimal cluster sizes are (a) 3,4 (b) 4,7 (c) 8,10.

The authors have made code available online for a topology generator based on the

model [ZK04b]. The parameters used in the simulations are as follows:

• network size = 48mx48m , density of deployment = .25 nodes/m2

• random node placement

• NCSFK modulation, Manchester encoding

• PREAMBLE LENGTH = 2, FRAME LENGTH = 50,

• NOISE FLOOR = -105.0; Power levels: -3dB, -7dB and -10dB.

Figures 3.13 (a), (b) (c) show performance of clustering for the different power val-

ues. For lower power, there is an increase in the routing cost since links become more

51

10−2

100

102

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

5500

6000


tran

smis

sion

cos

t

s = 2s = 4s = 6s = 8s = 12s = 24

10−2

100

102

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

5500

6000


tran

smis

sion

cos

t

s = 2s = 4s = 6s = 8s = 12s = 24

10−2

100

102

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

5500

6000


tran

smis

sion

cos

t

s = 2s = 4s = 6s = 8s = 12s = 24

(a) (b) (c)

Figure 3.13: Realistic wireless communication topology. Performance of clustering in48mx48m network with density = .25 nodes/m2 for power level (a) -3dB (b) -7dB (c)-10dB. Cluster sizes 6, 8 are near-optimal.

lossy. However, since proximity relationships between nodes do not change drastically,

the relative routing costs for different cluster sizes remain similar.

3.4.2 Joint entropy models

We now consider more general models for the joint entropy of sources while using the

realistic lossy link model from Section 5.1.2. The routing cost is calculated using the same

equations and simulations are performed with power level of -3dB, all other parameters

remaining the same.

52

3.4.2.1 Linear and convex functions of distance

In the empirically obtained model, the joint entropy is a concave function of the distance

between sources. We also look at a linear function, for which

H2(d) = H1 +min(1,d

c) ·H1

and a convex function, for which

H2(d) = H1 +min(1,d2

c2) ·H1

.

Fig 3.14 (a) illustrates the three forms of joint entropy functions for 2 sources. The

entropy of each source is normalized to 1 unit. The convex and linear curves are clipped

when the joint entropy equals the sum of individual entropies. Figures 3.14 (b) and (c)

show performance of clustering.

3.4.2.2 Continuous, Gaussian data model

In order to verify that the results from analysis and all earlier simulations is not an artifact

of the simple approximation models for joint entropy, we now consider a continuous,

jointly Gaussian data model and use its entropy as the metric for uncorrelated data in

53

0 2 4 6 8 101

1.1

1.2

1.3

1.4

1.5

1.6

1.7

1.8

1.9

2

Inter−node distance

Join

t ent

ropy concave

linear convex

10−1

100

101

102

103

2000

4000

6000

8000

10000

12000

14000


tran

smis

sion

cos

t

s = 2s = 4s = 6s = 9s = 18s = 36

10−1

100

101

102

2000

4000

6000

8000

10000

12000

14000


tran

smis

sion

cos

t

s = 2s = 4s = 6s = 9s = 18s = 36

(a) (b) (c)

Figure 3.14: (a) Example forms of joint entropy functions for 2 sources. The entropy ofeach source is normalized to 1 unit. The convex and linear curves are clipped when thejoint entropy equals the sum of individual entropies. The curves shown are for correlationparameter c = 2. Performance of clustering in 72m × 72m network with density = .25nodes/m2 for (b) linear model (c) convex model of joint topology. Cluster size 6 isnear-optimal.

the routing cost calculations. The data is assumed to have a zero-mean jointly Gaussian

distribution X ∼ NN (0,K), with unit variances σii = 1:

f(X) =1√

(2π)|K|12

e−12(X)TK−1(X).

, where K is the covariance matrix of X, with elements depending on the distance between

the corresponding nodes and the degree of correlation, Kij = e−dijc , where dij is the

distance between nodes i and j and c is the correlation parameter. For this distribution

and with quantization step size δ, entropy of a single source is [CT91]:

H1 =12log2(2πe)− log2(δ)

54

and joint entropy of n sources is:

Hn =12log2((2πe)n|K|)− nlog2(δ).

Since |K| becomes singular for large c values, we clip Hn by using

max1

2log2(2πe),

12log2((2πe)n|K|)

− nlog2(δ)

.

Figures 3.15 (a), (b) and (c) show performance of clustering for quantization step δ

= 1, 0.5 and .05. The cluster sizes s = 6, 8 are near-optimal. In Figures 3.15 (d), (e) and

(f) , the same curves are presented but the transmission cost is normalized to make the

highest value equal to 1. For lower values of δ, the quantization cost dominates and the

gains from removing inter-source correlations in data are diminished. Accordingly, the

relative gains from optimizing cluster size are also reduced.

3.4.3 Summary of results

Overall, the results presented in this section show that the basic conclusions from the

analysis hold even when the limiting assumptions of the analysis regarding node place-

ment, communication link quality, exact form of the correlation model, quantization,

are relaxed. In all cases, we observe the existence of small cluster-sizes that provide

near-optimal performance over a wide range of correlation settings.

55

100

101

102

1000

2000

3000

4000

5000

6000

7000

8000

9000


tran

smis

ssio

n co

st

s = 2s = 4s = 6s = 8s = 12s = 24

100

101

102

5000

6000

7000

8000

9000

10000

11000

12000


tran

mis

sion

cos

t

data1data2data3data4data5data6

100

101

102

1.8

1.9

2

2.1

2.2

2.3

2.4

2.5

2.6x 10

4


trns

mis

sion

cos

t

data1data2data3data4data5data6

(a) (b) (c)

Figure 3.15: Performance of clustering in 48m×48m network with density = .25 nodes/m2

with a continuous, jointly Gaussian data model and quatization step (a) δ = 1 (b) δ =0.5 (c) δ = 0.05. Cluster size 6, 8 are near-optimal.

3.5 Summary and Conclusions

We study the correlated data gathering problem in sensor networks using an empirically

obtained approximation for the joint entropy of sources. We present analysis of the

optimal routing structure under this approximation. This analysis leads naturally to a

clustering approach for schemes that perform well(in terms of energy-efficiency) over the

range of correlations. The optimal clustering depends on the level of correlation and

also on where the actual data compression is performed; at each individual node or at

intermediate data collection points or cluster heads. Remarkably, however, there exists a

static, near-optimal cluster size which performs well over the range of correlations. The

notion of near-optimality is formulated as a min-max optimization problem and rigorous

analysis of the solution is presented for both 1-D and 2-D network topologies. For a linear

arrangement of N sources, the near-optimal cluster size is Θ(√D) irrespective of where

56

compression occurs, where D(≥ N,O(N2)) is the shortest hop distance of each source to

the sink. For a 2-D grid deployment, with N sources and unit density, a network-wide

shortest path tree is optimal if every node compresses its data using side information from

its neighbors. If compression is possible only at cluster-heads, a Θ(N16 ) cluster size is

shown to be near-optimal. The robustness of the conclusions from analysis is established

using extensive simulations with more general communication and entropy models.

The practical implication of these results for sensor network data gathering is that a

simple, static cluster-based system design can perform as well as sophisticated adaptive

schemes for joint routing and compression.

57

Chapter 4

Practical schemes for distributed compression

The details of how exactly compression will be achieved were ignored in the earlier

analysis for reasons of tractability. We now consider the design of practical schemes

for achieving distributed compression based on two different views of structure in data.

First, we build on work by Ciancio, Shen and Ortega [CO05, SO08a, SO08b] to obtain a

transform to take advantage of the broadcast nature of wireless communications. Next,

we extend the ideas of Candes et al., [CRT06], Donoho [Don06] and Wang et al. [WGR07]

to the multi-hop routing scenario.

4.1 Wavelet transform design for wireless broadcast advantage

Ciancio, Shen and Ortega [CO05, SO08a, SO08b] have developed lifting based wavelet

transforms that can operate over tree routing topologies. Their algorithms assume unicast

The work described in this section was published as follows:Godwin Shen, Sundeep Pattem, Antonio Ortega, “Energy-Efficient Graph-Based Wavelets for DistributedCoding in Wireless Sensor Networks”, 34th International Conference on Acoustics, Speech, and SignalProcessing (ICASSP), April 2009.Sungwon Lee, Sundeep Pattem, Maheswaran Sathiamoorthy, Bhaskar Krishnamachari, Antonio Ortega,“Spatially-Localized Compressed Sensing and Routing in Multi-Hop Sensor Networks”, 3rd InternationalConference on Geosensor Networks, July 2009.

58

0 5 10 15 20 25 300

2

4

6

signal x(n)

5 10 15 20 25

0

2

4

6

5 10 15 20 25 30

0

2

4

6

smooth coefficients s(n)

predict coefficients p(n)

Figure 4.1: Example (a) signal and (b) 5/3 wavelet coefficients

communications between nodes in the network. In thsi section, we extend their work

by designing a new transform to take advantage of the broadcast nature of wireless

communication. This transform allows for better compression of data and hence energy

efficiency.

4.1.1 Wavelet basics: The 5/3 lifting transform

We start by presenting an intutive explanation of de-correlation using lifting steps for

the 5/3 wavelet transform. For a rigorous treatment of wavelets and liting, see Vetterli

and Kovacevic [VK91] and Daubechies [DS98], respectively. Consider a discrete-time

signal x(n). The basic idea is to separate the low pass and high pass components of x(n).

Each even-time sample x(2t) can be decomposed into an estimate, using the adjacent

odd-time samples x(2t− 1) and x(2t+ 1), plus a residual value. Given smoothness in the

time-evolution of the signal, or temporal correlations, the residuals have a much smaller

magnitude as compared to the original samples and will require significantly less number

of bits to represent. This is how compression can be achieved. The 5/3 lifting wavelet

transform for signal x(n) is defined as follows:

59

• even “predict“ coefficients, d(2k) = x(2k)− x(2k−1)+x(2k+1)2

• odd ”smooth“ coefficients, s(2k + 1) = x(2k + 1) + d(2k)+d(2k+1)4

An example signal and its coefficients are illustrated in Figure. 4.1.

4.1.2 Wavelets for sensor networks

We now discuss existing schemes for computing wavelet transforms in a distributed man-

ner at sensor nodes.

4.1.2.1 Unidirectional 1D wavelet

Ciancio and Ortega [CO05] proposed wavelet transforms for use in a sensor network

scenario. For a linear array of sensor nodes transporting data hop by hop to a sink at

one end, nodes at odd and even depth provide the odd and even samples for the spatial

signal. The 5/3 wavelet computations are modified in a way that ensures that data

always makes unidirectional progress i.e. towards the sink. This scheme was extended for

tree topologies [CPOK06] by considering heuristic (and sub-optimal) ways of handling

merging of 1D paths in the tree.

4.1.2.2 2D wavelet for tree topologies

Shen and Ortega [SO08b, SO08a] proposed a lifting transform that works for any tree

topology. As before, the sink is at the root of the tree and nodes at odd depth provide

the ”smoothing“ coefficients and the nodes at even depth the ”predict“ coefficients. The

difference from the 1D transform is that at each node thare could be more than just two

”adjacent” samples. This is illustrated in Figure. 4.2 (a). It was shown that the following

60

(a) (b)

Figure 4.2: Illustration of odd (green) and even (blue) nodes in a subtree for 2D wavelet(a) with unicast and (b) exploiting broadcast nature of wireless communications. Thesolid arrows are part of the tree routing paths. The dashed arrows are the wireless linksnot part of the tree. The arrows crossed off in red denote disallowed interactions fortransform invertibility and unidirectionality.

computations over a tree topology T for the set of vertices V , result in an invertible

transform:

• For i ∈ V , let ρ(i) be the parent in T, Ci be the set of children in T

• For node m at even depth in T ,

”predict” coefficient dm = xm − 1(|Cm|+1)

∑k∈Cm xk −

1(|Cm|+1) · xρ(m)

• For node n at odd depth in T ,

”smooth” coefficient sn = xn + 12(|Cn|+1)

∑k∈Cn dk + 1

2(|Cn|+1) · dρ(n)

Note that the above computations implicitly impose a schedule or ordering on the

transmissions at nodes. Transmissions begin at the leaf nodes in the tree and every non-

leaf node is constrained to hold its transmission until all nodes in the subtree rooted at

itself have finished transmission.

61

4.1.3 2D wavelet for wireless broadcast scenario

The 2D wavelet just described treats the transmissions along the routing tree as unicasts

i.e. destined only for a particular node, in this case the parent in the tree. However, in

the context of sensor networks, the wireless transmissions at each node can be potentially

heard by many nodes in its neighborhood, based on the topology and the transmission

power. In the earlier 2D wavelet, a node contributed to de-correlation operations only at

its parent in the tree. Taking advantage of the broadcast nature of wireless transmissions,

a single transmission at a node can be used for de-correlation operations potentially at

all nodes that can receive it.

We consider the design of a wavelet transform that exploits broadcast advantage.

The routing tree is assumed to be known. The key issue is to decide which of all avail-

able broadcast links and the data they provide can be incorporated into de-correlation

operations while still ensuring a invertible and unidirectional transform.

4.1.3.1 Augmented neighborhoods

Starting with a given tree topology T over a set of vertices V , we consider an “augmented”

neighborhood at each node. For node i ∈ V , define the augmented neighborhood Ni

according to the following constraints:

• avoid odd-odd and even-even pairs (for invertibility)

• send only to nodes with lower hop-count (for unidirectionality)

• compute only over data from earlier time-slots (for timely and correct computations)

62

0 100 200 300 400 500 6000

100

200

300

400

500

600

0 100 200 300 400 500 6000

100

200

300

400

500

600

0.06 0.08 0.1 0.12 0.14 0.16 0.185

10

15

20

25

30

35

40

45

total energy consumption (Joules)

SN

R (

dB)

tree transformgraph transform

(a) (b) (c)

Figure 4.3: (a) Sample tree topology (b) With additional broadcast links in the augmentedneighborhoods at each node (c) Performance gain in terms of SNR vs. cost for newtransform compared to 2D wavelet for tree topologies

4.1.3.2 New transform definition

Given (V, T, TAUG), the new transform is defined as follows:

• For node m at even depth in T , ”predict” coefficient dm = xm +∑

k∈Nmpm(k)xk

• For node n at odd depth in T , ”smooth” coefficient sn = xn +∑

k∈Nnun(k)dk

It can be shown that the above conditions are provably necessary for invertibility and

unidirectionality of the transform [SPO09].

4.1.3.3 Performance of new transform

A sample tree topology T and the augmented graph TAUG are shown in Figures. 4.3 (a)

and (b) respectively. Figure. 4.3 (c) shows plots of SNR vs. cost for the new transform

and the unicast based 2D wavelet. It can be seen that there is a significant gain in

63

performance. A higher SNR is obtained for the same cost or the same SNR is obtained

at a lower cost.

4.2 Compressed sensing for multi-hop network setting

The results of our earlier studies are for traditional data compression and transport. Com-

pressed sensing is a recent advance that allows a different solution for field reconstruction.

While the results from this area apply only for specific classes of signals, we investigate

the implications for joint routing and compression in multi-hop sensor networks.

Pioneering work by Candes, Romberg and Tao [CRT06] and Donoho [Don06] es-

tablished that given an n-dimensional vector that is k-sparse in a certain basis, it can

be reconstructed from O(klogn) random projections and showed that near-optimal re-

construction can be obtained by solving a linear program. Tropp and Gilbert [TG07]

subsequently showed that similar reconstruction can be achieved through a greedy algo-

rithm, namely orthogonal matching pursuit (OMP). The number of projections required

for reconstruction depends on incoherence between the sparsity inducing basis and the

measurement matrix. The projection matrices used by Candes and Donoho are dense

random ± 1 Bernoulli matrices or Gaussian matrices. Wang et.al. [WGR07] showed that

the remarkable results of compressed sensing could also be obtained using sparse random

projections. They showed that in a distributed network scenario, CS in its original for-

mulation would require each node to transmit O(n) packets while using sparse random

projections, similar results can be obtained with O(logn) packets per node. However,

64

this scheme is still very expensive in a multi-hop scenario. We present an extension to

obtain SRPs in a distributed manner with shortest path routing.

We use the following notation:

Φ: measurement matrix whose rows are projection vectors

Ψ: sparsity inducing basis whose columns are the basis vectors

H: the holographic basis H = ΨΦ

4.2.1 Combining routing with known results in compressed sensing

Consider a network of n sensor nodes with diameter d hops. The average distance of

nodes from the sink is also O(d) hops. If every node sends its raw sensor measurement

to the sink (independently) via the shortest path tree, then the average cost per reading

for the network is

Costraw−SPT = O(nd).

Now consider compressed sensing and assume a spanning tree topology. Nodes route

data to the sink along this tree. Each node adds its own reading multiplied by ±1 to the

value received from all its children in the tree and sends this new value to its parent. The

sink can add values received from each of its children to obtain one complete projection.

Since each node in the tree transmits exactly once, the cost per projection is n. Assuming

that the projection matrix is known to sink and nodes (each node only needs its column

vector) in advance, the cost for obtaining O(klogn

)projections is

CostCS−DRP = O(n.klogn

)= O

(knlogn

).

65

The measurement matrix for sparse random projections is defined as [WGR07]:

Φij =

+1 if p = 12s

−1 if p = 12s

0 otherwise

For obtaining sparse random projections, each node decides to send with probability

1s = logn

n and the measurement is routed along the shortest path. The sink generates the

row of the measurement matrix by placing ± 1 at positions for nodes from which data

was received and 0 for all others. Since node choice is random, the average path length

remains O(d)

and the cost using O(klogn

)SRPs is

CostCS−SRP = O(d.logn.klogn

)= O

(k.d.log2n

). (4.1)

This is a bound on the cost that any new scheme based on CS must better.

We make the following propositions for using CS in the multi-hop scenario.

Proposition 4.2.1. CS with time-domain sparsity is ineffective in multi-hop scenario.

Reasoning: At most k out of N sensors set off alarms when they sense a value greater

than a threshold. In this case, the sparsity inducing basis Ψ = I, the identity matrix. If

we use sparse random projections, CostCS−SRP = O(kdlog2n

). However, if only nodes

that set off alarms route their measurements via shortest path to the sink, the cost is

O(kd).

66

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.60

10

20

30

40

50

60

70

80

Cost ratio to Raw data transmissionS

NR

(dB

)

SPT256

APR

SRP2

SRP4

Figure 4.4: Compressed sensing performance in multi-hop setting. Plot of SNR vs cost fordifferent schemes. The black and green curves are for Sparse Random Projections (SRP).The blue and red curves are for two variations of computing projections over shortestpath routing.

Proposition 4.2.2. In a multi-hop scenario, shortest path routing is optimal for com-

pressed sensing via sparse random projections.

Reasoning: Each node decides to send its measurement to the sink with probability

1s = logn

n . Since the distribution of the O(logn

)(in expectation) nodes that choose to send

measurements is random, there can be no coordination in the routing. When individual

nodes route data independently, they have no incentive to move away from the shortest

path.

Figure. 4.4 shows the comparative performance of schemes computing projections

along routing paths to sink and sparse random projections computed at the sink. Several

routing schemes and their performance are considered by Lee et al. [LPS+09].

67

Chapter 5

SenZip: Distributed compression as a service

Our work up to this point, and in general, work on correlated data gathering in sensor

networks in the literature, has focused on theory and simulations to understand perfor-

mance limits. These studies, and some limited system implementations (e.g., [ZCH07]),

have therefore had limited impact on technology adoption and sensor network software

development because they have not yielded modular and inter-operable software. We

move towards addressing this problem by (i) proposing a novel architecture, SenZip, that

fits into the overall networking software architecture for sensor networks and (ii) demon-

strating that a practical design based on this architecture can be deployed on motes and

can achieve distributed configuration and modularity.

The SenZip architecture specifies a compression service that can encompass different

compression schemes and its modular interactions with standard networking services such

as routing. This architecture enables a distributed node configuration for compression,

just as existing systems make it possible for sensors to configure themselves for routing

The work described in this section was published as follows:Sundeep Pattem, Godwin Shen, Ying Chen, Bhaskar Krishnamachari, Antonio Ortega, “SenZip: AnArchitecture for Distributed En-Route Compression in Wireless Sensor Networks”, Workshop on Earthand Space Science Applications (ESSA), April 2009.

68

Figure 5.1: The SenZip architecture. A completely distributed compression service isenabled by having the interacting components shown here at each network node.

in a distributed manner. The architecture proposal is based on (a) lessons from overall

architectural principles for sensor networks [TDJ+07], (b) our own experience in imple-

menting a practical wavelet-based distributed compression system, and (c) identifying

common patterns in existing compression schemes. To concretely illustrate the utility

of the architecture, we show how it can incorporate two different compression schemes,

DPCM and 2D wavelets and present results from mote experiments for data gathering in

which nodes can configure themselves for compression under different routing conditions.

5.1 SenZip architecture

We propose and detail SenZip, an architecture for distributed en-route compression in

sensor networks. The primary goals of SenZip are are flexibility, modularity, and dis-

tributed configuration and reconfiguration. In addition to the lessons from the principles

69

of an overall architecture for sensor networks and common abstraction identified for ex-

isting compression schemes, our design of the SenZip architecture is based on a system

implementation effort.

5.1.1 SenZip Specification

The SenZip architecture specifies:

1. a compression service that can encompass different compression schemes and,

2. its interactions with standard routing and other networking services.

Figure 5.1 is a block diagram representation of the SenZip architecture. It needs to

be emphasized that a system based on SenZip would be completely distributed and com-

ponents shown in Figure 5.1 would reside on each network node. Of course, compressed

data from all nodes in the network finally reaches the base station where it is jointly

reconstructed. We now describe the services, their responsibilities and interactions.

5.1.1.1 Compression Service

The compression service consists of the aggregation module and the compression module.

Aggregation module: The aggregation module disseminates and gathers information

for maintaining the local aggregation tree by exchanging messages. This information is

collated in an aggregation table. The aggregation graph abstraction allows the definition

of a generic table that works for different compression schemes. Pseudo-code for such a

table is shown in Figure 5.2.

70

struct attributes int upstreamOnehopNeighborhoodSize;int downstreamOnehopNeighborhoodSize;...

weight attributes;

struct entry int node id;weight attributes weights;int further hops;tableEntry *neighborEntry[MAX NHOOD SIZE];

tableEntry;

AggregationTable tableEntry[MAX NHOOD SIZE];

Figure 5.2: Aggregation table example. The recursive entry structure allows the samedefinition for different compression schemes.

Compression module: This module has the following functions: (a) From the aggre-

gation tree structure provided by routing, this module obtains the role played by the

node - which computations to perform and for which nodes, the parameters involved in

computation and ordering information - the sequence in which nodes process and forward

data. (b) It receives raw measurements from the application and packets with data that

needs further processing from forwarding. (c) This module performs further processing

over the partially processed data in storage and initiates processing for data of the node

itself. The computations will be specific to the compression scheme and based on the role

and parameter information. (d) Data that is still partially processed is packetized and

sent to forwarding. For data that is fully processed, it checks if enough has been buffered

in storage to fill a packet. If yes, performs quantization and bit reduction operations, and

sends the packet to forwarding.

71

5.1.1.2 Networking components

SenZip introduces small changes to standard networking components as follows:

Routing engine: In addition to the standard routing functionality, this component in

SenZip has an extra interface to the compression service. It reports information of path

routing that is relevant for the local aggregation, for example, the parent and hop count

in a tree topology. Optionally decisions on changing parent can be coordinated with the

compression service, which can also provide a specific metric for the routing cost.

Forwarding engine: While partially processed data from nodes in the local aggrega-

tion tree is allowed to be intercepted by the compression service, fully processed data

is forwarded directly along the route to the sink. Optionally, it might apply different

settings, such as power, number of retries etc., for the different types of packets.

Link estimator : Efficient link estimation requires a limited choice of links to moni-

tor [FGJL07]. To remove a link (or node in the neighbor table) that is part of the current

aggregation tree, joint decision has to be made with the compression service to maintain

consistency in the data processing.

5.1.2 Discussion

We emphasize that the configuration of roles, parameters and ordering is to be achieved

purely locally from the aggregation graph and based on the compression scheme. There

is no centralized decision and dissemination. This is a design criteria for compression

schemes that can fit into the architecture. There is an overhead cost for the exchange

of beacons to maintain the aggregation table. Whether the overhead is acceptable or

not depends on the relative frequency of measurement versus the frequency of topology

72

changes. If the frequency of topology changes is very high, the potential gains from

compression might be overwhelmed by the cost of packet exchanges to maintain the

table.

Which component is best suited for constructing and maintaining the local aggregation

graph? One option is to give this additional responsibility to the routing engine, which

already generates and receives messages to setup path routing. However, we believe it

is much better for the compression service to handle the aggregation graph operations.

This will aid code-reuse and flexibility by restricting the changes to the routing engine to

providing a single extra interface.

To ensure flexibility and extensibility, important goals for an overall sensor network ar-

chitecture [CDE+05, TDJ+07], SenZip only details the interactions between compression

and networking services and not the interfaces. The components within the compression

service also follow the larger goal of “meaningful separation of concerns”. The abstrac-

tion helps avoid over-specification, by ensuring that the compression components are

required by most existing schemes. Overall, the specification of SenZip has the features

of a desirable programming paradigm described by Tavakoli et al. [TDJ+07].

5.2 Mapping algorithms to architecture

We now discuss two compression schemes that work over tree routing topologies - a

simple differential encoding scheme, DPCM, and a more sophisticated 2D wavelet scheme

developed by Shen and Ortega [SO08a]. We describe how these schemes fit into the SenZip

architecture.

73

table entry element DPCM 2D waveletweight attributes not needed upstreamOnehopNeighborhoodSize ≡

number of children in treedownstreamOnehopNeighborhoodSize ≡1 (for parent in tree)

further hops 1 (upstream only) 2 for upstream node, 1 for downstreamneighborEntry[].further hops 0 1 upstream node, 0 for downstream

Table 5.1: Aggregation table initialization

5.2.1 Algorithm details

Assume a given graph G(V,E) with vertices defined by node locations and edges defined

by communication links between nodes. Assume a tree graph T (V,R) (R ⊂ E) rooted

at a single sink node. Suppose every node is indexed by an integer n ∈ V , Cn is the set

of child indices of n, and ρ(n) is the parent index of n in T . We say that that node n

has depth k when it is k-hops from the sink. Also let xn denote the data measured at

node n. For simplicity, we assume data is forwarded and compressed along the same tree

T , i.e., the aggregation graph is T . In both schemes, we define the following transmission

schedule. Initially, nodes without any children (leaf nodes) forward raw data to their

parents in T . Then, every node n waits until it receives data from all children m ∈ Cn

before it transmits its own data. This induces an ordering of the communications which

is necessary for nodes to compress data as it is forwarded to the sink.

5.2.1.1 DPCM

Leaf nodes first forward raw data to their parents. Each node n waits to receive raw

measurements from all its children in T and then computes residual prediction errors as

differences from its own measurement as follows:

74

dm = xm − xn ∀m ∈ Cn

sn = xn. (5.1)

Node n then forwards the compressed prediction residuals of its children (and other

descendants) and its own raw measurements to its parent ρ(n).

5.2.1.2 2D wavelet

This transform is constructed as follows for a single level of decomposition. First, vertices

of G are assigned roles by being split into disjoint sets of predicts (odd depth) and updates

(even depth) based on depth in T . Next, a high-pass “detail” coefficient dm for each

predict node m is computed by subtracting from the data at node m, xm, a prediction

that is based on information available at neighboring nodes (where neighbors are defined

as nodes that are 1-hop away in the aggregation graph):

dm = xm −1

(|Cm|+ 1)

∑k∈Cm

xk −1

(|Cm|+ 1)· xρ(m) (5.2)

Finally, a low-pass “smooth” coefficient sn for each update node n is computed by

adding to xn a correction term based on the detail coefficients of neighboring nodes:

sn = xn +1

2(|Cn|+ 1)

∑k∈Cn

dk +1

2(|Cn|+ 1)· dρ(n) (5.3)

75

Under the given transmission schedule, each node only has access to data from its

descendants and only forwards its own data and data from its descendants. Since each

node n uses data from its parent, transform computations for n cannot be completed at

n. However, note that terms corresponding to children Cm and parent ρ(m) are explicitly

separated in the computations. This allows us to compute partial wavelet coefficients and

to update partial coefficients as data flows towards the sink to make them full wavelet

coefficients as described in [CPOK06, SO08a].

This process is summarized as follows. Leaf nodes first forward raw data. Each

predict node m waits to receive data from its children, then generates a partial coefficient

dp(m) using data from its children as dp(m) = xm− 1(|Cm|+1)

∑k∈Cm xk. Then m forwards

its partial dp(m) (and data from descendants) and ρ(m) completes the computation as

d(m) = dp(m) − 1(|Cm|+1) · xρ(m). Each update node performs similar operations. This

process is illustrated in Figure 5.3. Note that this induces an ordering of the computations.

5.2.2 Relating algorithms to SenZip

We now describe the operation overview for SenZip based systems deploying the two

algorithms.

5.2.2.1 Initialization

The aggregation component configures aggregation table entries and initiates message

exchanges (with its neighbors) in order to gather information needed to build the aggre-

gation table. The specifics of table entries for each scheme are shown in Table 5.1. This is

shared with the compression component which can then identify their role, parent in the

76

1

32

4

5 6 Nodes 5 and 6 forward raw data x5 and x6 to node 4

Node 4: (a) Generate partials dp(4), sp(5) and s(b) Forward [dp(4) sp(5) sp(6)] to node 3

Node 2 forwards raw data x2 to node 1

Node 3:(a) Complete partial 4 to get d(4)(b) Complete partials 5, 6 to get s(5), s(6) (c) Generate partial sp(3)(d) Forward [d(4) s(5) s(6) sp(3)] to node 1

Node 1:(a) Generate partials sp(2) and dp(1)(b) Forward [dp(1) sp(2) sp(3) d(4) s(5) s(6)]

Figure 5.3: Partial computations for 2D wavelet. Gray (white) circles denote even (odd) nodes.Operations at each node are done in the order listed.

tree and children in the tree, and ordering of computations, to configure each compression

scheme as follows:

DPCM: The roles are uniform i.e. all nodes have the same role. The ordering is

that leaf nodes start forwarding and intermediate nodes wait for all one-hop upstream

descendants (children) in aggregation tree.

2D wavelet: The roles are decided based on depth in tree from root, odd depth nodes

are predicts nodes and even depth nodes are updates. The parameters in computation

are equal to the weights, the number of one-hop (children) and two-hop (grandchildren)

77

upstream descendants. The ordering is that leaf nodes start forwarding and intermedi-

ate nodes wait for partial coefficients of one-hop (children) and two-hop (grandchildren)

upstream descendants in the aggregation tree.

5.2.2.2 Data forwarding and compression

DPCM: At each node n, the partially processed data to be received is raw data from

children and that to be sent is raw data for node n and fully processed data of the

children is the differentials according to Equation 5.1.

2D wavelet: At each node, partially processed data is raw data from children and

grandchildren. Sent partially processed data is raw data for node n and all children, and

fully processed data is the coefficients for all grandchildren according to Equations 5.2

and 5.3.

5.2.2.3 Reconfiguration

The routing engine informs aggregation component of a change in parent (and hop count)

in the tree.

DPCM: When parent changes at node n, send an explicit parent change message

to the old parent ρold(n) and initiate a message to the new parent. When a parent

change message is received by ρold(n), remove child form table. The number of children

is decremented, so waiting criteria in ordering changes.

2D wavelet: When parent of node n changes, send explicit delete message to ex-

parent ρold(n) and add message to new parent. If the hop count changes parity from

before, propagate the change to all upstream nodes (descendants in subtree). When a

78

(a) (b)

Figure 5.4: Code structure of (a) CTP and (b) SenZip compression service over CTP

parent change message is received by ρold(n), remove child from table. The number of

children and grandchildren is decremented, so waiting criteria in ordering changes. ρold(n)

sends a grandparent change message to ρ(ρold(n)) where changes in ordering are made.

5.3 System implementation

We have implemented a SenZip compression service in nesC/TinyOS [Tin] to run over

the Compression Tree Protocol [CTP, TinyOS Enhancement Proposal (TEP) 123] [tos].

This implementation effort has informed the design of the SenZip architecture and in

turn, concretely demonstrates it in software.

5.3.1 TinyOS code

The code structure of CTP and the SenZip extension are illustrated in Figure.5.4. We

now present some details of the code for components, interfaces, changes to CTP and

application.

79

5.3.1.1 Interfaces

The following new interfaces have been defined for the interactions of the new components

with other parts of the system.

• AggregationInformation: interactions between routing and aggregation component.

• AggregationTable: interactions between aggregation and compression component.

• StartGathering: interactions between application and compression component.

5.3.1.2 AggregationP component

The aggregation component maintains the local aggregation tree. The routing component

signals changes in parent in routing tree. At this point, the aggregation component sends

an ADD beacon to the new parent and a DELETE beacon to the old parent. The old

and new parents update their aggregation tables accordingly.

• Events:

1. Routing.parentChange: Signalled from the Routing engine to indicate a change

in the parent in routing tree.

(a) Send ADD beacon to new parent and DELETE message to the old parent.

(b) Signal change to Compression component.

2. AggBeaconReceive.receive:

(a) Update table for ADD/DELETE beacons from neighbors.

• Commands:

80

(a) (b)

Figure 5.5: (a) Distributed compression and (b) Centralized reconstruction

1. Table.contactDescendant: Called by the Compression component to directly

contact neighbors in the table from which expected packets have not been

received.

5.3.1.3 CompressionP component

When the aggregation component signals changes in the aggregation table, the compres-

sion component allocates and de-allocates memory for storing the data of children in the

aggregation tree. When the forwarding engine presents packets with data arriving from

children, the data is stored, transformed, compressed and packetized to be handed back

to the forwarding engine to transport it to the sink. Figure. 5.5 shows the sequence of

operations for compression at each node. Currently the DPCM transform is applied and

fixed quantization encoding is used for compression. Given the overheads, per packet

payload available for compressed data is 10 bytes or five 16-bit measurements.

• Events:

81

1. Table.tablePointer: Signalled from the Aggregation component to provide a

pointer to the table for the local aggregation neighborhood.

2. Table.change: Signalled from the Aggregation component to inform of changes

in the local aggregation neighborhood.

3. Intercept.forward: Signalled from the forwarding engine to filter out packets

meant for in-network processing i.e. compression.

4. AllRxTimer.fired: Internal timer setup to check if all expected packets from

the local aggregation neighborhood were received. If not, currently use the

readings from the previous epoch.

• Commands:

1. StartGathering.isStarted: Called from application to check if compression has

already started.

2. StartGathering.getStarted: Called from application to get compression started.

3. Measurements.set: Called from the application to transfer sensor measure-

ments.

• Tasks:

1. changesTask: Posted to update internal table according to changes signalled

by Aggregation component.

2. encodeCoefficientsTask: To encode and compress the coefficients generated by

the transform. Currently using fixed quantization encoding.

• Functions:

82

1. computeTransform: To apply the transform on data received from the local

aggregation neighborhood. Currently DPCM or differential computation.

5.3.1.4 Changes to CTP

Some small changes are introduced in CTP components to account for and aid in-network

compression.

• RoutingEngine: include AggregationInformation interface and inform Aggregation

component of changes in parent.

• ForwardingEngine: obtain the next hop for forwarding packets from Compression

component rather than Routing.

5.3.1.5 Application

The current application is written for a TMote Sky mote with an on-board temperature

sensor.

• Events:

1. StartGathering.startDone: Signal from Compression component to begin sen-

sor measurements.

2. SubReceive.receive: At the sink node, the Compression component transfers

all packets to application. They are then sent over the air to the base station

attached to a pc/laptop.

83

5.3.2 Experimental Results

An in-lab testbed with Tmote Sky motes [tmo] is used for the evaluation. Ambient tem-

perature is the sensed phenomenon and we introduce temperature gradients by switching

hot lamps on and off.

5.3.2.1 Static topologies

We use fixed topologies with 15 nodes for this set of experiments. The setting and two

sample topologies are illustrated in Figure 5.6 (a). The spatial transforms used are DPCM

and 2D wavelet and the bit reduction is via fixed quantization. We assume a uniform

bit allocation for all nodes. The same experiments (sequence of switches) are repeated

for the two different tree topologies in Figure 5.6 (a) with different bit allocations per

sample.

On initialization, all nodes in the network self-configured the roles, parameters and

ordering according to the topology. Figures 5.6 (b) and (c) show the reconstruction

with 2 bit allocation at node 7 which has different depth and hence roles, in the two

trees. Similarly, Figures 5.6 (d) and (e) show the reconstruction for node 12 with 3 bit

allocation. Figures 5.6 (g) and (h) compare the reconstruction error at the nodes for

each topology for 3 bit allocation. The RMS error ranges between .01oC to .16 oC over

the temperature range of 20oC to 28oC for 3 bit quantization of coefficients for original

sample of 16 bits. Since good and similar reconstruction is obtained, it is verified that the

compression operations were correctly configured in a completely distributed manner.

Figures 5.7 (a) shows the average RMS error for compression for tree 1 with varying

bit allocation. As expected, better reconstruction is obtained for higher bit allocations.

84

(a)

0 50 100 150 200

21

22

23

24

25

26

27

sample number

tem

pera

ture

(ce

ntig

rade

)

original signalreconstruction

0 50 100 150 20021

22

23

24

25

26

27

28

sample number

tem

pera

ture

(ce

ntig

rade

)

original signalreconstruction

(b) (c)

0 20 40 60 80 100 120 140 160 180 20021

21.5

22

22.5

23

23.5

24

24.5

25

25.5

26

0 20 40 60 80 100 120 140 160 180 20021

21.5

22

22.5

23

23.5

24

24.5

25

25.5

(d) (e)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 150

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

node id

RM

S e

rror

(ce

ntig

rade

)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 150

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

node id

RM

S e

rror

(ce

ntig

rade

)

(f) (g)

Figure 5.6: Experiments on static trees with 2D wavelet transform and fixed quantization.(a) Two fixed tree topologies, tree 1 and tree 2, for same set and locations of nodes. Rawmeasurement (dashed red) and reconstruction (solid blue) for node 7 with 2 bits allocatedper sample for (b) tree 1 and (c) tree 2, for node 12 with 3 bits per sample for (d) tree 1and (e) tree 2. Histogram of RMS error at all nodes with 3 bits per sample for (f) tree 1and (g) tree 2.

85

2 3 4 50

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

bit allocation per sample

aver

age

RM

S e

rror

2D waveletDPCM

2 3 4 50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

allocation per sample (bits)

cost

(no

rmal

ized

wrt

CT

P)

2D wavelet, tree 1

DPCM, tree 1

2D wavelet, tree 2

DPCM, tree 2

asymptotic bound

(a) (b)

Figure 5.7: (a) Average RMS error for tree 1 with increasing bit allocation per samplefor DPCM and 2D wavelet. (b) Normalized cost wrt. to raw data gathering with CTPfor increasing bit allocaton per sample.

Figures 5.7(b) shows the cost gain over raw data collection with CTP. In these experi-

ments, DPCM has lower cost since the partially processed data only travels one hop while

for 2D wavelet, this is 2 hops. The cost gains compared to raw data are relatively limited

due to small network size, particularly small average depth which is 3.27 for tree 1 and

4.07 for tree 2. It can be shown in general that with increasing average depth, the cost

for both schemes approaches the ratio of bit allocation to raw measurement size. For the

same number of bits, the wavelet scheme has better reconstruction but as just discussed,

a higher cost.

5.3.2.2 Dynamic topologies

In these experiments, we send explicit messages to nodes to alter their parent in the

routing tree while data gathering with compression is in progress. The compression

settings are to use DPCM transform and Golumb-Rice encoding.

These results verify (a) correct updating of aggregation table and configuration of

storage, transform computations and packetization at node that adds a new child to its

86

aggregation table, (b) correct handling of coefficients “pending” packetization at node

that deletes a child from its aggregation table and (c) correct reconstruction of altered

topology during reconstruction at base station.

87

Chapter 6

Conclusion

This work studied several aspects of the joint routing and compression problem in sensor

networks to arrive at a comprehensive solution. We have made significant progress to-

wards demonstrating completely distributed in-network compression in sensor networks.

We conclude with a discussion of the contributions and future work.

6.1 Contributions

The main contributions of this thesis are as follows:

Theoretical understanding of interplay between routing and in-network compression:

Two different scenarios, homogeneous and heterogeneous, are shown to have different

near-optimal routing. Subsequently, this problem was addressed by other researchers,

primarily for the homogeneous case and while they use different models, their results agree

with our basic conclusions. When the spatial correlation is uniform, for the homogeneous

case where every node is capable of compression computations, shortest-path routing is

order-optimal.

88

Design of algorithms for spatial compression: First, a wavelet compression algorithm

to take advantage of broadcast nature of wireless communications. This algorithm works

for any type of data over any connected 2D topology. The second is a compressed sensing

based scheme that extends the classical framework to the multi-hop scenario. This scheme

works when data is known to be sparse in some known spatial basis.

SenZip, architectural view of distributed compression as a service: A new “compres-

sion layer“ is defined to interact with standard networking components to achieve the

configuration (and dynamic reconfiguration) and computations required for compression

in a completely distributed fashion.

System design and software development : Software modules for a SenZip compression

service that works on top of the Collection Tree Protocol (which provides the networking

components). This concretely demonstrates that SenZip is a working architecture. The

code has been released to tinyos-contribs.

6.2 Future work

Some directions for future work on the analysis, algorithm design and system development

for distributed compression follow.

There is very little work on analyzing the case when regions that have correlated data

are not geographically proximate [DBF07]. The analysis presented in this thesis is limited

to the case of uniform spatial correlations. It will interesting to extend it for the scenarios

with non-uniform spatial correlations.

89

In the design of wavelet based algorithms, we have assumed that the optimal routing

is known and that the compression operations are configured for the chosen routes. The

design of algorithms that jointly optimize routing and compression needs more attention.

Our analysis and algorithms are focused primarily on ways to exploit spatial correlations.

The implicit understanding is that temporal correlations can be handled at each node in-

dividually. However, algorithms that account for temporal correlations across nodes, and

for the general space of spatio-temporal correlations need further research. Further work

is needed on studying distributed compression algorithms to understand how they might

fit into the SenZip architecture. Simplifications and modifications to these algorithms

might be needed for them to correspond to the abstraction used for SenZip design.

With some extensions, the SenZip based system can allow for distributed compression

to be widely adopted in data gathering sensor networks. We are working on a TinyOS

Enhancement Proposal (TEP) for standardization of SenZip. The system currently pro-

vides a few options in terms of spatio-temporal transforms and encoding schemes. It

will be useful to develop and provide a suite of compression schemes in TinyOS. Further,

there is a need for a manual for helping users make the correct choice of which scheme

to use based on domain and application specific knowledge. Improvements need to be

made for ensuring robustness. The current distributed initialization is based on a simple

broadcast flooding scheme. Practical deployments will require a reliable flooding scheme.

The reconstruction code needs to be extended to handle changes in topology and packet

losses. For long-lived operation, the current system needs to be integrated with a sleep-

scheduling mechanism. The system needs to be tested at scale i.e in medium and large

sized networks.

90

References

[BK01] Stephen F. Bush and Amit Kulkarni. Active Networks and Active Net-work Management: A Proactive Management Framework. Kluwer Aca-demic/Plenum Publishers, 2001.

[CBLV04] R. Cristescu, B. Beferull-Lozano, and M. Vetterli. On network correlateddata gathering. In Proceedings of the 23rd Conference of the IEEE Com-munications Society. IEEE Communications Society, March 2004.

[CBLVW06] R. Cristescu, B. Beferull-Lozano, M. Vetterli, and R. Wattenhofer. Networkcorrelated data gathering with explicit communication: Np-completenessand algorithms. IEEE/ACM Transactions on Networking, 14(1):41–54,February 2006.

[CDE+05] D. Culler, P. Dutta, C. T. Eee, R. Fonseca, J. Hui, P. Levis, J. Polastre,S. Shenker, I. Stoica, G. Tolle, and J. Zhao. Towards a sensor networkarchitecture: Lowering the waistline. In Proceedings of the Tenth Workshopon Hot Topics in Operating Systems. USENIX, June 2005.

[CDHH06] David Chu, Amol Deshpande, Joseph Hellerstein, and Wei Hong. Approxi-mate data collection in sensor networks using probabilistic models. In IEEEInternational Conference on Data Engineering (ICDE), pages 3–7. IEEE,April 2006.

[CO05] A. Ciancio and A. Ortega. A distributed wavelet compression algorithmfor wireless multihop sensor networks using lifting. In Proceedings of theIEEE International Conference on Acoustics, Speech, and Signal Processing.IEEE, March 2005.

[CPOK06] A. Ciancio, S. Pattem, A. Ortega, and B. Krishnamachari. Energy-efficientdata representation and routing for wireless sensor networks based on a dis-tributed wavelet compression algorithm. In Proceedings of the ACM/IEEEInternational Symposium on Information Processing in Sensor Networks(IPSN). Springer Verlag, April 2006.

[CRT06] E.J. Candes, J. Romberg, and T. Tao. Robust uncertainity principles : ex-act signal reconstruction from highly incomplete frequency information. InIEEE Transactions on Information Theory, pages 489–509. IEEE, February2006.

91

[CT91] T. M. Cover and J. A. Thomas. Elements of Information Theory. JohnWiley, New York, N.Y., USA, 1991.

[DBF07] T. Dang, N. Bulusu, and W. Feng. Rida: A robust information-driven datacompression architecture for irregular wireless sensor networks. In Proceed-ings of the 4th European Workshop on Sensor Networks. IEEE, January2007.

[DDT+08] M. Duarte, M. Davenport, D. Takhar, J. Laska, T. Sun, K. Kelly, andR. Baraniuk. Single-pixel imaging via compressive sampling. IEEE SignalProcessing Magazine, 25(2):83–91, March 2008.

[Don06] D. L. Donoho. Compressed sensing. In IEEE Transactions on InformationTheory, pages 1289–1306. IEEE, April 2006.

[DS98] I. Daubechies and W. Sweldens. Factoring wavelet transforms into liftingsteps. Journal of Fourier Analysis and Applications, 4(3):247–269, March1998.

[EGGM04] M. Enachescu, A. Goel, R. Govindan, and R. Motwani. Scale-free aggre-gation in sensor networks. In 1st International Workshop on AlgorithmicAspects of Wireless Sensor Networks, pages 71–84. Springer-Verlag, July2004.

[FGJL07] R. Fonseca, O. Gnawali, K. Jamieson, and P. Levis. Four bit wireless linkestimation. In Proceedings of the Sixth ACM Workshop on Hot Topics inNetworks. ACM, November 2007.

[GBR] GBROOS. Great barrier reef ocean observing system.http://imos.org.au/gbroos.html/.

[GDV06] M. Gastpar, P. L. Dragotti, and M. Vetterli. The distributed karhunen-loevetransform. IEEE Transcations on Information Theory, 52(12):5177–5196,December 2006.

[GE03] A. Goel and D. Estrin. Simultaneous optimization for concave costs: sin-gle sink aggregation or single source buy-at-bulk. In Proceedings of the14th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 499–505. ACM/SIAM, January 2003.

[GGP+03] D. Ganesan, B. Greenstein, D. Perelyubskiy, D. Estrin, and J. Heidemann.An evaluation of multi-resolution search and storage in resource-constrainedsensor networks. In Proceedings of the First ACM Conference on EmbeddedNetworked Sensor Systems, November 2003.

[HBSA04] T. He, B. M. Blum, J. A. Stankovic, and T. F. Abdelzaher. Aida: Adaptiveapplication independent data aggregation in wireless sensor networks. InACM Transactions on Embedded Computing System Special issue on Dy-namically Adaptable Embedded Systems, pages 3(2), 426 – 457. ACM, May2004.

92

[HCJB04] W. Hu, C.T. Chou, S. Jha, and N. Bulusu. Deploying long-lived and cost-effective hybrid sensor networks. In The 1st Workshop on Broadband Ad-vanced Sensor Networks. IEEE Communications Society, October 2004.

[IEGH02] C. Intanagonwiwat, D. Estrin, R. Govindan, and J.S. Heidemann. Impactof network density on data aggregation in wireless sensor networks. In Pro-ceedings of The 22nd International Conference on Distributed ComputingSystems, pages 457–458. IEEE Computer Society, July 2002.

[IGE+03] C. Intanagonwiwat, R. Govindan, D. Estrin, J.S. Heidemann, and F. Silva.Directed diffusion for wireless sensor networking. IEEE/ACM Transactionson Networking, 11(1):2–16, January 2003.

[KEW02] B. Krishnamachari, D. Estrin, and S.W. Wicker. The impact of data aggre-gation in wireless sensor networks. In Proceedings of the 22nd InternationalConference on Distributed Computing Systems, pages 575–578. IEEE Com-puter Society, July 2002.

[LDP07] M. Lustig, D. Donoho, and J. M. Pauly. Sparse mri: The application ofcompressed sensing for rapid mr imaging. Magnetic Resonance in Medicine,58(6):1182–1195, December 2007.

[LPS+09] Sungwon Lee, Sundeep Pattem, Maheswaran Sathiamoorthy, Antonio Or-tega, and Bhaskar Krishnamachari. Spatially-localized compressed sensingand routing in multi-hop sensor networks. In Proceedings of the 3rd Inter-national Conference on Geosensor Networks, July 2009.

[LTP05] H. Luo, Y. Tong, and G. Pottie. A two-stage dpcm scheme for wirelesssensor networks. In Proceedings of the IEEE International Conference onAcoustics, Speech, and Signal Processing. IEEE, April 2005.

[MFHH] Sam Madden, Michael J. Franklin, Joseph M. Hellerstein, and Wei Hong.Tag: A tiny aggregation service for ad hoc sensor networks. In Proceedingsof the 5th USENIX Symposium on Operating Systems Design and Imple-mentation, December.

[PKG04] S. Pattem, B. Krishnamachari, and R. Govindan. The impact of spatial cor-relation on routing with compression in wireless sensor networks. In Prceed-ings of the ACM/IEEE International Symposium on Information Processingin Sensor Networks, pages 28–35. Springer-Verlag, April 2004.

[PKG08] S. Pattem, B. Krishnamachari, and R. Govindan. The impact of spatialcorrelation on routing with compression in wireless sensor networks. ACMTransactions on Sensor Networks, 4(4), August 2008.

[PLS+09] S. Pattem, S. Lee, M. Sathiamoorthy, A. Ortega, and B. Krishnamachari.Compressed sensing and routing in multi-hop sensor networks. Tech report,USC CENG-2009-4, October 2009.

93

[PR99] S.S. Pradhan and K. Ramchandran. Distributed source coding using syn-dromes (discus): Design and construction. In Proceedings of the IEEE DataCompression Conference, pages 158–167. IEEE Computer Society, March1999.

[PSC+09] S. Pattem, G. Shen, Y. Chen, B. Krishnamachari, and A. Ortega. Senzip:An architecture for distributed en-route compression in wireless sensor net-works. In Proceedings of the Workshop on Sensor Networks for Earth andSpace Science Applications. IEEE/ACM, April 2009.

[rfc90] Compressing tcp/ip headers for low-speed serial links, ietf rfc 1144.http://tools.ietf.org/html/rfc1144, February 1990.

[rfc99] Compressing ip/udp/rtp headers for low-speed serial links, ietf rfc 2508.http://tools.ietf.org/html/rfc2508, February 1999.

[rfc01] Robust header compression, ietf rfc 3095.http://tools.ietf.org/html/rfc3095, July 2001.

[sen] Senzip code release. http://tinyos.cvs.sourceforge.net/viewvc/tinyos/tinyos-2.x-contrib/usc/senzip/.

[SO08a] G. Shen and A. Ortega. Optimized distributed 2d transforms for irregularlysampled sensor network grids using wavelet lifting. In Proceedings of theIEEE International Conference on Acoustics, Speech, and Signal Processing(ICASSP) 2008, Las Vegas, NV, USA, 2008.

[SO08b] G. Shen and A. Ortega. Optimized distributed 2d transforms for irregularlysampled sensor network grids using wavelet lifting. In Proceedings of theIEEE International Conference on Acoustics, Speech, and Signal Processing.IEEE, March 2008.

[SPO09] G. Shen, S. Pattem, and A. Ortega. Energy-efficient graph-based waveletsfor distributed coding in wireless sensor networks. In Proceedings of the34th International Conference on Acoustics, Speech, and Signal Processing.IEEE, April 2009.

[SS02] A. Scaglione and S.D. Servetto. On the interdependence of routing and datacompression in multi-hop sensor networks,. In Proceedings of The 8th ACMInternational Conference on Mobile Computing and Networking, pages 140–147. ACM, August 2002.

[SS05] A. Scaglione and S.D. Servetto. On the interdependence of routing and datacompression in multi-hop sensor networks. In Wireless Networks, Volume11, Number 1-2, pages 149–160. ACM, January 2005.

[TDJ+07] A. Tavakoli, P. Dutta, J. Jeong, S. Kim, J. Ortiz, P. Levis, and S. Shenker.A modular sensornet architecture: Past, present, and future directions. InProceedings of the International Workshop on Wireless Sensornet Architec-ture, April 2007.

94

[TG07] J. Tropp and A. Gilbert. Signal recovery from random measurements viaorthogonal matching pursuit. IEEE Transactions on Information Theory,53(12):4655–4666, December 2007.

[Tin] TinyOS. An operating system for wireless embedded sensor networks.http://www.tinyos.net/.

[TM06] D. Tulone and S. Madden. Paq: Time series forecasting for approximatequery answering in sensor networks. In Proceedings of the European Con-ference in Wireless Sensor Networks, pages 21–37. IEEE, February 2006.

[TMEC+10] A. Terzis, R. Musaloiu-E., J. Cogan, K. Szlavecz, A. Szalay, J. Gray, S. Ozer,M. Liang, J. Gupchup, and R. Burns. Wireless sensor networks for soil sci-ence. International Journal on Sensor Networks, Special Issue on Environ-mental Sensor Networks, 7(1/2):53–70, January 2010.

[tmo] Tmote sky device. http://www.snm.ethz.ch/Projects/TmoteSky.

[tos] Tinyos 2.0 network protocol working group, collection tree protocol, tinyosenhancement proposal (tep) 123. http://www.tinyos.net/tinyos-2.x/doc/.

[TVSO09] Paula Tarrio, Giuseppe Valenzise, Godwin Shen, and Antonio Ortega. Dis-tributed network configuration for wavelet-based compression in sensor net-works. In Proceedings of the 3rd International Conference on GeosensorNetworks, July 2009.

[TW96] David L. Tennenhouse, , and David J. Wetherall. Towards an active networkarchitecture. ACM SIGCOMM Computer Communication Review, 26(2):5–18, March 1996.

[VK91] M. Vetterli and J. Kovacevic. Wavelets and Subband Coding. Prentice Hall,Upper Saddle River, NJ, USA, 1991.

[vRW04] P. von Rickenbach and R. Wattenhofer. Gathering correlated data in sensornetworks. In Proceedings of the DIALM-POMC Joint Workshop on Foun-dations of Mobile Computing, pages 60–66. ACM, October 2004.

[WALJ+06] G. Werner-Allen, K. Lorincz, J. Johnson, J. Lees, and M. Welsh. Fidelityand yield in a volcano monitoring sensor network. In Proceedings of the 7thSymposium on Operating Systems Design and Implementation. USENIX,December 2006.

[WB99] M. Widmann and C. Bretherton. 50 km resolution daily precipta-tion for the pacific northwest, 1949-94. Online data-set located at<http://www.jisao.washington.edu/data sets/widmann>, 1999.

[WGR07] W. Wang, M. Garofalakis, and K. Ramachandran. Distributed sparserandom projections for refinable approximation. In Proceedings of theACM/IEEE International Symposium on Information Processing in SensorNetworks, pages 331–339. Springer Verlag, April 2007.

95

[ZCH07] Y. Zhang, S. Chatterjea, and P. Havinga. Experiences with implementinga distributed and self-organizing scheduling algorithm for energy-efficientdata gathering on a real-life sensor network platform. In Proceedings of theFirst IEEE International Workshop: From Theory to Practice in WirelessSensor Networks. IEEE, June 2007.

[ZK04a] M. Zuniga and B. Krishnamachari. Analyzing the transitional region inlow power wireless links. In Proceedings of the First IEEE InternationalConference on Sensor and Ad hoc Communications and Networks. IEEE,October 2004.

[ZK04b] M. Zuniga and B. Krishnamachari. Realistic wireless link qual-ity model and generator. Available online for download at<http://ceng.usc.edu/ anrg/downloads.html>, 2004.

[ZSS05] Y. Zhu, K. Sundaresan, and R. Sivakumar. Practical limits on achievableenergy improvements and useable delay tolerance in correlation aware datagathering in wireless sensor networks. In Proceedings of the 2nd IEEE Com-munications Society Conference on Sensor and Ad Hoc Communications andNetworks. IEEE, September 2005.

96

joint routing and compression in sensor networks: …anrg.usc.edu/www/thesis/sundeepthesis.pdf ·...

Documents