joint routing and compression in sensor networks: …anrg.usc.edu/www/thesis/sundeepthesis.pdf ·...
TRANSCRIPT
JOINT ROUTING AND COMPRESSION IN SENSOR NETWORKS: FROM
THEORY TO PRACTICE
by
Sundeep Pattem
A Dissertation Presented to theFACULTY OF THE USC GRADUATE SCHOOLUNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of theRequirements for the Degree
DOCTOR OF PHILOSOPHY(ELECTRICAL ENGINEERING)
August 2010
Copyright 2010 Sundeep Pattem
Dedication
To Sameera
ii
Acknowledgements
My work at USC owes a great deal to collaborations with and help from several faculty
and colleagues: Prof. Bhaskar Krishnamachari, Prof. Antonio Ortega, Prof. Ramesh
Govindan, Prof. Gaurav Sukhatme, Prof. Kristina Lerman (USC/ISI), Sameera Poduri,
Avinash Sridharan, Ying Chen, Alexandre Ciancio, Godwin Shen, Sungwon Lee, Matt
Klimesh (JPL), Maheswaran Sathiamoorthy, Aaron Tu, Aram Galstyan (USC/ISI).
It has been a privilege to be associated with Prof. Bhaskar Krishnamachari for all
these years. It is no exaggeration to say that I would not be writing a thesis but for
Bhaskar’s help - the extra-ordinary patience and kindness, the ability to empathize, en-
thuse and inspire, and the passion for helping students realize their potential, not just in
research, but as well-rounded people. I hope my life will reflect what I imbibed from his
emphasis on values and service.
My roommates and buddies, Apoorva Jindal, Rahul Urgaonkar, Avinash Sridharan,
Sonal Jain, Ankit Singhal, made time fly. The wise old(er) folks, Narayanan Sadagopan,
Venkata Pingali, Amit Ramesh, Srikanth Saripalli, Krishnakant Chintalapudi, Karthik
Dantu, made life easy. ANRG members - Marco Zuniga, Ying Chen, Shyam Kapadia,
Divya Devaguptapu, Joon Ahn, Hua Liu, Pai-Han Huang, Kiran Yedavalli, Jung-Hyun
iii
Jun, made working in the lab a pleasure. I will miss Shane Goodoff and his edgy sense
of humor.
My mother, Sri Shobha Rani, is my first teacher. I owe all my learning to her. My
father, Sri Rajeswara Rao, toiled hard so we could dream. I hope he thinks his harvest has
been good. My grandmother, Sri Kalavathi, introduced me to the power, and pleasures,
of the imagination. My sister, Santoshi, is my biggest (and only) fan. We’ll always have
each other. My friend, Rakesh, helped open my eyes to the vast and the blissful. My wife,
Sameera, has shared in all the delight and the despair. I hope she thinks it worthwhile.
For me, she has been a blessing.
iv
Table of Contents
Dedication ii
Acknowledgements iii
List Of Figures viii
Abstract xi
Chapter 1: Introduction 11.1 Data gathering sensor networks . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Thesis and Research Summary . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.1 Impact of spatial correlations on optimal routing . . . . . . . . . . 41.2.2 Algorithms for achieving distributed compression . . . . . . . . . . 51.2.3 Architecture and system implementation for distributed compression 5
Chapter 2: Background on in-network compression 72.1 Distributed aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.2 Distributed compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.1 Distributed source coding . . . . . . . . . . . . . . . . . . . . . . . 92.2.2 Analysis of impact of correlations on routing . . . . . . . . . . . . 92.2.3 Spatial transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.2.4 Compressed sensing . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Distributed configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.4 System implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Chapter 3: Modeling of joint routing and compression 173.1 Assumptions and Methodology . . . . . . . . . . . . . . . . . . . . . . . . 19
3.1.1 Note on Heuristic Approximation . . . . . . . . . . . . . . . . . . . 233.2 Routing Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2.1 Comparison of the schemes . . . . . . . . . . . . . . . . . . . . . . 253.3 A Generalized Clustering Scheme . . . . . . . . . . . . . . . . . . . . . . . 28
3.3.1 Description of the scheme . . . . . . . . . . . . . . . . . . . . . . . 293.3.1.1 Metrics for evaluation of the scheme . . . . . . . . . . . . 29
3.3.2 1-D Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303.3.2.1 Sequential compression along SPT to cluster head . . . . 31
v
3.3.2.2 Compression at cluster head only . . . . . . . . . . . . . 373.3.3 2-D analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3.3.1 Opportunistic compression along SPT to cluster head . . 413.3.3.2 Compression at cluster head only . . . . . . . . . . . . . 43
3.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493.4.1 Communication and Topology models . . . . . . . . . . . . . . . . 49
3.4.1.1 Random geometric graphs . . . . . . . . . . . . . . . . . 493.4.1.2 Realistic Wireless Communication model . . . . . . . . . 50
3.4.2 Joint entropy models . . . . . . . . . . . . . . . . . . . . . . . . . . 523.4.2.1 Linear and convex functions of distance . . . . . . . . . . 533.4.2.2 Continuous, Gaussian data model . . . . . . . . . . . . . 53
3.4.3 Summary of results . . . . . . . . . . . . . . . . . . . . . . . . . . . 553.5 Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Chapter 4: Practical schemes for distributed compression 584.1 Wavelet transform design for wireless broadcast advantage . . . . . . . . . 58
4.1.1 Wavelet basics: The 5/3 lifting transform . . . . . . . . . . . . . . 594.1.2 Wavelets for sensor networks . . . . . . . . . . . . . . . . . . . . . 60
4.1.2.1 Unidirectional 1D wavelet . . . . . . . . . . . . . . . . . . 604.1.2.2 2D wavelet for tree topologies . . . . . . . . . . . . . . . 60
4.1.3 2D wavelet for wireless broadcast scenario . . . . . . . . . . . . . . 624.1.3.1 Augmented neighborhoods . . . . . . . . . . . . . . . . . 624.1.3.2 New transform definition . . . . . . . . . . . . . . . . . . 634.1.3.3 Performance of new transform . . . . . . . . . . . . . . . 63
4.2 Compressed sensing for multi-hop network setting . . . . . . . . . . . . . 644.2.1 Combining routing with known results in compressed sensing . . . 65
Chapter 5: SenZip: Distributed compression as a service 685.1 SenZip architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.1.1 SenZip Specification . . . . . . . . . . . . . . . . . . . . . . . . . . 705.1.1.1 Compression Service . . . . . . . . . . . . . . . . . . . . . 705.1.1.2 Networking components . . . . . . . . . . . . . . . . . . . 72
5.1.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725.2 Mapping algorithms to architecture . . . . . . . . . . . . . . . . . . . . . . 73
5.2.1 Algorithm details . . . . . . . . . . . . . . . . . . . . . . . . . . . . 745.2.1.1 DPCM . . . . . . . . . . . . . . . . . . . . . . . . . . . . 745.2.1.2 2D wavelet . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.2.2 Relating algorithms to SenZip . . . . . . . . . . . . . . . . . . . . . 765.2.2.1 Initialization . . . . . . . . . . . . . . . . . . . . . . . . . 765.2.2.2 Data forwarding and compression . . . . . . . . . . . . . 785.2.2.3 Reconfiguration . . . . . . . . . . . . . . . . . . . . . . . 78
5.3 System implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 795.3.1 TinyOS code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.3.1.1 Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 805.3.1.2 AggregationP component . . . . . . . . . . . . . . . . . . 80
vi
5.3.1.3 CompressionP component . . . . . . . . . . . . . . . . . . 815.3.1.4 Changes to CTP . . . . . . . . . . . . . . . . . . . . . . . 835.3.1.5 Application . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.3.2 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 845.3.2.1 Static topologies . . . . . . . . . . . . . . . . . . . . . . . 845.3.2.2 Dynamic topologies . . . . . . . . . . . . . . . . . . . . . 86
Chapter 6: Conclusion 886.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 886.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
References 91
vii
List Of Figures
1.1 (a) Illustration of a distributed phenomena and data gathering using sensornetwork. (b) Hardware - Telosb mote device. . . . . . . . . . . . . . . . . 2
1.2 (a) Software abstraction from application developer perspective. (b) Pos-sible fit for compression. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Software abstraction for compression as a service . . . . . . . . . . . . . . 3
3.1 Empirical data (from the rainfall data-set) and approximation for jointentropy of linearly placed sources separated by different distances . . . . . 21
3.2 Illustration of routing for the three schemes: DSC, CDR, and RDC. Hi isthe joint entropy of i sources. . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3 Comparison of energy expenditures for the RDC, CDR and DSC schemeswith respect to the degree of correlation c. . . . . . . . . . . . . . . . . . . 27
3.4 Illustration of clustering for a two-dimensional field of sensors . . . . . . . 30
3.5 Comparison of the performance of different cluster-sizes for linear array ofsources(n = D = 105) with compression performed sequentially along thepath to cluster heads. The optimal cluster size is a function of correlationparameter c. Also, cluster size s = 15 performs close to optimal over therange of c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.6 Illustration of the existence of a static cluster for near-optimal performanceacross a range of correlations. The sources are in a linear array and datais sequentially compressed along the path to cluster heads. . . . . . . . . . 37
3.7 Performance with compression only at cluster head with nodes in a lineararray(n = D = 105). Cluster sizes s = 5, 7 are close to optimal over therange of c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
viii
3.8 Illustration of the near-optimal cluster size with compression only at clusterhead with nodes in a linear array. The performance of cluster sizes near
s = 7(≈√
1052 ) is close to optimal over the range of c values . . . . . . . . 40
3.9 Routing in a 2-D grid arrangement. (a) Calculation of joint entropy. Usingthe iterative approximation joint entropy of k nodes forming a contiguousset is the same as the joint entropy of k sensors lying on a straight line.This is illustrated along the diagonal. This also illustrates opportunisticcompression along SPT to cluster head. (b) Intra-cluster, shortest pathfrom source to cluster head routing with compression only at cluster head.The routing from cluster heads to sink is similar. . . . . . . . . . . . . . . 41
3.10 Comparison of the performance of various cluster sizes for a network with106 nodes on a 1000x1000 grid when compression is possible only at clusterheads. The performance for s = 5, 10 is observed to be close to optimalover the range of c values. . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.11 Illustration of the existence of a near-optimal cluster size. The networksize is n × n = 1000 × 1000 and compression is possible only at clusterheads. The performance of cluster side values near s = .6487n
13 is quite
close to optimal for all values of c ranging from 0.0001 to 10000 . . . . . . 48
3.12 Random geometric graph topology. Performance of clustering with density= 1 node/m2, communication radius = 3m for network of size (a) 24x24(b) 84x84 (c) 200x200. Near-optimal cluster sizes are (a) 3,4 (b) 4,7 (c) 8,10. 51
3.13 Realistic wireless communication topology. Performance of clustering in48mx48m network with density = .25 nodes/m2 for power level (a) -3dB(b) -7dB (c) -10dB. Cluster sizes 6, 8 are near-optimal. . . . . . . . . . . . 52
3.14 (a) Example forms of joint entropy functions for 2 sources. The entropyof each source is normalized to 1 unit. The convex and linear curves areclipped when the joint entropy equals the sum of individual entropies.The curves shown are for correlation parameter c = 2. Performance ofclustering in 72m × 72m network with density = .25 nodes/m2 for (b)linear model (c) convex model of joint topology. Cluster size 6 is near-optimal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.15 Performance of clustering in 48m×48m network with density = .25 nodes/m2
with a continuous, jointly Gaussian data model and quatization step (a) δ= 1 (b) δ = 0.5 (c) δ = 0.05. Cluster size 6, 8 are near-optimal. . . . . . . 56
4.1 Example (a) signal and (b) 5/3 wavelet coefficients . . . . . . . . . . . . . 59
ix
4.2 Illustration of odd (green) and even (blue) nodes in a subtree for 2D wavelet(a) with unicast and (b) exploiting broadcast nature of wireless communi-cations. The solid arrows are part of the tree routing paths. The dashedarrows are the wireless links not part of the tree. The arrows crossed offin red denote disallowed interactions for transform invertibility and unidi-rectionality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.3 (a) Sample tree topology (b) With additional broadcast links in the aug-mented neighborhoods at each node (c) Performance gain in terms of SNRvs. cost for new transform compared to 2D wavelet for tree topologies . . 63
4.4 Compressed sensing performance in multi-hop setting. Plot of SNR vscost for different schemes. The black and green curves are for SparseRandom Projections (SRP). The blue and red curves are for two variationsof computing projections over shortest path routing. . . . . . . . . . . . . 67
5.1 The SenZip architecture. A completely distributed compression service isenabled by having the interacting components shown here at each networknode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.2 Aggregation table example. The recursive entry structure allows the samedefinition for different compression schemes. . . . . . . . . . . . . . . . . . 71
5.3 Partial computations for 2D wavelet. Gray (white) circles denote even (odd)nodes. Operations at each node are done in the order listed. . . . . . . . . . . . 77
5.4 Code structure of (a) CTP and (b) SenZip compression service over CTP 79
5.5 (a) Distributed compression and (b) Centralized reconstruction . . . . . . 81
5.6 Experiments on static trees with 2D wavelet transform and fixed quanti-zation. (a) Two fixed tree topologies, tree 1 and tree 2, for same set andlocations of nodes. Raw measurement (dashed red) and reconstruction(solid blue) for node 7 with 2 bits allocated per sample for (b) tree 1 and(c) tree 2, for node 12 with 3 bits per sample for (d) tree 1 and (e) tree 2.Histogram of RMS error at all nodes with 3 bits per sample for (f) tree 1and (g) tree 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.7 (a) Average RMS error for tree 1 with increasing bit allocation per samplefor DPCM and 2D wavelet. (b) Normalized cost wrt. to raw data gatheringwith CTP for increasing bit allocaton per sample. . . . . . . . . . . . . . . 86
x
Abstract
In-network compression is essential for extending the lifetime of data gathering sensor
networks. To be really effective, in addition to the computations, the configuration re-
quired for such compression must also be achieved in a distributed manner. The thesis of
this dissertation is that it is possible to demonstrate completely distributed in-network
compression in sensor networks. Establishing this thesis requires studying several aspects
of joint routing and compression.
First, our analysis of the impact of spatial correlations on optimal routing shows that
there exist correlation-independent schemes for routing that are near-optimal. This im-
plies that static routing schemes may perform as well as sophisticated ones based on
learning correlations. Next, we develop novel and practical algorithms for distributed
compression that take into account the routing structure over which data is transported.
Finally, we argue that lack of work on (a) distributed configuration for compression oper-
ations and (b) reusable software development, is the primary reason why compression has
not been widely adopted in sensor network deployments. Our solution to address this gap
is SenZip, an architectural view of compression as a service that interacts with standard
networking services. A system implementation based on SenZip and results from experi-
ments concretely demonstrate distributed and self-organizing in-network compression.
xi
Chapter 1
Introduction
The sensor networks vision arose in the late 1990s, with the emergence of a new class
of devices that allow fine-grained sensing of the physical world. Technological advances
made it possible to integrate computation, communication, sensing, and even actuation,
on the same platform, while keeping the form factor small and cost low. Potentially large
numbers of these devices could be distributed in space to form a collaborating wireless
network capable of achieving complex global tasks. It was apparent that such networks
will enable several new applications of benefit to society. The last ten years have seen a
great amount of research activity in this area in both academia and industry.
1.1 Data gathering sensor networks
Sensor networks are aiding the evolution of monitoring systems for earth and space science
applications [GBR, TMEC+10, WALJ+06]. Frequently, these systems require continuous
data gathering from a distributed field to a central base station. A typical scenario is
illustrated in Figure 1.1(a). The image shows the temperature of the ocean surface off
the Los Angeles coast. A sensor network has been deployed to collect the temperature
1
(a) (b)
Figure 1.1: (a) Illustration of a distributed phenomena and data gathering using sensornetwork. (b) Hardware - Telosb mote device.
(a) (b)
Figure 1.2: (a) Software abstraction from application developer perspective. (b) Possiblefit for compression.
measurements and transport them to a base station on the coast. The hardware de-
ployed could be, for instance, Telosb motes shown in Figure 1.1(b). In software, the data
gathering application interfaces with sensors to receive sensor measurements and sends
them to a networking “black box” that will perform operations necessary to transport
the measurements to the sink. A crude abstraction for the software at each node from
an application developer perspective is shown in Figure 1.2(a).
The phenomena of interest being sensed evolves in space and time. For most naturally
occurring phenomena, it can be expected that the signal will be correlated in both of these
2
Figure 1.3: Software abstraction for compression as a service
dimensions. In-network compression or multi-node fusion is considered as a necessity due
to the energy constraints of sensor nodes. Since the energy cost is directly related to the
number of bits transmitted, it would be more efficient to exploit the correlations in the
data to compress it inside the network. Where should this compression be performed?
Perhaps as part of the application, as is the case in the Internet? The abstraction would
then look something like Figure 1.2(b). However, in this situation, the ’spatial image’ is
not available at any single node. The compression needs to be performed as the data is
routed to the sink.
1.2 Thesis and Research Summary
From an application developer perspective, compression needs to be provided as a ser-
vice. Given such a service, as shown in Figure 1.3, the application now sends the sensor
measurements to a “compression plus networking black box”. In addition to the regular
networking functions, this “black box” will be capable of achieving both the computations
3
and configuration required for in-network compression in a distributed manner. Is it pos-
sible to define such a service? What is inside the box? Our thesis is that it is possible
to practically demonstrate completely distributed in-network compression in
sensor networks. Establishing this thesis to arrive at our goal of “compression as a
service“ requires studying several aspects of joint routing and compression - What is
the impact of spatial correlations on optimal routing? What algorithms can be used for
distributed en-route compression? What issues need to be addressed in going from the
theory to a widely adopted system design for distributed compression?
1.2.1 Impact of spatial correlations on optimal routing
In considering the impact of spatial correlations on routing, since energy-efficiency is the
prime motivation for compression of correlated data, it makes sense to route along paths
which allow for more compression. However, the increased routing costs for deviating
too much from the original shortest paths might overwhelm the gains from compression.
We build models and perform analysis to explore this tension. Clustering is a natural
way of trading off progress towards the sink and opportunities for compression close
to data sources. The optimal cluster size can be expected to depend on the degree
or level of correlation in the data. Our analysis confirms this but also throws up two
surprising results. First, when every node is capable of compression computations, the
optimal cluster consists of the whole network i.e shortest path routing is optimal. Second,
when compression operations are performed only at cluster heads, there exists a near-
optimal cluster size that works well over a range of correlation levels. The implication
4
for correlated data gathering is that simple, non-adaptive routing schemes can perform
as well as sophisticated, adaptive ones.
1.2.2 Algorithms for achieving distributed compression
In the second part, we focus on the design of distributed compression algorithms. We
consider two different views of structure in data: one based on wavelet transforms and
the other on compressed sensing. Shen and Ortega [SO08a] have developed lifting based
wavelet transforms that can operate over any 2-D tree routing topology. Their algorithms
assume unicast communications between nodes in the network. We extend their work
by designing a new transform to take advantage of the broadcast nature of wireless
communication [SPO09]. This transform allows for better compression of data and hence
energy efficiency. The second approach is to extend the recent results in compressive
sensing for the multi-hop routing scenario. Our work is the first to consider this problem.
1.2.3 Architecture and system implementation for distributed compression
In this part, we focus on software development and system implementation issues for dis-
tributed compression. Earlier work has mostly dealt with specific schemes and optimiza-
tions and has not led to reusable software development, which is the crucial step in wide
adoption in deployments. Another important issue that earlier work has not addressed
is that of distributed configuration and re-configuration. Along with the computations
which have to be performed in a distributed fashion at the nodes, the configuration of
compression operations i.e. which ”roles” nodes play in the transform, which other nodes
they receive data from and perform computations over, the topology-specific parameter
5
settings in the transform etc., (along with re-configuration in the face of network dynam-
ics) also has to happen in a distributed manner. Finally, to avoid a re-design of the stack,
the compression should be able to work over standard networking (esp. routing) compo-
nents. Our solution incorporates these issues to propose SenZip, an architectural view of
compression as a service that works over standard networking components. To establish
that SenZip is a working architecture, we have implemented a nesC/TinyOS system that
provides a compression service based on the SenZip architecture that works on top of
the Collection Tree Protocol [tos]. The resulting system demonstrated distributed con-
figuration and computations and good reconstruction for compression with two different
schemes, DPCM and 2D wavelet over both static and dynamic routing topologies. This
system adapts to changes in the network topology using the tools provided by CTP. When
the topology changes,the local aggregation tree is re-configured in a distributed manner
and both compression and reconstruction continue smoothly. The software modules are
available for download on tinyos-contribs.
The rest of the dissertation is organized as follows. Chapter 2 provides background
on in-network compression for sensor networks by discussing related literature. Chapter
3 presents the modeling and analysis of the impact of spatial correlations on routing.
Chapter 4 presents new schemes for distributed compression. Chapter 5 presents the
SenZip architecture and details of the system implementation based on it. Chapter 6
concludes the thesis with a summary of contributions and future work.
6
Chapter 2
Background on in-network compression
In-network compression or multi-node fusion is essential for data gathering sensor net-
works due to the energy constraints of the nodes. We discuss the several approaches
that have been proposed to exploit the correlations for efficient and long-lived operation.
There is some limited work on in-network compression in wired networks. A set of stan-
dards has been developed for header compression inside the network [rfc90, rfc99, rfc01].
Obviously, in this case, the payload of the packets is not altered. Work on active networks
looked at performing operations on data inside the network to tradeoff computation and
communication [BK01, TW96]. However, the vision of an ActiveNet that will succeed
and replace the Internet did not materialize. The Internet is based on the end-to-end
paradigm with only end hosts performing operations on the data.
In this section, we begin by looking at schemes for distributed aggregation and com-
pression for sensor networks. These works primarily focus on achieving the computations
required for compression in a distributed manner. We then discuss the problem of dis-
tributed configuration and reconfiguration required at the nodes to perform compression
7
operations. Finally, we describe work on system design and software development for
energy-efficient data gathering.
2.1 Distributed aggregation
Aggregation schemes aim to avoid redundancy at packet level. Some examples are dupli-
cate suppression and finding statistics such as the minimum, maximum, average, count
etc. for the measurements of distributed sensors.
Krishnamachari et al. [KEW02] presented models and performance analysis for simple
aggregation (duplicate suppression, min, max) and illustrated the gains when compared
to end-to-end routing. They also studied the effects of network topology and the nature
of optimal routing for such aggregation. Aggregation via a minimum Steiner tree is
shown to be optimal and hence NP-hard, and some sub-optimal structures are then
considered. Intanagonwiwat et al. [IEGH02] observed that greedy aggregation based on
directed diffusion [IGE+03] can do better than opportunistic aggregation in high density
scenarios. Madden et al. [MFHH] argued that aggregation should be provided as a core
service for sensor network applications. They proposed the TAG (Tiny AGgregation)
service for answering declarative queries over a routing tree.
2.2 Distributed compression
In this section, we discuss literature on distributed compression.
8
2.2.1 Distributed source coding
These works involve constructive approximations to distriuted Slepian-Wolf. Several
works with little or no interaction between encoders. Typically, these approaches require
knowledge of global correlation structure at the sink or at all nodes. Multi-hop routing is
not considered. The techniques proposed by Pradhan et al. [PR99] suggest mechanisms
to compress the content at the original sources in a distributed manner without explicit
routing-based aggregation. The sink has complete knowledge of the correlation structure,
which it uses to arrive at the optimal coding rates at each node and then disseminates
the same to them. No inter-sensor communication is required for compression purposes.
Gastpar et al. [GDV06] present the distributed K-L transform that has applications for
distributed compression problems. The authors consider the optimal local operations at
distributed agents, such as sensor nodes, to provide a locally compressed version of the
data to a central base station which will then reconstruct the whole field with minimum
error. In general, the solution needs knowledge of global correlation structure and is
shown to be globally convergent for the Gauss-Markov data case.
In DSC techniques, the correlation between data captured by different nodes has
to be known, which in practice will require data exchange between nodes. Practical
application and deployment of such techniques for sensor network data gathering has not
been attempted.
2.2.2 Analysis of impact of correlations on routing
Work by Scaglione and Servetto [SS02] was the first to explicitly consider the problem of
joint routing and compression. By considering the joint entropy of sources as the data
9
metric and routing for compression within localized partitions (or clusters), it is shown
that the network broadcast problem in multi-hop networks is feasible.
Work by Enachescu et al. [EGGM04] presents a randomized algorithm which is a
constant factor approximation (in expectation) to the optimum aggregation tree simulta-
neously for all correlation parameters. A notion of correlation is introduced in which the
information gathered by a sensor is proportional to the area it covers and the aggregate
information generated by a set of sensors is the total area they cover. The performance of
aggregation under an arbitrary, general model is considered by Goel and Estrin [GE03].
In this thesis, we analyze the relative performance of various routing and compression
schemes based on using an empirically motivated model for the joint entropy as a func-
tion of inter-source distances [PKG04, PKG08]. The optimal routing structure is then
analyzed using this approximation. The analysis demonstrates that the optimal routing
structure depends on where the actual data compression is performed; at each individual
node or at “micro-servers” acting as intermediate data collection points. In both cases,
we show that there exist efficient correlation independent routing schemes.
The correlated data gathering problem and the need for jointly optimizing the coding
rate at nodes and routing structure is also considered in [CBLV04]. The authors provide
analysis of two strategies: the Slepian-Wolf or DSC model, for which the optimal coding
is complex (needs global knowledge of correlations) and optimal routing is simple (always
along a shortest path tree) and a joint entropy coding model with explicit communication
for which coding is simple and optimizing routing structure is difficult. For the Slepian-
Wolf model, a closed form solution is derived while for the explicit communication case
10
it is shown that the optimization problem is NP-complete and approximation algorithms
are presented.
In [vRW04], “self-coding” and “foreign-coding” are differentiated. In self-coding,
a node uses data from other nodes to compress its own data, while in foreign-coding
a node can also compress data from other nodes. With foreign-coding, the authors
show that energy-optimal data gathering involves building a directed minimum span-
ning tree (DMST). For self-coding, it is shown in [CBLV04] that the optimal solution is
NP-complete.
Work by Enachescu et al. [EGGM04] presents a randomized algorithm which is a
constant factor approximation (in expectation) to the optimum aggregation tree simulta-
neously for all correlation parameters. A notion of correlation is introduced in which the
information gathered by a sensor is proportional to the area it covers and the aggregate
information generated by a set of sensors is the total area they cover. The performance of
aggregation under an arbitrary, general model is considered by Goel and Estrin [GE03].
Zhu et. al [ZSS05] have shown that under many network scenarios, a shortest path
tree has performance that is comparable to an optimal correlation aware routing struc-
ture. While [GE03] takes a more general view of aggregation functions rather than as
compression of spatially correlated sources and results in [ZSS05] are contingent on a lim-
ited data compression model - compression gain independent of number of neighbors and
distances between nodes, our finding that there exists a near-optimal clustering scheme
that performs well for a wide range of correlations is in keeping with the results presented
in these works.
11
2.2.3 Spatial transforms
The design of spatial transforms involves separating the spatially distributed signal into
low and high pass portions. One particular class of methods only send trend data or data
models within a given cluster. In Ken [CDHH06] and PAQ [TM06] nodes are separated
into clusters and assigned roles as cluster head or non-cluster head. Then, nodes forward
data to cluster heads on some aggregation graph, model parameters for data in each
cluster are estimated at cluster heads and only model parameters are forwarded to the
sink along a routing tree. Note that an ordering of communications is implicit in this
process. Another simple form of distributed data compression is differential encoding.
For example, in DOSA [ZCH07], nodes are assigned roles as either correlating (C) or
non-correlating (NC) nodes, NC nodes forward data to C nodes and C nodes compute
and forward differentials of their NC neighbors. More sophisticated techniques have
also been proposed [LTP05]. Distributed computation of differentials must be done in
a predefined order on an aggregation graph. Differentials are forwarded along a routing
tree.
Ciancio and Ortega [CO05] developed a distributed scheme for removing spatial cor-
relations using wavelet transforms via lifting steps. This scheme was for 1D paths and
was later extended by Ciancio et al. [CPOK06] to handle the merging of multiple paths.
A further enhancement by Shen and Ortega [SO08a] designed a transform to work over
any given tree. We describe a new transform [SPO09] that exploits the broadcast nature
of wireless transmission to achieve better SNR vs. cost performance.
12
2.2.4 Compressed sensing
Wavelet transform techniques are essentially critically sampled approaches, so that their
cost of gathering scales up with the number of sensors, which could be undesirable when
large deployments are considered. Compressed sensing (CS) has been considered as a
potential alternative in this context, as the number of samples required (i.e., number
of sensors that need to transmit data), depends on the characteristics (sparseness) of
the signal [CRT06, Don06]. In addition CS is also potentially attractive for wireless
sensor networks because most computations take place at the decoder (sink), rather than
encoder (sensors), and thus sensors with minimal computational power can efficiently
encode data.
CS theoretical developments have focused on minimizing the number of measurements
(i.e., the number of samples captured), rather than on minimizing the cost of each mea-
surement. In many CS applications (e.g., [DDT+08, LDP07]), each measurement is a
linear combination of many (or all) samples of the signal to be reconstructed.
It is easy to see why this is not desirable in the context of a sensor network: the
signal to be sampled is spatially distributed so that measuring a linear combination of all
the samples would entail a significant transport cost to generate each aggregate measure-
ment. To address this problem, sparse measurement approaches (where each measure-
ment requires information from a few sensors) have been proposed. Wang et al. [WGR07]
look at such an approach in a single hop network. We consider multi-hop sensor net-
works [LPS+09, PLS+09]. Compared with state of the art compressed sensing techniques
13
for sensor networks, our experimental results demonstrate significant gains in reconstruc-
tion accuracy and transmission cost.
2.3 Distributed configuration
Only limited efforts have been devoted to understanding the problems associated with
distributed node configuration for compression. For efficiency and scalability, only a small
amount of “local” communications should be needed to determine which nodes exactly
perform which compression computations, over what data and how the data is then routed
to them. Distributed configuration is also desirable as it can help reduce initialization and
reconfiguration times since it is not necessary for a sink node to first gather information
about all nodes.
From a purely architectural viewpoint, it is well understood that addressing the re-
source constraints in sensor network operation requires cross-layer designs. However, this
flexibility has led to a proliferation of monolithic and vertically integrated systems. In the
absence of an agreement on the decomposition of services provided by system components
and their interactions, interfacing such designs for a practical deployment is more or less
infeasible.
Culler et al. [CDE+05] advocate the need for an overall sensor network architecture.
In a follow up paper by Tavakoli et al. [TDJ+07], a set of design principles is proposed
for the development of elements of the networking software architecture. In addition
to the traditional goals of code reuse and interoperability, these include extensibility.
This requirement arises in view of the relative immaturity of the field, where a rigid
14
and complete modularization stifles innovation. They recommend a hybrid approach,
with modularity for low level components (underlying infrastructure) and flexibility and
extensibility at the higher layer (programming paradigm).
We address the problem of distributed configuration for compression by proposing the
SenZip architecture. SenZip specifies a compression service that can encompass differ-
ent compression schemes and its modular interactions with standard networking services
such as routing. This architecture enables a distributed node configuration for compres-
sion, just as existing systems make it possible for sensors to configure themselves for
routing in a distributed manner. The architecture proposal is based on (a) lessons from
overall architectural principles for sensor networks [TDJ+07], (b) our own experience in
implementing a practical wavelet-based distributed compression system, and (c) identify-
ing common patterns in existing compression schemes. Work by Tarrio et al. [TVSO09]
GSN ’09 considers the design of simple wavelet-like techniques for distributed compression
which are exlicitly designed to work over and take advantage of configuration mechanisms
provided the Collection Tree Protocol [tos].
2.4 System implementation
Most earlier work has focused on theory and simulations to understand performance
limits. These studies, and some limited system implementations (e.g., [ZCH07]), have
therefore had limited impact on technology adoption and sensor network software devel-
opment because they have not yielded modular and inter-operable software.
15
Previous efforts to implement simpler kinds of aggregation mechanisms in sensor net-
works. These include the aggregation services in TAG/TinyDB [MFHH], application inde-
pendent distributed aggregation (AIDA) [HBSA04], and the differential encoding-based
distributed compression scheme whose implementation is described in [ZCH07]. There
has also been some prior work [GGP+03] on implementing traditional non-distributed
wavelets to compression for multi-resolution storage and querying in sensor networks.
To demonstrate the utility and practicality of SenZip, we have implemented a system
to achieve compression over the Collection Tree Protocol [tos]. The resulting system
demonstrated distributed configuration and computations and good reconstruction for
compression with two different schemes, DPCM and 2D wavelet over both static and
dynamic routing topologies [PSC+09]. The software modules were designed to be re-
usable and extensible and are available on tinyos-contribs [sen].
16
Chapter 3
Modeling of joint routing and compression
In order to understand the space of interactions between routing and compression, we
study simplified models of three qualitatively different schemes. In routing-driven com-
pression data is routed through shortest paths to the sink, with compression taking
place opportunistically wherever these routes happen to overlap [IEGH02] [KEW02]. In
compression-driven routing the route is dictated in such a way as to compress the data
from all nodes sequentially - not necessarily along a shortest path to the sink. Our analy-
sis of these schemes shows that they each perform well when there is low and high spatial
correlation respectively. As an ideal performance bound on joint routing-compression
techniques, we consider distributed source coding in which perfect source compression is
done a priori at the sources using complete knowledge of all correlations.
In order to obtain an application-independent abstraction for compression, we use the
joint entropy of sources as a measure of the uncorrelated data they generate. An empirical
The work described in this section was published as follows:Sundeep Pattem, Bhaskar Krishnamachari and Ramesh Govindan, “The Impact of Spatial Correlationon Routing with Compression in Wireless Sensor Networks”, Transactions on Sensor Networks (TOSN),Volume 4, Number 4, August 2008.Sundeep Pattem, Bhaskar Krishnamachari and Ramesh Govindan, “The Impact of Spatial Correlation onRouting with Compression in Wireless Sensor Networks,” Third Symposium on Information Processingin Sensor Networks (IPSN), 2004
17
approximation for the joint entropy of sources as a function of the distance between them
is developed. A bit-hop metric is used to quantify the total cost of joint routing with
compression. Evaluation of the above schemes using these metrics leads naturally to a
clustering approach for schemes that perform well over the range of correlations.
We develop a simple scheme based on static, localized clustering that generalizes these
techniques. Analysis shows that the nature of optimal routing will depend on the number
of nodes, level of correlation and also on where the compression is effected; at the individ-
ual nodes or at intermediate aggregation points (cluster heads). Our main contribution
is a surprising result that there exists a near-optimal cluster size that performs well over
a wide range of spatial correlations. A min-max optimization metric for the near-optimal
performance is defined and a rigorous analysis of the solution is presented for both 1-D
(line) and 2-D (grid) network topologies. We show further that this near-optimal size is in
fact asymptotically optimal in the sense that, for any constant correlation level, the ratio
of the energy costs associated with the near-optimal cluster size to those associated with
the optimal clustering goes to one as the network size increases. Simulation experiments
confirm that the results hold for more general topologies - 2-D random geometric graphs
and realistic wireless communication topology with lossy links, and also for a continuous,
Gaussian data model for the joint entropy with varying quantization.
From a system-engineering perspective, this is a very desirable result because it elim-
inates the need for highly sophisticated compression-aware routing algorithms that adapt
to changing correlations in the environment (which may even incur additional overhead
for adaptation), and therefore simplifies the overall system design.
18
3.1 Assumptions and Methodology
Our focus is on applications which involve continuous data gathering for large scale and
distributed physical phenomena using a dense wireless sensor network where joint routing
and compression techniques would be useful. An example of this is the collection of data
from a field of weather sensors. If the nodes are densely deployed, the readings from
nearby nodes are likely to be highly correlated and hence contain redundancies, because
of the inherent smoothness or continuity properties of the physical phenomenon.
To compare and evaluate different routing with compression schemes, we will need a
common metric. Our focus is on energy expenditure, and we have therefore chosen to
use the bit-hop metric. This metric counts the total number of bit transmissions in the
network for one round of gathering data from all sources. Formally, let T = (V,E, ξT )
represent the directed aggregation tree (a subgraph of the communication graph) corre-
sponding to a particular routing scheme with compression, which connects all sources to
the sink. Associated with each edge e = (u, v) is the expected number of bits be to be
transported over that edge in the tree (per cycle). For edges emanating from sources that
are leaves on the tree, the bit count is the amount of data generated by a single source.
For edges emanating from aggregation points, the outgoing edge may have a smaller bit
count than the sum of bits on the incoming edges, due to aggregation. For nodes that
are neither sources or aggregation points but act solely as routers, the outgoing edge will
contain the same number of bits as the incoming edge. The bit-hop metric ξT is simply:
ξT =∑e∈E
be. (3.1)
19
There are two possible criticisms of this metric that we should address directly. The
first is that the total transmissions may not capture the “hot-spot” energy usage of
bottleneck nodes, typically near the sink. However, an alternative metric that better
captures hot-spot behavior is not necessarily relevant if the a priori deployment and
energy placement ensure that the bottlenecks are not near the sink or if the sink changes
over time. The second possible criticism is that this does not incorporate reception costs
explicitly. However, the use of bit-hop metric is justified because it does in-fact implicitly
incorporate reception costs. If every bit transmission incurs the same corresponding
reception cost in the network, the sum of these transmission and reception costs will be
proportional to the total number of bit-hops.
To quantify the bit-hop performance of a particular scheme, therefore, we need to
quantify the amount of information generated by sources and by the aggregation points
after compression. For this purpose we use the entropy H of a source, which is a measure
of the amount of information it originates [CT91]. In this paper, we consider only lossless
compression of data. In order to characterize correlation in an application-independent
manner, we use the joint entropy of multiple sources to measure the total uncorrelated
data they originate. Theoretically, using entropy-coding techniques this is the maximum
possible lossless compression of the data from these sources. We now attempt to construct
a parsimonious model to capture the essential nature of joint entropy of sources as a
function of distance. The simplicity of this approximation model enables the analysis
presented in Sections 3 and 4.
In general, the extent of correlation in the data from different sources can be expected
to be a function of the distance between them. We used an empirical data-set pertaining
20
0 50 100 150 200 250 300 350 400 450
Distance (km)E
ntro
py (
bits
)
actual dataapproximation
H2
H3
H4
H1
2H1
3H1
4H1
[c = 25, RMS error = .03]
[c = 25, RMS error = .09]
[c = 25, RMS error = .055]
Figure 3.1: Empirical data (from the rainfall data-set) and approximation for joint en-tropy of linearly placed sources separated by different distances
to rainfall1 [WB99] to examine the amount of correlation in the readings of two sources
placed at different distances from each other. Since rainfall measurements are a contin-
uous valued random variable and hence would have infinite entropy, we present results
obtained from quantization. The range of values was normalized for a maximum value
of 100 and all readings ‘binned’ into intervals of size 10. Fig.3.1 is a plot of the average
joint entropy of multiple sources as a function of inter-source distance.
The figure shows a steeply rising convex curve that reaches saturation quickly. This
is expected since the inter-source distance is large (in multiples of 50km). From the
empirical curve, a suitable model for the average joint entropy of two sources (H2) as a
function of inter-source distance d is obtained as:
H2(d) = H1 + [1− 1(dc + 1)
]H1. (3.2)
1This data-set consists of the daily rainfall precipitation for the pacific northwest region over a periodof 46 years. The final measurement points in the data-set formed a regular grid of 50km x 50km regionsover the entire region under study. Although this is considerably larger-scale than the sensor networksof interest to us, we believe the use of such “real” physical measurements to validate spatial correlationmodels is important.
21
Here c is a constant that characterizes the extent of spatial correlation in the data. It
is chosen such that when d = c, H2 = 32H1. In other words, when inter-source distance
d = c, the second source generates half the first node’s amount in terms of uncorrelated
data. In Fig.3.1, a value of c = 25 matches the H2 curve well.
Finally, this leaves open the question of how to obtain a general expression for the joint
entropy of n sources at arbitrary locations. As we shall show later, this is needed in order
to study the performance of various strategies for combined routing and compression. To
this end, we now present a constructive technique to calculate approximately the total
amount of uncorrelated data generated by a set of n nodes.
From Eqn.3.2, it appears that on average, each new source contributes an amount of
uncorrelated data equal to [1− 1( d
c+1)
]H1, where we take the d as the minimum distance
to an existing set of sources. This suggests a constructive iterative technique to calculate
approximately the total amount of uncorrelated data generated by a set of n nodes:
1. initialize a set S1 = v1 where v1 is any node. We will denote by H(Si) the joint
entropy of nodes in set Si; where H(S1) = H1. Let V be the set of all nodes.
2. Iterate the following for i = 2 : n
(a) Update the set by adding a node vi where vi ∈ V \Si−1 is the closest (in terms
of Euclidean distance) of the nodes not in Si−1 to any node in Si−1, i.e. set
Si = Si−1, vi.
(b) Let di be the shortest distance between vi and the set of nodes in Si−1. Then
calculate the joint entropy as H(Si) = H(Si−1) + [1− 1
(dic
+1)]H1.
3. The final iteration yields H(Sn) as an approximation of Hn.
22
In the simple case when all nodes are located on a line equally spaced by a distance
d, this procedure would yield the expression:
Hn(d) = H1 + (n− 1)[1− 1(dc + 1)
]H1. (3.3)
That this closed-form expression provides a good approximation for a linear scenario is
validated by our measurements from the rainfall data set, as seen in Fig.3.1. The curve
for H3 was obtained by considering all sets of grid points (p1, p2, p3) such that they lie in
a straight line with the distance between two adjacent points plotted on the x-axis. The
curve for H4 was similarly obtained using all sets of 4 points.
3.1.1 Note on Heuristic Approximation
We note that the final approximationH(Sn) is guaranteed to be greater than the true joint
entropy H(v1, v2, ...., vn). Thus it does represent a rate achievable by lossless compression.
The approximation roughly corresponds to a rate allocation of H(vi/ηvi) at every node vi,
where ηvi is the nearest neighbor of vi. A more precise information-theoretic treatment
in terms of the rate allocations at each node is possible, for instance, as in [CBLV04,
CBLVW06]. We relinquish some rigor with the objective of gaining practical insight. This
approach makes the problem more tractable and is the basis for analysis in subsequent
sections.
Another point of contention is the need for such a heuristic approach instead of using
a continuous data model and using analytical expressions for the joint entropy for this
model. In this regard, we note that (a) our model matches the standard jointly Gaussian
23
entropy model for low correlation [Appendix ??] and (b) since the standard expression is
in covariance form, it cannot be used for high correlation values, necessitating a reasonable
approximation.
3.2 Routing Schemes
Given this framework, we can now evaluate the performance of different routing schemes
across a range of spatial correlations. We choose three qualitatively different routing
schemes; these schemes are simplified models of schemes that have been proposed in the
literature.
1. Distributed Source Coding (DSC): If the sensor nodes have perfect knowledge about
their correlations, they can encode/compress data so as to avoid transmitting re-
dundant information. In this case, each source can send its data to the sink along
the shortest path possible without the need for intermediate aggregation. Since
we ignore the cost of obtaining this global knowledge, our model for DSC is very
idealized and provides a baseline for evaluating the other schemes.
2. Routing Driven Compression (RDC): In this scheme, the sensor nodes do not have
any knowledge about their correlations and send data along the shortest paths to the
sink while allowing for opportunistic aggregation wherever the paths overlap. Such
shortest path tree aggregation techniques are described, for example, in [IEGH02]
and [KEW02].
3. Compression Driven Routing (CDR): As in RDC, nodes have no knowledge of the
correlations but the data is aggregated close to the sources and initially routed so
24
Routing and Aggregation in Distributed Source Codingsourcessinkrouters
2n−1
n1
H2n−1
Routing and Aggregation in Routing Driven Compressionsourcessinkrouters
2n−1
n1
H1
H1
H1
H1
H1
H1
H1
H1
H1
H1
H1
H1
H1
H1
H1
H1
H1
H1
H1
H1
H1 H
3 H
5
H2n−1
H2n−1
H2n−1
Routing and Aggregation in Compression Driven Routing
sourcessinkrouters
2n−1
n1
H1
H1
H2n−1
H2n−1
H2n−1
H2
H3
H2
H3
H2n−1
Figure 3.2: Illustration of routing for the three schemes: DSC, CDR, and RDC. Hi is thejoint entropy of i sources.
as to allow for maximum possible aggregation at each hop. Eventually, this leads
to the collection of data removed of all redundancy at a central source from where
it is sent to the sink along the shortest possible path. This model is motivated by
the scheme in [SS05].
3.2.1 Comparison of the schemes
Consider the arrangement of sensor nodes in a grid, where only the 2n− 1 nodes in the
first column are sources. We assume that there are n1 hops on the shortest path between
the sources and the sink. For each of the three schemes, the paths taken by data and the
intermediate aggregation are shown in Fig.3.2.
25
In our analysis, we ignore the costs associated for each compressing node to learn the
relevant correlations. This cost is particularly high in DSC where each node must learn
the correlations with all other source nodes. However the bit-hop cost still provides a
useful metric for evaluating the performance of the various schemes and allows us to treat
DSC as the optimal policy providing a lower-bound on the bit-hop metric.
Using the approximation formulae for joint entropy and the bit-hop metric for energy,
the expressions for the energy expenditure (E) for each scheme are as follows.
For the idealized DSC scheme, each source is able to send exactly the right amount
of uncorrelated data, and each source can send the data along the shortest path to the
sink, so that:
EDSC = n1H2n−1. (3.4)
Lemma 3.2.1. EDSC represents a lower bound on bit-hop costs for any possible routing
scheme with lossless compression.
Proof: The total joint information of all (2n− 1) sources is H2n−1. As discussed before,
no lossless compression scheme can reduce the total information transmitted below this
level. Each bit of this information must travel at least n1 hops to get from any source to
the sink. Thus n1H2n−1, the cost of the idealized DSC scheme, represents a lower bound
on all possible routing schemes with lossless compression.
In the RDC scheme, the tree is as shown in Fig.3.2 (middle), with data being com-
pressed along the spine in the middle. It is possible to derive an expression for this
scenario:
ERDC = (n1 − n)H2n−1 + 2H1
n−1∑i=1
(i) +n−2∑j=0
H2j+1. (3.5)
26
10−1
100
101
102
0
1000
2000
3000
4000
5000
6000
7000
8000Performance with a convex function for joint entropy vs distance
Correlation parameter in log scale log(c)E
nerg
y us
age
in b
it−ho
ps
DSCRDCCDR
Figure 3.3: Comparison of energy expenditures for the RDC, CDR and DSC schemeswith respect to the degree of correlation c.
For the CDR scheme, the data is compressed along the location of the sources, and
then sent together along the middle, as shown in Fig. 3.2. It can be shown that for this
scenario:
ECDR = n1H2n−1 + 2n−1∑i=1
Hi. (3.6)
The above expressions, in conjunction with the expression for Hn presented earlier,
allow us to quantify the performance of each scheme. Fig.3.3 plots the energy expenditure
for the DSC, RDC and CDR schemes as a function of the correlation constant c, for
different forms of the correlation function. For these calculations, we assumed a grid
with n1 = n = 53 and 2n − 1 = 105 sources. From this figure it is clear that CDR
approaches DSC and outperforms RDC for higher values of c (high correlation) while
RDC performance matches DSC and outperforms CDR for low c (no correlation). This
can be intuitively explained by the tradeoff between compressing close to the sources and
transporting information toward the sink. CDR places a greater emphasis on maximizing
the amount of compression close to the sources, at the expense of longer routes to the
27
sink, while RDC does the reverse. When there is no correlation in the data (small c),
no compression is possible and hence it is RDC that minimizes the total bit-hop metric.
When there is high correlation (large c), significant energy gains can be realized by
compressing as close to the sources as possible and hence CDR performs better under
these conditions.
Interestingly, it appears that neither RDC nor CDR perform well for intermediate
correlation values. This suggests that in this range a hybrid scheme may provide energy-
efficient performance closer to the DSC curve. CDR and RDC can be viewed as two
extremes of a clustering scheme, with CDR having all data sources form a single aggre-
gation cluster before sending data towards the sink while RDC has each source acting as
a separate cluster in itself. A hybrid scheme would be one in which sources form small
clusters and data is aggregated within them at a cluster head, which then sends data
to the sink along a shortest path. This insight leads us to an examination of suitable
clustering techniques.
3.3 A Generalized Clustering Scheme
The idea behind using clustering for data routing is to achieve a tradeoff between aggre-
gating near the sources and making progress towards the sink. In addition to factors like
number of nodes and position of sink, the optimal cluster size will also depend on the
amount of correlation in the data originated by the sources (quantified by the value of
c). Generally, the amount of correlation in the data is highest for sensor nodes located
close to each other and can be expected to decrease as the separation between nodes
28
increases. Once an optimal clustering based on correlations is obtained, aggregation of
data is required only for the sources within a cluster, after which data can be routed to
the sink without the need for further aggregation. As a consequence, none of the scenarios
considered henceforth will resemble RDC exactly.
3.3.1 Description of the scheme
We now describe a simple, location-based clustering scheme. Given a sensor field and
a cluster size, nodes close to each other form clusters. The clusters so formed remain
static for the lifetime of the network. Within each cluster, the data from each of the
nodes is routed along a shortest path tree (SPT) to a cluster head node. This node then
sends the aggregated data from its cluster to the sink along a multi-hop path with no
intermediate aggregation. This is illustrated in Fig. 3.4. The intermediate nodes on the
SPT may or may not perform aggregation. Data aggregation in the form of compression
is computationally intensive. All nodes in a network might not be capable of performing
compression, either because it is too expensive for them to do so or the delays involved
are unacceptable. It is conceivable that there will be a few high power nodes or micro-
servers [HCJB04] which will perform the compression. Nodes form clusters around these
nodes and route data to them. In this case, data aggregation takes place only at the
cluster head.
3.3.1.1 Metrics for evaluation of the scheme
Es(c) is defined as the energy cost (in bit-hops) for correlation c and cluster size s.
The optimal cluster size sopt(c) minimizes the cost for a given c. Let E∗(c) = Esopt(c)
29
0 1 2 3 4 5 60
1
2
3
4
5
6
Intra−clusterrouting
Extra−cluster routing of compresseddata to sink
Cluster−head
Source
Sink
Figure 3.4: Illustration of clustering for a two-dimensional field of sensors
represent the optimal energy cost for a given correlation c. For simplifying system design,
it is desirable to have a cluster size that performs close to the optimal over the range of
c values. We quantify the notion of ‘being close to optimal’ by defining a near-optimal
cluster size sno as the value of s that minimizes the maximum difference metric, i.e.
sno = arg mins∈[1,n]
maxc∈[0,∞)
Es(c)− E∗(c). (3.7)
In the following sections, we analyze the performance of the clustering scheme for
both 1-D and 2-D networks when aggregation is performed
• at intermediate nodes on the SPT, and
• only at the cluster-heads.
3.3.2 1-D Analysis
We begin with an analysis of the energy costs of clustering for a setup involving a linear
array of sources to better understand the tradeoffs. Consider n source nodes linearly
placed with unit spacing (i.e. d = 1) on one side of a 2-D grid of nodes, with the sink on
30
the other side, and assuming the correlation model, Hn = H1(1 + (n−1)1+c ). We consider n
s
clusters each consisting of s nodes. Since all sources have the same shortest hop distance
to the sink, the position of the cluster head within a cluster has no effect on the results.
Within each cluster, the data can either be compressed sequentially on the path to the
cluster head or only when it reaches the cluster head. The cluster head then sends the
compressed data along a shortest path involving D hops to the sink. The total bit-hop
cost for such a routing scheme is therefore
Es(c) =n
s(Eintras,c + Eextras,c ), (3.8)
where Eintras,c and Eextras,c are the bit-hop cost within each cluster and the bit-hop cost for
each cluster to send the aggregate information to the sink respectively.
3.3.2.1 Sequential compression along SPT to cluster head
At each hop within the cluster, a node receives Hi bits, aggregates them with its own
data and transmits Hi+1 bits. This is done sequentially until the data reaches the cluster
head. We have,
Eintras,c =s−1∑i=1
Hi =s−1∑i=1
(1 +
i− 11 + c
)H1
=(s− 1 +
(s− 2)(s− 1)2(1 + c)
)H1.
31
Since the cluster heads get aggregated data from s sources and send it to the sink using
a shortest path of D hops,
Eextras,c = Hs ·D =(
1 +s− 11 + c
)H1 ·D
⇒ Es(c) = nH1
(s− 1s
+(s− 2)(s− 1)
2s(1 + c)+D
s+
(s− 1)Ds(1 + c)
). (3.9)
The optimum value of the cluster size sopt can be determined by setting the derivative
of the above expression equal to zero. It can be shown that
sopt = 1, if c ≤ 12(D − 1)
=√
2c(D − 1), if1
2(D − 1)< c <
n2
2(D − 1)
= n, if c ≥ n2
2(D − 1).
Note that sopt depends on the distance from the sources to the sink2 and the degree of
correlation c.
Fig.3.5 shows (based on the analysis) how different cluster sizes perform across a range
of correlation levels, based on the analysis presented above for a set of 105 linearly placed
nodes. As expected the small cluster sizes and large cluster sizes perform well at low
and high correlations respectively. However, it appears that an intermediate cluster size
near 15 would perform well across the whole range of correlation values. The curve with
s = 105 corresponds to CDR and the DSC curve is also plotted for reference.
2It is, however, assumed that D ≥ n, so there is an implicit dependence on n.
32
Theorem 3.3.1. For Es(c) given by Equation.3.9, the near-optimal cluster size sno de-
fined by Equation.3.7 exists, and is given by
sno = Θ(min(√D,n)).
The following lemma is required for proving the theorem.
Lemma 3.3.2. To solve the optimization problem in Eqn.3.7 for Es(c) given by Eqn.3.9
it suffices to find s = sno such that
Esno(0)− E∗(0) = Esno(∞)− E∗(∞). (3.10)
Proof. We first show that for any arbitrary s, this difference is maximum at one of the
two extremes (i.e. at c = 0 and c→∞). Let
Eds (c) = Es(c)− E∗(c) = Es(c)− Esopt(c)
= nH1(s− sopt)
(s · sopt − 2c(D − 1)
)2s · sopt(1 + c)
∂Eds (c)∂c
= −nH1(s− 1)
(s+ 2(D − 1)
)2s(1 + c)2
, if c ≤ 12(D − 1)
= −nH1
(s−
√2c(D − 1)
)(s+
√2(D−1)
c
)2s(1 + c)2
, if1
2(D − 1)< c <
n2
2(D − 1)
= −nH1(s− n)
(s · n+ 2(D − 1)
)2s · n(1 + c)2
, if c ≥ n2
2(D − 1).
Eds (c) and its derivative vanish for the same values of c and since Eds (c) is non-negative,
the minimum is achieved at these values of c.
33
The derivative is continuous for all s ∈ [1, n], and
• for a particular value of s ∈ (1, n), it is zero only for one value of c.
• for s = 1, it is zero only for c ∈ [0, 12(D−1) ].
• for s = n, it is zero only for c ∈ [ n2
2(D−1) ,∞).
From the non-negativity of Eds (c) and the above properties of its derivative, we can
conclude that:
• for s ∈ (1, n), Eds (c) is convex
• for s = 1, it is monotonously increasing
• for s = n, it is monotonously decreasing.
This implies that Eds (c) is maximum either for c = 0 or c =∞ and Eqn.(3.7) reduces
to
mins∈[1,n]
max(Es(0)− E∗(0), Es(∞)− E∗(∞)). (3.11)
From Eqn. (3.9), we can derive the following expressions for energy costs of clustering
schemes for the two extreme correlation values:
Es(0) = nH1(s− 1
2+D)
E∗(0) = nH1D
Es(∞) = nH1(1 +D − 1s
)
E∗(∞) = nH1(1 +D − 1n
). (3.12)
34
Substituting Eqn. (3.12) in Eqn. (3.11) and disregarding common factors, we obtain:
mins∈[1,n]
max(s− 1
2,D − 1s− D − 1
n). (3.13)
Let f1(s) = s−12 , f2(s) = D−1
s −D−1n . We have
maxs=1
(f1, f2) = f2(1)
maxs=n
(f1, f2) = f1(n).
For s ∈ (1, n), f1, f2 are continuous, f1 is increasing and f2 is decreasing. Therefore,
max(f1, f2) achieves its minimum for s = sno such that
f1(sno) = f2(sno)
i.e. Esno(0)− E∗(0) = Esno(∞)− E∗(∞).
Proof of Theorem 3.3.1: Solving for f1(sno) = f2(sno), we get
sno − 12
=D − 1sno
− D − 1n
⇒ s2no + (2(D − 1)
n− 1)sno − 2(D − 1) = 0
⇒ sno =
√2(D − 1) + (
D − 1n− 1
2)2 − (
D − 1n− 1
2)
= Θ(min(√D,n)).
35
10−3
10−2
10−1
100
101
102
103
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
correlation parameter in log scale log(c)T
rans
imis
sion
cos
t Es(c
) (b
it−ho
ps)
s = 1s = 3s = 7s = 15s = 35s = 105 (CDR)DSC
Figure 3.5: Comparison of the performance of different cluster-sizes for linear array ofsources(n = D = 105) with compression performed sequentially along the path to clusterheads. The optimal cluster size is a function of correlation parameter c. Also, cluster sizes = 15 performs close to optimal over the range of c
This is illustrated in Fig.3.6, in which the costs are plotted with respect to the cluster
sizes for a few different values of the spatial correlation. The figure shows clearly that
although the optimal cluster size does increase with correlation level, the near-optimal
static cluster size performs very well across a range of correlation values. In this figure,
D = n = 105 and the near-optimal cluster size obtained from Theorem.3.3.1, sno = 14 is
indicated by the vertical line in the plot. Intersections of the dotted lines and the nearest
c curve with this vertical line show the difference in energy cost between the near-optimal
and optimal solutions.
36
1 10 14 1000
2000
4000
6000
8000
10000
12000
14000
16000
18000
cluster size in log scale log(s)T
rans
mis
sion
cos
t Es(c
) (b
it−ho
ps)
s = sopt
(c)
s = sno
c = .01
c = 1
c = 2
c = 5
c = 10
c = 100
Figure 3.6: Illustration of the existence of a static cluster for near-optimal performanceacross a range of correlations. The sources are in a linear array and data is sequentiallycompressed along the path to cluster heads.
3.3.2.2 Compression at cluster head only
In this case, each source within a cluster sends data to the cluster head using a shortest
path. There is no aggregation before reaching the cluster head. We have,
Eintras,c =s−1∑i=1
i ·H1 =s(s− 1)
2H1
Eextras,c =(
1 +s− 11 + c
)H1 ·D
⇒ Es(c) = nH1
(s− 12
+D
s+
(s− 1)D(s)(1 + c)
). (3.14)
It can be shown that
sopt = 1, if c ≤ 12D − 1
= n, if c >n2
2D − n2, 2D > n2
=
√2Dcc+ 1
, else .
37
Fig.3.7 shows that for a linear array of sources (with n = D = 105), the performance
for cluster sizes s = 5, 7 are close to optimal over the range of c. The DSC curve is plotted
for reference.
Theorem 3.3.3. For Es(c) given by Equation.3.14, the near-optimal cluster size sno
defined by Equation.3.7 exists, and is given by
sno = Θ(min(√D,n))
.
The following lemma is required for proving the theorem.
Lemma 3.3.4. The near-optimal cluster size s = sno for Es(c) given by Eqn.3.14 satisfies
the condition
Esno(0)− E∗(0) = Esno(∞)− E∗(∞).
Proof. The proof is similar to proof of Lemma 3.3.2 with
f1(s) =Es(0)− E∗(0)
nH1=s− 1
2, and
f2(s) =Es(∞)− E∗(∞)
nH1=
s
2+D
s−√
2D if 2D ≤ n2
=s− n
2+D
s− D
nelse.
38
10−3
10−2
10−1
100
101
102
103
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
correlation parameter in log scale log(c)T
rans
mis
sion
cos
t Es(c
) (b
it−ho
ps)
s = 1s = 3s = 5s = 7s = 15s = 105DSC
Figure 3.7: Performance with compression only at cluster head with nodes in a lineararray(n = D = 105). Cluster sizes s = 5, 7 are close to optimal over the range of c
Proof. of Theorem 3.3.3: Using Lemma 3.3.4 and solving
Esno(0)− E∗(0) = Esno(∞)− E∗(∞)
for Es(c) given by Eqn.3.14, we get
sno =2D
2√
2D − 1(≈√D
2) if 2D < n2
=2Dn
2D + n(n− 1)else.
It can be verified that
sno = Θ(√D) if D = o(n2)
= n if D = Ω(n2).
39
1 7 10 1000
2000
4000
6000
8000
10000
12000
14000
16000
18000
cluster size in log scale log(s)T
rans
mis
sion
cos
t Es(c
) (b
it−ho
ps)
c = 0.01
c = 0.5
c = 1.0
c = 2.0
c = 10
c = 100
c = 10000
s = (n/2)1/2
s = sopt
(c)
Figure 3.8: Illustration of the near-optimal cluster size with compression only at cluster
head with nodes in a linear array. The performance of cluster sizes near s = 7(≈√
1052 )
is close to optimal over the range of c values
The existence of a near-optimal cluster size is illustrated in Fig. 3.8. The performance
of cluster sizes near s = 7 is close to optimal over the range of c values.
3.3.3 2-D analysis
Consider a 2-D network in which N = n2 nodes are placed on a n× n unit grid and are
divided into clusters of size s × s. We assume that each node can communicate directly
only with its 8 immediate neighbors. The routing pattern within a cluster and from
the cluster-heads to the sink is similar and is illustrated in Fig.3.9. Note that using the
iterative approximation described in Section 3.1, the joint entropy of k adjacent3 nodes
on a grid is the same as the joint entropy of k sensors lying on a straight line. Fig.3.9(a)
illustrates this along the diagonal.
The results for the linear array of sources do not extend directly to a two-dimensional
arrangement where every node is both a source and a router. In the 1-D case, the optimal
3nodes forming a contiguous set
40
H1
H2
H1H2
H1
H1
H1
H4
H9
Hs2 cluster head
to sink
Hs2
H1H1
H1
H1
H1
2H1
2H1
4H1
9H1
cluster head
to sink
(a) (b)
Figure 3.9: Intra-cluster routing in a 2-D grid arrangement. (a) Opportunistic com-pression along shortest path to cluster head. For calculation of joint entropy, using theiterative approximation, joint entropy of k nodes forming a contiguous set is the sameas the joint entropy of k sensors lying on a straight line. This is illustrated along thediagonal. (b) Compression only at cluster head. The routing from cluster heads to sinkis similar to this case.
aggregation tree is different from the shortest path tree (except for the case with zero
correlation). This is because moving towards the sources allows greater compression than
moving towards the sink. In the 2-D case however, there are opportunities for compres-
sion in all directions. Hence, it is always possible to achieve compression while making
progress towards the sink.
3.3.3.1 Opportunistic compression along SPT to cluster head
According to the approximation we have been using for the joint entropy, the contribution
of a node v is H(v/ηv), where ηv is the nearest neighbor of v. If we assume that H(v/ηv)
is the fixed rate allocation for every node v, it follows4 that a network-wide SPT is the
4see [CBLV04] for a formal proof
41
optimal routing structure. In other words, the optimal cluster size s = n for all values of
correlation parameter c. There is no incentive for data to deviate from a shortest path
to the sink. The result is established more precisely in the following lemma.
Lemma 3.3.5. For a 2-D grid with opportunistic compression along an SPT to cluster
head, the optimal cluster size is s = n for any value of correlation parameter c ∈ [0,∞].
Proof. Consider a cluster of size sxs. The routing within the cluster is as shown in Fig.
3.9a and routing from cluster head to sink is as shown in Fig. 3.9b. The routing costs
are obtained as follows:
Eintras,c =(ns
)2s−1∑i=1
(2(s− i)Hi +Hi2
)=
(ns
)2s−1∑i=1
((2(s− i)
(1 +
i− 11 + c
)H1 +
(1 +
i2 − 11 + c
)H1
)=
(ns
)2(s− 1)
(s+ 1 +
(s− 2)(4s+ 3)6(1 + c)
)H1
Eextras,c =
ns−1∑i=0
ns−1∑j=0
maxs · i, s · jHs2
= s(
ns−1∑i=0
i∑j=0
i+
ns−1∑i=0
ns−1∑
j=i+1
j)(
1 +s2 − 11 + c
)H1
=n
6(n
s− 1)
(4ns
+ 1)(
1 +s2 − 11 + c
)H1.
The total cost is
Es(c) = Eintras,c + Eextras,c
42
The routing cost for a network-wide SPT i.e. with s = n is
En(c) = Eintran,c + 0 = (n− 1)(n+ 1 +
(n− 2)(4n+ 3)6(1 + c)
)H1.
now for any s < n and any value of c consider the difference
Es(c)− En(c)
=n
6(1 + c)
((ns− n
s− s2 + 1
)+
c
s2(4n2 − 3ns− s2 − 6n+
6s2
n
)). (3.15)
It can be verified that the two terms
ns− n
s− s2 + 1 and 4n2 − 3ns− s2 − 6n+
6s2
n
are positive for any value of s < n. Hence the difference in Eqn. 3.15 is always positive.
This implies that for all values of c ∈ [0,∞], Es(c) is minimum for s = n.
It should be noted that the optimality of a network-wide SPT obtained above is
contingent on two of our assumptions: 1. a grid topology, and 2. routing within clusters
is along an SPT. Cristecu et al [CBLV04] and Rickenbach et al [vRW04] show results for
general graph topologies.
3.3.3.2 Compression at cluster head only
When compression is possible only at cluster heads, there is a definite tradeoff in progress
towards the sink and compression at intermediate points. Since there is no compression
before reaching and after leaving the cluster-heads, shortest-path routing is optimal within
43
clusters and from cluster-heads to sink (Fig.3.9(b)). Let Es(c) be the total cost for a
network with cluster size s×s and correlation parameter c. Eintras and Eextras are defined
as the combined intra-cluster costs and the overall cost for routing from cluster heads to
the sink respectively. From Fig.3.9, a node at (i, j) will take maxi, j hops to reach the
cluster head at (0, 0). Since there are (ns )2 clusters, we have
Eintras,c =(ns
)2s−1∑i=0
s−1∑j=0
maxi, jH1 =(ns
)2( s−1∑i=0
i∑j=0
i+s−1∑i=0
s−1∑j=i+1
j)H1
=(ns
)2( s−1∑i=0
i(i+ 1) +s−1∑i=0
((i+ 1) + (i+ 2) + ...+ (s− 1)
))H1
=(ns
)2( s−1∑i=0
i(i+ 1) +s−1∑i=0
((s− 1)s2
− i(i+ 1)2
))H1
=n2
6s(s− 1)(4s+ 1)H1. (3.16)
Now, the shortest route between adjacent cluster-heads is s hops. Hence,
Eextras,c =
ns−1∑i=0
ns−1∑j=0
maxs · i, s · jHs2 = s
ns−1∑i=0
ns−1∑j=0
maxi, j(
1 +s2 − 11 + c
)H1
=n
6
(ns− 1)(4n
s+ 1)(
1 +s2 − 11 + c
)H1. (3.17)
[using the expression for∑∑
maxi, j from Eqn.3.16]
Es(c) = Eintras,c + Eextras,c
=[n2
6s(s− 1)(4s+ 1) +
n
6
(ns− 1)(4n
s+ 1)(
1 +s2 − 11 + c
)]H1. (3.18)
Fig.3.10 shows the performance of the scheme for various cluster sizes for a 1000×1000
network. While the optimal cluster size depends on the value of c, we again find that
44
there are certain intermediate cluster sizes (s =5, 10, 25) that perform near optimally
over a wide range of spatial correlations.
It can be shown that
sopt(c) =( 8c
4c+ 1n) 1
3 + o(n13 ).
Setting the partial derivative of Es(c) w.r.t s to zero,
∂Es(c)∂s
=n
6(c+ 1)
(− 2s+ (4c+ 1)n+ (c− 2)
n
s2− 8c
n2
s3
)H1 = 0
⇒ −2s3 + ns2 + n = 0, if c = 0
⇒ −2s4 + (4c+ 1)ns3 + (c− 2)ns− 8cn2 = 0, if c 6= 0. (3.19)
Differentiating again w.r.t s
∂E2s(c)
∂2s= −
(2ns2
+ 2)H1, if c = 0 (3.20)
=n
3(c+ 1)s4(12cn2 − s4 − (c− 2)ns)H1, if c 6= 0. (3.21)
If c = 0, the second derivative in Eqn.3.20 is always negative and hence the minimum
is achieved at the two extremities s = 1 and s = n. Therefore,
sopt(0) = 1, n. (3.22)
45
• If c > 0, for s = o(n12 ), s4 = o(n2) and (c − 2)ns = o(n2). Solving Eqn.3.19 with
this constraint,
(4c+ 1)ns3 − 8cn2 + o(n2) = 0
⇒ sopt(c) =( 8c
4c+ 1n) 1
3 + o(n13 ). (3.23)
It can be verified that a minimum is achieved since the second derivative in Eqn.3.21
is positive for this value of s.
• If c > 0, for s = Ω(n12 ), it can be verified that Eqn.3.19 has no solution for s ≤ n.
Lemma 3.3.6. The near-optimal cluster size s = sno for Es(c) given by Eqn.3.18 satisfies
the condition
Esno(0)− E∗(0) = Esno(∞)− E∗(∞).
The proof is similar to proof of Lemma 3.3.2 with
f1(s) =Es(0)− E∗(0)
n6H1
− n
s(s− 1)(4s+ 1)
= −s2 − 3ns+ 3n+ 1, and
f2(s) =Es(∞)− E∗(∞)
n6H1
− n
s(s− 1)(4s+ 1)
= −4n2
s2− 3n
s− 6 · 2
13n
43 + 3n+ 2 · 2
23n
23 .
46
Theorem 3.3.7. For Es(c) given by Equation.3.18, the near-optimal cluster size
sno = Θ(n13 )(≈ 0.6847n
13 ).
Proof. From Eqns. 3.22 and 3.23, sopt(0) = 1, n and sopt(∞)→ (2n)13 .
Using Lemma 3.3.6, the near-optimal cluster size s = sno satisfies:
Es(0)− E∗(0) = Es(∞)− E∗(∞)
⇒[n2
6s(s− 1)(4s+ 1) +
n
6
(ns− 1)(4n
s+ 1)s2]−[n
6(n− 1)(4n+ 1)
]=
[n2
6s(s− 1)(4s+ 1) +
n
6
(ns− 1)(4n
s+ 1)]
−[ n2
6(2n)13
((2n)
13 − 1
)(4(2n)
13 + 1
)+n
6
( n
(2n)13
− 1)( 4n
(2n)13
+ 1)]. (3.24)
Rearranging Eqn.3.24 and factoring out n6s2
, we get the condition:
s4 + 3ns3 − (6 · 213n
43 + 3n+ 2)s2 − 3ns+ 4n2 + o(n2) = 0. (3.25)
Since s4 = o(ns3), ns = o(n2), by factoring out n, Eqn.3.25 reduces to
3s3 − 6 · 213n
13 s2 + 4n+ o(s3) + o(n) = 0. (3.26)
It can be verified that Eqn.3.26 has only one non negative solution,
sno = 0.6487n13 + o(n
13 ).
47
10−3
10−2
10−1
100
101
102
103
0
1
2
3
4
5
6
7
8x 10
8
correlation parameter in log scale log(c)
Tra
nsm
issi
on c
ost E
s(c)
(bit−
hops
) s = 1s = 5s = 10s = 100s = 200s = 500
Figure 3.10: Comparison of the performance of various cluster sizes for a network with106 nodes on a 1000x1000 grid when compression is possible only at cluster heads. Theperformance for s = 5, 10 is observed to be close to optimal over the range of c values.
1 10 13 100 10000
1
2
3
4
5
6
7
8x 10
8
cluster side in log scale log(s)
Tra
nsm
issi
on c
ost E
s(c)
(bit−
hops
)
c = 0.0001
c = 10
c = 100
c = 10000
c = 1.0
c = 0.1
s = sopt
(c)s = .6487N1/3
s = (2N)1/3
Figure 3.11: Illustration of the existence of a near-optimal cluster size. The network sizeis n×n = 1000×1000 and compression is possible only at cluster heads. The performanceof cluster side values near s = .6487n
13 is quite close to optimal for all values of c ranging
from 0.0001 to 10000
48
Fig.3.11 illustrates the existence of the near-optimal cluster size for a network of 106
nodes on a 1000× 1000 grid. Clearly, the transmission cost with cluster side values near
s = 7(= d.6487n13 e) is quite close to the optimal for a large range of correlation coefficient
c values.
3.4 Simulation Results
The analysis in Section 3.3 is based on simple and restricted communication, topology
and joint entropy models. To verify the robustness of the conclusions from analysis,
we present results from extensive simulation experiments with more general models. As
before, the network is deployed in a N × N area which is partitioned into grids of size
s× s, for s ∈ [1, N ]. All nodes which are located within the same grid form a cluster.
3.4.1 Communication and Topology models
We consider more general communication and topology models, while using the same
entropy model as in the analysis. Nodes are deployed uniformly at random within the
network area. Each node is assumed to transmit 1 bit of data. The joint entropy of nodes
within the cluster are calculated using the iterative, approximation technique described
in Section 3.1.
3.4.1.1 Random geometric graphs
In this model, all nodes that are within the communication radius can communicate with
each other over ideal, lossless links. Since each link has a unit cost, the routing cost is
calculated as:
49
intra-cluster cost =∑
all nodes in cluster(node depth in cluster SPT)
extra-cluster cost =∑all clusters in network
(cluster-head depth in network SPT)· (cluster joint entropy)
total cost = intra-cluster cost + extra-cluster cost.
The simulation parameters are as follows:
• network sizes 24mx24m, 84mx84m, 240mx240m
• density of deployment = 1 node/m2
• communication radius = 3m
Figures 12 (a), (b), (c) show performance of clustering for the network sizes considered.
As predicted by the analysis, for a network of N nodes, N13 is a good estimate of the
near-optimal cluster size.
3.4.1.2 Realistic Wireless Communication model
We consider the model for lossy, low power wireless links proposed in [ZK04a]. Link
costs are the average number of transmissions required for a successful transfer and these
are used as weights for obtaining the shortest-path tree. The routing cost is calculated as:
intra-cluster cost =∑
all nodes in cluster(node cost in cluster SPT)
extra-cluster =∑all clusters in network
(cluster head cost in network SPT) · (cluster joint entropy)
50
10−2
100
102
0
1000
2000
3000
4000
5000
6000
correlation parameter in log scale
tran
smis
sion
cos
t
s = 1s = 2s = 3s = 4s = 6s = 8s = 12
10−2
100
102
0
0.5
1
1.5
2
2.5x 10
5
correlation parameter in log scale
tran
smis
sion
cos
t
s = 1s = 2s = 4s = 7s = 12s = 28s = 42
10−2
100
102
0
0.5
1
1.5
2
2.5
3
3.5
4x 10
6
correlation parameter in log scale
tran
smis
sion
cos
t
s = 2s = 4s = 8s = 10s = 20s = 40
(a) (b) (c)
Figure 3.12: Random geometric graph topology. Performance of clustering with density= 1 node/m2, communication radius = 3m for network of size (a) 24x24 (b) 84x84 (c)200x200. Near-optimal cluster sizes are (a) 3,4 (b) 4,7 (c) 8,10.
The authors have made code available online for a topology generator based on the
model [ZK04b]. The parameters used in the simulations are as follows:
• network size = 48mx48m , density of deployment = .25 nodes/m2
• random node placement
• NCSFK modulation, Manchester encoding
• PREAMBLE LENGTH = 2, FRAME LENGTH = 50,
• NOISE FLOOR = -105.0; Power levels: -3dB, -7dB and -10dB.
Figures 3.13 (a), (b) (c) show performance of clustering for the different power val-
ues. For lower power, there is an increase in the routing cost since links become more
51
10−2
100
102
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
5500
6000
correlation parameter in log scale
tran
smis
sion
cos
t
s = 2s = 4s = 6s = 8s = 12s = 24
10−2
100
102
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
5500
6000
correlation parameter in log scale
tran
smis
sion
cos
t
s = 2s = 4s = 6s = 8s = 12s = 24
10−2
100
102
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
5500
6000
correlation parameter in log scale
tran
smis
sion
cos
t
s = 2s = 4s = 6s = 8s = 12s = 24
(a) (b) (c)
Figure 3.13: Realistic wireless communication topology. Performance of clustering in48mx48m network with density = .25 nodes/m2 for power level (a) -3dB (b) -7dB (c)-10dB. Cluster sizes 6, 8 are near-optimal.
lossy. However, since proximity relationships between nodes do not change drastically,
the relative routing costs for different cluster sizes remain similar.
3.4.2 Joint entropy models
We now consider more general models for the joint entropy of sources while using the
realistic lossy link model from Section 5.1.2. The routing cost is calculated using the same
equations and simulations are performed with power level of -3dB, all other parameters
remaining the same.
52
3.4.2.1 Linear and convex functions of distance
In the empirically obtained model, the joint entropy is a concave function of the distance
between sources. We also look at a linear function, for which
H2(d) = H1 +min(1,d
c) ·H1
and a convex function, for which
H2(d) = H1 +min(1,d2
c2) ·H1
.
Fig 3.14 (a) illustrates the three forms of joint entropy functions for 2 sources. The
entropy of each source is normalized to 1 unit. The convex and linear curves are clipped
when the joint entropy equals the sum of individual entropies. Figures 3.14 (b) and (c)
show performance of clustering.
3.4.2.2 Continuous, Gaussian data model
In order to verify that the results from analysis and all earlier simulations is not an artifact
of the simple approximation models for joint entropy, we now consider a continuous,
jointly Gaussian data model and use its entropy as the metric for uncorrelated data in
53
0 2 4 6 8 101
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2
Inter−node distance
Join
t ent
ropy concave
linear convex
10−1
100
101
102
103
2000
4000
6000
8000
10000
12000
14000
correlation parameter in log scale
tran
smis
sion
cos
t
s = 2s = 4s = 6s = 9s = 18s = 36
10−1
100
101
102
2000
4000
6000
8000
10000
12000
14000
correlation parameter in log scale
tran
smis
sion
cos
t
s = 2s = 4s = 6s = 9s = 18s = 36
(a) (b) (c)
Figure 3.14: (a) Example forms of joint entropy functions for 2 sources. The entropy ofeach source is normalized to 1 unit. The convex and linear curves are clipped when thejoint entropy equals the sum of individual entropies. The curves shown are for correlationparameter c = 2. Performance of clustering in 72m × 72m network with density = .25nodes/m2 for (b) linear model (c) convex model of joint topology. Cluster size 6 isnear-optimal.
the routing cost calculations. The data is assumed to have a zero-mean jointly Gaussian
distribution X ∼ NN (0,K), with unit variances σii = 1:
f(X) =1√
(2π)|K|12
e−12(X)TK−1(X).
, where K is the covariance matrix of X, with elements depending on the distance between
the corresponding nodes and the degree of correlation, Kij = e−dijc , where dij is the
distance between nodes i and j and c is the correlation parameter. For this distribution
and with quantization step size δ, entropy of a single source is [CT91]:
H1 =12log2(2πe)− log2(δ)
54
and joint entropy of n sources is:
Hn =12log2((2πe)n|K|)− nlog2(δ).
Since |K| becomes singular for large c values, we clip Hn by using
max1
2log2(2πe),
12log2((2πe)n|K|)
− nlog2(δ)
.
Figures 3.15 (a), (b) and (c) show performance of clustering for quantization step δ
= 1, 0.5 and .05. The cluster sizes s = 6, 8 are near-optimal. In Figures 3.15 (d), (e) and
(f) , the same curves are presented but the transmission cost is normalized to make the
highest value equal to 1. For lower values of δ, the quantization cost dominates and the
gains from removing inter-source correlations in data are diminished. Accordingly, the
relative gains from optimizing cluster size are also reduced.
3.4.3 Summary of results
Overall, the results presented in this section show that the basic conclusions from the
analysis hold even when the limiting assumptions of the analysis regarding node place-
ment, communication link quality, exact form of the correlation model, quantization,
are relaxed. In all cases, we observe the existence of small cluster-sizes that provide
near-optimal performance over a wide range of correlation settings.
55
100
101
102
1000
2000
3000
4000
5000
6000
7000
8000
9000
correlation parameter in log scale
tran
smis
ssio
n co
st
s = 2s = 4s = 6s = 8s = 12s = 24
100
101
102
5000
6000
7000
8000
9000
10000
11000
12000
correlation parameter in log scale
tran
mis
sion
cos
t
data1data2data3data4data5data6
100
101
102
1.8
1.9
2
2.1
2.2
2.3
2.4
2.5
2.6x 10
4
correlation parameter in log scale
trns
mis
sion
cos
t
data1data2data3data4data5data6
(a) (b) (c)
Figure 3.15: Performance of clustering in 48m×48m network with density = .25 nodes/m2
with a continuous, jointly Gaussian data model and quatization step (a) δ = 1 (b) δ =0.5 (c) δ = 0.05. Cluster size 6, 8 are near-optimal.
3.5 Summary and Conclusions
We study the correlated data gathering problem in sensor networks using an empirically
obtained approximation for the joint entropy of sources. We present analysis of the
optimal routing structure under this approximation. This analysis leads naturally to a
clustering approach for schemes that perform well(in terms of energy-efficiency) over the
range of correlations. The optimal clustering depends on the level of correlation and
also on where the actual data compression is performed; at each individual node or at
intermediate data collection points or cluster heads. Remarkably, however, there exists a
static, near-optimal cluster size which performs well over the range of correlations. The
notion of near-optimality is formulated as a min-max optimization problem and rigorous
analysis of the solution is presented for both 1-D and 2-D network topologies. For a linear
arrangement of N sources, the near-optimal cluster size is Θ(√D) irrespective of where
56
compression occurs, where D(≥ N,O(N2)) is the shortest hop distance of each source to
the sink. For a 2-D grid deployment, with N sources and unit density, a network-wide
shortest path tree is optimal if every node compresses its data using side information from
its neighbors. If compression is possible only at cluster-heads, a Θ(N16 ) cluster size is
shown to be near-optimal. The robustness of the conclusions from analysis is established
using extensive simulations with more general communication and entropy models.
The practical implication of these results for sensor network data gathering is that a
simple, static cluster-based system design can perform as well as sophisticated adaptive
schemes for joint routing and compression.
57
Chapter 4
Practical schemes for distributed compression
The details of how exactly compression will be achieved were ignored in the earlier
analysis for reasons of tractability. We now consider the design of practical schemes
for achieving distributed compression based on two different views of structure in data.
First, we build on work by Ciancio, Shen and Ortega [CO05, SO08a, SO08b] to obtain a
transform to take advantage of the broadcast nature of wireless communications. Next,
we extend the ideas of Candes et al., [CRT06], Donoho [Don06] and Wang et al. [WGR07]
to the multi-hop routing scenario.
4.1 Wavelet transform design for wireless broadcast advantage
Ciancio, Shen and Ortega [CO05, SO08a, SO08b] have developed lifting based wavelet
transforms that can operate over tree routing topologies. Their algorithms assume unicast
The work described in this section was published as follows:Godwin Shen, Sundeep Pattem, Antonio Ortega, “Energy-Efficient Graph-Based Wavelets for DistributedCoding in Wireless Sensor Networks”, 34th International Conference on Acoustics, Speech, and SignalProcessing (ICASSP), April 2009.Sungwon Lee, Sundeep Pattem, Maheswaran Sathiamoorthy, Bhaskar Krishnamachari, Antonio Ortega,“Spatially-Localized Compressed Sensing and Routing in Multi-Hop Sensor Networks”, 3rd InternationalConference on Geosensor Networks, July 2009.
58
0 5 10 15 20 25 300
2
4
6
signal x(n)
5 10 15 20 25
0
2
4
6
5 10 15 20 25 30
0
2
4
6
smooth coefficients s(n)
predict coefficients p(n)
Figure 4.1: Example (a) signal and (b) 5/3 wavelet coefficients
communications between nodes in the network. In thsi section, we extend their work
by designing a new transform to take advantage of the broadcast nature of wireless
communication. This transform allows for better compression of data and hence energy
efficiency.
4.1.1 Wavelet basics: The 5/3 lifting transform
We start by presenting an intutive explanation of de-correlation using lifting steps for
the 5/3 wavelet transform. For a rigorous treatment of wavelets and liting, see Vetterli
and Kovacevic [VK91] and Daubechies [DS98], respectively. Consider a discrete-time
signal x(n). The basic idea is to separate the low pass and high pass components of x(n).
Each even-time sample x(2t) can be decomposed into an estimate, using the adjacent
odd-time samples x(2t− 1) and x(2t+ 1), plus a residual value. Given smoothness in the
time-evolution of the signal, or temporal correlations, the residuals have a much smaller
magnitude as compared to the original samples and will require significantly less number
of bits to represent. This is how compression can be achieved. The 5/3 lifting wavelet
transform for signal x(n) is defined as follows:
59
• even “predict“ coefficients, d(2k) = x(2k)− x(2k−1)+x(2k+1)2
• odd ”smooth“ coefficients, s(2k + 1) = x(2k + 1) + d(2k)+d(2k+1)4
An example signal and its coefficients are illustrated in Figure. 4.1.
4.1.2 Wavelets for sensor networks
We now discuss existing schemes for computing wavelet transforms in a distributed man-
ner at sensor nodes.
4.1.2.1 Unidirectional 1D wavelet
Ciancio and Ortega [CO05] proposed wavelet transforms for use in a sensor network
scenario. For a linear array of sensor nodes transporting data hop by hop to a sink at
one end, nodes at odd and even depth provide the odd and even samples for the spatial
signal. The 5/3 wavelet computations are modified in a way that ensures that data
always makes unidirectional progress i.e. towards the sink. This scheme was extended for
tree topologies [CPOK06] by considering heuristic (and sub-optimal) ways of handling
merging of 1D paths in the tree.
4.1.2.2 2D wavelet for tree topologies
Shen and Ortega [SO08b, SO08a] proposed a lifting transform that works for any tree
topology. As before, the sink is at the root of the tree and nodes at odd depth provide
the ”smoothing“ coefficients and the nodes at even depth the ”predict“ coefficients. The
difference from the 1D transform is that at each node thare could be more than just two
”adjacent” samples. This is illustrated in Figure. 4.2 (a). It was shown that the following
60
(a) (b)
Figure 4.2: Illustration of odd (green) and even (blue) nodes in a subtree for 2D wavelet(a) with unicast and (b) exploiting broadcast nature of wireless communications. Thesolid arrows are part of the tree routing paths. The dashed arrows are the wireless linksnot part of the tree. The arrows crossed off in red denote disallowed interactions fortransform invertibility and unidirectionality.
computations over a tree topology T for the set of vertices V , result in an invertible
transform:
• For i ∈ V , let ρ(i) be the parent in T, Ci be the set of children in T
• For node m at even depth in T ,
”predict” coefficient dm = xm − 1(|Cm|+1)
∑k∈Cm xk −
1(|Cm|+1) · xρ(m)
• For node n at odd depth in T ,
”smooth” coefficient sn = xn + 12(|Cn|+1)
∑k∈Cn dk + 1
2(|Cn|+1) · dρ(n)
Note that the above computations implicitly impose a schedule or ordering on the
transmissions at nodes. Transmissions begin at the leaf nodes in the tree and every non-
leaf node is constrained to hold its transmission until all nodes in the subtree rooted at
itself have finished transmission.
61
4.1.3 2D wavelet for wireless broadcast scenario
The 2D wavelet just described treats the transmissions along the routing tree as unicasts
i.e. destined only for a particular node, in this case the parent in the tree. However, in
the context of sensor networks, the wireless transmissions at each node can be potentially
heard by many nodes in its neighborhood, based on the topology and the transmission
power. In the earlier 2D wavelet, a node contributed to de-correlation operations only at
its parent in the tree. Taking advantage of the broadcast nature of wireless transmissions,
a single transmission at a node can be used for de-correlation operations potentially at
all nodes that can receive it.
We consider the design of a wavelet transform that exploits broadcast advantage.
The routing tree is assumed to be known. The key issue is to decide which of all avail-
able broadcast links and the data they provide can be incorporated into de-correlation
operations while still ensuring a invertible and unidirectional transform.
4.1.3.1 Augmented neighborhoods
Starting with a given tree topology T over a set of vertices V , we consider an “augmented”
neighborhood at each node. For node i ∈ V , define the augmented neighborhood Ni
according to the following constraints:
• avoid odd-odd and even-even pairs (for invertibility)
• send only to nodes with lower hop-count (for unidirectionality)
• compute only over data from earlier time-slots (for timely and correct computations)
62
0 100 200 300 400 500 6000
100
200
300
400
500
600
0 100 200 300 400 500 6000
100
200
300
400
500
600
0.06 0.08 0.1 0.12 0.14 0.16 0.185
10
15
20
25
30
35
40
45
total energy consumption (Joules)
SN
R (
dB)
tree transformgraph transform
(a) (b) (c)
Figure 4.3: (a) Sample tree topology (b) With additional broadcast links in the augmentedneighborhoods at each node (c) Performance gain in terms of SNR vs. cost for newtransform compared to 2D wavelet for tree topologies
4.1.3.2 New transform definition
Given (V, T, TAUG), the new transform is defined as follows:
• For node m at even depth in T , ”predict” coefficient dm = xm +∑
k∈Nmpm(k)xk
• For node n at odd depth in T , ”smooth” coefficient sn = xn +∑
k∈Nnun(k)dk
It can be shown that the above conditions are provably necessary for invertibility and
unidirectionality of the transform [SPO09].
4.1.3.3 Performance of new transform
A sample tree topology T and the augmented graph TAUG are shown in Figures. 4.3 (a)
and (b) respectively. Figure. 4.3 (c) shows plots of SNR vs. cost for the new transform
and the unicast based 2D wavelet. It can be seen that there is a significant gain in
63
performance. A higher SNR is obtained for the same cost or the same SNR is obtained
at a lower cost.
4.2 Compressed sensing for multi-hop network setting
The results of our earlier studies are for traditional data compression and transport. Com-
pressed sensing is a recent advance that allows a different solution for field reconstruction.
While the results from this area apply only for specific classes of signals, we investigate
the implications for joint routing and compression in multi-hop sensor networks.
Pioneering work by Candes, Romberg and Tao [CRT06] and Donoho [Don06] es-
tablished that given an n-dimensional vector that is k-sparse in a certain basis, it can
be reconstructed from O(klogn) random projections and showed that near-optimal re-
construction can be obtained by solving a linear program. Tropp and Gilbert [TG07]
subsequently showed that similar reconstruction can be achieved through a greedy algo-
rithm, namely orthogonal matching pursuit (OMP). The number of projections required
for reconstruction depends on incoherence between the sparsity inducing basis and the
measurement matrix. The projection matrices used by Candes and Donoho are dense
random ± 1 Bernoulli matrices or Gaussian matrices. Wang et.al. [WGR07] showed that
the remarkable results of compressed sensing could also be obtained using sparse random
projections. They showed that in a distributed network scenario, CS in its original for-
mulation would require each node to transmit O(n) packets while using sparse random
projections, similar results can be obtained with O(logn) packets per node. However,
64
this scheme is still very expensive in a multi-hop scenario. We present an extension to
obtain SRPs in a distributed manner with shortest path routing.
We use the following notation:
Φ: measurement matrix whose rows are projection vectors
Ψ: sparsity inducing basis whose columns are the basis vectors
H: the holographic basis H = ΨΦ
4.2.1 Combining routing with known results in compressed sensing
Consider a network of n sensor nodes with diameter d hops. The average distance of
nodes from the sink is also O(d) hops. If every node sends its raw sensor measurement
to the sink (independently) via the shortest path tree, then the average cost per reading
for the network is
Costraw−SPT = O(nd).
Now consider compressed sensing and assume a spanning tree topology. Nodes route
data to the sink along this tree. Each node adds its own reading multiplied by ±1 to the
value received from all its children in the tree and sends this new value to its parent. The
sink can add values received from each of its children to obtain one complete projection.
Since each node in the tree transmits exactly once, the cost per projection is n. Assuming
that the projection matrix is known to sink and nodes (each node only needs its column
vector) in advance, the cost for obtaining O(klogn
)projections is
CostCS−DRP = O(n.klogn
)= O
(knlogn
).
65
The measurement matrix for sparse random projections is defined as [WGR07]:
Φij =
+1 if p = 12s
−1 if p = 12s
0 otherwise
For obtaining sparse random projections, each node decides to send with probability
1s = logn
n and the measurement is routed along the shortest path. The sink generates the
row of the measurement matrix by placing ± 1 at positions for nodes from which data
was received and 0 for all others. Since node choice is random, the average path length
remains O(d)
and the cost using O(klogn
)SRPs is
CostCS−SRP = O(d.logn.klogn
)= O
(k.d.log2n
). (4.1)
This is a bound on the cost that any new scheme based on CS must better.
We make the following propositions for using CS in the multi-hop scenario.
Proposition 4.2.1. CS with time-domain sparsity is ineffective in multi-hop scenario.
Reasoning: At most k out of N sensors set off alarms when they sense a value greater
than a threshold. In this case, the sparsity inducing basis Ψ = I, the identity matrix. If
we use sparse random projections, CostCS−SRP = O(kdlog2n
). However, if only nodes
that set off alarms route their measurements via shortest path to the sink, the cost is
O(kd).
66
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.60
10
20
30
40
50
60
70
80
Cost ratio to Raw data transmissionS
NR
(dB
)
SPT256
APR
SRP2
SRP4
Figure 4.4: Compressed sensing performance in multi-hop setting. Plot of SNR vs cost fordifferent schemes. The black and green curves are for Sparse Random Projections (SRP).The blue and red curves are for two variations of computing projections over shortestpath routing.
Proposition 4.2.2. In a multi-hop scenario, shortest path routing is optimal for com-
pressed sensing via sparse random projections.
Reasoning: Each node decides to send its measurement to the sink with probability
1s = logn
n . Since the distribution of the O(logn
)(in expectation) nodes that choose to send
measurements is random, there can be no coordination in the routing. When individual
nodes route data independently, they have no incentive to move away from the shortest
path.
Figure. 4.4 shows the comparative performance of schemes computing projections
along routing paths to sink and sparse random projections computed at the sink. Several
routing schemes and their performance are considered by Lee et al. [LPS+09].
67
Chapter 5
SenZip: Distributed compression as a service
Our work up to this point, and in general, work on correlated data gathering in sensor
networks in the literature, has focused on theory and simulations to understand perfor-
mance limits. These studies, and some limited system implementations (e.g., [ZCH07]),
have therefore had limited impact on technology adoption and sensor network software
development because they have not yielded modular and inter-operable software. We
move towards addressing this problem by (i) proposing a novel architecture, SenZip, that
fits into the overall networking software architecture for sensor networks and (ii) demon-
strating that a practical design based on this architecture can be deployed on motes and
can achieve distributed configuration and modularity.
The SenZip architecture specifies a compression service that can encompass different
compression schemes and its modular interactions with standard networking services such
as routing. This architecture enables a distributed node configuration for compression,
just as existing systems make it possible for sensors to configure themselves for routing
The work described in this section was published as follows:Sundeep Pattem, Godwin Shen, Ying Chen, Bhaskar Krishnamachari, Antonio Ortega, “SenZip: AnArchitecture for Distributed En-Route Compression in Wireless Sensor Networks”, Workshop on Earthand Space Science Applications (ESSA), April 2009.
68
Figure 5.1: The SenZip architecture. A completely distributed compression service isenabled by having the interacting components shown here at each network node.
in a distributed manner. The architecture proposal is based on (a) lessons from overall
architectural principles for sensor networks [TDJ+07], (b) our own experience in imple-
menting a practical wavelet-based distributed compression system, and (c) identifying
common patterns in existing compression schemes. To concretely illustrate the utility
of the architecture, we show how it can incorporate two different compression schemes,
DPCM and 2D wavelets and present results from mote experiments for data gathering in
which nodes can configure themselves for compression under different routing conditions.
5.1 SenZip architecture
We propose and detail SenZip, an architecture for distributed en-route compression in
sensor networks. The primary goals of SenZip are are flexibility, modularity, and dis-
tributed configuration and reconfiguration. In addition to the lessons from the principles
69
of an overall architecture for sensor networks and common abstraction identified for ex-
isting compression schemes, our design of the SenZip architecture is based on a system
implementation effort.
5.1.1 SenZip Specification
The SenZip architecture specifies:
1. a compression service that can encompass different compression schemes and,
2. its interactions with standard routing and other networking services.
Figure 5.1 is a block diagram representation of the SenZip architecture. It needs to
be emphasized that a system based on SenZip would be completely distributed and com-
ponents shown in Figure 5.1 would reside on each network node. Of course, compressed
data from all nodes in the network finally reaches the base station where it is jointly
reconstructed. We now describe the services, their responsibilities and interactions.
5.1.1.1 Compression Service
The compression service consists of the aggregation module and the compression module.
Aggregation module: The aggregation module disseminates and gathers information
for maintaining the local aggregation tree by exchanging messages. This information is
collated in an aggregation table. The aggregation graph abstraction allows the definition
of a generic table that works for different compression schemes. Pseudo-code for such a
table is shown in Figure 5.2.
70
struct attributes int upstreamOnehopNeighborhoodSize;int downstreamOnehopNeighborhoodSize;...
weight attributes;
struct entry int node id;weight attributes weights;int further hops;tableEntry *neighborEntry[MAX NHOOD SIZE];
tableEntry;
AggregationTable tableEntry[MAX NHOOD SIZE];
Figure 5.2: Aggregation table example. The recursive entry structure allows the samedefinition for different compression schemes.
Compression module: This module has the following functions: (a) From the aggre-
gation tree structure provided by routing, this module obtains the role played by the
node - which computations to perform and for which nodes, the parameters involved in
computation and ordering information - the sequence in which nodes process and forward
data. (b) It receives raw measurements from the application and packets with data that
needs further processing from forwarding. (c) This module performs further processing
over the partially processed data in storage and initiates processing for data of the node
itself. The computations will be specific to the compression scheme and based on the role
and parameter information. (d) Data that is still partially processed is packetized and
sent to forwarding. For data that is fully processed, it checks if enough has been buffered
in storage to fill a packet. If yes, performs quantization and bit reduction operations, and
sends the packet to forwarding.
71
5.1.1.2 Networking components
SenZip introduces small changes to standard networking components as follows:
Routing engine: In addition to the standard routing functionality, this component in
SenZip has an extra interface to the compression service. It reports information of path
routing that is relevant for the local aggregation, for example, the parent and hop count
in a tree topology. Optionally decisions on changing parent can be coordinated with the
compression service, which can also provide a specific metric for the routing cost.
Forwarding engine: While partially processed data from nodes in the local aggrega-
tion tree is allowed to be intercepted by the compression service, fully processed data
is forwarded directly along the route to the sink. Optionally, it might apply different
settings, such as power, number of retries etc., for the different types of packets.
Link estimator : Efficient link estimation requires a limited choice of links to moni-
tor [FGJL07]. To remove a link (or node in the neighbor table) that is part of the current
aggregation tree, joint decision has to be made with the compression service to maintain
consistency in the data processing.
5.1.2 Discussion
We emphasize that the configuration of roles, parameters and ordering is to be achieved
purely locally from the aggregation graph and based on the compression scheme. There
is no centralized decision and dissemination. This is a design criteria for compression
schemes that can fit into the architecture. There is an overhead cost for the exchange
of beacons to maintain the aggregation table. Whether the overhead is acceptable or
not depends on the relative frequency of measurement versus the frequency of topology
72
changes. If the frequency of topology changes is very high, the potential gains from
compression might be overwhelmed by the cost of packet exchanges to maintain the
table.
Which component is best suited for constructing and maintaining the local aggregation
graph? One option is to give this additional responsibility to the routing engine, which
already generates and receives messages to setup path routing. However, we believe it
is much better for the compression service to handle the aggregation graph operations.
This will aid code-reuse and flexibility by restricting the changes to the routing engine to
providing a single extra interface.
To ensure flexibility and extensibility, important goals for an overall sensor network ar-
chitecture [CDE+05, TDJ+07], SenZip only details the interactions between compression
and networking services and not the interfaces. The components within the compression
service also follow the larger goal of “meaningful separation of concerns”. The abstrac-
tion helps avoid over-specification, by ensuring that the compression components are
required by most existing schemes. Overall, the specification of SenZip has the features
of a desirable programming paradigm described by Tavakoli et al. [TDJ+07].
5.2 Mapping algorithms to architecture
We now discuss two compression schemes that work over tree routing topologies - a
simple differential encoding scheme, DPCM, and a more sophisticated 2D wavelet scheme
developed by Shen and Ortega [SO08a]. We describe how these schemes fit into the SenZip
architecture.
73
table entry element DPCM 2D waveletweight attributes not needed upstreamOnehopNeighborhoodSize ≡
number of children in treedownstreamOnehopNeighborhoodSize ≡1 (for parent in tree)
further hops 1 (upstream only) 2 for upstream node, 1 for downstreamneighborEntry[].further hops 0 1 upstream node, 0 for downstream
Table 5.1: Aggregation table initialization
5.2.1 Algorithm details
Assume a given graph G(V,E) with vertices defined by node locations and edges defined
by communication links between nodes. Assume a tree graph T (V,R) (R ⊂ E) rooted
at a single sink node. Suppose every node is indexed by an integer n ∈ V , Cn is the set
of child indices of n, and ρ(n) is the parent index of n in T . We say that that node n
has depth k when it is k-hops from the sink. Also let xn denote the data measured at
node n. For simplicity, we assume data is forwarded and compressed along the same tree
T , i.e., the aggregation graph is T . In both schemes, we define the following transmission
schedule. Initially, nodes without any children (leaf nodes) forward raw data to their
parents in T . Then, every node n waits until it receives data from all children m ∈ Cn
before it transmits its own data. This induces an ordering of the communications which
is necessary for nodes to compress data as it is forwarded to the sink.
5.2.1.1 DPCM
Leaf nodes first forward raw data to their parents. Each node n waits to receive raw
measurements from all its children in T and then computes residual prediction errors as
differences from its own measurement as follows:
74
dm = xm − xn ∀m ∈ Cn
sn = xn. (5.1)
Node n then forwards the compressed prediction residuals of its children (and other
descendants) and its own raw measurements to its parent ρ(n).
5.2.1.2 2D wavelet
This transform is constructed as follows for a single level of decomposition. First, vertices
of G are assigned roles by being split into disjoint sets of predicts (odd depth) and updates
(even depth) based on depth in T . Next, a high-pass “detail” coefficient dm for each
predict node m is computed by subtracting from the data at node m, xm, a prediction
that is based on information available at neighboring nodes (where neighbors are defined
as nodes that are 1-hop away in the aggregation graph):
dm = xm −1
(|Cm|+ 1)
∑k∈Cm
xk −1
(|Cm|+ 1)· xρ(m) (5.2)
Finally, a low-pass “smooth” coefficient sn for each update node n is computed by
adding to xn a correction term based on the detail coefficients of neighboring nodes:
sn = xn +1
2(|Cn|+ 1)
∑k∈Cn
dk +1
2(|Cn|+ 1)· dρ(n) (5.3)
75
Under the given transmission schedule, each node only has access to data from its
descendants and only forwards its own data and data from its descendants. Since each
node n uses data from its parent, transform computations for n cannot be completed at
n. However, note that terms corresponding to children Cm and parent ρ(m) are explicitly
separated in the computations. This allows us to compute partial wavelet coefficients and
to update partial coefficients as data flows towards the sink to make them full wavelet
coefficients as described in [CPOK06, SO08a].
This process is summarized as follows. Leaf nodes first forward raw data. Each
predict node m waits to receive data from its children, then generates a partial coefficient
dp(m) using data from its children as dp(m) = xm− 1(|Cm|+1)
∑k∈Cm xk. Then m forwards
its partial dp(m) (and data from descendants) and ρ(m) completes the computation as
d(m) = dp(m) − 1(|Cm|+1) · xρ(m). Each update node performs similar operations. This
process is illustrated in Figure 5.3. Note that this induces an ordering of the computations.
5.2.2 Relating algorithms to SenZip
We now describe the operation overview for SenZip based systems deploying the two
algorithms.
5.2.2.1 Initialization
The aggregation component configures aggregation table entries and initiates message
exchanges (with its neighbors) in order to gather information needed to build the aggre-
gation table. The specifics of table entries for each scheme are shown in Table 5.1. This is
shared with the compression component which can then identify their role, parent in the
76
1
32
4
5 6 Nodes 5 and 6 forward raw data x5 and x6 to node 4
Node 4: (a) Generate partials dp(4), sp(5) and s(b) Forward [dp(4) sp(5) sp(6)] to node 3
Node 2 forwards raw data x2 to node 1
Node 3:(a) Complete partial 4 to get d(4)(b) Complete partials 5, 6 to get s(5), s(6) (c) Generate partial sp(3)(d) Forward [d(4) s(5) s(6) sp(3)] to node 1
Node 1:(a) Generate partials sp(2) and dp(1)(b) Forward [dp(1) sp(2) sp(3) d(4) s(5) s(6)]
Figure 5.3: Partial computations for 2D wavelet. Gray (white) circles denote even (odd) nodes.Operations at each node are done in the order listed.
tree and children in the tree, and ordering of computations, to configure each compression
scheme as follows:
DPCM: The roles are uniform i.e. all nodes have the same role. The ordering is
that leaf nodes start forwarding and intermediate nodes wait for all one-hop upstream
descendants (children) in aggregation tree.
2D wavelet: The roles are decided based on depth in tree from root, odd depth nodes
are predicts nodes and even depth nodes are updates. The parameters in computation
are equal to the weights, the number of one-hop (children) and two-hop (grandchildren)
77
upstream descendants. The ordering is that leaf nodes start forwarding and intermedi-
ate nodes wait for partial coefficients of one-hop (children) and two-hop (grandchildren)
upstream descendants in the aggregation tree.
5.2.2.2 Data forwarding and compression
DPCM: At each node n, the partially processed data to be received is raw data from
children and that to be sent is raw data for node n and fully processed data of the
children is the differentials according to Equation 5.1.
2D wavelet: At each node, partially processed data is raw data from children and
grandchildren. Sent partially processed data is raw data for node n and all children, and
fully processed data is the coefficients for all grandchildren according to Equations 5.2
and 5.3.
5.2.2.3 Reconfiguration
The routing engine informs aggregation component of a change in parent (and hop count)
in the tree.
DPCM: When parent changes at node n, send an explicit parent change message
to the old parent ρold(n) and initiate a message to the new parent. When a parent
change message is received by ρold(n), remove child form table. The number of children
is decremented, so waiting criteria in ordering changes.
2D wavelet: When parent of node n changes, send explicit delete message to ex-
parent ρold(n) and add message to new parent. If the hop count changes parity from
before, propagate the change to all upstream nodes (descendants in subtree). When a
78
(a) (b)
Figure 5.4: Code structure of (a) CTP and (b) SenZip compression service over CTP
parent change message is received by ρold(n), remove child from table. The number of
children and grandchildren is decremented, so waiting criteria in ordering changes. ρold(n)
sends a grandparent change message to ρ(ρold(n)) where changes in ordering are made.
5.3 System implementation
We have implemented a SenZip compression service in nesC/TinyOS [Tin] to run over
the Compression Tree Protocol [CTP, TinyOS Enhancement Proposal (TEP) 123] [tos].
This implementation effort has informed the design of the SenZip architecture and in
turn, concretely demonstrates it in software.
5.3.1 TinyOS code
The code structure of CTP and the SenZip extension are illustrated in Figure.5.4. We
now present some details of the code for components, interfaces, changes to CTP and
application.
79
5.3.1.1 Interfaces
The following new interfaces have been defined for the interactions of the new components
with other parts of the system.
• AggregationInformation: interactions between routing and aggregation component.
• AggregationTable: interactions between aggregation and compression component.
• StartGathering: interactions between application and compression component.
5.3.1.2 AggregationP component
The aggregation component maintains the local aggregation tree. The routing component
signals changes in parent in routing tree. At this point, the aggregation component sends
an ADD beacon to the new parent and a DELETE beacon to the old parent. The old
and new parents update their aggregation tables accordingly.
• Events:
1. Routing.parentChange: Signalled from the Routing engine to indicate a change
in the parent in routing tree.
(a) Send ADD beacon to new parent and DELETE message to the old parent.
(b) Signal change to Compression component.
2. AggBeaconReceive.receive:
(a) Update table for ADD/DELETE beacons from neighbors.
• Commands:
80
(a) (b)
Figure 5.5: (a) Distributed compression and (b) Centralized reconstruction
1. Table.contactDescendant: Called by the Compression component to directly
contact neighbors in the table from which expected packets have not been
received.
5.3.1.3 CompressionP component
When the aggregation component signals changes in the aggregation table, the compres-
sion component allocates and de-allocates memory for storing the data of children in the
aggregation tree. When the forwarding engine presents packets with data arriving from
children, the data is stored, transformed, compressed and packetized to be handed back
to the forwarding engine to transport it to the sink. Figure. 5.5 shows the sequence of
operations for compression at each node. Currently the DPCM transform is applied and
fixed quantization encoding is used for compression. Given the overheads, per packet
payload available for compressed data is 10 bytes or five 16-bit measurements.
• Events:
81
1. Table.tablePointer: Signalled from the Aggregation component to provide a
pointer to the table for the local aggregation neighborhood.
2. Table.change: Signalled from the Aggregation component to inform of changes
in the local aggregation neighborhood.
3. Intercept.forward: Signalled from the forwarding engine to filter out packets
meant for in-network processing i.e. compression.
4. AllRxTimer.fired: Internal timer setup to check if all expected packets from
the local aggregation neighborhood were received. If not, currently use the
readings from the previous epoch.
• Commands:
1. StartGathering.isStarted: Called from application to check if compression has
already started.
2. StartGathering.getStarted: Called from application to get compression started.
3. Measurements.set: Called from the application to transfer sensor measure-
ments.
• Tasks:
1. changesTask: Posted to update internal table according to changes signalled
by Aggregation component.
2. encodeCoefficientsTask: To encode and compress the coefficients generated by
the transform. Currently using fixed quantization encoding.
• Functions:
82
1. computeTransform: To apply the transform on data received from the local
aggregation neighborhood. Currently DPCM or differential computation.
5.3.1.4 Changes to CTP
Some small changes are introduced in CTP components to account for and aid in-network
compression.
• RoutingEngine: include AggregationInformation interface and inform Aggregation
component of changes in parent.
• ForwardingEngine: obtain the next hop for forwarding packets from Compression
component rather than Routing.
5.3.1.5 Application
The current application is written for a TMote Sky mote with an on-board temperature
sensor.
• Events:
1. StartGathering.startDone: Signal from Compression component to begin sen-
sor measurements.
2. SubReceive.receive: At the sink node, the Compression component transfers
all packets to application. They are then sent over the air to the base station
attached to a pc/laptop.
83
5.3.2 Experimental Results
An in-lab testbed with Tmote Sky motes [tmo] is used for the evaluation. Ambient tem-
perature is the sensed phenomenon and we introduce temperature gradients by switching
hot lamps on and off.
5.3.2.1 Static topologies
We use fixed topologies with 15 nodes for this set of experiments. The setting and two
sample topologies are illustrated in Figure 5.6 (a). The spatial transforms used are DPCM
and 2D wavelet and the bit reduction is via fixed quantization. We assume a uniform
bit allocation for all nodes. The same experiments (sequence of switches) are repeated
for the two different tree topologies in Figure 5.6 (a) with different bit allocations per
sample.
On initialization, all nodes in the network self-configured the roles, parameters and
ordering according to the topology. Figures 5.6 (b) and (c) show the reconstruction
with 2 bit allocation at node 7 which has different depth and hence roles, in the two
trees. Similarly, Figures 5.6 (d) and (e) show the reconstruction for node 12 with 3 bit
allocation. Figures 5.6 (g) and (h) compare the reconstruction error at the nodes for
each topology for 3 bit allocation. The RMS error ranges between .01oC to .16 oC over
the temperature range of 20oC to 28oC for 3 bit quantization of coefficients for original
sample of 16 bits. Since good and similar reconstruction is obtained, it is verified that the
compression operations were correctly configured in a completely distributed manner.
Figures 5.7 (a) shows the average RMS error for compression for tree 1 with varying
bit allocation. As expected, better reconstruction is obtained for higher bit allocations.
84
(a)
0 50 100 150 200
21
22
23
24
25
26
27
sample number
tem
pera
ture
(ce
ntig
rade
)
original signalreconstruction
0 50 100 150 20021
22
23
24
25
26
27
28
sample number
tem
pera
ture
(ce
ntig
rade
)
original signalreconstruction
(b) (c)
0 20 40 60 80 100 120 140 160 180 20021
21.5
22
22.5
23
23.5
24
24.5
25
25.5
26
0 20 40 60 80 100 120 140 160 180 20021
21.5
22
22.5
23
23.5
24
24.5
25
25.5
(d) (e)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 150
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
node id
RM
S e
rror
(ce
ntig
rade
)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 150
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
node id
RM
S e
rror
(ce
ntig
rade
)
(f) (g)
Figure 5.6: Experiments on static trees with 2D wavelet transform and fixed quantization.(a) Two fixed tree topologies, tree 1 and tree 2, for same set and locations of nodes. Rawmeasurement (dashed red) and reconstruction (solid blue) for node 7 with 2 bits allocatedper sample for (b) tree 1 and (c) tree 2, for node 12 with 3 bits per sample for (d) tree 1and (e) tree 2. Histogram of RMS error at all nodes with 3 bits per sample for (f) tree 1and (g) tree 2.
85
2 3 4 50
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
bit allocation per sample
aver
age
RM
S e
rror
2D waveletDPCM
2 3 4 50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
allocation per sample (bits)
cost
(no
rmal
ized
wrt
CT
P)
2D wavelet, tree 1
DPCM, tree 1
2D wavelet, tree 2
DPCM, tree 2
asymptotic bound
(a) (b)
Figure 5.7: (a) Average RMS error for tree 1 with increasing bit allocation per samplefor DPCM and 2D wavelet. (b) Normalized cost wrt. to raw data gathering with CTPfor increasing bit allocaton per sample.
Figures 5.7(b) shows the cost gain over raw data collection with CTP. In these experi-
ments, DPCM has lower cost since the partially processed data only travels one hop while
for 2D wavelet, this is 2 hops. The cost gains compared to raw data are relatively limited
due to small network size, particularly small average depth which is 3.27 for tree 1 and
4.07 for tree 2. It can be shown in general that with increasing average depth, the cost
for both schemes approaches the ratio of bit allocation to raw measurement size. For the
same number of bits, the wavelet scheme has better reconstruction but as just discussed,
a higher cost.
5.3.2.2 Dynamic topologies
In these experiments, we send explicit messages to nodes to alter their parent in the
routing tree while data gathering with compression is in progress. The compression
settings are to use DPCM transform and Golumb-Rice encoding.
These results verify (a) correct updating of aggregation table and configuration of
storage, transform computations and packetization at node that adds a new child to its
86
aggregation table, (b) correct handling of coefficients “pending” packetization at node
that deletes a child from its aggregation table and (c) correct reconstruction of altered
topology during reconstruction at base station.
87
Chapter 6
Conclusion
This work studied several aspects of the joint routing and compression problem in sensor
networks to arrive at a comprehensive solution. We have made significant progress to-
wards demonstrating completely distributed in-network compression in sensor networks.
We conclude with a discussion of the contributions and future work.
6.1 Contributions
The main contributions of this thesis are as follows:
Theoretical understanding of interplay between routing and in-network compression:
Two different scenarios, homogeneous and heterogeneous, are shown to have different
near-optimal routing. Subsequently, this problem was addressed by other researchers,
primarily for the homogeneous case and while they use different models, their results agree
with our basic conclusions. When the spatial correlation is uniform, for the homogeneous
case where every node is capable of compression computations, shortest-path routing is
order-optimal.
88
Design of algorithms for spatial compression: First, a wavelet compression algorithm
to take advantage of broadcast nature of wireless communications. This algorithm works
for any type of data over any connected 2D topology. The second is a compressed sensing
based scheme that extends the classical framework to the multi-hop scenario. This scheme
works when data is known to be sparse in some known spatial basis.
SenZip, architectural view of distributed compression as a service: A new “compres-
sion layer“ is defined to interact with standard networking components to achieve the
configuration (and dynamic reconfiguration) and computations required for compression
in a completely distributed fashion.
System design and software development : Software modules for a SenZip compression
service that works on top of the Collection Tree Protocol (which provides the networking
components). This concretely demonstrates that SenZip is a working architecture. The
code has been released to tinyos-contribs.
6.2 Future work
Some directions for future work on the analysis, algorithm design and system development
for distributed compression follow.
There is very little work on analyzing the case when regions that have correlated data
are not geographically proximate [DBF07]. The analysis presented in this thesis is limited
to the case of uniform spatial correlations. It will interesting to extend it for the scenarios
with non-uniform spatial correlations.
89
In the design of wavelet based algorithms, we have assumed that the optimal routing
is known and that the compression operations are configured for the chosen routes. The
design of algorithms that jointly optimize routing and compression needs more attention.
Our analysis and algorithms are focused primarily on ways to exploit spatial correlations.
The implicit understanding is that temporal correlations can be handled at each node in-
dividually. However, algorithms that account for temporal correlations across nodes, and
for the general space of spatio-temporal correlations need further research. Further work
is needed on studying distributed compression algorithms to understand how they might
fit into the SenZip architecture. Simplifications and modifications to these algorithms
might be needed for them to correspond to the abstraction used for SenZip design.
With some extensions, the SenZip based system can allow for distributed compression
to be widely adopted in data gathering sensor networks. We are working on a TinyOS
Enhancement Proposal (TEP) for standardization of SenZip. The system currently pro-
vides a few options in terms of spatio-temporal transforms and encoding schemes. It
will be useful to develop and provide a suite of compression schemes in TinyOS. Further,
there is a need for a manual for helping users make the correct choice of which scheme
to use based on domain and application specific knowledge. Improvements need to be
made for ensuring robustness. The current distributed initialization is based on a simple
broadcast flooding scheme. Practical deployments will require a reliable flooding scheme.
The reconstruction code needs to be extended to handle changes in topology and packet
losses. For long-lived operation, the current system needs to be integrated with a sleep-
scheduling mechanism. The system needs to be tested at scale i.e in medium and large
sized networks.
90
References
[BK01] Stephen F. Bush and Amit Kulkarni. Active Networks and Active Net-work Management: A Proactive Management Framework. Kluwer Aca-demic/Plenum Publishers, 2001.
[CBLV04] R. Cristescu, B. Beferull-Lozano, and M. Vetterli. On network correlateddata gathering. In Proceedings of the 23rd Conference of the IEEE Com-munications Society. IEEE Communications Society, March 2004.
[CBLVW06] R. Cristescu, B. Beferull-Lozano, M. Vetterli, and R. Wattenhofer. Networkcorrelated data gathering with explicit communication: Np-completenessand algorithms. IEEE/ACM Transactions on Networking, 14(1):41–54,February 2006.
[CDE+05] D. Culler, P. Dutta, C. T. Eee, R. Fonseca, J. Hui, P. Levis, J. Polastre,S. Shenker, I. Stoica, G. Tolle, and J. Zhao. Towards a sensor networkarchitecture: Lowering the waistline. In Proceedings of the Tenth Workshopon Hot Topics in Operating Systems. USENIX, June 2005.
[CDHH06] David Chu, Amol Deshpande, Joseph Hellerstein, and Wei Hong. Approxi-mate data collection in sensor networks using probabilistic models. In IEEEInternational Conference on Data Engineering (ICDE), pages 3–7. IEEE,April 2006.
[CO05] A. Ciancio and A. Ortega. A distributed wavelet compression algorithmfor wireless multihop sensor networks using lifting. In Proceedings of theIEEE International Conference on Acoustics, Speech, and Signal Processing.IEEE, March 2005.
[CPOK06] A. Ciancio, S. Pattem, A. Ortega, and B. Krishnamachari. Energy-efficientdata representation and routing for wireless sensor networks based on a dis-tributed wavelet compression algorithm. In Proceedings of the ACM/IEEEInternational Symposium on Information Processing in Sensor Networks(IPSN). Springer Verlag, April 2006.
[CRT06] E.J. Candes, J. Romberg, and T. Tao. Robust uncertainity principles : ex-act signal reconstruction from highly incomplete frequency information. InIEEE Transactions on Information Theory, pages 489–509. IEEE, February2006.
91
[CT91] T. M. Cover and J. A. Thomas. Elements of Information Theory. JohnWiley, New York, N.Y., USA, 1991.
[DBF07] T. Dang, N. Bulusu, and W. Feng. Rida: A robust information-driven datacompression architecture for irregular wireless sensor networks. In Proceed-ings of the 4th European Workshop on Sensor Networks. IEEE, January2007.
[DDT+08] M. Duarte, M. Davenport, D. Takhar, J. Laska, T. Sun, K. Kelly, andR. Baraniuk. Single-pixel imaging via compressive sampling. IEEE SignalProcessing Magazine, 25(2):83–91, March 2008.
[Don06] D. L. Donoho. Compressed sensing. In IEEE Transactions on InformationTheory, pages 1289–1306. IEEE, April 2006.
[DS98] I. Daubechies and W. Sweldens. Factoring wavelet transforms into liftingsteps. Journal of Fourier Analysis and Applications, 4(3):247–269, March1998.
[EGGM04] M. Enachescu, A. Goel, R. Govindan, and R. Motwani. Scale-free aggre-gation in sensor networks. In 1st International Workshop on AlgorithmicAspects of Wireless Sensor Networks, pages 71–84. Springer-Verlag, July2004.
[FGJL07] R. Fonseca, O. Gnawali, K. Jamieson, and P. Levis. Four bit wireless linkestimation. In Proceedings of the Sixth ACM Workshop on Hot Topics inNetworks. ACM, November 2007.
[GBR] GBROOS. Great barrier reef ocean observing system.http://imos.org.au/gbroos.html/.
[GDV06] M. Gastpar, P. L. Dragotti, and M. Vetterli. The distributed karhunen-loevetransform. IEEE Transcations on Information Theory, 52(12):5177–5196,December 2006.
[GE03] A. Goel and D. Estrin. Simultaneous optimization for concave costs: sin-gle sink aggregation or single source buy-at-bulk. In Proceedings of the14th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 499–505. ACM/SIAM, January 2003.
[GGP+03] D. Ganesan, B. Greenstein, D. Perelyubskiy, D. Estrin, and J. Heidemann.An evaluation of multi-resolution search and storage in resource-constrainedsensor networks. In Proceedings of the First ACM Conference on EmbeddedNetworked Sensor Systems, November 2003.
[HBSA04] T. He, B. M. Blum, J. A. Stankovic, and T. F. Abdelzaher. Aida: Adaptiveapplication independent data aggregation in wireless sensor networks. InACM Transactions on Embedded Computing System Special issue on Dy-namically Adaptable Embedded Systems, pages 3(2), 426 – 457. ACM, May2004.
92
[HCJB04] W. Hu, C.T. Chou, S. Jha, and N. Bulusu. Deploying long-lived and cost-effective hybrid sensor networks. In The 1st Workshop on Broadband Ad-vanced Sensor Networks. IEEE Communications Society, October 2004.
[IEGH02] C. Intanagonwiwat, D. Estrin, R. Govindan, and J.S. Heidemann. Impactof network density on data aggregation in wireless sensor networks. In Pro-ceedings of The 22nd International Conference on Distributed ComputingSystems, pages 457–458. IEEE Computer Society, July 2002.
[IGE+03] C. Intanagonwiwat, R. Govindan, D. Estrin, J.S. Heidemann, and F. Silva.Directed diffusion for wireless sensor networking. IEEE/ACM Transactionson Networking, 11(1):2–16, January 2003.
[KEW02] B. Krishnamachari, D. Estrin, and S.W. Wicker. The impact of data aggre-gation in wireless sensor networks. In Proceedings of the 22nd InternationalConference on Distributed Computing Systems, pages 575–578. IEEE Com-puter Society, July 2002.
[LDP07] M. Lustig, D. Donoho, and J. M. Pauly. Sparse mri: The application ofcompressed sensing for rapid mr imaging. Magnetic Resonance in Medicine,58(6):1182–1195, December 2007.
[LPS+09] Sungwon Lee, Sundeep Pattem, Maheswaran Sathiamoorthy, Antonio Or-tega, and Bhaskar Krishnamachari. Spatially-localized compressed sensingand routing in multi-hop sensor networks. In Proceedings of the 3rd Inter-national Conference on Geosensor Networks, July 2009.
[LTP05] H. Luo, Y. Tong, and G. Pottie. A two-stage dpcm scheme for wirelesssensor networks. In Proceedings of the IEEE International Conference onAcoustics, Speech, and Signal Processing. IEEE, April 2005.
[MFHH] Sam Madden, Michael J. Franklin, Joseph M. Hellerstein, and Wei Hong.Tag: A tiny aggregation service for ad hoc sensor networks. In Proceedingsof the 5th USENIX Symposium on Operating Systems Design and Imple-mentation, December.
[PKG04] S. Pattem, B. Krishnamachari, and R. Govindan. The impact of spatial cor-relation on routing with compression in wireless sensor networks. In Prceed-ings of the ACM/IEEE International Symposium on Information Processingin Sensor Networks, pages 28–35. Springer-Verlag, April 2004.
[PKG08] S. Pattem, B. Krishnamachari, and R. Govindan. The impact of spatialcorrelation on routing with compression in wireless sensor networks. ACMTransactions on Sensor Networks, 4(4), August 2008.
[PLS+09] S. Pattem, S. Lee, M. Sathiamoorthy, A. Ortega, and B. Krishnamachari.Compressed sensing and routing in multi-hop sensor networks. Tech report,USC CENG-2009-4, October 2009.
93
[PR99] S.S. Pradhan and K. Ramchandran. Distributed source coding using syn-dromes (discus): Design and construction. In Proceedings of the IEEE DataCompression Conference, pages 158–167. IEEE Computer Society, March1999.
[PSC+09] S. Pattem, G. Shen, Y. Chen, B. Krishnamachari, and A. Ortega. Senzip:An architecture for distributed en-route compression in wireless sensor net-works. In Proceedings of the Workshop on Sensor Networks for Earth andSpace Science Applications. IEEE/ACM, April 2009.
[rfc90] Compressing tcp/ip headers for low-speed serial links, ietf rfc 1144.http://tools.ietf.org/html/rfc1144, February 1990.
[rfc99] Compressing ip/udp/rtp headers for low-speed serial links, ietf rfc 2508.http://tools.ietf.org/html/rfc2508, February 1999.
[rfc01] Robust header compression, ietf rfc 3095.http://tools.ietf.org/html/rfc3095, July 2001.
[sen] Senzip code release. http://tinyos.cvs.sourceforge.net/viewvc/tinyos/tinyos-2.x-contrib/usc/senzip/.
[SO08a] G. Shen and A. Ortega. Optimized distributed 2d transforms for irregularlysampled sensor network grids using wavelet lifting. In Proceedings of theIEEE International Conference on Acoustics, Speech, and Signal Processing(ICASSP) 2008, Las Vegas, NV, USA, 2008.
[SO08b] G. Shen and A. Ortega. Optimized distributed 2d transforms for irregularlysampled sensor network grids using wavelet lifting. In Proceedings of theIEEE International Conference on Acoustics, Speech, and Signal Processing.IEEE, March 2008.
[SPO09] G. Shen, S. Pattem, and A. Ortega. Energy-efficient graph-based waveletsfor distributed coding in wireless sensor networks. In Proceedings of the34th International Conference on Acoustics, Speech, and Signal Processing.IEEE, April 2009.
[SS02] A. Scaglione and S.D. Servetto. On the interdependence of routing and datacompression in multi-hop sensor networks,. In Proceedings of The 8th ACMInternational Conference on Mobile Computing and Networking, pages 140–147. ACM, August 2002.
[SS05] A. Scaglione and S.D. Servetto. On the interdependence of routing and datacompression in multi-hop sensor networks. In Wireless Networks, Volume11, Number 1-2, pages 149–160. ACM, January 2005.
[TDJ+07] A. Tavakoli, P. Dutta, J. Jeong, S. Kim, J. Ortiz, P. Levis, and S. Shenker.A modular sensornet architecture: Past, present, and future directions. InProceedings of the International Workshop on Wireless Sensornet Architec-ture, April 2007.
94
[TG07] J. Tropp and A. Gilbert. Signal recovery from random measurements viaorthogonal matching pursuit. IEEE Transactions on Information Theory,53(12):4655–4666, December 2007.
[Tin] TinyOS. An operating system for wireless embedded sensor networks.http://www.tinyos.net/.
[TM06] D. Tulone and S. Madden. Paq: Time series forecasting for approximatequery answering in sensor networks. In Proceedings of the European Con-ference in Wireless Sensor Networks, pages 21–37. IEEE, February 2006.
[TMEC+10] A. Terzis, R. Musaloiu-E., J. Cogan, K. Szlavecz, A. Szalay, J. Gray, S. Ozer,M. Liang, J. Gupchup, and R. Burns. Wireless sensor networks for soil sci-ence. International Journal on Sensor Networks, Special Issue on Environ-mental Sensor Networks, 7(1/2):53–70, January 2010.
[tmo] Tmote sky device. http://www.snm.ethz.ch/Projects/TmoteSky.
[tos] Tinyos 2.0 network protocol working group, collection tree protocol, tinyosenhancement proposal (tep) 123. http://www.tinyos.net/tinyos-2.x/doc/.
[TVSO09] Paula Tarrio, Giuseppe Valenzise, Godwin Shen, and Antonio Ortega. Dis-tributed network configuration for wavelet-based compression in sensor net-works. In Proceedings of the 3rd International Conference on GeosensorNetworks, July 2009.
[TW96] David L. Tennenhouse, , and David J. Wetherall. Towards an active networkarchitecture. ACM SIGCOMM Computer Communication Review, 26(2):5–18, March 1996.
[VK91] M. Vetterli and J. Kovacevic. Wavelets and Subband Coding. Prentice Hall,Upper Saddle River, NJ, USA, 1991.
[vRW04] P. von Rickenbach and R. Wattenhofer. Gathering correlated data in sensornetworks. In Proceedings of the DIALM-POMC Joint Workshop on Foun-dations of Mobile Computing, pages 60–66. ACM, October 2004.
[WALJ+06] G. Werner-Allen, K. Lorincz, J. Johnson, J. Lees, and M. Welsh. Fidelityand yield in a volcano monitoring sensor network. In Proceedings of the 7thSymposium on Operating Systems Design and Implementation. USENIX,December 2006.
[WB99] M. Widmann and C. Bretherton. 50 km resolution daily precipta-tion for the pacific northwest, 1949-94. Online data-set located at<http://www.jisao.washington.edu/data sets/widmann>, 1999.
[WGR07] W. Wang, M. Garofalakis, and K. Ramachandran. Distributed sparserandom projections for refinable approximation. In Proceedings of theACM/IEEE International Symposium on Information Processing in SensorNetworks, pages 331–339. Springer Verlag, April 2007.
95
[ZCH07] Y. Zhang, S. Chatterjea, and P. Havinga. Experiences with implementinga distributed and self-organizing scheduling algorithm for energy-efficientdata gathering on a real-life sensor network platform. In Proceedings of theFirst IEEE International Workshop: From Theory to Practice in WirelessSensor Networks. IEEE, June 2007.
[ZK04a] M. Zuniga and B. Krishnamachari. Analyzing the transitional region inlow power wireless links. In Proceedings of the First IEEE InternationalConference on Sensor and Ad hoc Communications and Networks. IEEE,October 2004.
[ZK04b] M. Zuniga and B. Krishnamachari. Realistic wireless link qual-ity model and generator. Available online for download at<http://ceng.usc.edu/ anrg/downloads.html>, 2004.
[ZSS05] Y. Zhu, K. Sundaresan, and R. Sivakumar. Practical limits on achievableenergy improvements and useable delay tolerance in correlation aware datagathering in wireless sensor networks. In Proceedings of the 2nd IEEE Com-munications Society Conference on Sensor and Ad Hoc Communications andNetworks. IEEE, September 2005.
96