on computing compression trees for data collection in wireless sensor networks

On Computing Compression Trees for Data Collection in Wireless Sensor Networks

Jian Li, Amol Deshpande and Samir KhullerDepartment of Computer Science,

University of Maryland, College Park

Outline

• Introduction– Compression tree problem

• Prior approaches• Approximation algorithm• Experimental results• Conclusion

IntroductionDistributed Source Coding (DSC)

• Distributed source coding: Slepian–Wolf coding– Allow nodes to use joint coding of correlated data

without explicit communication– the total amount of data transmitted for a multi-hop

network

– DSC requires perfect knowledge of the correlations among the nodes, and may return wrong answers if the observed data values deviate from what is expected.

– Optimal transmission structure: Shortest path tree

Introduction

• Encoding with explicit communication Pattem et al. [7], Chu et al. [8], Cristescu et al. [9]– exploit the spatio-temporal correlations through

explicit communication among the sensor nodes.– These protocols may exploit only a subset of the

correlations– Without knowing the correlation among nodes a

priori.

ProblemOptimal Compression Tree Problem

• Given a given communication topology and a given set of correlations among the sensor nodes, find an optimal compression tree that minimizes the total communication cost

• Assumption:– utilize only second-order marginal or conditional probability distributions – only directly utilize pairwise correlations between the sensor nodes.

Prior ApproachesIND

Prior ApproachesCluster

Prior ApproachesDSC

Prior ApproachesCompression Tree

Communication Cost• Necessary Communication (NC):

=

• Intra-source Communication (IC):IC cost = Total Cost – NC cost = (6+3) - (4+5)

= 2 - 2

Solution Space

• Subgraphs of G (SG)– compress Xi using Xj only if i and j are neighbors.

• The WL-SG Model: Uniform Entropy and Conditional Entropy Assumption– Assume that H(Xi) = 1, i, and H(Xi|Xj) = , for all

adjacent pairs of nodes (Xi, Xj).• Weakly Connected Dominating Set (WCDS)

Problem

WL-SG Model

The approach for the CDS problem that gives a 2H , approximation [19], gives a H +1 approximation for WCDS [20].

The Generic Greedy Framework

• The main algorithm greedily constructs a compression tree by greedily choosing subtrees to merge in iterations.


• Step 1: – start with a empty graph F1 that consists of only isolated

nodes.• Step 2 (iteration): – In each iteration, we combine some trees together into a

new larger tree by choosing the most cost-effective treestar

• Step 3: – terminates when only one tree is left

r

Approximation factor

Experimental Results

• Rainfall Data:– we use an analytical expression of the entropy

that was derived by Pattem et al. [7] for a data set containing precipitation data collected in the states of Washington and Oregon during 1949-1994.

Rainfall Data:

Intel Lab Data:

Conclusion• This paper addressed the problem of finding an optimal or a near-

optimal compression tree for a given sensor network: – a compression tree is a directed tree over the sensor network nodes such

that the value of a node is compressed using the value of its parent.• We draw connections between the data collection problem and

weakly connected dominating sets, – we use this to develop novel approximation algorithms for the problem.

• We present comparative results on several synthetic and real-world datasets – showing that our algorithms construct near-optimal compression trees

that yield a significant reduction in the data collection cost.

on computing compression trees for data collection in wireless sensor networks

Documents

sensor nodes

data set

sensor network nodes

directed tree

data collection problem

precipitation data

nearoptimal compression

isolated nodes