final version 1 multiterminal source coding with copula ... · final version 1 multiterminal source...

Vrije Universiteit Brussel

Multiterminal Source Coding with Copula Regression for Wireless Sensor NetworksGathering Diverse DataZimos, Evangelos; Toumpakaris, Dimitris ; Munteanu, Adrian; Deligiannis, Nikolaos

Published in:IEEE Sensors Journal

DOI:10.1109/JSEN.2016.2585042

Publication date:2017

Document Version:Final published version

Link to publication

Citation for published version (APA):Zimos, E., Toumpakaris, D., Munteanu, A., & Deligiannis, N. (2017). Multiterminal Source Coding with CopulaRegression for Wireless Sensor Networks Gathering Diverse Data. IEEE Sensors Journal, 17(1), 139-150.[7499830]. https://doi.org/10.1109/JSEN.2016.2585042

General rightsCopyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright ownersand it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal

Take down policyIf you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediatelyand investigate your claim.

Download date: 23. Mar. 2020

https://doi.org/10.1109/JSEN.2016.2585042

https://cris.vub.be/en/publications/multiterminal-source-coding-with-copula-regression-for-wireless-sensor-networks-gathering-diverse-data(890f5364-ff4b-4281-8be9-b5838f646e24).html

FINAL VERSION 1

Multiterminal Source Coding with CopulaRegression for Wireless Sensor Networks Gathering

Diverse DataEvangelos Zimos, Student Member, IEEE, Dimitris Toumpakaris, Member, IEEE, Adrian

Munteanu, Member, IEEE, and Nikos Deligiannis, Member, IEEE

Abstract—Efficient data compression at a low processing andcommunication cost is a key challenge in wireless sensor net-works. In this paper, we propose a novel multiterminal sourcecode design, which, contrary to prior work, utilizes both the intra-and the inter-sensor data dependencies. The former is exploitedby applying simple DPCM followed by arithmetic entropy codingat each distributed encoder. This approach limits the encodingcomplexity and provides for a flexible design that adapts to vari-ations in the number of operating sensors. Moreover, we proposea regression method applied at the joint decoder, which aims atleveraging the inter-sensor data dependencies. Unlike existingwork that focuses on homogeneous data types, the proposedmethod makes use of copula functions, namely, a statistical modelthat captures the dependence structure amongst heterogeneousdata types. Experimentation using real sensor measurements—taken from the Intel-Berkeley database—shows that the proposedsystem achieves significant compression improvements comparedto state-of-the-art multiterminal and distributed source codingschemes.

Index Terms—Wireless sensor networks (WSNs), multiterminal(MT) source coding, distributed source coding (DSC), copularegression, differential pulse-code modulation (DPCM).

I. INTRODUCTION

W IRELESS sensor networks (WSNs) play a key role ina plethora of emerging applications in a wide span of

disciplines, ranging from smart cities and smart water systemsto smart metering, home automation and smart farming andagriculture [1]. The vision of such systems is realized bythe emerging Internet of Things (IoT), which is supported bythe integration of WSNs in the generic internet infrastructurevia the 6LoWPAN/IPv6 standard [2]. In the context of theseof applications, wireless sensors collect diverse correlatedinformation such as light,pressure, temperature, or humiditydata, process it, and then transmit it to central nodes for storageand/or further processing [1].

Since wireless sensors are typically powered by batteriesthat cannot easily be changed or recharged, the primaryconstraint in the design of WSNs is energy consumption.Power savings can be achieved by reducing the radio emissionof the sensors, which, therefore, calls for efficient compression

E. Zimos, A. Munteanu, and N. Deligiannis are with the Department ofElectronics and Informatics, Vrije Universiteit Brussel, Pleinlaan 2, 1050Brussels, Belgium and with iMinds, Gaston Crommenlaan 8 (b102), 9050Ghent, Belgium (email: [email protected]; [email protected];[email protected]).

D. Toumpakaris is with the Department of Electrical and Computer Engi-neering, University of Patras, Rio, Greece 265 00 (email: [email protected]).

of the transmitted data. To this end, the intra- and inter-sensor data dependencies need to be effectively leveragedwithout increasing the computational effort at the sensornodes and without requiring inter-sensor communication. Theencoding complexity at the sensor nodes should be as lowas possible and the computational burden should be shiftedtowards energy-robust central nodes (e.g., base stations, fusioncenters). Moreover, the encoding design needs to be flexiblein terms of rate allocation so as to avoid continuous reconfig-uration.

Existing works [3]–[5] propose conventional compressionalgorithms for WSNs, where low-memory differential pulse-code modulation (DPCM) [6] followed by entropy coding isused. Such compression schemes exploit the intra-sensor datadependencies, namely, the dependencies among consecutivesamples collected by each sensor. In order to leverage thedependence among data collected by different nodes, theconventional predictive coding paradigm requires that data beexchanged between the nodes, which in turn implies that inter-sensor communication is established. However, this encodingstrategy introduces additional radio transmission requirementsfor the sensors, thereby leading to rapid battery depletion.

An alternative strategy for efficient data compression inWSNs adheres to distributed source coding (DSC), a paradigmthat leverages inter-sensor data dependencies at the decoderside. DSC was initiated by Slepian and Wolf [7], who showedthat by separate encoding, two correlated sources can be com-pressed to their joint entropy with vanishing decoding errorprobability as the code length goes to infinity. Later, Wynerand Ziv [8] established the rate-distortion bound for lossycompression with decoder side information. They showed thatwhen the source and the side information are jointly Gaussianand the mean-squared error (MSE) is used as the distortionmetric, there is no performance loss incurred by not usingthe side information at the encoder. Recently, this no-rate-lossproperty has been extended to the case where the source andthe side information are binary and correlated by means of theZ-channel [9].

Berger [10] and Tung [11] introduced the multiterminal(MT) source coding problem, which refers to separate lossyencoding and joint decoding of two (or more) correlatedsources. From a theoretical perspective, the problem is shownto be challenging: an achievable rate region for the generalMT problem is still unknown, but inner and outer bounds havebeen devised [10], [11]. Theoretical studies have focused on

FINAL VERSION 2

special cases such as the quadratic Gaussian, where Gaussiansources and a quadratic distortion criterion are assumed [12].

Towards practical implementations of DSC for WSNs, atwo-sensor Slepian-Wolf (SW) coding scheme for temperaturemonitoring was deployed in [13], where rate adaptation wasachieved by means of an entropy tracking algorithm. Analternative SW design using Raptor codes was proposed in[14], [15] for cluster-based WSNs that measure temperaturedata. Instead of using SW coding as in [13]–[15], the work in[16] devised a Wyner-Ziv (WZ) code construction for WSNsmeasuring temperature data, which comprised quantizationfollowed by binarization and LDPC encoding. Focusing on theapplication of wind farm monitoring, an MT code construction[17] was developed to compress wind speed measurementsin [18]. Existing DSC designs consider a limited number ofsensors (typically two or three), since SW coding for manydata sources is difficult to implement in practice. To addressthis limitation, the authors of [19] and [20] proposed to replaceSW coding with entropy coding, thereby obtaining practicalcode constructions.

Prior studies consider the compression of homogeneous datatypes such as temperature [13]–[15] or wind speed data [18].However, many up-to-date applications involve various sensorsof heterogeneous modalities measuring diverse yet correlateddata (e.g., temperature, humidity, light). In this work, wepropose a novel MT source coding scheme that achievesefficient compression by leveraging the dependencies amongdiverse data types produced by multiple heterogeneous datasources. Our specific contributions are as follows:

• We propose a novel code design for multisensory WSNs,where both intra- and inter-sensor data dependenciesare exploited via DPCM and MT source coding, re-spectively. The proposed system combines the meritsof conventional predictive coding [3]–[5]—where onlyintra-sensor data dependencies are leveraged—and DSCsystems [14], [15], [18]–[20]—that only uses inter-sensordependencies.

• The proposed design is characterized by (i) lightweightencoding, as it applies DPCM to utilize the intra-sensordata dependencies instead of complex vector quantizationor trellis-coded quantization (TCQ) as in other works[21]; (ii) optimized compression performance, as it de-ploys a scalar Lloyd-Max quantizer at each encoderinstead of a simple uniform scalar quantizer (USQ) usedin other schemes [20]; and (iii) flexibility, since, contraryto classical DSC systems [17], [22]–[27], no systemreconfiguration is required when a subset of sensors isnot functional.

• Previous work [13], [16], [20] has focused on WSNscollecting homogeneous data and has used a multivariatenormal or Laplace distribution to describe the inter-sensordata dependencies. However, in this work, we exploitthe data structure among multiple sensors that collectdata of different types. In order to accurately express thesymmetric and asymmetric dependencies across diversedata sources, we propose the use of statistical models

based on copula functions1 [30], [31]. To this end, weuse well-known Elliptical copulas, such as the normaland Student’s t copulas, as well as the Clayton copula,which belongs to the Archimedean copula family. Weshow that copula functions capture the dependenciesamong heterogeneous data sources more accurately thanthe conventional multivariate modeling approach.

• The proposed system embodies a copula regressionmethod to leverage the inter-sensor data dependencies atthe decoder. In contrast to alternative copula regressionapproaches [32], [33], the proposed algorithm providesfor accurate inference at a reasonable complexity.

• The proposed coding scheme is evaluated using realsensor measurements taken from the well-establishedIntel-Berkeley database [34].

The remainder of the paper is organized as follows: SectionII gives a brief description of MT source coding without SWcompression. Section III presents the proposed coding design,whereas Section IV elaborates on the proposed ellipticalcopula regression. Experimental results are provided in SectionV. Section VI draws the conclusion of the work.

II. BACKGROUND ON MT SOURCE CODING

We consider a WSN comprising L sensors that collect dataproduced by correlated sources X1, X2, . . . , XL, which takevalues from L continuous alphabets X1,X2, . . . ,XL and aredrawn i.i.d. according to the joint probability density function(pdf) fX(x1, x2, . . . , xL). Each sensor, indexed by l ∈ IL ={1, 2, . . . , L}, gathers a sequence of n source samples andforms a data block xl = [xl(1), xl(2), . . . , xl(n)]. The dataare encoded using L separate encoding functions

φl : Xnl → {1, . . . , 2nRl}, l ∈ IL, (1)

where each compresses the source block xl at rate Rl byassigning to it a discrete index φl(xl). The joint decoder isa function

θ : {1, . . . , 2nR1} × . . . × {1, . . . , 2nRL} → X n1 × . . . × X n

L ,(2)

that reconstructs the data blocks of all sensors,[x1, x2, . . . , xL], based on the observed index tuple[φ1(x1), φ2(x2), . . . , φL(xL)].

Let dl(∙) be a distortion measure for the sensor l, definedas dl : Xl × Xl → R+. Given a distortion tuple D =[D1, D2, . . . , DL], the rate tuple R = [R1, R2, . . . , RL] isachievable if, for any ε > 0, there exists a large enough n, Lsource encoder functions φl, and a decoder function θ such thatthe distortion constraint 1

n

∑ni=1 E[d(xl(i), xl(i))] ≤ Dl + ε

be satisfied for each l ∈ IL. The achievable rate-distortionregion R∗(D) is the convex hull of all achievable rate tuplesR.

Two code designs for the two-terminal Gaussian MT prob-lem are proposed in [17], where TCQ is combined with SWcoding. In the first scheme, labeled as asymmetric SW CodedQuantization (SWCQ), each data block of one source, say

1Despite their long history in econometrics, copula functions have onlyrecently been explored in signal processing [15], [28], [29].

FINAL VERSION 3

X1, is quantized and entropy encoded so as to act as sideinformation to encode the corresponding data block of X2 bymeans of WZ coding. Then the decoded information is linearlycombined to produce side information that is used to furtherrefine X1. Finally, the reconstructed data blocks, denoted byx1 and x2, respectively, are passed to a linear estimator thatyields the final decoded estimates [x1, x2]. In the secondscheme [17], referred to as symmetric SWCQ, data blocksproduced by both sources are quantized and compressed usingsymmetric SW coding, based on the concept of channelcode partitioning. At the decoder, symmetric SW decoding isfollowed by inverse quantization to reconstruct the two blocks.Similarly to asymmetric SWCQ, a linear estimation step isfinally applied. However, extending the designs in [17] tomultiple sources is challenging, as practical SW coding basedon channel codes becomes difficult to implement.

To increase flexibility at the expense of compression per-formance, the authors of [19] studied the specific MT sourcecoding scenario where SW coding is replaced with simpleentropy coding. A practical realization of this scheme ispresented in [20], where L sensors monitor homogeneous datatypes. Each encoder performs USQ followed by arithmeticentropy encoding. At the decoder, after recovering the blocksxl from all sensors, a second estimation stage is applied, wherethe dependencies among the sensed data are exploited throughGaussian regression, as explained below.

Gaussian Regression Stage. Let the random vector X =[X1, . . . , Xl, . . . , XL], which describes the data produced fromall sensors at instant i ∈ {1, 2, . . . , n}, follow a multivariatenormal distribution, X ∼ N (μX ,ΣX), with mean value μX

and covariance matrix ΣX . Moreover, let the quantizationnoise Zl, which corrupts each component Xl in X, be additive,independent of Xl, and temporally independent. The dequan-tized data random variable at instant i from the l-th sensor isgiven by Xl = Xl+Zl. The variance of the quantization noiseZl can be calculated as

σ2Zl

=

∫xl∈Q[xl(i)]

(xl(i) − xl)2fXl(xl)dxl

∫xl∈Q[xl(i)]

fXl(xl)dxl

, (3)

where Q[xl(i)] and xl(i) are, respectively, the quantizationindex and the reconstructed value assigned to xl(i), andfXl

(xl) is the marginal pdf of Xl. The vectors X and X =[X1, . . . , Xl, . . . , XL] are assumed to be jointly Gaussian, i.e.,

[XX

]

∼ N

([μX

μX

]

,

[ΣXX ΣXX

ΣTXX

ΣXX

])

, (4)

where ΣXX = ΣXX = ΣTXX

= ΣX , and ΣXX = ΣX +ΣZ , with ΣZ being a diagonal matrix with nonzero elementsΣZ(l, l) = σ2

Zland (∙)T denoting matrix transpose. Given

the dequantized data X, the final estimate X is given by theconditional mean μX|X of X|X ∼ N (μX|X ,ΣX|X), that is,[35]

X = μX + ΣTXX

Σ−1

XX(X − μX). (5)

III. THE PROPOSED MT SOURCE CODE DESIGN

The architecture of the proposed MT source coding systemis depicted in Fig. 1. Unlike prior studies, we assume that the

X1

X2

XL

X1

X2ˆ

XL

Fig. 1. The proposed system architecture.

sensors collect heterogeneous correlated data (e.g., tempera-ture, humidity, and light). Furthermore, contrary to prior stud-ies [19], [20], we assume that the consecutive data samples,collected by each sensor, are highly correlated. With the aimto make use of this temporal correlation, each sensor l ∈ IL

gathers a block of n readings, xl = [xl(1), xl(2), . . . , xl(n)],and applies DPCM encoding [36], the block diagram of whichis shown in Fig. 2. The DPCM encoder comprises a linearprediction function and a Lloyd-Max [6] scalar quantizationfunction Q(∙). For the l-th sensor, the prediction value for thedata sample at instant i is given by

vl(i) =m∑

j=1

ajxl(i − j), (6)

where m is the memory length of the predictor. The coef-ficients aj , j = 1, . . . ,m, are chosen so as to minimizethe MSE between xl(i) and vl(i), and are estimated bysolving the Yule-Walker equations [36]. A Lloyd-Max scalarquantizer with M reconstruction levels is used to quantizethe prediction error wl(i) = xl(i) − vl(i). The reconstructionpoints and the quantization regions of the quantizer are de-termined during a training period. The l-th DPCM encoderoutputs a block ql = [Q[wl(1)], Q[wl(2)], . . . , Q[wl(n)]]containing the quantization indices of the prediction errorswl = [wl(1), wl(2), . . . , wl(n)]. The DPCM block ql isthen arithmetic entropy encoded [37] at rate Rl = 1

nH(ql)bits/sample. Arithmetic coding is very efficient, allowing fora compression rate that is very close to the empirical entropy.

At the joint decoder, the bitstream received from eachsensor l ∈ IL is arithmetic entropy decoded, producingthe quantization indices ql. The reconstructed values wl =[wl(1), wl(2), . . . , wl(n)] of the prediction errors, are thencalculated via inverse quantization. Subsequently, the decoderapplies DPCM decoding per sensor for estimating the sourceblocks xl = [xl(1), xl(2), . . . , xl(n)] for all sensors.

Upon reconstructing the source blocks, the joint decoderperforms an additional estimation stage, where the dependencestructure among the heterogeneous data collected by thevarious sensors is exploited. To express the joint statisticsamong diverse correlated data, the proposed system adheres toa modeling approach based on copula functions [31]. Namely,

FINAL VERSION 4

xl(i) +

+

+

—

wl(i) Q[wl(i)]

wl(i)

vl(i)

Fig. 2. DPCM encoder of memory length m for sensor l.

+

+

-1

-th DPCM encoder outputs a block, ..., Q[wl(i)]]

indices of the prediction errors w, ..., wl(i)]

culated via inverse quantization. Subsequently, the decoder

applies DPCM decoding for estimating the source blocks, ..., xl(i)]

Upon decoding the source blocks, the sink performs an

, and are used to estimate the source blocks, ..., xl(i)]

DPCM decoder of each sensor

remaining sensors. Let the response variable bevariables {Xς : ς ∈ IL\{l}}copula regression methods [32], [33], would estimate the val-

Fig. 3. Proposed DPCM decoder of memory length m for sensor l.

the vector x(i) = [x1(i), x2(i), . . . , xL(i)], containing thedequantized values from all sensors at instant i, is passedto the proposed copula regression algorithm that outputs arefined version, denoted by x(i) = [x1(i), x2(i), . . . , xL(i)].Fig. 3 presents the proposed DPCM decoder of each sensor l.The copula regression algorithm that we devise in this paperis described in the next section. The final estimates x(i) arecalculated for all i = 1, 2, . . . , n, and are used to estimate thesource blocks xl = [xl(1), xl(2), . . . , xl(n)] for all sensors.

IV. THE PROPOSED SEMI-PARAMETRIC COPULA

REGRESSION

Focusing on homogeneous data types, previous studies [13],[16], [20] express the inter-sensor data dependencies using themultivariate normal distribution. This implies that the marginaldistributions are assumed normal and that the dependencyamong the data is considered to be linear. However, when deal-ing with heterogeneous information sources these assumptionsmay be inaccurate, due to, for example, variations in signaldimensionality across diverse modalities.

In support of this argument, Figs. 4(a) and 4(b) show thatthe marginal pdfs of the data collected by a temperature and ahumidity sensor, respectively, do not follow a Gaussian distri-bution. In order to accurately express the dependencies amongdiverse data sources, we propose the use of statistical modelsbased on copula functions [30], [31]. Copula functions cancombine heterogeneous sensor data with disparate marginaldistributions into a multivariate pdf.

A. Copula Functions

In the literature [31], [38]–[41], there exist various bivariateand multivariate copula families. In this work, we concentratemainly on the most well-established of them, namely, theElliptical and the Archimedean copula families. Ellipticalcopulas provide symmetric expressions and are suitable forapplications where the number of the variables (i.e., L) in-creases. Typical examples of Elliptical copulas are the normalcopula and the t-copula (alias, Student’s copula). It is worthmentioning that Elliptical copulas facilitate the design ofregression schemes since they allow for a non-constant degree

10 20 30 40

Temperature (oC)

0

0.02

0.04

0.06

0.08

0.1

Prob

abili

ty

(a)

10 20 30 40 50 60Humidity (%)

0

0.02

0.04

0.06

0.08

Prob

abili

ty

(b)

Fig. 4. Example marginal pdfs obtained using kernel density estimationfor an appropriate smoothing window of bandwidth hXl

. (a) Real-valuedtemperature data (hXl

= 0.9315), and (b) real-valued humidity data(hXl

= 1.6166).

of association (i.e., the Spearman rank coefficient) between theresponse variable and the covariates [33]. Contrary to Ellipticalcopulas, Archimedean copulas are easily derived and capture awide range of dependence. However, being parameterized by asingle parameter, they lack some modeling flexibility. Amongothers, the Clayton copula is a widely used Archimedeancopula, since it has a simple closed-form expression.

In our system, copula functions are used to modelthe statistical distribution of the random vector X =[X1, . . . , Xl, . . . , XL] that describes the sensor values at in-stant i (see Fig. 2).

Definition. Let FX1(x1), FX2(x2), . . . , FXL(xL) be the con-

tinuous marginal cumulative distribution functions (cdfs) of therandom variables in X. By applying the probability integraltransform [42] on each Xl, l ∈ IL, the vector X is transformedinto the vector U = [U1, U2, . . . , UL], where Ul = FXl

(Xl).Therefore, regardless of the marginal distribution of Xl, thetransformed variable Ul always follows a uniform distribution.According to Sklar’s theorem [30], [31], if FX is the L-dimensional joint distribution of X, there exists a unique L-dimensional copula function C : [0, 1]L → [0, 1] such that

FX(x1, x2, . . . , xL) = C(FX1(x1), FX2(x2), . . . , FXL(xL)).(7)

If the marginal distributions are continuous, an assumption thatholds in our case, the copula function is unique.

Copula density function. The copula density function, de-noted by c (FX1(x1), . . . , FXL

(xL)), can be found by differ-entiating the expression in (7) with respect to ul = FXl

(xl),for all l ∈ IL. Thus, the multivariate pdf of the sensor datacan be written as:

fX(x1, . . . , xL) = c (FX1(x1), . . . , FXL(xL))

L∏

l=1

fXl(xl).

(8)Given the marginal pdfs of the random variables, an appropri-ate copula function that best captures the dependencies amongthe sensor data should be selected. In this work, we considerthe multivariate normal, t- and Clayton copulas.

Normal copula. The multivariate normal copula function [31]is defined as:

Cg(u) = ΦRg (Φ−1(u1), Φ−1(u2), . . . , Φ

−1(uL)), (9)

FINAL VERSION 5

where u = [u1, . . . , uL] is a realization of the random vectorU, ΦRg

denotes the standard multivariate normal distributionwith linear correlation matrix Rg , and Φ−1 is the inversefunction of the standard univariate normal distribution. Thenormal copula density is given by [43]

cg(ξ) = |Rg|− 1

2 exp

[

−12ξ(R−1

g − I)ξT]

, (10)

where ξ = [Φ−1(u1), Φ−1(u2), . . . , Φ−1(uL)]2 and I is theL × L identity matrix.

Student’s t-copula. The multivariate t-copula function [31]has the form:

Ct(u) = TRt,ν(t−1ν (u1), t

−1ν (u2), . . . , t

−1ν (uL)), (11)

where TRt,ν is the standardized multivariate t-distribution withν degrees of freedom and correlation matrix Rt, and t−1

ν

denotes the inverse of the univariate t-distribution. The t-copula density is [31]

ct(η) = |Rt|− 1

2Γ(

ν+L2

) [Γ(

ν2

)]L (1 + 1

ν ηR−1t ηT

)− ν+L2

[Γ(

ν+L2

)]LΓ(

ν2

) L∏

l=1

(

1 +η2

l

ν

)− ν+12

,

(12)where η = [η1, η2, . . . , ηL], ηl = t−1

ν (ul), l ∈ IL and Γ(∙) isthe gamma function.

Clayton copula. An Archimedean copula C(a) can be repre-sented by [31]:

C(a)(u; δ) = γ−1(γ(u1; δ) + ∙ ∙ ∙ + γ(uL; δ) δ), (13)

where γ : [0, 1] × Δ → [0,∞) is a continuous, strictlydecreasing and convex function such that γ(1; δ) = 0, and δis the copula parameter defined in the range Δ. The functionγ(ul), l ∈ {1, . . . , L}, is called generator and its pseudo-inverse, defined by:

γ−1(ul; δ) =

{γ−1(ul; δ) if 0 ≤ ul ≤ γ(0; δ),0 if γ(0; δ) ≤ ul ≤ ∞,

(14)

has to be completely monotonic of order L [44]. For theClayton copula, the generator has the following form [31]:

γCl(ul) = δ−1Cl

(u−δ

l − 1), (15)

where δCl is the parameter of the Clayton copula. The expres-sion of the Clayton copula function is given by

C(Cl)(u) =

(L∑

l=1

u−δCll − L + 1

)−1/δCl

, (16)

whereas the Clayton copula density function can be written as[45]:

c(Cl)(u) =

(L∑

l=1

u−δCll − L + 1

)−L−1/δCl

×

L∏

l=1

(u−δCl−1l [(l − 1)δCl + 1]). (17)

2Note that vector ξ is a function of u. Thus it can be written either cΓ(ξ)or cΓ(u).

The multivariate pdf of the sensor values is given byreplacing the copula density c (FX1(x1), . . . , FXL

(xL)) in (8)with the expression in either (10), (12), or (17).

B. The Proposed Copula Regression Method

The proposed code design embodies a copula regressionmethod that takes as input the reconstructed sensor valuesx(i) = [x1(i), x2(i), . . . , xL(i)] and produces a refined es-timate, denoted by x(i) = [x1(i), x2(i), . . . , xL(i)].

During a training stage, the model parameters, namely,either the correlation matrix Rg of the normal copula, thecorrelation matrix Rt and the degrees of freedom ν of thet-copula, or the parameter δCl of the Clayton copula, as wellas the continuous marginal pdfs and cdfs of sensor data areestimated. The correlation matrix Rg of the normal copula isparametrically estimated using standard Maximum LikelihoodEstimation (MLE) [46]. Regarding the t-copula, the correlationmatrix Rt and the degrees of freedom parameter ν are para-metrically estimated using approximate MLE [46], where thecopula function is fitted by maximizing an objective functionthat approximates the profile log-likelihood for the degrees offreedom parameter. Moreover, the parameter of the Claytoncopula is estimated using MLE.

The marginal pdfs are non-parametrically estimated usingkernel density estimation (KDE) [47]. Specifically, given τtraining samples from sensor l ∈ IL, the KDE estimator is

fXl(xl) =

1τhXl

τ∑

λ=1

K

(xl − xλ

hXl

)

, (18)

where K(∙) is the kernel function and hXlis the bandwidth

of the smoothing window. The kernel function is usuallychosen to be a smooth unimodal function with a peak atzero. In the literature [47] various kernel functions, suchas the Gaussian and the Epanechnikov kernel, have beenproposed. Although the Epanechnikov kernel is optimal in theMSE sense [48], the accuracy of the non-parametric estimatedepends less on the shape of the kernel function K(∙) thanon the value of its bandwidth hXl

[49]. For accurate densityestimation, an appropriate selection of the bandwidth valueis important since small or large values can lead to under- orover-smoothed estimators, respectively. In our implementation,the optimal bandwidth value is obtained by the MATLABfunction ksdensity, which uses a rule of thumb [47]. Thesmooth estimate of the corresponding marginal cdf, FXl

, isconstructed by integrating fXl

. That is,

FXl(xl) =

∫ xl

−∞fXl

(x)dx =1τ

τ∑

λ=1

κ

(xl − xλ

hXl

)

, (19)

where κ(x) =∫ x

−∞ K(χ)dχ.This work focuses on slowly-varying data sources, such as

temperature and humidity,which are modelled as stationaryprocesses, namely, random processes whose statistical proper-ties are time independent or do not change for a given periodof time [36]. Thus, the entries of the correlation matrices Rg

(for the normal copula) or Rt (for the t-copula), as wellas the parameter of the Clayton copula, are assumed to be

FINAL VERSION 6

constant for a given period of time and offline estimation issufficient. A similar approach has been followed in previousworks [15], [20]. Nevertheless, the use of online estimatorsfor the statistical parameters is left for future research.

Once the parameters of the models have been determined,the refined estimates x(i) are calculated based on the proposedregression algorithm. The algorithm refines the reconstructedsensor value xl(i), corresponding to the l-th sensor, usingthe other elements of x(i), which correspond to the othersensors. Let the response variable be Xl and the variables{Xς : ς ∈ IL\{l}} denote the covariates. Existing copula re-gression methods [32], [33], would estimate the values in x(i)using the conditional mean E[Xl|X1, . . . , Xς ], ς ∈ IL\{l}.This approach works well when the number of covariates issmall (two or three). However, when the dimensionality ishigh, as in our system, exact inference can be intractable,thereby requiring computationally expensive Monte Carlosampling methods [50].

Another approach for predicting the refined estimates in thevector x(i) considers the MLE problem, which can be writtenas

xl = arg maxxl

fXς |Xl(x1, . . . , xς |xl)

= arg maxxl

[

c (FX1(x1), . . . , Fxς (xς), FXl(xl))×

fXl(xl)

∏ςκ=1 fXκ(xκ)

fXl(xl)

]

(20)

= arg maxxl

c (FX1(x1), . . . , FXL(xL))

ς∏

κ=1

fXκ(xκ)

= arg maxxl

c (FX1(x1), . . . , FXL(xL)) , (21)

where Xς = {X1, X2, . . . , Xς}, ς ∈ IL\{l}. Eq. (20)is derived based on Bayes’ rule fXς |Xl

(x1, . . . , xς |xl) =fX(x1,...,xς ,xl)

fXl(xl)

and the expression in (8) that provides thedescription of the multivariate pdf fX(x1, . . . , xς , xl). In thiswork, we solve (21) by using an algorithm that delivers accu-rate inference at reasonable complexity. In particular, the cdf ofthe response variable FXl

(xl) is sampled until the consideredcopula density cg(ξ), ct(η) or c(Cl)(u) is maximized. This isexpressed by the following optimization problem:

u∗l = arg max

ul=FXl(xl)

c(u1, . . . , ul, . . . , uL), (22)

where ul ∈ [0, 1] and c(∙) is replaced by the expressionin either (10), (12) or (17). The solution of (22) is foundnumerically using a greedy approach. The objective functionin (22), is not necessarily concave meaning that local maximamay appear. We solve problem (22) using a greedy approach tofind the optimal ul, which does not abide by the “hill-climbingprinciples” [51] that lead to calculation of local maxima; thealgorithm rather performs an exhaustive search for all valuesof ul that span the region [0, 1] and finds the global maximum(within the step-size accuracy) of the copula density. Thesampling step is chosen to be ust = 0.001 such that we strikea balance between: (a) the decoding complexity level, wherelarger parameter values speedup the optimization process, and(b) the accuracy of the inference, where smaller values provide

meticulous copula sampling. Finally, the refined estimate ofthe sensor value is given by

xl(i) = F−1Xl

(u∗l ). (23)

The procedure is described in Algorithm 1. Initially, thealgorithm refines the value of the l-th sensor using the otherelements of x(i), which correspond to dequantized values ofthe remaining sensors. Subsequently, the algorithm replacesthe corresponding dequantized estimate xl(i) with the refinedvalue xl(i) in x(i) and continues with refining the value ofthe next sensor. The same procedure repeats for all unrefinedsymbols in x(i), yielding the refined estimate vector x(i). Thesensors indices are processed sequentially with increasing thevalue of l. Moreover, as shown in Algorithm 1, the proposedmethodology can be straightforwardly adapted to cope withthe case where a subset of sensors, denoted by Ic, is notoperating (due to, for example, battery depletion or dutycycling for extending the lifetime of the system3). In this case,the vector xe(i) contains only the components of x(i) thatcorrespond to the effective sensors, which are indexed in theset Ie = IL\Ic. Furthermore, only the columns and rows ofthe correlation matrices Rg (for the normal copula) or Rt (forthe t-copula) that correspond to the effective sensors are keptand the remaining rows and columns are removed.

V. EXPERIMENTAL EVALUATION

We evaluate the proposed system using actual sensor read-ings from the Intel-Berkeley Research lab [34] database. Thedatabase contains data collected from 54 Mica2Dot sensorsequipped with weather boards, monitoring diverse physicalparameters (that is, humidity, temperature, and light) in anindoor office and laboratory environment. To conduct ourexperiments, we selected randomly L = 21 sensors fromthe database, 11 of which harvest temperature data (in oC)and the other ten collect humidity data (in %). The sensorreadings from the Intel-Berkeley database exhibit variouslevels of dependence, such as strong (ρ1,3 = 0.9764), medium(ρ8,18 = 0.7574) and weak (ρ9,14 = 0.5311), where ρl1,l2

denotes Spearman’s rank correlation coefficient between dataof sensors l1 and l2.

The collected data were distinguished into a training andan evaluation set, without an overlap between the two. Theformer, which consisted of the initial 15% of the data, was usedto derive the parameters of the proposed coding scheme andthe proposed copula-function-based model. Given the trainingdataset, we derived two different predictors of memory lengthm = 3: one for the sensors measuring temperature and onefor those sensing humidity. The derived predictor coefficients,which led to minimum-MSE predictors [see (6)], were foundto be at,1 = 2.7520, at,2 = −2.6647, at,3 = −0.9127, forthe temperature data, and ah,1 = 1.1613, ah,2 = −0.0490,ah,3 = −0.1124 for the humidity data. Furthermore, followingthe semi-parametric approach described in Section IV, weestimated the correlation matrix Rg of the normal copuladensity, the correlation matrix Rt and the degrees of freedomν for the t-copula density, the parameter of the Clayton copula

3A subset of sensors is periodically turned off during specific time periods.

FINAL VERSION 7

Algorithm 1 Proposed copula regression algorithm for refin-ing the dequantized sensor values.

1: Inputs: Set of effective sensors Ie = {l1, . . . , lLe},

vector xe(i) with dequantized sensor values in Ie, copulaparameters, marginal statistics.

2: Output: Refined estimates xe(i).3: Modify Rg (or Rt) given the set Ie = {l1, . . . , lLe

}.4: for l ∈ Ie do5: c∗l = 0;6: u∗

l = 0;7: for ul = 0 → 1 with step ust do8: Create the vector u = [FXl1

(xl1(i)), . . . , ul, . . .

. . . FXlLe(xlLe

(i))].9: if Elliptical copula then

10: Calculate ξ = Φ−1(u) (or η = t−1ν (u)).

11: Calculate cg(ξ) using (10) or ct(η) using (12).12: if cg(ξ) ≥ c∗l (or ct(η) ≥ c∗l ) then13: c∗l = cg(ξ) (or c∗l = ct(η));14: u∗

l = ul;15: end if16: else17: Calculate c(Cl)(u) using (17).18: if c(Cl)(u) ≥ c∗l then19: c∗l = c(Cl)(u);20: u∗

l = ul;21: end if22: end if23: end for24: Calculate xl(i) = F−1

Xl(u∗

l ).25: Replace xl(i) with xl(i) in xe(i).26: end for27: Do xe(i) = xe(i).

δCl, as well as the marginal pdfs fXland cdfs FXl

on thesensor values. To compare the proposed modeling approachagainst the state of the art [20], we also fitted the multivariateGaussian model on the data; namely, we estimated paramet-rically4 the mean values μXl

and the standard deviations σXl

of the marginal distributions for the sensor values as wellas the corresponding covariance matrix ΣX . The estimatedparameters for the different statistical models are reported inTable I. The degrees-of-freedom parameter of the t-copulafunction model was found to be ν = 6.6995. Finally, wecalculated the parameter of the Clayton copula via MLE,which was found to be δCl = 0.5302.

During the training stage, we also configure the Lloyd-Maxquantizer and the arithmetic entropy coder that are deployed tocompress the data from each sensor in our system. Specifically,we determine the reconstruction values and the partitions ofeach quantizer, as well as the source statistics used in thearithmetic coders.

During the evaluation stage, the data from each sensorl ∈ I21 were aggregated into data blocks, each consistingof n = 40 consecutive samples. The block length is chosen tostrike a balance between good compression performance and

4The parameters were estimated using the MATLAB function fitdist.

TABLE IMEAN VALUES AND STANDARD DEVIATIONS FOR THE MULTIVARIATE

GAUSSIAN MODEL, AS WELL AS THE BANDWIDTH OF THE SMOOTHING

WINDOW FOR THE COPULA FUNCTION MODEL.

Gaussian Model Copula Model

SensorID

Mean St. dev. BandwidthμXl

σXlhXl

1 (Temp.) 23.2250 4.1648 0.93152 (Hum.) 33.9024 7.8912 1.61663 (Temp.) 22.6966 3.1606 0.76814 (Hum.) 37.4034 5.1743 0.99935 (Temp.) 21.8933 2.4164 0.40316 (Hum.) 38.5627 5.3227 0.98117 (Temp.) 22.2168 2.4620 0.00548 (Hum.) 35.3782 5.7556 0.44949 (Temp.) 22.0934 2.4939 1.196910 (Hum.) 35.9978 5.8755 0.517511 (Temp.) 22.2770 3.3557 1.085812 (Hum.) 36.7319 6.6400 1.308113 (Temp.) 22.0698 2.9606 0.591014 (Hum.) 37.3494 5.4444 1.053215 (Temp.) 21.4909 2.7720 0.427616 (Hum.) 37.5436 5.3022 0.769017 (Temp.) 21.0885 2.8714 0.468518 (Hum.) 38.4008 4.8944 0.739419 (Temp.) 20.5121 2.9205 0.514820 (Hum.) 40.0336 5.5524 0.911821 (Temp.) 20.8336 2.5813 0.3595

20 22 24 26 28 30 32 34 36 38 40Temperature

0

0.05

0.1

0.15

0.2

0.25

0.3

Prob

abili

ty

HistogramGaussian kernelLaplacian kernelBox kernelEpanechnikov kernel

Fig. 5. Fitting of non-parametric functions on temperature data collected bythe Intel-Berkeley database [34], where the (a) Gaussian, (b) Laplacian, (c)Box, and (d) Epanechnikov kernels have been used.

delay. The quantizer of each sensor uses the same number ofquantization levels, M , resulting in a stream of k = n×log2 Mbits that is passed to each entropy encoder.

A. Choice of the Appropriate Kernel Function

First, we evaluate the impact of the fitting accuracywhen different kernel functions are considered for the non-parametric estimation of the marginal pdfs. Fig. 5 depicts thefitting accuracy of different non-parametric distributions on thetemperature data collected by sensor 1. The distributions usethe Gaussian, the Laplacian, the Box and the Epanechnikovkernels. Moreover, using Kolmogorov-Smirnov fitting tests,Table II shows that the fitting accuracy of the differentdistributions is quite similar; this agrees with prior results such

FINAL VERSION 8

TABLE IIASYMPTOTIC p-VALUES WHEN PERFORMING KOLMOGOROV-SMIRNOV

FITTING TESTS BETWEEN THE ACTUAL TEMPERATURE READINGS AND

THE SETS PRODUCED BY THE NON-PARAMETRIC DISTRIBUTION WITH

DIFFERENT KERNELS. THE SIGNIFICANCE LEVEL IS SET TO 5%.

Kernel Function Bandwidth p-values

Gaussian 0.4332 0.3792Laplacian 0.4332 0.3331

Box 0.4332 0.2917Epanechnikov 0.4332 0.3553

TABLE IIIEFFECTIVE RATE VS. EFFECTIVE DISTORTION FOR GAUSSIAN,

LAPLACIAN, BOX AND EPANECHNIKOV KERNEL FUNCTIONS (Le = 21).

Eff. Rate Eff. Distortion(all schemes) Gaussian Laplacian Box Epanechnikov

1.1161 1.5316 1.5331 1.5334 1.53271.8870 0.3306 0.3318 0.3320 0.33112.7818 0.1035 0.1041 0.1040 0.10393.7408 0.0324 0.0329 0.0329 0.03274.7228 0.0057 0.0061 0.0064 0.00595.6577 0.0019 0.0023 0.0022 0.0020

as [49], where it has been proven that the choice of the kernelfunction is less significant than the appropriate choice of thebandwidth of the smoothing window.

Table III shows the performance of the proposed system fordifferent kernels. For illustrative purposes, we have assumedthe normal copula regression for the refinement stage. Thecompression performance is expressed in terms of the effectivedistortion (in MSE), 1

Le

∑l∈Ie

Dl, versus the effective rate (inbits/sample), 1

Le

∑l∈Ie

Rl, for different quantization levels,that is, M = 8, 16, 32, 64, 128, or 256. Here we have assumedthat Le = 21, namely, all considered sensors are active. Wesee that, for all different kernel types, the effective distortionperformance is very similar. Nevertheless, the Gaussian kernelprovides slightly better results than the other functions and,hence, is used in our experiments.

B. Performance Evaluation of DPCM

We assess the impact of DPCM (including Max-Lloydquantization) on the performance of the system. Particularly,we compare the system in [20], which applies USQ followedby arithmetic encoding, against our approach, which deploysDPCM with Lloyd-Max scalar quantization and arithmeticentropy encoding. In both systems, a Gaussian regression stepis performed at the decoder after inverse quantization. Thecomparisons are conducted for two scenarios. In the first, allsensors are active, whereas in the second a random subsetof sensors Ic may not operate, resulting in a setup with Le

effective sensors (see Section IV-B).Figs. 6(a) and 6(b) illustrate the performance of the com-

pared systems for Le = 21 and Le = 12, respectively5. Asin Section V-A, the compression performance is expressed

5Le is a parameter in our system. Without loss of generality, we canconsider different values of Le without affecting the overall behaviour of thesystem since (a) the sensor readings from the Intel-Berkeley database exhibitvarious levels of dependencies, and (b) the indices of the effective sensors arechosen randomly.

1 2 3 4 5 6 7 8Effective Rate (bits/sample)

10-3

10-2

10-1

100

101

Eff

ectiv

e di

stor

tion

(dB

)

DPCM with Gaussian regressionEntropy coding with Gaussian regression

(a)


10-4

10-3

10-2

10-1

100

101

Eff

ectiv

e di

stor

tion

(dB

)

DPCM with Gaussian regressionEntropy coding with Gaussian regression

(b)

Fig. 6. Rate-distortion performance comparison between entropy codingand DPCM when only Gaussian regression is employed at the decoder. Thenumber of effective sensors is (a) Le = 21 and (b) Le = 12.

in terms of the effective distortion versus the effective ratefor different quantization levels. It is clear that the proposedapproach leads to a substantially higher compression perfor-mance, delivering a significant effective rate reduction of upto 36.64% (when Le = 21) and 38.35% (when Le = 12) fora similar distortion level. These results underline the benefitof using a DPCM scheme with an optimized Lloyd-Maxquantizer to leverage the intra-sensor dependencies in oursetting.

C. Performance Evaluation of the Copula Regression Algo-rithm

We now evaluate the performance improvement achieved bythe proposed copula-based regression algorithm. To this end,we remove the DPCM component from the system; namely,the collected data are quantized with the Lloyd-Max quantizerand the indices are entropy encoded. In particular, we comparethe following schemes: (a) the baseline scheme using entropycoding without a refinement stage (i.e., no regression), (b) thescheme in [20] that combines entropy coding and Gaussianregression, (c) the proposed scheme using entropy codingand normal copula regression, (d) the proposed scheme thatcombines entropy coding and t-copula regression, and (e) the

FINAL VERSION 9

TABLE IVAVERAGE EFFECTIVE DISTORTION GAINS (IN %) USING THE SYSTEM IN

[20] AS A REFERENCE.

Regression Type Le = 12 Le = 21

No regression [system (a)] 4.96 39.07Normal copula regression [system (c)] 47.05 57.31

t-copula regression [system (d)] 65.93 75.90Clayton copula regression [system (e)] 79.22 87.55

proposed scheme that combines entropy coding and Claytoncopula regression.

The effective rate-distortion performance of the system isgiven in Figs. 7(a) and 7(b) for Le = 21 and Le = 12,respectively. It is worth observing that the Gaussian regressionmethod in [20] induces a higher distortion of the decodeddata compared to the simple case where no regression isapplied, especially at low rates. This is because in the low-rateregime the vectors X and X in (4) are not jointly Gaussianand thus, Gaussian regression leads to poor final estimates.When the rate increases, the assumption of joint Gaussiandistribution becomes more accurate and therefore better esti-mates are provided. However, the proposed copula regressionalgorithm systematically outperforms the Gaussian regressionscheme in [20] for all copula models. More importantly, theimprovements are increased when the encoding rate decreases.The reason is that at low rates, where the quantization of thedata is coarse, the copula regression schemes offer significantimprovement on the reconstruction quality.

Table IV presents the average percentage distortion reduc-tions obtained by comparing each of the schemes (a), (c)-(e) with the state-of-the-art scheme (b). The improvementsin distortion reduction refer to the cases when Le = 12 andLe = 21. These improvements show that copula-based modelsexpress the joint statistics among heterogeneous data moreaccurately than the multivariate Gaussian model. Furthermore,the t-copula function results in higher modeling accuracy thanthe normal copula, which is attributed to the ability of theformer to express better the dependencies between extremevalues [52]. Finally, the best performance is obtained when theClayton copula is used because this copula efficiently capturesthe asymmetric dependencies among sensor data.

D. Overall Performance of the Proposed System

In this section, we compare the proposed system, configuredwith DPCM and normal, t- or Clayton copula regression,against the system in [20]. The results reported in Figs.8(a) and 8(b) show that the proposed system significantlyoutperforms the benchmark. Specifically, the configurationwith normal copula regression achieves an average effectivedistortion reduction of 80.45% (when Le = 21) and 77.99%(when Le = 12). When the t-copula regression method isapplied, average effective distortion reductions of 85.03%(when Le = 21) and 81.28% (when Le = 12) are observed.The best performance is achieved when the Clayton copulais used, where the average distortion reductions are 93.23%(when Le = 21) and 92.06% (when Le = 12), respectively.

2 3 4 5 6 7 8Effective Rate (bits/sample)

10-3

10-2

10-1

100

101

Eff

ectiv

e di

stor

tion

(dB

)

No regression [system (a)]Gaussian regression [system (b)]Normal copula regression [system (c)]t-copula regression [system (d)]Clayton copula regression [system (e)]

(a)

2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5Effective Rate (bits/sample)

10-4

10-3

10-2

10-1

100

101

Eff

ectiv

e di

stor

tion

(dB

)

No regression [system (a)]Gaussian regression [system (b)]Normal copula regression [system (c)]t-copula regression [system (d)]Clayton copula regression [system (e)]

(b)

Fig. 7. Rate-distortion performance comparison for Gaussian, normal copula,t-copula, and Clayton copula regression employed at the decoder. At theencoder, arithmetic entropy coding is performed for all configurations. Thenumber of effective sensors is (a) Le = 21 and (b) Le = 12.

The resulting effective rate gains are the same as in SectionV-B, as they are attributed to the DPCM encoder in our system.

Contrary to the state-of-the-art system in [20], the proposedcode design leverages both the intra-sensor data correlation bymeans of DPCM, and the inter-sensor correlation using copulafunctions. Moreover, the proposed copula-based approach al-lows for capturing the dependencies among diverse data moreaccurately than the multivariate Gaussian model.

E. Performance Evaluation for Weaker Intra-Sensor Depen-dence Structure

In the previous experiments, the proposed method wasevaluated using actual sensor readings from the Intel Berkeleydatabase, which are collected per minute; we refer to thisdata as dataset A. Due to the high sampling rate, consecutivesensor readings are highly correlated and, in this case, DPCMprovides significant rate savings, as shown in Sections V-Band V-D. In order to assess the performance of the proposedmethod for weak intra-sensor dependencies, we perform a sub-sampling of the data in the Intel Berkeley database, namely,weconsider sensor readings collected per 30 minutes; we refer tothem as dataset B. Fig. 9 compares the autocorrelation function(ACF) of the temperature readings of sensor 1 for the datasets

FINAL VERSION 10


10-4

10-3

10-2

10-1

100

101E

ffec

tive

dist

ortio

n (d

B)

Entropy coding with Gaussian regressionDPCM with normal copula rgeressionDPCM with t-copula regressionDPCM with Clayton copula regression

(a)


10-4

10-3

10-2

10-1

100

101

Eff

ectiv

e di

stor

tion

(dB

)

Entropy coding with Gaussian regressionDPCM with normal copula rgeressionDPCM with t-copula regressionDPCM with Clayton copula regression

(b)

Fig. 8. Rate-distortion performance comparison between the state-of-the-artsystem presented in [20] and the proposed system using DPCM and (normal,t- and Clayton) copula regression at the decoder. The number of effectivesensors is (a) Le = 21 and (b) Le = 12.

-6000 -4000 -2000 0 2000 4000 6000Lags

0

0.2

0.4

0.6

0.8

1

AC

F

1 minute30 minutes

Fig. 9. Autocorrelation function calculated for the temperature readings ofsensor 1 during the training period, when data sampling is performed per (a)1 minute, or (b) 30 minutes.

A and B. It is clear that the ACF decays faster for the datasetB, as the sampling interval is larger.

We compare the performance of the proposed system withDPCM and Clayton copula regression against the state-of-the-art system in [20] for dataset B. Moreover, in the compar-ison, we include the system that applies entropy encoding

1 2 3 4 5 6 7 8 9Effective Rate (bits/sample)

10-3

10-2

10-1

100

101

Eff

ectiv

e di

stor

tion

(dB

)

DPCM with Clayton copula regressionEntropy coding with Clayton copula regressionEntropy coding with Gaussian regression

Fig. 10. Rate-distortion performance comparison between the proposedsystem, the system with entropy coding and Clayton copula regression, aswell as the state of the art [20] (dataset B).


10-4

10-3

10-2

10-1

100

101

Eff

ectiv

e di

stor

tion

(dB

)

ProposedNon cluster-based DSCCluster-based DSC

Fig. 11. Rate-distortion performance comparison of the proposed system withsingle- and two-cluster DSC designs.

and Clayton copula regression. The reason for choosing theClayton copula for regression is that it delivers the best MSEperformance, as shown in Sections V-C and V-D. For thedataset B, the Clayton copula parameter was found to beδCl = 0.5870.

The effective rate-distortion performance of all schemesis given in Fig. 10, for Le = 21. The results reveal thateven when the intra-sensor dependence is weak, the proposedmethod outperforms the system in [20]. In particular, averagerate savings of 1.1493 bits/sample are obtained, whereas theaverage distortion reductions are of 89.65%. The rate gain dueto DPCM is smaller than in Section V-D, where the datasetA is considered, but still significant. Furthermore, the systemwith entropy coding and Clayton copula regression deliversbetter performance in MSE than the state-of-the-art systemin [20], obtaining average reduction is 88.03%. Thus, theproposed copula regression algorithm delivers more accurateinference than the Gaussian regression [20] since copulas canefficiently capture the dependencies among the sensor data.

F. Comparison with DSC for Different WSN Topologies

Finally, we compare our system, which applies DPCMand Clayton copula regression, with a state-of-the-art DSC

FINAL VERSION 11

system [53], [54]. The DSC system in [53], [54] abides by thefollowing WSN topology [15]: the sensors are separated intosmaller groups called clusters. Each cluster contains a clusterhead (CH) and a number of peripheral nodes (PNs). The datacollected from the CH are intra-encoded and communicatedto a central node (decoder), where it plays the role of sideinformation that is used to decode the data from the PNs,which are Wyner-Ziv [8] encoded. The readings of each sensorare uniformly quantized and the resulting quantization indicesare split into bit-planes. The CH performs arithmetic entropyencoding of the bit-planes sequentially starting from the mostsignificant one. The PNs perform Slepian-Wolf [7] encodingof the bit-planes using Low-Density Parity-Check Accumulate(LDPCA) codes [55]. A multivariate Gaussian distribution isused to describe the statistical dependencies among the sensorreadings of the CH and the PNs.

We assume two different configurations of this topology.First, all 21 sensors form a big cluster with sensor 1 beingthe CH; we refer to this as a single-cluster topology. Second,the WSN is divided in two clusters; the first comprises 11sensors measuring temperature, whereas the second includesthe sensors that monitor humidity.

Fig. 11 shows that the proposed design outperforms signifi-cantly the DSC system for both the single-cluster and the two-cluster topologies. In particular, our system offers average ratesavings of 1.2662 bits/sample and of 1.0088 bits/sample com-pared to the single- and two-cluster DSC system, respectively.The corresponding average effective distortion reductions are90.14% and 90.25%. Thus, the proposed design efficientlyleverages the inter-sensor dependencies among heterogeneousdata, i.e. both temperature and humidity. However, we seethat the two-cluster DSC system outperforms the single-clusterconfiguration, showing that this DSC design is more suitablefor homogeneous data. Hence, grouping of sensors measuringhomogeneous data (i.e., temperature or humidity) is requiredso as to improve its performance.

VI. CONCLUSION

We have proposed a novel MT source code design for mul-tisensory WSNs monitoring diverse data, such as temperatureand humidity. Our design achieves significant compressiongains compared to the state of the art, because it takes intoaccount both the inter- and intra-sensor data dependencies.Firstly, to express the dependence structure among the diversedata types collected by the various sensors (such as humid-ity or temperature sensors), we proposed the use of multi-variate copula functions belonging to the Elliptical and theArchimedean family. Our system provides accurate statisticalinference via regression, by means of a proposed algorithmthat delivers accurate estimation at a reasonable complexity.Secondly, to leverage the intra-sensor data dependencies, weused a predictive quantization technique, namely, DPCM.Through experimentation using actual sensor measurementsfrom the well-established Intel-Berkeley database [34], weshowed that the proposed system significantly outperformsstate-of-the-art designs [20], [53], [54]. Finally, the proposedscheme is flexible, as it does not require reconfiguration whena subset of sensors is not operating.

VII. ACKNOWLEDGEMENTS

The work was supported in part by the FWO under ProjectG025615, in part by the VUB-UPatras International JointResearch Group on ICT (JICT) under Project OZR/2015/167,and in part by the VUB strategic research programme M3D2.We would also like to thank the reviewers for their constructivecomments.

REFERENCES

[1] I. F. Akyildiz and M. C. Vuran, Wireless sensor networks. John Wiley& Sons, 2010, vol. 4.

[2] G. Smart, N. Deligiannis, R. Surace, V. Loscri, G. Fortino, and Y. An-dreopoulos, “Decentralized time-synchronized channel swapping for adhoc wireless networks,” IEEE Trans. Veh. Technol., 2016.

[3] M. Vecchio, R. Giaffreda, and F. Marcelloni, “Adaptive lossless entropycompressors for tiny IoT devices,” IEEE Trans. Wireless Commun.,vol. 13, no. 2, pp. 1088–1100, 2014.

[4] F. Marcelloni and M. Vecchio, “A simple algorithm for data compressionin wireless sensor networks,” IEEE Commun. Lett., vol. 12, no. 6, pp.411–413, 2008.

[5] D. I. Sacaleanu, R. Stoian, D. M. Ofrim, and N. Deligiannis, “Com-pression scheme for increasing the lifetime of wireless intelligent sensornetworks,” in Proc. Eur. Signal Process. Conf. (EUSIPCO), 2012, pp.709–713.

[6] A. Gersho and R. M. Gray, Vector quantization and signal compression.Springer, 1992.

[7] D. Slepian and J. K. Wolf, “Noiseless coding of correlated informationsources,” IEEE Trans. Inf. Theory, vol. 19, no. 4, pp. 471–480, 1973.

[8] A. D. Wyner and J. Ziv, “The rate-distortion function for source codingwith side information at the decoder,” IEEE Trans. Inf. Theory, vol. 22,no. 1, pp. 1–10, 1976.

[9] N. Deligiannis, A. Sechelea, A. Munteanu, and S. Cheng, “The no-rate-loss property of Wyner-Ziv coding in the Z-channel correlation case,”IEEE Commun. Lett., vol. 18, no. 10, pp. 1675–1678, 2014.

[10] T. Berger, “Multiterminal source coding,” Inf. Theory Approach Com-mun., vol. 229, pp. 171–231, 1977.

[11] S.-Y. Tung, “Multiterminal source coding,” Ph.D. dissertation, CornellUniversity, May, 1978.

[12] J. Wang, J. Chen, and X. Wu, “On the sum rate of Gaussian multiter-minal source coding: New proofs and results,” IEEE Trans. Inf. Theory,vol. 56, no. 8, pp. 3946–3960, 2010.

[13] F. Oldewurtel, M. Foks, and P. Mahonen, “On a practical distributedsource coding scheme for wireless sensor networks,” in Proc. IEEE Veh.Technol. Conf. (VTC Spring). IEEE, 2008, pp. 228–232.

[14] N. Deligiannis, E. Zimos, D. M. Ofrim, Y. Andreopoulos, andA. Munteanu, “Distributed joint source-channel coding with raptor codesfor correlated data gathering in wireless sensor networks,” in Int. Conf.Body Area Networks (Bodynets), 2014, pp. 279–285.

[15] N. Deligiannis, E. Zimos, D. Ofrim, Y. Andreopoulos, and A. Munteanu,“Distributed joint source-channel coding with copula-function-basedcorrelation modeling for wireless sensors measuring temperature,” IEEESensors J., vol. 15, no. 8, pp. 4496–4507, 2015.

[16] F. Chen, M. Rutkowski, C. Fenner, R. C. Huck, S. Wang, and S. Cheng,“Compression of distributed correlated temperature data in sensor net-works,” in Proc. Data Compress. Conf. (DCC). IEEE, 2013, pp. 479–479.

[17] Y. Yang, V. Stankovic, Z. Xiong, and W. Zhao, “On multiterminal sourcecode design,” IEEE Trans. Inf. Theory, vol. 54, no. 5, pp. 2278–2302,2008.

[18] V. Stankovic, L. Stankovic, S. Wang, and S. Cheng, “Distributedcompression for condition monitoring of wind farms,” IEEE Trans.Sustain. Energy, vol. 4, no. 1, pp. 174–181, 2013.

[19] Y. Yang and Z. Xiong, “Distributed source coding without Slepian-Wolfcompression,” in Proc. IEEE Int. Symp. Inf. Theory (ISIT). IEEE, 2009,pp. 884–888.

[20] S. Cheng, “Multiterminal source coding for many sensors with entropycoding and Gaussian process regression,” in Proc. Data Compress. Conf.(DCC). IEEE, 2013, p. 480.

[21] M. W. Marcellin and T. R. Fischer, “Trellis coded quantization ofmemoryless and Gauss-Markov sources,” IEEE Trans. Commun., vol. 38,no. 1, pp. 82–93, 1990.

FINAL VERSION 12

[22] V. Stankovic, A. D. Liveris, Z. Xiong, and C. N. Georghiades, “Oncode design for the Slepian-Wolf problem and lossless multiterminalnetworks,” IEEE Trans. Inf. Theory, vol. 52, no. 4, pp. 1495–1507, 2006.

[23] S. S. Pradhan and K. Ramchandran, “Distributed source coding usingsyndromes (DISCUS): Design and construction,” IEEE Trans. Inf. The-ory, vol. 49, no. 3, pp. 626–643, 2003.

[24] A. D. Liveris, Z. Xiong, and C. N. Georghiades, “Compression of binarysources with side information at the decoder using LDPC codes,” IEEECommun. Lett., vol. 6, no. 10, pp. 440–442, 2002.

[25] J. Garcia-Frias, “Compression of correlated binary sources using turbocodes,” IEEE Commun. Lett., vol. 5, no. 10, pp. 417–419, 2001.

[26] Q. Xu, V. Stankovic, and Z. Xiong, “Distributed joint source-channelcoding of video using raptor codes,” IEEE J. Sel. Areas Commun.,vol. 25, no. 4, pp. 851–861, 2007.

[27] M. Fresia, L. Vandendorpe, and H. V. Poor, “Distributed source codingusing raptor codes for hidden Markov sources,” IEEE Trans. SignalProcess., vol. 57, no. 7, pp. 2868–2875, 2009.

[28] J. S. Murray, D. B. Dunson, L. Carin, and J. E. Lucas, “BayesianGaussian copula factor models for mixed data,” Journal of the AmericanStatistical Association, vol. 108, no. 502, pp. 656–665, 2013.

[29] S. G. Iyengar, P. K. Varshney, and T. Damarla, “A parametric copula-based framework for hypothesis testing using heterogeneous data,” IEEETrans. Signal Process., vol. 59, no. 5, pp. 2308–2319, 2011.

[30] M. Sklar, Fonctions de repartition a n dimensions et leurs marges.Universite Paris 8, 1959.

[31] R. B. Nelsen, An Introduction to Copulas (Springer Series in Statistics).Secaucus, NJ, USA: Springer-Verlag New York, Inc., 2006.

[32] H. Noh, A. E. Ghouch, and T. Bouezmarni, “Copula-based regressionestimation and inference,” Journal of the American Statistical Associa-tion, vol. 108, no. 502, pp. 676–688, 2013.

[33] R. A. Parsa and S. A. Klugman, “Copula regression,” Variance Advanc-ing and Science of Risk, vol. 5, pp. 45–54, 2011.

[34] S. Madden, “Intel berkeley research lab data,” 2003.[35] C. K. Williams, “Regression with Gaussian processes,” in Mathematics

of Neural Networks. Springer, 1997, pp. 378–382.[36] J. G. Proakis and M. Salehi, “Communication systems engineering,

2002.”[37] I. H. Witten, R. M. Neal, and J. G. Cleary, “Arithmetic coding for data

compression,” Commun. ACM, vol. 30, no. 6, pp. 520–540, 1987.[38] P. Embrechts, A. McNeil, and D. Straumann, “Correlation and depen-

dence in risk management: properties and pitfalls,” Risk Management:Value at Risk and Beyond, pp. 176–223, 2002.

[39] H. Joe, Dependence modeling with copulas. CRC Press, 2014.[40] E. W. Frees and E. A. Valdez, “Understanding relationships using

copulas,” North American Actuarial Journal, vol. 2, no. 1, pp. 1–25,1998.

[41] C. Genest and J. Mackay, “The joy of copulas: bivariate distributionswith uniform marginals,” The American Statistician, vol. 40, no. 4, pp.280–283, 1986.

[42] C. Genest and L.-P. Rivest, “On the multivariate probability integraltransformation,” Stat. Probab. Lett., vol. 53, no. 4, pp. 391–399, 2001.

[43] R. T. Clemen and T. Reilly, “Correlations and copulas for decision andrisk analysis,” Management Science, vol. 45, no. 2, pp. 208–224, 1999.

[44] A. J. McNeil and J. Neslehova, “Multivariate Achimedean copulas, d-monotone functions and -norm symmetric distributions,” Ann. Stat., pp.3059–3097, 2009.

[45] E. Cuvelier and M. Noirhomme-Fraiture, “Clayton copula and mixturedecomposition,” ASMDA 2005, pp. 699–708, 2005.

[46] E. Bouye, V. Durrleman, A. Nikeghbali, G. Riboulet, and T. Roncalli,“Copulas for finance-a reading guide and some applications,” Availableat SSRN 1032533, 2000.

[47] A. W. Bowman and A. Azzalini, Applied Smoothing Techniques forData Analysis: The Kernel Approach with S-Plus Illustrations. OxfordUniversity Press, 1997.

[48] V. A. Epanechnikov, “Non-parametric estimation of a multivariate proba-bility density,” Theory of Probability and Its Applications, vol. 14, no. 1,pp. 153–158, 1969.

[49] M. P. Wand and M. C. Jones, Kernel smoothing. Crc Press, 1994.[50] A. E. Gelfand and A. F. Smith, “Sampling-based approaches to calculat-

ing marginal densities,” Journal of the American Statistical Association,vol. 85, no. 410, pp. 398–409, 1990.

[51] T. H. Gormen, C. E. Leiserson, R. L. Rivest, C. Stein et al., “Introductionto algorithms,” MIT Press, vol. 44, pp. 97–138, 1990.

[52] W. Breymann, A. Dias, and P. Embrechts, “Dependence structuresfor multivariate high-frequency data in finance,” Quantitative Finance,vol. 3, no. 1, pp. 1–14, 2003.

[53] N. Deligiannis, A. Munteanu, S. Wang, S. Cheng, and P. Schelkens,“Maximum likelihood Laplacian correlation channel estimation in lay-ered Wyner-Ziv coding,” IEEE Trans. Signal Process., vol. 62, no. 4,pp. 892–904, 2014.

[54] Z. Xiong, A. D. Liveris, and S. Cheng, “Distributed source coding forsensor networks,” IEEE Signal Process. Mag., vol. 21, no. 5, pp. 80–94,2004.

[55] D. Varodayan, A. Aaron, and B. Girod, “Rate-adaptive codes fordistributed source coding,” Signal Processing, vol. 86, no. 11, pp. 3123–3130, 2006.

Evangelos Zimos received the Diploma in Electri-cal and Computer Engineering from University ofPatras, Greece, in 2010. From Aug. 2010 to Dec.2012 he was working as communication engineer atthe Hellenic military and the industry. Since Feb.2013 he is a Ph.D. researcher in the the Depart-ment of Electronics and Informatics (ETRO), VrijeUniversiteit Brussel (VUB). His research interestsmainly include wireless sensor networks, statisticalinference and network information theory.

Dimitris Toumpakaris received a Diploma in Elec-trical and Computer Engineering from the NationalTechnical University of Athens, Greece in 1997, andan M.S. and a Ph.D. degree in Electrical Engineeringfrom Stanford University in 1999 and 2003, respec-tively. Between 2003 and 2006 he was a SeniorDesign Engineer in Marvell Semiconductor Inc.,Santa Clara, California. He has also worked as aconsultant for Ikanos Communications and MarvellSemiconductor Inc. During 2012-2014 he was anEditor of IEEE Communications Letters. He is cur-

rently an Assistant Professor in the Department of Electrical and ComputerEngineering, University of Patras, Greece. His research interests includeinformation theory with emphasis on multiuser communications systems,interference management, synchronization and estimation. Views expressedin the article reflect the own personal opinion of the author.

Adrian Munteanu received the MSc. degree inelectronics from Politehnica University of Bucharest,Romania, in 1994, the MSc. degree in biomedicalengineering from University of Patras, Greece, in1996, and the Doctorate degree in applied sciences(awarded with the highest distinction and congratu-lations of the jury members) from Vrije UniversiteitBrussel (VUB), Belgium, in 2003.

In the period 2004-2010 he was post-doctoral fel-low with the Fund for Scientific Research Flanders(FWO), Belgium, and since 2007, he is professor

at VUB. His research interests include multiview video processing, scalableimage and video coding, distributed video coding, scalable coding of 3Dgraphics, 3D video coding, error-resilient coding, multiresolution image andvideo analysis, video segmentation and indexing, multimedia transmissionover networks and statistical modeling.

Prof. Munteanu is the author of more than 250 journal and conferencepublications, book chapters, patent applications and contributions to standards.He is the recipient of the 2004 BARCO-FWO prize for his PhD work. Prof.Munteanu currently serves as Associate Editor for IEEE Transactions onMultimedia.

FINAL VERSION 13

Nikos Deligiannis (S08, M10) is assistant professorwith the Electronics and Informatics department atVrije Universiteit Brussel. He received the Diplomain electrical and computer engineering from Uni-versity of Patras, Greece, in 2006, and the PhD inapplied sciences (awarded with Highest Honors andcongratulations from the jury) from Vrije Univer-siteit Brussel, Belgium, in 2012.

From Jun. 2012 to Sep. 2013, he was a postdoc-toral researcher with the Department of Electronicsand Informatics, Vrije Universiteit Brussel. From

Oct. 2013 to Feb. 2015, he was working as postdoctoral research associateat the Department of Electronic and Electrical Engineering at UniversityCollege London, UK. During that period, he was also acting as technicalconsultant on big visual data technologies at the British Academy of Filmand Television Arts (BAFTA), UK. His current research interests include bigdata processing, compressed sensing, data analysis and networking for theinternet of things, distributed computing, and visual search. Prof. Deligiannishas authored over 70 journal and conference publications, book chapters,and two patent applications (one owned by iMinds, Belgium and the otherby BAFTA, UK). He was a recipient of the 2011 ACM/IEEE InternationalConference on Distributed Smart Cameras Best Paper Award and the 2013Scientific Prize FWO-IBM Belgium.

final version 1 multiterminal source coding with copula ... · final version 1 multiterminal source...

Documents