quality enhancement of printed-and-scanned...

ARTICLE IN PRESS

0165-1684/$ - se

doi:10.1016/j.si

$Parts of thi

F. Deguillaume

side informatio

multimedia and

Photonics West

and Applicatio�CorrespondE-mail addr

Oleksiy.Koval@

Frederic.Deguil

Thierry.Pun@c

URL: http:/

Signal Processing 87 (2007) 1301–1313

www.elsevier.com/locate/sigpro

Quality enhancement of printed-and-scanned images usingdistributed coding$

S. Voloshynovskiy�, O. Koval�, F. Deguillaume, T. Pun

University of Geneva - CUI, 24 rue du General Dufour, CH-1211 Geneva 4, Switzerland

Received 20 December 2005; received in revised form 31 October 2006; accepted 7 November 2006

Available online 8 December 2006

Abstract

In this paper we address visual communications via print-and-scan channels from an information-theoretic point of view

as communications with side information that targets quality enhancement of visual data at the output of this type of

channels. The solution to this problem addresses important aspects of multimedia data processing and management.

A practical approach to side information communications for printed documents based on Wyner–Ziv and Gray setups is

analyzed in the paper that assumes two separated communications channels where an appropriate distributed coding

should be elaborated. The printing channel is considered to be a direct visual channel for images (‘‘analog’’ channel with

degradations). The ‘‘digital channel’’ is considered to be an appropriate auxiliary channel exploited to communicate the

information exploited for quality enhancement of printed-and-scanned image. We demonstrate both theoretically and

practically how one can benefit from this sort of ‘‘distributed paper communications’’.

r 2006 Elsevier B.V. All rights reserved.

Keywords: Data-hiding; Printed-and-scanned documents; Communications with side-information; Bar codes; Distributed coding; Joint

source-channel coding; Gel’fand–Pinsker problem; Wyner–Ziv coding; Gray coding; Image enhancement

1. Introduction

Printed documents are still the most common,habitual and widely used form of visual informationrepresentation. Therefore, as with digital media, the

e front matter r 2006 Elsevier B.V. All rights reserved

gpro.2006.11.003

s paper appeared as S. Voloshynovskiy, O. Koval,

and Thierry Pun, Visual communications with

n via distributed printing channels: extended

security perspectives, in: Proceedings of SPIE

, Electronic Imaging 2004, Multimedia Processing

ns, San Jose, USA, January 18–22, 2004.

ing authors.

esses: [email protected] (S. Voloshynovskiy),

cui.unige.ch (O. Koval),

[email protected] (F. Deguillaume),

ui.unige.ch (T. Pun).

/watermarking.unige.ch/, http://sip.unige.ch.

important emerging issues for printed documentsare copyright protection, authentication, integritycontrol, tracking and finally quality enhancement.The printed document can be considered as a partof visual communications where data are convertedinto some specific form suitable for reliablecommunications via printing channels, with somegiven constraints on perceptual quality. Digitalhalftoning is often used for the transformation ofdigital documents into printed form. Although,being very simple and having been used since1855, this transformation unavoidably leads to thedegradation of the original document quality.This is not the case for digital documents and hasimportant consequences for document security andmanagement.

.

www.elsevier.com/locate/sigpro

dx.doi.org/10.1016/j.sigpro.2006.11.003

mailto:[email protected]




http://watermarking.unige.ch/,

http://sip.unige.ch

ARTICLE IN PRESSS. Voloshynovskiy et al. / Signal Processing 87 (2007) 1301–13131302

Digital data-hiding [1–4] has been considered as apossible approach to address the above problemsfor digital documents. However, in most cases thetraditional means of digital data-hiding are notalways suitable for printed documents and newapproaches are required. Moreover, the printingindustry has already recognized a limit in theenhancement of printed/scanned image qualityrelated to inverse halftoning. Further enhancementwould imply the development of more expensiveand complex printing/scanning devices, which is notappropriate for many low-cost and real-time appli-cations. At the same time the security of printeddocuments (not necessarily printed on paper)including passports, visas, ID cards, driving li-censes, diplomas, credit cards, banknotes or checksis usually accomplished by proprietary know-howtechnologies and digital data-hiding is far to bewidely accepted in this field.

In this paper we consider some practicalapproaches and analyze several scenarios aimingat further extending the usage of distributed codingfor printed documents. This problem is tackledfrom an information-theoretic point of view ascommunications with side information. Therefore,the goal of the paper is three-fold:

�
introduce an information-theoretic formalisminto visual communications via distributed print-ing channels; � consider possible communications scenarios withside information for multimedia and securityapplications; �
1Note: More general auxiliary channel can be designed using

different information carriers used in the printed (plastic)

documents such as bar codes, invisible inks or crystals, magnetic

strips, cheap electronic low-capacity chips.

establish corresponding bounds on system per-formance assuming side information available atthe encoder or at the decoder depending on theparticular scenario.

The main practical focus of this paper is onimportant problem of quality enhancement ofdocuments at the output of print-and-scan channels.One of principle applications were such a problem isplaying a crucial role is document authenticationand person identification. In such a protocol oneshould verify if the photo contained in theidentification document (passport, ID card, driverlicense, etc.) is authentic or not. Evidently, in suchan application quality of the data at the output ofthe print-and-scan channel should be sufficientlyhigh to guarantee errorless protocol performance.

As an another possible application one canconsider the case of publicly distributed documents

such as journals, newspapers, books and advertise-ment materials. For instance, Fig. 1 represents thebasic setup where the image of concern (Lena)should be extracted from the printed document withthe purpose of authentication, tracking or qualityenhancement. If this image is simply scanned by a

public user that does not have any access to specificinformation, the best quality of the resultingscanned image will be defined by the appliedprocessing prior to printing (possibly lossy com-pression, resampling or blurring), by the quality ofthe printing and finally by the quality of thescanning (inverse halftoning). It is obvious that insome cases the degradation of quality can be quiteconsiderable so that the scanned image cannot befurther exploited for the targeted applications.

Moreover, in some practical scenarios of personidentification based on ID documents, the printeddocument cannot be reproduced in color, if cheapprinting is used or when the physics of thereproduction process does not allow to achieve itOne example consists in laser engraving of photoson plastic cards used in security printing. In thiscase, the color information can be communicatedvia data-hiding or some auxiliary channel exploitingthe redundancy between color planes.

A different situation arises when the imagecontains some embedded data or when these dataare communicated via some auxiliary channel thatcan assist the quality enhancement of the printeddocument. Assume here for simplicity that thisauxiliary channel is a 2-D high capacity bar codeshown at the bottom of the page (Fig. 1).1 The barcode can be designed as a specific pattern or logo toavoid an annoying random-like appearance on thepage. The bar code can be also attached to the lastpage of the journal or can be printed on somesupplying layout. Access to the embedded data or tothe content of the bar code is authorized and can bemanaged according to the selected distributionprotocol. In the simplest case, it can be grantedfor an extra price. We call the person who hasobtained the above access rights a private user or an

authorized user. The private user can somehowbenefit from the appropriately communicated extrainformation in enhancing the quality of the scanneddocument. Therefore, the overall goal is to design

ARTICLE IN PRESS

Fig. 1. Visual communications via printing channels: example of journal with side information communications in the form of 2-D bar

code.

S. Voloshynovskiy et al. / Signal Processing 87 (2007) 1301–1313 1303

an optimal system that will provide the maximumpossible quality of the reconstructed image withminimum needed rate of the extra information to beadditionally communicated to the decoder.

Two alternative solutions to the formulatedproblem exist that differ by the physical channelstructure of the channel conveying the qualityupdate information. The first one is based on thehigh capacity data-hiding and exploits a self-embedding principle for this purpose [5]. The secondone exploits high capacity 2-D bar codes, invisibleinks or crystals, magnetic strips, cheap electroniclow-capacity chips an auxiliary channel to transferthis update.

Up to our knowledge, only some theoretical resultsjustifying the performance of protocol where infor-mation and host interference are simultaneouslytransmitted over a channel are available [5–7] andnone existing practical high capacity data-hidingtechnique simultaneously satisfies the visibility andpayload constraints and provides robustness to thedigital-to-analogue-to-digital conversion. That is

why in the scope of this paper we will mostly focusour analysis on the latter approach.

The paper is organized as follows. Section 2outlines the fundamentals of rate-distortion theorywith side information. Sections 3 presents a mathe-matical model of the printing process. Section 4considers a possible protocol auxiliary channel indexcommunications. Section 5 presents experimentalresults and Section 6 concludes the paper.

Notations: We use capital letters to denote scalarrandom variables X, bold capital letters to denotevector random variables X, corresponding smallletters x and x to denote the realizations of scalarand vector random variables, respectively. Thesuperscript N is used to designate length-N vectorsx ¼ xN ¼ ½x1;x2; . . . ;xN �

T with kth element xk. Weuse X�pX ðxÞ or simply X�pðxÞ to indicate that arandom variable X is distributed according topX ðxÞ. The mathematical expectation of a randomvariable X�pX ðxÞ is denoted by EpX

½X � or simplyby E½X � and Var½X � denotes the variance of X.We use IðX ;Y Þ to denote a mutual information


between two random variables X and Y [8].Calligraphic fonts X denote sets X 2 X and jX jdenotes a cardinality of set X . IN denotes the N �

N identity matrix.

2. Visual communications with side information

2.1. Asymmetric two-side information

Visual communications via printing channels can beconsidered from the point of view of communicationswith an asymmetric side information available at theencoder and decoder. The asymmetry of side informa-tion originates from the fact that the different types ofprior information about the actual image X areavailable at the encoder and actual side informationavailable a posteriori at the decoder. Let S1 denotesthe side information available at the encoder and S2

the side information available at the decoder (Fig. 2).One can consider S1 as a ‘‘virtually predicted’’ imageafter halftoning/inverse halftoning and S2 as thephysically printed-and-scanned image. Assume thatsamples of digital image and side information atencoder and at decoder fX k;S1;k;S2;kg�pX ;S1;S2

ðx; s1; s2Þ are independent drawings of jointly distrib-uted random variables X 2 X , S1 2 S1 and S2 2 S2,where X ;S1;S2 are finite sets. A distortion measure isdefined as dðx; xÞ and the corresponding averagedistortion as D ¼ ð1=NÞ

PNk¼1E½dðX k; X kÞ�. The im-

age fX kg is encoded using a length-N blocks code i

(a bar code, a watermark, etc.) into a binary streamof rate R, which should be decoded back as areconstructed image fX kg.

Cover and Chiang [9] have shown that the rate R isachievable at the distortion level D, if there exists a

sequence of ð2NR;NÞ codes i : XN � SN1 ! f1; 2; . . . ;

2NRg, X : f1; 2; . . . ; 2NRg � SN2 ! X

Nsuch that the

average distortion E½dðX; XðiðX;S1Þ; S2ÞÞ�pD and

RðDÞ ¼ minpðujx;s1Þpðxju;s2Þ

½IðU ;S1;X Þ � IðU ;S2Þ�, (1)

where the minimization is performed under thefollowing distortion constraint

Px;u;s1;s2;x

dðx; xÞpðx; s1; s2Þpðujx; s1Þpðxju; s2ÞÞpD and U is an

S2

i (X,S1)EncoderX

S1,S2S1

Decoder X

Fig. 2. Two-sided asymmetric side information available at both

encoder and decoder.

auxiliary random variable. S2! ðX ;S1Þ ! U andðX ;S1Þ ! ðU ;S2Þ ! X form Markov chain.

In the following we will focus on two particularcases of the considered generalized setup fromFig. 2: symmetric side information and asymmetricside information available only at the decoder.

2.2. Symmetric two-side information (S1 ¼ S2 ¼ S)

To introduce the basic concept of visual commu-nications via printing channels for the broad set ofdifferent application scenarios considered in thefollowing sections, we will simplify the previoussetup to a case with symmetric side information S ¼

S1 ¼ S2 that can be available at both encoder anddecoder or only at the decoder according to Fig. 3.

The rate distortion with symmetric side informa-tion at both encoder and decoder S ¼ S1 ¼ S2 wasfirst introduced by Gray [10] and further consideredby Berger [11] and Flynn and Gray [12] (Fig. 3:dashed line corresponds to availability of SN at theencoder). The corresponding rate-distortion func-tion is defined as

RX jSðDÞ ¼ minpðxjx;sÞ

IðX ; X jSÞ, (2)

where the minimization is over all conditionaldistributions pðxjx; sÞ such that

Px;x;s dðx; xÞpðx; sÞ

pðxjx; sÞpD. The above rate-distortion-functionalso follows from (1) assuming S ¼ S1 ¼ S2 [9].

2.3. Asymmetric side information at decoder

The rate-distortion problem with side informa-tion available at the decoder was considered byWyner and Ziv [13]. It also follows from (1), ifS1 ¼ ;, S2 ¼ S and ðS1;X Þ ¼ X :

RðDÞWZX jS ¼ min

pðujxÞpðxju;sÞ½IðU ;X Þ � IðU ;SÞ�

¼ minpðujxÞpðxju;sÞ

HðUÞ �HðU jX Þ

�HðUÞ þHðU jSÞ


HðU jSÞ �HðU jX ;SÞ


IðU ;X jSÞ, ð3Þ

EncoderX

S

Decoder X

ps,x (x,s)

i (X)

Fig. 3. Symmetric side information S1 ¼ S2 ¼ S.

ARTICLE IN PRESSS. Voloshynovskiy et al. / Signal Processing 87 (2007) 1301–1313 1305

where the minimization is performed over allpðujxÞpðxju; sÞ, i.e., pðx; s; u; xÞ ¼ pðx; sÞpðujxÞpðxju; sÞfor fixed pðx; sÞ, and all decoder functions X ðiðX Þ;SÞsuch that

Px;x;u;s pðx; sÞ pðujxÞpðxju; sÞdðx; xÞpD and

jUjpjX j þ 1. The third equality in (3) follows fromthe fact that S! X ! U form a Markov chain andHðU jX Þ ¼ HðU jX ;SÞ, consequently, while the secondMarkov chain is formed as X ! ðU ;SÞ ! X .

In the general case, IðX ;X jSÞpIðU ;X jSÞ andtherefore it can be a loss in performance of theWyner–Ziv setup in comparison to the rate-distor-tion function with side information available atboth encoder and decoder [14].

We will denote the classic rate-distortion functionwithout any side information (S1 ¼ ;, S2 ¼ ;) as

RðDÞ ¼ minpðxjxÞ

IðX ;X Þ, (4)

where the minimization is performed over all pðxjxÞ

such thatP

x;x pðxÞpðxjxÞdðx; xÞpD.In the subsequent subsections we will consider the

case of a Gaussian formulation of the rate-distor-tion problem. A methodology allowing to extendthe results obtained for the discrete alphabets tocontinuous formulation can be found in [15].

2.4. Shannon lower bound (SLB): Gaussian source

and MSE distortion

The classic rate-distortion function without anyside information (4) for a Gaussian N ð0;s2X Þ sourcewith squared error distortion is represented by theSLB [8]:

RðDÞ ¼1

2logþ2

s2XD

, (5)

where logþ2 x ¼ maxf0; log2ðxÞg. The correspondingrate-distortion function simultaneously describingN independent Gaussian random variables (aparallel Gaussian source model) X k�N ð0;s2X k

Þ, k ¼

1; 2; . . . ;N with a rate R bits/source vector is givenby

RðDÞ ¼XN

k¼1

1

2log2

s2X k

Dk

, (6)

where

Dk ¼l if los2X k

;

s2X kif lXs2X k

;

((7)

where l is chosen to satisfyPN

i¼1Dk ¼ D.

2.4.1. Side information at encoder and decoder

Let X be a sample generated by a zero mean andstationary Gaussian memoryless source, i.e.,X�N ð0;s2X Þ. The side information is given by anoisy version of source, i.e., S ¼ X þ Z0 where X

and Z0 are independent and Z0�N ð0; s2Z0 Þ.The rate-distortion function (2) in this case will

have the same character as for the stationaryGaussian source:

RðDÞX jS ¼1

2logþ2

s2X jSD

, (8)

where s2X jS ¼ s2X ð1� r2Þ is conditional variance,

r ¼ E½ðX � X ÞðS � SÞT�=ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiVar½X �Var½S�p

¼ sX=sS

is the correlation coefficient and s2S ¼ s2X þ s2Z0 ,thus, providing s2X jS ¼ s2Zs

2X=ðs

2Z þ s2X Þ.

Consequently, the rate saving due to the availableside information at the encoder and the decoderwith respect to the independent coding of X is

RðDÞ � RðDÞX jS ¼1

2log2

1

1� r2. (9)

If r2! 0, i.e., the variance of the noise Z0 isincreasing and X and S become less correlated,RðDÞX jS ! RðDÞ and no gain can be expected fromthe joint decoding.

In the case of non-stationary Gaussian vector(parallel Gaussian source), X ¼ fX 1;X 2; . . . ;X Ng,where X k�N ð0;s2X k

Þ, we have similarly to (6):

RðDÞX jS ¼XN

k¼1

1

2log2

s2X ks2Z

ðs2X kþ s2ZÞDk

, (10)

where

Dk ¼

l if los2X k

s2Zs2X kþ s2Z

;

s2X ks2Z

s2X kþ s2Z

ifs2X k

s2Zs2X kþ s2Z

;

8>>>><>>>>: (11)

where l is chosen to satisfyPN

i¼1Dk ¼ D.

2.4.2. Side information at decoder

Wyner and Ziv [13,15] demonstrated that for theGaussian case RðDÞWZ

X jS ¼ RðDÞX jS. However, gen-erally RðDÞWZ

X jSXRðDÞX jS [14]. This means that thereis a rate loss with the Wyner–Ziv coding. Also it isknown that for discrete sources when D ¼ 0, theWyner–Ziv problem reduces to the Slepian-Wolfproblem [16] with Rð0ÞWZ

X jS ¼ Rð0ÞX jS ¼ HðX jSÞ

assuming that S is communicated to the decoderat the rate HðSÞ.


2.4.3. Approaching theoretical rate-distortion function

with symmetric side-information (Wyner– Ziv vs Gray

setups)

For practical reasons, we will perform a com-parative analysis of the non-collaborative encodingsetup (Wyner–Ziv problem) with collaborative one(Gray problem) for distributed coding in printingapplications.

In the general case, the following ‘‘inequalitysandwich’’ is valid:

RðDÞXRðDÞWZX jSXRðDÞX jS. (12)

Therefore, the rate-distortion function RðDÞWZX jS is

above the rate-distortion function RðDÞX jS. In ourparticular case of distributed coding with sideinformation in printing applications, the sideinformation at the encoder is naturally given bythe physical nature of the problem. Therefore, theprinting application should definitively benefit fromthe availability of side information.

2.4.4. Compressed channel index communication

In printed documents, the side information S isdirectly available as the printed image (possiblyafter inverse halftoning). However, the index i

should be somehow communicated to the decoderin practical systems. According to the justification,given in Introduction, we will assume that it iscommunicated via an auxiliary channel that caninclude different information carriers used in theprinted (plastic) documents such as bar codes,invisible inks or crystals, magnetic strips, cheapelectronic low-capacity chips.

3. Mathematical model of printing

To investigate the ‘‘quality’’ of side information,i.e., to establish how close S is correlated with X

from one side and to investigate the possibility ofself-embedded communications from another side,one should consider a mathematical model of aprinting channel.

3.1. Halftoning and inverse halftoning: image

processing perspectives

Digital halftoning is a process that converts agray-scale image (X 2 f0; 1; . . . ; 255g) into a bi-levelimage (X 2 f0; 255g) that can easily be reproducedby a simple printing device. There are two types ofhalftoning, i.e., ordered and random (or so-called

error diffusion) halftoning. The ordered halftoningis mostly used in laser printers while error diffusionrepresents a broad class of ink jet printing devices.We will focus on the later for demonstrationpurposes. The error diffusion takes the error fromthe quantized gray-scale pixel to bi-level pixel anddiffuses the quantization error over a causalneighborhood The inverse halftoning of errordiffusion should ensure that the spatially localizedaverage pixels values of the halftone and originalgray-scale image are similar. Kite et al. [17]proposed an accurate linear model for errordiffusion halftoning. This model accurately predictsthe ‘‘blue noise’’ (high-frequency noise) and edgesharpening effects observed in various error-dif-fused halftones. The resulting halftone image s isobtained as

s ¼ HxþQw ¼ Hxþ z0, (13)

where H represents a linear shift-invariant (LTI)halftone filter, Q is a LTI filter corresponding to theerror diffusion coloring, w is white noise and z0 ¼

Qw is colored noise.Therefore, the inverse halftoning according to (13)

can be posed as a deconvolution or restorationproblem (in image processing) or channel equaliza-tion problem (in digital communications) whereoperator H represents the equivalent blurring orchannel and z0 corresponds to the image acquisitionor channel noise.

The distributed coding system with communica-tions via printing channels is shown in Fig. 4. Thesystem consists of two parts. The first part isthe physically printed image S and the second partis the auxiliary channel dedicated to the communica-tion of index i that can be considered as a bar code.The physical printing channel is represented as asequential set of operations comprising halftoning,printing and scanning. Here, we assume that printingitself, as the black point reproduction process with aspecified resolution, and scanning do not introduceany distortions. In more general case one would needto take into account some optical blurring andaberrations as well as sensor non-linearity.

Considering the halftoning problem according tothe vector channel (13), assuming a vector Gaussiansetup X�N ð0;CX Þ and Z0�N ð0;CZ0 Þ, the deconvo-lution problem can be formulated as the MMSEestimation of x given the conditional meanX0¼ E½XjS�:

x0¼ ðHTC�1Z0 H þ C�1X Þ

�1HTC�1Z0 s ¼ HW s, (14)

ARTICLE IN PRESS

XS

Z'

X'H HW

MMSE estimate

X X'⇒

Z

Fig. 5. Equivalent halftoning/inverse halftoning channel.

Encoder

Channel

Decoder ˆ

Halftoning Printing Scanning

X i (X,S) X

S

X S

Fig. 4. Printing channel communication setup.


where HW ¼ ðHTC�1Z0 H þ C�1X Þ

�1HTC�1Z0 representsa Wiener ‘‘restoration-equalization’’ filter.

The estimation error Z ¼ X0� X ¼ E½XjS� � X is

also Gaussian Z�N ð0;CZÞ due to linear nature ofthe MMSE estimate where the covariance matrix isdefined as

CZ ¼ CX jS ¼ E½ðX0� XÞðX

0� XÞT�

¼ ðHTC�1Z0 H þ C�1X Þ�1. ð15Þ

The halftoning/inverse halftoning printing channelcan be sequentially represented by an equivalentchannel shown in Fig. 5.

3.2. Inverse halftoning: practical low-complexity

algorithm

In this section we will extend the results obtainedfor i.i.d. Gaussian signals to real images. The keyissue is the availability of an accurate, simple andtractable stochastic image model leading to a closeform solution. Although stochastic inverse half-toning is a stand-alone important problem, we willuse here the state-of-the-art inverse halftoningmethod proposed by Neelamani et al. [18] forwavelet-based inverse halftoning via deconvolution(WInHD). This approach is based on the originalidea of Donoho [19] called wavelet-vaguelette

decomposition. The wavelet-vaguelette decomposi-tion splits the deconvolution problem into twoparts. In the first part, the operator H in (13) isinverted leading to the ‘‘deblurred’’ estimate:

ex ¼ H�1s ¼ xþH�1Qz0 (16)

that produces the clean image x with noisy partH�1Qz0 assuming that the inverse operator H�1

exists (that it is not the case in general for all imagerestoration problems).

The second part of wavelet-vaguelette decom-position tries to remove the noisy part H�1Qz0 usingwavelet-based denoising. Therefore, the relevantproblem of stochastic image modeling is reducedto the proper modeling of wavelet coefficients.Hjorungnes et al. [20] have proposed a simpleand tractable model that has found a numberof applications in state-of-the-art image coding(EQ-coder [21]), denoising (wavelet Wiener filter[22]) and watermarking for evaluation of data-hiding capacity [3] and content-adaptive data-hiding[23]. According to the EQ framework, imagecoefficients are considered coming from a realizationof independent Gaussian distribution with a‘‘slowly’’ varying variance. This represents a classof local Gaussian models and justifies the applica-tion of local operations. It should be mentioned thatthe relationship between the global image statisticsand locally Gaussian assumption was elegantlydemonstrated based on mixture model where thevariance also becomes a random parameter drawnfrom the exponential distribution [20]. This alsojustifies a doubly stochastic modeling of imagecoefficients and provides a logical connection to theparallel Gaussian source model considered inSection 2.4.

The denoiser of WInHD is also based on the samemodel that leads to the Wiener filter as the MMSE(MAP) estimate for Gaussian case. To enhancethe performance of the denoiser, the WInHD


implements the Wiener filter in an overcompletetransform domain (complex wavelets) to overcomethe problems of the critically sampled wavelettransform. Thus the final estimate is obtained as

x0j;k ¼s2j;k

s2j;k þ s2jexj;k, (17)

where j denotes the scale, s2j;k the local variance ofimage coefficients for scale j, s2j the variance ofthe residual noisy term in the transform domainand all variables are considered in the transformdomain.

It is not our primary goal to enhance the WInHD,we will use it in its original form. It should,however, be mentioned that more powerful stochas-tic image models could be used, taking into accountthe stationarity of transform image coefficientsalong the edges and transitions. Such modelsshould bring additional enhancement of the inversehalftone image quality and corresponding higherreliability of side information at the decoder [24].

4. System implementation: index communication via

auxiliary physical channel

The approach based on the index communicationvia auxiliary physical channel can be designed eitherusing the Gray setup, i.e., symmetric two-sidedinformation, or the Wyner–Ziv setup, i.e., sideinformation available only at the decoder. Thebasic system block-diagram for both setups isshown in Fig. 6 with the difference consisting inthe availability of side information SN at theencoder, i.e., Gray setup.

The practical design of the auxiliary channelindex communication system includes two mainfunctional blocks, namely source coding for theresidual compression and channel coding (2-D barcode) for index communication.

Halftoning

i

Z

X

Encoder

HW

S

MultilevelEncoder

Fig. 6. Index communication via au

4.1. Source coding

To establish the lower bound for the rate-distortion function we will consider here only theGray setup assuming that the Wyner–Ziv systemwill do the best to approach this bound usingmultidimensional lattice/coset codes. According tothe analysis presented in Sections 2.4 and 3, weassume that the side information, which is given inthe form of analog (halftone) image is also availableat the encoder due to the complete knowledge of theprinting process (in other words, we assume that theinverse halftone image produced at the encoder isvery accurate approximation of the real scannedimage obtained at the decoder). Thus, the encodershould only communicate the difference Z ¼

E½XjS� � X to the decoder with the given boundeddistortion D choosing corresponding index iðX;SÞ.

From an information-theoretic perspective, anoptimal source code ð2NR;NÞ for this setup consistsof an encoding map i : XN � SN

! f1; 2; . . . ; 2NRg

and a reconstruction map X : f1; 2; . . . ; 2NRg�

SN! XN

, where both encoder ið:Þ and decoderX ð:Þ are designed based on the state information SN .In the general case, rate-distortion performance ofthis encoding scheme is lower bounded by (2) and inthe Gaussian setup the conditional rate distortionfunction is given by (8). The details of a particularpractical implementations of the source encoding/decoding are presented in Section 5.

4.2. Channel coding

The task of the channel code is to communicateerrorlessly (Pr½iaiðSÞ� ! 0) the index i via someanalog auxiliary channel. A natural auxiliarychannel for printed documents consists of 2-D barcodes [25], although as was mentioned in theintroduction other types of information carrierswith sufficient capacity can be used for this purpose.

Printing/Scanning

ˆDecoderHWX'

2-D BarCode

MultistageDecoder

Joint DecoderAnalog

Channel

i

SX

xiliary (2-D bar code) channel.

ARTICLE IN PRESS

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

D

R

0.9 0.8 0.7

0.6

0.5

0.40.3

0.10.2

Fig. 7. Conditional rate-distortion function for different r and

X�N ð0; 1Þ.


In the case of 2-D bar codes, multilevel codes (MLC)[26,27] can be used for errorless index communica-tion. The MLC are known to closely approachcapacity for band-limited channels The main idea ofmultilevel coding consists in the protection of eachaddress bit of the signal point by an individualbinary code C‘ at level ‘. One can use irregular low-density parity-check (LDPC) codes as the binarycode C‘ for each level ‘ followed by 8-PAM chosenfor simplicity reasons and easy integration intoprinted 2-D bar code. One can also apply precodingfor bar-codes to avoid intersymbol interferencecaused by the halftoning filter H. Finally, multistage

decoding is applied based on multiple access channeldecomposition of MLC using chain rule for mutualinformation between channel output and each level.More details about a particular implementation ofthe MLC/MSD channel code will be given in thenext section.

5. Results

First, to investigate the theoretical limits ofsource coding with side information available atboth encoder and decoder, we have modeled therate-distortion function RðDÞX jS for a Gaussiansource and the MSE distortion measure for differentvalues of r2 (Fig. 7), where r is a correlationcoefficient. The highest gain with respect to theclassic coding without side information can beexpected for highly correlated sources X and S

(reliable side information) when r2! 1. If r2 ! 0,there is no gain with respect to RðDÞ. Therefore, theprinting channel (and particularly a joint half-toning/inverse halftoning) should provide an accu-rate estimate of the source X, i.e., to reduce theimpact of the halftoning process. In real applica-tions, r2 is expected to be in the range of0:2pr2p0:6 and therefore for the stationaryGaussian source the maximum gain can be as muchas 0.26–0.66 bit per sample.

Secondly, we have performed the comparison ofdifferent coding strategies for a number of realimages based on Floyd halftoning and wavelet-based inverse haftoning via deconvolution [18].Here we only report results for the images Lenaand Baboon of size of 512� 512. The ‘‘lower

empirical bound’’ is estimated for the Gray setupdescribed in Section 4. The prediction of the inversehalftone images s based on the MMSE estimateðX0¼ E½XjS�Þ is performed at the encoder and the

residual Z ¼ X0� X is compressed at various rates

and communicated to the decoder via 2-D bar code.At the decoder, the MMSE estimate S is computedfrom the printed image and combined with thedecoded residual QðZÞ.

Three different lossy coders are used to compressthe residual QðZÞ, i.e., the DCT-based JPEG,wavelet-based WaveConvert [28] and wavelet-basedEQ coder. The last two codecs are wavelet trans-form-based. In the case of WaveConvert [29] theencoding algorithm is a progressive zero-tree-basedwavelet encoder exploiting progressive quantizationfollowed by the adaptive arithmetic entropy coder.

The EQ lossy image codec [21] exploits anintraband correlations among coefficients in thehigh-frequency wavelet subbands. The main under-lying idea of this method is that the samples in thesesubbands are distributed according to locallyGaussian distribution with slowly varying variancethat might be locally estimated using a maximumlikelihood strategy based on the causal neighbor-hood and potentially a parent sample located in thesame spatial position.

At the end of the estimation step, the coefficientsare quantized using a uniform threshold quantizerselected accordingly to the results of the rate-distortion optimization that is followed by anoptimal adaptive arithmetic coding.

While at the high-rate regime the approximationof the local variance by its quantized version isvalid, it is not the case at low rates. The reason forthat is the quantization to zero most of the datasamples that makes local variance estimationextremely inaccurate. The simple solution proposed


in [21] consists in the placement of all thecoefficients that fall into the quantizer deadzone inthe so-called unpredictable class, and the rest in theso-called predictable class. The samples of the firstone are considered to be distributed globally as ani.i.d. generalized Gaussian distribution, while theinfinite Gaussian mixture model is used to capturethe statistics of the samples in the second one. Thisseparation is performed using a simple rate-depen-dent thresholding operation. The parameters of theunpredictable class are exploited in the rate-distor-tion optimization and are sent to the decoder as sideinformation. The experimental results presented in[12,19] allow to conclude about the state-of-the-art

Fig. 8. Results of computer modeling for the frame of Lena image: (

halftone image; (d) directly JPEG compressed image (rate 0.2 bpp); (e

(f) directly EQ compressed image (rate 0.2 bpp) and (g) EQ compresse

performance of this technique in the image com-pression application.

In this paper we used the following parametersfor the EQ algorithm: 4 level DWT based on 9

7

biorthogonal wavelet filter pair with local varianceML estimate in a causal window of size 3� 3without considering interscale dependencies).

The performed experiments for all mentionedencoders were targeting the following compressionrates: 0.20, 0.25, 0.50 and 1.00 bpp. The obtainedresults are presented in Figs. 8 and 9. Moreover, forcomparison reasons, we indicate in Tables 1 and 2the results of direct compression by theabove methods assuming that one can directly

a) original image Lena; (b) halftone image; (c) WInHD inverse

) JPEG compressed image with side information (rate 0.2 bpp);

d image with side information (rate 0.2 bpp).


communicate the compressed image via 2-D barcode to the decoder without usage of side informa-tion for final image reconstruction. A possible gaindue to the decoding with the side information in theform of a halftone image is also shown in the table.The PSNR of the inverse halftone images (withrespect to the original images) was 31.95 dB for thecase of image Lena and 23.73 dB for image Baboon,respectively. The maximum gain is obtained for theEQ coder, and is in the range of 0.26–2.87 dB fordifferent rates with respect to the directly com-pressed image. Obviously, the capacity of theauxiliary channel will be a factor that determines aparticular operational rate depending on the appli-cation constraints. The enhancement of ‘‘analog’’

Fig. 9. Results of computer modeling for the frame of Baboon image: (

halftone image; (d) directly JPEG compressed image (rate 0.2 bpp); (e

(f) directly EQ compressed image (rate 0.2 bpp) and (g) EQ compresse

(halftone) image quality is also varying for differentrates between 2–8 dB, which indicates a consider-able potential of such an ‘‘upgrade’’ for manypractical applications where the image qualityshould be relatively high. Using advanced methodsof inverse halftoning and more accurate stochasticimage models one can expect a further enhancementof image quality for low-bit rates.

One important question that concerns the pro-posed bar code-based solution to the problem ofquality enhancement of printed-and-scanned docu-ments consists in the physical size of a bar codeconveying quality update information. Accordingly tothe recently obtained results [30], the storage capacityof modern such storage modules is more than

a) original image Baboon; (b) halftone image; (c) WInHD inverse

) JPEG compressed image with side information (rate 0.2 bpp);

d image with side information (rate 0.2 bpp).

ARTICLE IN PRESS

Table 1

Comparison of resulting PSNR [dB] for several compression

methods for image ‘‘Lena’’

Compression method Rate (bpp)

0.20 0.25 0.50 1.00

DCT-based JPEG coder

Direct compression 28.89 30.41 34.64 37.83

Compression with SI 33.55 34.17 36.19 38.61

Gain over direct compression 4.66 3.76 1.55 0.78

Gain over direct halftoning WInHD 1.60 2.22 4.24 6.66

WaveConvert coder





EQ-based coder





Table 2

Comparison of resulting PSNR [dB] for several compression

methods for test image ‘‘Baboon’’

Compression method Rate (bpp)

0.20 0.25 0.50 1.00

DCT-based JPEG coder




Gain over direct halftoning WInHD �0.12 0.66 1.86 3.93

WaveConvert coder





EQ-based coder





S. Voloshynovskiy et al. / Signal Processing 87 (2007) 1301–13131312

2000bytes=in2. In particular, our implementation ofthe 2-D bar codes based on 15 level non-equidistant 8pulse amplitude modulation MLC/MSD schemesusing the LDPC codes allows to reliably store andextract 2400bytes=in2 of information. Thus, it is easyto compute, for instance, that in the case ofcompression rate equal to 0.2 bits/element a bar code

will occupy approximately 17:6 cm2. Supposing that aprinted document should be reproduced on the paperof A4 format, a possible solution might consist in theplacement of a stripe containing compressed informa-tion 1� 17:6 cm2 in the top/bottom of the page.

6. Conclusions

We have considered the problem of visualcommunications with side information via distrib-uted printing channels. In particular, we haveanalyzed the possibility of side information-assistedimage processing in printing applications. We havealso provided a mathematical modeling of theprinting channel in the scope of digital communica-tions with side information. The related problem ofsource coding based on Wyner–Ziv and Gray setupshas been considered. Corresponding practical sce-nario has been analyzed and finally the theoreticalbounds on systems performance are given. Theresults of computer simulation show a range ofpossible applications where the expected gain canconsiderably increase the possibilities and function-alities of modern multimedia and security systems.

Acknowledgments

This paper was partially supported by the SwissNational Science Foundation Professorship GrantNo. PP002-68653/1, by the European Commissionthrough the IST Programme under contract IST-2002-507932-ECRYPT, European project FP6-507609-SIMILAR and Swiss IM2 projects. Theauthors are thankful to Fernando Perez-Gonzalezfor many helpful and interesting discussions ondifferent subjects of lattice coding in personalcommunications and during his visit to Universityof Geneva. The authors are also thankful to themembers of Stochastic Image Processing group(University of Geneva) for many stimulatingdiscussions during seminars.

The information in this document reflects onlythe authors views, is provided as is and noguarantee or warranty is given that the informationis fit for any particular purpose. The user thereofuses the information at its sole risk and liability.

References

[1] I.J. Cox, M.L. Miller, J.A. Bloom, Digital Watermarking,

Morgan Kaufmann Publishers, Academic Press, Los Altos,

CA, New York, 2002.


[2] F. Deguillaume, S. Voloshynovskiy, T. Pun, Hybrid robust

watermarking resistant against copy attack, in: Eleventh

European Signal Processing Conference (EUSIPCO’2002),

Toulouse, France, September 3–6, 2002.

[3] P. Moulin, M.K. Mihcak, A framework for evaluating the

data-hiding capacity of image sources, IEEE Trans. Image

Process. 11 (9) (September 2002) 1029–1042.

[4] F. Perez-Gonzalez, F. Balado, J.R. Hernandez, Performance

analysis of existing and new methods for data hiding with

known-host information in additive channels, IEEE Trans.

Signal Process. (Special Issue on Signal Processing for Data

Hiding in Digital Media and Secure Content Delivery) 51(4)

(April 2003).

[5] E. Martinian, G. Wornell, B. Chen, Authentication with

distortion criteria, IEEE Trans. Inform. Theory (July 2005)

2523–2542.

[6] T. Kalker, F.M. Willems, Coding theorems for reversible

embedding, in: DIMACS Workshop on Network Informa-

tion Theory, vol. 66, Rutgers University, Piscataway, NJ,

March 2003.

[7] A. Sutivong, M. Chiang, T. Cover, Y.-H. Kim, Channel

capacity and state estimation for state-dependent gaussian

channels, IEEE Trans. Inform. Theory 51 (4) (April 2005).

[8] T. Cover, J. Thomas, Elements of Information Theory,

Wiley, New York, 1991.

[9] T. Cover, M. Chiang, Duality of channel capacity and rate

distortion with two sided state information, IEEE Trans.

Inform. Theory (2002) 48(6).

[10] R. Gray, A new class of lower bounds on information rates

of stationary sources via conditional rate-distortion func-

tion, Inform. Control 19 (7) (July 1973) 480–489.

[11] T. Berger, The information theory approach to communica-

tions, in: G. Longo (Ed.), Multiterminal Source Coding,

Springer, Berlin, 1978.

[12] T.J. Flynn, R.M. Gray, Encoding of correlated observa-

tions, IEEE Trans. Inform. Theory 33 (11) (November 1987)

773–787.

[13] A. Wyner, J. Ziv, The rate-distortion function for source

coding with side information at the decoder, IEEE Trans.

Inform. Theory 22 (1) (January 1976) 1–10.

[14] R. Zamir, The rate loss in the Wyner–Ziv problem, IEEE

Trans. Inform. Theory 19 (November 1996) 2073–2084.

[15] A. Wyner, The rate-distortion function for source coding

with side information at the decoder-II: general sources,

Inform. Control 38 (1978) 60–80.

[16] D. Slepian, J.K. Wolf, Noiseless encoding of correlated

information sources, IEEE Trans. Inform. Theory 19 (July

1973) 471–480.

[17] T.D. Kite, B.L. Evans, A.C. Bovik, T.L. Sculley, Modeling

and quality assessment of halftoning by error diffusion,

IEEE Trans. Image Process. 9 (6) (May 2000) 909–922.

[18] R. Neelamani, R. Nowak, R.G. Baraniuk, WInHD: wavelet-

based inverse halftoning via deconvolution, submitted to

IEEE Trans. Image Process. (October 2002), submitted for

publication.

[19] D. Donoho, Nonlinear solution of linear inverse problems

by wavelet-vaguelette decomposition, Appl. Comput. Har-

monic Anal. 2 (1995) 101–126.

[20] A. Hjorungnes, J. Lervik, T. Ramstad, Entropy coding of

composite sources modeled by infinite gaussian mixture

distributions, in: IEEE Digital Signal Processing Workshop,

20–24 January, 1996, pp. 235–238.

[21] S. LoPresto, K. Ramchandran, M. Orhard, Image coding

based on mixture modeling of wavelet coefficients and

a fast estimation-quantization framework, in: Data Com-

pression Conference 97, Snowbird, Utah, USA, 1997,

pp. 221–230.

[22] M.K. Mihcak, I. Kozintsev, K. Ramchandran, P. Moulin,

Low-complexity image denoising based on statistical model-

ing of wavelet coefficients, IEEE Signal Process. Lett. 6 (12)

(December 1999) 300–303.

[23] S. Voloshynovskiy, A. Herrigel, N. Baumgaertner, T. Pun, A

stochastic approach to content adaptive digital image

watermarking, in: Third International Workshop on In-

formation Hiding, vol. 1768, Dresden, Germany, September

29–October 1, 1999, pp. 212–236.

[24] S. Voloshynovskiy, O. Koval, T. Pun, Wavelet-based image

denoising using non-stationary stochastic geometrical image

priors, in: IS&T/SPIE’s Annual Symposium, Electronic

Imaging 2003: Image and Video Communications and

Processing V, Santa Clara, CA, USA, 20–24 January 2003.

[25] N. Degara-Quintela, F. Perez-Gonzalez, Visible encryption:

using paper as a secure channel, in: Security and Water-

marking of Multimedia Contents V, vol. 5020, Santa Clara,

CA, USA, 20–24 January 2003.

[26] J. Huber, U. Wachsmann, R. Fischer, Coded modulation by

multilevelcodes: overview and state of the art, in: ITG–

Fachbericht: Codierung fur Quelle, Kanal und Ubertragung,

Aachen, Germany, March 1998, pp. 255–266.

[27] H. Imai, S. Hirakawa, A new multilevel coding method

using error-correcting codes, IEEE Trans. Inform. Theory 3

(23) (1995) 371–377.

[28] T. Strutz, in: WaveConvert, 2001, hhttp://www-nt.e-techni-

k.uni-rostock.de/�ts/software/download.htmli.

[29] E. Muller, T. Strutz, Scalable wavelet-based coding of color

images, in: Fourth International Conference on Actual

Problems of Electronic Instrument Engineering (APEIE’98),

Nowosibirsk, Russia, September 1998, pp. 29–35.

[30] R. Villan, S. Voloshynovskiy, O. Koval, T. Pun, Multilevel

2d bar codes: Towards high capacity storage modules for

multimedia security and management, IEEE Trans. Inform.

Forensics Secur. (2006), accepted for publication.

http://www-nt.e-technik.uni-rostock.de/~ts/software/download.html



quality enhancement of printed-and-scanned...

Documents