quality enhancement of printed-and-scanned...
TRANSCRIPT
ARTICLE IN PRESS
0165-1684/$ - se
doi:10.1016/j.si
$Parts of thi
F. Deguillaume
side informatio
multimedia and
Photonics West
and Applicatio�CorrespondE-mail addr
Oleksiy.Koval@
Frederic.Deguil
Thierry.Pun@c
URL: http:/
Signal Processing 87 (2007) 1301–1313
www.elsevier.com/locate/sigpro
Quality enhancement of printed-and-scanned images usingdistributed coding$
S. Voloshynovskiy�, O. Koval�, F. Deguillaume, T. Pun
University of Geneva - CUI, 24 rue du General Dufour, CH-1211 Geneva 4, Switzerland
Received 20 December 2005; received in revised form 31 October 2006; accepted 7 November 2006
Available online 8 December 2006
Abstract
In this paper we address visual communications via print-and-scan channels from an information-theoretic point of view
as communications with side information that targets quality enhancement of visual data at the output of this type of
channels. The solution to this problem addresses important aspects of multimedia data processing and management.
A practical approach to side information communications for printed documents based on Wyner–Ziv and Gray setups is
analyzed in the paper that assumes two separated communications channels where an appropriate distributed coding
should be elaborated. The printing channel is considered to be a direct visual channel for images (‘‘analog’’ channel with
degradations). The ‘‘digital channel’’ is considered to be an appropriate auxiliary channel exploited to communicate the
information exploited for quality enhancement of printed-and-scanned image. We demonstrate both theoretically and
practically how one can benefit from this sort of ‘‘distributed paper communications’’.
r 2006 Elsevier B.V. All rights reserved.
Keywords: Data-hiding; Printed-and-scanned documents; Communications with side-information; Bar codes; Distributed coding; Joint
source-channel coding; Gel’fand–Pinsker problem; Wyner–Ziv coding; Gray coding; Image enhancement
1. Introduction
Printed documents are still the most common,habitual and widely used form of visual informationrepresentation. Therefore, as with digital media, the
e front matter r 2006 Elsevier B.V. All rights reserved
gpro.2006.11.003
s paper appeared as S. Voloshynovskiy, O. Koval,
and Thierry Pun, Visual communications with
n via distributed printing channels: extended
security perspectives, in: Proceedings of SPIE
, Electronic Imaging 2004, Multimedia Processing
ns, San Jose, USA, January 18–22, 2004.
ing authors.
esses: [email protected] (S. Voloshynovskiy),
cui.unige.ch (O. Koval),
[email protected] (F. Deguillaume),
ui.unige.ch (T. Pun).
/watermarking.unige.ch/, http://sip.unige.ch.
important emerging issues for printed documentsare copyright protection, authentication, integritycontrol, tracking and finally quality enhancement.The printed document can be considered as a partof visual communications where data are convertedinto some specific form suitable for reliablecommunications via printing channels, with somegiven constraints on perceptual quality. Digitalhalftoning is often used for the transformation ofdigital documents into printed form. Although,being very simple and having been used since1855, this transformation unavoidably leads to thedegradation of the original document quality.This is not the case for digital documents and hasimportant consequences for document security andmanagement.
.
ARTICLE IN PRESSS. Voloshynovskiy et al. / Signal Processing 87 (2007) 1301–13131302
Digital data-hiding [1–4] has been considered as apossible approach to address the above problemsfor digital documents. However, in most cases thetraditional means of digital data-hiding are notalways suitable for printed documents and newapproaches are required. Moreover, the printingindustry has already recognized a limit in theenhancement of printed/scanned image qualityrelated to inverse halftoning. Further enhancementwould imply the development of more expensiveand complex printing/scanning devices, which is notappropriate for many low-cost and real-time appli-cations. At the same time the security of printeddocuments (not necessarily printed on paper)including passports, visas, ID cards, driving li-censes, diplomas, credit cards, banknotes or checksis usually accomplished by proprietary know-howtechnologies and digital data-hiding is far to bewidely accepted in this field.
In this paper we consider some practicalapproaches and analyze several scenarios aimingat further extending the usage of distributed codingfor printed documents. This problem is tackledfrom an information-theoretic point of view ascommunications with side information. Therefore,the goal of the paper is three-fold:
�
introduce an information-theoretic formalisminto visual communications via distributed print-ing channels; � consider possible communications scenarios withside information for multimedia and securityapplications; �1Note: More general auxiliary channel can be designed using
different information carriers used in the printed (plastic)
documents such as bar codes, invisible inks or crystals, magnetic
strips, cheap electronic low-capacity chips.
establish corresponding bounds on system per-formance assuming side information available atthe encoder or at the decoder depending on theparticular scenario.
The main practical focus of this paper is onimportant problem of quality enhancement ofdocuments at the output of print-and-scan channels.One of principle applications were such a problem isplaying a crucial role is document authenticationand person identification. In such a protocol oneshould verify if the photo contained in theidentification document (passport, ID card, driverlicense, etc.) is authentic or not. Evidently, in suchan application quality of the data at the output ofthe print-and-scan channel should be sufficientlyhigh to guarantee errorless protocol performance.
As an another possible application one canconsider the case of publicly distributed documents
such as journals, newspapers, books and advertise-ment materials. For instance, Fig. 1 represents thebasic setup where the image of concern (Lena)should be extracted from the printed document withthe purpose of authentication, tracking or qualityenhancement. If this image is simply scanned by a
public user that does not have any access to specificinformation, the best quality of the resultingscanned image will be defined by the appliedprocessing prior to printing (possibly lossy com-pression, resampling or blurring), by the quality ofthe printing and finally by the quality of thescanning (inverse halftoning). It is obvious that insome cases the degradation of quality can be quiteconsiderable so that the scanned image cannot befurther exploited for the targeted applications.
Moreover, in some practical scenarios of personidentification based on ID documents, the printeddocument cannot be reproduced in color, if cheapprinting is used or when the physics of thereproduction process does not allow to achieve itOne example consists in laser engraving of photoson plastic cards used in security printing. In thiscase, the color information can be communicatedvia data-hiding or some auxiliary channel exploitingthe redundancy between color planes.
A different situation arises when the imagecontains some embedded data or when these dataare communicated via some auxiliary channel thatcan assist the quality enhancement of the printeddocument. Assume here for simplicity that thisauxiliary channel is a 2-D high capacity bar codeshown at the bottom of the page (Fig. 1).1 The barcode can be designed as a specific pattern or logo toavoid an annoying random-like appearance on thepage. The bar code can be also attached to the lastpage of the journal or can be printed on somesupplying layout. Access to the embedded data or tothe content of the bar code is authorized and can bemanaged according to the selected distributionprotocol. In the simplest case, it can be grantedfor an extra price. We call the person who hasobtained the above access rights a private user or an
authorized user. The private user can somehowbenefit from the appropriately communicated extrainformation in enhancing the quality of the scanneddocument. Therefore, the overall goal is to design
ARTICLE IN PRESS
Fig. 1. Visual communications via printing channels: example of journal with side information communications in the form of 2-D bar
code.
S. Voloshynovskiy et al. / Signal Processing 87 (2007) 1301–1313 1303
an optimal system that will provide the maximumpossible quality of the reconstructed image withminimum needed rate of the extra information to beadditionally communicated to the decoder.
Two alternative solutions to the formulatedproblem exist that differ by the physical channelstructure of the channel conveying the qualityupdate information. The first one is based on thehigh capacity data-hiding and exploits a self-embedding principle for this purpose [5]. The secondone exploits high capacity 2-D bar codes, invisibleinks or crystals, magnetic strips, cheap electroniclow-capacity chips an auxiliary channel to transferthis update.
Up to our knowledge, only some theoretical resultsjustifying the performance of protocol where infor-mation and host interference are simultaneouslytransmitted over a channel are available [5–7] andnone existing practical high capacity data-hidingtechnique simultaneously satisfies the visibility andpayload constraints and provides robustness to thedigital-to-analogue-to-digital conversion. That is
why in the scope of this paper we will mostly focusour analysis on the latter approach.
The paper is organized as follows. Section 2outlines the fundamentals of rate-distortion theorywith side information. Sections 3 presents a mathe-matical model of the printing process. Section 4considers a possible protocol auxiliary channel indexcommunications. Section 5 presents experimentalresults and Section 6 concludes the paper.
Notations: We use capital letters to denote scalarrandom variables X, bold capital letters to denotevector random variables X, corresponding smallletters x and x to denote the realizations of scalarand vector random variables, respectively. Thesuperscript N is used to designate length-N vectorsx ¼ xN ¼ ½x1;x2; . . . ;xN �
T with kth element xk. Weuse X�pX ðxÞ or simply X�pðxÞ to indicate that arandom variable X is distributed according topX ðxÞ. The mathematical expectation of a randomvariable X�pX ðxÞ is denoted by EpX
½X � or simplyby E½X � and Var½X � denotes the variance of X.We use IðX ;Y Þ to denote a mutual information
ARTICLE IN PRESSS. Voloshynovskiy et al. / Signal Processing 87 (2007) 1301–13131304
between two random variables X and Y [8].Calligraphic fonts X denote sets X 2 X and jX jdenotes a cardinality of set X . IN denotes the N �
N identity matrix.
2. Visual communications with side information
2.1. Asymmetric two-side information
Visual communications via printing channels can beconsidered from the point of view of communicationswith an asymmetric side information available at theencoder and decoder. The asymmetry of side informa-tion originates from the fact that the different types ofprior information about the actual image X areavailable at the encoder and actual side informationavailable a posteriori at the decoder. Let S1 denotesthe side information available at the encoder and S2
the side information available at the decoder (Fig. 2).One can consider S1 as a ‘‘virtually predicted’’ imageafter halftoning/inverse halftoning and S2 as thephysically printed-and-scanned image. Assume thatsamples of digital image and side information atencoder and at decoder fX k;S1;k;S2;kg�pX ;S1;S2
ðx; s1; s2Þ are independent drawings of jointly distrib-uted random variables X 2 X , S1 2 S1 and S2 2 S2,where X ;S1;S2 are finite sets. A distortion measure isdefined as dðx; xÞ and the corresponding averagedistortion as D ¼ ð1=NÞ
PNk¼1E½dðX k; X kÞ�. The im-
age fX kg is encoded using a length-N blocks code i
(a bar code, a watermark, etc.) into a binary streamof rate R, which should be decoded back as areconstructed image fX kg.
Cover and Chiang [9] have shown that the rate R isachievable at the distortion level D, if there exists a
sequence of ð2NR;NÞ codes i : XN � SN1 ! f1; 2; . . . ;
2NRg, X : f1; 2; . . . ; 2NRg � SN2 ! X
Nsuch that the
average distortion E½dðX; XðiðX;S1Þ; S2ÞÞ�pD and
RðDÞ ¼ minpðujx;s1Þpðxju;s2Þ
½IðU ;S1;X Þ � IðU ;S2Þ�, (1)
where the minimization is performed under thefollowing distortion constraint
Px;u;s1;s2;x
dðx; xÞpðx; s1; s2Þpðujx; s1Þpðxju; s2ÞÞpD and U is an
S2
i (X,S1)EncoderX
S1,S2S1
Decoder X
Fig. 2. Two-sided asymmetric side information available at both
encoder and decoder.
auxiliary random variable. S2! ðX ;S1Þ ! U andðX ;S1Þ ! ðU ;S2Þ ! X form Markov chain.
In the following we will focus on two particularcases of the considered generalized setup fromFig. 2: symmetric side information and asymmetricside information available only at the decoder.
2.2. Symmetric two-side information (S1 ¼ S2 ¼ S)
To introduce the basic concept of visual commu-nications via printing channels for the broad set ofdifferent application scenarios considered in thefollowing sections, we will simplify the previoussetup to a case with symmetric side information S ¼
S1 ¼ S2 that can be available at both encoder anddecoder or only at the decoder according to Fig. 3.
The rate distortion with symmetric side informa-tion at both encoder and decoder S ¼ S1 ¼ S2 wasfirst introduced by Gray [10] and further consideredby Berger [11] and Flynn and Gray [12] (Fig. 3:dashed line corresponds to availability of SN at theencoder). The corresponding rate-distortion func-tion is defined as
RX jSðDÞ ¼ minpðxjx;sÞ
IðX ; X jSÞ, (2)
where the minimization is over all conditionaldistributions pðxjx; sÞ such that
Px;x;s dðx; xÞpðx; sÞ
pðxjx; sÞpD. The above rate-distortion-functionalso follows from (1) assuming S ¼ S1 ¼ S2 [9].
2.3. Asymmetric side information at decoder
The rate-distortion problem with side informa-tion available at the decoder was considered byWyner and Ziv [13]. It also follows from (1), ifS1 ¼ ;, S2 ¼ S and ðS1;X Þ ¼ X :
RðDÞWZX jS ¼ min
pðujxÞpðxju;sÞ½IðU ;X Þ � IðU ;SÞ�
¼ minpðujxÞpðxju;sÞ
HðUÞ �HðU jX Þ
�HðUÞ þHðU jSÞ
¼ minpðujxÞpðxju;sÞ
HðU jSÞ �HðU jX ;SÞ
¼ minpðujxÞpðxju;sÞ
IðU ;X jSÞ, ð3Þ
EncoderX
S
Decoder X
ps,x (x,s)
i (X)
Fig. 3. Symmetric side information S1 ¼ S2 ¼ S.
ARTICLE IN PRESSS. Voloshynovskiy et al. / Signal Processing 87 (2007) 1301–1313 1305
where the minimization is performed over allpðujxÞpðxju; sÞ, i.e., pðx; s; u; xÞ ¼ pðx; sÞpðujxÞpðxju; sÞfor fixed pðx; sÞ, and all decoder functions X ðiðX Þ;SÞsuch that
Px;x;u;s pðx; sÞ pðujxÞpðxju; sÞdðx; xÞpD and
jUjpjX j þ 1. The third equality in (3) follows fromthe fact that S! X ! U form a Markov chain andHðU jX Þ ¼ HðU jX ;SÞ, consequently, while the secondMarkov chain is formed as X ! ðU ;SÞ ! X .
In the general case, IðX ;X jSÞpIðU ;X jSÞ andtherefore it can be a loss in performance of theWyner–Ziv setup in comparison to the rate-distor-tion function with side information available atboth encoder and decoder [14].
We will denote the classic rate-distortion functionwithout any side information (S1 ¼ ;, S2 ¼ ;) as
RðDÞ ¼ minpðxjxÞ
IðX ;X Þ, (4)
where the minimization is performed over all pðxjxÞ
such thatP
x;x pðxÞpðxjxÞdðx; xÞpD.In the subsequent subsections we will consider the
case of a Gaussian formulation of the rate-distor-tion problem. A methodology allowing to extendthe results obtained for the discrete alphabets tocontinuous formulation can be found in [15].
2.4. Shannon lower bound (SLB): Gaussian source
and MSE distortion
The classic rate-distortion function without anyside information (4) for a Gaussian N ð0;s2X Þ sourcewith squared error distortion is represented by theSLB [8]:
RðDÞ ¼1
2logþ2
s2XD
, (5)
where logþ2 x ¼ maxf0; log2ðxÞg. The correspondingrate-distortion function simultaneously describingN independent Gaussian random variables (aparallel Gaussian source model) X k�N ð0;s2X k
Þ, k ¼
1; 2; . . . ;N with a rate R bits/source vector is givenby
RðDÞ ¼XN
k¼1
1
2log2
s2X k
Dk
, (6)
where
Dk ¼l if los2X k
;
s2X kif lXs2X k
;
((7)
where l is chosen to satisfyPN
i¼1Dk ¼ D.
2.4.1. Side information at encoder and decoder
Let X be a sample generated by a zero mean andstationary Gaussian memoryless source, i.e.,X�N ð0;s2X Þ. The side information is given by anoisy version of source, i.e., S ¼ X þ Z0 where X
and Z0 are independent and Z0�N ð0; s2Z0 Þ.The rate-distortion function (2) in this case will
have the same character as for the stationaryGaussian source:
RðDÞX jS ¼1
2logþ2
s2X jSD
, (8)
where s2X jS ¼ s2X ð1� r2Þ is conditional variance,
r ¼ E½ðX � X ÞðS � SÞT�=ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiVar½X �Var½S�p
¼ sX=sS
is the correlation coefficient and s2S ¼ s2X þ s2Z0 ,thus, providing s2X jS ¼ s2Zs
2X=ðs
2Z þ s2X Þ.
Consequently, the rate saving due to the availableside information at the encoder and the decoderwith respect to the independent coding of X is
RðDÞ � RðDÞX jS ¼1
2log2
1
1� r2. (9)
If r2! 0, i.e., the variance of the noise Z0 isincreasing and X and S become less correlated,RðDÞX jS ! RðDÞ and no gain can be expected fromthe joint decoding.
In the case of non-stationary Gaussian vector(parallel Gaussian source), X ¼ fX 1;X 2; . . . ;X Ng,where X k�N ð0;s2X k
Þ, we have similarly to (6):
RðDÞX jS ¼XN
k¼1
1
2log2
s2X ks2Z
ðs2X kþ s2ZÞDk
, (10)
where
Dk ¼
l if los2X k
s2Zs2X kþ s2Z
;
s2X ks2Z
s2X kþ s2Z
ifs2X k
s2Zs2X kþ s2Z
;
8>>>><>>>>: (11)
where l is chosen to satisfyPN
i¼1Dk ¼ D.
2.4.2. Side information at decoder
Wyner and Ziv [13,15] demonstrated that for theGaussian case RðDÞWZ
X jS ¼ RðDÞX jS. However, gen-erally RðDÞWZ
X jSXRðDÞX jS [14]. This means that thereis a rate loss with the Wyner–Ziv coding. Also it isknown that for discrete sources when D ¼ 0, theWyner–Ziv problem reduces to the Slepian-Wolfproblem [16] with Rð0ÞWZ
X jS ¼ Rð0ÞX jS ¼ HðX jSÞ
assuming that S is communicated to the decoderat the rate HðSÞ.
ARTICLE IN PRESSS. Voloshynovskiy et al. / Signal Processing 87 (2007) 1301–13131306
2.4.3. Approaching theoretical rate-distortion function
with symmetric side-information (Wyner– Ziv vs Gray
setups)
For practical reasons, we will perform a com-parative analysis of the non-collaborative encodingsetup (Wyner–Ziv problem) with collaborative one(Gray problem) for distributed coding in printingapplications.
In the general case, the following ‘‘inequalitysandwich’’ is valid:
RðDÞXRðDÞWZX jSXRðDÞX jS. (12)
Therefore, the rate-distortion function RðDÞWZX jS is
above the rate-distortion function RðDÞX jS. In ourparticular case of distributed coding with sideinformation in printing applications, the sideinformation at the encoder is naturally given bythe physical nature of the problem. Therefore, theprinting application should definitively benefit fromthe availability of side information.
2.4.4. Compressed channel index communication
In printed documents, the side information S isdirectly available as the printed image (possiblyafter inverse halftoning). However, the index i
should be somehow communicated to the decoderin practical systems. According to the justification,given in Introduction, we will assume that it iscommunicated via an auxiliary channel that caninclude different information carriers used in theprinted (plastic) documents such as bar codes,invisible inks or crystals, magnetic strips, cheapelectronic low-capacity chips.
3. Mathematical model of printing
To investigate the ‘‘quality’’ of side information,i.e., to establish how close S is correlated with X
from one side and to investigate the possibility ofself-embedded communications from another side,one should consider a mathematical model of aprinting channel.
3.1. Halftoning and inverse halftoning: image
processing perspectives
Digital halftoning is a process that converts agray-scale image (X 2 f0; 1; . . . ; 255g) into a bi-levelimage (X 2 f0; 255g) that can easily be reproducedby a simple printing device. There are two types ofhalftoning, i.e., ordered and random (or so-called
error diffusion) halftoning. The ordered halftoningis mostly used in laser printers while error diffusionrepresents a broad class of ink jet printing devices.We will focus on the later for demonstrationpurposes. The error diffusion takes the error fromthe quantized gray-scale pixel to bi-level pixel anddiffuses the quantization error over a causalneighborhood The inverse halftoning of errordiffusion should ensure that the spatially localizedaverage pixels values of the halftone and originalgray-scale image are similar. Kite et al. [17]proposed an accurate linear model for errordiffusion halftoning. This model accurately predictsthe ‘‘blue noise’’ (high-frequency noise) and edgesharpening effects observed in various error-dif-fused halftones. The resulting halftone image s isobtained as
s ¼ HxþQw ¼ Hxþ z0, (13)
where H represents a linear shift-invariant (LTI)halftone filter, Q is a LTI filter corresponding to theerror diffusion coloring, w is white noise and z0 ¼
Qw is colored noise.Therefore, the inverse halftoning according to (13)
can be posed as a deconvolution or restorationproblem (in image processing) or channel equaliza-tion problem (in digital communications) whereoperator H represents the equivalent blurring orchannel and z0 corresponds to the image acquisitionor channel noise.
The distributed coding system with communica-tions via printing channels is shown in Fig. 4. Thesystem consists of two parts. The first part isthe physically printed image S and the second partis the auxiliary channel dedicated to the communica-tion of index i that can be considered as a bar code.The physical printing channel is represented as asequential set of operations comprising halftoning,printing and scanning. Here, we assume that printingitself, as the black point reproduction process with aspecified resolution, and scanning do not introduceany distortions. In more general case one would needto take into account some optical blurring andaberrations as well as sensor non-linearity.
Considering the halftoning problem according tothe vector channel (13), assuming a vector Gaussiansetup X�N ð0;CX Þ and Z0�N ð0;CZ0 Þ, the deconvo-lution problem can be formulated as the MMSEestimation of x given the conditional meanX0¼ E½XjS�:
x0¼ ðHTC�1Z0 H þ C�1X Þ
�1HTC�1Z0 s ¼ HW s, (14)
ARTICLE IN PRESS
XS
Z'
X'H HW
MMSE estimate
X X'⇒
Z
Fig. 5. Equivalent halftoning/inverse halftoning channel.
Encoder
Channel
Decoder ˆ
Halftoning Printing Scanning
X i (X,S) X
S
X S
Fig. 4. Printing channel communication setup.
S. Voloshynovskiy et al. / Signal Processing 87 (2007) 1301–1313 1307
where HW ¼ ðHTC�1Z0 H þ C�1X Þ
�1HTC�1Z0 representsa Wiener ‘‘restoration-equalization’’ filter.
The estimation error Z ¼ X0� X ¼ E½XjS� � X is
also Gaussian Z�N ð0;CZÞ due to linear nature ofthe MMSE estimate where the covariance matrix isdefined as
CZ ¼ CX jS ¼ E½ðX0� XÞðX
0� XÞT�
¼ ðHTC�1Z0 H þ C�1X Þ�1. ð15Þ
The halftoning/inverse halftoning printing channelcan be sequentially represented by an equivalentchannel shown in Fig. 5.
3.2. Inverse halftoning: practical low-complexity
algorithm
In this section we will extend the results obtainedfor i.i.d. Gaussian signals to real images. The keyissue is the availability of an accurate, simple andtractable stochastic image model leading to a closeform solution. Although stochastic inverse half-toning is a stand-alone important problem, we willuse here the state-of-the-art inverse halftoningmethod proposed by Neelamani et al. [18] forwavelet-based inverse halftoning via deconvolution(WInHD). This approach is based on the originalidea of Donoho [19] called wavelet-vaguelette
decomposition. The wavelet-vaguelette decomposi-tion splits the deconvolution problem into twoparts. In the first part, the operator H in (13) isinverted leading to the ‘‘deblurred’’ estimate:
ex ¼ H�1s ¼ xþH�1Qz0 (16)
that produces the clean image x with noisy partH�1Qz0 assuming that the inverse operator H�1
exists (that it is not the case in general for all imagerestoration problems).
The second part of wavelet-vaguelette decom-position tries to remove the noisy part H�1Qz0 usingwavelet-based denoising. Therefore, the relevantproblem of stochastic image modeling is reducedto the proper modeling of wavelet coefficients.Hjorungnes et al. [20] have proposed a simpleand tractable model that has found a numberof applications in state-of-the-art image coding(EQ-coder [21]), denoising (wavelet Wiener filter[22]) and watermarking for evaluation of data-hiding capacity [3] and content-adaptive data-hiding[23]. According to the EQ framework, imagecoefficients are considered coming from a realizationof independent Gaussian distribution with a‘‘slowly’’ varying variance. This represents a classof local Gaussian models and justifies the applica-tion of local operations. It should be mentioned thatthe relationship between the global image statisticsand locally Gaussian assumption was elegantlydemonstrated based on mixture model where thevariance also becomes a random parameter drawnfrom the exponential distribution [20]. This alsojustifies a doubly stochastic modeling of imagecoefficients and provides a logical connection to theparallel Gaussian source model considered inSection 2.4.
The denoiser of WInHD is also based on the samemodel that leads to the Wiener filter as the MMSE(MAP) estimate for Gaussian case. To enhancethe performance of the denoiser, the WInHD
ARTICLE IN PRESSS. Voloshynovskiy et al. / Signal Processing 87 (2007) 1301–13131308
implements the Wiener filter in an overcompletetransform domain (complex wavelets) to overcomethe problems of the critically sampled wavelettransform. Thus the final estimate is obtained as
x0j;k ¼s2j;k
s2j;k þ s2jexj;k, (17)
where j denotes the scale, s2j;k the local variance ofimage coefficients for scale j, s2j the variance ofthe residual noisy term in the transform domainand all variables are considered in the transformdomain.
It is not our primary goal to enhance the WInHD,we will use it in its original form. It should,however, be mentioned that more powerful stochas-tic image models could be used, taking into accountthe stationarity of transform image coefficientsalong the edges and transitions. Such modelsshould bring additional enhancement of the inversehalftone image quality and corresponding higherreliability of side information at the decoder [24].
4. System implementation: index communication via
auxiliary physical channel
The approach based on the index communicationvia auxiliary physical channel can be designed eitherusing the Gray setup, i.e., symmetric two-sidedinformation, or the Wyner–Ziv setup, i.e., sideinformation available only at the decoder. Thebasic system block-diagram for both setups isshown in Fig. 6 with the difference consisting inthe availability of side information SN at theencoder, i.e., Gray setup.
The practical design of the auxiliary channelindex communication system includes two mainfunctional blocks, namely source coding for theresidual compression and channel coding (2-D barcode) for index communication.
Halftoning
i
Z
X
Encoder
HW
S
MultilevelEncoder
Fig. 6. Index communication via au
4.1. Source coding
To establish the lower bound for the rate-distortion function we will consider here only theGray setup assuming that the Wyner–Ziv systemwill do the best to approach this bound usingmultidimensional lattice/coset codes. According tothe analysis presented in Sections 2.4 and 3, weassume that the side information, which is given inthe form of analog (halftone) image is also availableat the encoder due to the complete knowledge of theprinting process (in other words, we assume that theinverse halftone image produced at the encoder isvery accurate approximation of the real scannedimage obtained at the decoder). Thus, the encodershould only communicate the difference Z ¼
E½XjS� � X to the decoder with the given boundeddistortion D choosing corresponding index iðX;SÞ.
From an information-theoretic perspective, anoptimal source code ð2NR;NÞ for this setup consistsof an encoding map i : XN � SN
! f1; 2; . . . ; 2NRg
and a reconstruction map X : f1; 2; . . . ; 2NRg�
SN! XN
, where both encoder ið:Þ and decoderX ð:Þ are designed based on the state information SN .In the general case, rate-distortion performance ofthis encoding scheme is lower bounded by (2) and inthe Gaussian setup the conditional rate distortionfunction is given by (8). The details of a particularpractical implementations of the source encoding/decoding are presented in Section 5.
4.2. Channel coding
The task of the channel code is to communicateerrorlessly (Pr½iaiðSÞ� ! 0) the index i via someanalog auxiliary channel. A natural auxiliarychannel for printed documents consists of 2-D barcodes [25], although as was mentioned in theintroduction other types of information carrierswith sufficient capacity can be used for this purpose.
Printing/Scanning
ˆDecoderHWX'
2-D BarCode
MultistageDecoder
Joint DecoderAnalog
Channel
i
SX
xiliary (2-D bar code) channel.
ARTICLE IN PRESS
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
D
R
0.9 0.8 0.7
0.6
0.5
0.40.3
0.10.2
Fig. 7. Conditional rate-distortion function for different r and
X�N ð0; 1Þ.
S. Voloshynovskiy et al. / Signal Processing 87 (2007) 1301–1313 1309
In the case of 2-D bar codes, multilevel codes (MLC)[26,27] can be used for errorless index communica-tion. The MLC are known to closely approachcapacity for band-limited channels The main idea ofmultilevel coding consists in the protection of eachaddress bit of the signal point by an individualbinary code C‘ at level ‘. One can use irregular low-density parity-check (LDPC) codes as the binarycode C‘ for each level ‘ followed by 8-PAM chosenfor simplicity reasons and easy integration intoprinted 2-D bar code. One can also apply precodingfor bar-codes to avoid intersymbol interferencecaused by the halftoning filter H. Finally, multistage
decoding is applied based on multiple access channeldecomposition of MLC using chain rule for mutualinformation between channel output and each level.More details about a particular implementation ofthe MLC/MSD channel code will be given in thenext section.
5. Results
First, to investigate the theoretical limits ofsource coding with side information available atboth encoder and decoder, we have modeled therate-distortion function RðDÞX jS for a Gaussiansource and the MSE distortion measure for differentvalues of r2 (Fig. 7), where r is a correlationcoefficient. The highest gain with respect to theclassic coding without side information can beexpected for highly correlated sources X and S
(reliable side information) when r2! 1. If r2 ! 0,there is no gain with respect to RðDÞ. Therefore, theprinting channel (and particularly a joint half-toning/inverse halftoning) should provide an accu-rate estimate of the source X, i.e., to reduce theimpact of the halftoning process. In real applica-tions, r2 is expected to be in the range of0:2pr2p0:6 and therefore for the stationaryGaussian source the maximum gain can be as muchas 0.26–0.66 bit per sample.
Secondly, we have performed the comparison ofdifferent coding strategies for a number of realimages based on Floyd halftoning and wavelet-based inverse haftoning via deconvolution [18].Here we only report results for the images Lenaand Baboon of size of 512� 512. The ‘‘lower
empirical bound’’ is estimated for the Gray setupdescribed in Section 4. The prediction of the inversehalftone images s based on the MMSE estimateðX0¼ E½XjS�Þ is performed at the encoder and the
residual Z ¼ X0� X is compressed at various rates
and communicated to the decoder via 2-D bar code.At the decoder, the MMSE estimate S is computedfrom the printed image and combined with thedecoded residual QðZÞ.
Three different lossy coders are used to compressthe residual QðZÞ, i.e., the DCT-based JPEG,wavelet-based WaveConvert [28] and wavelet-basedEQ coder. The last two codecs are wavelet trans-form-based. In the case of WaveConvert [29] theencoding algorithm is a progressive zero-tree-basedwavelet encoder exploiting progressive quantizationfollowed by the adaptive arithmetic entropy coder.
The EQ lossy image codec [21] exploits anintraband correlations among coefficients in thehigh-frequency wavelet subbands. The main under-lying idea of this method is that the samples in thesesubbands are distributed according to locallyGaussian distribution with slowly varying variancethat might be locally estimated using a maximumlikelihood strategy based on the causal neighbor-hood and potentially a parent sample located in thesame spatial position.
At the end of the estimation step, the coefficientsare quantized using a uniform threshold quantizerselected accordingly to the results of the rate-distortion optimization that is followed by anoptimal adaptive arithmetic coding.
While at the high-rate regime the approximationof the local variance by its quantized version isvalid, it is not the case at low rates. The reason forthat is the quantization to zero most of the datasamples that makes local variance estimationextremely inaccurate. The simple solution proposed
ARTICLE IN PRESSS. Voloshynovskiy et al. / Signal Processing 87 (2007) 1301–13131310
in [21] consists in the placement of all thecoefficients that fall into the quantizer deadzone inthe so-called unpredictable class, and the rest in theso-called predictable class. The samples of the firstone are considered to be distributed globally as ani.i.d. generalized Gaussian distribution, while theinfinite Gaussian mixture model is used to capturethe statistics of the samples in the second one. Thisseparation is performed using a simple rate-depen-dent thresholding operation. The parameters of theunpredictable class are exploited in the rate-distor-tion optimization and are sent to the decoder as sideinformation. The experimental results presented in[12,19] allow to conclude about the state-of-the-art
Fig. 8. Results of computer modeling for the frame of Lena image: (
halftone image; (d) directly JPEG compressed image (rate 0.2 bpp); (e
(f) directly EQ compressed image (rate 0.2 bpp) and (g) EQ compresse
performance of this technique in the image com-pression application.
In this paper we used the following parametersfor the EQ algorithm: 4 level DWT based on 9
7
biorthogonal wavelet filter pair with local varianceML estimate in a causal window of size 3� 3without considering interscale dependencies).
The performed experiments for all mentionedencoders were targeting the following compressionrates: 0.20, 0.25, 0.50 and 1.00 bpp. The obtainedresults are presented in Figs. 8 and 9. Moreover, forcomparison reasons, we indicate in Tables 1 and 2the results of direct compression by theabove methods assuming that one can directly
a) original image Lena; (b) halftone image; (c) WInHD inverse
) JPEG compressed image with side information (rate 0.2 bpp);
d image with side information (rate 0.2 bpp).
ARTICLE IN PRESSS. Voloshynovskiy et al. / Signal Processing 87 (2007) 1301–1313 1311
communicate the compressed image via 2-D barcode to the decoder without usage of side informa-tion for final image reconstruction. A possible gaindue to the decoding with the side information in theform of a halftone image is also shown in the table.The PSNR of the inverse halftone images (withrespect to the original images) was 31.95 dB for thecase of image Lena and 23.73 dB for image Baboon,respectively. The maximum gain is obtained for theEQ coder, and is in the range of 0.26–2.87 dB fordifferent rates with respect to the directly com-pressed image. Obviously, the capacity of theauxiliary channel will be a factor that determines aparticular operational rate depending on the appli-cation constraints. The enhancement of ‘‘analog’’
Fig. 9. Results of computer modeling for the frame of Baboon image: (
halftone image; (d) directly JPEG compressed image (rate 0.2 bpp); (e
(f) directly EQ compressed image (rate 0.2 bpp) and (g) EQ compresse
(halftone) image quality is also varying for differentrates between 2–8 dB, which indicates a consider-able potential of such an ‘‘upgrade’’ for manypractical applications where the image qualityshould be relatively high. Using advanced methodsof inverse halftoning and more accurate stochasticimage models one can expect a further enhancementof image quality for low-bit rates.
One important question that concerns the pro-posed bar code-based solution to the problem ofquality enhancement of printed-and-scanned docu-ments consists in the physical size of a bar codeconveying quality update information. Accordingly tothe recently obtained results [30], the storage capacityof modern such storage modules is more than
a) original image Baboon; (b) halftone image; (c) WInHD inverse
) JPEG compressed image with side information (rate 0.2 bpp);
d image with side information (rate 0.2 bpp).
ARTICLE IN PRESS
Table 1
Comparison of resulting PSNR [dB] for several compression
methods for image ‘‘Lena’’
Compression method Rate (bpp)
0.20 0.25 0.50 1.00
DCT-based JPEG coder
Direct compression 28.89 30.41 34.64 37.83
Compression with SI 33.55 34.17 36.19 38.61
Gain over direct compression 4.66 3.76 1.55 0.78
Gain over direct halftoning WInHD 1.60 2.22 4.24 6.66
WaveConvert coder
Direct compression 32.47 33.48 36.59 39.66
Compression with SI 35.11 35.39 37.77 40.03
Gain over direct compression 2.64 1.91 1.18 0.37
Gain over direct halftoning WInHD 3.16 3.44 5.82 8.08
EQ-based coder
Direct compression 32.87 34.11 37.20 40.40
Compression with SI 35.42 35.90 38.24 40.76
Gain over direct compression 2.55 1.79 1.04 0.36
Gain over direct halftoning WInHD 3.47 3.95 6.29 8.81
Table 2
Comparison of resulting PSNR [dB] for several compression
methods for test image ‘‘Baboon’’
Compression method Rate (bpp)
0.20 0.25 0.50 1.00
DCT-based JPEG coder
Direct compression 19.42 21.51 23.67 26.44
Compression with SI 23.61 24.39 25.59 27.66
Gain over direct compression 4.19 2.88 1.92 1.22
Gain over direct halftoning WInHD �0.12 0.66 1.86 3.93
WaveConvert coder
Direct compression 22.19 22.70 24.97 28.19
Compression with SI 25.35 25.71 27.25 30.32
Gain over direct compression 3.16 3.01 2.28 2.13
Gain over direct halftoning WInHD 1.62 1.98 3.52 6.59
EQ-based coder
Direct compression 22.90 23.32 25.84 29.60
Compression with SI 25.77 26.19 28.00 31.04
Gain over direct compression 2.87 2.87 2.16 1.44
Gain over direct halftoning WInHD 2.04 2.46 4.27 7.31
S. Voloshynovskiy et al. / Signal Processing 87 (2007) 1301–13131312
2000bytes=in2. In particular, our implementation ofthe 2-D bar codes based on 15 level non-equidistant 8pulse amplitude modulation MLC/MSD schemesusing the LDPC codes allows to reliably store andextract 2400bytes=in2 of information. Thus, it is easyto compute, for instance, that in the case ofcompression rate equal to 0.2 bits/element a bar code
will occupy approximately 17:6 cm2. Supposing that aprinted document should be reproduced on the paperof A4 format, a possible solution might consist in theplacement of a stripe containing compressed informa-tion 1� 17:6 cm2 in the top/bottom of the page.
6. Conclusions
We have considered the problem of visualcommunications with side information via distrib-uted printing channels. In particular, we haveanalyzed the possibility of side information-assistedimage processing in printing applications. We havealso provided a mathematical modeling of theprinting channel in the scope of digital communica-tions with side information. The related problem ofsource coding based on Wyner–Ziv and Gray setupshas been considered. Corresponding practical sce-nario has been analyzed and finally the theoreticalbounds on systems performance are given. Theresults of computer simulation show a range ofpossible applications where the expected gain canconsiderably increase the possibilities and function-alities of modern multimedia and security systems.
Acknowledgments
This paper was partially supported by the SwissNational Science Foundation Professorship GrantNo. PP002-68653/1, by the European Commissionthrough the IST Programme under contract IST-2002-507932-ECRYPT, European project FP6-507609-SIMILAR and Swiss IM2 projects. Theauthors are thankful to Fernando Perez-Gonzalezfor many helpful and interesting discussions ondifferent subjects of lattice coding in personalcommunications and during his visit to Universityof Geneva. The authors are also thankful to themembers of Stochastic Image Processing group(University of Geneva) for many stimulatingdiscussions during seminars.
The information in this document reflects onlythe authors views, is provided as is and noguarantee or warranty is given that the informationis fit for any particular purpose. The user thereofuses the information at its sole risk and liability.
References
[1] I.J. Cox, M.L. Miller, J.A. Bloom, Digital Watermarking,
Morgan Kaufmann Publishers, Academic Press, Los Altos,
CA, New York, 2002.
ARTICLE IN PRESSS. Voloshynovskiy et al. / Signal Processing 87 (2007) 1301–1313 1313
[2] F. Deguillaume, S. Voloshynovskiy, T. Pun, Hybrid robust
watermarking resistant against copy attack, in: Eleventh
European Signal Processing Conference (EUSIPCO’2002),
Toulouse, France, September 3–6, 2002.
[3] P. Moulin, M.K. Mihcak, A framework for evaluating the
data-hiding capacity of image sources, IEEE Trans. Image
Process. 11 (9) (September 2002) 1029–1042.
[4] F. Perez-Gonzalez, F. Balado, J.R. Hernandez, Performance
analysis of existing and new methods for data hiding with
known-host information in additive channels, IEEE Trans.
Signal Process. (Special Issue on Signal Processing for Data
Hiding in Digital Media and Secure Content Delivery) 51(4)
(April 2003).
[5] E. Martinian, G. Wornell, B. Chen, Authentication with
distortion criteria, IEEE Trans. Inform. Theory (July 2005)
2523–2542.
[6] T. Kalker, F.M. Willems, Coding theorems for reversible
embedding, in: DIMACS Workshop on Network Informa-
tion Theory, vol. 66, Rutgers University, Piscataway, NJ,
March 2003.
[7] A. Sutivong, M. Chiang, T. Cover, Y.-H. Kim, Channel
capacity and state estimation for state-dependent gaussian
channels, IEEE Trans. Inform. Theory 51 (4) (April 2005).
[8] T. Cover, J. Thomas, Elements of Information Theory,
Wiley, New York, 1991.
[9] T. Cover, M. Chiang, Duality of channel capacity and rate
distortion with two sided state information, IEEE Trans.
Inform. Theory (2002) 48(6).
[10] R. Gray, A new class of lower bounds on information rates
of stationary sources via conditional rate-distortion func-
tion, Inform. Control 19 (7) (July 1973) 480–489.
[11] T. Berger, The information theory approach to communica-
tions, in: G. Longo (Ed.), Multiterminal Source Coding,
Springer, Berlin, 1978.
[12] T.J. Flynn, R.M. Gray, Encoding of correlated observa-
tions, IEEE Trans. Inform. Theory 33 (11) (November 1987)
773–787.
[13] A. Wyner, J. Ziv, The rate-distortion function for source
coding with side information at the decoder, IEEE Trans.
Inform. Theory 22 (1) (January 1976) 1–10.
[14] R. Zamir, The rate loss in the Wyner–Ziv problem, IEEE
Trans. Inform. Theory 19 (November 1996) 2073–2084.
[15] A. Wyner, The rate-distortion function for source coding
with side information at the decoder-II: general sources,
Inform. Control 38 (1978) 60–80.
[16] D. Slepian, J.K. Wolf, Noiseless encoding of correlated
information sources, IEEE Trans. Inform. Theory 19 (July
1973) 471–480.
[17] T.D. Kite, B.L. Evans, A.C. Bovik, T.L. Sculley, Modeling
and quality assessment of halftoning by error diffusion,
IEEE Trans. Image Process. 9 (6) (May 2000) 909–922.
[18] R. Neelamani, R. Nowak, R.G. Baraniuk, WInHD: wavelet-
based inverse halftoning via deconvolution, submitted to
IEEE Trans. Image Process. (October 2002), submitted for
publication.
[19] D. Donoho, Nonlinear solution of linear inverse problems
by wavelet-vaguelette decomposition, Appl. Comput. Har-
monic Anal. 2 (1995) 101–126.
[20] A. Hjorungnes, J. Lervik, T. Ramstad, Entropy coding of
composite sources modeled by infinite gaussian mixture
distributions, in: IEEE Digital Signal Processing Workshop,
20–24 January, 1996, pp. 235–238.
[21] S. LoPresto, K. Ramchandran, M. Orhard, Image coding
based on mixture modeling of wavelet coefficients and
a fast estimation-quantization framework, in: Data Com-
pression Conference 97, Snowbird, Utah, USA, 1997,
pp. 221–230.
[22] M.K. Mihcak, I. Kozintsev, K. Ramchandran, P. Moulin,
Low-complexity image denoising based on statistical model-
ing of wavelet coefficients, IEEE Signal Process. Lett. 6 (12)
(December 1999) 300–303.
[23] S. Voloshynovskiy, A. Herrigel, N. Baumgaertner, T. Pun, A
stochastic approach to content adaptive digital image
watermarking, in: Third International Workshop on In-
formation Hiding, vol. 1768, Dresden, Germany, September
29–October 1, 1999, pp. 212–236.
[24] S. Voloshynovskiy, O. Koval, T. Pun, Wavelet-based image
denoising using non-stationary stochastic geometrical image
priors, in: IS&T/SPIE’s Annual Symposium, Electronic
Imaging 2003: Image and Video Communications and
Processing V, Santa Clara, CA, USA, 20–24 January 2003.
[25] N. Degara-Quintela, F. Perez-Gonzalez, Visible encryption:
using paper as a secure channel, in: Security and Water-
marking of Multimedia Contents V, vol. 5020, Santa Clara,
CA, USA, 20–24 January 2003.
[26] J. Huber, U. Wachsmann, R. Fischer, Coded modulation by
multilevelcodes: overview and state of the art, in: ITG–
Fachbericht: Codierung fur Quelle, Kanal und Ubertragung,
Aachen, Germany, March 1998, pp. 255–266.
[27] H. Imai, S. Hirakawa, A new multilevel coding method
using error-correcting codes, IEEE Trans. Inform. Theory 3
(23) (1995) 371–377.
[28] T. Strutz, in: WaveConvert, 2001, hhttp://www-nt.e-techni-
k.uni-rostock.de/�ts/software/download.htmli.
[29] E. Muller, T. Strutz, Scalable wavelet-based coding of color
images, in: Fourth International Conference on Actual
Problems of Electronic Instrument Engineering (APEIE’98),
Nowosibirsk, Russia, September 1998, pp. 29–35.
[30] R. Villan, S. Voloshynovskiy, O. Koval, T. Pun, Multilevel
2d bar codes: Towards high capacity storage modules for
multimedia security and management, IEEE Trans. Inform.
Forensics Secur. (2006), accepted for publication.