the problems in digital watermarking into intra-frames of h.264/avc

Image and Vision Computing 28 (2010) 1220–1228

Contents lists available at ScienceDirect

Image and Vision Computing

journal homepage: www.elsevier .com/locate / imavis

The problems in digital watermarking into intra-frames of H.264/AVC

Dong-Wook Kim *, Young-Geun Choi, Hwa-Sung Kim, Ji-Sang Yoo, Hyun-Jun Choi, Young-Ho SeoKwangwoon University, 447-1, Welgye-Dong, Nowon-Gu, Seoul 139-701, Republic of Korea

a r t i c l e i n f o

Article history:Received 16 July 2008Received in revised form 4 August 2009Accepted 20 December 2009

Keywords:H.264/AVCDigital watermarkingIntra pictureIntra-prediction

0262-8856/$ - see front matter � 2010 Elsevier B.V. Adoi:10.1016/j.imavis.2009.12.006

* Corresponding author. Tel.: +82 2 940 5167.E-mail addresses: [email protected] (D.-W. Kim), h

a b s t r a c t

Differently from other image/video compression techniques, it is not easy to find a successful watermark-ing scheme for H.264/AVC. Thus we, as the researchers on digital watermarking, intend to find the rea-son(s) of the hardness in this paper. Among the various techniques in H.264/AVC we only concern theintra-prediction in this paper, which is the main technique to form an intra picture. We closely examinethe properties of intra-prediction in the aspect of digital watermarking.

We consider three watermarking schemes: (1) a blind scheme by fixing the watermark positions with-out using any information from encoding process, (2) a semi-blind scheme which selects watermark posi-tions with a threshold value of the cost function to determine the prediction mode, and (3) a blindscheme selecting only the blocks predicted by 16� 16 mode as the one to be watermarked. These threeschemes are getting more restricted in selecting the watermarking positions. The experiments showedthat none of the three schemes were satisfactory. Even the error rates of the extracted watermark datawere getting lower.

After examining the data, we found the followings. The intra-prediction of H.264/AVC itself is notreversible. That is, the re-engineering of the intra pictures, which is necessary to extract the embeddedwatermark, does not guarantee the same results as before re-engineering. The intra-prediction modesin many blocks and the coefficient values are changed in re-engineering. In addition, watermarkingand attacks further change the prediction modes and coefficient values, which make the watermarkinguseless.

The reason of changing the prediction modes and the coefficient values, we concluded, is heuristics ofthe technique. That means the intra-prediction includes many heuristic schemes that cannot recover thedata exactly. Consequently a new watermarking method rather than the conventional ones is necessaryto find.

� 2010 Elsevier B.V. All rights reserved.

1. Introduction

Since digital media have become common to public, users havebeen demanding more and more information-intensive media andthe best one satisfying this demand is video, currently. But becauseit has enormous amount of data, it is necessary to compress thedata for communication. During last two decades, various tech-niques to compress the video data have been researched. The mostrepresentative ones are the MPEGs [1] and H.26x [2]. At the end,the two groups were joined to make a common video data com-pression scheme, which is H.264/AVC [3]. Because of its high com-pression ratio to the quality, it has been adopted in manyapplications.

Because of digitalization of data including video, it became eas-ier to illegally copy, modify, manipulate, and distribute. It hasthreatened the ownership of the contents seriously, which has

ll rights reserved.

[email protected] (H.-J. Choi).

been highly requiring a technique to protect the ownership. Be-cause a video data must be compressed and H.264/AVC is to bethe dominant scheme for this, an ownership protection techniquetargeting H.264/AVC is necessary. The digital watermarking isknown as the best one for the ownership protection and this paperis also about it.

So far, several researches on watermarking based on H.264/AVChave been published. Sakazawa et al. [4] recently proposed awatermarking scheme in inter-prediction domain of H.264/AVCand a solution to compensate the drift problem due to the manip-ulation of the residual data. In [5], a hybrid scheme including a ro-bust watermarking in DCT domain and a fragile watermarking inmotion vectors were proposed. For intra picture [6], proposed ablind watermarking scheme that inserts watermark in the quan-tized coefficients using block polarity and index modulation. Alsoin [7], the public key was made by the relative changes of DC coef-ficients after DCT for intra pictures to use as the owner’s secret keyto insert watermark at random positions. But the performances ofthe most of the above such as the error ratios of the extracted

http://dx.doi.org/10.1016/j.imavis.2009.12.006

mailto:[email protected]

mailto:[email protected]

http://www.sciencedirect.com/science/journal/02628856

http://www.elsevier.com/locate/imavis

D.-W. Kim et al. / Image and Vision Computing 28 (2010) 1220–1228 1221

watermarks after attacks were not good enough to be used inactual application in [8], the same authors as [7] proposed a water-marking framework which used a statistical method to determinethe existence of watermarking. But it did not present a method toextract the embedded watermark itself.

The purpose of this paper is to clarify the reason why the con-ventional watermarking scheme shows low performance forH.264/AVC, rather than to introduce a new algorithm. For the tar-get data, we focus our attention to the intra pictures concerningwith embedding and extracting digital watermark. That means thispaper is to examine the whole process to form an intra picture inthe aspect of digital watermarking to find the reason that makeswatermarking scheme so hard. The intra process includes the in-tra-prediction and quantization of the residual data and both areconsidered here. For the watermarking scheme, three blind/semi-blind data-insertion robust ones are considered, which are veryusual for other image/video data compression technique so far.Also we adopt some typical attacks to see how much the attacksaffect to the performance.

First the intra process of H.264/AVC is explained first in the nextsection and its properties in the viewpoint of digital watermarkingare explained in Section 3. Then, the three watermarking schemesare applied to the intra pictures and the results are analyzed in Sec-tion 4. Finally we discuss the analyses results and conclude this pa-per in Section 5.

2. The compression process for intra picture of H.264/AVC

H.264/AVC has three kinds of prediction processes: spatial (in-tra) prediction for an intra picture (I-picture) unidirectional tem-poral (inter) prediction (P-picture), and bi-directional temporal(inter) prediction (B-picture), although the usage of them dependson the profile. The intra-prediction is to remove the spatial redun-dancy in a frame or picture, while the inter-prediction is to removethe temporal redundancy by block-based motion prediction. Bothhave much of heuristic factors such as trial-and-error.

Let us reduce our focus on intra-prediction of H.264/AVC, whoseblock diagram is shown in Fig. 1. Intra-prediction in encoding(Fig. 1a) is to find the best one among the 13 prediction modesin Fig. 2, four for a 16� 16 block or nine for 4� 4 block. Oncethe prediction mode is determined,the predicted value of each

Fig. 1. Block diagram of intra-

coefficient in this block is calculated by the pre-defined corre-sponding formula, which is subtracted by the original coefficientvalue to form the residual value. Each 4� 4 block consisting ofthe residual values is DCTed and quantized to be entropy-encoded.The quantized data is also de-quantized, inverse-DCTed, and addedto the predicted values to reconstruct the block to be used as thereference for the further predictions. The de-coding process inFig. 1b is the same except that the prediction mode determinedduring the encoding process is entered as the information of thedata to be decoded.

The prediction modes are divided into two classes as shown inFig. 2; the ones for 16� 16 block in (a) and the ones for 4� 4 blockin (b). Each mode uses the adjacent coefficients of upper and leftblocks that have been coded and reconstructed already. All the13 modes are examined to find the one showing the least cost withthe following function.

Cost ¼ Distortionþ kmode � Rate

kmode ¼ 0:85� 2ðQP�12Þ=3 ð1Þ

Here, Distortion is an error function such as SAD (Sum of AbsoluteDifferences) or SSD (Sum of Squares of Differences) between thepredicted block and the original block, Rate is the bit rate of the cur-rent macro-block, and QP is the quantization index.

3. Properties of intra process related to digital watermarking

3.1. Digital watermarking

Digital watermarking is basically a technique to insist the own-ership of a digital content with a pre-defined data extracted fromthe content. So far, various techniques according to the applica-tions and requirements have been presented and they can be clas-sified as follows.

� Relation to the signal processing procedure: Digital watermarkingmay be ought to be performed with a signal process such as datacompression in parallel according to the application. When avideo should be sent to the users immediately after acquired suchthat a football game is broadcasted in real time, digital water-marking should be performed during the data compression proce-dure without loosing the speed of real time. But if a video content

prediction of H.264/AVC.

Fig. 2. Intra-prediction modes.

1222 D.-W. Kim et al. / Image and Vision Computing 28 (2010) 1220–1228

is acquired and stored to be used in the future, the watermarkingcan be independent of the signal process. The former usually hasmore restrictions in designing a watermarking scheme.

� Blind or non-blind watermarking: In extracting the embeddedwatermark whenever necessary, some or all of the informationabout the original image may need to be used (non-blind water-marking). But in other cases it is not necessary (blind water-marking). The blindness totally depends on the watermarkingscheme itself. Of course blind watermarking scheme is harderto design.

� Watermarking for robustness or fragility: For an attack, someapplication need that the embedded or selected watermark isto be remained as unchanged as possible for an attack (robust-ness). But in some other application the scheme is designedfor an attack to destroy a part of all of the watermark or the ori-ginal image (fragility). The formal is used in such a case to insistthe ownership of the content, while the latter is to figure out ifthe content was attacked.

� Inserting watermark data or extract data as watermark: Someschemes insert the owner-defined data into the contents butin other schemes some information is extracted from content

as the watermark of that content. The watermark of the formeris always the same as long as the owner doses not change thewatermark, while that of the latter is different in a different con-tent. Thus, when use the latter scheme, the owner must save allthe watermarks from all his contents.

� Inserting absolute values or relative values: When a scheme is toinsert the watermark data, the insertion scheme can be one ofthe two: absolute value insertion (Eq. (2)) or relative value inser-tion (Eq. (3)).

I0 ¼ I þ aW ð2ÞI0 ¼ Ið1þ bWÞ ð3Þ

where I and I0 are the value of the original content and the water-marked one, respectively, and a or b is the scaling factor in eachcase. Especially when a watermark is a digital data [�1, 1] couldbe used for [0, 1] in many cases.

In each classification above, we consider a blind or semi-blindscheme to insist ownership (robustness) performing during the sig-nal processing (data compression) and inserting the absolute value ofwatermark data.


The fundamental requirements for a watermarking scheme toprotect ownership are imperceptibility and robustness. That is, theembedded watermark must be as imperceptible as possible thatimage degradation by embedding the watermark must be mini-mized. Also the watermark must be remained as unchanged aspossible for the various malicious and non-malicious attacks. Gen-erally these two have the relationship of mutual trade-off. Thesetwo requirements are much affected by the following two factors:

[Where watermark?]: watermarking positions.[How watermark? ]: watermark insertion for data-insertionwatermarking.

The watermarking position is highly associated with human vi-sual system and the watermark-insertion scheme is related to thestrength of the inserted data. The detailed can be found in [9].

3.2. Properties of intra-prediction associated to digital watermarking

Now, as explained above, a data-insertion watermarkingscheme during the data compression by H.264/AVC is assumed.Also, we consider the intra-prediction (I-picture) as the target pro-cess, although some technique targeted the inter-prediction pro-cess (P-picture) [4]. As the target data to be watermarked, wechose the quantized data of the residual blocks after intra-predic-tion, which is marked with a star in Fig. 1a. The reason why theresidual image data is selected is that manipulating other datasuch as the prediction mode for the purpose of watermarkingchanges the real image value(s) too much. Also, the data beforequantization may be removed during the quantization step.

Consider the watermark extraction process. A user must havethe H.264/AVC decoder if he or she wants to watch a video com-pressed by H.264/AVC. Thus, the transmitted bit-stream com-pressed by H.264/AVC remains un-decoded if the user does notmanipulate the data. In this case, the extraction process is straight-forward such as just taking the data in the watermarked positionsand inversely processing the insertion scheme. But if the transmit-ted data is manipulated by a reason (such as an attack) or restoredwith a different data format, which is the case that we are mostlyconcerned, the encoding process must be re-performed to extractthe inserted watermark. We pay attention to the latter case with-out losing generality.

Let us assume the case when the encoding process should be re-performed to extract the embedded watermark. For this we can re-gard the watermarked and/or attacked video content as the inputvideo of the encoding process of Fig. 1a. It means that all the pro-cess including the one to select the positions to be watermarked isthe same as the compression process. Only the step to embedwatermark data into the image value is replaced with the step tofind the embedded data value from the watermarked image value.

But in this chapter we focus our attention to the process itself:what happens if a data resulting from de-coding the compresseddata without any watermarking or attack is re-compressed withthe same condition as the 1st compression? The cases applying ac-tual watermarking scheme and/or attack are postponed to the nextchapter. We have experimented the situation explained above with

Table 1The average rates of the mode-changed blocks by the unit of 4� 4 as a block (totally 158

Video QP Case l Case 2

Foreman 28 0 4.830 0 1.6

Mobile 28 0 030 0 0

several video streams whose size was QCIF (176� 144). The exper-iment process was as follows:

(1) A video stream is encoded with given conditions by consid-ering all the frames as I-pictures.

(2) Encoded video is reconstructed.(3) Result from (2) is re-compressed with the same conditions

as (1).(4) Results from (1) and (3) are compared.

Here, we used the software JM9.8 of H.264/AVC for more fairjudgment.

3.2.1. Change in intra-prediction modes by re-engineeringThe results from the experiment above showed that many code

blocks changes their prediction modes. The experimental resultsfor two representative videos (Foreman and Mobile) are summa-rized in Table 1, whose entries are the average values per frameby the unit of 4� 4 block from the measured 50. In the table eachcase is represented as follows:

[Case 1]: Case of prediction mode change from one 16� 16mode to a different 16� 16 mode.[Case 2]: The case of prediction mode change from a 16� 16mode to a set of 4� 4 modes.[Case 3]: The case of prediction mode change from a set of 4� 4modes to a 16� 16 mode.[Case 4]: The case of prediction mode change from a 4� 4 modeto a different 4� 4 mode.[Total]: Case 1 + Case 2 + Case 3 + Case 4.

As can see in the table, more intra-prediction modes are chan-ged as the compression ratio increases and as the high-frequencycomponent increases. In all the videos, change from a 4� 4 modeto a different 4� 4 mode is dominant and some of the blocks pre-dicted as a 16� 16 are changed to a set of 4� 4 modes. Most of (inspite of the results in Table 1, we cannot say it happens always) theblocked predicted as a 16� 16 prediction mode are remained un-changed. This experimental result shows that the ratio of changesin prediction modes is between 2% and 5% of the total blocks. Thisratio seems not too high, but it may cause some serious affects insome applications such as watermarking because a mode changeresults in change in the values of one or more coefficients in thatblock. Note that if a watermark embedded for ownership protec-tion consists of a series of bits and both the values and the posi-tions are important, loss of one watermark bit during theextraction may ruin the meaning of the watermark. Fig. 3 showsan example frame map by the unit of 4� 4 block (a white block)whose prediction mode was changed by the 2nd compressionprocess.

3.2.2. Change in the quantized residual coefficientsIf the prediction mode of a block is changed, it is certain that the

coefficient values, even if not all, of the residual block are changed.How about the residual coefficients in a block whose predictionmode is not changed? To figure it out, we examined the coefficients

4 blocks in a QCIF frame).

Case 3 Case 4 Total

0 29.2 34.00 34.3 35.9

0 61.0 61.00 69.9 69.9

Fig. 3. An example frame map showing the 4� 4 blocks whose prediction modeswere changed after 2nd encoding process.

Fig. 4. An example frame map showing the coefficients whose values were changedafter 2nd encoding process.


resulting from the experiments for Table 1 and the results areshown in Table 2. One example frame is shown in Fig. 4, in whichthe white dots show the coefficients whose values were differentfrom the ones by the 1st encoding process after the 2nd encodingprocess. The values in Table 2 are the average number of coeffi-cients in a 4� 4 block (16 coefficients) whose values are changed.

As can see in Table 2, it seems that the more coefficients changetheir values by the 2nd compression in the higher compression ra-tio and in the video having the more high-frequency components.Average ratio of the changed coefficients was between 1% and 5%.About 90% of them were in the blocks whose prediction modeswere changed. But in the mode-unchanged blocks also, 1—3% ofthe coefficients changed their values. This means that the intracoding process inherently possesses some probability for informa-tion resulting from the 1st compression to be lost after 2nd com-pression, even though the compression ratios for 1st and 2ndcompression are the same.

3.2.3. Analyses for the changes in prediction mode and coefficientvalues

Why these happen? Let us examine the encoding process inFig. 1a again. Because the entropy coding does not change anyinformation, we only consider the processing steps before entropycoding. They can be expressed as the following equations:

c2ðiÞ ¼ QfDCT½c0ðiÞ � Phc0ðiÞ : c1ði� 1Þi�g ð4Þc1ðiÞ ¼ DCT�1fQ�1½c2ðiÞ� þ Phc0ðiÞ : c1ði� 1Þig ð5Þ

where c0ðiÞ; c1ðiÞ, and c2ðiÞ and the current coefficients of the origi-nal image, the reconstructed, and the one just before entropy cod-ing, respectively (refer to Fig. 1a). Also c1ði� 1Þ is thereconstructed coefficient used to predict the current coefficient.The operator phx : yi is the intra predictor to predict x with respect

Table 2The average numbers of the changed coefficients by the unit of 4� 4 as a block (totally 2

Video QP # of changed coeff. in an unchanged block # o

Foreman 28 0.1427 1.330 0.1734 1.5

Mobile 28 0.4072 5.130 0.4743 4.5

to y; Q ½p�; ðQ�1½p�Þ is the quantizer (de-quantizer), andDCT½q�; ðDCT�1½q�Þ is DCT (inverse-DCT) operation.

These equations have the meaning that the currently predicted,subtracted, DCTed, and quantized results are used to predict theright or lower blocks. For this process, the 1st and the 2nd com-pression processes are different in the two factors. The first oneis the data to be encoded. At the 1st compression, the blocks tobe encoded are the original ones which have not processed at all,but at the 2nd compression they are the ones already encoded atthe compression. This difference may change values of Distortionand then the Cost values in Eq. (1), which may result in the changesin prediction modes and/or the values of the residual coefficients.The second factor is the reference blocks. Intra-prediction refersto the left and the upper blocks. Because of the reason above thereference blocks at the 2nd compression may be different fromthose at the 1st compression, which may change the predictionmodes and/or the values of the residual coefficients. As the result,these two factors may combine to increase (it can be thought to de-crease but from the experimental results it is more likely to in-crease) the changes in the prediction modes and the values ofcoefficients.

All the two factors are intrinsic in the current intra picture cod-ing process that they cannot be modified or removed. This meansthat a watermarking process in the intra picture inherently retainshigh possibility to lose some of the embedded data during theextraction process. This is why the extraction ratio of embeddedwatermark was so low in the previous works as [7].

4. Watermarking attempts in intra pictures

In this section, we introduce three attempts of digital water-marking in the intra pictures of H.264/AVC to show the effects ofwatermarking and/or attacks in addition to the one by the inherent

5,344 coefficients in a QCIF frame).

f changed coeff. in a changed block Total # of changed coeff. in a block

888 0.1687266 0.1968

711 0.5915102 0.6506

Fig. 5. (a) Watermark and (b) watermarking position.


property of intra-prediction on the changes in prediction modesand the coefficient values. For this purpose, we consider the

Fig. 6. Watermarked and attacked imaged; (a) original; (b) just compressed; (c) comprequality = 3, (e) JPEG compressed with quality = 8, (f) 1% Gaussian noise added, (g) 2% Gauand (k) strongly blurred.

schemes more general and/or having wider applications, whichare blind (one scheme is semi-blind), data-inserting, robust, andperformed during the data compression process. For all the at-tempts the procedure in Section 3.2 was modified as;

(1) A video stream is encoded with given conditions by consid-ering all the frames as I-pictures and water-marked with agiven scheme.

(2) Encoded video is reconstructed.(3) Each image is attacked by various attacks, if necessary.(4) The resulting video is re-processed with the same conditions

as (1) to extract the watermark.(5) Data resulting from (1) to (4) are compared.

In this experiment, we used CIF ð352� 288Þ videos. As thewatermark we used a binary image data shown in Fig. 5a, whosesize was 32� 32 (1024 bits). Because H.264/AVC performs DCTand quantization for each 4� 4 block, watermarking in this paper

ssed and watermarked only; compressed, watermarked, (d) JPEG compressed withssian noise added, (h) weakly sharpened, (i) strongly sharpened, (j) weakly blurred,


is also performed on this basis. The watermarking position in a4� 4 block is the lowest-frequency diagonal coefficient, which isposition ‘3’ in Fig. 5b. The other two positions (position ‘2’, and‘3’) showed similar performances but ‘3’ was the best. The reasonwhy this position was chosen can be easily found in the previousworks.

Once the blocks to be watermarked are selected, the actualembedding process to the position ‘3’ of each block is as follows:

if watermark bit is 0

TabExp

A

NNGNB

S

J

if coefficient is odd

coefficient = coefficient � 1;

else if coefficient is even

do not modify;

else

if coefficient is odd

do not modify;

else if coefficient is even

coefficient = coefficient + 1;

With this embedding, the extraction process is simply to takethe LSB of the coefficient ‘3’ of an embedded 4� 4 block. In general,the watermark data consists of a series of values. Therefore eachwatermark value and its relative position in the watermark dataare important to insist the ownership. Especially in a binary water-mark data the position is more important because if one bit ismissed or an extra bit is added in extraction, the rest of the databits are mis-positioned and the meaning may be totally ruined.

The experiments in this paper were performed with two videostreams; Foreman and Mobile, for each of which 50 frames areused and the resulting data are averaged. For the compression ra-tio, we chose QP = 28 and 30 cases. Also, for the attack, we includesJPEG compression (quality = 3 and 8), Gaussian addition (1% and2%), sharpening (weak and strong), and blurring (weak and strong).As the experimental results we measured the PSNR of the water-marked/attacked image to the original and error rate of the ex-tracted watermark from the watermarked/attacked image to theoriginal one.

Fig. 6 shows examples of the watermarked and/or attackedimages. Some of the attacked images in Fig. 6 are not worthy of re-use (refer to the PSNR values in Table 3), but still we use them forthe experimental purpose. Typically an image with PSNR lowerthan 25 dB is thought to be too damaged.

4.1. Blind watermarking by fixing the watermarking positions

One way not to miss any position of watermark bit in extractionis to fix them regardless of the original image. In this attempt, wedecided to insert one watermark bit in an 8� 8 block and the leftupper 4� 4 block is chosen as the block to be watermarked. This

le 3erimental results for the blind watermarking scheme by fixing the watermarking posi

ttack Foreman

QP = 28 QP = 30

Error rate (%) PSNR (dB) Error rate (%)

o watermarking – 37.45 –o attack 1.01 37.16 1.16aussian 1% 3.14 34.99 1.95oise addition 2% 11.09 32.46 5.61lurring Weak 16.65 35.08 12.07

Strong 22.16 32.85 21.10harpening Weak 11.32 31.43 15.63

Strong 19.93 24.35 16.15PEG compress Q = 3 20.57 34.00 16.98

Q = 8 3.91 36.25 2.71

watermarking scheme is a blind one that any data from the originalimage is not necessary in extraction process. With this scheme allthe 1024 watermark bits can be inserted in each frame.

The experimental results are summarized in Table 3. First, if thewatermarked image was not attacked, quality degradation bywatermarking only was less than 0.5 dB, which is quite acceptable.The error rate of the extracted watermark without any attack wasabout 1%, which is by the reason explained in the previous chapterand is acceptable, too. But many of the attacked cases showed veryhigh error rates in the extracted watermarks. The error rate in-creases as the compression ratio increases and as the complexityof the image increases. Even if the cases when the attacked imageis too degraded to be valuably used (PSNR lower than 25 dB) arenot considered, the error rates of some attacked cases (2% of Gauss-ian noise addition, weak blurring, weak sharpening, and Q = 3 JPEGcompression) were higher than 10%, with which the watermarkdata cannot be recognizable as shown in Fig. 7.

This happens because of the reason explained in the previouschapters, that is, increase the changes in intra-prediction modesand coefficient values by attack as well as watermarking. It alsotells us that manipulating the image data randomly such as thiswatermarking scheme and attacks seriously changes both predic-tion modes and coefficient values.

4.2. Watermarking using the cost function

4.2.1. Blind watermarkingDetermination of prediction mode is done by calculating the

cost function in Eq. (1). Intuitively, the prediction mode at the1st compression has less possibility to be changed by the 2nd com-pression if the difference in the cost between the best mode andthe 2nd best mode is large enough. Because the cost function mustbe calculated during the encoding process, it is not hard to use thecalculation results. Therefore, a method to select the blocks whosecost function differences defined as Eq. (6) are large enough can beconsidered.

cost difference ¼ cost2ndbest � costbest ð6Þ

Where costbest and cost2nd best are the costs of the best and 2nd bestmode, respectively. We pre-determined a threshold cost differenceðCthÞ for this method and only the blocks whose cost difference isgreater than ðCthÞ are selected as the ones to be watermarked. It re-sults in a blind watermarking because extraction does not use anyembedding information. But this method still has some possibilityto mis-select the watermarked blocks during the extraction processbecause some blocks whose cost values are near ðCthÞ may changetheir cost values (the one lower than ðCthÞ becomes higher thanðCthÞ or vice versa), which results in losing some watermark bitsor acquiring additional bits.

tions.

Mobile

QP = 28 QP = 30

PSNR (dB) Error rate (%) PSNR (dB) Error rate (%) PSNR (dB)

36.09 – 35.27 – 33.4035.74 0.93 34.97 0.96 33.1034.08 5.68 33.54 2.70 32.1231.91 15.91 31.59 8.82 30.6234.38 27.70 28.34 23.48 27.9732.50 30.23 24.93 28.10 24.7930.85 27.77 24.40 26.22 24.1624.23 35.95 17.69 33.65 17.5833.48 26.60 29.51 20.28 29.0735.21 6.56 33.32 4.18 32.13

Fig. 7. The extracted watermark image with error rate of (a) 1.95%, (b) 5.57%, (c) 10.16% and (d) 14.84%.

Fig. 8. Extracted watermark images by blind watermarking using cost function when threshold is (a) 100, (b) 200, (c) 300 and (d) 400.


The example extracted watermark images for this scheme areshown in Fig. 8, which includes various threshold values of costfor the Foreman video when QP = 28 without any attack. Fromthe figures, the amount of embedded watermark bits decreasesas the threshold value increases such that only about half of thedata in Fig. 5a were embedded and extracted. As explained above,the extraction process mis-locates the watermarked positions andruins the extracted watermark even when Cth ¼ 100 (compare toFig. 7). Consequently this method cannot be used.

4.2.2. Semi-blind watermarkingThe problem in the above method was mis-location of the

watermarked positions in extraction. To solve it, we stored thewatermarked locations determined by the method in the previoussubsection and use it in extraction, which then becomes a semi-blind watermarking.

Even though there is no possibility to lose the water-markedpositions, the experimental results showed that their perfor-mances were still not good enough, as shown in Table 4 whoseconditions are the same as Table 3 and Cth ¼ 100. In this case theaverage numbers of inserted watermark bits for Foreman and Mo-bile were 796 and 1024 when QP = 28 and 920 and 1024 whenQP = 30, respectively. If the watermarked images were not at-

Table 4Experimental results for the semi-blind watermarking scheme using the cost function.

Attack Foreman

QP = 28 QP = 30

Error rate (%) PSNR (dB) Error rate (%)

No watermarking – 37.45 –No attack 0.08 37.24 0.83Gaussian 1% 3.66 35.06 1.66Noise addition 2% 18.05 32.47 9.48Blurring Weak 16.70 35.09 15.85

Strong 24.36 32.84 22.44Sharpening Weak 16.69 31.53 17.70

Strong 31.53 24.43 29.15JPEG Q = 3 23.39 34.01 17.73Compress Q = 8 6.48 36.32 2.25

tacked, the error rates were a little lower than the fixed-positionmethod. But in the attacked cases the error rates were higher.

The analyses revealed the followings. Because of the threshold,the amount of mode-changed watermarked blocks decreased,which lowered the error rate in the un-attacked cases. In the at-tacked cases also, the amount of changes in prediction modes de-creased but the attack affected the values of more coefficientseven in the blocks whose prediction modes did not changed. Con-sequently this method does not satisfy the robustness requirementof the watermark.

4.3. Blind watermarking to embed only in the blocks predicted by16� 16 mode

One other information arising our interest in Table 1 was thatthe blocks predicted by 16� 16 mode at the 1st compression donot (or have very low possibility to) change their prediction modesat the 2nd compression. It means that selecting only these blocksas the ones to be watermarked may lower the error rate. Thus,we designed a watermarking scheme that one watermark bit is in-serted to each 4� 4 block in the macro-blocks predicted by16� 16 mode. Insertion position in a 4� 4 block was the sameas Fig. 6. With this scheme the average numbers of inserted bits

Mobile

QP = 28 QP = 30

PSNR (dB) Error rate (%) PSNR (dB) Error rate (%) PSNR (dB)

36.09 – 35.27 – 33.4035.83 0.36 35.04 0.74 33.1634.14 3.69 33.58 1.73 32.1731.96 16.07 31.60 9.95 30.6634.39 30.91 28.35 27.77 27.9832.49 35.02 24.93 32.43 24.7930.99 33.49 24.41 29.74 24.1824.35 43.81 17.69 41.65 17.5933.52 28.73 29.53 22.31 29.0935.28 5.64 33.36 3.92 32.18

Table 5Experimental results for the blind watermarking scheme to embed only in the blocks predicted by 16� 16 mode in the 1st compression.

Attack Foreman Mobile

QP = 28 QP = 30 QP = 28 QP = 30

Error rate (%) PSNR (dB) Error rate (%) PSNR (dB) Error rate (%) PSNR (dB) Error rate (%) PSNR (dB)

No watermarking – 37.45 – 36.06 – 35.27 – 33.40No attack 0.00 37.29 0.00 35.89 0.02 35.24 0.00 33.35Gaussian 1% 0.98 35.07 0.22 34.18 1.40 33.73 0.32 32.32Noise addition 2% 6.81 32.48 2.65 31.95 8.08 31.70 3.25 30.77Blurring Weak 14.54 35.14 11.24 34.45 19.48 28.39 15.24 28.03

Strong 21.08 32.88 20.93 32.53 26.93 24.95 26.51 24.81Sharpening Weak 9.26 31.50 12.71 30.96 14.25 24.42 17.45 24.20

Strong 12.19 24.39 11.51 24.31 18.36 17.70 17.68 17.60JPEG Q = 3 12.91 34.02 11.38 33.54 17.29 29.58 15.49 29.16Compress Q = 8 1.35 36.36 0.24 35.32 1.46 33.50 0.34 32.32


in Foreman and Mobile were 255 and 44 for QP = 28 and 376 and60 for QP = 30, respectively.

The experimental results are in Table 5. As in the table, the errorrates were lowered compared to Table 3 or Table 4. Especially thecases when image is watermarked only showed very low errorrates. But still blurring attack, sharpening attack, and JPEG com-pression with quality 3 showed unacceptably high error rates.The analysis of the experiment results showed that the watermark-ing did not change the prediction mode even though some coeffi-cient values were changed but the attacks changed theprediction modes as well as the coefficient values.

5. Discussion and conclusion

So far we have examined the properties of intra-prediction ofH.264/AVC with connection to digital watermarking. The mostserious problem was that the re-engineering of the intra framewith the same conditions as the first does not guarantee the sameresults in the prediction modes and the coefficient values. That is,the re-engineering changes the intra-prediction modes and thecoefficient values. Change in the prediction modes makes changein the coefficient values more seriously.

We have attempted three watermarking schemes which embeda binary watermark data into the images; a blind scheme by fixingthe watermarking positions, a semi-blind scheme using the costfunction, and a blind scheme with the blocks predicted with16� 16 mode. But none of them satisfied the requirements; someattacks seriously lower the error rates of the extracted water-marks. This is because the attacks further change the predictionmodes and coefficient values.

In general intra-frames have been used to watermark a videostream for the previous MPEGs, which do not include the intra-pre-diction scheme. But because of the problem explained above, it canbe tried to watermark inter-frames for H.264/AVC and some re-search have been published [4]. But our analysis revealed that in-ter-frames also have the similar problem in spite of [4].

Consequently, we think that a traditional scheme as above can-not be a good solution because of the heuristic methods in the in-tra-prediction of H.264/AVC. There must be a new kind ofwatermarking schemes which do not include the intra or inter-pre-diction, in which to use the data in the content might be betterthan to insert some data into the contents. The scheme in [8] isan example except that it could not provide a method to extractthe values of the inserted watermark values.

Acknowledgements

This work was supported by the IT R&D program of MKE/IITA.[2009-F- 028-01, Signal Processing Elements and their SoC Devel-opments to Realize the Integrated Service System for InteractiveDigital Holograms].

References

[1] I.E.G. Richardson, H.264/AVC and MPEG-4 Video Compression, John & WileySons LTD., 2003.

[2] I.E.G. Richardson, Video Codec Design, Developing Image and Video,Compression Systems, John & Wiley Sons LTD., 2002.

[3] ISO/IEC, ISO/IEC JTC1/SC29/WG11 Coding of Moving Picture and Audio, Draft ofVersion 4 of ISO/IEC 14496-10 (E) MPEG05/N7081, 2005.

[4] S. Sakazawa, Y. Takishima, Y. Nakajima, H.264/AVC Native video watermarkingmethod, in: IEEE International Symposium on Circuits and Sys-tems (ISCAS2006) Proceedings, 2006, pp. 1439–1442.

[5] G. Qiu, P. Marziliano, A.T.S. Ho, D. He, Q. Sun, A hybrid watermarking scheme forH.264/AVC video, in: Proceedings of the 17th Inter-national Conference onPattern Recognition (ICPR 2004), vol. 4, 2006, pp. 2353–2356.

[6] T.T. Lu, W.L. Hsu, P.C. Chang, Blind video watermarking for H.264/AVC, in:Canadian Conference on Electrical and Computer Engineering (CCECE 2006),2006, pp. 2353–2356.

[7] M. Noorkami, R. Merserau, Compressed-domain video watermarking for H.264/AVC, in: IEEE International Conference on Image Processing (ICIP 2005), vol. 2,2005, pp. 2353–2356.

[8] M. Noorkami, R. Merserau, A framework for robust watermarking of H.264/AVC-encoded video with controllable detection performance, IEEE Transactionon Information Forensics and Security 2 (1) (2007) 14–23.

[9] I.J. Cox, M.L. Miller, J.A. Bloom, Digital Watermarking, Morgan KaufmannPublishers, San Francisco, CA, 2002.

the problems in digital watermarking into intra-frames of h.264/avc

Documents