detectability of spread spectrum audio watermarking ... · detectability of spread spectrum audio...

12
Detectability of Spread Spectrum Audio Watermarking Technique in Ambient Noisy Environment as Second Screen Young-Seok Lee Dept. of Electronics, Chungwoon University, Incheon Campus 113 Suk-Gol Ro, Incheon, 22100, KOREA [email protected] February 4, 2018 Abstract Background /Objectives: As second screen applica- tion, we consider spread spectrum audio watermarking tech- nique. And also we focus on the detectability of watermark from recorded audio. Methods/ Statistical analysis: we present results of detection rate of standard direct sequence spread spectrum audio watermarking scheme8 that has significantly higher robust detection. In addition, we experiment desynchro- nization problem in the second screen application of spread spectrum audio watermarking technique. Findings: we experiment the usage of spread spectrum audio watermarking technique as Second screen. Exper- imental results show that it failed to extract the water- mark from the recorded audio signal. In the analysis of the recorded audio signal, the magnitude of recorded audio signal can observe great attenuation compared to the mag- nitude of watermarked audio signal. It is also possible to estimate that the embedded watermark has been deformed during the recording process. This estimation is a feasi- ble result when observing the frequency characteristics of 1 International Journal of Pure and Applied Mathematics Volume 118 No. 19 2018, 1483-1493 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu Special Issue ijpam.eu 1483

Upload: hoangthuy

Post on 14-Mar-2019

217 views

Category:

Documents


0 download

TRANSCRIPT

Detectability of Spread Spectrum AudioWatermarking Technique in AmbientNoisy Environment as Second Screen

Young-Seok LeeDept. of Electronics, Chungwoon University,

Incheon Campus 113 Suk-Gol Ro,Incheon, 22100, KOREA

[email protected]

February 4, 2018

Abstract

Background /Objectives: As second screen applica-tion, we consider spread spectrum audio watermarking tech-nique. And also we focus on the detectability of watermarkfrom recorded audio.

Methods/ Statistical analysis: we present results ofdetection rate of standard direct sequence spread spectrumaudio watermarking scheme8 that has significantly higherrobust detection. In addition, we experiment desynchro-nization problem in the second screen application of spreadspectrum audio watermarking technique.

Findings: we experiment the usage of spread spectrumaudio watermarking technique as Second screen. Exper-imental results show that it failed to extract the water-mark from the recorded audio signal. In the analysis ofthe recorded audio signal, the magnitude of recorded audiosignal can observe great attenuation compared to the mag-nitude of watermarked audio signal. It is also possible toestimate that the embedded watermark has been deformedduring the recording process. This estimation is a feasi-ble result when observing the frequency characteristics of

1

International Journal of Pure and Applied MathematicsVolume 118 No. 19 2018, 1483-1493ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version)url: http://www.ijpam.euSpecial Issue ijpam.eu

1483

a microphone in a mobile phone. he microphones in mostmobile phones are designed to compress audio signals asmuch as possible. This means that components outside thehuman audible frequency band are removed using a psy-choacoustic model. In our experiment, we confirmed thatthe frequency characteristics of the microphone in mobileare band pass filters with center frequencies of 1.5KHz and3.0KHz. Therefore, it is estimated that this result makes itdifficult to extract the watermark from the recorded audiosignal.

Improvements /Applications: The watermark tech-nique for the recorded audio signal should be designed inconsideration of the attenuation and the frequency charac-teristics of the microphone in addition to the general water-mark attack.

Key Words:Second screen, Audio watermarking tech-nique, Spread spectrum, Attenuation, Recorded audio.

1 Introduction

The second screen application has two main methods of contentsynchronization for audio signals (Automatic Content Awareness1,ACR) such as fingerprint and digital watermark. These technolo-gies have fundamentally different approaches. Fingerprint2 - smalland stable representation of content for distortion. The recognitionprocess is based on the fingerprint generation of the audio signaland the reference sample, and retrieves the necessary data later. Ifthe information of audio is necessary for recognition and is suffi-cient, it will disappear directly from the audio signal.

The second screen application continuously records the audiostream and sends the request along with the fingerprint to theserver. Database storage and retrieval is typically implemented onthe server side. If a highly popular project is generally interestedin the second screen, you should be able to use enough resources tomaintain a high load. Successful recognition based on fingerprintsis only possible with the originality of short audio. However, theactual content may have audio duplicates, such as the same moodmusic on the credit, or the same background music on the back-ground. Recognition of this section under the condition of noise

2

International Journal of Pure and Applied Mathematics Special Issue

1484

distortion is followed by false positive operations of the second sort.Therefore, to properly apply the second screen, we consider to

be able to apply the ACR system and perform additional adjust-ments. The transition to digital watermarking technology3 is quiteattractive as a possible solution to this problem. All the necessarydata ”is pre-embedded in the audio stream in advance and is fullyrecognizable on the client side and the uniqueness of the watermarkeliminates redundant and similar audio fragmentation problems.

Depending on whether the detector of the watermark requiresthe original signal, the algorithm is shared with blind watermarkingand visual watermarking. In the context of ACR, we have beeninterested in visual impairment algorithms4. That is, it allowed theextraction of the sign without the availability of the initial speechsignal. Watermarks implemented with audio must be transparent(inaudible) - Do not enter distortions that significantly affect thequality of the original signal. The simple quantitative propertyof transparency is the SNR parameter (signal-to-noise ratio) thatdefines the relationship between the power of the original signal andthe distortion power of the generated the watermark.

According to the recommendation of the Phonographic Industryof International Phonographic Industry (IFPI) of SNR, it shouldbe more than 20dB. The ODG parameter (objective differentialgrade), which is calculated according to the algorithm of PEAQalong with the SNR for transparency evaluation and changing from0 to absolute obscurity, is used as -4 in distortion causing strongstimuli. Unlike SNR, the ODG parameter takes into account thecharacteristics of a person’s hearing system, such as frequency andtemporary masking. For successful recognition of watermarks, thesignal processing method should be stable (robust). Loss compres-sion, filtering, digital-to-analog / analog-to-digital conversion, andadding noise are needed. Resistance to attack is the number of bitsthat were mistakenly decoded in time units by bit error rate (BER).

Keeping on the importance of blind watermarking, the paperbased on the blind detection mechanism of the watermark fromWatermarked signal is reviewed. Previous audio watermarkingtechnique was not very powerful and the watermarked signals havevery sensitivity level in the human auditory system. By early re-searchers, a robust audio watermarking technique using the blindwatermark detection method was proposed by Bassia5 in the time

3

International Journal of Pure and Applied Mathematics Special Issue

1485

domain which was application of amplitude modification of audio.A watermark key was used to create a watermark to be embeddedin an audio signal known only to the copyright holder. Water-marks with high amplitude values are used, but robustness can beincreased if recognizable distortion is introduced. The proposedwatermarking technique was not statistically detectable and couldresist MPEG compression, rescaling, filtering and re-quantization.

Spread-spectrum technology is one of the most important tech-nique to hide secret information in an audio signal. The charac-teristics of this technique are spreading the embedded informationto be impossible detecting for malicious attackers. There are twoimplementations of this technology: Direct Sequence Spread Spec-trum Method or Frequency Usage Hopping spread spectrum6. InDirect Sequence Spread Spectrum approach, the exclusive OR isoperating while spreading embedded information. Frequency hop-ping spread spectrum technology, spreading Hopping sequence ishopping sequence is specified7. In this paper, we present results ofdetection rate of standard direct sequence spread spectrum audiowatermarking scheme8 that has significantly higher robust detec-tion. In addition, we experiment desynchronization problem in thesecond screen application of spread spectrum audio watermarkingtechnique.

2 Spread Spectrum Audio Watermark-

ing Technique

In this paper, we restrict our experiment for detectability of water-mark to direct-sequence spread spectrum watermarking and discussa set of technologies to improve the effectiveness of their embeddingand detecting in noisy environment. The robustness of watermark-ing algorithm is enabled using:

(i) The repetition of block coding for keep from de-synchronizationattacks9

(ii)Psycho-acoustic frequency masking (PAFM)10. PAFM cre-ates an imbalance in the number of positive and negative watermarkchips in the part of the spread spectrum sequence that is used for

4

International Journal of Pure and Applied Mathematics Special Issue

1486

watermark correlation detection and that corresponds to the audi-ble part of the frequency spectrum.

(iii)To improve reliability of watermark detection, the varianceof the correlation test is reduced.

The audio signal to be watermarked xεRNcan be representedas a random vector which is assumed independent identically dis-tributed (i. i. d) Gaussian random variable with standard deviationσ x11. A watermark is defined as direct sequence spread spectrumthat is a vector pseudo randomly generated in ωε(−1,+1) whereare generated such that they are mutually independent with respectto the original audio x. The watermarked signal y is expressed by

y = x+ δω (1)

Where δ is embedding strength of watermark.A watermarkω can be detected correlation by given signal vector

z with ω:C(z, ω) = z.ω = E[z.ω] +N(0,

σx√N

) (2)

When the detector is operated in optimum, the positive probabilityerror PFA is

PFA = Pr[C(z, ω) ≥ τ |(z = x)] =1

2erfc(

τ√N

σx√

2) (3)

and the negative probability error PMDis

PMD = Pr[C(z, ω) ≥ τ |(z = x+ ω)] =1

2erfc(

(E[z.ω]− τ)√N

σx√

2)

(4)

5

International Journal of Pure and Applied Mathematics Special Issue

1487

Fig. 1. Typical embedding procedure of spread spectrum audiowatermarking technique

Figure 1 shows the typical embedding procedure of the spreadspectrum audio watermarking technique. The watermark that israndomly generated may be audible adding audio signal. To over-come the effect of audible sequence, psycho-acoustic model is ap-plied to embedding algorithm. In psycho-acoustic model of humanauditory model, in order to quantify the audibility of a particularfrequency component, spread spectrum audio watermarking tech-nique uses a simple psycho-acoustic frequency masking model. Foreach modulated complex lapped transform magnitude coefficient,the likelihood that it is audible averages 0.6 in the crucial 200 Hz -2 kHz subband, in our audio benchmark suite.

Figure 2 illustrates the frequency spectrum of modulated com-plex lapped transform block as well as the psycho-acoustic fre-quency masking boundary. Psycho-acoustic frequency masking fil-tering introduces the problem of spread spectrum sequence imbal-ance; a problem also illustrated in Figure 212. When embedding apositive chip (ω =1), an inaudible frequency magnitude becomesaudible over the level of audibility for given modulated complexlapped transform block

Fig.2.Threshold masking of psycho-acoustic model of humanauditory model

6

International Journal of Pure and Applied Mathematics Special Issue

1488

3 Results and Discussion

When the watermarked audio signal propagates in the air, the audioclip of watermarked audio from speaker of media devices as televi-sion is collected by microphone system in mobile phone. The audioclips in the air are polluted by additive noise and AD/DA attackand also severe attenuation. Figure 3 shows a watermarked audioand collected audio clip of the watermarked audio. The collectedaudio clip is recorded distance 4m, 60dB sound level from televi-sion speaker. In the comparison of amplitude between two audios,the amplitude level of recorded audio clip is 10% of the originalwatermarked audio. Attenuation by propagation in the air affectsdistortion of watermarked audio.

Fig.3.Comparison of audio level between watermarked andrecorded audio

In our experiments, we modified watermark audio in Figure 4.As in Figure 3, the recorded audio loses information about syn-chronization. To recover the information of synchronization, at thefront of recorded audio, the known random sequence and silent pe-riod are added. When the watermark is detected from recordedaudio, we can use the known random sequence to the position ofsynchronization of the recorded audio by correlator. The appli-cation of psycho-acoustic model performed in frequency domain tofind threshold masking. To determine the masking threshold at anyframe of audio clip, samples of frame are transformed to Fourierdomain with Hamming window and then masking threshold is cal-culated.

7

International Journal of Pure and Applied Mathematics Special Issue

1489

Fig.4.The modified watermarked audio consideringdesynchronization by recording

Fig.5. Schematic of masking threshold procedure bypsychoacoustic model

Our experimental results are shown in Figure 6. The left figurein Figure 6 is the correlation result from the watermarked audiosignal and the right figure is the result from the recorded audiosignal. We can observe peak signal as correlation results but cannotobserve peak, respectively. It means that the watermark cannot bedetected in the recorded audio.

Fig.6.Correlation results of watermarked and recorded audio

8

International Journal of Pure and Applied Mathematics Special Issue

1490

Our experimental results show that we failed to extract the wa-termark from the recorded audio signal. These results imply thatthere is a great difference between the recorded audio and the wa-termarked audio and also that the watermark embedded in therecorded audio signal has been deformed or distorted.

4 Conclusion

In this paper we experiment the usage of spread spectrum audio wa-termarking technique as Second screen. Experimental results showthat it failed to extract the watermark from the recorded audio sig-nal. In the analysis of the recorded audio signal, the magnitude ofrecorded audio signal can observe great attenuation compared tothe magnitude of watermarked audio signal. It is also possible toestimate that the embedded watermark has been deformed duringthe recording process. This estimation is a feasible result when ob-serving the frequency characteristics of a microphone in a mobilephone. he microphones in most mobile phones are designed to com-press audio signals as much as possible. This means that compo-nents outside the human audible frequency band are removed usinga psychoacoustic model. In our experiment, we confirmed that thefrequency characteristics of the microphone in mobile are band passfilters with center frequencies of 1.5kHz and 3.0kHz. Therefore, itis estimated that this result makes it difficult to extract the water-mark from the recorded audio signal. As a result, the watermarktechnique for the recorded audio signal should be designed in con-sideration of the attenuation and the frequency characteristics ofthe microphone in addition to the general watermark attack.

References

[1] Kawagoe M, Tojo A, Fingerprint classification, Pattern Recog-nition 17(3), 1984, pp. 293-303.

[2] Candela G T, Grother P J, Watson C I, Wilkinson R A, Wil-son C L, A pattern level classification automation system forfingerprints, NISTIR 5647, Technical Report and CD, NIST,1995.

9

International Journal of Pure and Applied Mathematics Special Issue

1491

[3] Van Cauwenberge A, Schaap G, van Roy R, TV no longercommands our full attention: Effects of second-screen viewingand task relevance on cognitive load and learning from news38, 2014, pp. 100-109.

[4] Kesar P, Bulterman D, Jansen, A J, Usages of the SecondaryScreen in an Interactive Television Environment: Control, En-rich, Share, and Transfer Television Content. In Proceedingsof the 6th European conference on Changing Television Envi-ronments. 2014.

[5] Bassia P, Pitas I, robust audio watermarking in the time do-main, IEEE transactions on Multimedia, 2001, 2(2), pp. 232241.

[6] Baranwal N, Datta K, Comparative study of spread spectrumbased audio watermarking techniques. International Confer-ence on Recent Trends in Information Technology (ICRTIT),2011, pp. 896-900.

[7] Zhang P, Xu S, Yang H, Robust and transparent audio water-marking based on improved spread spectrum and psychoacous-tic masking. International Conference on Information Scienceand Technology (ICIST), 2012, pp. 640-643.

[8] Haitsma J, Van Der Veen M, Kalker T, Bruekers F, Audiowatermarking for monitoring and copy protection, Proceedingsof the Eighth ACM Multimedia Workshop, 2000, pp. 119 122.

[9] Anderson R J, Petitcolas, On the limits of Steganography.IEEE J. Select. Areas Commun., 16, 1998, pp. 474481.

[10] Baumgarte F, Psychoacoustic model for audio coding based onthe cochlear filter bank. IEEE workshop on the Applications ofSignal Processing to Audio and Acoustics. 2001, pp. 139-142.

[11] Van Wim E, Introduction to random signal and noise, JohnWiley & Sons Ltd., 2006

[12] Blumstein S E, Stevens K N, Perceptual Invariance and OnsetSpectra for Stop Consonants in Different Vowel Environments.J Acoustic Society of American, 67, 1980, pp. 648-662.

10

International Journal of Pure and Applied Mathematics Special Issue

1492

1493

1494