quickest detection of a tonal burst

11
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 45, NO. 8, AUGUST 1997 2037 Quickest Detection of a Tonal Burst Robert J. Stahl and Peter K. Willett Abstract—In this paper, we describe and analyze the perfor- mance of a technique for the quickest detection of a sinusoid of unknown frequency, amplitude, and phase in additive white noise. The approach is based on the work of Broder and Schwartz and relies on asymptotic results, that is, the “signal” to be detected as quickly as possible is assumed to be of vanishingly small amplitude, which is the most difficult (and interesting) situation. In the literature, the relationship between the small-signal Page’s test and locally optimal fixed-length detection theory is explored in detail for the case of a known contaminant. Here, these results are extended to the case of a stochastic contaminant (i.e., the unknown sinusoid). We derive the version of Page’s test optimized under the assumptions that the amplitude is small, the data arrives in blocks, and the frequency of the sinusoid is uniformly distributed in a given band, and we verify the performance predictions via simulation. To detect a sinusoid of completely unknown frequency, an ensemble of such detectors is required, and this ensemble is very close to an FFT-based scheme. If FFT’s are to be used, however, the best performance is obtained when each is augmented by a half-band-shifted version of itself. Index Terms— Efficacy, Page’s test, quickest detection, signal detection, unknown sinusoid. I. INTRODUCTION T HIS PAPER investigates the problem of detecting a sinusoidal disturbance occurring within an additive noise background. A disturbance (disorder) problem is one in which an observed process undergoes a change in distribution at some unknown time. Here, not only is the time the sinusoid “turns on” unknown, but so are its frequency and phase. In addition, a constraint that the signal amplitude is small will also be applied to the study. Since the problem is one of detecting a nonstationarity, a traditional fixed block detection scheme is inappropriate; therefore, a sequential detector will be employed. In particular, a scheme based on Page’s test [4], [5] will be chosen. Page’s test, which is a quickest detection scheme originally introduced to detect failures in manufacturing processes, is a modification of the original work performed by Wald [6]–[8]. Page’s procedure is implemented using cumulative sum (CUSUM) approach declare detection (1) Manuscript received March 16, 1994; revised March 28, 1997. This work was supported in part by ONR through NUWC Newport under Contract N66604-96-C-0553. The associate editor coordinating the review of this paper and approving it for publication was Prof. Douglas Williams. R. J. Stahl is with the Atlantic Aerospace Electronics Corporation, Waltham, MA 02154 USA. P. K. Willett is with the University of Connecticut, Storrs, CT 06269 USA. Publisher Item Identifier S 1053-587X(97)05786-3. where CUSUM; detection threshold; statistic; input sample. Page’s original procedure was defined with equal to the log- likelihood ratio. Since a great deal about the signal is unknown, a likelihood ratio cannot be implemented directly. Nonetheless, for the purposes of obtaining a test statistic, one can initially place some constraints on the signal. For example, it will be assumed that the sinusoid’s frequency is contained within a narrow band of frequencies—this requirement will be relaxed later. In addition, since the signal amplitude is assumed small, a locally optimal test statistic will be utilized. Before the problem of detecting the sinusoid is taken up, a review of some earlier results is in order. The following section discusses the work of Broder and Schwartz [1]–[3], in particular, the quickest detection of a change in mean. Here, some important performance metrics are defined, and a key result equating “Page’s performance” with the standard fixed-length detector efficacy measure is summarized. In the sections that follow, a similar result will be obtained for the unknown signal case, along with a detector structure that permits one to consider any independent and identically distributed (i.i.d.) noise background; for the Gaussian case, this detector’s performance is almost identical to the DFT. Finally, the single-band case will be extended to a multiband implementation. Simulation results are also shown. II. PREVIOUS RESULTS In [1]–[3], the model explored was (2) with observations process; additive white noise process having univariate density ; unknown time of disturbance. In order to obtain this time of disturbance, a detector based on Page’s test was investigated. In the above studies, the utility of Page’s test was demon- strated with several test statistics and under a variety of background noise densities. Here, two valuable performance metrics were defined; they are the mean time between false alarms and the average delay in detecting the signal These quantities are the disorder problem’s equivalent to the false alarm and detection probabilities found in standard 1053–587X/97$10.00 1997 IEEE

Upload: independent

Post on 30-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 45, NO. 8, AUGUST 1997 2037

Quickest Detection of a Tonal BurstRobert J. Stahl and Peter K. Willett

Abstract—In this paper, we describe and analyze the perfor-mance of a technique for thequickest detectionof a sinusoid ofunknown frequency, amplitude, and phase in additive white noise.The approach is based on the work of Broder and Schwartz andrelies on asymptotic results, that is, the “signal” to be detectedas quickly as possible is assumed to be of vanishingly smallamplitude, which is the most difficult (and interesting) situation.In the literature, the relationship between the small-signal Page’stest and locally optimal fixed-length detection theory is exploredin detail for the case of aknown contaminant. Here, these resultsare extended to the case of a stochastic contaminant (i.e., theunknown sinusoid).

We derive the version of Page’s test optimized under theassumptions that the amplitude is small, the data arrives inblocks, and the frequency of the sinusoid is uniformly distributedin a given band, and we verify the performance predictionsvia simulation. To detect a sinusoid of completely unknownfrequency, an ensemble of such detectors is required, and thisensemble is very close to an FFT-based scheme. If FFT’s are tobe used, however, the best performance is obtained when each isaugmented by a half-band-shifted version of itself.

Index Terms—Efficacy, Page’s test, quickest detection, signaldetection, unknown sinusoid.

I. INTRODUCTION

T HIS PAPER investigates the problem of detecting asinusoidal disturbance occurring within an additive noise

background. A disturbance (disorder) problem is one in whichan observed process undergoes a change in distribution atsome unknown time. Here, not only is the time the sinusoid“turns on” unknown, but so are its frequency and phase. Inaddition, a constraint that the signal amplitude is small willalso be applied to the study. Since the problem is one ofdetecting a nonstationarity, a traditional fixed block detectionscheme is inappropriate; therefore, a sequential detector willbe employed. In particular, a scheme based on Page’s test[4], [5] will be chosen. Page’s test, which is a quickestdetection scheme originally introduced to detect failures inmanufacturing processes, is a modification of the original workperformed by Wald [6]–[8].

Page’s procedure is implemented using cumulative sum(CUSUM) approach

declare detection (1)

Manuscript received March 16, 1994; revised March 28, 1997. This workwas supported in part by ONR through NUWC Newport under ContractN66604-96-C-0553. The associate editor coordinating the review of this paperand approving it for publication was Prof. Douglas Williams.

R. J. Stahl is with the Atlantic Aerospace Electronics Corporation, Waltham,MA 02154 USA.

P. K. Willett is with the University of Connecticut, Storrs, CT 06269 USA.Publisher Item Identifier S 1053-587X(97)05786-3.

where

CUSUM;detection threshold;statistic;input sample.

Page’s original procedure was defined withequal to the log-likelihood ratio. Since a great deal about the signal is unknown,a likelihood ratio cannot be implemented directly. Nonetheless,for the purposes of obtaining a test statistic, one can initiallyplace some constraints on the signal. For example, it will beassumed that the sinusoid’s frequency is contained within anarrow band of frequencies—this requirement will be relaxedlater. In addition, since the signal amplitude is assumed small,a locally optimal test statistic will be utilized.

Before the problem of detecting the sinusoid is taken up,a review of some earlier results is in order. The followingsection discusses the work of Broder and Schwartz [1]–[3],in particular, the quickest detection of a change in mean.Here, some important performance metrics are defined, anda key result equating “Page’s performance” with the standardfixed-length detector efficacy measure is summarized.

In the sections that follow, a similar result will be obtainedfor the unknown signal case, along with a detector structurethat permits one to consider any independent and identicallydistributed (i.i.d.) noise background; for the Gaussian case,this detector’s performance is almost identical to the DFT.Finally, the single-band case will be extended to a multibandimplementation. Simulation results are also shown.

II. PREVIOUS RESULTS

In [1]–[3], the model explored was

(2)

with

observations process;additive white noise process having univariate density

;unknown time of disturbance.

In order to obtain this time of disturbance, a detector basedon Page’s test was investigated.

In the above studies, the utility of Page’s test was demon-strated with several test statistics and under a variety ofbackground noise densities. Here, two valuable performancemetrics were defined; they are the mean time between falsealarms and the average delay in detecting the signal

These quantities are the disorder problem’s equivalent tothe false alarm and detection probabilities found in standard

1053–587X/97$10.00 1997 IEEE

2038 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 45, NO. 8, AUGUST 1997

detection theory. For a more general version of (2), given by

(3)

(where denotes the probability density function), then usinga detection threshold of, we have [1]–[3], [9], [10]

(4)

(5)

(We present a brief tutorial explanation of this in AppendixA.) Here, is the nonzero root of the moment generatingfunction identity of [the update, as defined in (1)], that is

(6)

Given that (and hence ) is sufficiently large, the approx-imations

(7)

(8)

hold. (In the development of (7) and (8), it is necessarythat These follow in a straightforwardmanner from (6), provided , and as weshall discuss, these latter must be true in order for the Pageprocedure to work at all.) The subscripts and are usedto denote , respectively, before and after the disturbance.

A key empirical observation was made in [1]–[3]: Inde-pendent of background noise and test statistic used, a linearrelationship existed between and for large Thisprompted the definition of an asymptotic performance measure(Page’s efficiency), which defined as

(9)

Now, once again, assuming large, we can use (7)–(9) toapproximate Page’s efficiency as

(10)

where represents the expectation operator. While this mayappear to be a simple expression, the difficulty in findinga closed-form solution for may preclude its directimplementation. Given this, a small signal solution for thisquantity was also pursued in [1], that is, a solution valid as

approaches zero. In this case, bothand its first derivative(with respect to ) are zero, but the identity

(11)

was shown to hold, where is the efficacy of a statisticin noise with density from standard fixed-length detectiontheory. This is a powerful result because it enables us to obtainthe performance of Page’s test using a well-known quantity:the efficacy [7], [11]; an application is that an optimal choiceof is the “locally-optimal” nonlinearity.

Fig. 1. Sinusoidal disturbance occurring at samplen = 250 in unit Gaussiannoise and with� = 1:

III. T HE ANALYSIS

A. Problem Statement

In the present application, the goal is to detect, as quicklyas possible, the addition of a sinusoid to a white noise process,as exemplified in Fig. 1. Mathematically, the model is

(12)

with the frequency and the phase both unknown. Asin [1], we shall assume that (which is the strength of thedisturbance) is small and concentrate on asymptotic results.(As a broad but at least somewhat defensible statement, almostanything works well for large .)

The quickest detection of a disturbance that is itself afunction of unknown parameters is not a well-understoodproblem. Generalized likelihood ratio (GLR) techniques areappealing, but the resulting structures are not well suited toimplementation: Estimation of and is, of course, feasible,but the number of searches over possible starting timesmakes such an approach prohibitive. In this paper, we seek ascheme that both performs well and is tractable; Page’s testsatisfies the latter requirement, and it will be shown that theformer is also met.

It is not immediately apparent how a CUSUM test is to beapplied in the case at hand since the “setting to zero” actionimplies “forgetting” of previous data, and this conflicts withthe strongly autocorrelated nature of the disturbance. As such,we modify the model to

(13)

with i.i.d. uniform variates. Essentially, thisis a block-stationary scenario, with the blocks uncorrelated.The reader may be uncomfortable with the modified model of(13), and we shall here attempt to address this in a question-and-answer format.

Objection: The “decimation” of the disturbance startingtime into blocks of length will increase the reporting delay.

STAHL AND WILLETT: QUICKEST DETECTION OF A TONAL BURST 2039

Response:This is true, but the increase in delay will, ingeneral, be only a fraction of a block length. Since the signalstrength is small, this is a minor effect.

Objection: The phases are not uniform randomvariates. Any scheme based on this assumption will performpoorly in the true model (12).

Response:This turns out to be untrue, as will be shown.Objection: Even with independent , the blocks are not

independent (for ), being coupled throughResponse:This is true; however, the blocks are uncorre-

lated, and this, given our small-scenario, is sufficient.Objection: By making the blocks uncorrelated, a great deal

of phase information is lost. As such, whatever scheme isproposed will be grossly suboptimal.

Response:It is unfortunate that the CUSUM structure doesnot allow for memory, as could be applied to parameterestimation. However, it will be shown that by increasing,phase coherence may be recovered and exploited; this is at theexpense of computation and of robustness tomodulated(i.e.not single-tone) disturbances.

It is difficult to be persuasive in challenging the fourthobjection: There may be a way to detect tonal disturbancesmore quickly than that to be proposed. Autoregressive (AR)and/or eigenstructure methods come to mind; however, thecomputational difficulties of knowing when to start the estima-tion procedure, and the practical difficulties of implementingsuch schemes when the disturbance power is very small, argueagainst such approaches.

B. Local Performance of Page’s Test inthe Stochastic Signal Case

With reference to (13), the problem we are interested in isof the form

(14)

where it is [analogous to of (13)], which defines theonset of a “block-disturbance.” In the case at hand,is the

th block of data, and theth element of is ;however, since for the results of this section to hold all that isnecessary is that , we shall be general.

The test whose performance we are investigating is

declare detection (15)

Apart from the obvious block processing, as previously men-tioned, the noteworthy aspect of this test is thedependenceof the update. The reason for this is that Page’s test requires

(16)

that is, that the test statistic be negatively “biased” before thedisturbance and positively afterwards; this is not possible forarbitrarily small unless we have , which is a function of(Note that with , a log-likelihood ratio (16) is automaticallysatisfied, and an excellent discussion of bias in Page’s test isavailable in [13].)

It is shown in Appendix A that for the problem of (14),we have

(17)

and

(18)

where the derivatives arewith respect to , and the integrals aredimensional. Treating the integrals in (18) as variables and

differentiating, we get that performance (in terms of efficacy)is best when

(19)

which, if achievable, gives

Var

(20)

where is the familiar second-derivative efficacy [11]. Theresulting Taylor approximation

(21)

will be shown to be reasonable for small (and even moderate)

It is shown in Appendix B that we must also have

(22)

Any dependence of that satisfies (19) and (22) gives thesame small-signal performance; from here on, we shall assume

(23)

for both notational and implementational simplicity. It isreasonable to assume that over the range ofin which weare interested, increases with , that is, the strongerthe disturbance, the stronger the reaction provoked in the teststatistic. (Although this must be true forbetween 0 andsomeupper bound, this bound can be finite—a case in point is thetest statistic designed for Cauchy. It is not our intentionhere to make general statements about when this does anddoes not happen, and hence, we shall content ourselves withour assumption.) At any rate, using this assumption and (23),we have

(24)

and

(25)

2040 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 45, NO. 8, AUGUST 1997

and, hence, (16) is satisfied. The reader may object thatis, forpractical purposes, unknown; for example, if a test is designedfor and in fact , then we have ,which is a poor test indeed. However, specification of a fixedvalue as a “minimum detectable level” allows use offor which

(26)

and for any

(27)

Note that substitution of these into (7) and (8) indicates testingusing when, in fact, results in an unchanged buta reduced

C. The Locally Optimal Test

Apart from the intrinsic utility of the performance measureresulting from the relationship betweenand , its messageis that the optimal (small-signal) choice for is that whichmaximizes the efficacy, that is, the locally optimal test statisticfor a sinusoid of unknown frequency and phase in white noise.For this case, we have

(28)

where is the second derivative efficacy of a locally optimal(fixed length) test, and

(29)

where

(30)

While this seems clear and simple, there is a problem.If the unknown frequency is assumed to be uniformlydistributed on , then is the energy detector, which is atrivial solution that takes no account of the useful dependencystructure of the signal. If we instead assume thatis confinedto the th of narrow bands, i.e.,

(31)

then a more interesting structure results. This is at the costof interrogating only a single band of frequencies—we shalldiscuss the “total-system” implementation shortly.

Using (13), (30), and (31), the locally optimal test may bewritten as [9], [10]

(32)

where

(33)

with and representing the upper and lower bounds,respectively, which are shown above for At this point,any background density may be substituted. For example, inLaplace noise [11] with variance , we have

(34)

For Gaussian noise with zero mean and variance(themain focus of this study), (32) may be written as

(35)

A compact matrix representation for this is

(36)

with the matrix defined as

......

(37)

Note that and that the and subscriptshave been removed for brevity. For this statistic and noisewith variance , we have (see Appendix C)

(38)

As a comparison, the -point magnitude square discreteFourier transform (DFT) has efficacy [9], [10]

(39)

To illustrate this, Fig. 2 shows efficacy plots for the locallyoptimal detector and the DFT

One can see that on a single-band basis,these are practically identical, which implies near optimalityof the DFT for this application.

It seems reasonable, at this point, to check the applicabilityof our results. Specifically, we have (10) relatingto the root

of the moment generating function (MGF) of, and wehave used a small-signal assumption to approximate this. Theapproximation, from (82) of Appendix B, is

(40)

STAHL AND WILLETT: QUICKEST DETECTION OF A TONAL BURST 2041

Fig. 2. Plots of�DFT and �lo:

which for the locally optimal reduces to

(41)

How good an approximation is this? Using the quadratic formshown in (36), we have

where The MGF of a quadratic formcan be shown for the Gaussian case to be [12]

(42)

where

covariance matrix of ;mean vector;identity matrix.

Since we are interested in , we have ; in addition,note that if is a constant, then Insolving for , we set the MGF to one and get

Substituting for , the equality

must be solved to obtain the desired root. With the eigenvaluesof represented by , the above expression can berewritten as

(43)

(Note that the MGF does not exist at all for [9].)Along with the approximation of (41), this, solved numer-

ically, is plotted in Fig. 3 for and Notethe agreement for In general, the larger the , thesmaller this range is, but for most reasonable values of,the approximation appears good for and is usuallyacceptable even for larger

Fig. 3. Plot of approximate and exact values oft�(0) for N = M = 8 andk = 3 and for the locally optimalg� :

IV. RESULTS AND IMPLEMENTATION CONCERNS

To verify the small signal approximation of, simulationswere run for several combinations of and Some ofthese results are shown in Table I for and Table II for

(others are available in [9]). Note thatrepresentsthe value predicted by (21); and are the empiricalresults. One can see that in all cases, the approximation holdsquite well for large In addition, while the approximationis based on an assumption that the signal level is vanishinglysmall, one can see that reasonable levels ofstill provide thedesired results. A set of these results is also shown graphicallyin Fig. 4; here, sample points of versus along with a solidline reflecting and a dashed line for areshown. Note that the approximation holds more closely asincreases due to the approximation between (7,8) and (4,5);a similar effect was observed in [1]. The scenario here is ofphase coherence, that is, the performance, as calculated underthe assumed model of uncorrelated blocks, is very similar tothat obtained in this more practical situation. To further explorethis, several trials were performed for both independent blocksand constant phase blocks. These results are shown in TableIII. Here, the average delay in detection is, for all practicalpurposes, identical for both cases. This is not surprising, asphase coherence is not being exploited.

Up to this point, the assumption has been that the signal tobe detected was confined to a narrow band. A more practicalproblem is of detecting a sinusoid that may appear anywhere inthe spectrum (i.e., on the interval ); to detect this signal,all bands must be interrogated. This raises the followingquestion: What is the best choice of (the length of eachblock of data) and (the number of bands)? Figs. 5 and 6show two different perspectives on this matter. Fig. 5 showsthe bandshape of a single band as the value ofis variedand held constant (note that the case of mostclosely resembles the FFT). Fig. 6 shows curves of efficacy,hence, efficiency versus for a single band for three differentvalues of (8, 16, and 32). Two observations can be made:Performance is proportional to and levels off afterThis tells us that should be made as large as possible, with

2042 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 45, NO. 8, AUGUST 1997

Fig. 4. Plot ofD versusT for N = M = 4; k = 2; and � = 0:8: NotethatT andD are in units of blocks of lengthN and that the circles representsimulation results.

TABLE IPAGE’ S EFFICIENCY RESULTS FORM = N: D AND T ARE IN UNITS OF BLOCKS

TABLE IIPAGE’ S EFFICIENCY RESULTS FORM 6= N: D AND T ARE IN UNITS OF BLOCKS

However, care should be taken not to maketoolarge as it may prohibit a reasonable delay in detection

To get an understanding for what the performance will be forthe total system, one can consider howand will changeas more bands are utilized. Since the signal to be detected is asinusoid, it will, in general, be detected by the band that coversit (i.e., the one with the best performance for that frequency).Therefore, will change little as more bands are added. Onthe other hand, since the background noise is white, the mean

TABLE IIITEST OF INDEPENDENT VERSUSCONSTANT PHASE UPDATES

Fig. 5. � versus! for N = 64 andM = 4; 8; 16; and 64.

Fig. 6. � versusM for N = 8; 16; and 32.

time between false alarms will decrease(by a factor ofapproximately ) as more of the spectrum is covered. Basedon this, the inequalities

(44)

(45)

are reasonable. Here, the subscriptrepresents the total systemquantities, and the subscriptrepresents the value averagedover the design band. Equality would result for the case of

STAHL AND WILLETT: QUICKEST DETECTION OF A TONAL BURST 2043

TABLE IVPERFORMANCE OFMULTI-BAND DETECTOR FORN = 8

independent (nonoverlapping) bands of equal shape. Given thedefinition of shown in (9), the total system efficiency willbe lower bounded by

(46)

The above shows us that the system efficiency will be loweroperating over all frequencies than the single band detectoroperating over its design band. However, one can see that as

, and thus , becomes large, the system performance willbe approximately the same as for the single band detector.

To verify the relationship shown above, simulations wererun for the case of with and . In order tocalculate and for each , the center band was used asthe “representative” band. The expression(see [1]) and (38) were averaged over the band to obtainthese two quantities. For each , thresholds were foundsuch that all bands operated at the same false alarm rateand yielded a system false alarm rate of roughly 105—thisnumber was chosen because it should be sufficient to obtain theasymptotic result Once again, as in previous simulations,a value of was used. Table IV shows the resultsof these simulations. The column is the approximation for

using the lower bound shown in (46). One can see thatthe experimental results agree with this lower bound, thusproviding a degree of confidence in the approximation.

Note that if one is using the FFT implementation of theDFT, an -point transform results in unique bands from0 to Observing Fig. 2, one can see that the FFT bandsactually correspond to alternating bands of the locally optimaldetector. The above (that is best) tells us that thiswill not quite provide optimum performance. The differenceis small in , but for large , the logarithmic relationship canengender a significant deviation in The FFT’s performancecan be maximized by the use of two FFT’s: one on the originaldata and the second on the data bandshifted (modulated) by

This will effectively “fill in the gaps” between bandsand provide performance similar to the locally optimal detectorwith

For those interested in implementing such a system, thereare a number of practical concerns.

Choice of : The parameter is the number of bandsbeing tested. It has been shown that ideally, we should have

Choice of : There is no clear answer here. In terms ofnumber of samples (rather than in terms of number of blocks,which most of the previous has dealt with), we have

The blockwise efficiency is roughly proportional to; the decrease in due to the greater number of bands

tested is not a “first-order” effect. The implication is that,for example, doubling results in a halving of the averagedelay to detection; this is due to the better use made of phasecoherence in larger blocks. It would therefore appear thatshould be chosen as large as possible. Conversely, however,a large choice of will certainly increase the computationalload and may also decrease robustness to disturbances that arenarrowband rather than sinusoidal. is a design parameter.

Choice of Bias: The negative bias [as given by (29)] isa function of , and the signal-to-noise ratio is, in general,unknown. The recommendation is to choose aminimum de-tectable level and design the bias based on this value. Athreshold thereby set will (naturally) specify the mean timebetween false alarms regardless of the actual, and the meantime to detection willdecreasefrom its predicted value [see(8)] when

Choice of : We have for the most part assumed Gauss-ian noise, which has implied a quadratic-form update. Thereis no need for this: Equation (32) allows for an arbitrary noisedensity. In practice, replacing by an amplifier-limiterin (32) would offer robustness against impulsive interferencewith little degradation in performance and would most likelybe the implementation of choice.

Choice of : Equation (7) gives explicitly, if approxi-mately; if a locally optimal is used and the bias is set using

, then a reasonable approximation here is

V. CONCLUSION

The goal of this work was to develop a detection scheme todetect, as quickly as possible, the occurrence of a sinusoidalsignal in a noisy background. The logical choice for a “quick”detector was Page’s test, and based on a small signal assump-tion, a locally optimal test statistic was utilized. In addition tothis, the issue of performance was addressed. Page’s efficiency

was introduced, and a small-signal approximation wasobtained. This, in fact, turned out to be directly related tothe efficacy measure used in standard detection theory.

While obtaining the small-signal approximation for, it wasobserved that as the signal level tended to zero, and so did,along with its first three derivatives. The intention was thatthe approximation for would hold for signals of moderateamplitude; this, indeed, was the case. For all of the trialsconducted, the experimentalwas quite close to the theoreticalcalculation. In addition, for applicable cases, a large meantime between false alarms was accompanied by a reasonableaverage delay in detection—this is desirable.

Given the type of signal (a sinusoid), one natural comparisonfor this detector existed: the discrete Fourier transform (DFT).While not necessarily the “optimal” choice, the DFT is oftenemployed in similar applications. One reason for this is theexistence of the fast Fourier transform (FFT), which is a veryefficient implementation of the DFT.

When efficacy (hence, Page’s efficiency) plots were com-pared for the DFT and locally optimal detector, the perfor-mance was seen to be almost identical. This is important

2044 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 45, NO. 8, AUGUST 1997

because it implies optimality for the DFT for this application.The equivalency holds when the locally optimal detector bandwas chosen to coincide with a DFT bin. One advantage waspointed out for the locally optimal detector; the number ofbands and bandshape is user definable and independent of theblock size. This is not the case for the DFT, especially whenthe FFT implementation is utilized.

After observing the case of one band, an attempt was madeto develop a total system approach by implementing the locallyoptimal detector on several bands across the frequency spec-trum. The signal was also allowed to take on any frequency.Some key observations were made. First, for a givencombination, the system efficiency approaches that of a singleband operating over its design frequencies. Next, no gain isto be had by increasing to a value greater than Thebottom line conclusion is that is probably the bestchoice from the standpoint of maximizing the efficiency. Itwas also shown that while larger values of yield betterperformance, constraints on may prevent one from usingblock sizes that are too big.

Finally, while the locally optimal detector withseems to be the best choice, the FFT is still an appealingalgorithm due to its rapid implementation. However, the-point FFT only gives unique bands. A modified FFTscheme can be used to obtain the desired performance. Thisinvolves performing two -point FFT’s on each block, whereone is bandshifted by half a binwidth, thereby obtaining thesame frequency coverage.

APPENDIX ATHE AVERAGE DELAY APPROXIMATION

In this Appendix, we briefly develop the approximationsfor the average delay to detection and to false alarm forPage’s test. The expressions are reasonably well known; thisdiscussion follows [3].

In the CUSUM procedure described in (1), each timedrops below 0, it is reset to zero. Let us temporarily revisethis such that each time drops below a negative value,it is reset to zero. Page’s procedure can then be looked onas a sequence of Wald tests—sequential detection problemswith upper threshold and lower threshold—only the lastof which ends with a crossing of the upper threshold. We define

(any Wald test ends at its lower threshold);number of samples in any Wald test ending above;number of samples in any Wald test ending below;number of samples in any Wald test;number of samples in Page’s test.

Since the number of Wald tests ending with a lower thresh-old crossing prior to the first upper threshold crossing is ageometric random variable with parameter, we have

(47)

where the approximation follows from standard average sam-ple number analysis for sequential testing [6]. Now, consider

Wald’s identity [6]

(48)

in which

(49)

is the moment-generating function of a single sample of data.Wald’s identity applies for any value of; here, we take theparticular value such that , meaning that isa “root” of the moment generating function. If the root exists,then we may substitute Wald’s approximation (basically thateach sequential test ends atexactly its threshold) into Wald’sidentity evaluated at this value and get

(50)

This may be solved for and substituted into (47) to yield

(51)

We take the limit as (i.e., we are returning to theproper Page test scenario) and get (4) and (5). We note thataccuracy of Wald’s approximation for very smallis suspect;a more accurate analysis (and more complicated expression)is available in [14].

APPENDIX BDERIVATION OF THE EFFICACY RELATIONSHIP

In this Appendix, we explore the small signal performanceof the detector specified in (15). Specifically, we wish toapproximate

(52)

for small , where we write

(53)

with integration (expectation) assumed over all random vari-ables. The notation is somewhat confusing but is unfortunatelynecessary. refers to the density of after a disturbance ofstrength occurs (and, naturally, may be used before thedisturbance), whereas refers to the test statistic used todetect a disturbance of strength As such, refers tothe root of the MGF of when the actual value of is

At any rate, we must assume that

(54)

(55)

for positive ; these are the conditions necessary for Page’stest to work (that the statistic be “biased” negatively and

STAHL AND WILLETT: QUICKEST DETECTION OF A TONAL BURST 2045

positively respectively before and after the disturbance occurs).Naturally, these also imply that

(56)

(57)

(58)

where here (as later), theprime indicates differentiationwithrespect to Now, since

(59)

we have, since , that

(60)

(61)

Equations (58), (60), and (61), coupled with the straightfor-ward consequence of (58) that

(62)

give us all we need to proceed.Since , we must examine its derivatives, as in [1]

and [2]. Beginning with the first

(63)

Based on the conditions stated above

(64)

Now, the second derivative

(65)

Taking the limit and applying the initial conditions also resultsin

(66)

At this point, one can see a difference between this and theresult obtained in [1] and [2]; here, the second derivative ofdoesnot result in a useful quantity. Continuing with the thirdderivative

(67)

and applying the limit and conditions

(68)

Now, we are required to find the quantity This can beaccomplished by differentiating the MGF identity (6) underthe null hypothesis

(69)

Differentiating once, we get

(70)

Taking the limit, applying the initial conditions, and given thefact that , we get

(71)

This provides no useful information about Differentiat-ing (69) a second time results in

(72)

Now, applying the limit and initial conditions, we have

(73)

or

(74)

Given this, we have

(75)

Differentiating for the fourth time

(76)

Taking the limit and applying all conditions

(77)

Now, we are faced with the task of obtaining Differen-tiating (69) a third time results in

(78)

2046 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 45, NO. 8, AUGUST 1997

Taking the limit once again results in

(79)

Differentiating (69) for the fourth time

(80)

Now, letting , applying all conditions, and replacing theexpectation with an integral results in

(81)

which has the nonzero solution

(82)

Substituting this expression into (76) yields

(83)

APPENDIX CEFFICACY OF THE LOCALLY OPTIMAL DETECTOR

The following equation will be used for evaluating theefficacy of the detector [11]:

Var(84)

Note that the numerator expectation contains anThis isthere to show that the efficacy, hence, the performance, willbe represented as a function of frequency.

The locally optimal test statistic for the white Gaussian noisecase is shown in (33) and (35) of Section III-C. The expectedvalue of can be found with

(85)

where subscripts and have been dropped from [see(33)] for ease of notation.

Continuing

(86)

(87)

At this time, it is appropriate to make the following replace-ments. If

then

Focusing on the double sum of (85), it is straightforward toshow that for each , there are identicalterms. Therefore, the double sum can be replaced with a singlesum. Equation (85) now becomes

(88)

In order to make the variance calculation easier, a modifiedwill be used.

It is easy to show that

Var

Additionally

(89)

where

Continuing

(90)

Careful inspection will show that the second expectation termis equal to zero for any combination of and The firstexpectation can take on two values:

( terms)( terms).

STAHL AND WILLETT: QUICKEST DETECTION OF A TONAL BURST 2047

The number of terms are equal to the number of occurrencesof that value in the double sum. This yields

(91)The last expectation term can be shown to be nonzero onlyfor the cases of

and or and

In both cases

Given the above conditions, one can see that for anysuch that the expectation is nonzero

For any value of , there are terms.This then allows us to reduce the quadruple sum of (90) with

(92)

Given (90)–(92)

which, along with (89), gives

Var

(93)

This, along with (84) and (88), produces

(94)

An analysis for statistics of the form

(95)is a straightforward extension of the above and has beenomitted in the interests of clarity and brevity.

REFERENCES

[1] B. Broder, “Quickest detection procedures and transient signal detec-tion,” Ph.D. dissertation, Princeton Univ., Princeton, NJ, 1990.

[2] B. Broder and S. C. Schwartz, “The performance of Page’s test andnonparametric quickest detection,” inProc. 1989 Conf. Inform. Sci. Syst.,Johns Hopkins Univ., Baltimore, MD, Mar. 1989, pp. 34–39.

[3] , “The performance of the generalized CUSUM test in the change-point problem.” Tech. Rep., Princeton Univ., Princeton, NJ, 1989.

[4] E. S. Page, “Continuous inspection schemes,”Biometrica, vol. 41, pp.100–114, 1954.

[5] , “An improvement to Wald’s approximations for some propertiesof sequential tests,”J. Royal Statist. Soc., vol. B-16, no. 1, pp. 136–139,1954.

[6] A. Wald, Sequential Analysis. New York: Wiley, 1947.[7] H. V. Poor, An Introduction to Signal Detection and Estimation. New

York: Springer-Verlag, 1988.[8] B. K. Ghosh,Sequential Tests of Statistical Hypothesis. Reading, MA:

Addison-Wesley, 1970.[9] R. J. Stahl, “An application of Page’s test for detecting a sinusoidal

disturbance of unknown frequency and phase.” Master’s thesis, Univ.Connecticut, Storrs, 1991.

[10] R. J. Stahl and P. K. Willett, “Quickest detection of an unknownsinusoid,” inProc. 1992 Conf. Inform. Sci. Syst., Princeton Univ., Mar.1992, Princeton, NJ, pp. 190–195.

[11] S Kassam,Signal Detection in Non-Gaussian Noise. New York:Springer-Verlag, 1989.

[12] A. Searle,Linear Models. New York: Wiley, 1971.[13] D. Abraham, “Asymptotically optimal bias for a general nonlinearity in

Page’s test,”IEEE Trans. Aerosp. Electron. Syst., pp. 1–8, Jan. 1996.[14] D. Siegmund,Sequential Analysis. New York: Springer-Verlag, 1985.

Robert J. Stahl was born in New York, NY, on August 7, 1961. He receivedthe B.S. degree from the State University of New York, Buffalo, in 1984 andthe M.Sc. degree from the University of Conecticut, Storrs, in 1991, both inelectrical engineering.

From 1984 to 1996, he worked as an engineer for the Naval UnderseaWarfare Center, New London, CT. His main technical focus included thedevelopment of signal processing, detection, estimation, and classificationalgorithms for real-time implementation in submarine sonar systems. Heis currently employed by the Atlantic Aerospace Electronics Corporation,Waltham, MA, developing signal and image processing algorithms.

Peter K. Willett received the B.A.Sc. degree in1982 from the University of Toronto, Toronto, Ont.,Canada, and the Ph.D. degree from Princeton Uni-versity, Princeton, NJ, in 1986.

He is currently is an associate professor at theUniversity of Connecticut, Storrs, where he hasworked since 1986. His interests are generally inthe areas of detection theory and signal processing.