spectral analysis of signals
DESCRIPTION
This document describes ways to analyze signals spectrally.TRANSCRIPT
-
SPECTRAL ANALYSISOF SIGNALSThe Missing Data Case
-
Copyright 2005 by Morgan & Claypool
All rights reserved. No part of this publication may be reproduced, stored in a retrieval sys-tem, or transmitted in any form or by any means electronic, mechanical, photocopy, recording,or any other except for brief quotations in printed reviews, without the prior permission of thepublisher.
Spectral Analysis of Signals, The Missing Data Case
Yanwei Wang, Jian Li, and Petre Stoica
www.morganclaypool.com
ISBN: 1598290002
Library of Congress Cataloging-in-Publication Data
First Edition10 9 8 7 6 5 4 3 2 1
Printed in the United States of America
-
SPECTRAL ANALYSISOF SIGNALSThe Missing Data Case
Yanwei WangDiagnostic Ultrasound CorporationBothell, WA 98021
Jian LiDepartment of Electrical and Computer Engineering,University of Florida,Gainesville, FL 32611, USA
Petre StoicaDepartment of Information Technology,Division of Systems and Control,Uppsala University,Uppsala, Sweden
M&C Morgan &Claypool Publishers
-
ABSTRACTSpectral estimation is important in many fields including astronomy, meteorology,
seismology, communications, economics, speech analysis, medical imaging, radar,
sonar, and underwater acoustics. Most existing spectral estimation algorithms are
devised for uniformly sampled complete-data sequences. However, the spectral
estimation for data sequences with missing samples is also important in many ap-
plications ranging from astronomical time series analysis to synthetic aperture radar
imaging with angular diversity. For spectral estimation in the missing-data case,
the challenge is how to extend the existing spectral estimation techniques to deal
with these missing-data samples. Recently, nonparametric adaptive filtering based
techniques have been developed successfully for various missing-data problems.
Collectively, these algorithms provide a comprehensive toolset for the missing-data
problem based exclusively on the nonparametric adaptive filter-bank approaches,
which are robust and accurate, and can provide high resolution and low sidelobes.
In this lecture, we present these algorithms for both one-dimensional and two-
dimensional spectral estimation problems.
KEYWORDSAdaptive filter-bank, APES (amplitude and phase estimation),
Missing data, Nonparametric methods, Spectral estimation
-
vContents1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Complete-Data Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Missing-Data Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. APES for Complete Data Spectral Estimation . . . . . . . . . . . . . . . . . . . . . . . . . 52.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.3 Forward-Only APES Estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.4 Two-Step Filtering-Based Interpretation . . . . . . . . . . . . . . . . . . . . . . . . 82.5 ForwardBackward Averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.6 Fast Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3. Gapped-Data APES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.2 GAPES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2.1 Initial Estimates via APES . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143.2.2 Data Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.2.3 Summary of GAPES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3 Two-Dimensional GAPES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.3.1 Two-Dimensional APES Filter . . . . . . . . . . . . . . . . . . . . . . . . 183.3.2 Two-Dimensional GAPES . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.4 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243.4.1 One-Dimensional Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.4.2 Two-Dimensional Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4. Maximum Likelihood Fitting Interpretation of APES . . . . . . . . . . . . . . . . . 314.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314.2 ML Fitting Based Spectral Estimator . . . . . . . . . . . . . . . . . . . . . . . . . . 314.3 Remarks on the ML Fitting Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . 33
-
vi CONTENTS
5. One-Dimensional Missing-Data APES via Expectation Maximization. .355.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355.2 EM for Missing-Data Spectral Estimation . . . . . . . . . . . . . . . . . . . . . 365.3 MAPES-EM1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .375.4 MAPES-EM2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .415.5 Aspects of Interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.5.1 Some Insights into the MAPES-EM Algorithms . . . . . . . . 455.5.2 MAPES-EM1 versus MAPES-EM2. . . . . . . . . . . . . . . . . . .465.5.3 Missing-Sample Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 465.5.4 Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475.5.5 Stopping Criterion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .47
5.6 MAPES Compared With GAPES . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475.7 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
6. Two-Dimensional MAPES via Expectation Maximization andCyclic Maximization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616.2 Two-Dimensional ML-Based APES. . . . . . . . . . . . . . . . . . . . . . . . . . .626.3 Two-Dimensional MAPES via EM. . . . . . . . . . . . . . . . . . . . . . . . . . . .64
6.3.1 Two-Dimensional MAPES-EM1 . . . . . . . . . . . . . . . . . . . . . . 646.3.2 Two-Dimensional MAPES-EM2 . . . . . . . . . . . . . . . . . . . . . . 68
6.4 Two-Dimensional MAPES via CM . . . . . . . . . . . . . . . . . . . . . . . . . . . 726.5 MAPES-EM versus MAPES-CM . . . . . . . . . . . . . . . . . . . . . . . . . . . . 746.6 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.6.1 Convergence Speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 766.6.2 Performance Study. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .786.6.3 Synthetic Aperture Radar Imaging Applications . . . . . . . . . 82
7. Conclusions and Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 877.1 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 877.2 Online Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
The Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
-
vii
PrefaceThis lecture considers the spectral estimation problem in the case where some of
the data samples are missing. The challenge is how to extend the existing spectral
estimation techniques to deal with these missing-data samples. Recently, nonpara-
metric adaptive filtering based techniques have been developed successfully for
various missing-data spectral estimation problems. Collectively, these algorithms
provide a comprehensive toolset for the missing-data problem based exclusively on
the nonparametric adaptive filter-bank approaches. They provide the main topic
of this book.
The authors would like to acknowledge the contributions of several other
people and organizations to the completion of this lecture. We are grateful to our
collaborators on this topic, including Erik G. Larsson, Hongbin Li, and Thomas
L. Marzetta, for their excellent work and support. In particular, we thank Erik G.
Larsson for providing us the Matlab codes that implement the two-dimensional
GAPES algorithm. Most of the topics described here are outgrowths of our research
programs in spectral analysis. We would like to thank those who supported our
research in this area: the National Science Foundation, the Swedish Science Council
(VR), and the Swedish Foundation for International Cooperation in Research and
Higher Education (STINT). We also wish to thank Jose M. F. Moura for inviting
us to write this lecture and Joel Claypool for publishing our work.
-
viii
List of Abbreviations
1-D one-dimensional
2-D two-dimensional
APES amplitude and phase estimation
AR autoregressive
ARMA autoregressive moving-average
CAD computer aided design
CM cyclic maximization
DFT discrete Fourier transform
EM expectation maximization
FFT fast Fourier transform
FIR finite impulse response
GAPES gapped-data amplitude and phase estimation
LS least squares
MAPES missing-data amplitude and phase estimation
MAPES-CM missing-data amplitude and phase estimation via cyclicmaximization
MAPES-EM missing-data amplitude and phase estimation via expectationmaximization
ML maximum likelihood
RCF robust Capon filter-bank
RF radio frequency
RMSEs root mean-squared errors
SAR synthetic aperture radar
WFFT windowed fast Fourier transform
-
1C H A P T E R 1
Introduction
Spectral estimation is important in many fields including astronomy, meteorology,
seismology, communications, economics, speech analysis, medical imaging, radar,
and underwater acoustics. Most existing spectral estimation algorithms are devised
for uniformly sampled complete-data sequences. However, the spectral estimation
for data sequences with missing samples is also important in a wide range of appli-
cations. For example, sensor failure or outliers can lead to missing-data problems.
In astronomical, meteorological, or satellite-based applications, weather or other
conditions may disturb sample taking schemes (e.g., measurements are available
only during nighttime for astronomical applications), which will result in missing
or gapped data [1]. In synthetic aperture radar imaging, missing-sample problems
arise when the synthetic aperture is gapped to reduce the radar resources needed for
the high-resolution imaging of a scene [24]. For foliage and ground penetrating
radar systems, certain radar operating frequency bands are reserved for applications
such as aviation and cannot be used, or they are under strong electromagnetic or
radio frequency interference [5, 6] so that the corresponding samples must be dis-
carded, both resulting in missing data. Similar problems arise in data fusion via
ultrawideband coherent processing [7].
1.1 COMPLETE-DATA CASEFor complete-data spectral estimation, extensive work has already been carried out
in the literature, see, e.g., [8]. The conventional discrete Fourier transform (DFT) or
fast Fourier transform based methods have been widely used for spectral estimation
-
2 SPECTRAL ANALYSIS OF SIGNALS
tasks because of their robustness and high computational efficiency. However, they
suffer from low resolution and poor accuracy problems. Many advanced spectral
estimation methods have also been proposed, including parametric [911] and
nonparametric adaptive filtering based approaches [12, 13]. One problem associated
with the parametric methods is order selection. Even with properly selected order, it
is hard to compare parametric and nonparametric approaches since the parametric
methods (except [11]) do not provide complex amplitude estimation. In general,
the nonparametric approaches are less sensitive to data mismodelling than their
parametric counterparts. Moreover, the adaptive filter-bank based nonparametric
spectral estimators can provide high resolution, low sidelobes, and accurate spectral
estimates while retaining the robust nature of the nonparametric methods [14, 15].
These include the amplitude and phase estimation (APES) method [13] and the
Capon spectral estimator [12].
However, the complete-data spectral estimation methods do not work well
in the missing-data case when the missing data samples are simply set to zero. For
the DFT-based spectral estimators, setting the missing samples to zero corresponds
to multiplying the original data with a windowing function that assumes a value of
one whenever a sample is available, and zero otherwise. In the frequency domain,
the resulting spectrum is the convolution between the Fourier transform of the
complete data and that of the windowing function. Since the Fourier transform of
the windowing function typically has an underestimated mainlobe and an extended
pattern of undesirable sidelobes, the resulting spectrum will be poorly estimated and
contain severe artifacts. For the parametric and adaptive filtering based approaches,
similar performance degradations will also occur.
1.2 MISSING-DATA CASEFor missing-data spectral estimation, various techniques have been developed pre-
viously. In [16] and [17], the LombScargle periodogram is developed for irregu-
larly sampled (unevenly spaced) data. In the missing-data case, the LombScargle
-
INTRODUCTION 3
periodogram is nothing but DFT with missing samples set to zero. The CLEAN
algorithm [18] is used to estimate the spectrum by deconvolving the missing-data
DFT spectrum (the so-called dirty map) into the true signal spectrum (the so-
called true clean map) and the Fourier transform of the windowing function (the
so-called dirty beam) via an iterative approach. Although the CLEAN algorithm
works for both missing and irregularly sampled data sequences, it cannot resolve
closely spaced spectral lines, and hence it may not be a suitable tool for high-
resolution spectral estimation. The multi-taper methods [19, 20] compute spectral
estimates by assuming certain quadratic functions of the available data samples.
The coefficients in the corresponding quadratic functions are optimized according
to certain criteria, but it appears that this approach cannot overcome the resolution
limit of DFT. To achieve high resolution, several parametric algorithms, e.g., those
based on an autoregressive or autoregressive moving-average models, were used
to handle the missing-data problem [2124]. Although these parametric methods
can provide improved spectral estimates, they are sensitive to model errors. Non-
parametric adaptive filtering based techniques are promising for the missing-data
problem, as we will show later.
1.3 SUMMARYIn this book, we present the recently developed nonparametric adaptive filtering
based algorithms for the missing-data case, namely gapped-data APES (GAPES)
and the more general missing-data APES (MAPES). The outlines of the remaining
chapters are as follows:
Chapter 2: In this chapter, we introduce the APES filter for the complete-data
case. The APES filter is needed for the missing-data algorithm developed in
Chapter 3.
Chapter 3: We consider the spectral analysis of a gapped-data sequence where
the available samples are clustered together in groups of reasonable size.
-
4 SPECTRAL ANALYSIS OF SIGNALS
Following the filter design framework introduced in Chapter 2, GAPES
is developed to iteratively interpolate the missing data and to estimate the
spectrum. A two-dimensional extension of GAPES is also presented.
Chapter 4: In this chapter, we introduce a maximum likelihood (ML) based in-
terpretation of APES. This framework will lay the ground for the general
missing-data problem discussed in the following chapters.
Chapter 5: Although GAPES performs quite well for gapped data, it does not work
well for the more general problem of missing samples occurring in arbitrary
patterns. In this chapter, we develop two MAPES algorithms by using a ML
fitting criterion as discussed in Chapter 4. Then we use the well-known
expectation maximization (EM) method to solve the so-obtained estimation
problem iteratively. We also demonstrate the advantage of MAPES-EM over
GAPES by comparing their design approaches.
Chapter 6: Two-dimensional extensions of the MAPES-EM algorithms are devel-
oped. However, because of the high computational complexity involved, the
direct application of MAPES-EM to large data sets, e.g., two-dimensional
data, is computationally prohibitive. To reduce the computational complex-
ity, we develop another MAPES algorithm, referred to as MAPES-CM,
by solving a ML fitting problem iteratively via cyclic maximization (CM).
MAPES-EM and MAPES-CM possess similar spectral estimation perfor-
mance, but the computational complexity of the latter is much lower than
that of the former.
Chapter 7: We summarize the book and provide some concluding remarks. Addi-
tional online resources such as Matlab codes that implement the missing-data
algorithms are also provided.
-
5C H A P T E R 2
APES for Complete DataSpectral Estimation
2.1 INTRODUCTIONFilter-bank approaches are commonly used for spectral analysis. As nonparametric
spectral estimators, they attempt to compute the spectral content of a signal with-
out using any a priori model information or making any explicit model assumption
about the signal. For any of these approaches, the key element is to design narrow-
band filters centered at the frequencies of interest. In fact, the well-known peri-
odogram can be interpreted as such a spectral estimator with a data-independent
filter-bank. In general, data-dependent (or data-adaptive) filters outperform their
data-independent counterparts and are hence preferred in many applications. A
well-known adaptive filter-bank method is the Capon spectral estimator [12]. More
recently, Li and Stoica [13] devised another adaptive filter-bank method with en-
hanced performance, which is referred to as the amplitude and phase estimation
(APES). APES surpasses its rivals in several aspects [15, 25] and find applications
in various fields [1, 2631].
In this chapter, we derive the APES filter from pure narrowband-filter de-
sign considerations [32]. It is useful as the initialization step of the algorithms in
Chapter 3. The remainder of this chapter is organized as follows: The problem
formulation is given in Section 2.2 and the forward-only APES filter is presented
in Section 2.3. Section 2.4 provides a two-step filtering interpretation of the APES
estimator. Section 2.5 shows how the forwardbackward averaging can be used
-
6 SPECTRAL ANALYSIS OF SIGNALS
to improve the performance of the estimator. A brief discussion about the fast
implementation of APES appears in Section 2.6.
2.2 PROBLEM FORMULATIONConsider the problem of estimating the amplitude spectrum of a complex-valued
uniformly sampled discrete-time signal {yn}N1n=0 . For a frequency of interest, thesignal yn is modeled as
yn = () e jn + en(), n = 0, . . . , N 1, [0, 2 ), (2.1)where () denotes the complex amplitude of the sinusoidal component at fre-
quency , and en() denotes the residual term (assumed zero-mean), which in-
cludes the unmodeled noise and interference from frequencies other than . The
problem of interest is to estimate () from {yn}N1n=0 for any given frequency .
2.3 FORWARD-ONLY APES ESTIMATORLet h() denote the impulse response of an M-tap finite impulse response (FIR)
filter-bank
h() = [h0() h1() h M1()]T, (2.2)where ()T denotes the transpose. Then the filter output can be written as hH()yl ,where
yl = [yl yl+1 yl+M1]T, l = 0, . . . , L 1 (2.3)are the M 1 overlapping forward data subvectors (snapshots) and L = N M + 1.Here ()H denotes the conjugate transpose.
For each of interest, we consider the following design objective:
min(),h()
L1
l=0
hH()yl () e jl2 s.t. hH()a() = 1, (2.4)
where a() is an M 1 vector given by
a() = [1 e j e j(M1)]T. (2.5)
-
APES FOR COMPLETE DATA SPECTRAL ESTIMATION 7
In the above approach, the filter-bank h() is designed such that
1. the filtered sequence is as close to a sinusoidal signal as possible in a least
squares (LS) sense;
2. the complex spectrum () is not distorted by the filtering.
Let g() denote the normalized Fourier transform of yl :
g() = 1L
L1
l=0yl ejl (2.6)
and define
R = 1L
L1
l=0yl yHl . (2.7)
A straightforward calculation shows that the objective function in (2.4) can be
rewritten as
1L
L1
l=0
hH()yl () e jl2
= hH()Rh() ()hH()g() ()gH()h() + |()|2
= |() hH()g()|2 + hH()Rh() |hH()g()|2, (2.8)
where () denotes the complex conjugate. The minimization of (2.8) with respectto () is given by
() = hH()g(). (2.9)
Insertion of (2.9) in (2.8) yields the following minimization problem for the deter-
mination of h():
minh()
hH()S()h() s.t. hH()a() = 1, (2.10)
where
S() R g()gH(). (2.11)
-
8 SPECTRAL ANALYSIS OF SIGNALS
The solution to (2.10) is readily obtained [33] as
h() = S1()a()
aH()S1()a(). (2.12)
This is the forward-only APES filter, and the forward-only APES estimator in
(2.9) becomes
() = aH()S1()g()
aH()S1()a(). (2.13)
2.4 TWO-STEP FILTERING-BASEDINTERPRETATION
The APES spectral estimator has a two-step filtering interpretation: passing the
data {yn}N1n=0 through a bank of FIR bandpass filters with varying center frequency, and then obtaining the spectrum estimate () for [0, 2) from the filtereddata.
For each frequency , the corresponding M-tap FIR filter-bank is given by
(2.12). Hence the output obtained by passing yl through the FIR filter h() can be
written as
hH()yl = ()[hH()a()] e jl + wl ()= () e jl + wl (), (2.14)
where wl () = hH()el () denotes the residue term at the filter output and thesecond equality follows from the identity
hH()a() = 1. (2.15)Thus, from the output of the FIR filter, we can obtain the LS estimate of () as
() = hH()g(). (2.16)
2.5 FORWARDBACKWARD AVERAGINGForwardbackward averaging has been widely used for enhanced performance in
many spectral analysis applications. In the previous section, we obtained the APES
-
APES FOR COMPLETE DATA SPECTRAL ESTIMATION 9
filter by using only forward data vectors. Here we show that forwardbackward
averaging can be readily incorporated into the APES filter design by considering
both the forward and the backward data vectors.
Let the backward data subvectors (snapshots) be constructed as
yl = [yNl1 yNl2 yNlM]T, l = 0, . . . , L 1. (2.17)
We require that the outputs obtained by running the data through the filter both
forward and backward are as close as possible to a sinusoid with frequency . This
design objective can be written as
minh(),(),()
12L
L1
l=0
{hH()yl () e jl2 + hH()yl () e jl
2}
s.t. hH()a() = 1. (2.18)The minimization of (2.18) with respect to () and () gives () = hH()g()and () = hH()g(), where g() is the normalized Fourier transform of yl :
g() = 1L
L1
l=0yl ejl . (2.19)
It follows that (2.18) leads to
minh()
hH()Sf b()h() s.t. hH()a() = 1, (2.20)
where
Sf b() Rf b g()gH() + g()gH()
2(2.21)
with
R f = 1LL1
l=0yl yHl , (2.22)
Rb = 1LL1
l=0yl yHl , (2.23)
-
10 SPECTRAL ANALYSIS OF SIGNALS
and
Rf b = R f + Rb2 . (2.24)Note that here we use R f instead of R to emphasize on the fact that it is estimated
from the forward-only approach. The solution of (2.20) is given by
hf b() =S1f b ()a()
aH()S1f b ()a(). (2.25)
Because of the following readily verified relationship
yl = J yLl1, (2.26)
we have
g() = Jg() ej(L1), (2.27)Rb = JRTf J, (2.28)
and
g()gH() = J [g()gH()]T J, (2.29)
where J denotes the exchange matrix whose antidiagonal elements are ones and the
remaining elements are zeros. So Sf b() can be conveniently calculated as
Sf b() =S f () + JSTf () J
2, (2.30)
where
S f () R f g()gH(). (2.31)
Given the forwardbackward APES filter hf b(), the forwardbackward
spectral estimator can be written as
f b() =aH()S1f b ()g()
aH()S1f b ()a(). (2.32)
-
APES FOR COMPLETE DATA SPECTRAL ESTIMATION 11
Note that due to the above relationship, the forwardbackward estimator of ()
can be simplified as
f b() = hHf b()g() = f b ej(N1), (2.33)
which indicates that from f b() we will get the same forwardbackward spectral
estimator f b().
In summary, the forwardbackward APES filter and APES spectral estimator
still has the same forms as in (2.12) and (2.13), but R and S() are replaced by
Rf b and Sf b(), respectively. Note that Rf b and Sf b() are persymmetric matrices.
Compared with the non-persymmetric estimates R f and S f (), they are generally
better estimates of the true R and Q (), where R and Q () are the ideal covariance
matrices with and without the presence of the signal of interest, respectively. See
Chapter 4 for more details about R and Q ().
For simplicity, all the APES-like algorithms we develop in the subsequent
chapters are based on the forward-only approach. For better estimation accuracy,
the forwardbackward averaging is used in all numerical examples.
2.6 FAST IMPLEMENTATIONThe direct implementation of APES by simply computing (2.13) for many dif-
ferent of interest is computationally demanding. Several papers in the literature
have addressed this problem [29, 3436]. Here we give a brief discussion about
implementing APES efficiently.
To avoid the inversion of an M M matrix S() for each , we use thematrix inversion lemma (see, e.g., [8]) to obtain
S1() = R1 + R1g()gH()R1
1 gH()R1g() . (2.34)
-
12 SPECTRAL ANALYSIS OF SIGNALS
Let R1/2 denote the Cholesky factor of R1, and let
a() = R1/2a()g() = R1/2g() () = aH()a()() = aH()g()() = gH()g(). (2.35)
Then we can write (2.12) and (2.13) as
h() = [R1/2]H[(1 ())a() + ()g()]
()(1 ()) + |()|2 (2.36)
and
() = () ()(1 ()) + |()|2 (2.37)
whose implementation requires only the Cholesky factorization of the matrix R
that is independent of .
This strategy can be readily generalized to the forwardbackward averaging
case. Since the complete-data case is not the focus of this book, we refer the readers
to [29, 3436] for more details about the efficient implementations of APES.
-
13
C H A P T E R 3
Gapped-Data APES
3.1 INTRODUCTIONOne special case of the missing-data problem is called gapped data, where the mea-
surements during certain periods are not valid due to many reasons such as interfer-
ence or jamming. The difference between the gapped-data problem and the gen-
eral missing-data problem, where the missing samples can occur at arbitrary places
among the complete data set, is that for the gapped-data case, there exists group(s)
of available data samples where within each group there are no missing samples.
Such scenarios exist in astronomical or radar applications where large seg-
ments of data are available in spite of the fact that the data between these segments
are missing. For example, in radar signal processing, the problem of combining sev-
eral sets of measurements made at different azimuth angle locations can be posed
as a problem of spectral estimation from gapped data [2, 4]. Similar problems arise
in data fusion via ultrawideband coherent processing [7]. In astronomy, data are
often available as groups of samples with rather long intervals during which no
measurements can be taken [17, 3741].
The gapped-data APES (GAPES) considers using the APES filter (as in-
troduced in Chapter 2) for the spectral estimation of gapped-data. Specifically, the
GAPES algorithm consists of two steps: (1) estimating the adaptive filter and the
corresponding spectrum via APES and (2) filling in the gaps via LS fit.
In the remainder of this chapter, one-dimensional (1-D) and two-
dimensional (2-D) GAPES are presented in Sections 3.2 and 3.3, respectively.
Numerical results are provided in Section 3.4.
-
14 SPECTRAL ANALYSIS OF SIGNALS
3.2 GAPESAssume that some segments of the 1-D data sequence {yn}N1n=0 are unavailable. Let
y [ y1 y2 yN1]T
[
yT1 yT2 yTP
]T(3.1)
be the complete data vector, where y1, . . . , yP are subvectors of y, whose lengths are
N1, . . . , NP , respectively, with N1 + N2 + + NP = N. A gapped-data vector is formed by assuming yp, for p = 1, 3, . . . , P (P is always an odd number),are available:
[yT1 y
T3 yTP
]T. (3.2)
Similarly,
[yT2 y
T4 yTP1
]T (3.3)
denotes all the missing samples. Then and have dimensions g 1 and(N g ) 1, respectively, where g = N1 + N3 + + NP is the total number ofavailable samples.
3.2.1 Initial Estimates via APESWe obtain the initial APES estimates of h() and () from the available data
as follows.
Choose an initial filter length M0 such that an initial full-rank covariance ma-
trix R can be built with the filter length M0 using only the available data segments.
This indicates
p{1,3,...,P}max(0, Np M0 + 1) > M0. (3.4)
Let Lp = Np M0 + 1 and letJ be the subset of {1, 3, . . . , P} for which Lp > 0.Then the filter-bank h() is calculated from (2.11) and (2.12) by using the
-
GAPPED-DATA APES 15
following redefinitions:
R = 1pJ Lp
pJ
N1++Np1+Lp1
l=N1++Np1yl yHl , (3.5)
g() = 1pJ Lp
pJ
N1++Np1+Lp1
l=N1++Np1yl ejl . (3.6)
Note that the data snapshots used above have a size of M0 1 whose elements areonly from , and hence they do not contain any missing samples. Correspondingly,
the R and g() estimated above have sizes of M0 M0 and M0 1, respecti-vely.
Next, the filter-bank h() is applied to the available data and the LS
estimate of () from the filter output is calculated by using (2.16), where g()
is replaced by (3.6). Note that in the above filtering process, only the available
samples are passed through the filter. The initial LS estimate of () is based on
these so-obtained filter outputs only.
3.2.2 Data InterpolationNow we consider the estimation of based on the initial spectral estimates ()
and h() obtained as outlined above. Under the assumption that the missing data
have the same spectral content as the available data, we can determine under the
condition that the output of the filter h() fed with the complete data sequence
made from and is as close as possible (in the LS sense) to () ejl , for
l = 0, . . . , L 1. Since usually we evaluate () on a K-point DFT grid, k =2k/K for k = 0, . . . , K 1 (usually we have K > N ),we obtain as the solu-tion to the following LS problem:
min
K1
k=0
L1
l=0
hH(k)yl (k) e jk l2. (3.7)
Note that by estimating this way, we remain in the LS fitting framework of
APES.
-
16 SPECTRAL ANALYSIS OF SIGNALS
The quadratic minimization problem (3.7) can be readily solved. Let
H(k) =
h0 hM01 0 0 00 h0 hM01 0 0
. . . . . . . . .
0 0 0 h0 hM01
=
hH
(k)
hH
(k). . .
hH
(k)
CLN
(3.8)
and
(k) = (k)
1
e jk...
e jk (L1)
CL1. (3.9)
Using this notation we can write the objective function in (3.7) as
K1
k=0
H(k)
y0...
yN1
(k)
2
. (3.10)
Define the L g and L (N g ) matrices A(k) and B(k) from H(k) via thefollowing equality:
H(k)
y0...
yN1
= A(k) + B(k). (3.11)
Also, let
d(k) = (k) A(k). (3.12)
With this notation the objective function (3.10) becomes
K1
k=0B(k) d(k)2, (3.13)
-
GAPPED-DATA APES 17
whose minimizer with respect to is readily found to be
=(
K1
k=0BH(k)B(k)
)1 (K1
k=0BH(k)d(k)
). (3.14)
3.2.3 Summary of GAPESOnce an estimate has become available, the next logical step should consist
of reestimating the spectrum and the filter-bank, by applying APES to the data
sequence made from and . According to the discussion around (2.4), this entails
the minimization with respect to h(k) and (k) of the function
K1
k=0
L1
l=0
hH(k)yl (k) e jk l2 (3.15)
subject to hH(k)a(k) = 1, where yl is made from and . Evidently, the min-imization of (3.15) with respect to {h(k), (k)}K1k=0 can be decoupled into Kminimization problems of the form of (2.4), yet we prefer to write the criterion
function as in (3.15) to make the connection with (3.7). In effect, comparing (3.7)
and (3.15) clearly shows that the alternating estimation of {(k), h(k)} and outlined above can be recognized as a cyclic optimization (see [42] for a tutorial of
cyclic optimization) approach for solving the following minimization problem:
min,{(k ),h(k )}
K1
k=0
L1
l=0
hH(k)yl (k) e jk l2 s.t. hH(k)a(k) = 1.
(3.16)
A step-by-step summary of GAPES is as follows:
Step 0: Obtain an initial estimate of {(k), h(k)}.Step 1: Use the most recent estimate of {(k), h(k)} in (3.16) to estimate by
minimizing the so-obtained cost function, whose solution is given by (3.14).
Step 2: Use the latest estimate of to fill in the missing data samples and estimate
{(k), h(k)}K1k=0 by minimizing the cost function in (3.16) based on the
-
18 SPECTRAL ANALYSIS OF SIGNALS
interpolated data. (This step is equivalent to applying APES to the complete
data.)
Step 3: Repeat steps 12 until practical convergence.
The practical convergence can be decided when the relative change of the cost
function in (3.16) corresponding to the current and previous estimates is smaller
than a preassigned threshold (e.g., = 103). After convergence, we have a finalspectral estimate {(k)}K1k=0 . If desired, we can use the final interpolated datasequence to compute the APES spectrum on a grid even finer than the one used in
the aforementioned minimization procedure.
Note that usually the selected initial filter length satisfies M0 < M due to
the missing data samples, so there are many practical choices to increase the filter
length after initialization, which include, for example, increasing the filter length
after each iteration until it reaches M. For simplicity, we choose to use filter length
M right after the initialization step.
3.3 TWO-DIMENSIONAL GAPESIn this section, we extend the GAPES algorithm developed previously to 2-D data
matrices.
3.3.1 Two-Dimensional APES FilterConsider the problem of estimating the amplitude spectrum of a complex-valued
uniformly sampled 2-D discrete-time signal {yn1,n2}N11,N21n1=0,n2=0 , where the data matrixhas dimension N1 N2.
For a 2-D frequency (1, 2) of interest, the signal yn1,n2 is described as
yn1,n2 = (1, 2) e j (1n1+2n2) + en1,n2 (1, 2), n1 = 0, . . . , N1 1,n2 = 0, . . . , N2 1, 1, 2 [0, 2), (3.17)
where (1, 2) denotes the complex amplitude of the 2-D sinusoidal compo-
nent at frequency (1, 2) and en1,n2 (1, 2) denotes the residual matrix (assumed
-
GAPPED-DATA APES 19
zero-mean), which includes the unmodeled noise and interference from frequencies
other than (1, 2). The 2-D APES algorithm derived below estimates (1, 2)
from {yn1,n2} for any given frequency pair (1, 2).Let Y be an N1 N2 data matrix
Y
y0,0 y0,1 . . . y0,N21y1,0 y1,1 . . . y1,N21
......
. . ....
yN11,0 yN11,1 . . . yN11,N21
, (3.18)
and let H(1, 2) be an M1 M2 matrix that contains the coefficients of a 2-DFIR filter
H(1, 2)
h0,0(1, 2) h0,1(1, 2) . . . h0,M21(1, 2)
h1,0(1, 2) h1,1(1, 2) . . . h1,M21(1, 2)...
.... . .
...
h M11,0(1, 2) h M11,1(1, 2) . . . h M11,M21(1, 2)
.
(3.19)
Let L1 N1 M1 + 1 and L2 N2 M2 + 1. Then we denote by
X = H(1, 2) Y (3.20)
the following L1 L2 output data matrix obtained by filtering Y through the filterdetermined by H(1, 2)
xl1,l2 =M11
m1=0
M21
m2=0hm1,m2 (1, 2)yl1+m1,l2+m2
= vecH(H(1, 2))yl1,l2, (3.21)
-
20 SPECTRAL ANALYSIS OF SIGNALS
where vec() denotes the operation of stacking the columns of a matrix onto eachother. In (3.21), yl1,l2 is defined by
yl1,l2 vec(Yl1,l2 ) vec
yl1,l2 yl1,l2+1 . . . yl1,l2+M21yl1+1,l2 yl1+1,l2+1 . . . yl1+1,l2+M21
......
. . ....
yl1+M11,l2 yl1+M11,l2+1 . . . yl1+M11,l2+M21
.
(3.22)
The APES spectrum estimate (1, 2) and the filter coefficient matrix
H(1, 2) are the minimizers of the following LS criterion:
min(1,2),H(1,2)
L11
l1=0
L21
l2=0
xl1,l2 (1, 2) e j (1l1+2l2 )2
s.t. vecH(H(1, 2))aM1,M2 (1, 2) = 1. (3.23)
Here aM1,M2 (1, 2) is an M1 M2 1 vector given by
aM1,M2 (1, 2) aM2 (2) aM1 (1), (3.24)
where denotes the Kronecker matrix product and
aMk (k) [1 ejk . . . e j (Mk1)k ]T, k = 1, 2. (3.25)
Substituting X into (3.23), we have the following design objective for 2-D APES:
min(1,2),H(1,2)vec(H(1, 2) Y) (1, 2)aL1,L2 (1, 2)
2
s.t. vecH(H(1, 2))aM1,M2 (1, 2) = 1, (3.26)
where aL1,L2 (1, 2) is defined similar to aM1,M2 (1, 2).
The solution to (3.26) can be readily derived. Define
R = 1L1L2
L11
l1=0
L21
l2=0yl1,l2 y
Hl1,l2 (3.27)
-
GAPPED-DATA APES 21
and let g(1, 2) denote the normalized 2-D Fourier transform of yl1,l2 :
g(1, 2) = 1L1L2L11
l1=0
L21
l2=0yl1,l2 e
j (1l1+2l2). (3.28)
The filter H(1, 2) that minimizes (3.26) is given by
vec(H(1, 2)) = S1(1, 2)aM1,M2 (1, 2)
aHM1,M2 (1, 2)S1(1, 2)aM1,M2 (1, 2)
(3.29)
and the APES spectrum is given by
(1, 2) =aHM1,M2 (1, 2)S
1(1, 2)g(1, 2)
aHM1,M2 (1, 2)S1(1, 2)aM1,M2 (1, 2)
, (3.30)
where
S(1, 2) R g(1, 2)gH(1, 2). (3.31)
3.3.2 Two-Dimensional GAPESLet G be the set of sample indices (n1, n2) for which the data samples are available,and U be the set of sample indices (n1, n2) for which the data samples are missing.The set of available samples {yn1,n2 : (n1, n2) G} is denoted by the g 1 vector, whereas the set of missing samples {yn1,n2 : (n1, n2) U} is denoted by the(N1 N2 g ) 1 vector. The problem of interest is to estimate (1, 2) given .
Assume we consider a K1 K2-point DFT grid: (k1, k2 ) = (2k1/K1, 2k2/K2), for k1 = 0, . . . , K1 1 and k2 = 0, . . . , K2 1 (with K1 > N1 andK2 > N2). The 2-D GAPES algorithm tries to solve the following minimization
problem:
min, {(k1, k2 ), H(k1, k2 )}
K11
k1=0
K21
k2=0
vec(H(k1, k2) Y ) (k1, k2)aL1,L2 (k1, k2 )2
s.t. vecH(H(k1, k2 ))aM1,M2 (k1, k2 ) = 1, (3.32)
via cyclic optimization [42].
-
22 SPECTRAL ANALYSIS OF SIGNALS
For the initialization step, we obtain the initial APES estimates of H(1, 2)
and (1, 2) from the available data in the following way. Let S be the set ofsnapshot indices (l1, l2) such that the elements of the corresponding initial data
snapshot indices {(l1, l2), . . . , (l1, l2 + M 02 1), . . . , (l + M 01 1, l2), . . . , (l1 +M 01 1, l2 + M 02 1)} G. Define the set of M 01 M 02 1 vectors {yl1,l2 :(l1, l2) S}, which contain only the available data samples, and let |S| be thenumber of vectors in S. Furthermore, define the initial sample covariance matrix
R = 1|S|
(l1,l2)Syl1,l2 y
Hl1,l2 . (3.33)
The size of the initial filter matrix M 01 M 02 must be chosen such that the Rcalculated in (3.33) has full rank. Similarly, the initial Fourier transform of the data
snapshots is given by
g(1, 2) = 1|S|
(l1,l2)Syl1,l2 e
j (1l1+2l2). (3.34)
So the initial estimates of H(1, 2) and (1, 2) can be calculated by (3.29)
(3.31) but by using the R and g(1, 2) given above.
Next, we introduce some additional notation that will be used later for the step
of interpolating the missing samples. Let the L1L2 (L2 N1 M1 + 1) matrix Tbe defined by
T =
IL1 0L1,M11IL1 0L1,M11
. . .
IL1
. (3.35)
Hereafter, 0K1,K2 denotes a K1 K2 matrix of zeros only and IK stands for the K K identity matrix. Furthermore, let G be the following (L2 N1 M1 + 1) N1 N2
-
GAPPED-DATA APES 23
Toeplitz matrix:
G(1, 2) =
hH1 01,L11 hH2 01,L11 . . . h
HM2 0 . . . 0
0 hH1 01,L11 hH2 01,L11 . . . h
HM2 . . . 0
.... . . . . . . . . . . . . . . . . . . . .
...
0 0 hH1 01,L11 hH2 01,L11 . . . hHM2
(3.36)
where {hm2}M2m2=1 are the corresponding columns of H(1, 2). With these defini-tions, we have
vec(X) = vec(H(1, 2) Y) = TG vec(Y). (3.37)By making use of (3.37), the estimate of based on the initial estimates
(1, 2) and H(1, 2) is given by the solution to the following problem:
min
TG(0, 0)
...TG(K11, K21)
vec (Y)
(0, 0)aL1,L2 (0, 0)
...(K11, K21)aL1,L2 (K11, K21)
2
.
(3.38)
To solve (3.38), let the matrices G(k1, k2 ) and G(k1, k2 ) be defined implicity
by the following equality:
G(k1, k2 ) vec(Y) = G(k1, k2 ) + G(k1, k2 ), (3.39)
where and are the vectors containing the available samples and missing samples,
respectively. In other words, G(k1, k2 ) and G(k1, k2 ) contain the columns of
G(k1, k2 ) that correspond to the indices in G and U , respectively. By introducingthe following matrices:
G
TG(0, 0)...
TG(K11, K21)
(3.40)
-
24 SPECTRAL ANALYSIS OF SIGNALS
and
G
TG(0, 0)...
TG(K11, K21)
, (3.41)
the criterion (3.38) can then be written as
min
G+ G 2 , (3.42)
where
(0, 0)aL1,L2 (0, 0)...
(K11, K21)aL1,L2 (K11, K21)
. (3.43)
The closed-form solution of the quadratic problem (3.42) is easily obtained as
= (GH G)1
GH( G
). (3.44)
A step-by-step summary of 2-D GAPES is as follows:
Step 0: Obtain an initial estimate of {(1, 2), h(1, 2)}.Step 1: Use the most recent estimate of {(1, 2), h(1, 2)} in (3.32) to estimate
by minimizing the so-obtained cost function, whose solution is given by
(3.44).
Step 2: Use the latest estimate of to fill in the missing data samples and estimate
{(1, 2), H(1, 2)}K11,K21k1=0,k2=0 by minimizing the cost function in (3.32)based on the interpolated data. (This step is equivalent to applying 2-D
APES to the complete data.)
Step 3: Repeat steps 12 until practical convergence.
3.4 NUMERICAL EXAMPLESWe now present several numerical examples to illustrate the performance of
GAPES for the spectral analysis of gapped data. We compare GAPES with
windowed FFT (WFFT). A Taylor window with order 5 and sidelobe level 35 dBis used for WFFT.
-
GAPPED-DATA APES 25
3.4.1 One-Dimensional ExampleIn this example, we consider the 1-D gapped-data spectral estimation. To imple-
ment GAPES, we choose K = 2N for the iteration steps and the final spectrumis estimated on a finer grid with K = 32. The initial filter length is chosen asM0 = 20, and we use M = N/2 = 64 after the initialization step. We calculate thecorresponding WFFT spectrum via zero-padded FFT.
The true spectrum of the simulated signal is shown in Fig. 3.1(a), where we
have four spectral lines located at f1 = 0.05 Hz, f2 = 0.065 Hz, f3 = 0.26 Hz, andf4 = 0.28 Hz with complex amplitudes 1 = 2 = 3 = 1 and 4 = 0.5. Besidesthese spectral lines, Fig. 3.1(a) also shows a continuous spectral component centered
at 0.18 Hz with a width b = 0.015 Hz and a constant modulus of 0.25. The datasequence has N = 128 samples where the samples 2346 and 76100 are missing.The data is corrupted by a zero-mean circularly symmetric complex white Gaussian
noise with variance 2n = 0.01.In Fig. 3.1(b) the WFFT is applied to the data by filling in the gaps with
zeros. Note that the artifacts due to the missing data are quite severe in the spec-
trum. Figs. 3.1(c) and 3.1(d) show the moduli of the WFFT and APES spectra
of the complete data sequence, where the APES spectrum demonstrated superior
resolution compared to that of WFFT. Figs. 3.1(e) and 3.1(f ) illustrate the moduli
of the WFFT and APES spectra of the data sequence interpolated via GAPES.
Comparing Figs. 3.1(e) and 3.1(f ) with 3.1(c) and 3.1(d), we note that GAPES
can effectively fill in the gaps and estimate the spectrum.
3.4.2 Two-Dimensional ExamplesGAPES applied to simulated data with line spectrum: In this example we con-
sider a data matrix of size 32 50 consisting of three noisy sinusoids, with fre-quencies (1,0.8), (1,1.1), and (1.1,1.3) and amplitudes 1, 0.7, and 2, respectively,
embedded in white Gaussian noise with standard deviation 0.1. All samples in the
columns 1020 and 3040 are missing. The true spectrum is shown in Fig. 3.2(a)
and the missing-data pattern is shown in Fig. 3.2(b). In Fig. 3.2(c) we show the
-
26 SPECTRAL ANALYSIS OF SIGNALS
0 0.1 0.2 0.3 0.4 0.50
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
Mod
ulus
of C
ompl
ex A
mpl
itude
Mod
ulus
of C
ompl
ex A
mpl
itude
Mod
ulus
of C
ompl
ex A
mpl
itude
0 0.1 0.2 0.3 0.4 0.50
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
Frequency (Hz) Frequency (Hz)
Frequency (Hz) Frequency (Hz)
Mod
ulus
of C
ompl
ex A
mpl
itude
(a) (b)
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.40
0.2
0.4
0.6
0.8
1
1.2
0 0.1 0.2 0.3 0.4 0.50
0.2
0.4
0.6
0.8
1
1.2
Mod
ulus
of C
ompl
ex A
mpl
itude
(c) (d)
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.40
0.2
0.4
0.6
0.8
1
1.2
0 0.1 0.2 0.3 0.4 0.50
0.2
0.4
0.6
0.8
1
1.2
Mod
ulus
of C
ompl
ex A
mpl
itude
(e) (f)FIGURE 3.1: Modulus of the gapped-data spectral estimates [N = 128, 2n = 0.01, twogaps involving 49 (40%) missing samples]. (a) True spectrum, (b) WFFT, (c) complete-data
WFFT, (d) complete-data APES, (e) WFFT with interpolated data via GAPES, and (f )
GAPES.
-
GAPPED-DATA APES 27
1.0000
0.7000
2.0000
0 0.5 1 1.5 20
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
10 20 30 40 50
5
10
15
20
25
30
(a) (b)
0.98937
0.70254
2.0014
0 0.5 1 1.5 20
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
1.0103
0.7024
2.0028
0 0.5 1 1.5 20
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
(c) (d)
0.66132
0.77715
1.0147
0 0.5 1 1.5 20
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
1.0461
0.6949
2.1452
0 0.5 1 1.5 20
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
(e) (f)FIGURE 3.2: Modulus of the 2-D spectra. (a) True spectrum, (b) 2-D data missing
pattern, the black stripes indicate missing samples, (c) 2-D complete-data WFFT, (d)
2-D complete-data APES with a 2-D filter of size 16 25, (e) 2-D WFFT, and (f ) 2-DGAPES with an initial 2-D filter of size 10 8.
-
28 SPECTRAL ANALYSIS OF SIGNALS
(a)
(b) (c)
5 10 15 20 25 30 35 40 45
5
10
15
20
25
30
35
40
45
(d) (e) (f)
FIGURE 3.3: Modulus of the SAR images of the backhoe data obtained from a 48 48data matrix with missing samples. (a) 3-D CAD model and K-space data, (b) 2-D complete-
data WFFT, (c) 2-D complete-data APES with a 2-D filter of size 24 36, (d) 2-D datamissing pattern, the black stripes indicate missing samples, (e) 2-D WFFT, and (f ) 2-D
GAPES with an initial 2-D filter of size 20 9.
-
GAPPED-DATA APES 29
512 512 WFFT spectrum of the full data. In Fig. 3.2(d) we show the 512 512APES spectrum of the full data obtained by using a 2-D filter matrix of size 16 25.Fig. 3.2(e) shows the WFFT spectrum obtained by setting the missing samples to
zero. Fig. 3.2(f ) shows the GAPES spectrum with an initial filter of size 10 8.Comparing Fig. 3.2(f ) with 3.2(d), we can see that GAPES still gives very good
spectral estimates as if there were no missing samples.
GAPES applied to SAR data: In this example we apply the GAPES al-
gorithm to the SAR data. The Backhoe Data Dome, Version 1.0 consists of
simulated wideband (713 GHz), full polarization, complex backscatter data from
a backhoe vehicle in free space. The 3-D computer-aided design (CAD) model of
the backhoe vehicle is shown in Fig. 3.3(a), with a viewing direction correspond-
ing to (approximately) 45 elevation and 45 azimuth. The backscattered data has
been generated over a full upper 2 steradian viewing hemisphere, which is also
illustrated in Fig. 3.3(a). We consider a 48 48 HH polarization data matrix col-lected at 0 elevation, from approximately a 3 azimuth cut centered around 0 az-
imuth, covering approximately a 0.45 GHz bandwidth centered around 10 GHz. In
Fig. 3.3(b) we show the SAR image obtained by applying WFFT to the full data.
Fig. 3.3(c) shows the image obtained by the application of APES to the full data
with a 2-D filter of size 24 36. Note that the two closely located vertical lines(corresponds to the loader bucket) are well resolved by APES because of its su-
per resolution. To simulate the gapped data, we create artificial gaps in the phase
history data matrix by removing the columns 1017 and 3037, as illustrated in
Fig. 3.3(d). In Fig. 3.3(e) we show the result of applying WFFT to the data where
the missing samples are set to zero. Significant artifacts due to the data gapping can
be observed. Fig. 3.3(f ) shows the resulting image of GAPES after one iteration.
(Further iteration did not change the result visibly.) To perform the interpolation,
we apply 2-D GAPES with an initial filter matrix of size 20 9 on a 96 96 grid.After the interpolation step, the spectrum of the so-obtained interpolated data ma-
trix is computed via 2-D APES with the same filter size as that used in Fig. 3.3(c).
We can see that GAPES can still resolve the two vertical spectral lines clearly.
-
31
C H A P T E R 4
Maximum LikelihoodFitting Interpretation
of APES
4.1 INTRODUCTIONIn this chapter, we review the APES algorithm for complete-data spectral esti-
mation following the derivations in [13], which provide a maximum likelihood
(ML) fitting interpretation of the APES estimator. They pave the ground for the
missing-data algorithms we will present in later chapters.
4.2 ML FITTING BASED SPECTRAL ESTIMATORRecall the problem of estimating the amplitude spectrum of a complex-valued
uniformly sampled data sequence introduced in Section 2.2. The APES algorithm
derived below estimates () from {yn}N1n=0 for any given frequency .Partition the data vector
y = [y0 y1 yN1]T (4.1)
into L overlapping subvectors (data snapshots) of size M 1 with the followingshifted structure:
yl = [yl yl+1 yl+M1]T, l = 0, . . . , L 1, (4.2)
-
32 SPECTRAL ANALYSIS OF SIGNALS
where L = N M + 1. Then, according to the data model in (2.1), the lth datasnapshot yl can be written as
yl = ()a() e jl + el (), (4.3)
where a() is an M 1 vector given by (2.5) and el () = [e l ()e l+1() e l+M1()]T. The APES algorithm mimics a ML approach to estimate () by
assuming that el (), l = 0, 1, . . . , L 1, are zero-mean circularly symmetric com-plex Gaussian random vectors that are statistically independent of each other and
have the same unknown covariance matrix
Q () = E[el ()eHl ()]. (4.4)
Then the covariance matrix of yl can be written as
R = ()2a()aH() + Q (). (4.5)
Since the vectors {el ()}L1l=0 in our case are overlapping, they are not statisticallyindependent of each other. Consequently, APES is not an exact ML estimator.
Using the above assumptions, we get the normalized surrogate log-likelihood
function of the data snapshots {yl } as follows:
1L
lnp({yl }(), Q ()) = M ln lnQ () 1
L
L1
l=0
[yl ()a()e jl
]H
Q1()[ yl ()a()e jl ] (4.6)
= M ln ln Q() tr{
Q1()1L
L1
l=0[yl ()a()e jl
] [yl ()a()e jl
]H}, (4.7)
where tr{} and | | denote the trace and the determinant of a matrix, respectively.For any given (), maximizing (4.7) with respect to Q () gives
Q () = 1LL1
l=0[yl ()a()e jl ][yl ()a()e jl ]H. (4.8)
-
ML FITTING INTERPRETATION OF APES 33
Inserting (4.8) into (4.7) yields the following concentrated cost function (with
changed sign)
G = Q () =
1L
L1
l=0
[yl ()a()e jl
][yl ()a()e jl
]H , (4.9)
which is to be minimized with respect to (). By using the notation g(), R, and
S() defined in (2.6), (2.7), and (2.11), respectively, the cost function G in (4.9)
becomes
G = R + |()|2a()aH() g()H()aH() ()a()gH()= R g()gH() + [()a() g()][()a() g()]H (4.10)= S()I + S1()[()a() g()][()a() g()]H, (4.11)
where S() can be recognized as an estimate of Q (). Making use of the identityI + AB = I + BA, we get
G = S(){1 + [()a() g()]H S1() [()a() g()]}. (4.12)
Minimizing G with respect to () yields
() = aH()S1()g()
aH()S1()a(). (4.13)
Making use of the calculation in (4.10), we get the estimate of Q () as
Q () = S() + [()a() g()][()a() g()]H. (4.14)
In the APES algorithm, () is the sought spectral estimate and Q () is the
estimate of the nuisance matrix parameter Q ().
4.3 REMARKS ON THE ML FITTING CRITERIONThe phrase ML fitting criterion used above can be commented as follows. In some
estimation problems, using the exact ML method is computationally prohibitive or
even impossible. In such problems one can make a number of simplifying assump-
tions and derive the corresponding ML criterion. The estimates that minimize the
-
34 SPECTRAL ANALYSIS OF SIGNALS
so-obtained surrogate ML fitting criterion are not exact ML estimates, yet usually
they have good performance and generally they are by design much simpler to com-
pute than the exact ML estimates. For example, even if the data are not Gaussian
distributed, a ML fitting criterion derived under the Gaussian hypothesis will often
lead to computationally convenient and yet accurate estimates. Another example
here is sinusoidal parameter estimation from data corrupted by colored noise: the
ML fitting criterion derived under the assumption that the noise is white leads
to parameter estimates of the sinusoidal components whose accuracy asymptoti-
cally achieves the exact CramerRao bound (derived under the correct assumption
of colored noise), see [43, 44]. The APES method ([13, 15]) is another example
where a surrogate ML fitting criterion, derived under the assumption that the
data snapshots are Gaussian and independent, leads to estimates with excellent
performance. We follow the same approach in the following chapters by extending
the APES method to the missing-data case.
-
35
C H A P T E R 5
One-DimensionalMissing-Data APES via
Expectation Maximization
5.1 INTRODUCTIONIn Chapter 3 we presented GAPES for gapped-data spectral estimation. GAPES
iteratively interpolates the missing data and estimates the spectrum. However,
GAPES can deal only with missing data occurring in gaps and it does not work
well for the more general problem of missing data samples occurring in arbitrary
patterns.
In this chapter, we consider the problem of nonparametric spectral estima-
tion for data sequences with missing data samples occurring in arbitrary patterns
(including the gapped-data case) [45]. We develop two missing-data amplitude
and phase estimation (MAPES) algorithms by using a ML fitting criterion as de-
rived in Chapter 4. Then we use the well-known expectation maximization (EM)
[42, 46] method to solve the so-obtained estimation problem iteratively. Through
numerical simulations, we demonstrate the excellent performance of the MAPES
algorithms for missing-data spectral estimation and missing-data restoration.
The remainder of this chapter is organized as follows: In Section 5.2, we give
a brief review of the EM algorithm for the missing-data problem. In Sections 5.3
and 5.4, we develop two nonparametric MAPES algorithms for the missing-data
spectral estimation problem via the EM algorithm. Some aspects of interest are
-
36 SPECTRAL ANALYSIS OF SIGNALS
discussed in Section 5.5. In Section 5.6, we compare MAPES with GAPES for
the missing-data problem. Numerical results are provided in Section 5.7 to illustrate
the performance of the MAPES-EM algorithms.
5.2 EM FOR MISSING-DATA SPECTRALESTIMATION
Assume that some arbitrary samples of the uniformly sampled data sequence
{yn}N1n=0 are missing. Because of these missing samples, which can be treated asunknowns, the surrogate log-likelihood fitting criterion in (4.6) cannot be maxi-
mized directly. We show below how to tackle this general missing-data problem
through the use of the EM algorithm.
Recall that the g 1 vector and the (N g ) 1 vector contain all theavailable samples (incomplete data) and all the missing samples, respectively, of the
N 1 complete data vector y. Then we have the following relationships:
= {yn}N1n=0 (5.1) = , (5.2)
where denotes the empty set. Let = {(), Q ()}. An estimate of canbe obtained by maximizing the following surrogate ML fitting criterion involving
the available data vector :
= arg max
ln p( |). (5.3)
If were available, the above problem would be easy to solve (as shown in the
previous chapter). In the absence of , however, the EM algorithm maximizes the
conditional (on ) expectation of the joint log-likelihood function of and. The
algorithm is iterative. At the ith iteration, we use i1 from the previous iteration
to update the parameter estimate by maximizing the conditional expectation:
i = arg max
E{
ln p(, |) , i1}
. (5.4)
It can be shown [42, 47] that for each iteration, the increase in the surrogate log-
likelihood function is greater than or equal to the increase in the expected joint
-
1-D MISSING-DATA APES VIA EM 37
surrogate log-likelihood in (5.4), i.e.,
ln p( | i ) ln p( | i1) E{
ln p(, | i ) , i1}
E{
ln p(, | i1) , i1}
. (5.5)
Since the data snapshots {yl } are overlapping, one missing sample may occurin many snapshots (note that there is only one new sample between two adjacent data
snapshots). So two approaches are possible when we try to estimate the missing data:
estimate the missing data separately for each snapshot yl by ignoring any possible
overlapping, or jointly for all snapshots {yl }L1l=0 by observing the over lappings. Inthe following two sections, we make use of these ideas to develop two different
MAPES-EM algorithms, namely MAPES-EM1 and MAPES-EM2.
5.3 MAPES-EM1In this section we assume that the data snapshots {yl }L1l=1 are independent of eachother, and hence we estimate the missing data separately for different data snap-
shots. For each data snapshot yl , let l and l denote the vectors containing the
available and missing elements of yl , respectively. In general, the indices of the
missing components could be different for different l . Assume that l has dimen-
sion gl 1, where 1 gl M is the number of available elements in the snapshotyl . (Although gl could be any integer that belongs to the interval 0 gl M, weassume for now that gl = 0. Later we will explain what happens when gl = 0.)Then l and l are related to yl by unitary transformations as follows:
l = STg (l )yl (5.6)l = STm(l )yl , (5.7)
where Sg (l ) and Sm(l ) are M gl and M (M gl ) unitary selection matricessuch that
STg (l )Sg (l ) = Igl , (5.8)STm(l )Sm(l ) = IMgl , (5.9)
-
38 SPECTRAL ANALYSIS OF SIGNALS
and
STg (l )Sm(l ) = 0gl (Mgl ). (5.10)
For example, if M = 5 and we observe the first, third, and fourth components ofyl , then gl = 3,
Sg (l ) =
1 0 0
0 0 0
0 1 0
0 0 1
0 0 0
(5.11)
and
Sm(l ) =
0 0
1 0
0 0
0 0
0 1
. (5.12)
Because we clearly have
yl = [Sg (l )STg (l ) + Sm(l )STm(l )] yl= Sg (l )l + Sm(l )l , (5.13)
the joint normalized surrogate log-likelihood function of {l , l} is obtained bysubstituting (5.13) into (4.7)
1L
ln p({l , l } | (), Q ()) = M ln ln | Q () | tr{
Q1()1L
L1
l=0
[Sg (l )l + Sm(l )l
()a() e jl] [
Sg (l )l + Sm(l )l ()a() e jl]H
}.
(5.14)
-
1-D MISSING-DATA APES VIA EM 39
Owing to the Gaussian assumption on yl , the random vectors[l
l
]=
[STm(l )
STg (l )
]yl , l = 0, . . . , L 1 (5.15)
are also Gaussian with mean[
STm(l )
STg (l )
]a()() e jl , l = 0, . . . , L 1 (5.16)
and covariance matrix[
STm(l )
STg (l )
]Q ()
[Sm(l ) Sg (l )
], l = 0, . . . , L 1. (5.17)
From the Gaussian distribution of[ ll
], it follows that the probability density func-
tion of l conditioned on l (for given = i1) is a complex Gaussian with meanbl and covariance matrix Kl [48]:
l | l , i1 CN (bl , Kl ), (5.18)
where
bl = E{l
l , i1}
= STm(l )a()i1() e jl+ STm(l )Q
i1()Sg (l )
[STg (l )Q
i1()Sg (l )
]1(l STg (l )a()i1() e jl
)
(5.19)
and
Kl = cov{l
l , i1}
= STm(l )Qi1()Sm(l ) STm(l )Qi1()Sg (l )
[STg (l )Q
i1()Sg (l )]1
STg (l )Qi1()Sm(l ).
(5.20)
Expectation: We evaluate the conditional expectation of the surrogate log-
likelihood in (5.14) using (5.18)(5.20), which is most easily done by adding and
-
40 SPECTRAL ANALYSIS OF SIGNALS
subtracting the conditional mean bl from l in (5.14) as follows:
[Sg (l )l + Sm(l )l ()a() e jl
]
= [Sm(l )(l bl )] + [Sg (l )l + Sm(l )bl ()a() e jl
]. (5.21)
The cross-terms that result from the expansion of the quadratic term in (5.14)
vanish when we take the conditional expectation. Therefore the expectation step
yields
E{
1L
ln p({l , l }|(), Q ()) | {l }, i1(), Qi1()}
= M ln ln |Q ()| tr{
Q1()1L
L1
l=0
(Sm(l)Kl STm(l)
+ [Sg (l)l + Sm(l)bl ()a()e jl] [
Sg (l)l + Sm(l)bl ()a()e jl]H
)}.
(5.22)
Maximization: The maximization part of the EM algorithm produces up-dated estimates for () and Q (). The normalized expected surrogate log-likelihood (5.22) can be rewritten as
M ln ln |Q ()| tr{
Q1()1L
L1
l=0
(l +
[zl ()a() e jl
] [zl ()a() e jl
]H)},
(5.23)
where we have defined
l Sm(l)Kl STm(l) (5.24)
and
zl Sg (l)l + Sm(l)bl . (5.25)
According to the derivation in Chapter 4, maximizing (5.23) with respect to ()
and Q () gives
1() = aH()S1()Z()
aH()S1()a()(5.26)
-
1-D MISSING-DATA APES VIA EM 41
and
Q 1() = S() + [1()a() Z()][1()a() Z()]H, (5.27)
where
Z() 1L
L1
l=0zl ejl (5.28)
and
S() 1L
L1
l=0l + 1L
L1
l=0zl zHl Z()ZH(). (5.29)
This completes the derivation of the MAPES-EM1 algorithm, a step-by-step
summary of which is as follows:
Step 0: Obtain an initial estimate of {(), Q ()}.Step 1: Use the most recent estimate of {(), Q ()} in (5.19) and (5.20) to
calculate bl and Kl , respectively. Note that bl can be regarded as the current
estimate of the corresponding missing samples.
Step 2: Update the estimate of {(), Q ()} using (5.26) and (5.27).Step 3: Repeat steps 1 and 2 until practical convergence.
Note that when gl = 0, which indicates that there is no available samplein the current data snapshot yl , Sg (l) and l do not exist and Sm(l) is an M Midentity matrix; hence, the above algorithm can still be applied by simply removing
any term that involves Sg (l) or l in the above equations.
5.4 MAPES-EM2Following the observation that the same missing data may enter in many snapshots,
we propose a second method to implement the EM algorithm by estimating the
missing data simultaneously for all data snapshots.
Recall that the available and missing data vectors are denoted as ( g 1vector) and [(N g ) 1 vector], respectively. Let y denote the LM 1 vector
-
42 SPECTRAL ANALYSIS OF SIGNALS
obtained by concatenating all the snapshots
y
y0...
yL1
= Sg + Sm, (5.30)
where Sg (LM g ) and Sm (LM (N g )) are the corresponding selection ma-trices for the available and missing data vectors, respectively. Because of the over-
lapping of {yl }, Sg and Sm are not unitary, but they are still orthogonal to eachother:
STg Sm = 0g(Ng ). (5.31)
Instead of (5.6) and (5.7), we have from (5.30)
= (STg Sg)1
STg y = STg y (5.32)
and
= (STmSm)1
STm y = STm y. (5.33)
The matrices Sg and Sm introduced above are defined as
Sg Sg(
STg Sg)1
(5.34)
and
Sm Sm(
STmSm)1
, (5.35)
and they are also orthogonal to each other:
STg Sm = 0g(Ng ). (5.36)
Note that STg Sg and STmSm are diagonal matrices where each diagonal element
indicates how many times the corresponding sample appears in y owing to the
overlapping of {yl }. Hence both STg Sg and STmSm can be easily inverted.
-
1-D MISSING-DATA APES VIA EM 43
Now the normalized surrogate log-likelihood function in (4.6) can be written
as1L
ln p(y | (), Q ()) = M ln 1L
ln |D()| 1
L[ y ()()]HD1()[ y ()()],
(5.37)
where () and D() are defined as
()
e j0a()...
e j(L1)a()
(5.38)
and
D()
Q () 0. . .
0 Q ()
. (5.39)
Substituting (5.30) into (5.37), we obtain the joint surrogate log-likelihood of
and :1L
ln p(, | (), Q ()) = 1L
{LM ln ln |D()| [Sg + Sm ()()]H
D1()[Sg + Sm ()()]} + C J, (5.40)
where CJ is a constant that accounts for the Jacobian of the nonunitary transfor-
mation between y and and in (5.30).
To derive the EM algorithm for the current set of assumptions, we note that
for given i1() and Qi1(), we have (as in (5.18)(5.20))
|, i1 CN (b, K), (5.41)
where
b = E{ |, i1
}
= STm()i1() + STmDi1()Sg[
STg Di1()Sg
]1 ( STg ()i1()
)
(5.42)
-
44 SPECTRAL ANALYSIS OF SIGNALS
and
K = cov{ |, i1}
= STmDi1()Sm STmDi1()Sg[
STg Di1()Sg
]1STg D
i1()Sm. (5.43)
Expectation: Following the same steps as in (5.21) and (5.22), we obtain the
conditional expectation of the surrogate log-likelihood function in (5.40):
E{
1L
ln p(, | (), Q ()) |, i1(), Qi1()}
= M ln 1L
ln |D()| tr{
1L
D1()(
SmKSTm
+ [Sg + Smb ()()][Sg + Smb ()()]H)} + C J. (5.44)
Maximization: To maximize the expected surrogate log-likelihood function
in (5.44), we need to exploit the known structure of D() and (). Let
z0...
zL1
Sg + Smb (5.45)
denote the data snapshots made up of the available and estimated data samples,
where each zl , l = 0, . . . , L 1, is an M 1 vector. Also let 0, . . . ,L1 be theM M blocks on the block diagonal of SmKSTm . Then the expected surrogatelog-likelihood function we need to maximize with respect to () and Q ()
becomes (to within an additive constant)
ln |Q ()| tr{
Q1()1L
L1
l=0
(l +
[zl ()a() e jl
] [zl ()a()e jl
]H)}
.
(5.46)
The solution can be readily obtained by a derivation similar to that in Section 5.3:
2() = aH()S1()Z()
aH()S1()a()(5.47)
-
1-D MISSING-DATA APES VIA EM 45
and
Q 2() = S() + [2()a() Z()][2()a() Z()]H, (5.48)
where S() and Z() are defined as
S() 1L
L1
l=0l + 1L
L1
l=0zl zHl Z()ZH() (5.49)
and
Z() 1L
L1
l=0zl ejl . (5.50)
The derivation of the MAPES-EM2 algorithm is thus complete, and a step-by-step
summary of this algorithm is as follows:
Step 0: Obtain an initial estimate of {(), Q ()}.Step 1: Use the most recent estimates of {(), Q ()} in (5.42) and (5.43) to
calculate b and K. Note that b can be regarded as the current estimate of the
missing sample vector.
Step 2: Update the estimates of {(), Q ()} using (5.47) and (5.48).Step 3: Repeat steps 1 and 2 until practical convergence.
5.5 ASPECTS OF INTEREST
5.5.1 Some Insights into the MAPES-EM AlgorithmsComparing {1(), Q 1()} in (5.26) and (5.27) [or {2(), Q 2()} in (5.47) and(5.48)] with {(), Q ()} in (4.13) and (4.14), we can see that the EM algorithmsare doing some intuitively obvious things. In particular, the estimator of ()
estimates the missing data and then uses the estimate {bl } (or b) as though it werecorrect. The estimator of Q () does the same thing, but it also adds an extra
term involving the conditional covariance Kl (or K), which can be regarded as a
generalized diagonal loading operation to make the spectral estimate robust against
estimation errors.
-
46 SPECTRAL ANALYSIS OF SIGNALS
We stress again that the MAPES approach is based on a surrogate like-
lihood function that is not the true likelihood of the data snapshots. However,
such surrogate likelihood functions (for instance, based on false uncorrelatedness
or Gaussian assumptions) are known to lead to satisfactory fitting criteria, under
fairly reasonable conditions (see, e.g., [42, 49]). Furthermore, it can be shown that
the EM algorithm applied to such a surrogate likelihood function (which is a valid
probability distribution function) still has the key property in (5.5) to monotonically
increase the function at each iteration.
5.5.2 MAPES-EM1 Versus MAPES-EM2Because at each iteration and at each frequency of interest , MAPES-EM2 esti-
mates the missing samples only once (for all data snapshots), it has a lower com-
putational complexity than MAPES-EM1, which estimates the missing samples
separately for each data snapshot.
It is also interesting to observe that MAPES-EM1 makes the assump-
tion that the snapshots {yl } are independent when formulating the surrogate datalikelihood function, and it maintains this assumption when estimating the miss-
ing datahence a consistent ignoring of the overlapping. On the other hand,
MAPES-EM2 makes the same assumption when formulating the surrogate data
likelihood function, but in a somewhat inconsistent manner it observes the over-
lapping when estimating the missing data. This suggests that MAPES-EM2, which
estimates fewer unknowns than MAPES-EM1, may not necessarily have a (much)
better performance, as might be expected (see the examples in Section 5.7).
5.5.3 Missing-Sample EstimationFor many applications, such as data restoration, estimating the missing samples
is needed and can be done via the MAPES-EM algorithms. For MAPES-EM2,
at each frequency of interest , we take the conditional mean b as an estimate of
the missing sample vector. The final estimate of the missing sample vector is the
average of all b obtained from all frequencies of interest. For MAPES-EM1, at
-
1-D MISSING-DATA APES VIA EM 47
each frequency of interest, there are multiple estimates (obtained from different
overlapping data snapshots) for the same missing sample. We calculate the mean
of these multiple estimates before averaging once again across all frequencies of
interest. We remark that we should not consider the {bl } (or b) at each frequency as an estimate of the -component of the missing data because other frequency
components contribute to the residue term as well, which determines the covariance
matrix Q () in the APES model.
5.5.4 InitializationSince in general there is no guarantee that the EM algorithm will converge to a
global maximum, the MAPES-EM algorithms may converge to a local maximum,
which depends on the initial estimate 0 used. To demonstrate the robustness of our
MAPES-EM algorithms to the choice of the initial estimate, we will simply let the
initial estimate of () be given by the WFFT with the missing data samples set
to zero. The initial estimate of Q () follows from (4.8), where again, the missing
data samples are set to zero.
5.5.5 Stopping CriterionWe stop the iteration of the MAPES-EM algorithms whenever the relative change
in the total power of the spectra corresponding to the current[i (k)
]and previous
[i1(k)
]estimates is smaller than a preselected threshold (e.g., = 103):
|K1k=0 |i (k)|2 K1k=0 |i1(k)|2|K1k=0 |i1(k)|2
, (5.51)
where we evaluate () on a K-point DFT grid: k = 2k/K , for k = 0, . . . ,K 1.
5.6 MAPES COMPARED WITH GAPESAs explained above, MAPES is derived from a surrogate ML formulation of the
APES algorithm; on the other hand, GAPES is derived from a LS formulation
-
48 SPECTRAL ANALYSIS OF SIGNALS
of APES [32]. In the complete-data case, these two approaches are equivalent in
the sense that from either of them we can derive the same full-data APES spectral
estimator. So at first, it might look counterintuitive that these two algorithms
(MAPES and GAPES) will perform differently for the missing-data problem (see
the numerical results in Section 5.7). We will now give a brief discussion about this
issue.
The difference between MAPES and GAPES concerns the way they esti-
mate when some data samples are missing. Although MAPES-EM estimates
each missing sample separately for each frequency k (and for each data snapshot
yl in MAPES-EM2) while GAPES estimates each missing sample by considering
all K frequencies together, the real difference between them concerns the different
criteria used in (3.16) and (5.3) for the estimation of : GAPES estimates the
missing sample based on a LS fitting of the filtered data, hH(k)yl . On the other
hand, MAPES estimates the missing sub-sample directly from {yl } based on anML fitting criterion. Because the LS formulation of APES focuses on the output
of the filter h(k) (which is supposed to suppress any other frequency components
except k), the GAPES algorithm is sensitive to the errors in h(k) when it tries to
estimate the missing data. This is why GAPES performs well in the gapped-data
case, since there a good estimate of h(k) can be calculated during the initializa-
tion step. However, when the missing samples occur in an arbitrary pattern, the
performance of GAPES degrades. Yet the MAPES-EM does not suffer from such
a degradation.
5.7 NUMERICAL EXAMPLESIn this section we present detailed results of a few numerical examples to demon-
strate the performance of the MAPES-EM algorithms for missing-data spec-
tral estimation. We compare MAPES-EM with WFFT and GAPES. A Taylor
window with order 5 and sidelobe level 35 dB is used for WFFT. We chooseK = 32N to have a fine grid of discrete frequencies. We calculate the correspond-ing WFFT spectrum via zero-padded FFT. The so-obtained WFFT spectrum is
-
1-D MISSING-DATA APES VIA EM 49
used as the initial spectral estimate for the MAPES-EM and GAPES algorithms.
The initial estimate of Q () for MAPES-EM has been discussed before, and the
initial estimate of h() for GAPES is calculated from (2.12), where the missing
samples are set to zero. We stop the MAPES-EM and the GAPES algorithms
using the same stopping criterion in (5.51) with being selected as 103 and 102,
respectively. The reason we choose a larger for GAPES is that it converges rela-
tively slowly for the general missing-data problem and its spectral estimate would
not improve much if we used an < 102. All the adaptive filtering algorithms
considered (i.e. APES, GAPES, and MAPES-EM) use a filter length of M = N/2for achieving high resolution.
The true spectrum of the simulated signal is shown in Fig. 5.1(a), where we
have four spectral lines located at f1 = 0.05 Hz, f2 = 0.065 Hz, f3 = 0.26 Hz,and f4 = 0.28 Hz with complex amplitudes 1 = 2 = 3 = 1 and 4 = 0.5.Besides these spectral lines, Fig. 5.1(a) also shows a continuous spectral component
centered at 0.18 Hz with a width b = 0.015 Hz and a constant modulus of 0.25.The data sequence has N = 128 samples among which 51 (40%) samples are miss-ing; the locations of the missing samples are chosen arbitrarily. The data is corrupted
by a zero-mean circularly symmetric complex white Gaussian noise with variance
2n = 0.01.In Fig. 5.1(b), the APES algorithm is applied to the complete data and the
resulting spectrum is shown. The APES spectrum will be used later as a reference
for comparison purposes. The WFFT spectrum for the incomplete data is shown
in Fig. 5.1(c), where the artifacts due to the missing data are readily observed.
As expected, the WFFT spectrum has poor resolution and high sidelobes and it
underestimates the true spectrum. Note that the WFFT spectrum will be used as
the initial estimate for the GAPES and MAPES algorithms. Fig. 5.1(d) shows the
GAPES spectrum. GAPES also underestimates the sinusoidal components and
gives some artifacts. Apparently, owing to the poor initial estimate of h(k) for the
incomplete data, GAPES converges to one of the local minima of the cost function
in (3.16). Figs. 5.1(e) and 5.1(f ) show the MAPES-EM1 and MAPES-EM2
-
50 SPECTRAL ANALYSIS OF SIGNALS
0 0.1 0.2 0.3 0.4 0.50
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
Mod
ulus
of C
ompl
ex A
mpl
itude
0 0.1 0.2 0.3 0.4 0.50
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
Mod
ulus
of C
ompl
ex A
mpl
itude
(a) (b)
0 0.1 0.2 0.3 0.4 0.50
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
Mod
ulus
of C
ompl
ex A
mpl
itude
0 0.1 0.2 0.3 0.4 0.50
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
Mod
ulus
of C
ompl
ex A
mpl
itude
(c) (d)
0 0.1 0.2 0.3 0.4 0.50
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
Mod
ulus
of C
ompl
ex A
mpl
itude
0 0.1 0.2 0.3 0.4 0.50
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
Mod
ulus
of C
ompl
ex A
mpl
itude
(e) (f)
FIGURE 5.1: Modulus of the missing-data spectral estimates [N = 128, 2n = 0.01, 51(40%) missing samples]. (a) True spectrum, (b) complete-data APES, (c) WFFT, (d)
GAPES with M = 64 and = 102, (e) MAPES-EM1 with M = 64 and = 103,and (f ) MAPES-EM2 with M = 64 and = 103.
-
1-D MISSING-DATA APES VIA EM 51
spectral estimates. Both MAPES algorithms perform quite well and their spectral
estimates are similar to the high-resolution APES spectrum in Fig. 5.1(b).
The MAPES-EM1 and MAPES-EM2 spectral estimates at different iter-
ations are plotted in Figs. 5.2(a) and 5.2(b), respectively. Both algorithms converge
quickly with MAPES-EM1 converging after 10 iterations while MAPES-EM2
after only 6.
The data restoration performance of MAPES-EM is shown in Fig. 5.3.
The missing samples are estimated using the averaging approach we introduced
previously. Figs. 5.3(a) and 5.3(b) display the real and imaginary parts of the inter-
polated data, respectively, obtained via MAPES-EM1. Figs. 5.3(c) and 5.3(d) show
the corresponding results for MAPES-EM2. The locations of the missing samples
are also indicated in Fig. 5.3. The missing samples estimated via the MAPES-
EM algorithms are quite accurate. More detailed results for MAPES-EM2 are
shown in Fig. 5.4. (Those for MAPES-EM1 are similar.) For a clear visualiza-
tion, only the estimates of the first three missing samples are shown in Fig. 5.4.
The real and imaginary parts of the estimated samples as a function of frequency
are plotted in Figs. 5.4(a) and 5.4(b), respectively. All estimates are close to the
corresponding true values, which are also indicated in Fig. 5.4. It is interesting to
note that larger variations occur at frequencies where strong signal components are
present.
The results displayed so far were for one randomly picked realization of the
data. Using 100 Monte Carlo simulations (varying the realizations of the noise, the
initial phases of the different spectral components, and the missing-data patterns),
we obtain the root mean-squared errors (RMSEs) of the magnitude and phase
estimates of the four spectral lines at their true frequency locations. These RMSEs
for WFFT, GAPES, and MAPES-EM are listed in Tables 5.1 and 5.2. Based on
this limited set of Monte Carlo simulations, we can see that the two MAPES-EM
algorithms perform similarly, and that they are much more accurate than WFFT
and GAPES. A similar behavior has been observed in several other numerical
experiments.
-
52 SPECTRAL ANALYSIS OF SIGNALS
0 0.1 0.2 0.3 0.4 0.50
0.5
1
Mod
ulus i=0
0 0.1 0.2 0.3 0.4 0.50
0.5
1M
odul
us i=1
0 0.1 0.2 0.3 0.4 0.50
0.5
1
Mod
ulus i=2
0 0.1 0.2 0.3 0.4 0.50
0.5
1
Mod
ulus i=3
0 0.1 0.2 0.3 0.4 0.50
0.5
1
Mod
ulus i=4
0 0.1 0.2 0.3 0.4 0.50
0.5
1
Mod
ulus i=5
Frequency (Hz)
0 0.1 0.2 0.3 0.4 0.50
0.5
1
Mod
ulus i=0
0 0.1 0.2 0.3 0.4 0.50
0.5
1
Mod
ulus i=1
0 0.1 0.2 0.3 0.4 0.50
0.5
1
Mod
ulus i=2
0 0.1 0.2 0.3 0.4 0.50
0.5
1
Mod
ulus i=3
0 0.1 0.2 0.3 0.4 0.50
0.5
1
Mod
ulus i=4
0 0.1 0.2 0.3 0.4 0.50
0.5
1
Mod
ulus i=5
Frequency (Hz)
(a) (b)
FIGURE 5.2: Modulus of the missing-data spectral estimates obtained via the MAPES-
EM algorithms at different iterations [N = 128, 2n = 0.01, 51 (40%) missing samples].(a) MAPES-EM1 and (b) MAPES-EM2.
-
1-D MISSING-DATA APES VIA EM 53
20 40 60 80 100 1204
2
0
2
4
n
Rea
l Par
t of S
igna
lTrue DataInterpolated DataMissing Data Locations
(a)
20 40 60 80 100 1204
2
0
2
4
n
Imag
inar
y Pa
rt of
Sig
nal True DataInterpolated Data
Missing Data Locations
(b)
20 40 60 80 100 1204
2
0
2
4
n
Rea
l Par
t of S
igna
l
True DataInterpolated DataMissing Data Locations
(c)
20 40 60 80 100 1204
2
0
2
4
n
Imag
inar
y Pa
rt of
Sig
nal True DataInterpolated Data
Missing Data Locations
(d)
FIGURE 5.3: Interpolation of the missing samples [N = 128, 2n = 0.01, 51 (40%) miss-ing samples]. (a) Real part of the data interpolated via MAPES-EM1, (b) imaginary part of
the data interpolated via MAPES-EM1, (c) real part of the data interpolated via MAPES-
EM2, and (d) imaginary part of the data interpolated via MAPES-EM2.
-
54 SPECTRAL ANALYSIS OF SIGNALS
0 0.1 0.2 0.3 0.4 0.5
1.5
1
0.5
0
0.5
1
1.5
2
2.5
Frequency (Hz)
Rea
l Par
t of M
issi
ng S
ampl
e Es
timat
es
1st Missing Sample2nd Missing Sample3rd Missing SampleTrue Value (1st)True Value (2nd)True Value (3rd)
(a)
0 0.1 0.2 0.3 0.4 0.52.5
2
1.5
1
0.5
0
0.5
1
1.5
2
2.5
Frequency (Hz)
Imag
inar
y Pa
rt of
Mis
sing
Sam
ple
Estim
ates
1st Missing Sample2nd Missing