wavelet filtering of magnetotelluric data

GEOPHYSICS, VOL. 65, NO. 2 (MARCH-APRIL 2000); P. 482–491, 9 FIGS.

Wavelet filtering of magnetotelluric data

Daniel O. Trad∗ and Jandyr M. Travassos‡

ABSTRACT

A method is described for filtering magnetotelluric(MT) data in the wavelet domain that requires a mini-mum of human intervention and leaves good data sec-tions unchanged. Good data sections are preserved be-cause data in the wavelet domain is analyzed throughhierarchies, or scale levels, allowing separation of noisefrom signals. This is done without any assumption onthe data distribution on the MT transfer function. Noisyportions of the data are discarded through thresholdingwavelet coefficients. The procedure can recognize and fil-ter out point defects that appear as a fraction of unusualobservations of impulsive nature either in time domainor frequency domain. Two examples of real MT data arepresented, with noise caused by both meteorological ac-tivity and power-line contribution. In the examples givenin this paper, noise is better seen in time and frequencydomains, respectively. Point defects are filtered out toeliminate their deleterious influence on the MT transferfunction estimates. After the filtering stage, data is pro-cessed in the frequency domain, using a robust algorithmto yield two sets of reliable MT transfer functions.

INTRODUCTION

There are two basic methods for estimating the magnetotel-luric (MT) transfer functions from field data—the single-site(SS) method and the remote-reference (RR) method. In thefirst method, one or two biased estimates at each frequency isobtained for each measuring direction (Sim et al., 1971). In thesecond method, only one unbiased estimate can be obtainedat one site using simultaneous measurement of the magneticfield at a remote site (Gamble et al., 1979; Clarke et al., 1983).The latter can be used effectively only when noises are uncor-related between local and remote sites. The remote-referencemethod has become standard procedure in MT data acquisi-

Manuscript received by the Editor April 1, 1997; revised manuscript received May 26, 1999.∗Formerly CONICET-CRICYT, C.C. 131, Mendoza 5500, Argentina; presently University of British Columbia, Dept. of Earth and Ocean Sciences,2219 Main Mall, Vancouver, British Columbia V6T 1Z4, Canada. E-mail: [email protected].‡Formerly Lamont-Doherty Observatory, 61 Route 9W, Palisades, New York 10964; presently CNPq-Observatorio Nacional, 20921-400 Rio deJaneiro-RJ, Brazil. E-mail: [email protected], to whom all correspondence should be addressed.c© 2000 Society of Exploration Geophysicists. All rights reserved.

tion because it is the only way to remove bias. The single-sitemethod still is used when a second instrument is not availableor is faulty. Regardless of the choice between the two methods,the least-squares method is the standard procedure to produceestimates of the MT transfer function.

The MT linear relationship that follows from the usual as-sumption of a low-dimensionality external source is

E = ZH+N, (1)

where E is the local electric field, Z is the MT transfer func-tion tensor, H is the magnetic field, and N is the noise. Theimpedance Z either is estimated by the SS or the RR method.The noise N accounts for higher-dimensionality sources andshort-duration nonstationary and instrumental noise. Accord-ingly, N vanishes for the strict zero wavenumber, or plane-wavemodel. Once Z is estimated, one can compute the residuals bysubstituting its value in relation (1). Residuals are given by thedifference between the observed E field and the predicted Efield, where the predicted E field is obtained by substitutingthe estimated value of Z in relation (1).

The use of classical spectral analysis together with least-squares regression is warranted if data follow a stationary andGaussian model. In this case, it is also common to assume fur-ther that residuals follow a multivariate normal-probability dis-tribution. If this is satisfied, one obtains maximum-likelihoodestimates with minimum variance.

In practice, most data display gross departures, or outliers,from such a simple regression model. The main causes are ge-omagnetic phenomena, thunderstorms, anthropogenic contri-bution, and instrumental problems. Those contributions havehigh dimensionality, producing heavily biased and wildly os-cillatory transfer functions and poor variance estimates. Out-liers usually appear as a fraction of the useful observations,with distinct characteristics from the rest of the sample. Thephenomenon is independent of the structure of the bulk of ge-omagnetic field observations. Those unusual observations aretermed outliers or leverage points, depending on whether they

482

Wavelet Filtering of MT Data 483

occur in the dependent variable (E) or in the independent vari-able (H) in relation (1) (Rousseeuw and Leroy, 1987).

Two forms of outliers and leverage points are common—point defects and high-dimension nonstationarity (Chave e al.,1987). Point defects are completely independent of the processunder study. They often are associated with lightning nearby.In that case, they appear as spike noise, usually made of severalpoint defects. This kind of noise has an impulsive nature, beingwell localized in time. Local nonstationarity results from finiteduration and nonplanar excursions of the natural geomagneticfield caused by intense disturbances such as magnetic storms.In that case, the stationarity of the normal field is interruptedby the onset of a phenomenon displaying a distinct structure.

Point defects and local nonstationarities usually share a com-mon characteristic of being identified and removed more eas-ily in the time domain because they are well localized in time.Other noise sources are seen better in the frequency domain,such as spectral lines in the residuals caused by anthropogeniccontribution. This is a case of highly correlated noise signals.Like point defects, spectral lines also must have their influenceon the transfer function estimates removed.

Because impulsive noise components in MT field measure-ments frequently are well localized in time, it has been sug-gested that time-domain estimation of MT impulse functionsmight be more efficient than frequency-domain estimation(Yee et al., 1988). Several papers are available in the liter-ature on time-domain estimation of MT impulse functions(Kunetz, 1972; McMechan and Barrodale, 1985; Yee et al., 1988;Spagnolini, 1994). Time-domain estimation also may be attrac-tive in some cases because it requires only short-time stationar-ity (Spagnolini, 1994). Notwithstanding these advantages, mostMT experts favor estimating the impedance in the frequencydomain because it is much simpler.

Frequency-domain robust methods are capable of yieldingreliable impedance estimates in the presence of a moderatenumber of violations of the assumptions of Gaussian distri-bution and stationarity. As in the classical approach, robustmethods are nonparametric methods based on the Fouriertransform. Usually these methods are based on an iterativereweighted least-squares scheme (Egbert and Booker, 1986;Chave et al., 1987; Chave and Thomson, 1989; Larsen, 1989;Sutarno and Vozoff, 1991; Egbert et al., 1992). In the robust ap-proach, one attempts to downweight noisy data sections adap-tively. Generally, this procedure may become of limited effec-tiveness because one whole section is downweighted even if ithas a few large outliers. This can be a problem if the majorityof sections is contaminated by a few outliers each.

Robust methods applied to MT data processing were im-proved with the introduction of techniques for identifying andremoving leverage points with a jackknifed estimate of error(Chave and Thomson, 1989; Chave and Thomson, 1992; Larsenet al., 1996). In particular, one can mention the use of the hatmatrix to control the bias caused by noise in the H field (Chaveand Thomson, 1992) and the use of a smoothed transfer func-tion to identify and substitute bad data (Larsen et al., 1996).These two procedures are very efficient in determining andcontrolling the disastrous effect of outliers and leverage points.

The identification and substitution of outliers and leveragepoints in the MT time series should be considered as oneextremely important stage in the quest for robust estimating of

the MT transfer functions and realistic associated errors obvi-ously, that is far from being an easy task. For instance, just theidentification of leverage points can be very tricky because out-liers in the residuals may not correspond to magnetic outlierscaused by distortion of the hat matrix (Chave and Thomson,1989). One way to identify and substitute outliers and leveragepoints is to work back and forth between time and frequencydomains. In that way, one can take advantage of the particular-ities in the representation of data and noise in both domains.Notwithstanding, it is usually necessary to make some assump-tion on the transfer and impulse functions. Egbert et al. (1992)have substituted outliers and gaps in a long time series with apredicted E field obtained from the H field, using an impulseresponse function. In another work, Larsen et al. (1996) iden-tify and substitute outliers, leverage points, and gaps using afrequency-domain smooth transfer function. Those two linesof approach in identifying and substituting bad data sectionsare equivalent because there is always a well-defined trans-fer function that produces the same results as its time-domaincounterpart, i.e., the impulse function (Egbert, 1992).

This paper does not deal with robust estimation of the MTtransfer functions. Instead, it follows another path in using thewavelet transform to identify and filter out point defects in thedata before applying a robust code. Once the data are filtered,the estimation of the MT transfer functions is performed asusual in the frequency domain. We restrict ourselves to noisesources that are of impulsive nature either in time or frequencydomain, where wavelet analysis proved to be very useful (Tradand Travassos, 1996).

The representation of the data in the wavelet domain is an-other way to look at the data. The idea behind transformingdata into the wavelet domain is to achieve a better localizationof discrete events. It is easy to understand the advantage oftransforming the data into the wavelet domain. Assume that adata section is contaminated with outliers and leverage pointsof impulsive nature. Those point defects usually are localizedin the time domain but spread out in the frequency domain.They do not have a compact representation in the frequencydomain because sines and cosines, the basis functions of theFourier transform, are nonlocal, stretching out to infinity intime domain. Conversely, the signal in the wavelet domain hasbasis functions localized in both time and frequency. We willtake advantage of this duality when dealing with well-definedanthropogenic spectral lines, regarding them as point defectsin the frequency domain. In both cases, the representation ofimpulsive defects is natural in the wavelet domain. The waveletcoefficients are influenced by local events that then can be iden-tified and filtered easily.

A major benefit of the wavelet representation is the separa-tion of the signal in hierarchies, or scale levels. These scale levelsassist in deciding on the data portion that will undergo filter-ing or downweighting by reversing the phase-plane thresholdpartitioning technique (Kumar and Foufoula-Georgiou, 1997)for signal reconstruction. Assume a white-noise behavior forthe MT signals. In this case, the MT signal smears out in thewavelet domain at all scales with low-amplitude wavelet coeffi-cients. At any given scale, we recognize the wavelet coefficientswith amplitudes above a certain threshold. Now it is possibleto extract noise by downweighting only large-amplitude coef-ficients resulting from impulsive noise or spectral lines, leaving

484 Trad and Travassos

good data sections unchanged. After filtering, data sections arethen ready to be processed further by any robust code.

A few other papers use wavelets to extract useful signalfrom a predominantly white-noise time series. Examples ofextracting nonstationary signal from thermistance measure-ments (Moreau et al., 1996) and the detection of the so-calledjerks (Alexandrescu et al., 1995) are found in the literature.Wavelets have been used for local filtering of potential-fielddata (Hornby et al., 1999; Fedi and Quarta, 1998). In an ap-proach opposite to that in this paper, wavelets have beenused to extract useful lightning far-field signals from a wind-contaminated time series (Yuanchou and Paulson, 1997). Thislast work obtains the wavelet transform by multiplying theFourier transform of the Morlet wavelet (Daubechies, 1992)with the frequency-domain representation of the electromag-netic field. A good review of the use of wavelets for geophys-ical applications can be found in the literature (Kumar andFoufoula-Georgiou, 1997).

As a final note, we point out that the estimation of the MTtransfer functions could have been done in the wavelet domainbecause Maxwell equations have a natural representation inthat domain. However, such an approach is beyond the scopeof this paper.

DISCRETE WAVELET FILTERING

Wavelets are used to represent a time series in the sameway we use trigonometric functions in Fourier analysis. Oneimportant difference is that in wavelet analysis, the scale inwhich we look at the data plays a crucial role; wavelets processdata at different scales, or resolutions. Broadly speaking, it isas if one looked at the data through different-sized windows. Ifone looks at the data through a large window, one can see grossdata features. On the other hand, through a small window, onecan see smaller data features. Multiresolution analysis, or thestudy of signals and processes at distinct resolutions, is doneusing this framework. Note that Fourier analysis uses only asingle spectral window.

Because of the different scales in which the original datacan be represented in terms of a wavelet expansion, it is possi-ble to have a detailed look at discontinuities or point defects.Use short-basis functions, i.e., a contracted, or high-frequency,form of the prototype wavelet, to look at the discontinuities.Conversely, use longer-basis functions, i.e., a dilated, or low-frequency, form of the prototype wavelet, to perform frequencyanalysis. Moreover, because the original signal can be repre-sented in terms of a wavelet expansion, data operations suchas filtering can be performed on the wavelets’ coefficients.

In wavelet analysis, one adopts a wavelet prototype func-tion called the analyzing, or mother, wavelet. This work usesthe Daubechies wavelet (Daubechies, 1992) as the analyzingwavelet. This is an orthogonal, fractal wavelet with a com-pact representation. The class of orthogonal wavelets is usedwidely in multiresolution analysis. As with any particular setof wavelets, the Daubechies is specified by a set of coefficients.In particular, we have generated this set of wavelets with 4to 20 coefficients. The best results in this work were obtainedwith the Daubechies wavelet with four coefficients, as shownin Figure 1.

Let ψ(t) be the mother wavelet defined in the space ofsquare-integrable functions over the real numbers L2(R).

Through dilations and translations of ψ(t), one constructs anorthogonal basis (Vetterli and Kovacevic, 1995; Strang, 1989)

ψa,b(t) = 1√|a|ψ(

t − b

a

), (2)

where a 6= 0, b∈ R. The scale factor a gives dilations and com-pressions, and b is a translation in time. Varying a gives riseto a spectrum. The translation factor b represents the slidingof ψ(t) over a given time series F(t). The continuous wavelettransform of a time series F(t) ∈ L2(R) is defined as

f (a, b) = 〈ψa,b(t), F(t)〉 =∫ ∞−∞

ψa,b(t)F(t) dt. (3)

The time series F(t) can be recovered fully from the trans-form (3) by the reconstruction relation, also called resolution ofthe identity (Vetterli and Kovacevic, 1995). That relation statesthat any time series F(t) can be written as a superposition ofshifted and dilated wavelets.

The electromagnetic field was transformed into the waveletdomain through a discrete wavelet transform (DWT) algorithm(Press et al., 1992). The DWT works with two classes of filterscalled quadrature mirror filters. The first class is a smoothingfilter, and the second class is a nonsmoothing filter. Each outputvalue of the nonsmoothing filter at any level of filtering is ac-cumulated as a wavelet coefficient of the original data vectorbelonging to a particular scale. The smoothing-filter output isused again in the next level of filtering. Therefore, the DWTconsists of applying two filters, followed by downsampling toavoid aliasing. The output of the DWT consists of those detailcomponents that were accumulated along the filtering steps.The end product will always be a vector with two smooth co-efficients and a hierarchy of nonsmoothed coefficients. The fi-nal output values of the nonsmoothing filter are the mother-wavelet coefficients. To obtain the original data from the DWTcoefficients, one simply reverses this procedure.

Assume a time series F(t) of any component of the geomag-netic field, i.e., E(t) or H(t) from relation (1), with N= 2J+1

FIG. 1. Four Daubechies wavelets at several scales—4, 5, 6, and7. These are asymmetrical, orthogonal fractal wavelets withfour coefficients, shown here at different levels of detail. Thewavelets are shown here as a time series with a sampling inter-val of 2 s. The amplitude scale is arbitrary.


data points. Assume its DWT, i.e., e(a, b)=〈ψa,b(t),E(t)〉∂ orH(a, b) = 〈ψa,b(t),H(t)〉∂ , where the superscript ∂ indicatesthe digital counterpart of the continuous wavelet transformf (a, b) in relation (3). For the sake of simplicity, na will beused for the number of coefficients at the resolution level a.The highest-resolution level of f (a, b), namely level a= J, hasna= 2J wavelet coefficients. The level a= 4 is the lowest levelconsidered in this work and has na= 24 coefficients.

In the remainder of this section, we apply wavelet-domainfiltering to MT data either in time or in frequency domain.From this point onward, it is assumed that all the data alreadyhave been transformed into the wavelet domain.

To apply the filter to the data set, we follow a procedure sim-ilar to the application of weights in reweighted least squareswhich is found easily in the MT literature (Chave et al., 1987).The main difference here is that we work with wavelet coeffi-cients instead of residuals. Begin by estimating a scale for thewavelet coefficients. This scale will be needed for thresholdingin the filtering process. Assume a zero mean Gaussian noisewith a small fraction of outliers. The variance of the data isestimated by assuming that most of the wavelet coefficientsf (a, b) represent useful signal, and then the median absolutedeviation (MAD) does reflect the size of the original data.

The following scale is used in this work (Chave et al., 1987;Rousseeuw and Leroy, 1987)

s(a; i ) = median{| fa(b)− f a|}σi

, (4)

where fa(b) is the b-shifted wavelet coefficient at a fixed scalea, and f a = median{ fa(b)}, for all possible shifts b. It is clearfrom relation (4) that the scale depends on the resolution levela. Each s(a; i ) is a robust estimate for the standard deviation.The symbol σi stands for the theoretical MAD for the consid-ered standard-probability distribution. Two values for σi areconsidered in this paper—σ1= 0.6745 for a Gaussian distribu-tion, and σ2= 0.44845 for a Rayleigh distribution.

The next step is to design a threshold level to identify anoma-lous data. The thresholding scheme is based on the assumptionthat noise in the wavelet transform is also Gaussian and ap-proximately stationary at each resolution level a. This followsfrom the fact that the wavelet basis is orthogonal. In the waveletliterature, a threshold usually is set to discard values below acertain level. This work follows an inverse route because itsobjective is to eliminate energetic short-period events such asspikes. Assuming that the MT useful signal behaves like whitenoise in time-domain, then it so remains in the wavelet domainand with the same relative amplitude to isolated events. Theenergy of an impulsive noise such as a spike is compressed intoa very small number of large-amplitude wavelet coefficients.Those coefficients stick up well above normal signal amplitudelevels where the useful signal smears at all scales because of itswhite-noise characteristics. We then set a threshold to identifyand eventually weight down coefficients that are above it.

Obviously, that threshold will depend on the resolutionlevel a in the same way the scale does. Remember that na

is the number of wavelet coefficients at a particular resolu-tion level a. With the scale s(a; i ), relation (4), it is possibleto construct a threshold for the wavelet coefficients. We con-sider two thresholds here, one for the Gaussian and anotherfor the Rayleigh distribution. The threshold for a Gaussian

distribution is (Donoho, 1993)

τ1 = Ca s(a; 1)√

2 log na, (5)

whereas for a Rayleigh distribution, it is (Chave et al., 1987)

τ2 = Ca s(a; 2)√

2 log 2na, (6)

where Ca is an empirical factor to account for the need to in-crease the threshold τi at lower-resolution levels. At those lev-els, the low-frequency component of the data is representedwith fewer coefficients. Therefore, throwing out useful signal ismore likely to occur than at the higher-resolution levels, wherethere are more coefficients. Note an inverse analogy to the cas-cade decimation estimation of Fourier coefficients (Wight andBostick, 1980). Only a few periods of signal may occur in theentire time series at the highest levels of decimation.

In this work, we found that

Ca = log N

log na(7)

gives good results. Assuming that the useful signal looks likewhite noise, its DWT coefficients will have a variance that growsroughly as 2a. The factor Ca prevents the heavy dumping of thesquare root in the thresholds τi [relations (5) and (6)].

Any coefficient that sticks above the chosen threshold τi [re-lation (5) or (6)], is weighted down to produce the coefficientsf (a, b) that will be transformed back to the time or frequencydomain for the estimation of the impedance. The choice here isto use a severe data-adaptive robust filter that falls off quickly.We tried Huber and Thomson weights (Chave et al., 1987) inour data set. The best results were obtained with Thomsonweights. The Thomson weight function is given by

$a(b; i ) = exp{−exp

[τi

∣∣∣∣ fa(b)s(a; i )

∣∣∣∣− τi

]}, (8)

where the index i has the same meaning as in relations (4),(5), and (6), i.e., i = 1, 2 for the Gaussian and Rayleigh distri-butions, respectively. Relation (8) is smooth and falls off veryquickly as soon as coefficients become larger than the thresh-olds in relation (5) or (6). The filtering algorithm computesweights from relation (8) for each scale, or resolution level aindependently.

EXAMPLES

The filtering procedure presented in this paper was devel-oped to filter a data set obtained in northern Argentina. Twosites from that field campaign are shown here as examples toillustrate the application of wavelet filtering to real MT data.Data were collected with a broadband MT system manufac-tured by EMI Inc.

No remote reference was available for that data set. Notethat the unavailability of a remote reference is not a limitationin the examples shown here, and that the application of thewavelet filtering to a remote-reference data set is straightfor-ward. The filtering procedure is applied to the data prior to theestimation of the MT response functions. Because of lack ofa remote reference, one may regard the single-site case as themost unfavorable. It is important to remember that a single-site data set can yield unbiased transfer functions only if theelectric noises are uncorrelated with the magnetic ones and


the magnetic measurements are noise free. This is a very strictrequirement seldom found in practice.

The two examples below display large wavenumber eventsassociated with thunderstorms and strongly correlated culturalnoise. The first example is a time series with a large spike, andthe second is contaminated heavily with 50-Hz anthropogenicnoise. All components are affected in both data sets. Both arecommon examples in MT exploration and may provide instruc-tive results to the reader.

In the first example, a time series is transformed to the wave-let domain, filtered, transformed back to time domain, and thenprocessed in the frequency domain to produce the MT trans-fer functions. Each time series has to be filtered independentlybecause the filter will depend on the spectral energy content ofeach channel. Moreover, the wavelet transformation is appliedto the raw time series before applying the system response. Theapplication of the system response alters the frequency con-tent of the original recorded series, overemphasizing the lowerfrequencies relative to the higher ones. The system responsesact as a nonsymmetrical band-pass filter. This artificially in-fluences the determination of the wavelet filter, yielding un-reliable impedance estimates. For impedance-estimation pur-poses, it is mandatory to apply the system responses on the rawseries, but for the wavelet filtering, it is important to work witha whiter spectrum. The reader should be aware that prewhiten-ing of time series may be necessary prior to the application ofthe wavelet filtering to other data sets. The differences in in-strument responses are removed later in the filtered raw-timeseries.

In the second example, the time series is transformed to thefrequency domain in the usual way, using FFT. The Fourier co-efficients then are transformed to the wavelet domain, filtered,transformed back to the frequency domain, and then processedto produce the MT transfer functions. Here the real and imagi-nary parts of the spectral data at each channel are transformedto the wavelet domain independently. This ensures that thephase is not distorted. As before, the system response is ap-plied after the filtering stage.

In both examples below, the transfer functions are presentedin their usual form of apparent resistivity and phase. To esti-mate the transfer functions, we have used a robust reweightedleast-squares (RWLS) algorithm (Chave et al., 1987; Chaveand Thomson, 1989) with a few modifications of our own. Inthis algorithm, an initial least-squares solution is used to pro-duce a set of residuals that is compared against a distributionmodel. Residuals larger than the theoretically expected val-ues then are used to produce weights that reduce the influ-ence of their parent data sections. This is repeated until thesum of squares of the residuals does not change significantly.Residuals are assumed to be Gaussian distributed. Transferfunctions are estimated and their associated errors are givenby the jackknife, a nonparametric error estimator (Chave andThomson, 1989).

The adopted RWLS algorithm has important limitations,however. First, it considers outliers in the dependent variableonly; it does not consider leverage points. Second, because itweights entire data sections, it may produce biased results.It is well known that trying to reduce the effect of outliersthrough weighting data sections may downweight data withgood signal-to-noise ratio, resulting in increasing bias (Chaveand Thomson, 1989; Larsen, 1989; Larsen et al., 1996; Egbert

and Livelybrooks, 1996). Third, it cannot deal with stronglycorrelated noise signals in most data sections.

The choice of this somewhat old-fashioned robust algorithmis justified on the basis of conspicuously displaying the ef-fect of our wavelet-filtering scheme on the MT transfer func-tions. The RWLS algorithm does not interfere too much withthe data set, but it weights only entire data sections. Thisleaves enough room for displaying data improvement that thewavelet-filtering procedure can provide. In this way, we cansingle out the efficiency of the proposed filtering scheme withresults that were not worked with other powerful processingtechniques.

The reader should be aware that more recent codes are muchmore efficient in dealing with the point defects shown here thanthe RWLS algorithm used in this paper. Those codes do a goodjob of downweighting bad data, leaving good data unchanged,as well as dealing with leverage points. A serious limitationstill may be met, however, when there is strong and continuouscorrelated noise (Larsen et al., 1996). We mention two recentcodes: Chave’s code uses the hat matrix to control the biasresulting from noise in the H field (Chave and Thomson, 1992);and Larsen’s uses a smoothed transfer function to identify andsubstitute bad data points (Larsen et al., 1996).

The Ramblon data

This data set was obtained in February 1995, in the Andes,near the city of Mendoza, Argentina. Measurements weretaken north-south, namely the x-direction, and east-west,namely the y-direction, both magnetic. Data acquisition be-gan in fine weather and ended up under a severe storm. Thatstorm produced several spikes, in particular a very large onein the last window when strong electrical activity began anddata collection was halted. All channels were affected heavilyby meteorological activity.

The filtering procedure described in the last section was ap-plied to the Ramblon data set. Concentrate on the time seriesobtained, recording the electromagnetic field with a samplingrate of 2 s. Each component has eight data windows of 512 datapoints producing a time series 8192 s long. The five componentsof the electromagnetic field are transformed into the waveletdomain. We assume a zero mean time series with Gaussiannoise and a small fraction of outliers. Thresholding at each res-olution level a is carried out using the scale given by relation (4),with σ1= 0.6745. Ramblon data thresholding is done using τ1 inrelation (5). Wavelet coefficients that stick above the thresholdτ1 are weighted down using Thomson weights $a(b; 1) givenin relation (8). The filtered wavelet coefficients f (a, b) thenare transformed back to time domain to yield a cleaned timeseries. This new time series then is processed with the RWLSalgorithm mentioned at the beginning of this section.

Figure 2 shows the raw (a) and filtered (b) time series andtheir difference (c) for the Hy(t) (east-west) component. Thelarge spike, at about 8000 s in the last data window, is clearlyvisible in the raw time series (a). Note that this leverage point inHy(t) causes such a severe distortion in the estimated transferfunction that the correspondent residuals will not correspondto the outlier in Ex(t) at the same location in time. In thedifference between the two time series (Figure 2c), other spikesbecome visible—one just below 1000 s, two between 2000 sand 3000 s, one just below 4000 s, and another at about 6500 s.


Several other less energetic spikes are not seen because of thescale of Figure 2. All those spikes were filtered out successfully,leaving useful data untouched.

Figure 3 shows the raw (a) and filtered (b) time series forthe Ex(t) component. This figure also shows the wavelet coef-ficients for the high-resolution scales, namely 11 (c), 10 (d), and9 (e). The large spike in the last data window, at about 8000 s,is still dominant, as is its counterpart in the Hy(t) component.

FIG. 2. Hy(t) time series (east-west direction) at the Ramblonsite. (a) is the measured (raw) data, (b) is the wavelet-filtereddata, and (c) is the difference between (a) and (b). The arrowshows a burst of energy at 3000 s. The three time series aredisplaced arbitrarily for the sake of clarity.

FIG. 3. Ex(t) time series (north-south direction) and waveletcoefficients at Ramblon site. (a) is the raw data and (b) is thewavelet-filtered data. The wavelet coefficients at the three high-est-resolution scales are displayed in the same panel: (c) is scale11, (d) is scale 10, and (e) is scale 9. Time series and waveletcoefficients are displaced arbitrarily for the sake of clarity. Thevertical scale is good for the Ex(t) component only. The waveletscale is arbitrary.

A clear difference is that the energy here is 400 times smaller,but still significantly higher than signal levels. As for the Hy(t)component, spikes were filtered out completely. Figures 2 and3 provide good illustrations of the power of wavelet filtering.It can be said that the filtering shown in this paper is better atremoving spikes than other methods because it does not affectthe useful part of the data.

A few other features become clear from Figure 3. As long asthe threshold in relation (5) was chosen adequately, there is lit-tle danger of discarding useful signal. The short-duration and,consequently, nonstationary burst of energy at about 3000 s re-mains after the application of the filter. That burst is inducedby activity in the Hy(t) channel and may be regarded as use-ful signal. From the same figure, it is possible to see how lo-cally the wavelet coefficients represent spikes. Note that theabrupt change in DC level remains after the largest spike is re-moved. This shift is characteristic of the finite recovery timeof the hardware filters. Another point worth noting is thatthere are spikes that do not have any direct correspondenceto spikes in Hy(t), Figure 2a,c. For instance, the spike at about6000 s has one-third the energy of the dominant spike in Ex(t),but it is not induced by any detectable activity in the Hy(t)component. This is an example of a point defect easily cor-rected by any standard robust scheme because it is a singleoutlier. It does not have any correspondence to a leveragepoint.

It is important to point out that thresholds are different forHy(t) and Ex(t). In other words, distinct time series, say, Hy(t)and Ex(t), are filtered independently, depending on their en-ergy. This could have produced a biased set of impedance esti-mates if some of the useful signal had been filtered out. How-ever, bias is avoided by the judicious choice for the thresholds,relations (5) and (6). We would have induced bias in the trans-fer functions if we had used the same threshold for two distincttime series. For instance, using the Ex(t)-derived threshold tofilter the Hy(t) time series will not filter spikes correctly andwill produce biased transfer functions because the energy of thenoise and signal components generally varies between distincttime series.

To further show the effect of the filtering procedure de-scribed in this work, we compare transfer functions before andafter application of the filter. Transfer function estimation iscarried out in the frequency domain.

Not surprisingly, the RWLS algorithm fails catastrophicallywith this data set, mainly because of the pulling effect of thelarge-regression residuals in the independent variable H. Infact, the transfer function that was estimated to obtain the firstresiduals is far from the true value, and so are the residuals.Weights are unable to downweight outliers properly becausethey are based on poorly estimated residuals. Figure 4 showsthe MT transfer functions obtained using all the recorded datawindows. Both directions are biased strongly and show scatterat or below 0.1 Hz. Transfer functions at those frequencies wereestimated using the eight data windows sampled at 2 s, partiallyshown in Figures 2 and 3. The x-direction is worse because ofthe lower signal-to-noise ratios.

In fact, most of the wild behavior of the MT transfer func-tions obtained from the raw Ramblon data set is caused by thelarge spike in the last data window (Figures 2 and 3). It is pos-sible, however, to get a better, albeit approximate, estimate forthe MT transfer functions by simply removing the last (eighth)


data window of the time series sampled at 2 s. The first sevenwindows also have spikes, but they are one to two orders ofmagnitude smaller than the larger one. The transfer functionsestimated using only the first seven data windows also are dis-played in Figure 4. Now it is easier to assess how severe thebias and oscillatory behavior are as a result of the large spikein the eighth data window. But the first seven windows yieldonly an approximate estimate of the (unknown) true transferfunctions. It is far from being a good estimate because there isnoise in those data windows too. Obviously, the filtered resultsshown below are much better than the seven-window resultsfor several reasons. One that may be regarded as the most im-portant is that useful data are not discarded together with thenoise.

Figure 5 shows the MT transfer functions obtained with thesame RWLS algorithm but using a wavelet-filtered data set.We filter only the data section that is affected by the impul-sive noise, i.e., for frequencies below 0.1 Hz. The applicationof any filter to a noncontaminated data section is not war-ranted because it may discard useful data; thresholds depend

FIG. 4. Apparent resistivity and phase for all raw data windowsat Ramblon. Circles represent the north-south component, orx-component; squares represent the east-west component, ory-component. Error bars represent a 95% confidence interval.Lines show the results for apparent resistivity and phase esti-mated after eliminating the last (eighth) data window sampledat 2 s, i.e., using only the first seven windows. The two dashedlines represent the north-south component, and the two dot-ted lines represent the east-west component. The two lines foreach component represent a 95% confidence interval.

on data variance. A comparison with Figure 4 shows that biasand scatter virtually were eliminated in spite of the limitationsof the adopted RWLS algorithm. Remember that the RWLSalgorithm cannot handle leverage points as the ones shown inFigure 2 (cf. Figure 4).

Because the true value of the transfer function is an unknownquantity, the assessment of a particular set of estimates is alwaysrelative. However, it is not enough to say that scatter and biaswere eliminated in the filtered results when compared withthe raw set of estimates. This remains true even if the noisiest,eighth data window was purged from the original data. Resultsstill may be biased to an unknown degree.

An independent test with the Ramblon data set may be at-tainable by processing the raw data set with other algorithms.We concentrate on two recent programs that are very efficientin dealing with the point defects shown in this paper. We useLarsen’s program (Larsen et al., 1996), referred to as L, andChave’s program (Chave and Thomson, 1992), referred to asC. We processed Ramblon’s raw (eight data windows) with Land C algorithms, obtaining a compatible set of results. Thoseresults also are displayed in Figure 5. A comparison of the

FIG. 5. Apparent resistivity and phase for the filtered Ramblondata. Circles and xs represent the north-south component;squares and asterisks represent the east-west component. Re-sults of this paper are shown as isolated circles and squares.Independent results of processing the raw data set with L andC algorithms are shown as xs and asterisks, linked with lines foreasy visualization. L results are linked with continuous lines. Cresults are linked with dotted lines. Error bars are not shownfor L and C results for the sake of clarity.


RWLS with both the L and C results shows that the wavelet fil-ter successfully dealt with the point defects present in our dataset, and that this was done without making any assumption onthe data distribution. Of course, this assessment remains rel-ative because the true value for the transfer function is stillunknown.

As a final remark, we note that coherence values run highfor the large spike. This renders single-site coherence weight-ing (Egbert and Livelybrooks, 1996; Travassos and Beamish,1988) virtually useless for this data set. In fact, single-site coher-ence weighting, albeit useful, is not a completely safe schemebecause noisy data sections may display large coherence values(Travassos and Beamish, 1988).

The Concordia data

The Concordia data set was obtained near the city ofConcordia, by the Uruguay River in Argentina. It is located4.5 km from a high-voltage line (500 kV) from Salto GrandePower Plant, Entre Rios, Argentina. The power plant itself is25 km northeast of the measuring site. The data set is com-posed of 16 contiguous data windows of 512 data points each,with a sampling rate of 500 Hz. Power-line noise reaches theMT site by traveling through a resistive Precambrian base-ment with skin depth in excess of 15 km. The data are con-taminated strongly with 50-Hz lines, its harmonics, and sub-harmonics. Data acquisition was performed during light rain.Measurements were taken along the geomagnetic north-southx-direction and east-west y-direction.

Here we have a case of more or less well defined spectrallines that affect the data. It is a case of continuous and stronglycorrelated anthropogenic noise. The power plant does not yieldsharp and stable spectral lines, however. This renders the hard-ware notch filters of little avail. In this case, the RWLS algo-rithm produces unreliable estimates at or around the harmon-ics and subharmonics of the net. Subharmonics may appearbecause of the nonlinear components along the line, mainlytransformers. Note that transients in the net also may producenonharmonic currents that have a natural period which is afunction of the power line itself. This period and its harmon-ics do not necessarily have to agree with the system frequency(50 Hz).

We apply the filtering procedure to the spectrum of theConcordia data. We apply FFT to all the 16 data windows to-gether, or 16× 512= 8192 points. In this manner, each com-ponent is represented in the frequency domain by 4096 rawspectral estimates. This is done to ensure that there are enoughwavelet coefficients to represent the spectral lines throughenough scale levels. We consider the spectral lines to be thincompared with the whole length of the spectrum. We still canassume that we have a small fraction of spectral outliers in ourdata set.

The Fourier coefficients for Concordia data are transformedinto the wavelet domain by using the same set of Daubechieswavelets. The Fourier spectra for the five components of theelectromagnetic field are transformed into the wavelet domain,with the real and imaginary parts taken independently. Thisensures that phase is not distorted.

Wavelet filtering is applied to a flattened spectrum. All theparameters used in the wavelet filtering, namely scale, thresh-olds, and weights, relations (4) to (8), are calculated using the

modulus of the spectrum. These values then are used for fil-tering the real and imaginary parts of the Concordia spectrum.Thresholding at each resolution level a is carried out by usingthe scale given by relation (4), for a Rayleigh distribution, i.e.,σ2= 0.44845. Relation (6) gives the threshold, i.e., τ2, used inthe Concordia data. The wavelet coefficients that stick abovethe threshold τ2 are weighted down by using Thomson weights$a(b; 2) in relation (8). The filtered coefficients f (a, b) then aretransformed back to the frequency domain to yield a cleanedspectrum.

Figure 6 shows both a raw (a) and filtered (b) power-densityspectrum for the component Ex(t). The spectral lines producedby correlated cultural noise are clearly visible in the raw spec-trum (Figure 6a). The dominant spectral lines in the spec-trum are the fundamental frequency of the power-distributionsystem and its harmonics—50 Hz (48.5–51.5 Hz), 100 Hz(98.5–101.5 Hz), 151 Hz (150.6–150.7 Hz), and 249 Hz (248.5–249.8 Hz). Values between parentheses show a frequency rangeassuming an arbitrary amplitude threshold. Other dominantlines (>100 mV) also appear to have been caused by the power-line system: 33 Hz, 83 Hz, 166 Hz. All those lines are recog-nized and filtered out a great deal (Figure 6b). In particular,the 50-Hz line virtually disappears, with only 0.1% of its initialenergy remaining.

The filtered spectral data can be transformed back to thetime domain through inverse Fourier transform. Figure 7 showsthe raw (a) and filtered (b) time series as well as the differencebetween them (c) for the same Ex(t) (north-south) component.The filter removed the harmonic signal that dominated theoriginal data, as can be seen in the filtered signal (Figure 7b).As for the Ramblon data set, the filtering process has left theuseful signal unchanged.

As before, to show the effect of the filtering procedure on thetransfer-function estimation, we compare results before andafter application of the filter. We filter only the data section thatis affected by the cultural noise, i.e., for frequencies above 1 Hz.Transfer-function estimation is carried out in the frequencydomain.

FIG. 6. Power spectral density of the Ex(t) component, Con-cordia data. The upper panel shows the raw spectrum (a), andthe lower panel shows the filtered spectrum (b).


The RWLS algorithm does a poor job at the frequenciesaffected by the power-line noise. The effect of the power lineon the transfer functions can be seen in Figure 8. This figureshows the response functions obtained by using all the 16 rawdata windows. Results appear to be particularly unreliable atthe harmonics 50, 100, and 150 Hz, as well as near the frequencyof 3 Hz. Both directions are affected, the x-direction beingworse because of the lower signal-to-noise ratio. Frequenciesin the dead band also display poor results resulting from lowersignal-to-noise ratios. The wavelet filter described in this papercannot address this kind of problem.

The filtered response functions can be seen in Figure 9. Theinfluence of spectral lines virtually is removed. The wavelet-filtering procedure on the spectrum lines allowed the RWLSalgorithm to produce reliable estimates for the response func-tions. This is an illustration of the effectiveness of the waveletfiltering in dealing with continuous correlated noise.

The assessment of the set of estimates is done in relativeterms. For the Concordia case, this task is easier because thenoise effect was restricted to a limited number of frequencies.But an independent test may be attainable by processing theraw data set with L and C algorithms. The results obtained withL and C algorithms also are shown in Figure 9. A comparisonof the RWLS with both the L and C results shows that thewavelet filter may have dealt successfully with the correlatednoise present in our data set. Again, that was done withoutmaking any assumption on data distribution.

CONCLUSIONS

In this paper, we have presented a procedure to filter outoutliers and leverage points that are common in the MT trade.The filtering procedure is performed in the wavelet domain,with similar demands on computer time as the FFT. These pointdefects appear as gross departures from the simple model ofstrict zero wavenumber, or plane-wave model. Two data setsgiven here as examples have the characteristic of having point

FIG. 7. Ex(t) time series at Concordia site: (a) raw data, (b)filtered data, (c) difference between the two time series. Theharmonic signals that dominate the data are removed com-pletely by the filter, as can be seen in the filtered signal, (b).The three time series are displaced arbitrarily for the sake ofclarity.

defects easily identified either in time or in frequency domain.They illustrate the occurrence of large wavenumber events ofmeteorological origin or strongly correlated cultural noise. Inboth cases, data were transformed into the wavelet domain andsuccessfully cleaned. A robust algorithm then is used to processthe data after the filtering stage.

A four-coefficient Daubechies prototype wavelet was used inthis work in producing the wavelet representation of our datasets. Filtering is performed on the wavelet coefficients them-selves rather than on the data. The filtering procedure takesadvantage of the fact that data are analyzed through differ-ent scale levels in the wavelet domain. Only bad data portionsare filtered out, leaving good portions untouched by choosingthe appropriate scale values and threshold-based weighting.Data then are transformed back to their original domain. Itis shown here that this procedure yields reliable estimates ofthe MT transfer functions. We believe that the two cases pre-sented here are particularly unfavorable because no remotereference was available. We conclude that the filtering processpresented here is highly effective in removing data defects bothin time and in frequency domain. It can recognize outliers andleverage points and filter them out without the need for datasubstitution. In this way, the procedure given here has a mini-mum of human intervention, in the sense that it is not necessaryto carry out any substitution process. Intervention in the pro-cess is restricted to thresholding that should not pose any great

FIG. 8. Apparent resistivity and phase for the Concordia data.Results display strong distortions at 50 Hz, as well as at itsharmonics and near 3 Hz. Symbols and error bars follow theconvention of Figure 4.


FIG. 9. Apparent resistivity and phase obtained for theConcordia data after filtering out spectral lines. The influenceof the power line is removed almost completely. Independentresults from processing the raw data set with L and C algo-rithms also are shown. Symbols, error bars, and lines followthe same convention as in Figure 5.

challenge to the experienced user, because deleterious pointdefects are easily recognized energetic events that remain soin the wavelet domain.

ACKNOWLEDGMENTS

The authors fully acknowledge the support for this work bya number of institutions in Argentina and Brazil. VITAE hassupported this work. Travassos was supported partially by aCNPq scholarship. Trad was supported by a CONICET schol-arship. We acknowledge the continued support of Sayd J. C.Landaberry in this work. We thank Jimmy Larsen and AlanChave for use of their codes and for helpful discussions. Theauthors acknowledge the helpful suggestions of the two re-viewers of this paper. We also thank Manuel Mamani, CarlosMoyano, and the Geophysics Group of CRICYT for their helpand for permission to use the two data sets given as examplesin this work. The support of the above-mentioned institutionsand names does not imply that they share the opinions andviews expressed in this work by the authors, who are solelyresponsible for the results shown here.

REFERENCES

Alexandrescu, M., Gilbert, D., Hulot, G., Le Mouel, J.-L., and Saracco,G., 1995, Detection of geomagnetic jerks using wavelet analysis: J.

Geophys. Res., 100, 12557–12572.Chave, A. D., and Thomson, D. J., 1989, Some comments on magne-

totelluric response function estimation: J. Geophys. Res., 94, 14215–14225.

——— 1992, Robust, controlled leverage estimation of magnetotel-luric response functions: Proc. 11th Workshop on ElectromagneticInduction in the Earth, Abstract 8.13, Victoria Univ., New Zealand.

Chave, A. D., Thomson, D., and Ander M., 1987, On the robust estima-tion of power spectra, coherences, and transfer functions: J. Geophys.Res., 92, 633–648.

Clarke, J., Gamble, T. D., Goubau, W. M., Koch, R. H., and Miracky,R. F., 1983, Remote reference magnetotellurics: Equipment and pro-cedures: Geophys. Prosp., 31, 149–170.

Daubechies, I., 1992, Ten lectures on wavelets: Soc. Industrial and Ap-plied Mathematics, GBMS Ser. Appl. Math., 61.

Donoho, D. L., 1993, Wavelet shrinkage and W. V. A.—A ten minutetour, in Meyer, Y., and Roques, S., Eds., Progress in wavelet analysisand applications: Editions Frontieres, 109–128.

Egbert, G. D. 1992, Noncausality of the discrete-time magnetotelluricimpulse response: Geophysics, 57, 1354–1358.

Egbert, G. D., and Booker, J. R., 1986, Robust estimation of geomag-netic transfer function: Geophys. J. Roy. Astr. Soc., 87, 173–194.

Egbert, G. D., Booker, J. R., and Schultz, A., 1992, Very long periodmagnetotellurics at Tucson Observatory: Estimation of impedances:J. Geophys. Res., 97, 15113–15128.

Egbert, G. D., and Livelybrooks, D. W., 1996, Single station magnetotel-luric impedance estimation: Coherence weighting and the regressionM-estimate: Geophysics, 61, 964–970.

Fedi, M., and Quarta, T., 1998, Wavelet analysis for the regional-residual and local separation of potential field anomalies: Geophys.Prosp., 46, 507–525.

Gamble, T. D., Goubau, W. M., and Clarke, J., 1979, Magnetotelluricwith a remote reference: Geophysics, 44, 53–68.

Hornby, P., Boschetti, F., and Horowitz, F. G., 1999, Analysis of po-tential field data in the wavelet domain: Geophys. J. Int., 137, 175–196.

Kumar, P., and Foufoula-Georgiou, E., 1997, Wavelet analysis for geo-physical applications: Rev. Geophys., 35, 385–412.

Kunetz, G., 1972, Processing and interpretation of magnetotelluricsoundings: Geophysics, 37, 1005–1021.

Larsen, J. C., 1989, Transfer functions: Smooth robust estimates byleast squares and remote reference methods: Geophys. J. Internat.,99, 645–663.

Larsen, J. C., Mackie, R. L., Manzella, A., Fiordelisi, A., and Rieven, S.,1996, Robust smooth MT transfer functions: Geophys. J. Internat.,124, 801–819.

McMechan, G. A., and Barrodale, I., 1985, Processing electromag-netic data in the time-domain: Geophys. J. Roy. Astr. Soc., 81, 277–294.

Moreau, F., Gilbert, D., and Saracco, G., 1996, Filtering non-stationarygeophysical data with orthogonal wavelets: Geophys. Res. Lett., 23,407–410.

Press, W., Teukolsky, S., Vetterling, W., and Flannery, B., 1992, Numer-ical recipes: Cambridge Univ. Press.

Rousseeuw, P. J., and Leroy, A. M., 1987, Robust regression and outlierdetection: John Wiley & Sons, Inc.

Sims, W. E., Bostick, F. X. Jr., and Smith, H. W., 1971, The estimationof the magnetotelluric impedance tensor elements from measureddata: Geophysics, 36, 938–942.

Spagnolini, U., 1994, Time domain estimation of MT impedance tensor:Geophysics, 59, 712–721.

Strang, G., 1989, Wavelets and dilation equations: A brief introduction:SIAM Review, 31, 614–627.

Sutarno, D., and Vozoff, K., 1991. Phase-smoothed robust M-estimation of magnetotelluric impedance functions: Geophysics, 36,938–942.

Trad, D. O., and Travassos, J. M., 1996, Magnetotelluric data analysis:Robust filter in discrete wavelet analysis: 13th Workshop on Elec-tromagnetic Induction in the Earth, paper 7.13.

Travassos, J. M., and Beamish D., 1988, Magnetotelluric dataprocessing—A case study: Geophys. J., 93, 377–391.

Vetterli, M., and Kovacevic, J., 1995, Wavelets and subband coding:Prentice-Hall, Inc.

Wight, D. E., and Bostick, F. X., 1980, Cascade decimation—Atechnique for real time estimation of power spectra, in Vozoff, K.,Ed., Magnetotelluric methods: SEG Geophysics Reprint Series 5,215–218.

Yuanchou, Z., and Paulson, K. V., 1997, Enhancement of signal-to-noise ratio in natural-source transient magnetotelluric data withwavelet transform, Pure Appl. Geophys., 149, 405–419.

Yee, E., Kosteniuk, P. R., and Paulson, K. V., 1988, The reconstructionof the magnetotelluric impedance tensor: An adaptive parametrictime-domain approach, Geophysics, 53, 1080–1087.

wavelet filtering of magnetotelluric data

Documents