wigner-ville

89
T IME -F REQUENCY A NALYSIS OF N ON -S TATIONARY P ROCESSES AN I NTRODUCTION MARIA S ANDSTEN 2013 Centre for Mathematical Sciences CENTRUM SCIENTIARUM MATHEMATICARUM

Upload: ana-constantinescu

Post on 25-Nov-2015

37 views

Category:

Documents


0 download

DESCRIPTION

About Wigner-Ville distribution

TRANSCRIPT

  • TIME-FREQUENCY ANALYSIS OFNON-STATIONARY PROCESSESAN INTRODUCTION

    MARIA SANDSTEN

    2013

    Centre for Mathematical Sciences

    CENTRUM

    SCIENTIA

    RUM

    MATHEMATICARUM

  • Contents

    1 Introduction and some history 3

    1.1 A short story on spectral analysis . . . . . . . . . . . . . . . . . . . . 3

    1.2 Spectrum analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    1.3 A historical chronology of time-frequency analysis . . . . . . . . . . . 7

    2 Spectrogram and Wigner distribution 11

    2.1 Fourier transform and spectrum analysis . . . . . . . . . . . . . . . . 11

    2.2 The spectrogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    2.2.1 An adaptive window . . . . . . . . . . . . . . . . . . . . . . . 16

    2.3 The Wigner distribution and Wigner spectrum . . . . . . . . . . . . . 17

    2.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

    3 Cross-terms and the Wigner-Ville distribution 21

    3.1 Cross-terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

    3.2 The analytic signal and the Wigner-Ville distribution . . . . . . . . . 22

    3.3 Cross-terms of the spectrogram . . . . . . . . . . . . . . . . . . . . . 26

    3.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

    4 Ambiguity functions and kernels 29

    4.1 Definitions and properties . . . . . . . . . . . . . . . . . . . . . . . . 29

    4.2 RID and product kernels . . . . . . . . . . . . . . . . . . . . . . . . . 34

    4.3 Separable kernels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

    4.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

    5 Concentration measures 41

    5.1 Concentration measures . . . . . . . . . . . . . . . . . . . . . . . . . 41

    1

  • Maria Sandsten Introduction and some history

    6 Quadratic distributions and Cohens class 436.1 Kernel interpretation of the spectrogram . . . . . . . . . . . . . . . . 446.2 Some historical important distributions . . . . . . . . . . . . . . . . . 456.3 A view on the four domains of time-frequency analysis . . . . . . . . 496.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

    7 The instantaneous frequency 597.1 Instantaneous frequency . . . . . . . . . . . . . . . . . . . . . . . . . 597.2 Time-frequency reassignment . . . . . . . . . . . . . . . . . . . . . . 627.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

    8 The uncertainty principle and Gabor expansion 678.1 The uncertainty principle . . . . . . . . . . . . . . . . . . . . . . . . . 678.2 Gabor expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

    9 Stochastic time-frequency analysis 719.1 Different definitions of non-stationary processes . . . . . . . . . . . . 729.2 The mean square error optimal kernel . . . . . . . . . . . . . . . . . . 759.3 Multitaper time-frequency analysis . . . . . . . . . . . . . . . . . . . 769.4 Cross-term reduction using a penalty function . . . . . . . . . . . . . 78

    2

  • Chapter 1

    Introduction and some history

    Signals in nature vary greatly and sometimes it can be difficult to characterize them.We usually differ between deterministic and stochastic (random) signals where adeterministic signal is explicitly known and a stochastic process is one realizationfrom a collection of signals which are characterized by properties like expected valueand variance. We also differ between stationary and non-stationary signals wherea stationary signal has time-invariant properties, i.e. properties that do not changewith time. Time is a fundamental way of studying a signal but we can also studythe signal in other representations, where one of the most important is frequency.Time-frequency analysis of non-stationary and time-varying signals, in general, hasbeen a field of research for a number of years.

    1.1 A short story on spectral analysis

    The short description here just intend to give an orientation on the history behindspectral analysis in signal processing and for a more thorough description see, e.g.,[1].

    Sir Isaac Newton (1642-1727) performed an experiment in 1704, where he used aglass prism to resolve the sunbeams into the colors of the rainbow, a spectrum. Theresult was an image of the frequencies contained in the sunlight. Jean Baptiste JosephFourier (1768-1830) invented the mathematics to handle discontinuities in 1807 wherethe idea was that a discontinuous function can be expressed as a sum of continuousfrequency functions. This idea was seen as absurd from the established scientists atthat time, including Pierre Simon Laplace (1749-1827) and Joseph Louis Lagrange(1736-1813). The application behind the idea was the most basic problem regarding

    3

  • Maria Sandsten Introduction and some history

    heat, a discontinuity in temperature when hot and cold objects were put together.However, the idea was eventually accepted and is what we today call the Fourierexpansion.

    Joseph von Fraunhofer (1787-1826) invented the spectroscope in 1815 for measure-ment of the index of refraction of glasses. He discovered what today has becomeknown as the Fraunhofer lines. Based on this, an important discovery was madeby Robert Bunsen (1811-1899) in the 1900th century. He studied the light from aburning rag soaked with salt solution. The spectrum from the glass prism consistedof lines and especially a bright yellow line. Further experiments showed that everymaterial has its own unique spectrum, with different frequency contents. The glassprism was the first spectrum analyser. Based on this discovery, Bunsen and Gus-tav Kirchhoff (1824-1887) discovered that light spectra can be used for recognition,detection and classification of substances.

    The spectrum of electrical signals can be obtained by using narrow bandpass-filters.The measured output from one filter is squared and averaged and the result is theaverage power at the center-frequency of the filter. Another bandpass-filter with dif-ferent center-frequency will give the average power at another frequency. With manyfilters of different center-frequencies, an image of the frequency content is obtained.This spectrum analyzer performance is equivalent to modulating (shifting) the signalin frequency and using one narrow-banded lowpass-filter. The modulation procedurewill move the power of the signal at the modulation frequency to the base-band.Thereby only one filter, the lowpass-filter, is needed. The modulation frequency isdifferent for every new spectrum value. This interpretation is called the heterodynetechnique.

    1.2 Spectrum analysis

    A first step in modern spectrum analysis (discrete-time, sampled) was made by SirArthur Schuster already in 1898, [2]. He fitted Fourier series to sunspot numbersto find hidden periodicities. The Periodogram was invented. The Periodogram canbe described with the heterodyne technique where the lowpass-filter is a filter withimpulse respones corresponding to a time limited rectangle sequence (window). Themethod has been frequently used for spectrum estimation ever since James Cooleyand John Tukey developed the Fast Fourier Transform (FFT) algorithm in 1965,[3, 4, 5]. However, it was then discovered that Carl Friedrich Gauss (1777-1855)invented the FFT-algorithm already in 1805, although he did not publish the result.The FFT is the efficient way of calculating the Fourier transform of a signal which

    4

  • Maria Sandsten Introduction and some history

    explores the structure of the Fourier transform algorithm to minimize the numberof multiplications and summations. This made the Fourier transform to actuallybecome a tool and not just a theoretical description. A recent and more thoroughreview of Tukey s work in time series analysis and spectrum estimation can be foundin [6].

    To find a reliable spectrum with optimal frequency resolution from a sequence of (of-ten short length) data has become a large research field. For stationary time-series,two major approaches can be found in the literature, the windowed Periodogram(and lag-windowed covariance estimates) which represents the non-parametric ap-proach or the classical spectral analysis and the model based, usually an Auto-Regressive (AR), which represents the parametric approach or the modern spec-tral analysis. Developments and combinations of these two approaches have duringthe last decades been utilized in a huge number of applications, [7]. A fairly recent,short, pedagogical overview is given in, [8, 9].

    High-resolution spectral estimation of sinusoids with algorithms as MUSIC, Pisarenkoand others is also a very large research field and descriptions of some of these algo-rithms could be found in, e.g., [7]. These algorithms are outstanding in certainapplications but in some cases the usual periodogram is applicable even for sinusoids,e.g., to detect or estimate a single sinusoid in a noisy signal, e.g., [10].

    All these spectrum analysis techniques rely on the fact that the properties, e.g., thefrequency contents, of the signal change slowly with time. If this is true, the time-frequency spectrum of the signal can be estimated and visualized. However, if thesignal if non-stationary or time-varying, with e.g., step-wise changes, other methodsshould be applied.

    We give an example where the need for time and frequency analysis becomes obvious.We analyze two different signals, which are obviously not similar Figure 1.1a) and b).These two signals are not stationary and the periodogram spectra, i.e., the squaredabsolute values of the Fourier transforms will be exactly the same although they havea very different appearance in time, which is seen in Figure 1.1c) and d). From thesignals alone, it is not possible to tell what frequencies they consist of and from thespectra alone, it is not possible to tell how the non-stationary signals evolve withtime. With use of time-frequency analysis, however, in the form of spectrograms,these differences become clearly visible, Figure 1.2a) and b), where red color indicateshigh power and blue color is low power.

    5

  • Maria Sandsten Introduction and some history

    0 200 4001.5

    1

    0.5

    0

    0.5

    1

    1.5

    Time

    a) Chirp signal

    0 200 400

    0

    5

    10

    15

    Time

    b) Impulse signal

    0 0.1 0.2 0.3 0.40

    200

    400

    600

    800

    Frequency

    Pow

    er

    c) Spectrum of chirp

    0 0.1 0.2 0.3 0.40

    200

    400

    600

    800

    Frequency

    Pow

    er

    d) Spectrum of impulse

    Figure 1.1: Two different signals with the same spectral estimates.

    Figure 1.2: Spectrograms of the chirp and impulse signals, (Red color: high power,blue color: low power).

    6

  • Maria Sandsten Introduction and some history

    1.3 A historical chronology of time-frequency anal-

    ysis

    Eugene Wigner (1902-1995) received the Nobel Prize together with Maria GoeppertMayer and Hans Jensen in 1963 for the discovery concerning the theory of the atomicnucleus and elementary particles. The Wigner distribution, almost as well known asthe spectrogram, was suggested in a famous paper from 1932, because it seemed tobe the simplest, [11]. The paper is actually in the area of quantum mechanics andnot at all in signal analysis. The drawback of the Wigner distribution is that it givescross-terms, that is located in the middle between and can be twice as large as thedifferent signal components. A large area of research has been devoted to reductionof these cross-terms, where different time-frequency kernels are proposed.

    The design of these kernels are usually made in the so called ambiguity domainalso known as the doppler-lag domain. The word ambiguity is a bit ambiguousas it then should stand for something that is not clearly defined. The name comesfrom the beginning from the radar field and was introduced in [12], describing theequivalence between time-shifting and frequency-shifting for linear FM signals, whichare frequently used in radar. Suggesting the name characteristic function, it washowever first introduced by Jean-Andre Ville, [13], where the concept of the Wignerdistribution also was used for signal analysis. A similar derivation was also suggestedat about the same time by Jose Enrique Moyal, [14]. In his paper, Ville also redefinedthe analytic signal, where a real-valued signal is converted into a positive frequencysignal. The Wigner distribution of the analytic signal, which is usually applied inpractice today, is therefore called the Wigner-Ville distribution.

    The concept of instantaneous frequency, the derivative of the phase, is closelyrelated to the definition of the analytic signal as we from the analytic signal get aphase. The first interest in instantaneous frequency arouse with frequency modulationin radio transmission with a variable frequency in an electric circuit theory context.

    Dennis Gabor (1900-1979) suggested the representation of a signal in two dimen-sions, time and frequency. In his paper from 1946 Gabor suggested certain elemen-tary signals that occupy the smallest possible area in the two-dimensional area, theinformation diagram, one quantum of information. He had a large interest inholography and in 1948 he carried out the basic experiments in holography, at thattime called wavefront reconstruction. In 1971 he received the Nobel Prize for hisdiscovery of the principles underlying the science of holography. The Gabor expansionwas related by Martin Bastiaans to the short-time Fourier transform in 1980.

    In parallell to this, in the 1940s to 60s, during and after the invention of the spectro-

    7

  • Maria Sandsten Introduction and some history

    gram and the Wigner-Ville distribution, a lot of other distributions and methods wereinvented, e.g., Rihaczek distribution, [15], Page distribution, [16] and Margenau-Hillor Levin distribution, [17]. In 1966 a formulation was made by Leon Cohen, also inquantum mechanics, [18], which included these and an infinite number of other meth-ods as kernel functions. The formulation by Cohen was restricted with a constraintof the kernels so the marginals should be satisfied. Later a quadratic class ofmethods have been defined, also including kernels which do not satisfy the marginals,but otherwise fulfill the properties of the Cohens class. The general quadratic classof time-frequency representation methods is the most applied and a huge number oftime-frequency kernels can be found for different representations. Commonly, thisclass is referred to as the Cohens class. We could however note, that Cohens classfrom the beginning also included signal-dependent kernels which can not be includedin the quadratic class as the methods then no longer are a quadratic form of thesignal. The discrete-time and discrete-frequency kernel was introduced by Claasenand Mecklenbrauker, [19], as late as 1980 in signal analysis.The reassignment principle was introduced in 1976 by Kodera, de Villedary andGendrin, [20], but were not applied to a larger extent until when Auger and Flandrinreintroduced the method, [21]. The idea of reassignment is to keep the localization ofa single component by reassigning mass to the center of gravity where the cross-termsare reduced by smoothing of the Wigner distribution.However, many of these methods are actually developed for deterministic signals(possibly with a disturbance) and used for a more exploratory description of multi-component signals. In the recent 10-15 years, the attempts to find optimal estimationalgorithms for non-stationary processes, have increased, [22]. The term Wignerspectrum has been established where the ambiguity function is replaced by the am-biguity spectrum. A fundamental problem is the estimation of a time-frequencyspectrum from a single realization of the process. Many kernels, optimized for dif-ferent purposes, can be found, e.g., [23, 24, 25, 26]. A well-defined framework hasalso been proposed for the class of underspread processes, for which the ambiguitydomain spectrum of the process is concentrated around the origin. Additionally, it isshown, at least approximately, that many rules for time-invariant linear systems andstationary processes applies, [27].We will also find a connection between the spectrogram formulation and the Wigner-Ville distribution/spectrum using the concept of multitapers, originally invented byDavid Thomson, [28]. The calculation of the two-dimensional convolution betweenthe kernel and the Wigner-Ville spectrum of a process realization can be simplifiedusing kernel decomposition and calculating multitaper spectrograms, e.g., [29, 30].The time-lag estimation kernel is rotated and the corresponding eigenvectors and

    8

  • Maria Sandsten Introduction and some history

    eigenvalues are calculated. The estimate of the Wigner-Ville spectrum is given as theweighted sum, using the eigenvalues as weights, of the spectrograms of the data withthe different eigenvectors as sliding windows, [31]. Different approaches to approxi-mate a kernel with a few multitaper spectrograms have been taken, e.g., comparisonof Slepian and Hermite functions, [32], decomposing the Reduced Interference distri-bution, [33, 34], least square optimization of the weights using Hermite functions aswindows, [35] and utilizing a penalty function, [36]. In [25], time-frequency kernelsand connected multitapers for statistically stable frequency marginals are developed.

    9

  • Maria Sandsten Introduction and some history

    10

  • Chapter 2

    Spectrogram and Wignerdistribution

    A spectrogram is a spectral representation that shows how the spectral density ofa signal varies with time. Other names that are found in different applications arespectral waterfall or sonogram. The time domain data is divided in shorter se-quences, which usually overlap, and Fourier transformed to calculate the magnitude ofthe frequency spectrum for each sequence. These frequency spectra are then orderedon a corresponding time-scale and form a three-dimensional picture, (time, frequency,magnitude). The Wigner distribution/spectrum, [11], almost as well known as thespectrogram, is another famous spectral representation, which has the best time- andfrequency concentration of a signal. However, the Wigner distribution gives cross-terms, which makes the interpretation of the Wigner distribution difficult and a largearea of research has been devoted to reduction of these cross-terms.

    2.1 Fourier transform and spectrum analysis

    The Fourier transform of a continuous signal x(t) is defined as

    X(f) = F{x(t)} =

    x(t)ei2piftdt,

    and the signal x(t) can be recovered by the inverse Fourier transform,

    x(t) = F1{X(f)} =

    X(f)ei2piftdf.

    11

  • Maria Sandsten Spectrogram and Wigner distribution

    This is a Fourier transform pair, we have representation in both time and frequency.The absolute value of the Fourier transform gives us the magnitude function,|X(f)| and the argument is the phase function, arg(X(f)). The spectrum isgiven from the squared magnitude function,

    Sx(f) = |X(f)|2.

    We should also remember that using the Wiener-Khintchine theorem, the powerspectrum of a zero-mean stationary stochastic process x(t), can also be calculatedas the Fourier transform of the covariance function rx(),

    Sx(f) = F{rx()} =

    rx()ei2pifd,

    where rx() is defined asrx() = E [x(t )x(t)],

    denoting E [ ] as the expected value and as the complex conjugate. The covariancefunction rx() can be recovered by the inverse Fourier transform of the spectrum,

    rx() = F1{Sx(f)} =

    Sx(f)ei2pifd.

    2.2 The spectrogram

    The short-time Fourier transform (STFT) of a signal x(t) is defined as

    X(t, f) =

    x(t1)h(t1 t)ei2pift1dt1, (2.1)

    where the window function h(t) centered a at time t is multiplied with the signal x(t)before the Fourier transform. The window function is viewing the signal just close tothe time t and the Fourier transform will be an estimate locally around t. The usualway of calculating the STFT is to use a fixed positive even window, h(t), of a certainshape, which is centered around zero and has power

    |h(t)|2dt = 1. Similar to the

    ordinary Fourier transform and spectrum we can formulate the spectrogram as

    Sx(t, f) = |X(t, f)|2, (2.2)

    which is used very frequently for analyzing time-varying and non-stationary signals.We can also extend the Wiener-Khintchine theorem, and define the power spectrum

    12

  • Maria Sandsten Spectrogram and Wigner distribution

    of a zero-mean non-stationary stochastic process x(t), to be calculated as theFourier transform of the time-dependent covariance function rx(t, ),

    Sx(t, f) = F{rx(t, )} =

    rx(t, )ei2pifd, (2.3)

    where rx(t, ) is defined as

    rx(t, ) = E [x(t )x(t)]. (2.4)

    Using the spectrogram, the signal is divided in several shorter pieces, where a spec-trum is estimated from each piece, which give us information about where in timedifferent frequencies occur. There are several advantages of using the spectrogram,e.g., fast implemention using the fast Fourier transform (FFT), easy interpretationand the clear connection to the periodogram. We illustrate this with an example:

    Example

    In Figure 2.1a) a signal consisting of several frequency components is shown. Amusical interpretation is three tones of increasing height. A common estimate ofthe spectrum is defined by the Fourier transform of a windowed sampled data vector,{x(n), n = 0, 1, 2, . . . N 1}, (N = 512), where t = n/Fs, (Fs is the sample frequencyand in this example Fs = 1). The (modified) periodogram is defined as

    Sx(l) =1

    N|N1n=0

    x(n)h(n)ei2pinlL |2, l = 0 . . . L 1 (2.5)

    where the frequency scale is f = l/L Fs. The window function h(n) is normalizedaccording to

    h(n) =h1(n)

    1N

    N1n=0 h

    21(n)

    , (2.6)

    where h1(n) can be any window function, including h1(n) = 1, (the rectangle win-dow). The periodogram (i.e. using a rectangle window), in Figure 2.1b) shows threefrequency components but can not give any information on how these signals are lo-cated in time. Figure 2.1c) presents the modified periodogram (Hanning window),which is a very common analysis window, giving an estimated spectrum that seemsto consist of just one strong frequency component and two weaker. These examples

    13

  • Maria Sandsten Spectrogram and Wigner distribution

    0 50 100 150 200 250 300 350 400 450 5001.5

    1

    0.5

    0

    0.5

    1

    1.5a) Data

    t

    0 0.05 0.1 0.15 0.20

    1

    2

    3

    4c) Modified periodogram

    f

    S x(f)

    0 0.05 0.1 0.15 0.20

    1

    2

    3

    4b) Periodogram

    f

    S x(f)

    Figure 2.1: A three-component signals with increasing tone, frequency; a) data; b)periodogram; c) modified periodogram

    show how important it is to be careful using different spectral estimation techniques.The spectrogram is defined as

    Sx(n, l) =1

    M|N1n1=0

    x(n1)h(n1 n+M/2)ei2pin1 lL |2, (2.7)

    for time-discrete signals and frequencies, where the window function is of length Mand normalized according to Eq. (2.6) with N = M . The resulting spectrogram ofthe three musical tones is presented in Figure 2.2 and shows a clear view of threefrequency components and their locations in time and frequency.The length (and shape) of the window function h(t) is very important as it deter-mines the resolution in time and frequency, as shown in Figure 2.3. The signal shownin Figure 2.3a) consists of two Gaussian windowed sinusoidal components located atthe same time interval around t = 100, but with different frequencies, f0 = 0.07 andf0 = 0.17 and another component with frequency f0 = 0.07 located around t = 160.The three components show up clearly in Figure 2.3c) where the window length ofa Hanning window is M = 32. Figures 2.3b) and d) show the result when too shortor too long window lengths are chosen. For a too short window length, Figure 2.3b),

    14

  • Maria Sandsten Spectrogram and Wigner distribution

    Figure 2.2: Spectrogram of the signal in Figure 2.1 using a Hanning window of length128 samples.

    the two components located at the same time interval are smeared together as thefrequency resolution becomes really bad for a window length of M = 8. With a longwindow, Figure 2.3d), the frequency resolution is good but now the time resolution isso bad that the two components with the same frequency are smeared together. Thetime- and frequency resolutions have to fulfill the so called uncertainty inequality,to which we will return to later. This gives a strong limitation on using the spectro-gram for signals where the demand on time- and frequency resolution is very high.For now, we just note that usually a proper trade-off between time- and frequencyresolution is found by choosing the length of the window about the same size as thetime-invariance or stationarity of the individual signal components. The followingsimple example show this.

    Example

    If the signal and the window function are

    x(t) = (

    pi)14 e

    2t2 , h(t) = (

    pi)14 e

    2t2 ,

    15

  • Maria Sandsten Spectrogram and Wigner distribution

    Figure 2.3: Spectrogram examples with different window lengths

    respectively, then the spectrogram is

    Sx(t, f) =2

    + e

    +

    t2 1+

    4pi2f2 .

    Studying the spectrogram one can note e.g., that the area of the ellipse e1 down fromthe peak value is

    A = +

    pi.

    Differentiation gives = and a minimum area A = 2pi. The conclusion is that thea minimum time-frequency spread is given if the window length is equal to the signallength for a Gaussian signal and window.

    2.2.1 An adaptive window

    As it is not always easy to know or estimate the components individual lengthsbeforehand and these also vary for different components, adaptive windows can be

    16

  • Maria Sandsten Spectrogram and Wigner distribution

    used, [37, 38]. A variable Gaussian window is defined as

    ht,f (t1) =

    (2

  • Maria Sandsten Spectrogram and Wigner distribution

    Wigner distribution. The Wigner distribution has always better resolution than thespectrogram for a so called monocomponent signal, i.e., a complex sinusoidal signalor a complex chirp signal (linearly increasing, or (decreasing) frequency) if they aredefined for all values of t. The Wigner distribution give exactly the instantaneousfrequency, i.e. perfectly localized functions for these signals.The basic properties for the Wigner distribution and the Wigner spectrum are thesame, meaning that all formulas mentioned below are valid although we restrict tomentioning just the Wigner distribution.The Wigner distribution is always real valued even if the signal is complex valued. Forreal valued signals/processes the frequency domain is symmetrical, i.e., Wx(t, f) =Wx(t,f), (compare with the definition of spectrum and spectrogram for real valuedsignals). The Wigner distribution also satisfies the so called marginals, i.e.,

    Wx(t, f)df = |x(t)|2, (2.11)

    and

    Wx(t, f)dt = |X(f)|2. (2.12)If the marginals are satisfied, the total energy condition is also automatically satisfied,

    Wx(t, f)dtdf = |x(t)|2dt =

    |X(f)|2df = Ex, (2.13)

    where Ex is the energy of the signal. If we shift the signal in time or frequency, theWigner distribution will be shifted accordingly, it is time-shift and frequency-shift invariant, i.e., if y(t) = x(t t0) then

    Wy(t, f) = Wx(t t0, f), (2.14)and if z(t) = x(t)ei2pif0t then

    Wz(t, f) = Wx(t, f f0). (2.15)The Wigner distribution has weak time support as it is limited by the durationof the signal, i.e., if x(t) = 0 when t < t1 and t > t2 then Wx(t, f) = 0 for t < t1and t > t2. Similarly it has weak frequency support and is thereby limited bybandwidth of the signal, i.e., if X(f) = 0 when f < f1 and f > f2, then Wx(t, f) = 0for f < f1 and f > f2. However, when the signal stops for while and then startsagain, i.e., if there is an interval in the signal that is zero, it does not imply thatthe Wigner distribution is zero in that time interval. The same goes for frequencyintervals where the spectrum is zero, it does not imply that the Wigner distributionis zero in that frequency interval.

    18

  • Maria Sandsten Spectrogram and Wigner distribution

    Example

    The Wigner distribution of x(t) = ei2pif0t, < t < , (complex-valued sinusoidalsignal of frequency f0), is given by

    Wx(t, f) =ei2pif0(t+

    2

    )ei2pif0(t2

    )ei2pifd

    =ei2pi(f0f)d = (f f0).

    Example

    For the chirp signal x(t) = ei2pi2t2+i2pif0t, we can compute the Wigner distribution as

    Wx(t, f) =ei2pi(

    2

    (t+ 2

    )2+f0(t+2

    ))ei2pi(2

    (t 2

    )2+f0(t 2 ))ei2pifd

    =ei2pi(t+f0f)d = (f f0 t).

    We could also mention that for x(t) = (t t0), the Wigner distribution is given asWx(t, f) = (t t0), where we should note that the integral for the Wigner distribu-tion involves the multiplication of two delta-functions, which should not be allowed,but we leave this theoretical dilemma to the serious mathematics. For these andsome other simple deterministic signals, the calculations of the Wigner distributionis possible to perform by hand, see also examples in the exercises. Remembering thatthe Wigner distribution is the most concentrated distribution, we also compute theWigner distribution for the Gaussian signal, that was analyzed in the spectrogramsection.

    Example

    For the same signal as in the spectrogram window length example, i.e., x(t) = (pi)14 e

    2t2 ,

    the Wigner distribution is

    Wx(t, f) = 2e(t2+ 1

    4pi2f2).

    Studying the area of this ellipse e1 down from the peak value gives A = pi which ishalf of the corresponding area of the spectrogram.So, why dont we always use the Wigner distribution in all calculations and why isthere a huge field of research to find new well-resolved time-frequency distributions?The problem is well-known as cross-terms, which we will look more closely into inthe next chapter.

    19

  • Maria Sandsten Spectrogram and Wigner distribution

    2.4 Exercises

    2.1 A signal x(t) = (tt0) is windowed with a Gaussian function h(t) = (a/pi)1/4eat2/2.Compute the STFT and the spectrogram.

    2.2 Show that the Wigner distribution must be real-valued even if the signal iscomplex-valued. Hint: Consider the complex conjugate of the Wigner distribu-tion.

    2.3 Show that the Wigner distribution is time-shift and frequency-shift invariantby replacing the input signal to the distribution with y(t) = x(t t0)ei2pif0t.

    2.4 Calculate the Wigner distribution of

    x(t) = (/pi)1/4et2/2ei2pi

    2t2+i2pif0t

    C2.5 Create a one-component signal with Gaussian windowed complex-valued sinu-soids, (using e.g., the m-file [Xc,T]=gaussdata(N,Nk,Tvect,Fvect); with totaldata length N=256, component length Nk=128, centre time value Tvect=100and centre frequency value Fvect=0.1). You can easily create the correspond-ing real-valued sinusoid by using Xr=real(Xc);. Compute and plot the spec-trogram using a Hanning window of length WIN=64. You can use the m-file[S,TI,FI]=mtspectrogram(X,WIN); or alternatively Matlabs spectrogram. Usethe complex-valued sinusoid Xc and also the real-valued sinusoid Xr as inputsignal. Compare the different spectrogram images and make sure that youagree on the view. What will happen if you change the frequency to the verylow frequency Fvect=0.02? Explain the difference between the image of thecomplex-valued sinusoid and the real-valued sinusoid. The following files couldbe used:

    The m-file [X,T]=gaussdata(N,Nk,Tvect,Fvect,Fs); computes a (N x 1)complex-valued data vector X with K different (Nk X 1) Gaussian compo-nents with frequencies specified in Fvect located at centre values specifiedin Tvect. A sample frequency Fs can be specified.

    The m-file [S,TI,FI]=mtspectrogram(X,WIN,Fs,NFFT,NSTEP,wei); com-putes and plots the (multitaper) spectrogram using the window specifiedwith the vector WIN, (or with the Hanning window length WIN).

    20

  • Chapter 3

    Cross-terms and the Wigner-Villedistribution

    The Wigner distribution gives cross-terms, that is located in the middle betweenand can be twice as large as the different signal components. And it does not matterhow far apart the different signal components are, cross-terms show up anyway andbetween all components of the signal as well as disturbances in the case of noise.This makes the interpretation of the Wigner distribution difficult and a large area ofresearch has been devoted to reduction of these cross-terms.In his paper, [13], Jean-Andre Ville, among other definitions, also redefined the an-alytic signal. The The Wigner distribution of the analytic signal, which is usuallyapplied in practice today, is therefore called the Wigner-Ville distribution.

    3.1 Cross-terms

    The main problem with the Wigner distribution/spectrum shows up for multi-componentsignals or processes and for signals with disturbances the method is often useless. Thedrawback is seen even if we just study the two-component signal x(t) = x1(t) + x2(t)for which the Wigner distribution is,

    Wx(t, f) = Wx1(t, f) +Wx2(t, f) + 2

  • Maria Sandsten Cross-terms and the Wigner-Ville distribution

    to the distance between the auto-terms and the direction of the oscillations will beorthogonal to the line connecting the auto-terms, see Figure 3.1, which shows theWigner distribution of the analytic signal of the sequence displayed in Figure 2.1.(The concept of analytic signals will be explained later).

    Example

    Comparing Figure 3.1 with Figure 2.1 we know where we should find the auto-terms.For simplicity, they are called component F1, F2 and F3. We find two of them asyellow components and the middle one, F2, marked with a red-yellow square-pattern.This pattern consists of the middle component and the added corresponding cross-termfrom the outer components, (oscillating and located in the middle between F1 and F3).Then there are a cross-term in the middle between F1 and F2 and one between F2and F3. In conclusion, from these three signal components, the Wigner distributionwill give the three components and additionally three cross-terms (one between eachpossible pair of components). All the cross-terms oscillates and it should be noted thatthey also adopt negative values (blue colour), which seems strange if we interpret theWigner distribution computation as a time-varying power estimate. This is a majordraw-back of the Wigner distribution/spectrum. This example, only including threecomponents, shows that the result using the Wigner distribution/spectrum could easilybe misinterpreted.

    3.2 The analytic signal and the Wigner-Ville dis-

    tribution

    A signal z(t) is analytic ifZ(f) = 0 for f < 0, (3.2)

    where Z(f) = F{z(t)}, i.e. the analytic signal requires the spectrum to be identicalto the spectrum for the real signal for positive frequencies and to be zero for negativefrequencies. The analytic signal was introduced by Gabor in 1946, [39] and laterVille introduced the Wigner distribution using the analytic signal, [13]. In honor ofthis contribution, the definition

    Wz(t, f) =

    z(t+

    2)z(t

    2)ei2pifd, (3.3)

    where z(t) now is the analytic signal, is called the Wigner-Ville distribution. Thisis the most used definition today. We illustrate the advantages of the Wigner-Ville

    22

  • Maria Sandsten Cross-terms and the Wigner-Ville distribution

    Figure 3.1: The Wigner distribution of the analytic signal of the sequence in Fig-ure 2.1.

    distribution with an example.

    Example

    Two real-valued sinusoids windowed with Gaussian windows with frequencies f0 =0.15, located around t = 80, and f0 = 0.3 at t = 200, are combined to a sequence oflength N = 256, see Figure. 3.2. The spectrogram is shown in Figure 3.3a), where weclearly can identify the two components and their frequency locations. As the signalis real-valued, we also get two components located at f = 0.15 and f = 0.3. InFigure 3.3b) the Wigner distribution, calculated using Eq. (3.4), is shown and nowthe picture becomes complex. The low-frequency component shows up at t = 80 andf = 0.15 and the high-frequency component at t = 200 and f = 0.3 as expected,(yellow-orange components) but we also get others that actually look like componentsand not oscillating cross-terms, one at t = 80, f = 0.35 and the other at t = 200,f = 0.2. These two components come from that everything repeats with period0.5. This is a well known fact for the Discrete Fourier Transform (DFT) althoughthat we are used to that the period is one, not 0.5. The reason for the period of0.5 is that we actually have down-sampled the signal a factor 2 when the discrete

    23

  • Maria Sandsten Cross-terms and the Wigner-Ville distribution

    Wigner distribution is calculated. The discrete Wigner distribution used in thecomputations is defined as

    Wx[n, l] = 2min(n,N1n)min(n,N1n)

    x[n+m]x[nm]ei2pim lL , (3.4)

    for a discrete-time signal x[n], n = 0 . . . N 1, where L is the number of frequencyvalues, f = l/(2 L) .In sampling of a signal, we get aliasing if the frequency content of the signal exceedshalf the sampling frequency. A similar phenomenon appears here as the sequence isdown-sampled a factor 2, meaning that f0 (here 0.3) above f = 0.25 will alias to afrequency 0.5f0 (here 0.2) and will show up in the spectrum image as a component.The only indication of that this frequency component is something strange, is thatthere is not a cross-term midway between the frequency component at f = 0.3 andthe one at f = 0.2, the cross-terms are found between f = 0.15 and f = 0.3. We alsosee the cross-terms midway between the frequency located at f = f0 and f = f0,(at f = 0) as well as the cross-terms midway between e.g., f = 0.15 and f = 0.3,(located at f = (0.15 + 0.3)/2 0.22. The picture of a two-component real-valuedsignal becomes very complex and the interpretation difficult. One elegant solution isto use the analytic signal and calculate the Wigner-Ville distribution, Figure 3.3c).The negative frequency components f = 0.15 and f = 0.3 disappear and theonly cross-term is between the two components at positive frequencies. The spectrumrepeats with period 0.5 but as we know from the beginning that no frequencies arelocated at negative frequencies (definition of analytic signal), we can just ignore therepeated picture at negative frequencies. We also see that no aliasing occur for thefrequency at f = 0.3, (as there are no negative frequencies that can repeat above 0.25).This means, that the signal frequency content of a analytic signal can be up to f = 0.5as usual for discrete-time signals. There are other possible methods to avoid aliasingfor signal frequencies higher than 0.25 and still compute the Wigner distribution forthe real-valued signals, e.g., could the signal be up-sampled a factor two before thecomputation of the IAF, and thereby the aliasing is moved to the frequency 0.5 as inusual sampling.

    24

  • Maria Sandsten Cross-terms and the Wigner-Ville distribution

    0 50 100 150 200 2501

    0.5

    0

    0.5

    1

    Figure 3.2: A two-component real-valued sequence.

    Figure 3.3: Different time-frequency distributions of the sequence in Figure 3.2.

    25

  • Maria Sandsten Cross-terms and the Wigner-Ville distribution

    3.3 Cross-terms of the spectrogram

    However, actually, the spectrogram also give cross-terms but they show up as strongcomponents mainly when the signal components are close to each other. This is seenin the following example: If we have two signals x1(t) and x2(t), the spectrum, thesquared absolute value of the Fourier transform, of the resulting sum of the signals,x(t) = x1(t) + x2(t) is

    |X(f)|2 = |F{x(t)}|2 = |X1(f) +X2(f)|2= |X1(f)|2 + |X2(f)|2 +X1(f)X2 (f) +X1 (f)X2(f)= |X1(f)|2 + |X2(f)|2 + 2

  • Maria Sandsten Cross-terms and the Wigner-Ville distribution

    Figure 3.4: The cross-terms for the spectrogram and the Wigner distribution of twocomplex Gaussian windowed sinusoids.

    27

  • Maria Sandsten Cross-terms and the Wigner-Ville distribution

    3.4 Exercises

    3.1 Calculate the Wigner distributions of

    a)

    x(t) = ei2pif1t + ei2pif2t

    b)

    x(t) = A cos(2pif0t).

    3.2 Calculate the Wigner distribution of

    x(t) = (t t1) + (t t2).

    3.3 Calculate the Wigner distribution of

    x(t) = ei2pif0t + (t t0).

    C3.4 Create a three-component signal with Gaussian windowed complex-valued sinu-soids with total data length N=256, component length Nk=128, centre time val-ues Tvect=[70 70 180] and centre frequency values Fvect=[0.1 0.3 0.1] using thefunction gaussdata. Compute and plot the Wigner distribution using the m-file[W,TI,FI]=quadtf(Xc);. Compare the resulting image to the result of the spec-trogram and explain what you see. Also, compute the real-valued correspondingsignal Xr and calculate the Wigner distribution. What can you conclude fromthis picture? Can you explain the view?

    The following additional file could be used:

    The m-file [W,TI,FI]=quadtf(X,METHOD,par,Fs,NFFT); computes andplots the specified time-frequency distribution (default Wigner distribu-tion). To compute the analytic signal as input when you have real-valueddata, use the Matlab-function hilbert, Xa=hilbert(X).

    28

  • Chapter 4

    Ambiguity functions and kernels

    A large area of research has been devoted to reduction of cross-terms, where differenttime-frequency kernels are proposed. The design of these kernels are usually madein the so called ambiguity domain, also known as the doppler-lag domain. Theword ambiguity is a bit ambiguous as it then should stand for something that isnot clearly defined. The name comes from the beginning from the radar field and wasintroduced in [12], describing the equivalence between time-shifting and frequency-shifting for linear FM signals, which are frequently used in radar. Suggesting thename characteristic function, it was however first introduced by Jean-Andre Ville,[13], where the concept of the Wigner distribution also was used for signal analysis. Asimilar derivation was also suggested at about the same time by Jose Enrique Moyal,[14]. Ambiguity functions and ambiguity kernels have shown to be an efficient wayof reducing cross-terms.

    4.1 Definitions and properties

    The ambiguity function is defined as

    Az(, ) =

    z(t+

    2)z(t

    2)ei2pitdt, (4.1)

    where usually the analytic signal z(t) is used. (Without any restrictions, z(t) couldbe replaced by x(t)). We note that the formulation is similar to the Wigner-Villledistribution, the difference is that the Fourier transform is made in the t-variableinstead of the -variable, giving the ambiguity function dependent of the two variables and . The Fourier transform of the instantaneous auto-correlation function in the

    29

  • Maria Sandsten Ambiguity functions and kernels

    variable t gives

    Asx(, ) =

    rx(t, )ei2pitdt, (4.2)

    which we call the ambiguity spectrum.

    Example

    We compare the view of the Wigner-Ville distribution and ambiguity function for dif-ferent signals. In the first case in Figure 4.1a) and b) we see that a Gaussian functionwith centre frequency f = 0.05 and centre time t = 40 will be a Gaussian function inthe Wigner domain at the corresponding locations in time and frequency. The am-biguity function however, will be located at = 0 and = 0, but the frequency- andtime-shift will show up as the oscillation and direction on the oscillations respectively.In Figure 4.1c) and d) the Gaussian with higher centre frequency f = 0.2 located att = 0 also shows up in the centre of the ambiguity function with a different frequencyand direction of the oscillation. The clear advantage of the ambiguity function showsup in the last example, Figure 4.1e) and f), where the signal now consists of the sumof the two signal components of the two other examples. The two signal componentsand the cross-term of the Wigner distribution are clearly visible in Figure 4.1e). Inthe ambiguity function, Figure 4.1f), the signal components add together at the centrewhere the cross-term now show up located at = fb fa and = tb ta as well asat = fa fb and = ta tb.

    As the signal (auto-term) components always will be located at the centre of theambiguity function, independently of where they are located in the time-frequencyplane, and the cross-terms always will be located away from the centre, a naturalapproach is to keep the terms located at the centre and reduce the components awayfrom the centre of the ambiguity function. This is also the ambition of many differentambiguity kernels that have been proposed.So called quadratic time-frequency distributions can be formulated as the mul-tiplication of the ambiguity function and the ambiguity kernel, (, ),

    AQz (, ) = Az(, ) (, ), (4.3)

    and the smoothed Wigner-Ville distribution is found as the 2-dimensional convolu-tion

    WQz (t, f) = Wz(t, f) (t, f),

    30

  • Maria Sandsten Ambiguity functions and kernels

    Figure 4.1: A time- and frequency shifted Gaussian function and the representationas Wigner-Ville distribution and ambiguity function (real part); a) and b), f1 = 0.1,t1 = 20; c) and d), f2 = 0.3, t2 = 0; e) f), The sum of the two Gaussian functions

    31

  • Maria Sandsten Ambiguity functions and kernels

    where(t, f) = F{(, )} =

    (, )ei2pi(ft)dd.

    We can note that the Wigner-Ville distribution has the simple ambiguity domainkernel (, ) = 1 and the corresponding time-frequency (non-)smoothing kernel(t, f) = (t)(f).Recalling the time marginal property, we see that if we put = 0 in Eq.(4.1) we get

    Az(, 0) =

    z(t)z(t)ei2pitdt = |z(t)|2ei2pitdt, (4.4)

    which is the Fourier transform of the time marginal. Similarly, usingZ(f) =

    z(t)e

    i2pitfdt we get

    Az(0, ) =

    Z(f)Z(f)ei2pifdf = |Z(f)|2ei2pifdf, (4.5)

    which is the inverse Fourier transform of the frequency marginal. The frequencymarginal is the usual spectral density and the inverse Fourier transform of the spectraldensity, Az(0, ), can then be interpreted as the usual covariance function. We canconclude from this, that in order to preserve the marginals of the time-frequencydistribution, the ambiguity kernel must fulfill

    (0, ) = (, 0) = 1. (4.6)

    From this we also conclude that

    (0, 0) = 1, (4.7)

    to preserve the total energy of the signal. The Wigner distribution is real-valued andfor the filtered ambiguity function to also become real-valued, the kernel must fulfill

    (, ) = (,). (Exercise!) (4.8)The time-invariance and frequency-invariance property is important and to find therestrictions of the ambiguity kernel we study the quadratic distribution for a time-and frequency-shifted signal w(t) = z(t t0)ei2pif0t giving

    WQw (t, f) =

    w(u+

    2)w(u

    2)(, )ei2pi(tfu)dudd,

    =

    z(u+

    2 t0)ei2pif0(u+ 2 )z(u

    2 t0)ei2pif0(u 2 ) . . .

    32

  • Maria Sandsten Ambiguity functions and kernels

    . . . (, )ei2pi(tfu)dudd,

    =

    z(u1 +

    2)z(u1

    2)(, )ei2pi(f0+tfu1t0)du1dd,

    =

    z(u1 +

    2)z(u1

    2)(, )ei2pi((tt0)(ff0)u1)du1dd,

    = WQz (t t0, f f0),

    where we have assumed that the kernel (, ) is not a function of time nor of fre-quency. This leads to the conclusion that the quadratic distribution is time shiftinvariant if the kernel is independent of time and frequency shift invariant if thekernel is independent of frequency. However, the kernel can be signal dependent.The weak time support of a quadratic distribution is satisfied if

    (, )ei2pitd = 0 | | 2|t|,

    and the weak frequency support if

    (, )ei2pifd = 0 || 2|f |.

    The Wigner distribution has weak time- and frequency support but not strong time-and frequency support as cross-terms arise in between signal components of a multi-component signal. Strong time support means that whenever the signal is zero, thenthe distribution also is zero, giving a stronger restriction on the ambiguity kernel,

    (, )ei2pitd = 0 | | 6= 2|t|,

    and similarly strong frequency support implies that

    (, )ei2pifd = 0 || 6= 2|f |.

    For proofs see [41].

    33

  • Maria Sandsten Ambiguity functions and kernels

    4.2 RID and product kernels

    Methods to reduce the cross-terms, also sometimes referred to as interference terms,have been proposed and a number of useful kernels, that falls into the socalled Re-duced Interference distributions (RID) class can be found. The maybe mostapplied RID is the Choi-Williams distribution, [42], also called the Exponential dis-tribution (ED), with the ambiguity kernel defined as

    ED(, ) = e 22

    , (4.9)

    where is a design parameter. The Choi-Williams distribution also falls into thesubclass of product kernels which have the advantage in optimization of beingdependent of one variable only, i.e., x = . The Choi-Williams distribution alsosatisfy the marginals as

    ED(0, ) = ED(, 0) = 1.

    The kernel is also Hermitian, i.e.,

    ED(, ) = ED(,),ensuring realness of the time-frequency distribution.

    Example

    Comparing the Choi-Williams distribution with the Wigner-Ville distribution for twoexample signals we see in Figure 4.2 a nice view where the Choi-Williams distributiongives a reduction of the cross-terms by smearing out them, but still keeps most of theconcentration of the auto-terms, when the signal components are located at differentfrequencies as well as different times. However, it is important to note that thisnice view are not fully repeated for the two complex sinusoids located at the samefrequency, see Figure 4.3, where the cross-terms still are present but comparing thesize of them will show that the cross-terms of the Wigner-Ville distribution is twicethe height of the auto-terms but the Choi-Williams distribution cross-terms are abouthalf the height of the auto-terms, a reduction of four times. Still, the remainingamplitude of the cross-terms could cause misinterpretation of the distributions. Asimilar performance would be seen for two components located at the same time.

    Other ways of dealing with the cross-terms have been applied, e.g. the Born-Jordandistribution derived by Cohen, [18], using a rule defined by Born and Jordan. The

    34

  • Maria Sandsten Ambiguity functions and kernels

    Figure 4.2: The Wigner and Choi-Williams distributions of two complex sinusoidswith Gaussian envelopes located at t1 = 64, f1 = 0.1 and t2 = 128, f2 = 0.2.

    35

  • Maria Sandsten Ambiguity functions and kernels

    Figure 4.3: The Wigner-Ville and Choi-Williams distributions of two complex sinu-soids with Gaussian envelopes located at t1 = 64, f1 = 0.1 and t2 = 128, f2 = 0.1.

    36

  • Maria Sandsten Ambiguity functions and kernels

    properties of this kernel were not fully understood until in the 1990s, with the workof Jeong and Williams, [43]. The ambiguity kernel for the Born-Jordan distribution,also called is the Sinc-distribution, is

    BJ(, ) = sinc(a) =sin(pia)

    pia, (4.10)

    which also fits into the RID class and is a product kernel.

    4.3 Separable kernels

    A nice form of simple kernels are the separable kernels defined by

    (, ) = G1()g2(). (4.11)

    This form transfers easily to the time-frequency domain as (t, f) = g1(t)G2(f)with g1(t) = F1{G1()} and G2(f) = F{g2()}. The quadratic time-frequencyformulation becomes

    WQz (t, f) = g1(t) Wz(t, f) G2(f), (4.12)as

    AQz (, ) = G1()Az(, )g2(). (4.13)

    The separable kernel replaces the 2-D convolution of the quadratic time-frequencyrepresentation with two 1-D convolutions, which might be beneficial for some signals.Two special cases can be identified: if

    G1() = 1, (4.14)

    a doppler-independent kernel is given as (, ) = g2(), and the quadratic time-frequency distribution reduces to

    WQz (t, f) = Wz(t, f) G2(f), (4.15)which is a smoothing in the frequency domain meaning that only the weak timesupport is satisfied. The doppler-independent kernel is also given the name Pseudo-Wigner or windowed Wigner distribution. The other case is when

    g2() = 1, (4.16)

    giving the lag-independent kernel, (, ) = G1() where the weak frequency sup-port is satisfied and the time-frequency formulation gives only a smoothing in thevariable t,

    WQz (t, f) = g1(t) Wz(t, f). (4.17)

    37

  • Maria Sandsten Ambiguity functions and kernels

    Figure 4.4: The resulting time-frequency plots for three complex sinusoids withGaussian envelopes, located at t1 = 64, f1 = 0.1 and t2 = 128, f2 = 0.1 andt3 = 128, f3 = 0.2; a) The Wigner distribution; b) Doppler-independent kernel;c) Lag-independent kernel.

    Example

    A comparison for some example signals show the advantage of the frequency smoothingfor the Doppler-independent kernel as the cross-terms for signals with different timelocation disappears with use of averaging (smoothing) in the frequency direction, asthe cross-terms oscillates in the frequency direction, Figure 4.4a) and b). For thelag-independent kernel we see the advantage of the smoothing in the time-direction inFigure 4.4c), where the cross-terms disappears in the frequency direction.

    38

  • Maria Sandsten Ambiguity functions and kernels

    4.4 Exercises

    4.1 Compute the ambiguity functions of

    a)

    x(t) = ei2pif0t

    b)

    x(t) = (t t0)

    c)

    x(t) = ei2pif0t + ei2pif1t

    d)

    x(t) = A cos(2pif0t)

    4.2 Compute the ambiguity function of

    x(t) = ei2pif0t + (t t0)

    4.3 Show that Eq. (4.8) is the claim for the real-valued property for the quadraticclass.

    4.4 Calculate the Pseudo-Wigner distribution using the lag-window

    g() = ea2/2,

    for the complex sine wave,

    z(t) = ei2pif0t.

    Compare with the results from the Wigner-Ville distribution. Lead: the follow-ing definite integral could be helpful:

    e2xx2

    dx =

    pi

    e2

    , > 0. (4.18)

    39

  • Maria Sandsten Ambiguity functions and kernels

    C4.5 a) Start with the one-component complex-valued signal from exercise C2.5a).Compute and plot the ambiguity function with [A,TI,FI]=quadamb(Xc);.The plot shows the absolute value! (For analysis of real and imaginaryparts, study the output A of the function). Study the ambiguity functionfor different one-component signals, e.g., change Tvect and Fvect as inputsto gaussdata. Does the absolute value of the ambiguity function changefor the different signals? Verify by a closer study of the absolute value ofthe output A.

    b) Construct a another signal with more than one component using gauss-data, e.g. a similar one as in C3.4, and study the resulting ambiguity func-tions and time-frequency distributions, using the Choi-Williams distribu-tion. To change the parameter (default = 1) use[A,TI,FI]=quadamb(Xc,choi,) where Fs = 1 and NFFT=1024 (defaultvalues). A smaller value of , e.g., = 0.1 will reduce cross-terms but willalso affect the auto-terms. For large , the Choi-Williams distribution willbe similar to the Wigner distribution. Study the ambiguity kernel (figure2)and the resulting filtered ambiguity function (figure1) and explain the ef-fect of the kernel. Study the resulting time-frequency distribution using[W,TI,FI]=quadtf(Xc,choi,). Change to Tvect=[70 180] Fvect=[0.10.3] of the complex-valued signal and study the Choi-Williams distribu-tion for different values of . Why is this signal easier to resolve? Explainfrom the view of the ambiguity kernel and filtered ambiguity function. Howshould you describe the form of the ambiguity kernel for the Choi-Williamsdistribution? How can you see that the marginals are satisfied?

    c) Use the three-component complex-valued signal from C3.4 and investigatethe difference between the doppler-independent kernel, METHOD=w-wig, and the lag-independent kernel, METHOD=l-ind, for different win-dow lengths (Hanning, default length par = NN/10). Study the filteredambiguity function, the ambiguity kernel as well as the time-frequencydistribution and explain the differences and how different window lengths,par=WIN, affect the results.

    The following additionally m-file could be used and modified for your own pur-poses:

    The m-file [A,TI,FI]=quadamb(X,METHOD,par,Fs,NFFT) computes andplots the filtered ambiguity function (absolute value) using a specifiedmethod.

    40

  • Chapter 5

    Concentration measures

    Many of the proposed methods in time-frequency analysis are devoted to suppressionof cross-terms and to maintain the concentration of the autoterms. To measurethis and possibly automatically determine parameters, measurement criteria for theconcentration are needed. Such criteria were defined already in [39] and data-adaptivemethods could be found ,e.g., in [37].The basic idea of concentration measurement can be found using a simple examplefrom probability theory. We use N non-negative numbers, p1, p2, . . . , pN where

    p1 + p2 + . . .+ pN = 1.

    A quadratic test function = p21 + p

    22 + . . . p

    2N ,

    will attend its minimum value when p1 = p2 = . . . = pN = 1/N (maximallyspread)and the maximum value when only one pi = 1 and all the others are zero(minimum spread). Incorporating the unity sum constraint in the test function gives

    =p21 + p

    22 + . . . p

    2N

    (p1 + p2 + . . .+ pN)2.

    5.1 Concentration measures

    We can formulate similar a measure for the time-frequency concentration, e.g., assuggested in [37],

    =

    W

    4(t, f)dtdf

    (

    W 2(t, f)dtdf)2

    , (5.1)

    41

  • Maria Sandsten Concentration measures

    where W (t, f) is a chosen time-frequency distribution or spectrum. Other normscould of course be used, but this one is similar to the often used kurtosis in statis-tics. The measure in Eq. (5.1) is however not useful for signals including severalcomponents and a better measure is a more local function, also proposed in [37],

    (t,f) =

    Q

    2(t1 t, f1 f)W 4(t1, f1)dt1df1(

    Q(t1 t, f1 f)W 2(t1, f1)dt1df1)2

    , (5.2)

    where Q(t, f) is a weighting function that determines the region for the concentrationmeasurement. One suggestion is to use a Gaussian function as weighting function,[37].Another measure is based on the Renyi entropy, [44],

    R =1

    1 log2(

    W(t, f)dtdf), > 2. (5.3)

    The behavior depends on

    W

    (t, f)dtdf for > 2 similarly to the above men-tioned functions and the logarithm is a monotone function. However, as 1/(1 )is negative for > 2, the entropy of Eq. (5.3) becomes larger for less concentrateddistributions. The Renyi entropy could be interpreted as counting the number ofcomponents in a multi-component signal. For odd values of > 1, R is asymp-totically invariant to time-frequency cross-terms and does not count them. The Rattends the lowest value for the Wigner distribution of a single Gaussian windowedsignal. It is also invariant to time- and frequency shifts of the signal.The Renyi entropy can be related to the well-known Shannon entropy

    H =

    W (t, f) log2W (t, f)dtdf,

    which could be recovered from the Renyi entropy, from the limit of 1, [44]. Weshould note that the Shannon entropy could not be used for a distribution includingany negative values at any time-frequency value as the logarithm is applied insidethe integral. For the Renyi entropy however, the logarithm is applied outside theintegral, and many signals and especially smoothed distributions are positive in thiscontext. However, multi-component signals and odd values of can be found forwhich

    W

    (t, f)dtdf < 0, see [44] for examples of such Wigner distributions.Often, the value of = 3 is applied. Many of the time-frequency distributions inthe literature have been based either implicitly or explicitly on information measures,e.g., [45, 46].

    42

  • Chapter 6

    Quadratic distributions andCohens class

    In the 1940s to 60s, during and after the invention of the spectrogram and theWigner-Ville distribution, a lot of other distributions and methods were invented,e.g., Rihaczek distribution, [15], Page distribution, [16] and Margenau-Hill or Levindistribution, [17]. In 1966 a formulation was made by Leon Cohen in quantum me-chanics, [18], which included these and an infinite number of other methods as kernelfunctions. Many of the distributions satisfied the marginals, the instantaneous fre-quency condition and other properties and all of these methods are nowadays referredto as the the Cohens class. Later a quadratic class was defined, also includingkernels that not satisfy the marginals, which was a restriction of the original Cohensclass. However, the quadratic class is nowadays often referred to as the Cohensclass. Quadratic time-frequency distributions can always be formulated as themultiplication of the ambiguity function and the ambiguity kernel, (, ),

    AQz (, ) = Az(, ) (, ), (6.1)

    where the ambiguity function is the two-dimensional Fourier transform of the Wigner-Ville distribution and accordingly, for any ambiguity kernel, the smoothed Wigner-Ville distribution is found as the 2-dimensional convolution

    WQz (t, f) = Wz(t, f) (t, f),=

    Az(, )(, )ei2pi(ft)dd

    43

  • Maria Sandsten Quadratic distributions and Cohens class

    =

    z(u+

    2)z(u

    2)(, )ei2pi(tfu)dudd, (6.2)

    where the last line is the most recognized defining the quadratic class. The quadraticclass also includes the spectrogram.

    6.1 Kernel interpretation of the spectrogram

    The spectrogram, using the analytic signal, is defined

    Sz(t, f) = |

    h(t t1)z(t1)ei2pift1dt1|2,

    = (

    h(t t1)z(t1)ei2pift1dt1)(

    h(t t2)z(t2)ei2pift2dt2).

    Replacing t1 = t +

    2and t2 = t

    2,

    Sz(t, f) =

    z(t +

    2)z(t

    2)h(t t

    2)h(t t +

    2)ei2pifddt,

    where we can identify the time-lag distribution rz(t, ) = z(t +2)z(t

    2) and a

    time-lag kernel

    h(t, ) = h(t+

    2)h(t

    2),

    giving

    W hz (t, f) = Sz(t, f) =

    rz(t, )h(t t, )ei2pifdtd.

    (The proof of the time-lag formulation of the quadratic time-frequency formulationis calculated in exercise 6.1.)Another formulation of the spectrogram in the quadratic class is given by startingwith a kernel h(, ) which is the ambiguity function of a window h(t), i.e.,

    h(, ) =

    h(t+

    2)h(t

    2)ei2pitdt.

    Then we get (we drop the integral limits for simplification),

    W hz (t, f) =

    z(t1+

    2)z(t1

    2)h(t2+

    2)h(t2

    2)ei2pi(t1+t2t+f)dt1dt2dd,

    which after integration over becomes

    =

    z(t1 +

    2)z(t1

    2)h(t2 +

    2)h(t2

    2)(t t1 t2)ddt1dt2.

    44

  • Maria Sandsten Quadratic distributions and Cohens class

    Using that t2 = t t1,

    =

    z(t1 +

    2)z(t1

    2)h(t t1 +

    2)h(t t1

    2)ei2pifddt1.

    Replacing t1 +2

    = t and t1 2 = t gives

    =

    z(t)z(t)h(t t)h(t t)ei2pi(tt)dtdt,

    which equals the spectrogram,

    Sz(t, f) = (

    h(t t)z(t)ei2piftdt)(

    h(t t)z(t)ei2piftdt),

    = |

    h(t t)z(t)ei2piftdt|2.

    If we study the assumed the ambiguity kernel of the spectrogram above, we see thatthe marginals never could be fulfilled. Why?

    6.2 Some historical important distributions

    The Rihaczek distribution (RD), [15], is derived from the energy of a complex-valueddeterministic signal over finite ranges of t and f , and if these ranges become infinites-imal, a energy density function is obtained. The energy of a complex-valued signalis

    E = |z(t)|2dt =

    z(t)z(t)dt =

    z(t)Z(f)ei2piftdfdt, (6.3)

    where then for infinitesimal t and f we define

    Rz(t, f) = z(t)Z(f)ei2pift, (6.4)

    as the Rihaczek distribution. (Sometimes this distribution is also called the Kirkwood-Rihaczek distribution as it actually was derived earlier in the context of quantummechanics, [47].) We note that the marginals are satisfied as

    Rz(t, f)df = |z(t)|2,

    (seen from Eq. (6.3)), and

    Rz(t, f)dt = Z(f)

    z(t)ei2piftdt = |Z(f)|2.

    45

  • Maria Sandsten Quadratic distributions and Cohens class

    Other important properties of the Rihaczek distribution are both strong time supportand strong frequency support, which is easily seen as the distribution is zero at thetime intervals where z(t) is zero and at the frequency intervals where Z(f) is zero.The Fourier transform in two variables of Eq. (6.4) gives the ambiguity function fromwhere the ambiguity kernel

    (, ) = eipi , (6.5)

    can be identified. From this we also verify that the marginals are satisfied as (, 0) =(0, ) = 1.

    Example

    Comparing the results of the Wigner-Ville and Rihaczek distributions for some sig-nals show that for a sum of two complex-valued sinusoid signal, (f1 = 0.05 andf2 = 0.15), multiplied with a Gaussian functions and located at the same time-instant,Figure 6.1a), the Rihaczek distribution Figure 6.1c) (real value) and d) (imaginaryvalue) does not generate any cross-terms as the Wigner-Ville distribution does, Fig-ure 6.1b). The view of the components might be difficult to interpret as they seem tooscillate quite strange. However, in Figure 6.2, for the sum of two complex-valuedsinusoids, (f1 = 0.05 and f2 = 0.15), with Gaussian envelope located at differenttime intervals and with different frequencies, the Rihaczek distribution produces twiceas many cross-terms compared to the Wigner-Ville distribution. Why the cross-termsappear as in Figure 6.2 is intuitively easy to understand if we study Eq. (6.4) where thesignal z(t) will be present around time 30 and 30 and the Fourier transform Z(f)will be present around frequencies f1 = 0.05 and f2 = 0.15. These two functionsare combined to the two-dimensional time-frequency representation, which naturallythen will have components appering at all combinations of these time and frequencyinstants.

    The real part of the Rihaczek distribution is also defined as a distribution, the Levindistribution (LD), [48], i.e.,

    Lz(t, f) =

  • Maria Sandsten Quadratic distributions and Cohens class

    Figure 6.1: The Wigner-Ville and Rihaczek distributions of a real valued cosine signalwith Gaussian envelope.

    47

  • Maria Sandsten Quadratic distributions and Cohens class

    Figure 6.2: The Wigner-Ville and Rihaczek distributions of two complex sinusoidswith Gaussian envelope located at different time intervals and with different frequen-cies.

    48

  • Maria Sandsten Quadratic distributions and Cohens class

    The Rihaczek and Levin distributions have their obvious drawbacks, but as theyalso are intuitively nice from interpretation aspects, they are often applied where thesignal is windowed, e.g., a windowed Rihaczek distribution is defined as

    Rwz (t, f) = z(t) [Ff{z()w( t)}] ei2pift. (6.7)With an appropriate size of the window, the cross-terms of the Rihaczek distributioncan be reduced.Another suggestion of distribution is the running transform, Z(t, f), or the Pagedistribution, [16], which is the Fourier transform of the signal up to time t.

    Z(f) = t

    z(t)ei2piftdt. (6.8)

    If the signal is zero after time t, the marginal in frequency is |Z(f)|2. If a distributionsatisfy the marginal, then t

    P(t, f)dt = |Z(f)|2.

    Differentiating with respect to time gives

    P(t, f) =

    t

    [| t

    z(t)ei2piftdt|2

    ]=

    t

    [( t

    z(t)ei2piftdt)( t

    z(t)ei2pift

    dt)]

    = z(t)ei2piftZ(f) + Z(f)z(t)ei2pift

    = 2 t1, there

    59

  • Maria Sandsten The instantaneous frequency

    exists a time interval between t1 and t2 such that

    (t2) (t1) = (t2 t1)(t). (7.3)

    If pI is the period of one oscillation of x(t) then the instantaneous frequency isdefined as fI = 1/pI . Evaluating for one period, i.e., t2 = t1 + pI , will result in that(t2) = (t1) + 2pi and

    (t) from Eq. (7.3) will be simplified to

    (t) = 2pi/pI = 2pifI .

    The instantaneous frequency will then, similarly to the constant-frequency example,be

    fI =(t)2pi

    , (7.4)

    for the time t during that oscillation.In the case of a real-valued signal, however, e.g.,

    x(t) = A(t) cos(2pif(t)t), (7.5)

    the instantaneous frequency will be zero as this signal has no phase in the sensedefined above. This will of course be the case for any real-valued signal. If we couldchoose we would rather like the instantaneous frequency to be fI also for a real-valuedsignal. Therefore, it is of interest to use analytic signals.

    Analytic signals

    As already mentioned a signal z(t) is analytic if

    Z(f) = 0 for f < 0, (7.6)

    where Z(f) = F{z(t)}, i.e. the analytic signal requires the spectrum to be identicalto the spectrum for the real signal for positive frequencies and to be zero for negativefrequencies.

    Theorem 1 The signalz(t) = x(t) + iy(t), (7.7)

    where x(t) and y(t) are real-valued, is analytic with a real-valued DC component, ifand only if

    Y (f) = (i sign(f))X(f), (7.8)

    60

  • Maria Sandsten The instantaneous frequency

    where X(f) = F{x(t)} and Y (f) = F{y(t)} and where

    sign(f)

    1 if f > 00 if f = 01 if f < 0.

    (7.9)

    We get

    Z(f) =

    2X(f) if f > 0X(f) if f = 00 if f < 0.

    (7.10)

    The inverse Fourier transform of Z(f) is the analytic signal corresponding to the realsignal x(t). The calculation of the analytic signal can be made in the time domain aswell, see e.g., [31], but the interpretation in frequency domain is much easier, usingthe Hilbert transform. The Hilbert transform of a signal x(t) is computed as

    H[x(t)] = F1{(i sign(f))F{x(t)}}, (7.11)giving

    z(t) = x(t) + iH[x(t)]. (7.12)

    Example

    The Hilbert transform introduces a phase lag of 90 degrees (pi/2), caused by the mul-tiplication of i = eipi/2, where the resulting signal is then said to be in quadratureof the original signal. For the positive constant f0,

    H[cos(2pif0t)] = sin(2pif0t),H[sin(2pif0t)] = cos(2pif0t).

    We can also assumeH[A(t) cos((t))] = A(t) sin((t)),

    if the variation of A(t) is sufficiently slow. Sufficiently slow in this case means thatthe variation should be slow enough to ensure that the spectrum of A(t) do not overlapthe spectrum of cos((t)). This means that all low frequencies are contained in A(t)where the high frequencies are in (t). We get

    z(t) = A(t) cos((t)) + iH[A(t) cos((t))]= A(t) cos((t)) + iA(t) sin((t))

    = A(t)ei(t).

    61

  • Maria Sandsten The instantaneous frequency

    Example

    The convolution of an analytic signal with an arbitrary function is analytic as,z(t1)h(t t1)dt1 =

    0

    Z(f)H(f)ei2piftdf,

    where the resulting signal must be analytic as Z(f) = 0 for f < 0 and the integralthereby is defined only for positive frequencies.

    Example

    It is important to note that when the signal is a multi-component signal, the definitionof instantaneous frequency becomes difficult. The concept of instantaneous frequencyis no longer valid and the methods that rely on the definition of instantaneous fre-quency do not work properly. One example is the sum of two complex sinusoids,

    x(t) = A1ei2pif1t + A2e

    i2pif2t, (7.13)

    where f1 and f2 are positive,. i.e., an analytic signal. Depending on the values of thefrequencies and the amplitudes very different instantaneous frequencies show up, seeFigure 7.1a), where the parameters are f1 = 0.11, f2 = 0.2, A1 = 0.5 and A2 = 1giving an instantaneous frequency varying between 0.17 and 0.29 which includes thefrequency 0.2 but not at all 0.11. In Figure 7.1b), the amplitude of the first frequencyis changed to A1 = 1.5 and the resulting instantaneous frequency even becomesnegative for certain time points.

    7.2 Time-frequency reassignment

    The reassignment principle was introduced in 1976 by Kodera, de Villedary andGendrin, [20], but were not applied to a larger extent until when Auger and Flandrinreintroduced the method, [21]. The idea of reassignment is to keep the localizationof a single component by reassigning mass to the center of gravity where the cross-terms are reduced by smoothing of the Wigner distribution. The spectrogram usingthe window h(t) is

    Sx(t, f) = |X(t, f)|2,where

    X(t, f) =

    x(t1)h(t1 t)ei2pift1dt1,

    62

  • Maria Sandsten The instantaneous frequency

    0 50 100

    0.16

    0.18

    0.2

    0.22

    0.24

    0.26

    0.28

    0.3

    0.32

    t

    f i(t)

    a)

    0 50 1000.1

    0.05

    0

    0.05

    0.1

    0.15

    t

    b)

    Figure 7.1: Example of instantaneous frequency of a two-component signal; a) f1 =0.11, f2 = 0.2, A1 = 0.5 and A2 = 1; b) f1 = 0.11, f2 = 0.2, A1 = 1.5 and A2 = 1.

    but could also be rewritten as the 2-dimensional convolution between the Wignerdistribution and the time-frequency kernel of the window function,

    Sx(t, f) =

    Wx(u, v)Wh(t u, f v)du dv. (7.14)

    In this form, the spectrogram is easily seen as a weighted sum of all the Wignerdistribution values at the neighboring points. The mass at these points should beassigned to the center of gravity, which are calculated as

    tx(t, f) =1

    Sx(t, f)

    uWx(u, v)Wh(t u, f v)du dv (7.15)

    fx(t, f) =1

    Sx(t, f)

    vWx(u, v)Wh(t u, f v)du dv. (7.16)

    giving

    RShx(t, f) =

    Sx(u, v)(t tx(u, v))(f fx(u, v))du dv. (7.17)

    63

  • Maria Sandsten The instantaneous frequency

    From the beginning t, (local group delay), and f , (local instantaneous frequency),have been related to the phase of the STFT as

    tx(t, f) = 12pi

    f(t, f) (7.18)

    fx(t, f) = f +

    t(t, f), (7.19)

    where(t, f) = arg{X(t, f)}.

    However, this does not lead to anything useful for discrete-time signals and a muchmore efficient implementation is possible as

    tx(t, f) = t

  • Maria Sandsten The instantaneous frequency

    Figure 7.2: Spectrogram and reassigned spectrogram of a chirp-signal using window-length 16

    65

  • Maria Sandsten The instantaneous frequency

    C7.4 a) Use the m-file gausschirpdata to compute a Gaussian windowed complex-valued chirp-signal, e.g., with N=256, Nk=256, Tvect=128, F0vect=0,F1vect=0.5. Investigate the time-frequency distribution using differentmethods. Which of the applied methods should actually, in theory, givethe instantaneous frequency of (an infinite length) chirp signal?

    b) Investigate the performance, by view, of the reassignment spectrogramusing the m-file reassignspectrogram on gausschirpdata. Use for exam-ple the signal of C7.4a) and investigate for different window length, e.g.,WIN=20 and WIN=250. How does the spectrogram and the reassignedspectrogram change for these values. Especially study the direction of thelinear increase. Also investigate the resulting spectra when adding somenoise on the signal and try to put a level of input parameter e, e.g. e = 0.1or e = 1.

    The following additionally m-files could be used and modified for your ownpurpose:

    The m-file [X,T]=gausschirpdata(N,Nk,Tvect,F0vect,F1vect,Fs) gener-ates a complex-valued signal of total length N including a number ofGaussian windowed chirp-components, each of length Nk. The centre timevalues are specified in Tvect, the initial frequency values in F0vect and thefinal frequency values in F1vect. A sample frequency Fs can be specified.

    The m-file [SS,MSS]=reassignspectrogram(X,WIN,e,Fs,NFFT); computesand plots the usual windowed spectrogram (SS), and the corresponding re-assigned spectrogram (MSS). A window or a Hanning window-length couldbe changed using the input variable WIN. The parameter e sets the level,below which no reassignment of spectrum values are made. It might fastenthe calculations and can also improve the result in a noisy situation.

    66

  • Chapter 8

    The uncertainty principle andGabor expansion

    The spectrogram and STFT relies on the idea by Dennis Gabor in 1946 where hesuggested representing a signal in two dimensions, time and frequency. He called thetwo-dimensional area, information diagrams and suggested certain elementary sig-nals that occupy the smallest possible area in the information diagram, one quantumof information. In 1971 he received the Nobel Prize for his discovery of the principlesunderlying the science of holography. However, the Gabor expansion was not relatedto the short-time Fourier transform until more recently, [52]

    8.1 The uncertainty principle

    A signal cannot be both time-limited and frequency-limited at the same time. Thisis seen, e.g., in the example where a multiplication of any signal x(t) with a rectanglefunction certainly will time-limit the signal,

    xT (t) = x(t) rect[t/T ].

    In the frequency domain, the result will be of infinite bandwidth as the multiplicationis transferred to a convolution with an infinite sinc-function, (sin x/x).

    XT (f) = X(f) T sinc(fT ).

    Similarly, a bandwidth limitation by a rectangle function in the frequency domainwill cause a convolution in the time domain with an infinite sinc-function.

    67

  • Maria Sandsten The uncertainty principle and Gabor expansion

    We can measure the frequency bandwidth using the so called effective bandwidthBe,

    B2e =

    f

    2|X(f)|2df |X(f)|2df

    , (8.1)

    where the denominator is the total energy of the signal and analogously the so calledeffective duration Te is defined as

    T 2e =

    t

    2|x(t)|2dt |x(t)|2dt

    . (8.2)

    In these definitions the signal as well as the spectrum are assumed to be locatedsymmetrically around t = 0 and f = 0. These expressions can be compared withthe variance of a stochastic variable, i.e., if |X(f)|2 is interpreted as the probabilitydensity function of the stochastic variable f the denominator would be 1 and thenumerator would be the variance of f . Using these measures we can define thebandwidth-duration product, Be Te, which for a Gaussian signal is

    Be Te = 14pi,

    [39], and for all other signals

    Be Te > 14pi.

    Another name for the bandwidth-duration product is the uncertainty principlewhich serves as a measure of the information richness in the signal. Gabor chosethe Gaussian signal as his elementary function in the Gabor expansion, [39], as thisfunction is the most concentrated both in time and frequency.

    8.2 Gabor expansion

    For a continuous signal x(t), the Gabor expansion is defined as

    x(t) =

    m=

    k=

    am,kg(tmT0)ei2pikF0t, (8.3)

    where T0 and F0 denote the time and frequency sampling steps. The coefficients am,kare called the Gabor coefficients. Gabor chose the Gaussian function

    g(t) = (

    pi)14 e

    2t2 ,

    68

  • Maria Sandsten The uncertainty principle and Gabor expansion

    as the elementary function or synthesis window, because it is optimally concen-trated in the joint time-frequency domain in terms of the uncertainty principle. How-ever, in practice, the Gaussian window cannot be used as it does not possess theproperty of compact support, and using the Gaussian window actually means usinga truncated version of this window. The related sampled STFT, also known as theGabor transform, is

    X(mT0, kF0) =

    x(t1)w(t1 mT0)ei2pikF0t1dt1, (8.4)

    where X(mT0, kF0) = am,k, if the sampling distances T0 and F0 satisfy the criticalsampling relation F0T0 = 1. In this case there is an uniquely defined analysiswindow w(t) given from the synthesis window g(t). This is seen if Eq. (8.4) issubstituted into Eq. (8.3) as

    x(t) =

    x(t1)

    m=

    k=

    g(tmT0)w(t1 mT0)ei2pi(kF0tkF0t1)dt1,

    which is true if

    m=

    k=

    g(tmT0)w(t1 mT0)ei2pi(kF0tkF0t1) = (t t1).

    Using the Poisson-sum formula and the biorthogonality, see [52], the double summa-tion is reduced to a single integration,

    g(t)w(tmT0)ei2pikF0t = km,

    where k is the Kronecker delta and the analysis window w(t) follows uniquely froma given synthesis window. However, this window does not have attractive propertiesfrom a time-frequency analysis perspective, as it is not well located in time andfrequency. If instead an oversampled lattice is used, i.e., F0T0 < 1, the analysiswindow is no longer unique, and a choice can be made from a time-frequency analysisperspective.For discrete-time signals, calculation of the analysis window is made using the Zaktransform, [53] and a mean square error solution gives a analysis window which is mostsimilar to the synthesis window, [54], see Figure 8.1. The Gaussian synthesis windowin Figure 8.1a) generates at critical sampling the analysis window of Figure 8.1b)which is not attractive for generating a time-frequency distribution, but might be the

    69

  • Maria Sandsten The uncertainty principle and Gabor expansion

    0 50 1000

    0.1

    0.2

    0.3

    a) g(t)

    t0 50 100

    0.2

    0

    0.2

    0.4

    b) w(t), critical sampling

    t

    0 50 1000

    0.1

    0.2

    0.3

    c) w(t), oversampling 4

    t0 50 100

    0

    0.1

    0.2

    0.3

    d) w(t), oversampling 16

    t

    Figure 8.1: a) A Gaussian synthesis window g(t). The analysis window w(t) at; b)critical sampling; c) oversampling a factor 4; d) oversampling a factor 16.

    best for other applications, e.g., image analysis. For higher oversampling Figure 8.1c)and d), the analysis window becomes more similar to the synthesis window.The Gabor expansion is often used in applications where analysis as well as recon-struction of the signal is needed but also for estimation of e.g., linear time-variant(LTV) filters, [55].

    70

  • Chapter 9

    Stochastic time-frequency analysis

    Time-frequency analysis of non-stationary and time-varying signals, in general, hasbeen a field of research for a number of years. The general quadratic class of time-frequency representation methods can be used for this purpose, and huge number ofknown time-frequency kernels can be found for different representations. However,many of these are actually developed for deterministic signals (possibly with a dis-turbance) and used for a more exploratory description of multicomponent signals. Inthe recent 10-15 years, the attempts to find optimal estimation algorithms for non-stationary processes, have increased, [22]. A fundamental problem is the estimationof a time-frequency spectrum from a single realization of the process. The IAF forthe non-stationary process is

    rz(t, ) = E [z(t+ 2

    )z(t 2

    )], (9.1)

    with the estimate from one realization as

    rz(t, ) = z(t+

    2)z(t

    2). (9.2)

    The Wigner spectrum is defined as

    Wz(t, f) =

    rz(t, )ei2pifd, (9.3)

    with an estimate found as Wz(t, f) = rz(t, )e

    i2pifd .

    71

  • Maria Sandsten Stochastic time-frequency analysis

    9.1 Different definitions of non-stationary processes

    The spectral decomposition theorem for a stationary process admits the Cramerharmonic decomposition

    z(t) =

    ei2pif1tdZ(f1).

    This decomposition has double orthogonality, which means that

    ei2pif1tei2pif2tdt = (f2 f1).

    and that

    E [dZ(f1)dZ(f2)] = (f1 f2)Sz(f1)df1df2.A zero-mean Gaussian non-stationary process is characterized by the covariancefunction

    pz(t1, t2) = E [z(t1)z(t2)],which is dual to the spectral distribution function Pz(f1, f2) by means of the Fourier-type relation,

    E [z(t1)z(t2)] =

    Pz(f1, f2)ei2pi(f1t1f2t2)df1df2.

    The spectral increments of Pz(f1, f2) are not orthogonal as is the case for a sta-tionary process, where the spectral distribution function reduces to Pz(f1, f2) =(f1 f2)Sz(f1). However, for the existence of a decomposition, the requirement

    |Pz(f1, f2)|df1df2

  • Maria Sandsten Stochastic time-frequency analysis

    Using the orthogonality relation

    (t1, f1)(t2, f2)dt1 = (f2 f1),

    we find the Karhunen representation as

    pz(t1, t2)(t2, f1)dt2 = Sz(f1)(t1, f1). (9.4)

    The advantage of the Karhunen representations is the orthogonality in time as wellas frequency. However, the interpretation of the variable f1 as frequency is no longerobvious. The decomposition of the autocovariance leads to

    E [|z(t)|2] = |(t1, f1)|2Sz(f1)df1,

    which defines a time-dependent power spectrum as

    Wz(t1, f1) = |(t1, f1)|2Sz(f1).

    As the non-stationarity property is very general, different restrictions have been ap-plied. One definition is the locally stationary process (LSP), [58], which has acovariance of the form

    pz(t1, t2) = q(t1 + t2

    2) r(t1 t2), (9.5)

    where q(t) is a non-negative function and r(t) positive definite. Studying a formula-tion as a local symmetric covariance function, i.e.,

    E [z(t+ 2

    )z(t 2

    )] = q(t) r(),

    we see that the LSP is a result of a modulation in time of a stationary covariancefunction. When q(t) tends to be constant, the LSP becomes more stationary. Exam-ples of realizations of such processes are depicted in Figure 9.1 for different parametervalues.Another similar definition is the quasi-stationary process, which apply a deter-ministic modulation directly on the stationary process, i.e.

    z(t) = d(t)w(t),

    73

  • Maria Sandsten Stochastic time-frequency analysis

    0 100 2001

    0.5

    0

    0.5

    1c=1.1

    0 100 2001.5

    1

    0.5

    0

    0.5

    1c=1.5

    0 100 2001.5

    1

    0.5

    0

    0.5

    1c=2

    0 100 2002

    1.5

    1

    0.5

    0

    0.5

    1

    1.5c=4

    0 100 2002

    1.5

    1

    0.5

    0

    0.5

    1

    1.5c=10

    0 100 2002

    1

    0

    1

    2c=30

    Figure 9.1: Examples of realizations of a locally stationary processes for differentparameter values, c.

    where w(t) is a complex-valued weakly stationary process. The local covariancefunction becomes

    E [z(t+ 2

    )z(t 2

    )] = d(t+

    2)d(t

    2)rw().

    If the modulation is slow, i.e., the variation of d(t) is small in comparison with thestatistical memory of w(t), then d(t+ /2) d(t) d(t /2).A third definition is connected to linear systems, where we can represent z(t) asthe output of a linear time-varying system H whose input is stationary white noise,denoted n(t), with power spectral density Sn(f) = 1, [59],

    z(t) = (Hn)(t) =

    h(t, t)n(t)dt,

    We note that if z(t) is stationary, H is time-invariant and the power spectral density ofz(t) is given by Sz(f) = |H(f)|2, where H(f) = h()ei2pifd . The system H iscalled an innovations system of the process z(t) and the process is underspread ifcomponents that are sufficiently distant from each other in the time-frequency planeare effectively uncorrelated. Many papers relating to time-frequency analysis andlinear time-varying systems can be found, e.g., [27],

    74

  • Maria Sandsten Stochastic time-frequency analysis

    Priestley, [60], suggested the oscillatory random process, which is a rather re-stricted class of processes, as a linear combination of oscillatory processes is notnecessarily oscillatory, where a basis function

    (t, f) = A(t, f)ei2pift,

    is applied giving the process z(t) as

    z(t) =

    (t, f)dZ(f). (9.6)

    It is worth noting that the basis functions are not necessarily orthogonal for thePriestly decomposition. Remembering the Cramer decomposition of a stationaryprocess z(t) =

    e

    i2piftdZ(f) where the process is decomposed using the orthogonalbasis functions ei2pift we see that the Priestly decomposition is similar in the sensethat an amplitude modulation of each complex exponential is applied for the non-stationary oscillatory process. If A(t, f) is slowly varying in time for each frequencythe decomposition is close to orthogonal as

    (t, f1)

    (t, f2)dt =

    A(t, f1)A(t, f2)ei2pi(f1f2)tdt (f1 f2),

    when A(t, f1)A(t, f2) 1. The socalled e