the hilbert-huang transform- theory applications development

University of IowaIowa Research Online

Theses and Dissertations

2011

The Hilbert-Huang Transform: theory,applications, developmentBradley Lee BarnhartUniversity of Iowa

Copyright 2011 Bradley L. Barnhart

This dissertation is available at Iowa Research Online: http://ir.uiowa.edu/etd/2670

Follow this and additional works at: http://ir.uiowa.edu/etd

Part of the Physics Commons

Recommended CitationBarnhart, Bradley Lee. "The Hilbert-Huang Transform: theory, applications, development." PhD diss., University of Iowa, 2011.http://ir.uiowa.edu/etd/2670.

http://ir.uiowa.edu?utm_source=ir.uiowa.edu%2Fetd%2F2670&utm_medium=PDF&utm_campaign=PDFCoverPages

http://ir.uiowa.edu/etd?utm_source=ir.uiowa.edu%2Fetd%2F2670&utm_medium=PDF&utm_campaign=PDFCoverPages

http://ir.uiowa.edu/etd?utm_source=ir.uiowa.edu%2Fetd%2F2670&utm_medium=PDF&utm_campaign=PDFCoverPages

http://network.bepress.com/hgg/discipline/193?utm_source=ir.uiowa.edu%2Fetd%2F2670&utm_medium=PDF&utm_campaign=PDFCoverPages

1

THE HILBERT-HUANG TRANSFORM:

THEORY, APPLICATIONS, DEVELOPMENT

by

Bradley Lee Barnhart

An Abstract

Of a thesis submitted in partial fulfillment of the requirements for the Doctor of

Philosophy degree in Physics in the Graduate College of

The University of Iowa

December 2011

Thesis Supervisor: Professor William Eichinger

1

2

ABSTRACT

Hilbert-Huang Transform (HHT) is a data analysis tool, first developed in 1998,

which can be used to extract the periodic components embedded within oscillatory data.

This thesis is dedicated to the understanding, application, and development of this tool.

First, the background theory of HHT will be described and compared with other spectral

analysis tools. Then, a number of applications will be presented, which demonstrate the

capability for HHT to dissect and analyze the periodic components of different oscillatory

data. Finally, a new algorithm is presented which expands HHT ability to analyze

discontinuous data. The sum result is the creation of a number of useful tools developed

from the application of HHT, as well as an improvement of the HHT tool itself.

Abstract Approved: ________________________________

Thesis Supervisor ________________________________

Title and Department ________________________________

Date

3

THE HILBERT-HUANG TRANSFORM:

THEORY, APPLICATIONS, DEVELOPMENT

by


A thesis submitted in partial fulfillment of the requirements for the Doctor of

Philosophy degree in Physics in the Graduate College of

The University of Iowa

December 2011

Thesis Supervisor: Professor William Eichinger

Graduate College The University of Iowa

Iowa City, Iowa

CERTIFICATE OF APPROVAL

_______________________

PH.D. THESIS

_______________

This is to certify that the Ph.D. thesis of


has been approved by the Examining Committee for the thesis requirement for the Doctor of Philosophy degree in Physics at the December 2011 graduation.

Thesis Committee: ___________________________________ William Eichinger, Thesis Supervisor

___________________________________ Thomas Boggess Jr.

___________________________________ Paul Kleiber

___________________________________ Wayne Polyzou

___________________________________ Anton Kruger

ii

2

Dedicado a Eduardo y su duende

iii

3

ACKNOWLEDGMENTS

I want to first thank my adviser Dr. Bill Eichinger. Thank you for all of your

encouragement, advice, and support. This work would not be possible without you.

Also thank you to my wife Rebecca. You have always created such joy in my life, and

I thank you for all of your love, kindness, and support.

Thank you to my parents, Randall and Nancy, for a childhood which provided the

pathway to success. You are my role models.

And thank you to my dog Lucy. You always give me a great excuse for a long walk.

iv

4

ABSTRACT

Hilbert-Huang Transform (HHT) is a data analysis tool, first developed in 1998,

which can be used to extract the periodic components embedded within oscillatory data.

This thesis is dedicated to the understanding, application, and development of this tool.

First, the background theory of HHT will be described and compared with other spectral

analysis tools. Then, a number of applications will be presented, which demonstrate the

capability for HHT to dissect and analyze the periodic components of different oscillatory

data. Finally, a new algorithm is presented which expands HHT ability to analyze

discontinuous data. The sum result is the creation of a number of useful tools developed

from the application of HHT, as well as an improvement of the HHT tool itself.

v

5

TABLE OF CONTENTS

LIST OF TABLES ............................................................................................................................. vii LIST OF FIGURES ......................................................................................................................... viii CHAPTER I. INTRODUCTION ....................................................................................................... 1 II. BACKGROUND........................................................................................................... 4

Traditional Spectral Analysis Tools ............................................................................. 4

Fourier Analysis ...................................................................................................... 5 Short-Time Fourier Transform ............................................................................ 7 Wavelet Analysis ..................................................................................................... 8 Generalized Time-Frequency Distributions ....................................................... 9

III. HILBERT-HUANG TRANSFORM (HHT) ......................................................... 10

Hilbert Spectral Analysis ............................................................................................ 10 Empirical Mode Decomposition (EMD).................................................................. 12

IV. ANALYSIS OF SUNSPOT VARIABILITY USING THE

HILBERT-HUANG TRANSFORM ...................................................................... 14 Introduction .................................................................................................................. 14 Ensemble Empirical Mode Decomposition (EEMD) ............................................ 15 Results ............................................................................................................................ 15 Discussion ..................................................................................................................... 20 Further Research .......................................................................................................... 21

V. EMD APPLIED TO SOLAR IRRADIANCE, GLOBAL

TEMPERATURE, AND CO2 CONCENTRATION DATA ............................ 28 Introduction .................................................................................................................. 28 Data Used ...................................................................................................................... 28 Results ............................................................................................................................ 29

Cycles in Data ....................................................................................................... 30 IMF Comparisons ................................................................................................ 32

Discussion ..................................................................................................................... 34 VI. CHARACTERIZING SAMPLING ERRORS ASSOCIATED WITH

THE NEAR-SURFACE ENERGY BUDGET CLOSURE PROBLEM ......... 50 The Energy Balance Problem ..................................................................................... 50 EMD as a Dyadic Filter ............................................................................................... 53

vi

6

Eddy Covariance Methods .......................................................................................... 54 Traditional Eddy Covariance Method ............................................................... 54 EMD Eddy Covariance Method ........................................................................ 55

Orthogonality and Sampling Durations .................................................................... 57 How Long is Long Enough? ...................................................................................... 61 Conclusions ................................................................................................................... 63

VII. AN IMPROVED ENSEMBLE EMD ALGORITHM ........................................ 71 Motivation ..................................................................................................................... 71 Ensemble Empirical Mode Decomposition ............................................................. 72 Errors Due to Data Gaps ............................................................................................ 73 Error Reduction Methods ........................................................................................... 74 Discussion ..................................................................................................................... 75

VIII. SUMMARY ................................................................................................................... 83

REFERENCES .................................................................................................................................. 85

vii

7

LIST OF TABLES

Table 5.1 Mean and standard deviation of instantaneous frequencies (1/yrs) calculated

using the Hilbert Transform. ................................................................................................. 42 5.2 Periods (in years) calculated using Hilbert analysis and zero-crossing method. ............. 43 5.3 Correlation coefficients (r) between total solar irradiance and sunspot from 1749

to 2009 ....................................................................................................................................... 45 5.4 Correlation coefficients (r) between total solar irradiance and global mean

temperature from 1880 to 1945 ............................................................................................. 47 5.5 Correlation coefficients (r) between total solar irradiance and global mean

temperature from 1945 to 2009 ............................................................................................. 47 5.6 Correlation coefficients (r) between sunspot number and global mean

temperature from 1880 to 1945 ............................................................................................. 49 5.7 Correlation coefficients (r) between sunspot number and global mean

temperature from 1945 to 2009 ............................................................................................. 49

viii

8

LIST OF FIGURES

Figure 4.1 Monthly sunspot data decomposed into its intrinsic mode functions (IMFs)

using EEMD ............................................................................................................................. 22 4.2 Statistical significance test for the extracted IMFs. Notice the first extracted

IMF, is below the 1% confidence limit and is therefore considered statistically insignificant from noise ........................................................................................................... 23

4.3 The monthly sunspot data denoised by removing the first IMF extracted using EEMD ....................................................................................................................................... 24

4.4 Hilbert spectra of IMFs representing the (a) 11-year cycle, (b) 20-50-year cycle, and the (c) quasi-100-year cycle ............................................................................................. 25

4.5 Short-time Fourier spectrogram of the monthly sunspot data with window sizes of (a) 100 years and (b) 26 years ............................................................................................ 26

4.6 Wigner-Ville distribution of sunspot data ............................................................................ 27

4.7 Extracted IMF representing the 11-year solar cycle plotted along with its instantaneous frequency as calculated using equation (6) .................................................. 27

5.1 Sunspot number data set and its decomposed IMFs .......................................................... 38

5.2 Total Solar Irradiance (TSI) measurements and their decomposed IMFs ...................... 39

5.3 Global mean temperature and its decomposed IMFs ........................................................ 40

5.4 CO2 concentration as measured from the Mauna Loa Observatory ................................ 41

5.5 Subsection of IMF 2, the yearly cycle extracted from the CO2 data using EEMD ........ 42

5.6 Comparison between TSI and sunspot number IMFs. The TSI data were multiplied by a factor of 100 in order to improve visibility. .............................................. 44

5.7 Comparison of IMFs for global mean temperature and total solar irradiance ............... 46

5.8 Comparison of global mean temperature and sunspot number data IMFs. The temperature was multiplied by a factor of 500 to allow for visibility ............................... 48

6.1 Dyadic nature of EMD when applied to turbulence .......................................................... 65

6.2 Variance contributions from IMF pairs for 60 minute data sets of vertical wind velocity and temperature ......................................................................................................... 65

6.3 Covariance contributions from IMF pairs of vertical wind velocity and temperature ............................................................................................................................... 66

6.4 Covariance contributions from w IMF 10 and all T IMFs ................................................ 66

ix

9

6.5 Absolute value of the nonorthogonal fraction of the total covariance from the IMF Covariance Matrices as calculated for 10 days. The top two plots show SMEX 2002 data from Site 161. The bottom two plots show SMEX 2002 from Site 152 ...................................................................................................................................... 67

6.6 Orthogonal (blue) and nonorthogonal (red) fractions of the total covariance of wT plotted against essentially the number of cycles sampled, as defined by the sampling duration divided by the period of the process (in this case, an IMF) ............. 68

6.7 Orthogonal (blue) and nonorthogonal (red) fractions of the total covariance of wq plotted against essentially the number of cycles sampled, as defined by the sampling duration divided by the period of the process (in this case, an IMF). ............ 69

6.8 Orthogonal (blue) nonorthogonal (red) and total (black) contributions from each IMF for the sensible heat flux as calculated from Site 161 from SMEX 2002. Each subplot is a different sampling duration, going from left to right and top to bottom in the following order: 5, 10, 15, 20, 30, 45, 60, 80, 100, 120, 150 minutes ...... 70

7.1 Original data decomposed into its IMFs as well as the IMFs decomposed from discontinuous data ................................................................................................................... 77

7.2 Error as defined by the summed differences between the discontinuous and continuous extracted IMFs, plotted against data gap size ................................................. 78

7.3 Errors plotted as a function of frequency of IMFs. The errors are primarily in the low-frequency IMFs .......................................................................................................... 79

7.4 Errors plotted as a function of time. For the high-frequency IMFs, the errors occur largely near the gap endpoints ..................................................................................... 80

7.5 Comparison of three different decompositions of data. The original signals (black) contain no gaps. The red signals are the decomposed IMFs from the discontinuous EEMD algorithm, and the blue signals are the discontinuous EEMD algorithm used after a mirroring technique was performed ................................ 81

7.6 Comparison of relative error associated with including or not including the mirror technique when using the discontinuous EEMD decomposition ....................... 82

1

1

CHAPTER I INTRODUCTION

In order to describe the physical world, measurements must be gathered and

interpreted. Just as it is essential to understand the specifications of the instruments used to

collect data, it is also necessary to understand the strengths and limitations of the tools used

to interpret data.

Frequency analysis tools are used to analyze the internal fluctuations of a signal in

terms of their frequency, or size scales. While frequency analysis tools are beneficial for

describing the contributions to a signal from various frequency or size scales, oftentimes the

tools which are used have limitations that restrict how the data can be interpreted. For

example, Fourier-based analysis tools rely on the mathematical property that any signal can

be reconstructed from the sum of sinusoidal functions. This, in theory, is advantageous and

can be used to describe the relative contributions to the signal from the various sine

functions with different frequency. However, these sinusoidal functions are infinite in extent,

and are required to have constant amplitudes and phases. Imagine standing in a grass field

for an hour feeling the intermittent puffs of wind on your face, and it becomes clear that

nature is not stationary. Or conversely, imagine if ocean waves were required to have

constant amplitudes and phases, and how oddly convenient the world would seem. Since

nature does not fit stationary and linear assumptions, it is necessary then to extend our

mathematical tools which describe nature to more adaptive methods. That is, methods

should be extended to accommodate for signals to be nonstationary, and which may be the

result of many, perhaps nonlinear, combinations of processes.

Following the advent of traditional Fourier analysis, many new methods have been

developed to accommodate for nonstationary signals. These vary from short-time Fourier

2

2

transforms (STFT), which allow a signal to be nonstationary as long as it is piece-wise

stationary, or wavelet analysis which can sift out particular signatures from a signal on a

variety of size scales. Generalized time-frequency distributions have also been derived which

encompasses special cases such as wavelets or STFT, and include much more complicated

versions of these tools. With each frequency analysis tool come assumptions and limitations

which affect the signal being analyzed. It is important to understand these limitations to

properly interpret the signal.

This dissertation describes a relatively new data analysis tool called the Hilbert-

Huang transform (HHT) which is able to extract the frequency components from possibly

nonlinear and nonstationary intermittent signals. As with any frequency analysis tool, it has

strengths and weaknesses which need to be understood in order to accurately interpret the

output. However, it is a powerful tool which can describe the frequency components locally

and adaptively for nearly any oscillating signal. This makes the tool extremely versatile. For

instance, HHT has been used to study a wide variety of data including rainfall, earthquakes,

heart-rate variability, financial time series, Lidar data, and ocean waves to name a few

subjects. Therefore, it is justified to continue research on this relatively new tool in order to

fully understand the underlying theory, its potential applications, and its development.

This dissertation is divided into 8 Chapters. Chapter 2 gives a brief background of

current data analysis tools: their strengths and limitations. Chapter 3 introduces HHT and

compares its abilities to these traditional data analysis tools.

Chapters 4-7 describe four separate papers which were submitted to refereed

journals for publication between 2009-2011; two are currently published (Chapters 4,5) and

two are currently under review (Chapters 6,7).

3

3

Chapter 4 demonstrates the utility of HHT when applied to oscillatory data, in

particular, sunspot number data. HHT is compared with other data analysis tools and shown

to be useful to describe the local frequency components of complicated data. Chapter 5 uses

a portion of HHT in order to compare two or more periodic cycles within oscillatory data.

The techniques used in Chapter 4 are extended in Chapter 5 in order to compare two

separate cyclic data oscillations.

Chapters 4 and 5 utilized HHT with well-known data which has been analyzed

extensively using alternative methods. In contrast, Chapter 6 utilizes HHT to address an

unsolved problem: the problem of the lack of a near-surface energy budget closure. This

chapter will apply HHT to meteorological data in an attempt to shed new light on this

problem. The poorly understood measurement sampling errors associated with near-surface

fluxes will be analyzed and conclusions associated with the energy budget closure problem

will be discussed.

Chapter 7 describes an improvement to the EMD algorithm in order to

accommodate for discontinuous, intermittently sampled, data. The benefits of such an

improvement are discussed as well as its limitations.

Finally, Chapter 8 will give a brief summary of the research completed thus far, and

give several suggestions for needed future research.

In order to understand and predict the natural world around us, it is essential not to

fit the world into mathematical equations but rather to expand our mathematical equations

to better fit the natural world. This proposal aims to accomplish this by analyzing a new data

analysis tool, HHT, which may more locally and adaptively describe the natural world.

4

4

CHAPTER II BACKGROUND

Traditional Spectral Analysis Tools

Data analysis is the fundamental connection between measurements and the

conclusions we draw from those measurements. Typically, data analysis tools attempt to

describe the intrinsic variability of measured variables, whether they be temperature, wind

velocity, heart rate, population, rainfall, stock volatility, or any other variable system.

However, it is important to understand how the tools used to analyze data affect the data

itself.

In order to understand the intrinsic variability of a system, measured signals are

oftentimes written mathematically as the sum of their contributing components. Equation

2.1 shows that a time-dependent signal, f(t), can be written as the product of amplitude

coefficients, , and basis functions, { } [6].

( ) ∑ ( )

(2.1)

The signal and the basis functions could also be written as a function of space

depending on the system being analyzed. If the basis functions form an orthonormal set, the

amplitude coefficients can be calculated as in equation 2.2.

∫ ( ) ( )

(2.2)

The energy density contribution from each component, then, is shown in equation

2.3 [5][6][43].

| | (2.3)

5

5

Note that the series in equation 2.1 can be thought of as a mathematical

approximation to the original signal.

An enormous number of solutions to problems utilize this technique of describing a

complicated signal in terms of simpler ones. For example, consider the Schrodinger equation

in equation 2.4.

[

] (2.4)

The general solution to equation 2.4 is shown in equation 2.5

( ) ∑

( ) (2.5)

where the time-dependence in equation 2.5 can be included only when the potential,

V, is independent of time [23]. ( ) are the solutions to the time-independent Schrodinger

equation as shown in equation 2.6.

[

] (2.6)

Depending on the potential V of the system, different coefficients, , and

functions, ( ), are used to describe the solution. For the hydrogen atom, the functions

( ) are written as the product of radial functions, described by Laguerre polynomials, and

the angular functions, which are known as spherical harmonics. Other potentials give

solutions for the wave functions which require spherical or cylindrical Bessel functions [23].

These functions are mathematical approximations which represent the physical processes of

the system.

6

6

Fourier Analysis

When analyzing periodic fluctuations in measured data, the most common form of

data analysis is Fourier analysis. Formulated by Joseph Fourier in the early 1800s, Fourier

analysis utilizes the postulate that any signal can be constructed as a sum of sinusoidal

functions [5][6][43]. Therefore, Fourier analysis relies on the assumption that any signal can

be written as in equation 2.1 where the basis functions, { }, are sine and cosine functions.

For signals which contain multiple frequency components, Fourier analysis describes these

signals as the sum of sine waves, with infinite extent, with different frequencies, as shown in

equation 2.7.

( ) ∑

(2.7)

Because the frequency of each sinusoidal function must be time-independent,

Fourier analysis is able to construct stationary data only. That is, the frequency of the signal

being analyzed is assumed to not change with time. Also, because the sine waves used to

describe a signal are infinite in extent, Fourier analysis is considered a global analysis tool.

The amplitudes of the basis functions can be calculated as shown in equation 2.8.

( ) ( ) ∫ ( )

(2.8)

Equation 2.8 shows that the amplitude coefficients describe the contribution of the

signal at different frequency components, . This equation is called the Fourier transform,

and is useful because it provides a frequency-domain representation of a time-domain

function [5][6][43]. The energy density is the square of the amplitude functions. The relative

contribution to the energy density from each frequency component is shown in equation 2.9.

7

7

( ) ( ) (2.9)

The total energy density is then the sum of the contributions from all frequencies. It

is a common practice to plot Fourier spectra which plot energy density vs. frequency. This

allows the largest contributing components to be located at particular frequencies.

Short-Time Fourier Transform

As mentioned before, nonstationary signals such as sporadic impulses or aperiodic

signals cannot be described locally using Fourier analysis. In order to accommodate for

nonstationary signals, the short-time Fourier transform (STFT) was developed. It is shown

in equation 2.10.

( )

∫ ( ) ( )

(2.10)

The idea of STFT is to break a nonstationary signal into sections, in all of which the

signal is stationary. Then, the regular Fourier transform can be calculated in each section and

the energy density can be determined in each section. Therefore, the original signal can be

nonstationary, as long as it is stationary within each window. The window function,

( ), is chosen by the user to be a particular size. The STFT spectrogram, which

plots the energy density contributions from each frequency, is time-dependent. However, the

frequencies must be constant within each window [5][43].

Note that the choice of window size is important and determines what frequencies

will be resolved from the data. For instance, a short window will be able to show the time-

dependence of frequencies very locally in time, however, it will only capture the high

frequency components and will not resolve the lower frequency, longer periodic, oscillations.

8

8

If a longer window is used, the lower frequency components can be resolved, however, the

possibility of capturing nonstationary features from the signal increases [5][43].

Wavelet Analysis

Wavelet analysis is another type of frequency analysis tool. The wavelet transform of

the signal f(t) is shown in equation 2.11 where ψ is the wavelet, a is the scale factor and b is

the time shift.

( )

√ ∫ ( ) (

)

(2.1

1)

The transform basically represents the similarity between a signal and the pre-

determined wavelet at scale a at time b. Wavelets of different size (frequency) scales are used

to generate many wavelet transform coefficients. Energies can be calculated just as in

equation 2.9 to produce time-frequency-energy spectrograms without the need for a fixed

window [8][9][43]. This was an improvement over STFT because, as the result of the flexible

basis functions, both the high-frequency and the low-frequency structures could be analyzed.

Large scale wavelets are used to extract low frequency, large scale features while high

frequency oscillations are extracted with smaller scale wavelets [8][9][43].

Wavelet analysis works well to seek out particular structures at different size

(frequency) scales within data. For example, Morelet wavelets can isolate and analyze rainfall-

runoff events. However, a drawback of wavelet analysis is that the wavelet basis functions,

and therefore the structures being sifted out from the original signal, are chosen a priori. It is

possible that the utilized wavelets may or may not reflect the processes in the analyzed

signal. If an inappropriate set of wavelets is used to correlate with a signal, the calculated

9

9

wavelet coefficients and variance of the signal may give misleading and nonphysical results

[8][9][43].

Generalized Time-Frequency Distributions

There are many spectral analysis tools which can be described by an overall

generalized time-frequency distribution. equation 2.12 shows this distribution where x(t) is

the signal and ( ) is the kernel which determines the properties of the distribution

[9][33].

( )

∫ ( ) (

) (

) ( )

(2.

12)

The distribution is called the Wigner-Ville distribution when ( ) . Overall,

the Wigner-Ville distribution gives better time and frequency resolution than STFT and does

not have to sacrifice one resolution for the benefit of the other. Negatives of the Wigner-

Ville distribution include the possibility of calculating nonphysical harmonics and even

negative amplitudes. Therefore, the real frequency contributions have to be picked out

amidst nonphysical harmonics. Other kernals can be used including a bi-Gaussian, which

produces a pseudo-Wigner distribution (PWD) or an exponential kernel, which produces a

Choi-Williams distribution (CWD) [9][33].

10

10

CHAPTER III HILBERT-HUANG TRANSFORM (HHT)

An alternative data analysis tool has been proposed by Norden E. Huang called the

Hilbert-Huang Transform (HHT) [26]. The HHT technique for analyzing data consists of

two components: a decomposition algorithm called empirical mode decomposition (EMD)

and a spectral analysis tool called Hilbert spectral analysis. Both tools will be introduced and

described hereafter. It will be shown that HHT can provide a local description of the

oscillating components of a signal, whether nonstationary or nonlinear. This provides a new

approach for analyzing the variability of signals and can be compared with current tools such

as any of the methods mentioned previously.

Hilbert Spectral Analysis

The purpose of HHT is to demonstrate an alternative method to present spectral

analysis tools for providing the time-frequency-energy description of time series data. Also,

the method attempts to describe nonstationary data locally. Rather than a Fourier or wavelet

based transform, the Hilbert transform was used, in order to compute instantaneous

frequencies and amplitudes and describe the signal more locally. Equation 3.1 displays the

Hilbert transform, ( ), which can be written for any function x(t) of Lp class [6]. The PV

denotes Cauchy’s principle value integral.

[ ( )] ( )

∫

( )

(3.1)

[6][21] determined that an analytic function can be formed with the Hilbert

transform pair as shown in equation 3.2.

11

11

( ) ( ) ( ) ( ) ( ) (3.2)

where

( ) ( ) ( ) (

) √ (3.3)

( ) and ( ) are the instantaneous amplitudes and phase functions, respectively

[21]. The instantaneous frequency can then be written as the time derivative of the phase, as

shown in equation 3.4.

( )

(3.4)

Note that the analytic function z(t) is the mathematical approximation to the original

signal x(t).

Because the amplitude and frequency functions are expressed as functions of

time, the Hilbert spectrum, which displays the relative amplitude or energy (square of

amplitude) contributions for a certain frequency at a specific time, can be constructed as

H(w,t). Then, a marginal spectrum can be calculated as in equation 3.5, where the spectrum is

summed over the time domain of 0 and T.

( ) ∫ ( )

(3.5)

The marginal spectrum represents the sum of all amplitudes (energies) over the

entire data span. This can be directly compared to the Fourier spectrum which was shown in

equation 2.9 as ( ).

[26] and [27] showed that not all functions give “good” Hilbert transforms,

meaning those which produce physical instantaneous frequencies. For example, functions

with non-zero means will give negative frequency contributions using the Hilbert transform

[26][27]. Therefore, the signals which can be analyzed using the Hilbert transform must be

12

12

restricted so that their calculated instantaneous frequency functions have physical meaning.

Next, the empirical mode decomposition will be described. It is essentially an algorithm

which decomposes nearly any signal into a finite set of functions which have “good” Hilbert

transforms that produce physically meaningful instantaneous frequencies.

Empirical Mode Decomposition

The EMD algorithm is the other component to the HHT method. The

algorithm attempts to decompose nearly any signal into a finite set of functions, whose

Hilbert transforms give physical instantaneous frequency values. These functions are called

intrinsic mode functions (IMFs). The algorithm utilizes an iterative sifting process which

successively subtracts the local mean from a signal. The sifting process is as follows:

1. Determine the local extrema (maxima, minima) of the signal.

2. Connect the maxima with an interpolation function, creating an upper

envelope about the signal.

3. Connect the minima with an interpolation function, creating a lower

envelope about the signal.

4. Calculate the local mean as half the difference between the upper and lower

envelopes.

5. Subtract the local mean from the signal.

6. Iterate on the residual.

The sifting process is repeated until the signal meets the definition of an IMF, which

will be explained shortly. Then, the IMF is subtracted from the original signal, and the sifting

process is repeated on the remainder. This is repeated until the final residue is a monotonic

13

13

function. The last extracted IMF is the lowest frequency component of the signal, better

known as the trend.

Previously, the sifting process was said to stop when the signal met the

criteria of an IMF. Therefore, it is important to understand how an IMF is defined.

Remember the definition of an IMF was formed to ensure that the IMF signals give physical

frequency values when using the Hilbert transform. The definition of an IMF, therefore, is a

signal which has a zero-mean, and whose number of extrema and zero-crossings differ by at

most one [26][27]. IMFs are considered monocomponent functions which do not contain

riding waves [26][27].

Once a signal has been fully decomposed, the signal D(t) can be written as

the finite sum of the IMFs and a final residue as shown in equation 3.6.

( ) ( ) ∑ ( )

(3.6)

Using equations 3.2 and 3.3, the analytic function can be formed as shown in

equation 3.7.

( ) ( ) [∑ ( ) ∫ ( )

] (3.7)

Also, for reference, equation 3.8 shows the Fourier decomposition of a signal, x(t).

( ) [∑

] (3.8)

Notice that the EMD decomposition can be considered a generalized Fourier

decomposition, because it describes a signal in terms of amplitude and basis functions whose

14

14

amplitudes and frequencies may fluctuate with time [26][27]. The HHT will now be used on

a number of different data sets to analyze its applicability.

15

15

CHAPTER IV ANALYSIS OF SUNSPOT VARIABILITY USING

THE HILBERT-HUANG TRANSFORM

Introduction

Sunspot number variation has been well studied and represents a crucial component

in the analysis of solar activity [53]. Understanding the intrinsic cycles of sunspot number

fluctuations helps to better characterize and understand the solar processes from which they

are responsible. Also, it aids in the prediction of future solar activity.

Because sunspot number data are nonstationary and the result of nonlinear

processes, it is necessary to choose a data analysis tool which will accurately describe its

cyclic components locally and adaptively [26][27]. Sunspot cycles are known to be of varying

lengths and amplitudes. While Fourier analysis is the most common data analysis technique

used to extract periodicities from periodic signals, it requires constant amplitudes and phases

and is not well-suited to the problem [5][9]. Therefore, it is justified to explore a new data

analysis technique which may be more suitable to extract the cyclic components from the

sunspot number data set.

A relatively new data analysis tool called the Hilbert Huang Transform (HHT) is a

tool which was specifically developed for analyzing nonstationary and nonlinear signals

[26][27]. Here we present a HHT analysis of monthly sunspot numbers from 1749-2010 and

compare the extracted cyclic components with those found using Fourier analysis as well as

generalized time-frequency distributions.

Ensemble Empirical Mode Decomposition (EEMD)

EMD is a dyadic filter bank in the frequency domain [15]. This means that the sifting

method can only extract IMFs which differ in frequency by more than factors of 2. An

16

16

improved EMD algorithm called Ensemble EMD (EEMD) has been developed by [58]

which utilizes this characteristic to extract robust and statistically significant IMFs. EEMD is

summarized here:

1. Add finite amplitude noise to the original signal.

2. Decompose signal into a finite set of IMFs using the EMD sifting method

described previously.

3. Repeat steps 1 and 2 with different noise data sets.

4. Average the ensemble of extracted IMFs to average out the noise and obtain

mean IMFs.

A complete description of EEMD can be found in [58]. EEMD was used to analyze

monthly sunspot data and will be shown below.

Results

Monthly sunspot data from January 1749 to April 2010 were decomposed

into different frequency components using EEMD. Eighty different sets of noise with a

standard deviation of 0.2 were added to the original data and decomposed using EMD. The

ensemble of decomposed IMFs was then averaged to obtain mean IMFs. The data along

with the mean extracted IMFs are shown in Fig. 4.1.

Clearly the extracted IMFs have time-dependent amplitudes and phases and differ

from pure sinusoidal functions. They are the intrinsic fluctuations extracted directly from the

signal using the sifting process and are not pre-determined functions. The 11-year cycle is

shown as the second extracted IMF in Fig. 4.1. Notice that the IMF captures the oscillation

of the signal even though the signal is nonstationary. It is well known that each 11-year cycle

does not oscillate as a perfect sinusoid. In fact, it is possible the cycle is made up of two or

17

17

more cyclic components. While the EMD method is unable to separate any components

whose periodicity is greater than factors of 2, it is able to display the subsequent varying 11-

year cycle, and demonstrate the changes in frequency due to its nonlinear behavior. This

investigation focuses mainly on periodicities equal or greater than the 11-year Schwabe cycle.

There were originally four higher frequency components which were combined into one

IMF and labeled the high-frequency IMF as seen in the top plot of Fig. 4.1. The high

frequency oscillations in IMF 1were determined to be statistically insignificant from noise

due to a statistical test which was suggested by [58]. They decomposed a large number of

noise data sets using EMD to create statistical significance confidence limits. The 5 extracted

IMFs are shown in Fig. 4.2 along with the 1%, 50%, and 99% confidence limits derived

from [58].

The star in the upper left corner, with mean energy below the zero mark, is the data

set itself and can be ignored. All IMFs are above the 99-percentile confidence limit except

for the highest frequency fluctuations found in the first IMF. Therefore, only the first IMF is

not statistically significant from noise [58].

One application of the EMD or EEMD method is to remove the high frequency

fluctuations from the sunspot data by subtracting IMF 1 from the original signal as shown in

Fig. 4.3. The advantage of this technique is that the highest frequency components were

removed locally through the EMD sifting process. Therefore, meaningful structures were

not smoothed over which often occurs when using low-pass filtering [58]. For comparison, a

Butterworth low-pass filter was used to remove the high frequency oscillations of the

sunspot number data. The 3db cutoff frequency was set to remove periodicities less than

approximately 5.5 years. The correlation coefficient between the Butterworth filtered data

and the original sunspot data was 0.8498 whereas the EMD method filter gave a value of

18

18

0.9375. Therefore, EMD provided a more accurate representation of the original signal while

removing the high frequency fluctuations. It is conceivable an alternative cutoff frequency

could be used and a better fit obtained using a different low-pass filter, however, this post-

processing is subjective and prone to bias. The EMD method, however, does not require

prior knowledge of the system in order to locally and adaptively extract and remove the

highest frequency content from the signal [26][27].

The other extracted IMFs in Fig. 4.1 represent the longer cycles of the signal.

The 20-50-year cycle is shown as the third IMF in Fig. 4.1. Because EMD is a dyadic filter

bank in the frequency domain, it is possible this IMF is the sum of two or more cycles

whose periods differ by less than a factor of two. The 22-year (Hale) cycle dominates

between approximately 1825 and 1940. However, before and after this time period exists a

slightly longer cycle, approximately 40-50 years. IMF 4 exhibits an approximately 100-year

cyclic oscillation which is known as the Gleissberg cycle [53]. The Gleissberg cycle period is

typically between 60 to 120 years [53]. Finally, the trend is displayed which shows an upward

trend in sunspot number for the past 250 years.

Using the Hilbert transform, Hilbert spectra ( ), were calculated for

each IMF. These can be compared and contrasted with alternative spectral analysis methods

such as STFT and time-frequency distributions.

Fig. 4.4 shows the Hilbert spectra for the extracted IMFs. The frequency is displayed

as cycles/years. Figs. 4.5a and 4.5b show STFT spectrograms of the overall dataset. Fig. 4.5a

uses a window size of 100 years and Fig. 4.5b uses a window size of 26 years. Notice how

the frequency resolution is better and poorer, respectively, and that the time resolution is

related. [28] also analyzed sunspot data with STFT. They used a pre-emphasis filter, which

amplifies certain portions of the spectrogram, in order to more easily distinguish the cycles

19

19

within sunspot data. Figs. 4.5a and 4.5b did not use a pre-emphasis filter, therefore, the

cycles are slightly less resolved than in [28]. See Fig. 2 in [28] for their STFT spectrograms.

[33] also analyzed solar sunspot number using pseudo-Wigner (PWD) distribution. Refer to

Fig. 6 in [33] for the PWD spectrogram of solar sunspot data.

The Hilbert spectrum in Fig. 4.4a shows how the 11-year solar cycle is not constant

but actually changes with time. This is because the 11-year solar cycle is different from a

constant frequency sinusoid. In Fig. 4.4a, it oscillates about a mean of 0.0909 cycles/year

which corresponds to a period of 11.11 years. Fig. 4.4a also shows that the amplitude

fluctuations of the 11-year Schwabe cycle, as can be seen by the color variations from gray to

black, are oscillatory. These fluctuations correspond to the oscillations of IMF 4, as shown in

Fig. 4.1. This is not surprising as the Gleissberg cycle (IMF 4) is the amplitude modulation of

the Schwabe cycle [53].

The STFT spectra in Figs. 4.5a and 4.5b also exhibit a peak near 0.09 Hz which

corresponds to the 11-year cycle. However, there are large contributions from other

frequencies. Because Fourier analysis attempts to construct the original signal with a sum of

sine and cosine functions with constant amplitudes and phases, it requires an infinite number

of contributions from different frequencies [5]. Also, the STFT spectrogram does not

capture the oscillation in frequency of the 11-year solar cycle, which is due to the

nonlinearity of the signal. [33] used PWD to better resolve the 11-year solar cycle. The

distributions did not resolve the oscillation in instantaneous frequency due to nonlinearity,

however, they significantly increased the resolution in both frequency and time as compared

to STFT [33].

The high frequency noise as represented by IMF 1 does not show any coherent

energy contributions from a particular frequency so its Hilbert spectrum was not shown.

20

20

The Hilbert spectrum for IMF 3, the 20-50-year (quasi-Hale) cycle, is shown in Fig.

4.4b. Notice the frequency increases between approximately 1830 and 1940. The amplitude,

as shown by the color, decreases from 1830 to 1940 but is larger before and after this time

period. It is interesting to note that the STFT does not capture the Hale cycle and the PWD

from [33] shows a very faint Hale cycle in their Fig. 4.6.

The Gleissberg cycle, as mentioned previously, represents the periodic amplitude

modulation of the 11-year Schwabe cycle [53]. The Hilbert spectrum of the extracted

Gleissberg cycle, IMF 4, is shown in Fig. 4.4c. It exhibits a mostly constant frequency. This

is in contrast to both the STFT and PWD spectra which show a steady decrease in frequency

corresponding to a steadily lengthening cycle. Also, [33] are able to display shorter period

cycles in the PWD spectrograms. For this investigation, only cycles of approximately 11-

years and greater are shown.

Fig. 4.6 displays the Wigner-Ville distribution for this data set. While the Wigner-

Ville distribution is able to capture the 11-year cycle, there are nonphysical harmonics which

dominate the spectrum. Therefore, the Hilbert, STFT and PWD spectra are more

informative when used for interpreting the sunspot data.

Fig. 4.7 displays the instantaneous frequency of the 11-year cycle IMF and

the IMF itself. The 11-year IMF has been divided by 1000 and a constant of 0.1 has been

added for visibility with the instantaneous frequency. Notice that the IMF cycles tend to

increase more quickly when rising in number and decay more slowly when falling, which is

the cause for the change in instantaneous frequency. The instantaneous frequency is higher

during the rising in sunspot number and is lower during the prolonged “tail” when the

sunspot number decreases more slowly. This nonlinear behavior is similar to rainfall-runoff

data when a short duration rain event occurs followed by a longer runoff period, causing the

21

21

instantaneous frequency of the process to fluctuate with time. This nonlinearity is not

captured using alternative spectral analysis tools.

Discussion

The Hilbert-Huang transform (HHT) has been used to analyze monthly sunspot

numbers and their variability from 1749 to 2010. HHT decomposed the data set into a

number of cyclic components using the ensemble empirical mode decomposition (EEMD).

The IMFs could be viewed in the time domain and compared with the original data. They

were extracted locally and adaptively from the data set and did not require a priori

knowledge about the system or the selection of prescribed basis functions. However, the

method acts as a dyadic filter bank in the frequency domain, meaning that it cannot separate

cycles which differ in period by less than a factor of 2. The Hilbert transform was then used

to calculate spectra and compare with the short-time Fourier transform (STFT) and the

pseudo-Wigner distribution (PWD). The Hilbert method displayed energy contributions

from only a few cyclic components. They were found to be representative of the Schwabe,

Hale, and Gleissberg sunspot cycles. Also, the periodicity of the 11-year solar Schwabe cycle

was shown to be time-dependent. Overall, this analysis demonstrates the utility of HHT

when analyzing nonstationary data which may be due to nonlinear processes. Also, it has

extracted the various cycles from sunspot number data, which can be compared and

contrasted with previous and future sunspot research.

Future Research

The HHT has shown to be useful for decomposing sunspot number into its

intrinsic frequency components. From this, the study of the signal’s variability on different

22

22

time scales is possible. Also, one of the main strengths of the HHT method is to compare

the frequency components of two or more signals to determine relationships between them.

Further research will be pursued to utilize this technique to analyze mean global temperature,

co2 measurements and total solar irradiance proxy data. Then the different frequency

oscillations (IMFs) for each signal can be compared directly and checked for correlations.

Research should focus on developing techniques to compare different frequency

components and determine whether the two IMFs may be related.

Figure 4.1 Monthly sunspot data decomposed into its intrinsic mode

functions (IMFs) using EEMD

23

23

Figure 4.2 Statistical significance test for the extracted IMFs. Notice the first

extracted IMF, is below the 1% confidence limit and is therefore considered statistically insignificant from noise

24

24

Figure 4.3 The monthly sunspot data denoised by removing the first IMF extracted

using EEMD

25

25

Figure 4.4 Hilbert spectra of IMFs representing the (a) 11-year cycle, (b) 20-50-year

cycle, and the (c) quasi-100-year cycle

26

26

Figure 4.5 Short-time Fourier spectrogram of the monthly sunspot data with window

sizes of (a) 100 years and (b) 26 years

27

27

Figure 4.6 Wigner-Ville distribution of sunspot data

Figure 4.7 Extracted IMF representing the 11-year solar cycle plotted along with its

instantaneous frequency as calculated using equation (6)

28

28

CHAPTER V EMD APPLIED TO SOLAR IRRADIANCE, GLOBAL

TEMPERATURE, AND CO2 CONCENTRATION DATA

Introduction

In this investigation, the EMD method is used to isolate and analyze the various

cycles within total solar irradiance, global mean temperature, and co2 concentration

measurements. The different cyclic components will be compared with one another in the

time domain. For instance, the solar forcing from total solar irradiance will be compared

with the global mean temperature fluctuations at different frequency scales in the time-

domain. Therefore, it is easy to tell when two cyclic components are in phase and when they

are not. The EMD method can locally and adaptively analyze the inherent cyclic components

of nonlinear and nonstationary data. Therefore, it is beneficial to analyze the strengths and

weaknesses of this new tool in the context of climate data sets.

Data Used

The primary goal of this investigation is to utilize a relatively new data analysis tool

to identify, as well as compare and contrast, the intrinsic cycles of a number of possibly

inter-related variables. For demonstration, the following data sets were chosen: sunspot

number, total solar irradiance, global mean temperature, and CO2 concentration.

Sunspot number has been immensely recorded and studied. See for instance, [53].

Also, previously in this thesis, we decomposed monthly sunspot numbers into the 11-year

(Schwabe), quasi-22-year (Hale), and quasi-100-year (Gleissberg) cycles using Empirical

Mode Decomposition. They compared the HHT results with those from time–frequency

distributions, including short-time Fourier and pseudo-Wigner distributions. This

investigation will build upon these results by comparing the extracted IMFs with IMFs from

http://www.sciencedirect.com/science/article/pii/S1364682611001234#ref_bib16


29

29

other variables in the time-domain. The monthly sunspot number records utilized were

obtained from the solar physics group at NASA's Marshall Space Flight Center. The monthly

sunspot number data set can be located at [39]. The monthly data from 1749 to 2009 were

decomposed into IMFs, then averaged to obtain annual resolution.

Total Solar Irradiance (TSI) is a direct measure of the solar output. Because TSI

measurements have only been prevalent since the mid-1970s via satellites, this investigation

chose to utilize a reconstructed TSI data set from 1749 to 2009. The proxy data used were

annual data from 1749 to 2009, obtained from the LASP Interactive Solar Irradiance Data

Center. The data can be downloaded at [34]. It is recognized that this data set was

reconstructed with the use of sunspot number data. Therefore, correlations between sunspot

number and the TSI data should be self-evident.

Global mean temperature is one of the most controversial and important data sets

that exist today. This investigation used monthly temperature data from 1880 to 2009 taken

from NASA's Land-Ocean Temperature Index, LOTI. The data can be found at [38]. The

data were decomposed into IMFs, then averaged to achieve annual resolution.

Yearly CO2 concentration data measured at Mauna Loa observatory were also used

for this investigation. Data were obtained from 1958 to 2010 from the NOAA Earth System

Research Laboratory c/o Dr. Pieter Tans at [40].

Results

The EMD method was utilized to extract the intrinsic cycles of the previously

mentioned datasets. Their decomposed intrinsic mode functions (IMFs) are displayed in

Figs. 5.1 through 5.3. The global mean temperature comprised of monthly measurements

30

30

while the TSI data and CO2 concentration data were recorded annually. Therefore, the IMFs

were decomposed for each data set, then averaged to provide annual temporal resolution.

Cycles in Data

The CO2 concentration data set is perhaps the most straightforward data set

decomposed by the EMD method. The decomposition, shown in Fig. 5.4, yielded three

IMFs, including high-frequency noise, an annual oscillation, and a steadily increasing trend.

[60] have previously decomposed CO2 concentration proxy data from 1880 to 2002

using EMD. They utilized annual data so they were unable to extract the yearly cycle as

shown in IMF 2. Instead, they extracted noise IMFs and a century long increasing trend. Fig.

5.5 is an enlarged plot of IMF 2, the annual oscillation. Notice how the signal is not a linear

sinusoid, for instance, at approximately the 1984 cycle. EMD is able to represent the actual

fluctuations of the signal, without forcing assumptions of linearity or stationarity [26][27].

Therefore, non-sinusoidal oscillations such as in IMF 2 can be extracted and analyzed in the

time-domain.

Apart from CO2 concentration, the EMD method was also applied to sunspot

number, total solar irradiance, and global mean temperature data sets as shown in Figs. 5.1

through 5.3. Instantaneous Frequencies (IF) were calculated for each IMF from the three

data sets using equation 3.4. These frequencies fluctuate over the entire data duration period.

The mean and standard deviation of the instantaneous frequencies were calculated and are

shown in Table 5.1.

The periods could also be approximated by equation 5.1 where Dur is the length of

the signal and ZC is the number of zero-crossings.

http://www.sciencedirect.com/science/article/pii/S1364682611001234#ref_f0020




http://www.sciencedirect.com/science/article/pii/S1364682611001234#eq0030

http://www.sciencedirect.com/science/article/pii/S1364682611001234#ref_t0005

31

31

(5.1)

This calculation of the periods does not take into account the nonstationarity, or

frequency changes in the cycles. However, it does give an idea for the general cycle time

scales for each IMF. Both methods yielded nearly identical periods for the lower IMFs, as

shown in Table 5.2. The higher IMFs, which represent the longer periodic oscillations of the

signal, have greater differences because there were fewer cycles to average over fluctuations

in instantaneous frequencies.

The cycles of sunspot number have been studied immensely in the scientific

community. For an overview, see [53]. The most prominent cycle in sunspot data is the 11-

year Schwabe cycle shown as IMF 2 Figure 5.1. The third IMF shows a less uniform cycle

with a period between 13 and 16 years. IMF 4, which is the quasi-Hale IMF, exhibits a 22-

year cycle from approximately 1840 to 1940 but has longer cycles before and after this time

period. IMF 5 is approximately a 100-year cycle, which is known as the Gleissberg cycle.

Finally, IMF 6 is the trend or the lowest frequency component of the signal.

Total solar irradiance proxy data was also decomposed into 5 IMFs and a trend, as

shown in Fig. 5.2. IMF 2 has an approximately 11-year oscillation. This is not surprising

since the proxy data was partially reconstructed using solar sunspot number data. However,

it is interesting to note the time-dependent amplitude in IMF 2. The IMFs extracted using

EMD are not required to be constant amplitude or frequency [26][27]. The remaining IMFs

are longer oscillations, which will be shown to correspond to sunspot number IMFs.

The extracted IMFs from global mean temperature data are shown in Fig. 5.3. IMF 2

is approximately a 5-year oscillation and IMF 3 represents a quasi-11-year cycle. This will be

compared with the 11-year IMFs of sunspot number and TSI. IMFs 4 and 5 represent the





32

32

longer cyclic oscillations with mean periods of 17–24 and 58–65 years, respectively. [53] also

describes an 11-year climate oscillation, as well as large climate oscillations with periods of

approximately 20 and 60 years, using the Fourier spectral analysis. These were determined

from peaks in the frequency-domain spectra. The EMD method has expanded this research

by providing the ability to view the cyclic components in the time-domain.

IMF Comparisons

The variables have been decomposed into their IMFs, which represent oscillations at

various time scales. The IMFs can now be compared and contrasted.

CO2 concentration data was decomposed into an annual cycle and a trend. The

annual cycles were not resolved in the other variables because annually averaged data were

used. The trends, by inspection, are positively correlated.

Because the total solar irradiance proxy data was reconstructed partially using

sunspot data, it is not surprising that its IMFs are correlated with those of sunspot data. For

instance, Fig. 5.6 plots the two variables' IMFs together for comparison. Notice that the 11-

year cycle IMF matches very closely as well as the 100-year cycles. The middle IMFs do

correlate for most of the time period; however, there is some disconnect from 1850 to 1900.

The lowest frequency components for each variable can be compared directly from Figs. 5.1

and 5.2. These are clearly well correlated. Correlation values for the IMFs are given in Table

5.3. Again, the correlation between solar irradiance and sunspot number was expected

because the TSI data were reconstructed using sunspot data.

The possibility of correlation between TSI data and global mean temperature has

been a widely debated topic. It is important to understand how the fluctuations in solar

irradiance affect the earth's climate and global mean temperature. Also, it is beneficial to




33

33

quantify how much effect fluctuations in solar irradiance have compared to other forcings

within the global climate system. In order to approach this topic, we will first compare

visually the different IMFs of TSI data and global mean temperature. This is comparing the

different periodic cycles inherent within each data set. Fig. 5.7 shows the comparison of

various IMFs from TSI data with those from global mean temperature data.

Fig. 5.7 shows that the data fluctuates between being correlated and not. The first

regime can be seen from 1880 to approximately 1945 where the fluctuations in solar

irradiance appear to not be correlated well with the global temperature fluctuations. In the

third plot down in Fig. 5.7, the two are out of phase by 180 degrees. The two variables in the

second plot appear to lock in phase at approximately 1940 and continue until 2009.

However, the first and third plot show phase locking between the two variables between

1970 to approximately 1995, and lose the correlation for subsequent time periods. Also, the

trends of TSI and temperature can be compared by inspection of Fig. 5.2 and Fig. 5.3.

The difference between these time periods can be seen graphically as well as analyzed

in terms of correlation coefficients. Tables 5.4 and 5.5 display the correlation coefficients

between the IMFs of TSI and globally averaged temperature. These correspond to two

different time periods, mainly, 1880–1945 and 1945–2009. It is quite clear that from 1880 to

1945, there is small or negative correlation between the two signals for all time scales,

neglecting the trend. However, the time period 1945–2009 shows a dramatic increase in

correlation coefficient values. Tables 5.4 and 5.5 also show that the trends are well correlated

throughout the entire data duration.

Consider two oscillating processes that oscillate with different frequencies. It is

inevitable that they will reinforce one another during certain times and will be out of phase

during others. The EMD method demonstrates that while from 1880 to 1945 the variations




http://www.sciencedirect.com/science/article/pii/S1364682611001234#f0010

http://www.sciencedirect.com/science/article/pii/S1364682611001234#f0015





34

34

in solar output were not correlated with global temperature, between 1945 and 2009 they

were positively correlated. These correlations were found at a variety of time scales,

including approximately the 11-year, the 22-year, and the 100-year cycles.

Finally we compare the IMFs of global mean temperature with those from sunspot

number data. The results are plotted in Fig. 5.8. Also, all correlations are given in Tables 5.6

and 5.7 where the combination of IMFs of interest are highlighted in bold.

The IMFs appear to give very similar results as comparing global mean temperature

IMFs with those of total solar irradiance. Tables 5.6 and 5.7 show shifts in correlation

numerically. Notice that the correlations are mostly negative between 1880 and 1945, as

shown in Table 5.6. Between 1945 and 2009, however, the correlation values dramatically

increase, as shown in Table 5.7. For the first and third plots in Fig. 5.8, here is no correlation

from 1880 until approximately 1970, after which there is correlation until approximately

1995. For years after 1995, the correlation is reduced. For the second plot in Figure 5.8,

there is no correlation until approximately 1940. After 1940 there appears to be correlation

up until 2009. The third plot shows sunspot number and temperature out of phase from

1880 to 1945, as well as after 2000. However, between 1950 and approximately 1980, the

two signals are in phase.

Discussion

The Hilbert–Huang transform is a data analysis tool that is able to analyze

nonstationary data, which may be the result of nonlinear processes. Therefore, it is justified

to analyze various data sets to study the periodic cycles inherent in the data and to compare

different variables at different time scales.



35

35

By decomposing CO2 concentration into its IMFs, the different periodic

components could be analyzed in the time-domain. The CO2 concentration measurements

exhibited a diurnal cycle which, remarkably, has not changed much since 1958. For the last

50 years, the change in CO2 from annual minimum to maximum was 5.7±0.56 ppm as

calculated from the cycles in IMF 2. Superimposed upon this cycle is the long term trend,

which has increased approximately 75 ppm since 1958.

One of the most interesting results of this investigation is the identification of a

quasi-11-year cycle and quasi-22-year cycles within globally averaged temperature data. Also,

the EMD method showed that during particular time periods the quasi-11-year temperature

cycle was locked in phase with the cycles from total solar irradiance and sunspot number. It

seems intuitive that the dominate cycle in solar irradiance output, the 11-year cycle, would

directly affect the temperature at the earth. In fact, a number of studies have shown changes

within the troposphere, which are associated with solar fluctuations [10]. TSI and

temperature oscillations at longer time scales of 22 years and 65 years were also shown to be

correlated during these time periods. There have also been suggested correlations between

arctic-wide surface air temperature records and solar irradiance on decadal and multi-decadal

scales using wavelet analysis (Soon et al., 2009).

The magnitudes of the 11-year cycle fluctuations can be estimated empirically from

the IMFs. From Fig. 5.7, during the last five 11-year cycles, the average change in TSI from

solar minimum to solar maximum was 0.775±0.055 W/m2. For these same five 11-year

cycles, the average change in global mean temperature from minimum to maximum, as

calculated from Figure 5.7 was 0.101±0.012 °C.

It should be noted that decadal variations between 9 and 15 years in the temperature

records could be due to a variety of occurrences in addition to solar forcing. For instance,

36

36

volcanic eruptions and the Pacific Decadal Oscillation (PDO) both exhibit cycles at these

periodic scales and may be partially responsible for the resulting cycles present in the global

mean temperature data set.

Over the entire data duration, the 11-year Schwabe cycles remain relatively constant.

That is, over longer periods, they tend cancel themselves out with relatively symmetric

fluctuations. To determine the net radiative forcing over longer time periods, the trends

extracted using the EMD method can be analyzed. From Fig. 5.3 the trend of globally

averaged temperature has increased approximately 0.44 °C since 1959. Looking at the trend

of TSI data in Fig. 5.2, the change in TSI from 1959 to 2010 was approximately 0.3 W/m2.

Note that this is a maximum estimate, because not all of the energy from the Sun will be

absorbed by the Earth. This can be compared with the forcing associated with an increase in

CO2 concentration over the same time period, which can be calculated using equation 5.2

(

) (5.2)

Equation 5.2 was formulated using radiative transfer models to calculate the radiative

forcing due to a change in CO2 from some initial value to its present value (Myhre et al.,

1998). Solving equation 5.2 for the change in CO2 from 1958 to 2010, based upon the trend

in Figure 5.4 gives a radiative forcing of 1.13 W/m2. Therefore, while the short term and

long term fluctuations of total solar irradiance do produce radiative forcing upon the Earth,

the long term net radiative forcing is much smaller than the net forcing from increasing CO2

concentrations.

These estimates of forcing are not necessarily directly connected to absolute changes

in temperature. Multiple feedback mechanisms may exist, which complicate the processes by

which the Earth absorbs and retains energy.



37

37

The Hilbert–Huang Transform has been introduced as a relatively new spectral

analysis tool capable of analyzing the cyclic components of nonlinear and nonstationary data.

The Empirical Mode Decomposition method was used to decompose oscillatory signals of

total solar irradiance, sunspot number, global mean temperature, and CO2 concentration into

their Intrinsic Mode Functions. These IMFs exhibited time-dependent amplitudes and

frequencies. The IMFs were then analyzed and compared in the time-domain. Also,

empirical evaluations of radiative forcing from different periodic components of CO2

concentration and total solar irradiance were estimated. The net radiative forcing from

increasing solar irradiance was shown to be much smaller than the forcing due to increases

in CO2 during the last 50 years.

38

38

Figure 5.1 Sunspot number data set and its decomposed IMFs.

39

39

Figure 5.2 Total Solar Irradiance (TSI) measurements and their decomposed IMFs.

40

40

Figure 5.3 Global mean temperature and its decomposed IMFs.

41

41

Figure 5.4 CO2 concentration as measured from the Mauna Loa Observatory.

42

42

Figure 5.5 Subsection of IMF 2, the yearly cycle extracted from the CO2 data using

EEMD.

SSN

Mean(IF)

SSN Stdev(IF)

TSI

Mean(IF)

TSI

Stdev(IF)

T Mean(IF)

T Stdev(IF)

IMF

1

0.28

0.14

0.24

0.13

0.28

0.14

IMF

2

0.09

0.02

0.09

0.04

0.20

0.08

IMF

3

0.08

0.09

0.05

0.02

0.10

0.05

IMF

4

0.03

0.01

0.04

0.09

0.06

0.09

IMF

5

0.01

0.005

0.009

0.006

0.02

0.006

Table 5.1. Mean and standard deviation of instantaneous frequencies (1/yrs)

calculated using the Hilbert Transform.

43

43

SSN Hilb

SSN ZC

TSI Hilb

TSI ZC T Hilb T ZC

IMF 1 3.6 3.4 4.2 4.1 3.6 3.0

IMF 2 11 11 11 11 5.0 5.5

IMF 3 13 16 20 19 10 10

IMF 4 37 37 28 52 17 24

IMF 5 93 104 113 104 58 65

Table 5.2 Periods (in years) calculated using Hilbert analysis and zero-crossing

method.

44

44

Figure 5.6 Comparison between TSI and sunspot number IMFs. The TSI data were multiplied by a factor of 100 in order to improve visibility.

45

45

Sunspot IMF1 IMF2 IMF3 IMF4 IMF5 IMF6

TSI 0.85 0.50 0.82 0.27 0.28 0.33 0.20

IMF1 0.21 0.54 0.26 0.04 -0.08 0.02 -0.02

IMF2 0.61 0.54 0.95 0.10 -0.01 -0.001 -0.03

IMF3 0.36 0.10 0.31 0.74 0.10 0.03 0.06

IMF4 0.26 -0.02 0.01 0.19 0.79 -0.05 0.03

IMF5 0.48 -0.01 0.03 0.01 0.31 0.84 0.25

IMF6 0.59 -0.03 -0.05 -0.01 -0.001 0.59 0.96

Table 5.3 Correlation coefficients (r) between total solar irradiance and sunspot from 1749 to 2009.

46

46

Figure 5.7 Comparison of IMFs for global mean temperature and total solar irradiance.

47

47

TSI IMF 1 IMF 2 IMF 3 IMF 4 IMF 5 IMF 6

T 0.28 0.02 -0.03 -0.34 0.13 0.28 0.41

IMF 1 -0.13 -0.03 -0.08 -0.09 -0.04 -0.04 -0.10

IMF 2 -0.18 0.004 -0.02 -0.34 -0.07 -0.04 -0.14

IMF 3 0.09 0.02 -0.02 -0.40 0.30 0.10 0.13

IMF 4 0.11 0.07 0.08 0.04 -0.06 -0.26 0.26

IMF 5 0.77 0.03 0.02 0.02 0.20 0.66 0.86

IMF 6 0.67 0.01 -0.04 -0.04 0.27 0.37 0.87

Table 5.4 Correlation coefficients (r) between total solar irradiance and global mean temperature from 1880 to 1945.

TSI IMF 1 IMF 2 IMF 3 IMF 4 IMF 5 IMF 6

T 0.13 0.01 0.09 0.39 0.27 -0.06 0.06

IMF 1 0.02 0.004 0.13 0.17 0.01 0.005 -0.04

IMF 2 -0.02 0.02 0.06 0.43 0.08 -0.10 -0.11

IMF 3 0.10 -0.003 0.03 0.08 0.67 -0.05 0.04

IMF 4 0.27 0.02 0.01 -0.05 0.13 0.59 0.08

IMF 5 -0.85 0.004 0.06 -0.006 -0.28 -0.81 -0.77

IMF 6 0.83 -0.04 -0.04 0.007 0.33 0.26 0.99

Table 5.5 Correlation coefficients (r) between total solar irradiance and global mean temperature from 1945 to 2009.

48

48

Figure 5.8 Comparison of global mean temperature and sunspot number data IMFs. The temperature was multiplied by a factor of 500 to allow for visibility.

49

49

T IMF 1 IMF 2 IMF 3 IMF 4 IMF 5 IMF 6

Sunspot # 0.04 0.02 -0.03 -0.34 0.06 0.17 0.07

IMF 1 -0.14 -0.09 -0.11 -0.02 -0.04 0.01 -0.13

IMF 2 -0.22 -0.005 -0.01 -0.31 -0.15 -0.06 -0.17

IMF 3 -0.03 0.08 -0.02 -0.37 0.20 0.08 -0.07

IMF 4 -0.16 0.06 0.01 0.01 0.36 -0.42 -0.22

IMF 5 0.79 0.04 0.03 0.05 0.28 0.80 0.76

IMF 6 0.68 0.01 -0.04 -0.03 0.27 0.40 0.88

Table 5.6 Correlation coefficients (r) between sunspot number and global mean temperature from 1880 to 1945.

T IMF 1 IMF 2 IMF 3 IMF 4 IMF 5 IMF 6

Sunspot # -0.11 0.01 0.08 0.34 0.13 -0.14 -0.20

IMF 1 -0.13 -0.09 0.05 -0.05 -0.07 -0.11 -0.09

IMF 2 0.04 0.02 0.07 0.42 0.08 -0.03 -0.05

IMF 3 0.06 -0.005 0.02 0.09 0.68 -0.20 0.05

IMF 4 -0.02 0.03 0.02 -0.10 0.03 0.30 -0.17

IMF 5 -0.90 0.01 0.05 -0.005 -0.31 -0.64 -0.90

IMF 6 0.83 -0.04 -0.04 0.007 0.33 0.26 0.99

Table 5.7 Correlation coefficients (r) between sunspot number and global mean temperature from 1945 to 2009.

50

50

CHAPTER VI CHARACTERIZING SAMPLING ERRORS ASSOCIATED WITH

THE NEAR-SURFACE ENERGY BUDGET CLOSURE PROBLEM

The Energy Balance Problem

Conservation of energy at the earth’s surface, as defined by the balance of net

radiation and ground heat flux with the sum of turbulent sensible and latent heat fluxes, has

consistently not been satisfied experimentally [14][16][18][20]. Many studies have found the

net radiation and ground heat fluxes are consistently approximately 20% greater than the

turbulent fluxes [20]. This residual is often calculated as in equation 6.1,

( ) ( ) (6.1)

where R is the residual, Qnet is the net radiation, G is the soil heat flux, and H and E

are the sensible and latent heat fluxes, respectively, which shows the amount of energy

needed to balance the budget. Any lack of closure pertains not only to heat and moisture

measurements, but also those for trace gases such as carbon dioxide; the energy budget is

not closed for most of the FLUXNET sites, which measure flux of carbon dioxide

[3][13][55]. Accuracy of these measurements is pivotal for understanding the surface

exchange of greenhouse gases and quantifying carbon, as well as heat and moisture, cycling

over specific ecosystems.

A workshop was held in Genoble, France in 1994 to address the problems resulting

in the lack of closure [19]. The workshop formed the basis for the EBEX-2000 (Energy

Budget Experiment in 2000) which was conducted 50 miles south of Fresno, California.

However, the problems of closure were not able to be completely solved from the

experiment. More recently, in October 2009 in Thurnau, Germany, there was a panel

discussion about the energy budget closure problem which cited the current state of

51

51

knowledge, as well as areas where future research was needed [20]. They concluded the

following:

(1) Currently, the energy budget cannot be closed with experimental measurements.

(2) While previous studies have blamed the lack of closure on the high-frequency

response of meteorological instruments, they have been shown to not have a

remarkable effect with the advent of newer and faster sampling systems [20].

(3) The primary issues resulting in the lack of closure are attributed to the Eddy

Covariance technique and the resulting miscalculation of the sensible and latent

heat fluxes, not the net radiation or soil heat flux [20].

(4) One of the main contributions to the lack of closure is the energy transport of

large, low-frequency contributions to the vertical component of the eddy

transport, which are not fully measured using traditional eddy covariance

methods [13][14][19][52]. These are generally due to heterogeneity of the land

surface near the measurement system. These low-frequency transport

mechanisms can be due to slowly moving convection cells, or by the passage of

clouds above the sensing instruments [20].

(5) For some tall tower measurements, these mechanisms can be fully measured by

increasing the averaging time or using wavelet analysis, resulting in energy

balance [14][45][52]. However, when measurements are made more near the

surface, or near the surface of the roughness layer (i.e. above forest canopies),

the low-frequency oscillations are not fully measured. Therefore, no significant

amount of flux is measured from these low-frequency oscillations, at least for

averaging intervals between 30 and 240 minutes [19][20].

52

52

This section will specifically address the low-frequency contributions to the turbulent

fluxes using empirical mode decomposition and the Eddy Covariance method. We intend to

demonstrate that any finite measurement duration cannot fully capture all the low-frequency

oscillations within a realistic atmosphere. We will demonstrate that the errors within

turbulent fluxes, as calculated using the Eddy Covariance method, are partially due to

including undersampled low-frequency processes.

Other studies have come to similar conclusions, mainly, using Ogive functions they

suppose that low-frequency circulations must be responsible for missing flux [19][52]. We

will propose an alternative method to Ogive functions by determining the largest structure

that can be sufficiently sampled with a particular sampling duration. The EMD method,

then, provides a new method to view the frequency contributions to the total flux. The

contributions also demonstrate whether the processes have been sufficiently sampled.

To present this investigation, we will first introduce a relatively new spectral analysis

tool, Hilbert-Huang Transform, which is specifically designed to handle nonstationary data.

Much like wavelet analysis, it is a spectral analysis tool which extracts the frequency

contributions from an oscillatory signal. However, it does not require the use of pre-

determined basis functions, as in wavelet analysis. This relatively new method utilizes a

decomposition algorithm, called Empirical Mode Decomposition (EMD), which will be

introduced and used to decompose atmospheric wind components, temperature, and

humidity variables into their frequency components. Then we will quantify contributions to

the near-surface turbulent fluxes from each of these decomposed components. From this,

we will demonstrate and quantify errors due to undersampling low-frequency, nonstationary

oscillations when calculating the turbulent fluxes in the near-surface energy budget.

53

53

EMD as a Dyadic Filter

The EMD algorithm is able to sift out the intrinsic periodic components from

complicated oscillatory data. These components, called intrinsic mode functions (IMFs), are

time-domain functions which represent the local variability of the original signal at a

particular size (frequency) scale. There are limitations regarding the periodic components

EMD is able to extract from oscillatory data. For instance, EMD can only sift out periodic

components which differ in period by more than factors of two [15][57]. If a signal has two

or more superimposed periodic components which have periods closer than factors of two,

the extracted IMF will be the superposition of all the components within that dyadic range.

When dealing with turbulent atmospheric data, which can be thought of as a

collection of eddies existing at all size scales, the EMD algorithm acts as a dyadic filter. To

demonstrate this point, multiple data sets of 5 minute, 20 Hz temperature data were

decomposed using the EMD algorithm. Fig. 6.1 shows the calculated mean periods of the

IMFs as plotted on a log2 graph against IMF number. The mean periods were calculated

roughly by counting the number of zero-crossings of each IMF and dividing by twice the

total length of the IMF.

Fig. 6.1 demonstrates that the average period of each IMF is approximately twice the

preceding IMF. Since all time-domain components are additive, each IMF can be interpreted

as containing the sum of all oscillatory components within its particular dyadic range.

This also demonstrates that the problem of “mode mixing” is not influential in these

data. The problem called “mode mixing” is where a decomposed IMF contains a mixture of

different, sometimes drastically different, periodic scales [26][27]. Beceause the IMFs

displayed in Fig. 6.1 are clearly dyadic, each IMF is the sum of all frequency (periodic) scales

54

54

within the dyadic range of that IMF; in other words, there is no mixture of drastically

different modes. If there were “mode mixing”, Fig. 6.1 would not be linear, and each IMF

would not have the mean period which is twice the previous.

So, the EMD method acts as a dyadic filter when used with turbulent atmospheric

data. We can use the EMD method to decompose our atmospheric oscillatory data into a set

of IMFs whose different contributions to the turbulent flux can be calculated using a form

of the Eddy Covariance method. First, we will briefly give an overview of the Eddy

Covariance (EC) method.

Eddy Covariance Methods

Traditional Eddy Covariance Method

The eddy covariance (EC) technique is most commonly used for calculating the heat,

moisture, and CO2 fluxes near the earth’s surface [3][19]. The first two of these are the

sensible and latent turbulent heat fluxes which exist in the surface energy budget, and which

are consistently overestimated by approximately 20%. The EC method applies Reynold’s

averaging, the method of separating a signal into its mean and fluctuating components, to

the near-surface mass balance, as shown in equation 6.2 through 6.4.

( ) (6.2)

; (6.3)

( )

[( )( )]

( )

.

(6.4)

55

55

Here k is some scalar, u is the wind velocity vector, overbar denotes the mean, and

prime denotes the fluctuation from this mean. Then, typically the mean of both sides in

equation 6.4 is calculated in order to make

because by definition the average of fluctuating components will be zero.

Under particular assumptions of horizontal homogeneity, the vertical near-surface flux of

some scalar k is shown to be equal to the covariance of the fluctuating component of the

vertical wind velocity w and the fluctuating component of the scalar k, as in equation 6.5

( )

∑( )( )

(6.5)

Overall, the EC method says that the vertical turbulent transport of sensible and

latent heat can be calculated by their covariances with w; just replace k in equation 6.5 with

temperature T and specific humidity, q, for sensible and latent heat, respectively. However,

these calculations are used within near-surface energy budget calculations and are thought to

be approximately 20% underestimated.

The EC method relies on a number of assumptions associated with instrumental

setup and data collection. For a full explanation of proper measurement techniques, see

[16][18][54]. For this investigation, we assume that all systematic errors have been minimized

prior to our data analysis.

EMD Eddy Covariance Method

In addition to calculating the total vertical transport of sensible and latent heat fluxes

near the earth’s surface with the EC method, many studies calculate the relative

contributions to the flux from various size (frequency) scales of eddies [31][45][52]. To do

this, they separate the signals into frequency components using spectral analysis tools such as

56

56

Fourier analysis or wavelet analysis. Likewise, the EMD method can be used to separate

signals into their periodic components, and analyze their relative contributions to the total

flux.

Remember that any signal can be decomposed with the EMD method into a finite

number of fluctuating IMFs and a residue, as shown in equation 6.6

( ) ∑ ( )

( ) ∑ ( ) .

(6.6)

It is possible to calculate the contribution to the flux from particular IMFs.

This is equivalent to calculating contributions to the flux from particular frequency (size)

scales of eddies. Equation 6.7 shows the total flux,

∑( )

∑ ( )

∑( )( )

(6.7)

hich is equal to the sum of all covariances from each of the IMF pairs, where the

IMF numbers are indexed by i and j. The total flux can therefore be written as the total sum

of an “IMF Covariance Matrix” as shown in equation 6.8

( )

∑ (

( ) ( ) ( )( ) ( ) ( )

( ) ( ) ( )

)

(

(6.8)

where the total covariance cov(w,k) is equal to the sum of all the covariance

contributions from each of the IMF pairs.

57

57

Following the advent of traditional Fourier The sum of the IMF Covariance Matrix

gives results identical to calculating the total covariance via Fourier analysis or by simply

calculating the covariance of w and k as in equation 6.5.

Orthogonality and Sampling Durations

So far, we have introduced the EMD method and shown how the sum of its IMF

Covariance Matrix is equivalent to the total near-surface vertical turbulent flux of either

sensible or latent heat. Next, we will discuss how the IMF Covariance Matrix can be related

to sampling durations and show how errors due to undersampling of fluxes can be

calculated.

The total covariance of w and k can be separated into contributions from orthogonal

IMF (i = j) and from nonorthogonal IMF (i j) components as shown in equation (6.9).

( ) ∑ ( )

∑ ( )

(6.9)

The orthogonal terms are the diagonal terms in the IMF Covariance Matrix (first

sum on right hand side of equation 6.9) while the nonorthogonal terms are the off-diagonal

terms (second sum on right hand side of equation 6.9). The factor of two is related to the

fact that it is a symmetric matrix.

For comparison, consider typical Fourier decompositions. Fourier describes

an oscillatory signal as the sum of an infinite set of basis functions, which are weighted sine

and cosine functions with different constant frequencies from zero to the Nyquist frequency

[5][43][51]. The sine and cosine basis functions are, by definition, orthogonal when summed

58

58

(or integrated) over all space or time. Therefore, the Fourier analogue to the IMF Covariance

Matrix is an infinite matrix whose off-diagonal terms are all equal to zero.

( ) ∑ (

( ) ( ) ( )

) (6.10)

The Fourier Covariance Matrix, shown in equation 6.10, says that the only

contributions to the covariance come from diagonal (orthogonal) terms between identical

basis functions with frequencies from zero to the Nyquist frequency. However, in order for

the basis functions to be completely orthogonal, they must be infinite in extent [6][51].

Therefore, when dealing with finite length data, which all measured data are, Fourier analysis

assumes that the finite data length repeats infinitely [5][43][51]. This is why many scientists

use bell-tapering or other tapering methods at the beginning and end of the data set before

processing with Fourier analysis in order to avoid any discontinuous jumps between the end

and the beginning of the data.

We have shown that the EMD method allows for nonorthogonal (off-diagonal)

contributions in the IMF Covariance Matrix. Since the IMF Covariance Matrix and the

Fourier analogue matrix give identical results when summed over all components, it is

natural to wonder what causes the differences in the matrices themselves. To explore the

different contributions from orthogonal and nonorthogonal components via the EMD

method, data sets of w and T were decomposed into their IMFs for two different sampling

durations: 5 minutes and 60 minutes.

First, the variance contributions of w and T were calculated for all the IMFs. Fig. 6.2

shows the variance contributions from IMF 5 of w and T for the 60 minute duration data

59

59

set. This is essentially looking at one particular row (in this case, row 5) of the IMF

Covariance Matrix between w,w and T,T.

The orthogonal contribution comes from the same IMF number ( ), and is the

largest contribution in the row. None of the other IMFs contribute significantly to the

variance. Therefore, the orthogonal term dominates. This row, then, is considered

orthogonal.

The covariance was also calculated for w and T. Fig. 6.3 shows w IMF 5 and its

covariance contributions with all the T IMFs. This is equivalent to looking at row 5 in the

IMF Covariance Matrix between w,T.

Again, the major contribution comes from the orthogonal contribution, IMF 5

of w and T. However, there is energy spreading to the adjacent IMFs, mainly IMFs 3, 4, and

6. This is common when calculating covariances between different variables: the

momentum flux, vertical sensible heat flux, and , vertical latent heat flux. These

contributions from adjacent IMFs, energy spreading, are not nonorthogonal contributions

but are still considered orthogonal, pseudo-diagonal contributions. Therefore, Fig. 6.3 still

represents an orthogonal row in the Covariance Matrix.

Now for a nonorthogonal case. The 5 minute duration data sets were used and

decomposed into their respective set of IMFs. Fig. 6.4 shows the w IMF 10 and its

covariance contributions with all T IMFs.

IMF 10 shows contributions at a number of different IMFs. Notice there are even

significant negative contributions. This demonstrates a case where the row (row 10) has

significant nonorthogonal contributions.

Nonorthogonal contributions are due to the finite duration of the data, and the

undersampling of the lowest frequencies. When the short data sets of w and T were used (5

60

60

minutes) there were nonorthogonal contributions when looking at the higher (longer

periodicities) IMFs, specifically IMF 10. When the longer data sets of w and T were used (60

minutes) there were only orthogonal contributions, noting the occurrence of energy

spreading, when looking at the lower (shorter periodicities) IMFs, particularly IMF 5.

To investigate this further, we analyzed 10 data sets (consecutive days) of 20 Hz

meteorological data from 1000 to 1230 local time, from the SMEX 2002 experiment in Iowa.

These data were chosen because they represent “ideal” turbulent conditions over corn fields.

Each data set (w, q, T) was broken into 11 sampling durations, each starting at 1000

local time, including lengths of 5, 10, 15, 20, 30, 45, 60, 80, 100, 120, and 150 minutes. Each

data set was decomposed into their IMFs using the EMD method. Then, IMF Covariance

Matrices were calculated for each sampling duration.

Fig. 6.5 shows the absolute value of the nonorthogonal contributions (fraction of

total covariance) from the sensible (H) and latent (E) heat IMF Covariance Matrices as

summed for different sampling durations. Data are shown from sites 161, as shown in the

top row, and 152, as shown in the bottom row, from SMEX 2002. Notice that the

nonorthogonal contributions decrease as the sampling duration is increased, and that the

deviations between the different days decreases.

The physical reasons for these nonorthogonal contributions at the short sampling

durations is that the longer periodic IMFs, which represent the periodic eddies within the

system of particular frequencies, are not sampled sufficiently. As the measurement duration

is increased, more and more cycles are sampled which causes the nonorthogonal terms to

decrease. Absolute values were used because the nonorthogonal contributions can

sometimes be negative, as will be shown shortly. Also, ten data sets were used to show that

this is not an uncharacteristic result which occurs within one particular data set; the result is

61

61

not dependent upon the random fluctuations of the EMD algorithm. Notice, though, that

even a 150 minute duration is unable to reduce the nonorthogonal contributions to zero.

Fig. 6.5 shows that the nonorthogonal terms in the IMF Covariance Matrices should

decrease for the IMFs which are becoming more sufficiently sampled. However, if the

sampling duration is increased enough, more IMFs (rows and columns in the matrix) will be

created because using a longer sampling duration can capture larger cycles; these cycles will

not be sufficiently sampled, which result in nonzero nonorthogonal terms. In an idealized

case, where the measuring duration is infinite, the Covariance Matrix will equal the analogue

Fourier Covariance Matrix (as shown in equation 6.10), where it is an infinite matrix and

where all the nonorthogonal contributions are zero.

Next we can use the idea of orthogonality to determine when a signal is

sufficiently sampled, and how this effects the measurement and calculations of turbulent

fluxes.

How Long is Long Enough?

A long withstanding question within the atmospheric community is how to find the

appropriate duration to sufficiently sample an oscillatory process, or more poignantly, “How

Long is Long Enough?” (Lenschow et al. 1994). The question about appropriate length is

one which depends not only on the sampling duration, but also the period of the oscillatory

process being sampled.

The EMD has proven to be a unique frequency decomposition tool which can

provide insights into this question. From the EMD perspective, a process has been

sufficiently sampled when its nonorthogonal contributions have decreased to zero.

Therefore, a signal has been measured for enough cycles and its components are completely

62

62

distinguished from the other components embedded within the signal. We ask then, how

many cycles of a process need to be sampled in order for it to be sufficiently distinguished

from the other processes embedded in the signal?

Fig. 6.6 shows the orthogonal (blue) and nonorthogonal (red) fractions of the total

covariance as a function of sampling duration divided by period of IMF. The x-axis can be

explained as the number of cycles of a periodic process which is captured with a particular

sampling duration. For short sampling durations, or long periods, there is great scatter.

However, as the sampling duration divided by the period is increased, the nonorthogonal

fraction asymptotes to zero. Likewise, the orthogonal fraction asymptotes to one. The

regions where these contributions asymptote gives a quantitative estimate for the number of

cycles required to sufficiently sample the process. From the 10 days of data used for this test,

the number of cycles required was approximately 7. Physically, this means that it takes 7

cycles to sufficiently distinguish one periodic process from another when the two are

embedded within the same signal.

This can be used to determine the longest periodic oscillations sufficiently sampled

for a given sampling duration. If a 30 minute sampling duration is used, the longest periodic

process that will be sufficiently sampled will have a period of approximately 4 minutes,

assuming that it takes approximately 7 cycles to be sufficiently sampled. This explains why

typical Ogive functions do not display significant changes in flux estimates when increasing

sampling durations in small increments. In order to sufficiently sample a 30 minute process,

a sampling duration of at least 210 minutes is necessary.

The EMD method has created a tool which can be used to determine whether a

signal is sufficiently sampled. We now show that the errors in the turbulent fluxes are

partially due to the undersampling of the lowest frequency components. Fig. 6.8 shows the

nonorthogonal, orthogonal, and total contributions to the covariance of wT for site 161.

Each subplot is a different sampling duration in the following order: 5, 10, 15, 20, 30, 45, 60,

80, 100, 120, 150 minutes. Notice that the nonorthogonal contributions are typically found

63

63

in the higher IMFs, which represents lower frequency oscillations. As the sampling duration

is increased, as you can see by looking through the subplots, the high-frequency components

become more sufficiently sampled, however, the low-frequency components still have large

fluctuations. These fluctuations are random errors associated with undersampling the lowest

frequencies.

We have used the EMD method to develop a tool which has proven useful to

identify the largest periodic structure which can be sampled sufficiently with a particular

sampling duration. Therefore, the EMD method is able to determine which periodic

contributions to the total covariance contain random errors due to undersampling, and

which do not. However, it has been typically thought that the turbulent fluxes have always

been underestimated [16][18][20]. While the EMD method shows that random errors can

occur due to undersampling, it does not explain why the turbulent fluxes are consistently

underestimated. Instead, any undersampled IMF will contribute either positively or

negatively to the flux, causing errors.

Conclusions

This work has introduced a relatively new spectral analysis tool called Hilbert-Huang

Transform and has utilized its empirical mode decomposition algorithm to decompose

meteorological data into their intrinsic periodic oscillations. By using EMD as a dyadic filter

for meteorological data, we have calculated the contributions to near-surface fluxes from

different frequency components and constructed the idea of an IMF Covariance Matrix for

calculating near-surface turbulent fluxes.

This investigation also determined an approximate estimation for the number of

cycles needed to sufficiently distinguish a process among other embedded processes within a

signal. By recognizing that nonorthogonal contributions are evidence of undersampled

processes, the EMD method can be used to determine which frequency components have

been sampled sufficiently which contribute to the calculated turbulent flux.

64

64

While the method determines which periodic processes are undersampled, it does

not show that the errors due to undersampling are always negative. Rather, they occur

randomly, contributing positive and negative contributions to the total flux.

Further research should be performed to compare the nonorthogonal components

with direct calculations for many different data sets, including studies with small or large

energy budget residuals. Also, the nonorthogonal contributions could be compared to other

meteorological values such as u* or stability parameters in order to determine how and if

they are related.

65

65

Figure 6.1 Dyadic nature of EMD when applied to turbulence

Figure 6.2 Variance contributions from IMF pairs for 60 minute data sets of vertical

wind velocity and temperature

66

66

Figure 6.3 Covariance contributions from IMF pairs of vertical wind velocity and

temperature

Figure 6.4. Covariance contributions from w IMF 10 and all T IMFs

67

67

Figure 6.5 Absolute value of the nonorthogonal fraction of the total covariance from the IMF Covariance Matrices as calculated for 10 days. The top two plots show SMEX 2002

data from Site 161. The bottom two plots show SMEX 2002 from Site 152.

68

68

Figure 6.6 Orthogonal (blue) and nonorthogonal (red) fractions of the total covariance of wT plotted against essentially the number of cycles sampled, as defined by the

sampling duration divided by the period of the process (in this case, an IMF).

69

69

Figure 6.7 Orthogonal (blue) and nonorthogonal (red) fractions of the total covariance of wq plotted against essentially the number of cycles sampled, as defined by the

sampling duration divided by the period of the process (in this case, an IMF).

70

70

Figure 6.8 Orthogonal (blue) nonorthogonal (red) and total (black) contributions from each IMF for the sensible heat flux as calculated from Site 161 from SMEX 2002.

Each subplot is a different sampling duration, going from left to right and top to bottom in the following order: 5, 10, 15, 20, 30, 45, 60, 80, 100, 120, 150 minutes.

71

71

CHAPTER VII AN IMPROVED EEMD ALGORITHM

Motivation

There are a number of Empirical Mode Decomposition (EMD) algorithms available

today. These include commercial software called the Hilbert-Huang transform data

processing system (HHT-DPS) which was developed by Norden Huang at NASA and is

available through NASA’s website. There are also publicly available Matlab codes by Patrick

Flandrin [22] and R code by [32] which extract IMFs from a given input data series.

However, no investigation to our knowledge has utilized these algorithms to process

discontinuous data. For example, the EEMD algorithm introduced by [58] will not run when

there is gaps, or NaNs, in the input data.

Oftentimes instruments fail in the field, resulting in data gaps within the data. These

gaps prevent the EMD algorithm from properly sifting through the data. Typically, scientists

utilize interpolation values, such as the mean of surrounding data points, to fill data gaps.

This may be useful for small data gaps, however, the data is assumed to remain constant

during the time period, and so is insufficient for larger gaps of fluctuating data. In the spirit

of a local and adaptive decomposition tool, it is important to manipulate the data as little as

possible, and merely describe the data that does exist. This investigation suggests an

improvement to the Ensemble Empirical Mode Decomposition (EEMD) algorithm which

allows for gaps in the input data. The implications of applying EEMD to varying sizes of

discontinuous data will be discussed. Also, we will suggest an error reduction technique

which extracts IMFs which are more locally accurate.

72

72

Ensemble Empirical Mode Decomposition

This investigation utilizes the Ensemble Empirical Mode Decomposition (EEMD)

algorithm which was pioneered by [58]. The algorithm utilizes the original EMD sifting

method which is described fully in [26][27]. An overview of the EEMD algorithm follows:

1. Add finite amplitude noise to the original input signal.

2. Decompose the signal into a finite set of intrinsic mode functions (IMFs) using the

original EMD sifting method.

3. Repeat steps 1 and 2 with different noise data sets of same noise standard deviation.

4. Average the ensemble of extracted IMFs to average out the noise and obtain their

mean IMFs.

A complete description of EEMD can be found in [58]. The standard deviation of

random noise which is added to the original data before decomposition can be specified by

the user. For this investigation, we used a standard deviation of 0.2.

In order to accommodate for discontinuous data, the MATLAB version of the

EEMD algorithm by Zhaoua Wu was modified [58]. In order to accomplish this, the sifting

process, which fits spline functions to the local maxima and the minima of the signal, must

be performed on each individual continuous data segment. When the algorithm encounters a

data gap, the splines must be halted, and a new set of piece-wise spline functions begin on

the next set of continuous data.

Figure 1 shows original data and its decomposed IMFs, as well as the decomposed

set with artificial data gaps created.

The improvement is simple but powerful. The majority of the discontinuous IMFs

seem to replicate the original IMFs away from the gap section. For instance, look at

73

73

approximately Arbitrary Time 90 for IMF2. The local oscillations are still captured even with

the gap present. The two decompositions are not identical, however. This investigation will

now assess the errors associated with decomposing discontinuous data in a quantifiable

manner. Also, suggestions will be made for how to reduce such errors.

Errors Due to Data Gaps

In order to assess the abilities of EEMD applied to discontinuous data, we

present an error analysis comparing an original IMF with a discontinuous IMF. Equation 1

shows the root mean square error equation.

√

∑ ( )

. (1)

The total error for a single decomposition, then, is the sum of the rms errors from all

IMFs.

It is interesting to compare the rms error with the size of gaps within the data. Figure

2 shows the error calculated for a number of data gap sizes. That is, a data gap was artificially

created in the exact middle of the data. The gap length was created as some percentage of

the original data length. Then, the IMFs were decomposed and compared with the IMFs

from the original data.

The errors, as calculated by equation (1), are shown in Figure 2 for the first 6 IMFs

extracted using the new discontinuous EEMD algorithm. Notice that the errors increase

with increased gap size, as expected. Also, the highest frequency IMFs, the low IMF

numbers, have the smallest differences with the original data. As the IMF number increases,

which represents the lower frequency IMFs, the errors increase more quickly with increasing

data gap size. This shows that the algorithm works better with high-frequency IMFs.

74

74

Another question to ask is where do the errors most occur? Fig. 7.4 shows the errors

plotted against time for all IMFs with a gap size of 80 points, which is approximately 20% of

the total data.

Notice that the high-frequency IMFs have the largest errors near the endpoints of

the gaps. The low-frequency IMFs also have errors, but they are not specifically located near

the gap endpoints.

A common problem within the old EMD sifting algorithm has been one dealing with

so-called End-effect errors [26][44][58]. These errors, which have been well studied, have

traditionally existed at the endpoints of the data [26][44][58]. This is because the first or

second derivatives which are required for spline fitting are unavailable. The endpoints, then,

have large fluctuations which do not represent the real signal. For our particular algorithm

dealing with discontinuous data, these end-effect errors occur not only at the beginning and

end of the input data, but at the start and end of every data gap. While the differences in the

low-frequency IMFs are primarily due to the size of the data gap, the high-frequency IMFs

have differences primarily due to gap end-effect errors. Therefore, in order to reduce the

errors from the high-frequency IMFs, we can use traditional end-effect mitigation tools as

described in the subsequent section.

Error Reduction Methods

There are a number of investigations which have dealt with end-effect errors

[26][44][58]. Therefore, it may be possible to utilize these end-effect mitigation tools in order

to decrease the errors in the high-frequency IMFs due to data gaps.

[44] suggest that IMFs should use a mirror extension technique to lower the errors

due to end effects. This would restrict the spline from varying extravagantly at the ends of

the gaps.

75

75

The mirroring technique used in this investigation is now reviewed. When a data gap

is encountered, it is split in two sections. The first section is filled by the mirror image of the

data directly before the gap. The second section is filled by the mirror image of the data

directly after the gap. The amount of mirroring needed is dependent on the gap size. The

result is a continuous data set. The traditional EEMD algorithm is then used to decompose

the data into its IMFs. Once decomposed, the data gaps are recreated by removing the

mirrored data.

To test the effectiveness of this technique with discontinuous data, two gaps were

created in the continuous data. Three different algorithm iterations were performed. The

first one was a traditional EEMD algorithm decomposing the original data without gaps.

The second was the discontinuous EEMD algorithm as applied to the data set without

mirroring. The third was the discontinuous EEMD algorithm as applied to the data which

had undergone the mirroring technique.

Fig. 7.5 shows the three decompositions compared. This verifies, at least

visually, the effectiveness in the mirroring technique to reduce end-effect errors near the

endpoints of the gaps. Also, the low-frequency IMFs more closely match the original

decomposition.

Fig. 7.6 shows the relative error from each IMF as compared between the

discontinuous EEMD algorithms with and without mirroring applied.

The first IMF is the actual data and can be ignored. For all the other IMFs, the

mirroring technique greatly reduces the error in the decompositions. Therefore, the

processing of discontinuous data is greatly improved by using the mirroring technique.

Discussion

Overall, this investigation presents a new version of the Ensemble Empirical Mode

Decomposition (EEMD) algorithm which is now applicable to discontinuous data. For short

76

76

gap durations, the errors are small and the decomposition is locally representative. A

mirroring technique was utilized to improve the discontinuous decomposition. This makes

for a more local and adaptive decomposition of data which may contain one or more gaps.

Further research should be pursued which utilizes neural networks or prediction models to

fill gaps and improve on the reduction of errors.

77

77

Figure 7.1 Original data decomposed into its IMFs as well as the IMFs decomposed from discontinuous data

78

78

Figure 7.2 Error as defined by the summed differences between the discontinuous and continuous extracted IMFs, plotted against data gap size.

79

79

Figure 7.3 Errors plotted as a function of frequency of IMFs. The errors are primarily in the low-frequency IMFs

80

80

Figure 7.4 Errors plotted as a function of time. For the high-frequency IMFs, the errors occur largely near the gap endpoints.

81

81

Figure 7.5 Comparison of three different decompositions of data. The original signals (black) contain no gaps. The red signals are the decomposed IMFs from the discontinuous EEMD algorithm, and the blue signals are the discontinuous EEMD

algorithm used after a mirroring technique was performed.

82

82

Figure 7.6 Comparison of relative error associated with including or not including the mirror technique when using the discontinuous EEMD decomposition.

83

83

CHAPTER VIII SUMMARY

This dissertation focused on the theory, application, and development of a relatively

new spectral analysis tool called Hilbert-Huang Transform. While it is an empirical tool, its

power lies in its versatility, where it may be applied to virtually any oscillating data signal,

whether nonlinear or nonstationary.

While some of the data sets analyzed in this thesis are well-known and have been

studied immensely using other data analysis tools, the contribution of this thesis is the

development of the tools and techniques related to Hilbert-Huang Transform and how it

can be applied to different types of data.

First, using sunspot data, the periodic components were extracted using HHT and

compared to well-known processes. The results were shown to give more local descriptions

of the frequency components than traditional spectral analysis tools.

Next, the periodic components were shown useful when compared to one another in

the time domain. This provided new methods for analyzing frequency components of

different processes, embedded within a signal, in the time domain. The HHT was used to

calculate the similarity between the relative amplitudes and phases of two nonstationary

processes. This created new techniques which can be used on other nonstationary data.

The pivotal characteristic of the EMD method is that it acts as a dyadic filter bank in

the frequency domain. That is, it is unable to decipher fluctuations that have periodicities

which differ by less than factors of 2. This limits the cycles EMD is able to sift out from

fluctuating signals. The dyadic nature has been demonstrated on turbulent wind velocity,

temperature, and humidity data.

84

84

HHT was also used to approach the problem of energy budget closure near the

earth’s surface from a completely new viewpoint. The orthogonality of the extracted

components using the EMD method were shown to be related to whether or not the

internal oscillations were sufficiently sampled. This provides researchers with a tool to justify

that the covariance contributions from various frequency components have been sufficiently

sampled.

Finally, a modification to the EMD algorithm has been presented which allows for

data gaps within the input signal. As most real data does contain gaps during some duration,

this expands the potential applications of the EMD method greatly. Problems with the

algorithm have been discussed, including end-effect errors due to under-defined

interpolation functions. A mirroring technique has shown to reduce the errors due to end-

effect errors. Therefore, this new algorithm can accurately and adaptively work in extracting

the periodic components embedded within discontinuous data. This improvement is

essential to allowing HHT to work with discontinuous data, thereby making the tool much

more adaptive to all types of data.

Overall, this dissertation has provided an in-depth analysis of a new tool, and

has strengthened the tool itself. Future studies will further broaden its applicability to new

problems, and will attempt to strengthen the theoretical foundation on which it stands. Real

data is inherently messy; it is noisy, nonstationary, and intermittent. This thesis has taken a

step towards describing this messy reality more adaptively and efficiently.

85

85

REFERENCES

[1] Attoh-Okine, N. O. 2005: Perspectives on the Theory and Practices of the Hilbert- Huang Transform. In The Hilbert-Huang Transform in Engineering, edited by N.E. Huang and N.O. Attoh-Okine, 281-305. Taylor & Francis.

[2] Aubinet et al. 2000: Estimates of the annual net carbon and 5 water exchange of forest: The EUROFLUX methodology. Adv Ecol Res. 30, 113–175.

[3] Baldocchi et al. 2001: FLUXNET: A new tool to study the temporal and spatial variability of ecosystem-scale carbon dioxide, water vapor, and energy flux densities. Bull Amer Meteorol Soc. 82, 2415-2434.

[4] Balocchi, R. 2004: Deriving the respiratory sinus arrhythmia from the heartbeat time series using empirical mode decomposition, Chaos, solitons, and fractals, 20, 171-172.

[5] Brigham, E. 1988: The Fast Fourier Transform and its Applications, Prentice Hall, Englewood Cliffs, NJ.

[6] Byron F.W., and Fuller R.W. 1992: Mathematics of Classical and Quantum Physics, Dover Publications, Inc., New York.

[7] Chen, Q., N.E. Huang, S. Riemenschneider, and Y. Xu. 2006: A B-spline approach for empirical mode decompositions. Advances in computational mathematics, 24, 1-4, 171.

[8] Chui, C.K. 1992: An Introduction to Wavelets. Academic Press, Boston, MA. [9] Cohen, L. 1995: Time-Frequency Analysis, Prentice-Hall, Englewood Cliffs, NJ. [10] Coughlin, K.T. & K.K. Tung. 2004: 11-year solar cycle in the stratosphere extracted by

the empirical mode decomposition method. Advances in space research. 34, 2, 323. [11] Duffy, D.G. 2004: The application of Hilbert-Huang transforms to meteorological

datasets. Journal of Atmospheric and Oceanic Technology. 21, 4, 599. [12] Echeverría, J.C., J.A. Crowe, M.S. Woolfson, and B.R. Hayes-Gill. 2001: Application of

empirical mode decomposition to heart rate variability analysis. Medical biological engineering computing. 39, 4, 471.

[13] Feigenwinter C., Bernhofer C., and R. Vogt. 2004: The influence of advection on the short term CO2-budget in and above a forest canopy. Bound.-Layer Meteor. 113, 201–224.

[14] Finnigan JJ, Clement R, Malhi Y, Leuning R, and H.A. Cleugh. 2003: A re-evaluation of long-term flux measurement techniques. Part I: Averaging and coordinate rotation. Bound.-Layer Meteor. 107, 1–48.

86

86

[15] Flandrin, P., Rilling, G., and Goncalves, P. 2004: Empirical mode decomposition as a filter bank. IEEE Signal Processing Letters. 11, 2, 112.

[16] Foken T., S.P. Oncley. 1995: Results of the workshop “Instrumental and methodical

problems of land surface flux measurements. Bull Am Meteorol Soc. 76, 1191–1193.

[17] Foken T., Wichura B., Klemm O., Gerchau J., Winterhalter M., and T. Weidinger. 2001: Micrometeorogical measurements during the total solar eclipse of August 11, 1999. Meteorologische Zeitschrift. 10,171-178.

[18] Foken T., Gockede M., Mauder M., Mahrt L., Amiro B.D., and J.W. Munger. 2004:

Post-field data quality control. In Handbook of micrometeorology: A guide for surface flux measurement and analysis. Lee X, Massman WJ, Law B. Kluwer, Dordrecht. 181–208.

[19] Foken T., Wimmer F., Mauder M., Thomas C., and C. Liebethal, 2006: Some aspects of the energy balance closure problem. Atmos Chem Phys Discuss. 6, 3381.

[20] Foken T., Aubinet M., Finnigan J.J., Leclerc M.Y., Mauder M., and U. Kyaw Tha Paw. 2011: Results Of A Panel Discussion About The Energy Balance Closure Correction For Trace Gases. Bull Amer Meteor Soc. 92, ES13–ES18.

[26] Gabor, D. 1946: Theory of Communication. Proc. IEEE Part III, 93, 26, 429-457. [22] Gabriel Rilling. Empirical Mode Decomposition.

http://perso.ens-lyon.fr/patrick.flandrin/emd.html (accessed Nov 2, 2011). [23] Griffiths, D.J. 2005, Introduction to Quantum Mechanics, 2nd Edition. Pearson Ed. Intl.

Prentice Hall, Upper Saddle Rive, NJ.

[24] Haar, A. 1910: Zur theorie der orthogonalen funktionensysteme. Mathematische Annalen. 69, 3, 331.

[25] Holder H.E., A.M. Bolch, and R. Avissar. 2009: Using the Empirical Mode Decomposition (EMD) method to process turbulence data collected on board aircraft. Submitted to J. Atmos. Ocean. Tech. http://hdl.handle.net/10161/1074

[26] Huang, N.E., Shen, Z., Long, S., Wu, M., Shih, H., Zheng, Q., Yen, N., Tung, C. and Liu, H. 1998: The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. R. Soc.. 454, 1971, 903.

[27] Huang, N.E., Z. Shen, S.R. Long. 1999: A new view of nonlinear water waves: The Hilbert spectrum. Annual Review of Fluid Mechanics. 31, 1, 417.

[28] Huang Y., F.G. Schmitt, Z. Lu, Y. Liu. 2007: Empirical mode decomposition analysis of experimental homogeneous turbulence time series. Colloque GRETSI, 11-14 September, Troyes, http://documents.irevues.inist.fr/handle/2042/17539

http://perso.ens-lyon.fr/patrick.flandrin/emd.html

http://hdl.handle.net/10161/1074

87

87

[29] Huang, N.E. and Z. Wu. 2008: A review on Hilbert-Huang transform: Method and its applications to geophysical studies. Reviews of geophysics. 46, 2.

[30] Islam, M.K., M.S. Rahman, S. Akimasa, P. Banik. 2006: Empirical mode decomposition analysis of climate changes with special reference to rainfall data. Discrete Dynamics in Nature and Society. 2006.

[31] Katul et al. 2001: Multiscale analysis of vegetation surface fluxes: from seconds to Years.

Adv in Water Resources. 24, 1119-1132.

[32] Kim D. and H. Oh. 2009: EMD: A Package for Empirical Mode Decomposition and Hilbert Spectrum. The R Journal. 1, May 2009.

[33] Kolláth, Z. and K. Oláh. 2009: Multiple and changing cycles of active stars – I. Methods of analysis and application to the solar cycles. Astronomy & Astrophysics. 501 2, 695.

[34] LASP Interactive Solar Irradiance Datacenter. Historical Total Solar Irradiance. http://lasp.colorado.edu/lisird/tsi/historical_tsi.html (accessed Nov 2, 2011).

[35] Lenschow D.H., Mann J., and L. Kristensen. 1994: How long is long enough when measuring fluxes and other turbulence statistics? J of Atmos. and Oceanic Technology. 11, 661-673.

[36] Liu, Z., N. Zhang, R. Wang, and J. Zhu. 2007: Doppler wind lidar data acquisition system and data analysis by empirical mode decomposition method. Opt. Eng., 46, 26001.

[37] Malinowski, S.P., Haman, K.E., Kopec, M.K., Kumala, W., Gerber, H.E., and Krueger, S.K.2008: Small-scale variability of temperature and LWC at Stratocumulus top. 13th AMS Conference on Cloud Physics, 2.21.

[38] NASA GISS. GLOBAL Land-Ocean Temperature Index. http://data.giss.nasa.gov/gistemp/tabledata/GLB.Ts+dSST.txt (accessed Nov 2, 2011).

[39] NASA Marshall Space Flight Center. Solar Physics. http://solarscience.msfc.nasa.gov/greenwch/spot_num.txt.(accessed Nov 2, 2011).

[40] NOAA Earth System Research Laboratory. Trends in Atmospheric Carbon Dioxide. www.esrl.noaa.gov/gmd/ccgg/trends/. (accessed Nov 2, 2011).

[41] Peel MC, G.G.S. Pegram, and T.A. McMahon. 2007: Empirical Mode Decomposition: Improvement and application. In International Congress on Modeling and Simulation, edited by Oxley, L. and D. Kulasiri. Modelling and Simulation Society of Australia and New Zealand, December 2007, 2996-3002.

http://lasp.colorado.edu/lisird/tsi/historical_tsi.html

http://www.sciencedirect.com/science?_ob=RedirectURL&_method=externObjLink&_locator=url&_issn=13646826&_origin=article&_zone=art_page&_plusSign=%2B&_targetURL=http%253A%252F%252Fdata.giss.nasa.gov%252Fgistemp%252Ftabledata%252FGLB.Ts%252BdSST.txt

http://www.sciencedirect.com/science?_ob=RedirectURL&_method=externObjLink&_locator=url&_issn=13646826&_origin=article&_zone=art_page&_plusSign=%2B&_targetURL=http%253A%252F%252Fsolarscience.msfc.nasa.gov%252Fgreenwch%252Fspot_num.txt

http://www.sciencedirect.com/science?_ob=RedirectURL&_method=externObjLink&_locator=url&_issn=13646826&_origin=article&_zone=art_page&_plusSign=%2B&_targetURL=http%253A%252F%252Fwww.esrl.noaa.gov%252Fgmd%252Fccgg%252Ftrends%252F

88

88

[42] Pegram, G. G. S., Peel, M. C. and T.A. McMahon. 2008: Empirical mode decomposition using rational splines: an application to rainfall time series. Proc. R. Soc. A. 464, 1483–1501.

[43] Qian, S. 2002: Introduction to Time-Frequency and Wavelet Transforms, Prentice-Hall Inc.

Upper Saddle River, NJ.

[44] Qingjie, Z., Huayong, Z., and S. Lincheng. 2010: A new method for mitigation of end effect in empirical mode decomposition. Informatics in Control, Automation, and Robotics (CAR), 2010 2nd International Asia Conf. March 2010.

[45] Sakai R., Fitzjarrald D., and K.E. Moore. 2001: Importance of low-frequency contributions to eddy fluxes observed over rough surfaces. J Appl Meteor. 40, 2178–2192.

[46] Sarabandi, K. and I. Koh. 2002: Effect of canopy-air interface roughness on HF-VHF wave propagation in forest. IEEE Transactions on Antennas and Propagation. 50, 2, 111.

[47] Sneddon, I. 1951: Fourier Transforms. McGraw-Hill Book Company, Inc. New York, NY.

[48] Sonett, C. P. 1983: J. Geophys. Res., vol. 88, no. A4, p. 3225-3228.

[49] Stephens, G. L. 1986: Radiative transfer in spatially heterogeneous, two-dimensional anisotropically scattering media. J. Quant. Spectrosc. Radiat. Transfer, 36, 51-67.

[50] Stephens, G.L., and C.M.R. Platt. 1987: Aircraft observations of the radiative and microphysical properties of stratocumulus and cumulus cloud fields. J. Clim. Appl. Meteorol., 26, 1243-1269.

[51] Stull, R. 1988, An Introduction to Boundary Layer Meteorology, Kluwer Academic

Publishers. Boston, MA.

[52] Sun X., Zhu Z., Wen X., Yuan G., and G. Yu. 2006: The impact of averaging period on eddy fluxes observed at ChinaFLUX sites. Agricultural and Forest Meteorology. 137, 188-193.

[53] Usoskin, I.G. and K. Mursula. 2003: Long-term solar cycle evolution: Review of recent developments. Solar Phys. 218, 319-343.

[54] Vickers D., and L. Mahrt. 1997: Quality control and flux sampling problems for tower and aircraft data. J of Atmos and Oceanic Technology. 14, 512-526.

[55] Wilson et al. 2002: Energy balance closure at FLUXNET sites. Agric Forest Meteorol. 113, 223–234.

[56] Wu, S., Liu, Z., and B. Liu. 2006: Enhancement of lidar backscatters signal-to-noise ratio using empirical mode decomposition. Optics Comm. 267, 1, 137.

89

89

[57] Wu, Z., and N.E. Huang. 2004: A study of the characteristics of white noise using the empirical mode decomposition. Proc. R. Soc. Lond. A, 460, 2046, 1597-1611.

[58] Wu, Z., and N.E. Huang. 2009: Ensemble Empirical Mode Decomposition: a noise-assisted data analysis method. Adv. in Adaptive Data Analysis. 1, 1, 1-41.

[59] Zhao, Jin-ping and D. Huang. 2001: Mirror extending and circular spline function for empirical mode decomposition method. Journal of Zhejiang University, 2, 3, 247-252.

[60] Zhen-Shan, L. 2007: Multi-scale analysis of global temperature changes and trend of a drop in temperature in the next 20 years. Meteorology and Atmospheric Physics. 95, 1-2, 115.

the hilbert-huang transform- theory applications development

Documents

thesis supervisor

thesis committee

thesis of bradley lee

thesis requirement

university of iowa iowa

hht tool

abstract hilberthuang

background theory of