nonlinear dynamics of the great salt lake: dimension estimation

11
WATER RESOURCES RESEARCH, VOL. 32, NO. 1, PAGES 149-159, JANUARY 1996 Nonlinear dynamics of the Great Salt Lake: Dimension estimation Taiye B. Sangoyomi Hydrosphere Resource Consultants, Boulder,Colorado Upmanu Lall Utah Water Research Laboratory, Collegeof Civil and Environmental Engineering Utah State University, Logan Henry D. I. Abarbanel Department of Physics and Marine Physical Laboratory, Scripps Institution of Oceanography Universityof California at San Diego, La Jolla Abstract. We studythe possibility that variations in the volume of the Great Salt Lake (GSL), a large,closed basin lake, may be described as a low-dimensional nonlinear dynamical system. There is growing evidence for structure in the recurrence patterns of climatic fluctuations that drive western United States hydrology. Moreover, the time behavior of such lakesis generally more regular than that of the climatic forcing. This suggests the possibility that an analysis of the 144-year, biweekly time series of the GSL volumemay shedsomelight on the underlying dynamics of lake variations. Three methods (correlation dimension, nearest neighbor dimension, and falseneighbor dimension) of estimating attractordimension are appliedand compared. The analysis suggests that the GSL dynamics may be described by a dimension of aboutfour. Implications of such analyses relativeto low-frequency variations and colored noiseand limitations of suchanalyses are discussed. 1. Introduction The Great Salt Lake (GSL) is the fourth largest, perennial, closed basinlake in the world. The lake is shallow (average depth 3-5 m), with a large surface area (greater than 6400 km:),and itssalinity ranges from 5% to 28%.Covering por- tions of northern Utah, southern Idaho, and eastern Wyoming, theGSLdrainage has anarea of 59,570 km :, or 23,000 square miles. The 1847-1992 GSL volume time series [Sangoyomi, 1993]reveals (Figure 1) significant interannual and intcrd½c- adalfluctuations (Figure2). Such persistent wet or dry condi- tionsare related to regional climatic variability and are impor- tant for understanding or forecasting drought and the long- term availability of water. Evidence of structure in these fluctuations comes from the analyses of Lall andMann [1995], who use spectral analysis of the GSL and local precipitation and temperaturetime series to identify a few interannualand intcrd½cadal frequency bands that have significant power. They speculate that the observed spectral signatures maycorrespond to unstable or anharmonic oscillations in these frequency bands,rather than to strictlyperiodic behavior. Mann et al. [1995]establish connections between d½cadal and longer-term hemispheric variabilityin sealevel pressure, surface tempera- ture fields, and GSL volume fluctuations. Analysis of such structured fluctuations of the GSL is thus useful for an im- proved understanding of long-term, large-scale climatic fluctu- ations and their interaction with surface hydrology. A large number of factorsdeterminethe climatic state and Copyright 1996 by the American Geophysical Union. Paper number 95WR02572. 0043-! 397/96/95WR-02872505.00 hence the GSL volume.The integrating effect of the GSL may lead to its fluctuations being determinedlargely by a few, unknown dynamical variables that may be complex, nonlinear functions of the physical variables. In this paper we test the hypothesis that the GSL dynamics can be described by a small set of variables, that is, it is low-dimensional. Such evidence may beuseful for developing low-order models that explain at least a part of GSL variabilityand perhaps can be ultimately related to low-frequency climatic variability. The recentinterestin nonlineardynamics, fractals, and cha- otic systems has provided new insightsinto the working of manyphysical systems aswell as a rapidlygrowing set of tools for nonlinear time series analysis. An assumption in such anal- yses is that a dynamical system in the form of a set of differ- ential equations or discretetime maps in terms of a set of physical variablesdescribes the observed time series. This is consistent with the mathematical form of climatic models that consider conservation of mass, energy, and momentum. These models are typically infinite-dimensional since a space-time distributed set of parameters/variables is considered. By con- trast, the historyof only a singleobservable from the system maybe available, with the interest in inferring properties of the system's dynamics from it. Methods [Takens, 1981; Ruelle, 1994; Abarbanel et al., 1993; Casdagli et al., 1991; Gibson et al., 1992] for recovering the dynamics of the system in terms of this single observable (and pseudovariables obtained by simple op- erations, e.g., delays, of the time series) havebeen advanced. The recovered dynamics may be usedto estimate invariants of the dynamics (e.g.,the asymptotic joint probability distribution of the pseudovariable setandfunctions related to it) as well as to make short-term forecasts. Lall et al. [1996] demonstrated 149

Upload: columbia

Post on 04-Dec-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

WATER RESOURCES RESEARCH, VOL. 32, NO. 1, PAGES 149-159, JANUARY 1996

Nonlinear dynamics of the Great Salt Lake: Dimension estimation

Taiye B. Sangoyomi Hydrosphere Resource Consultants, Boulder, Colorado

Upmanu Lall Utah Water Research Laboratory, College of Civil and Environmental Engineering Utah State University, Logan

Henry D. I. Abarbanel Department of Physics and Marine Physical Laboratory, Scripps Institution of Oceanography University of California at San Diego, La Jolla

Abstract. We study the possibility that variations in the volume of the Great Salt Lake (GSL), a large, closed basin lake, may be described as a low-dimensional nonlinear dynamical system. There is growing evidence for structure in the recurrence patterns of climatic fluctuations that drive western United States hydrology. Moreover, the time behavior of such lakes is generally more regular than that of the climatic forcing. This suggests the possibility that an analysis of the 144-year, biweekly time series of the GSL volume may shed some light on the underlying dynamics of lake variations. Three methods (correlation dimension, nearest neighbor dimension, and false neighbor dimension) of estimating attractor dimension are applied and compared. The analysis suggests that the GSL dynamics may be described by a dimension of about four. Implications of such analyses relative to low-frequency variations and colored noise and limitations of such analyses are discussed.

1. Introduction

The Great Salt Lake (GSL) is the fourth largest, perennial, closed basin lake in the world. The lake is shallow (average depth 3-5 m), with a large surface area (greater than 6400 km:), and its salinity ranges from 5% to 28%. Covering por- tions of northern Utah, southern Idaho, and eastern Wyoming, the GSL drainage has an area of 59,570 km :, or 23,000 square miles. The 1847-1992 GSL volume time series [Sangoyomi, 1993] reveals (Figure 1) significant interannual and intcrd½c- adal fluctuations (Figure 2). Such persistent wet or dry condi- tions are related to regional climatic variability and are impor- tant for understanding or forecasting drought and the long- term availability of water. Evidence of structure in these fluctuations comes from the analyses of Lall and Mann [1995], who use spectral analysis of the GSL and local precipitation and temperature time series to identify a few interannual and intcrd½cadal frequency bands that have significant power. They speculate that the observed spectral signatures may correspond to unstable or anharmonic oscillations in these frequency bands, rather than to strictly periodic behavior. Mann et al. [1995] establish connections between d½cadal and longer-term hemispheric variability in sea level pressure, surface tempera- ture fields, and GSL volume fluctuations. Analysis of such structured fluctuations of the GSL is thus useful for an im-

proved understanding of long-term, large-scale climatic fluctu- ations and their interaction with surface hydrology.

A large number of factors determine the climatic state and

Copyright 1996 by the American Geophysical Union.

Paper number 95WR02572. 0043- ! 397/96/95WR-02872505.00

hence the GSL volume. The integrating effect of the GSL may lead to its fluctuations being determined largely by a few, unknown dynamical variables that may be complex, nonlinear functions of the physical variables. In this paper we test the hypothesis that the GSL dynamics can be described by a small set of variables, that is, it is low-dimensional. Such evidence may be useful for developing low-order models that explain at least a part of GSL variability and perhaps can be ultimately related to low-frequency climatic variability.

The recent interest in nonlinear dynamics, fractals, and cha- otic systems has provided new insights into the working of many physical systems as well as a rapidly growing set of tools for nonlinear time series analysis. An assumption in such anal- yses is that a dynamical system in the form of a set of differ- ential equations or discrete time maps in terms of a set of physical variables describes the observed time series. This is consistent with the mathematical form of climatic models that

consider conservation of mass, energy, and momentum. These models are typically infinite-dimensional since a space-time distributed set of parameters/variables is considered. By con- trast, the history of only a single observable from the system may be available, with the interest in inferring properties of the system's dynamics from it. Methods [Takens, 1981; Ruelle, 1994; Abarbanel et al., 1993; Casdagli et al., 1991; Gibson et al., 1992] for recovering the dynamics of the system in terms of this single observable (and pseudovariables obtained by simple op- erations, e.g., delays, of the time series) have been advanced. The recovered dynamics may be used to estimate invariants of the dynamics (e.g., the asymptotic joint probability distribution of the pseudovariable set and functions related to it) as well as to make short-term forecasts. Lall et al. [1996] demonstrated

149

150 SANGOYOMI ET AL.: NONLINEAR DYNAMICS OF THE GREAT SALT LAKE

30x106 -

10-

8 I I 8 I I I I 1847 1 68 18 9 1910 1930 1951 1972 1992

Year

Figure 1. The 1847-1992 biweekly time series of the Great Salt Lake volume (1 acre-foot equals 1233.48 m3).

success in 2- to 4-years ahead forecasts of the GSL under this framework. Three methods, correlation dimension, nearest neighbor dimension, and the false neighbor dimension, are used here to characterize the GSL dynamics.

The correlation dimension [Grassberger and Procaccia, 1983] has recently been studied as a measure of complexity of a dynamical system and for distinguishing between low-order, deterministic chaos and a random process. A number of anal- yses of weather and climate data [e.g.,Nicolis and Nicolis, 1984; Fraedrich, 1986; Essex et al., 1987; Kurths and Herzel, 1987; Tsonis and Elsher, 1988; Maasch, 1989; Keppenne and Nicolis, 1989; Rodriguez-Iturbe et al., 1989, 1990; Sharifi et al., 1990] have claimed that these processes reflect low-dimensional (three to eight) dynamics, on the basis of the estimated corre- lation dimension. The plausibility of such claims has been questioned [e.g., Grassberger, 1986; Procaccia, 1988; Ghilardi and Rosso, 1990; Ruelle, 1990; Lorenz, 1991] on the grounds of numerical identifiability and sample size limitations as well as of conceptual considerations. Such questions have led to im- proved estimation methods and better insights into the impli- cations of the estimates.

A measure related to the correlation dimension can also be

obtained using the nearest neighbor (NN) method. First intro- duced by Pettis et al. [1979], it has been reproduced in various forms by Termonia and Alexandrowicz [1983], Guckenheimer and Buzyna [1983], Somorjai [1986], Badii and Broggi [1988], Van De Water and Schram [1988], and Havstad and Ehlers [1989].

Finally, a geometrical method for dimension estimation has been devised by Abarbanel and Kennel [1992]. This method uses nearest neighbors of each observation to check whether the trajectories of the underlying dynamics have been properly unfolded at a given embedding dimension.

These three methods are applied to the GSL time series. The indicated number of dimensions is about four from all

methods. The reader is referred to work by Abarbanel et al. [1993] for an expository treatment of the material on nonlinear dynamics used here. We begin with a brief overview of how dynamics may be reconstructed from a time series. This is used to develop the notion of dimension as a topological or geo- metrical measure of "bulk" of the states visited by the dynam- ical system. The algorithms used for dimension estimation are then presented with applications to the GSL data.

2. State Space Reconstruction From a Scalar Time Series

Consider that the system of interest is characterized by a d-dimensional state space z (e.g., d interacting variables such

as pressure, humidity, temperature, rainfall, and runoff). For an autonomous system the associated dynamics may be repre- sented as

dz(t) d• = F(z(t)) (1)

Let us say that we have a univariate time series x 1, x2, '", xt, ... for one of the d state variables, x (e.g., GSL volume), generated by such a system, with sampling rate At. The system in (1) can be written as a higher-order differential equation in terms of a single state variable x.

x (a) = f(x, x', '", x (d-l)) (2)

Packard et al. [1980] and Takens [1981] introduced the no- tion of state space reconstruction from an observed scalar time series. A pseudophase space is defined using delay coordinates, that is, by defining a delay vector xt = {x(t), x(t - ,), ..., x[t - (m - 1)z]}, where z is an appropriately chosen delay

ß

time, which is an integer multiple of At, and rn is an integer embedding dimension. If the solution to the equations lies on an attractor (i.e., a set of points, manifold or object in phase space that trajectories converge to after transients die out) of dimension d.4 < d, then choosing the integer rn > 2d.4 is a sutficient condition for unfolding the attractor from the scalar time series x(t), t = 1, ..., n. Then subject to generic assumptions on F and At, the underlying dynamics for any lag z, and forecast period T, could be represented by a smooth (i.e., differentiable) map:

x(t + T) = fr(x(t), x(t - z), x(t - 2z), '",

x(t - (m - 1)T)) = fT(xt) fr: •m .•> • (3)

Equation (3) provides a basis for reconstructing a state space of the underlying dynamics given a scalar time series, as well as for forecasting that scalar component, provided the map fr can be described for appropriate values of T and m.

A variety of prescriptions for choosing an appropriate • have been presented in the literature. We have noted the problem with choosing • too small. If • is chosen too large and the dynamics are chaotic, all relevant information for phase space reconstruction is lost since neighboring trajectories diverge,

GSL Monthly (3Volume Spectrum (1848-199t)

3.0

2.5

1.0

0.5

0.0

0.0

ß

0.1 0.2 0.3 0.4 0.5 frequency (cycles/year)

Figure 2. Normalized multitaper spectra of GSL monthly volume change using six 4z- tapers.

SANGOYOMI ET AL.: NONLINEAR DYNAMICS OF THE GREAT SALT LAKE 151

and averaging in time and/or space is no longer useful. There seems to be agreement [Abarbanel et al., 1993; Tsonis et al., 1993] that there is no optimal method for choosing ,. The goal is to end up with a coordinate set that is independent such that each coordinate added to the reconstruction set provides new information. Holzfuss and Mayer-Kress [1986] suggest using a value of delay time at which the autocorrelation function first crosses the zero line. Tsonis and Elsner [1988] used a delay time greater than the decorrelation time, which they defined as the time at which the correlation drops to 1/e. Graf and Elbert [1990] write that the delay time can be set equal to the smallest lag for which the autocorrelation function is zero or to the first local minimum if that is earlier than the zero point.

Another choice for the delay time is the value that produces the first local minimum in the mutual information function

[Fraser and Swinney, 1986]. This method for choosing the delay time has the advantage that it considers all kinds of relations, not only the linear ones as in the autocorrelation function. Mutual information attempts to measure how dependent the values of x t + • are on the values of x t as a function of the time lag T. The average mutual information in bits is defined as

where Px(t)('l•) is the probability of choosing ,/from the set x(t), Px(t+,)(13) is the probability of choosing/3 from the set x(t + r), andPx(t),x(t+,)(*t, 13) is the probability of getting rt as the first component and /3 as the second component of {x(t), x(t + r)}. The Fraser and Swinney algorithm uses a histogram-based estimator for computing these probabilities. Abarbanel et al. [1993] suggest the prescription that in case there is no clear minimum with T in MI(T), T be chosen as the first value such that MI(T)/MI(0) -< 0.2.

Our experiments suggest that it is desirable to examine plots of the phase space with various values of T and also to examine each of these criteria. The average mutual information func- tion was calculated for five different segments of length 2048 of the GSL volume time series (total length equals 3463 biweekly values) using a code provided by Fraser. The first minimum in the mutual information for these calculations occurred be-

tween 9 and 13 lags. A plot from one of these calculations is shown in Figure 3. The autocorrelation function at different time lags for each of the samples was also computed. The first

5-

0 20 40 60 80 100

DelayTimeLag

Figure 3. Mutual information function of the Great Salt Lake volume time series. The sampling time At = 15 days. The first minimum occurs at a delay time lag of 9 = 135 days for this sample (2048 points) of the time series.

1.00

0.98

õ 0.94-

• 0.92- •%• ••, 0.90 -

0.88 - •

0.86 - •,•%j 0.84 -

I .... • .... I .... • .... I .... • .... • .... • .... I .... • .... I .... • .... I .... I ....

0 10 20 30 40 50 60 70

Delay Time Lag

Figure 4. Autocorrelation function for a sample of length 2048 from the GSL volume time series. The first minimum is at

a delay time of 13 biweekly periods is 195 days.

minimum in the autocorrelation function occurred at a lag of 13 for all 5 samples considered. Figure 4 shows a plot of the autocorrelation function.

3. Dimension Analysis: Concepts Our interest lies in identifying the dimension of the set that

the dynamics (as represented by the trajectories of the single observable in the embedded space) is contained in. Recovery of this dimension (1) allows a classification of the complexity of the dynamics and (2) provides the number of terms in the delay vector needed for forecasting that variable. As regards (1), it turns out that for trajectories of a chaotic system, a variety of measures of dimension [Schroeder, 1991] may be needed to classify the process. Geometrical methods [Abarbanel and Ken- nel, 1992] provide a direct estimation of dimension for (2). In this section these concepts are defined. Estimation algorithms and results follow in the next section.

3.1. Dimensions Through Scaling Ideas

The simplest notion of dimension follows from the observa- tion that the volume V of an object that is d dimensional can be expressed as r a, where r is some characteristic length (e.g., for a line d = 1, V = r; for a cube, V = r3). This suggests that dimension d may be estimated as the ratio log (V)/log (r).

More generally, one can define measures of "bulk" and develop estimates of dimension in terms of such a measure and its scaling with r. This idea is explained by Abarbanel et al. [1993] as follows. Let us suppose that evolution of the process has been recorded for a long enough period, such that the underlying process has been sufficiently described (i.e., a rep- resentative sampling of the phase space is available) and that there is no residual dependence on initial conditions. This allows the definition of an "invariant" probability density p(y) in the phase space, where y = {x(t), x(t + •),..., x(t + (m - 1)T)} represents the embedding. Recall that this prob- ability distribution can be thought of as the distribution of mass in the m-dimensional space. The notion of "bulk" can now be developed in terms of the moments of this distribution. Sup- pose, we measure bulk as {E[#/']} •//', for some number b, where the expectation E[ ] is taken over the density p(y), and the function # is taken to be p(y) itself. This measure of bulk is more sensitive to nonuniformity in the distribution of mass over the domain as b increases. If the mass is uniformly

152 SANGOYOMI ET AL.: NONLINEAR DYNAMICS OF THE GREAT SALT LAKE

distributed, for example, over a sphere or cube, then p(y) is constant over the object, and the measured bulk (and hence dimension) is independent of b.

Generalized dimensions are then defined as

log(bulk) D = lim

r-•0 log (r)

log ({E[p(y)t']} tit') = lim

r-•O log (r)

11øglfP•(Y)P(y)dy 1 = lim-

r-•O b log (r) (5)

or

Dq = lim- (6) r-->0 q -- 1 log (r)

4 il,3 .

Points on map, m=2 are projected on this line, m=l

Figure 5. Illustration of false neighbors. The data is sampled from x (t) = sin(t) at a sampling rate Ar = 0.005, with r = 50 and rn = 2. The four points shown are projected down to one dimension at the bottom of the figure. Note how points 1 and 3 are false neighbors once rn = 2.

where p is P(Yi), q = (b + 1), and the summation corre- sponds to a discrete approximation of the integral as the sam- ple size n approaches infinity.

This definition becomes clearer upon considering the special cases of q = 0, 1, and 2, and by approximating pi = ni(r)/n, where ni(r ) is the number of observations out of n that falls in the ith partition of the data formed with a characteristic length r (e.g., a partition is a hypercube in rn dimensions of side r). There are t/r nonempty partitions of the data for a given r.

Box counting dimension log (rtr)

Do = lim (7) r-•0 log (l/r)

Information dimension D• = lim • Pi log (Pi) r-->0 log (r) (8)

Correlation dimension D 2 - lim (9) r-->0 log (r)

Abarbanel et al. [1993] observe that Do measures bulk in terms of the number of partitions that are not empty, irrespec- tive of the actual number of elements in the partition. D• measures bulk by weighting each partition's contribution in terms of the fraction of the observations that fall in it. Conse-

quently, they consider Do analogous to volume, and D•, to mass. D2 measures the probability of finding a pair of random points in a partition and hence measures variation in mass.

The situation here is similar to estimating moments of a random variable in traditional stochastic analysis, and one can explore dimension for other values of q as well. Connections between the generalized dimensions D q and multifractal spec- tra are described by Schroeder [1991]. The algorithm [Grass- berger and Procaccia, 1983] for estimating D 2 from a time series, and its relative computational efficiency have led to its being the most widely applied. The algorithm [Grassberger and Procaccia, 1983] is described and implemented in section 4. Another scaling-based algorithm that uses distances to k near- est neighbors rather than fixed partitions to estimate the prob- ability P(Yi) is also described and implemented there.

3.2. Dimension Through Geometry and Nearest Neighbors

Abarbanel and Kennel [1992] used a simple geometrical idea, rather than a scaling approach, to provide a direct estimate of the embedding dimension at which the trajectories have been unfolded. Their idea is illustrated with reference to the exam-

ple in Figure 5. Pick four points (labeled 1, 2, 3, and 4 in Figure 5) on a map formed with m = 2. If we were to work in m = 1, the coordinates of these points would be 0, 1, 0, and -1, respectively. Points 1 and 3 would be nearest neighbors. Mov- ing to m = 2, we see that point 1 (0,-1) is not really a nearest neighbor of point 3 (0,1). The points 2 (1,0) and 4 (-1, 0) are closer to 1 than 3 is. Indeed, the distance from point 1 to point 3 is now two units, that is, the maximum possible in the circle map, rather than zero units as was the case with m = 1. Points 1 and 3 and all similar points are then considered false neigh- bors. Going from m = 2 to m = 3, we end up with no false neighbors. Consequently, at m = 2 the dynamics have been unfolded. For noisy data, one sees if the fraction of false neighbors is acceptable. It is important to note that this test estimates an integer dimension dœ that is the minimum needed to unfold the dynamics. This dimension will lie between the integer greater than or equal to dA and less than or equal to (2dA + 1). This method has been used successfully among others by Tsonis et al. [1994].

4. Methodology and GSL Results Algorithms for correlation dimension (the version of Grass-

berger and Procaccia [1983]), nearest neighbor dimension [Pet- tis et al., 1979] and false neighbor dimension estimation [Abar- banel and Kennel, 1992] and results for the GSL time series are presented in this section. The algorithms were first tested with data of the same length (3463) for the x variable from the Lorenz equations [Lorenz, 1976]. The estimated dimensions were 2.05, 2.09, and 3, respectively, for this data set compared with the known [Wolf, 1986] dimension of 2.07. Recall that the false neighbor method gives the smallest usable integer em- bedding dimension. Details of these and other tests of the numerical algorithms with synthetic data (R6ssler, uniform, Gaussian, gamma, AR processes) are available from the authors.

SANGOYOMI ET AL.: NONLINEAR DYNAMICS OF THE GREAT SALT LAKE 153

4.1. Correlation Dimension

Grassberger and Procaccia [1983] proposed the estimation of the numerator in (9) through the so-called correlation integral, C(r), obtained by constructing a sphere or cube of radius r at each point Yi in phase space and counting the number of points in it, that is,

i N N

C(r) = N(N - 1) • • H(r - lYi - Y]I) (10) i=1 j=l

i•j

where H(•r) is the Heaviside function, and H(•r) = 0 if •r < 0, and H(•r) = i if •r > 0. The maximum norm is used for distance so that the vectors are counted within an m-

dimensional cube of side r centered on the reference vector.

The dimension D2 is estimated from the slope of a plot of log C(r) versus log r. In this paper natural logarithms are used throughout. The slope is estimated by a least squares fit of a straight line over a certain range of r, called the scaling region.

The correlation dimension in (9) was defined with the limit taken as r approaches zero. Unfortunately, if the data is at all noisy, C(r) becomes poorly behaved as r approaches zero. The number of pairs of points available can be too few below some radius r. Similarly, for very large r, all the neighbors that get counted are counted with reference to a few vectors that are

near the edges of the domain (i.e., may even be outliers). The edge effect grows as m increases. Consequently, the usable (the log [C(r)] versus log (r) slope is constant) r values are in an intermediate range called the scaling region. This region is usually large (2 to 3 orders of magnitude) for noiseless data from a low-dimensional (two or three) system. It may not be easy to find such a region with a real data set.

The strategy is to fix •- appropriately and then vary m = 1, ..., M. For each value of m, identify a scaling region and estimate the slope D2,m. Now plot D2,m versus m. If D2,m is constant for all m > m *, then one accepts it as the correlation dimension D 2. The estimate of D 2 can be biased downward due to correlation between the reference vectors and their

close neighbors. The effect of this can be minimized by exclud- ing a number of nearest neighbor vectors about each reference point when calculating the correlation integral in (10) [Thetier, 1986]. To decide on the number of nearest neighbors to ex- clude about each reference vector, the plots of log C (r) versus log (r) are computed when one, two, three,..., nearest neigh- bors are removed. Then the one which gives the minimum distortion of the log C(r) versus log (r) curves or the largest scaling region is selected. We found excluding two nearest neighbors to give the largest scaling region across m and r for the GSL data.

The application to the GSL time series data is now pre- sented. The delay ß was taken to be 9, and m was varied from 1 to 8 in steps of one. Figure 6a shows the plot of log C(r) versus log (r) for the reconstructed attractor of the GSL time series for embedding dimensions 2, 4, 6, and 8. The scaling region to use for estimating the slope of log C(r) versus log (r) is not obvious. To make it easier to select, local slopes of the curves in Figure 6a are calculated as

log [C(r)]i+ • -- log [C(r)]i_ • = . (11) Ai log (t')i+l- log (t')i_ 1

These local slopes are plotted in Figure 6b. From Figure 7 the correlation dimension of the GSL volume time series is esti-

mated to be 3.4. It is unclear whether the estimates of D2 have converged with m as large as 8.

We find it disappointing that the user has to indulge in a number of subjective and visual judgements in order to come up with an estimate. For the GSL data we regard these esti- mates as tentative at best. Similar problems were encountered upon tests with synthetic, noise-contaminated data from known dynamical systems. Other modifications to the correla- tion integral method have also been made in order to reduce the distortion of the curves of log C(r) versus log (r). A list of these is given by Haystad and Ehlers [1989]. These include (1) using log C(r) as the independent variable and log r as the dependent variable (this is done by specifying the number of nearest neighbors about each reference vector and obtaining the distances corresponding to these, rather than specifying a radius and obtaining the number of vectors in the box as indicated in (7)) and (2) taking the average of each log C(r), rather than estimating C(r), and taking the average before applying the log. These modifications make the correlation integral method similar to the nearest neighbor method.

4.2. Nearest Neighbor Dimension

The nearest neighbor (NN) method estimates the dimension by looking at the properties of the average distance to a se- quence of near neighbors. One can think of this as switching the independent and dependent variables in the correlation dimension procedure. We shall present the original proposal by Pettis et al. [1979], which predates the activity in the recent literature.

Suppose that we have an m-dimensional embedding (x i, xi-•-, ''' , Xi--(m--1)•-) of the time series that is represented as a state vector yi. The probability density p(y) at a point y in m-dimensional space may be estimated using the k-nearest neighbor multivariate density (k - NN) estimator [Silverman, 1986] as

k/N

P(Y) = V (12)

where N is the total number of data points y•, k is the num- ber of near neighbors to y within a hypersphere of radius r k about y; V = Vdr• d is the volume of the hypersphere; Vd = ,rd/2/F(d/2 + 1), is the volume of the unit d-dimensional hypersphere, and F( ) is the gamma function.

The k-NN density estimate is a nonparametric estimator similar in spirit to a histogram. Histograms are usually formed by placing "bins" of fixed volume (width in one-dimension) over the domain of the variable(s) of interest and estimating the frequency of observations in each bin. The k-NN estimator places the "bins" centered at each point at which an estimate of the probability density is needed, specifies the number of neighbors (k) and uses a bin volume that is prescribed by the distance to the kth nearest neighbor of the point of estimate. This strategy improves on the histogram by freeing the esti- mate from the choice of origin and allowing bin sizes to adapt to variations in the probability density, being larger where data is sparse and smaller in areas of higher data density.

Assuming that p(y) is constant and nonzero in some small neighborhood (i.e., in the limit as rk --> 0) of the point of estimate y, and considering a binomial process to describe the probability that a certain number of neighbors can fall within a radius r of y, Pettis et al. [1979] derive the density function for the distance ra,y to the kth nearest neighbor from y as a gamma

154 SANGOYOMI ET AL.: NONLINEAR DYNAMICS OF THE GREAT SALT LAKE

-1

-4

-10

-13

10.0 12.0 14.0

m=2

m=4

m=6

m=8

16.0

log r

(a)

• 3

0 -

12.0

O

ß 4.

4.

O

D ß ß 4.

O

O Do DD • O

O

13.0

Scaling Region

ß

4.4. ,(.• 4. 4.4,

O D Do DD' ß []

ß I

14.0

log r (b)

ß m=2

ß

ß 4. ß

4. ß

ß 4. ß ß 4. ß

DD D

.; Do o ß "'..

ß ! . !

15.0 16.0

Figure 6. (a) Log C(r) versus log r for Great Salt Lake volume time series when two nearest neighbors are excluded in calculating the correlation integral. (b) Local slope of the curves in Figure 6a versus log r.

distribution. This assumption of local homogeneity and inde- pendence of points in state space may not be valid for a dy- namical system under study. In the subsequent analysis it may be useful to examine the estimates over a range of k values. The expected value of the average distance to the kth nearest neighbor is then

where

N CNk•/d El?k] = (i/N) • E[rk,y,] -

i=1

(13)

kUaF(k) 1 • G•,• = r(k + i/d) C•v = • • [Np(yi)V•] -u•

i=1

1 N

i=1

The term C N is independent of k and is a constant if m > d, the intrinsic dimension of the data set, while G k,a depends on k. However, log (G•,,a) is in the range of 0 < log (G•,,a) < 0.12 for all k and d, the maximum occurring for k = i and d = 2.17. The average distance from each observation to its kth nearest neighbor, 7•,, is used as an estimate of E(7•,). An estimator 8 for d is then defined through

log (Gkia) + log (70 = (1/8) log (k) + log (C•v) (14) The plot of log (7/,) as a function of log (k) will have a slope

of (1/8), and 8 can be obtained. The term C•v affects only the intercept of this plot. The dimension estimate is solved for in an iterative manner. An initial estimate 8o is obtained by setting log (G•,,a) to zero, and fitting a least squares regression line to log (7/,) versus log (k) for k = kmin,... , kma x. The value of log (G/,,a) is then used to fit another regression line to obtain 81. This is continued until 18i+ 1 - 8il < e for some tolerance • and iteration index i. The estimate at iteration i is

given as

SANGOYOMI ET AL.: NONLINEAR DYNAMICS OF THE GREAT SALT LAKE 155

•i--' K • log (k)[log (?k) + log (Gk,a,_,)] k=l

) - • log (k) • [log (t0+log (G•,a,_,)] k=l k=l

ß K • [log (k)] 2- log (k) (15) k=l =

where K = km• - kmi n + 1, and km• and kmi n are the m•imum and minimum number of neighbors considered, re- spectively.

Pettis et al. [1979] performed Monte Carlo tests of their algorithm with a varie• of low-dimensional data seti covering situations where the data fill the space in which they are em- bedded as well as where the data are lower dimensional than

the embedding space (e.g., uniform points on a circle in •o dimensions). They found that their algorithm was insensitive to the choice of k, and worked well even for k = 2 and for samples as small as 100. However, there is clearly distortion with small k, due to noise and due to clustering of data, and for large k, due to edge effects. The variance of • will be high for small k (it emphasizes local information), while for large k (it emphasizes global information) the assumption of a locally constant p (y) is violated. A "scaling" region (kmin, km• ) con- sequently needs to be chosen in terms of k. Somo•ai [1986] suggests using k = •N for the dimension estimate. Our tests suggested that for most data sets, a fairly wide range of values of k that bracket •N gives stable results.

Pettis et al. [1979] found from empirical studies that includ- ing samples on the edge of the data set substantially increased the sample variance and distorted the estimate of •. They recommend retaining only the r•,y, such that r•,y, • • + s•, where s• is the standard deviation of the kth nearest neighbor distances. The • is then recomputed using only the distances that meet this criterion. This sort of censoring to avoid edge effects is often recommended [Cressie, 1991] when analyzing spatial point process data using nearest neighbor methods. In our context, some bias in estimating d• may be introduced by this censoring, since extreme points of the set are down- weighted. We experimented by retaining distances up to 2 and 3 times s• from • for our data sets and found that the differ-

7

1 ß i . i ß i ß

1 2 3 4 5 6 7 8

Embedding Dimension

Figure 7. Estimated dimension versus the embedding di- mension. The estimated dimension is the slope over the scaling region of the log C(r) versus log r plot of Figure 6.

7-

6-

5-

3-

2-

1-

(a)

ß ...... ;" ..5

I I I

13 14 15

1og(Gkd) + log(r 0

4.2 I

4.0--

3.8 I

3.6-

3.4-

3.2•

3.0-

(b)

.: m=2---•

o

,? o

o

I I I

13 14 15 16

1og(Gkcl) + 1og(rk)

Figure 8. (a) Plot of log k versus log (Gk,d) -3- log (rk) for the nearest neighbor method, z = 9. (b) Plot of log k versus log (Gk,d) + log (rk) over the scaling region for the Great Salt Lake volume time series.

ence in the final estimate was minor. Where no censoring was employed, the estimated dimension was adversely effected.

As with the correlation dimension algorithm, one computes 8 ?epeatedly for reconstructed phase space With increasing embedding dimension to see if • stabilizes with m.

A plot of log (k) versus [log (Gk,d) q- log (?k)] from the nearest neighbor dimension method is shown in Figure 8a for nearest neighbors from 2 to 700 (log (k) = 0.69 to 6.55), using embedding dimensions of two, four, six, and eight. It is seen that the kink s in similar plots observed for the correlation integral method are absent in this case. There is some curva- ture in the relationships for low and high k values.

In this context a scaling region can be described in terms of a range of k values. The range of nearest neighbors used for the scaling region is 20 to 70 (i.e., log (k) = 2.99 to 4.25). Actually, the relationships are stable for log (k) = 2.5 to 6 or k = 6 to 400, a much larger range than with the correlation integral method. The plot of log (k) versus [log (Gk,a) + log (?k)] over the scaling region is shown in Figure 8b for embed- ding dimensions of 1, 3, 5, 7, 9, and 11. A plot of the estimated nearest neighbor dimension versus embedding dimension iS shown in Figure 9. The estimated nearest neighbor dimension of the GSL volume time series is 3.74. There is more clear-cut

evidence of saturation of dimension in this case starting at an embedding dimension of six, than with the correlation dimen- sion procedure. The estimate of dimension for m = 4 is now 3.2 rather than 2 (the correlation dimension estimate). The

156 SANGOYOMI ET AL.: NONLINEAR DYNAMICS OF THE GREAT SALT LAKE

o o

0 2 4 6 8 10 12

Embedding Dimension

Figure 9. Nearest neighbor dimension versus embedding di- mension for Great Salt Lake volume time series.

reduced "bias" is confirmed in parallel analyses of synthetic data.

4.3. False Neighbor Dimension

A key step in the Abarbanel and Kennel [1992] algorithm is how to decide upon increasing the embedding dimension that a nearest neighbor is false. Two criteria are used. These are the following:

1. If R m + l(i) --> 2R.q, the i th vector has a false nearest neighbor (where R m + • (i) is the distance to the nearest neigh- bor of the ith vector in an embedding of dimension (m + 1), and R.q is the standard deviation of the x(t), t = 1, ..., n).

2. If [Rm+i(i) - Rm(i)] > •Rm(i), the ith vector has a false nearest neighbor (where e is a threshold factor (10 to 50 apparently works well), and the distance R m + • (i) is computed to the same neighbor that was identified with embedding m, but with the (m + 1)th coordinate (i.e., x(t - mr) appended to the i th vector and to its nearest neighbor with embedding m).

The first criterion is needed because with a finite data set, under repeated embedding, one may stretch out the points such that they are far apart and yet cannot move any farther apart upon increasing the dimension. The second criterion checks whether the nearest neighbors have moved far apart on increasing the dimension. The threshold e to use has been established by experimentation.

The results from an application of this method are shown in Figure 10. We see that the fraction of false neighbors has dropped to essentially zero by m - 4, consistent with the correlation and nearest neighbor dimension estimates.

in the estimation of the probability densities used for estimat- ing the dimensions D o, D •, and D 2. The number of data points required for a reliable dimension estimate increases exponentially with the dimensionality of the data set. Complex- ity of the underlying geometry is also a factor. Perhaps the most serious factor is whether the underlying dynamics has been sampled in a representative manner; one could collect 10 6 points in the first 1 s of a multiyear experiment and learn virtually nothing from them about the overall dynamics. On the other hand, a few thousand points collected over the entire experiment may be meaningful. Some optimistic sample sizes reported purely on numerical grounds are given by Eckmann and Ruelle [1992], who establish that the Grassberger- Procaccia algorithm will not produce dimensions larger than 2 log (N), or 7 for the GSL data set, and suggest that if the estimated dimension is not significantly smaller than this num- ber, it should be rejected. Abarbanel et al. [1993] suggest that this implies a minimum sample size of 10 d/2, that is, only 100 for the GSL data. However, all this presumes that the corre- lation dimension algorithm can be successfully applied with a clear-cut scaling region identified (something that was not ob- vious in our application and in others we have examined).

The effect of small data sets on dimension estimation has

been investigated by several authors. Ramsey and Yuan [1990] assessed dimension calculations with small sample sizes with the correlation integral method. They conclude that dimension can be estimated with upward bias for attractors and with downward bias for random noise as the embedding dimension is increased. Haystad and Ehlers [1989] use a variant of the nearest neighbor dimension algorithm with which they under- estimate the dimension of the Mackey-Glass equation [Mackey and Glass, 1977] (dimension is 7.5) by about 11% using a sequential series of as few as 200. Somorjai [1986] also reports estimating dimension with the nearest neighbor algorithm to within an error of 10% for a very high dimensional system of 20 using only 2000 data points. The nearest neighbor algorithm should be expected to give better results, since it uses adaptive partitions or bins; is assured of spanning the space, unlike fixed partition size methods, which are inefficient; and is known to have a higher convergence rate [Scott, 1992] for probability density estimates as m increases compared to that for histo- gram type of estimates. It seems to be much more robust in terms of defining a usable (and usually much broader) "scal- ing" region. Since every reference vector has by construction an adequate number of points in its neighborhood, the nearest

5. Some Factors That Influence Dimension Calculations

We indicated earlier that the reliability of the algorithms used for estimating dimension remains an area of inquiry. A key question is how many data are needed to identify the dimension by the nonparametric methods advanced here. A second important question is whether these dimension esti- mates reflect determinism or low-dimensional behavior, or if they can correspond to a long memory stochastic process. Of course, a host of methodological or algorithmic questions also arise. We shall only briefly indicate the current state of inves- tigation into such issues. The reader is referred to work by Abarbanel et al. [1993], Eckmann and Ruelle [1992], and Ram- sey and Yuan [1990] for additional information.

We noted earlier that a histogram style approach is implicit

100,

z z 80

u. 60 o

• 4o

a. 20

1 9 2 3 4 5 6 7 8

Embedding Dimension

Figure 10. Percentage of false nearest neighbors for the GSL data as a function of embedding dimension, r - 9.

SANGOYOMI ET AL.: NONLINEAR DYNAMICS OF THE GREAT SALT LAKE 157

neighbor method allows one to successfully probe a much broader range of scales.

Once again, we stress that the sample size needed is really determined by how much of the phenomenon or attractor has been explored. It is useful when working with data from nat- ural systems to talk of the dimension of the observed trajectory rather than the system. In our context a large-scale climatic shift (e.g., the ice age) may very well expose a rather different dynamics of the GSL. In a sense this is an argument for sta- tionarity. More strongly it is an argument for representative- ness. A pencil may balance quite well on its flat edge for the observational period. A slight draft that eventually knocks it over exposes that state as an unstable, rather than a stable, point of the dynamics.

An apparently serious objection to correlation dimension estimates was raised by Osborne and Provenzale [1989], who showed that a finite correlation dimension could be estimated

for stochastic systems with power law spectra. If the power spectrum P(f) of the process scales with frequency f, that is, P(f) = 1/f •, then the estimated correlation dimension will be 2/(a - 1). For the GSL data, a was estimated as 1.25, suggest- ing d = 8, obviously quite different from our estimates. More- over, while the plot of log [P(f)] versus log (f) looks quite like a straight line over a broad scaling region in f, it was inappropriate given high power at a number of frequency bands. We wonder how often such power law fits are accepted as descriptors for the overall behavior, when they may describe the base behavior reasonably well but completely miss the underlying dynamics that may be represented by a few scat- tered spectral peaks or elevated bands. The autocorrelation function does not show evidence of power scaling. Theiler [1991] has addressed the Osborne and Provenzale [1989] finding by pointing out that it results from including pairs in calculat- ing C(r) that are highly correlated. His suggestion is to con- sider only pairs that are sut•iciently (a decorrelation time) removed from each other in time. We incorporated this sug- gestion in investigating the deletion of a number of nearest neighbors of the point while forming C(r) in the correlation integral method and by choosing a minimum k = 20 in the nearest neighbor method. Further discussion of related topics may be found in work by Triantafyllou et al. [1994].

Correlation dimension estimates can have a downward bias

if the lag •-is chosen such that successive delay vectors are highly correlated. The Correlation at •- = 9 is about 0.95. The average mutual information for •- = 9 is, however, not much greater than for subsequent local minima. Dimension esti- mates were investigated for the false neighbor method for 6 < •- < 24, and for the scaling methods for 8 < •' < 20. Results were similar to those obtained from •- = 9. For forecasting the GSL we found [Lall et al., 1996] that •- -- 9 gave the best results in terms of cross validated squared error of prediction. This lag may hence provide a useful embedding.

The false neighbor test was applied to time series v(t) formed by differencing the volume time series at various inter- vals, for example, v(t) = x(t) - x(t - 1), v(t) = x(t) - x(t - 9), v(t) - x(t) - x(t - 24). Each of the series v(t) formed in this manner was then analyzed as if it were the observed series. Such differencing is aimed at removing or ameliorating the effects of colored noise. As a result, the per- centages of false neighbors changed for rn - 1, 2, and 3, but for rn -> 4, they all stay close to zero, confirming the sugges- tion of determinism and an embedding dimension of four.

The relative simplicity, robustness, and directness of the

false neighbor method makes it in our view the method to use. If one is disposed to investigate scaling-based dimensions, the nearest neighbor method is superior in our investigations.

6. Discussion

The analyses here suggest that the dimension of the GSL dynamics is of the order of 4. The correlation and nearest neighbor dimensions measure how densely the trajectories populate phase space, while the false number dimension pro- vides the number of coordinates needed to unfold the attractor

without loss of information. Given the vagaries of climate, can we really believe that four hidden variables are all it takes to generate all the complexity we observe? Is the dynamical sys- tem analyzed, the low-frequency climate system, or perhaps something more closely related to just the GSL? How does this analysis help one conceptualize hydroclimatic systems? Alas, rigorous answers to such questions are not readily forthcoming.

We do find that the evidence is suggestive of determinism. The false neighbor test and success in forecasting reported by Lall et al. [1996] form the core of our belief, with the evidence from the nearest neighbor test for support. The fact that the correlation dimension estimate is similar is helpful. The esti- mated dimension of four suggests that a nonlinear forecasting model with four coordinates should be successful. In this sense

the dimension estimate is similar to choosing the order of an AR model using, say, the Akaike information criteria. No insight into what these four variables should be in terms of physical variables is available from time series analysis. For such insights one would have to also analyze the behavior of an appropriate hydroclimate model.

We draw on the discussion by Lorenz [1991] to direct thought on the issues raised above. He points out that the atmosphere is a weakly coupled system; local convection in London is only weakly related to local atmospheric flow over Washington, D.C. The coupling is provided by a general weather pattern, which may in turn be loosely coupled to a hemispherical flow. Depending on which variable of the system is analyzed, one may estimate remarkably different dimension. He views a suitable variable for estimating the system's dimen- sion as one that is strongly coupled to most of the variables that determine the long-term evolution of the system. If a variable is selected that is strongly coupled to only a few variables, its estimated dimension will be considerably smaller. We believe that the GSL volume is such a variable. Now, if we are inter- ested primarily in the GSL (or other similar hydrological ob- jects), the confirmation through dimension analysis that its dynamics are indeed low dimensional is encouraging. In this context the effect of the myriads of other variables is subsumed as dynamical (rather than observational) noise with amplitude small relative to the "signal." Given the low-frequency char- acter of the lake's fluctuations, and its obvious relation to regional drought, one cautiously hopes for explanation, under- standing, and improved predictability in the appropriate framework.

It is important to note that the GSL data represent "natural" spatial averaging of hydroclimate over a large watershed. Long-time, local measurements of pressure in Central Europe [Fraedrich, 1986], as well as other meteorological fields, do not show such clear-cut low-dimensional behavior. These data sets

reflect the complex meteorological processes at small time and space scales. The GSL data average out such dynamical aspects and present us with a baseline of relatively predictable varia-

158 SANGOYOMI ET AL.: NONLINEAR DYNAMICS OF THE GREAT SALT LAKE

tions in a climate driven system at timescales that are opera- tionally useful. Use of the GSL time series to establish a frame- work for "natural" climate variations seems very promising.

It will be interesting to formally investigate the attributes of various hydrologic processes and classify them by dimension, spatial scales, and frequency attributes in an attempt to fathom where there is hope for deterministic modeling, and which processes are high dimensional and best treated as a stochastic. Finally, we emphasize that dimensions are static or asymptotic measures of the behavior of a dynamical system. Predictability in time is best measured through Lyapunov exponents or rate of growth of forecast error. Dimensions aS well as these at- tributes are likely to be state dependent; the complexity of the systems' behavior may be quite different depending on whether we examine high- or low-flow regimes. Investigations and re- ports into these issues are forthcoming.

Acknowledgments. The work reported here was supported in part by the NSF under grant EAR-9205727 and by the USGS under grant 1434-92-G-226. The comments of two anonymous reviewers were re- sponsible for a significant improvement in the manuscript. The help of Alin-Andrei Carsteanu to investigate scaling properties of the data is appreciated.

References

Abarbanel, H. D. I., and M. B. Kennel, Local false nearest neighbors and dynamical dimensions from observed chaotic data, Phys. Rev., E47, 3057-3068, 1992.

Abarbanel, H. D. I., R. Brown, J. J. Sidorowich, and L. S. Tsimring, The analysis of observed chaotic data in physical systems, Rev. Mod. Phys., 65(N4), 1331-1392, 1993.

Badii, R., and G. Broggi, Measurement of the dimension spectrum F(A): Fixed-mass approach, Phys. Lett. A, 131,339-343, 1988.

Casdagli, M., S. Eubank, J. D. Farmer, and J. Gibson, State space reconstruction in the presence of noise, Physica D, 51, 52-98, 1991.

Cressie, N., Statistics for spatial data, in Wiley Series in Probability and Mathematical Statistics, 900 pp., John Wiley, New York, 1991.

Eckmann, J.-P., and D. Ruelie, Fundamental limitations for estimating dimensions and Lyapunov exponents in dynamical systems, Physica D, 56, 185-187, 1992.

Essex, C., T. Lookman, and M. A. H. Nerenberg, The climate attractor over short timescales, Nature, 326, 64-66, 1987.

Fraedrich, K., Estimating the dimensions of weather and climate at- tractors, J. Atmos. Sci., 43(5), 419-432, 1986.

Fraser, A.M., and H. L. Swinney, Independent coordinates for strange attractors from mutual information, Phys. Rev. A, 33(2), 1134-1140, 1986.

Ghilardi, P., and R. Rosso, Comment on "Chaos in rainfall" by I. Rodriguez-Iturbe et al., Water Resour. Res., 26(8), 1837-1839, 1990.

Gibson, J. F., J. D. Farmer, M. Casdagli, and S. Eubank, An analytic approach to practical state space reconstruction, Rep. 92-04-021, Santa Fe Inst., Santa Fe, N.M., 1992.

Graf, K. E., and T. Elbert, Dimensional analysis of the waking EEG, in Chaos in Brain Function, edited by E. Basar, pp. 135-152, Springer- Verlag, New York, 1990.

Grassberger, P., Do climatic attractors exist?, Nature, 323, 609-612, 1986.

Grassberger, P., and I. Procaccia, Measuring the strangeness of strange attractors, Physica, 9D, 189-208, 1983.

Guckenheimer, J., and G. Buzyna, Dimension measurements for geostrophic turbulence, Phys. Rev. Lett., 51, 1438-1441, 1983.

Havstad, J. W., and C. L. Ehlers, Attractor dimension of nonstationary dynamical systems from small data sets, Phys. Rev., 39(2), 845-853, 1989.

Holzfuss, J., and G. Mayer-Kress, An approach to error-estimation in the application of dimension algorithms, in Dimensions and Entro- pies in Chaotic Systems, edited by G. Mayer-Kress, pp. 114-147, Springer-Verlag, New York, 1986.

Keppenne, C. L., and C. Nicolis, Global properties and local structure of the weather attractor over western Europe, J. Atmos. Sci., 46, 2356-2370, 1989.

Kurths, J., and H. Herzel, An attractor in a solar time series, Physica, 25D, 165-172, 1987.

Lall, U., and M. Mann, The Great Salt Lake: A barometer of low- frequency climatic variability, Water Resour. Res., 31(10), 2503-2515, 1995.

Lall, U., T. Sangoyomi, and H. D. I. Abarbanel, Nonlinear dynamics of the Great Salt Lake: Nonparametric short term forecasting, Water Resour. Res., in press, 1996.

Lorenz, E. N., Nondeterministic theories of climatic change, Quart. Res., 6, 495-507, 1976.

Lorenz, E. N., Dimension of weather and climate attractors, Nature, 353, 241-244, 1991.

Maasch, K. A., Calculating climate attractor dimension from 15180 records by the Grassberger-Procaccia algorithm, Clim. Dyn., (4), 45-55, 1989.

Mann, M., U. Lall, and B. Saltzman, Decadal-to-centennial-scale cli- mate variability: Insights into the rise and fall of the Great Salt Lake, Geophys. Res. Lett., 22(8), 937-940, 1995.

Nicolis, C., and G. Nicolis, Is there a climatic attractor?, Nature, 311, 529-532, 1984.

Osborne, A. R., and A. Provenzale, Finite correlation dimension for stochastic systems with power-law spectra, Physica D, 35, 357-381, 1989.

Packard, N.H., J.P. Crutchfield, J. D. Farmer, and R. S. Shaw, Geometry from a time series, Phys. Rev. Lett., 45(9), 712-716, 1980.

Pettis, K. W., T. A. Bailey, A. K. Jain, and R. C. Dubes, An intrinsic dimensionality estimator from near-neighbor information, IEEE Trans. Pattern Anal. Mach. Intell., 1 (1), 25-37, 1979.

Procaccia, I., Complex or just complicated, Nature, 333, 498-499, 1988.

Ramsey, J. B., and H.-J. Yuan, The statistical properties of dimension calculations using small data sets, NonlineariF, 3, 155-176, 1990.

Rodriguez-Iturbe, I., B. Febres de Power, and M. B. Sharifi, Chaos in rainfall, Water Resour. Res., 25(7), 1667-1684, 1989.

Rodriguez-Iturbe, I., B. Febres de Power, M. B. Sharifi, and K. P. Georgakakos, Reply, Water Resour. Res., 26(8), 1841-1842, 1990.

Ruelie, D., Deterministic chaos: The science and the fiction, Proc. R. Soc. London A, 427, 241-248, 1990.

Ruelie, D., Where can one hope to profitably apply the ideas of chaos?, Phys. Today, 47, 24-30, 1994.

Sangoyomi, T. B., Climatic variability and dynamics of Great Salt Lake hydrology, Ph.D. dissertation, Utah State Univ., Logan, 1993.

Schroeder, M., Fractals, in Chaos and Power Laws: Minutes From an Infinite Paradise, 429 pp., W. H. Freeman, New York, 1991.

Scott, D. W., Multivariate density estimation: Theory, practice and visualization, in Wiley Series in Probability and Mathematical Statis- tics: Applied Probability and Statistics Section, 317 pp., John Wiley, New York, 1992.

Sharifi, M. B., K. P. Georgakakos, and I. Rodriguez-Iturbe, Evidence of deterministic chaos in the pulse of storm rainfall, J. Atmos. Sci., 47(7), 888-893, 1990.

Silverman, B. W., Density estimation for statistics and data analysis, in Monographs On Statistics And Applied Probability, Chapman and Hall, New York, 1986.

Somorjai, R. L., Methods for estimating the intrinsic dimensionality of high-dimensional point sets, in Dimensions and Entropies in Chaotic Systems, edited by G. Mayer-Kress, pp. 137-147, Springer-Verlag, New York, 1986.

Takens, F., Detecting strange attractors in turbulence, in Dynamical Systems and Turbulence, edited by D. R. A. L. S. Young, pp. 366- 381, Springer-Verlag, New York, 1981.

Termonia, Y., and Z. Alexandrowicz, Fractal dimension of strange attractors from radius versus, Phys. Rev. Lett., 51, 1265-1268, 1983.

Theiler, J., Spurious dimension from correlation algorithms applied to limited time series data, Phys. Rev. A., 34, 2427-2432, 1986.

Theiler, J., Some comments on the correlation dimension of l/f" noise, Phys. Lett. A, 155, 480, 1991.

Triantafyllou, G. N., R. Picard, and A. A. Tsonis, Exploiting geometric signatures to accurately determine properties of attractors, Appl. Math. Lett., 7(N6), 19-24, 1994.

Tsonis, A. A., and J. B. Eisner, The weather attractor over very short time scales, Nature, 333, 545-547, 1988.

Tsonis, A. A., G. N. Triantafyllou, J. B. Eisner, J. J. Holdzkom, and A.D. Kirwan, An investigation of the ability of nonlinear methods to infer dynamics from observables, Bull. Am. Meteorol. Soc., 75(9), 1623-1633, 1994.

SANGOYOMI ET AL.: NONLINEAR DYNAMICS OF THE GREAT SALT LAKE 159

Van De Water, W., and P. Schram, Generalized dimensions from near-neighbor information, Phys. Rev. ,4, 37, 3118-3125, 1988.

Wolf, A., Quantifying chaos with Lyapunov exponents, in Nonlinear Science: Theory and Applications, edited by A. Holden, Manchester Univ. Press, Manchester, Eng., 1986.

U. Lall, Utah Water Research Laboratory, College of Civil and Environmental Engineering, Utah State University, Logan, UT 84322- 8200. (e-mail: [email protected])

T. B. Sangoyomi, Hydrosphere Resource Consultants, Inc., 1002 Wal- nut, Suite 200, Boulder, CO 80302. (e-mail: [email protected])

H. D. I. Abarbanel, Department of Physics and Marine Physical Laboratory, Scripps Institution of Oceanography, University of Cali- fornia at San Diego, La Jolla, CA 92093-0402.

(Received December 1, 1994; revised September 15, 1995; accepted September 18, 1995.)