error in a usgs 30-meter digital elevation model and its ... · geological modeling, and...

20
Error in a USGS 30-meter digital elevation model and its impact on terrain modeling K.W. Holmes a, * , O.A. Chadwick a , P.C. Kyriakidis b,1 a Department of Geography, University of California, Santa Barbara, CA, 93106, USA b Department of Geological and Environmental Sciences, Stanford University, Stanford, CA, 94305-2115, USA Received 15 October 1999; received in revised form 21 February 2000; accepted 2 March 2000 Abstract Calculations based on US Geological Survey (USGS) digital elevation models (DEMs) inherit any errors associated with that particular representation of topography. We investigated the potential impact of error in a USGS 30 m DEM on terrain analysis over 27 km 2 . The difference in elevation between 2652 differential Global Positioning Systems measurements and USGS 30-m DEM derived elevations provided the comparative error dataset. Analysis of this comparative error data suggested that although the global (average) error is small, local error values can be large, and also spatially correlated. Stochastic conditional simulation was used to generate multiple realizations of the DEM error surface that reproduce the error measurements at their original locations and sample statistics such as the histogram and semivariogram model. The differences between these alternative error surfaces provide a model of uncertainty for the unknown DEM error spatial distribution. These DEM errors had a significant impact on terrain attributes which compound elevation values of many grid cells (e.g. slope, wetness index, etc.). A case study using terrain modeling demonstrates that the result of error propagation is most dramatic in valley bottoms and along streamlines. q 2000 Elsevier Science B.V. All rights reserved. Keywords: Digital terrain models; Uncertainty; Spatial distribution; Digital simulation; Geostatistics; Global positioning systems 1. Introduction Topography controls fluxes of energy, nutrient distribution, mass movement, and water dispersion in many landscapes. As a result, topographic maps and their digital analogues are often useful for study- ing spatially distributed landscape processes. Digital elevation models (DEMs) have been used for mapping and environmental spatial analysis in landslide prediction and characterization (e.g. Dikau et al., 1996), climate/meteorological applications (e.g. Thornton et al., 1997), route optimization (e.g. Ehlschlaeger and Shortridge, 1996), integrated studies of hillslope processes (e.g. McDermid and Franklin, 1995), landform analysis (e.g. Weibel and Heller, 1990), image registration (e.g. Giles and Franklin, 1998), hydrologic modeling (e.g. Band, 1993), sedi- ment flux modeling (e.g. Mitasova et al., 1996), land use planning (e.g. Mellerowicz et al., 1992) and soil- landscape modeling (e.g. McKenzie and Austin, 1993). Summaries of DEM applications can be found in Moore et al. (1990), Weibel and Heller (1990), and Milne and Sear (1997). For terrain analyses, researchers either produce their own Journal of Hydrology 233 (2000) 154–173 www.elsevier.com/locate/jhydrol 0022-1694/00/$ - see front matter q 2000 Elsevier Science B.V. All rights reserved. PII: S0022-1694(00)00229-8 * Corresponding author. Tel.: 11-805-893-8525; fax: 11-805- 893-3146. E-mail address: [email protected] (K.W. Holmes). 1 Present address: Earth Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Upload: others

Post on 19-Mar-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et

Error in a USGS 30-meter digital elevation model and itsimpact on terrain modeling

K.W. Holmesa,*, O.A. Chadwicka, P.C. Kyriakidisb,1

aDepartment of Geography, University of California, Santa Barbara, CA, 93106, USAbDepartment of Geological and Environmental Sciences, Stanford University, Stanford, CA, 94305-2115, USA

Received 15 October 1999; received in revised form 21 February 2000; accepted 2 March 2000

Abstract

Calculations based on US Geological Survey (USGS) digital elevation models (DEMs) inherit any errors associated with that

particular representation of topography. We investigated the potential impact of error in a USGS 30 m DEM on terrain analysis

over 27 km2. The difference in elevation between 2652 differential Global Positioning Systems measurements and USGS 30-m

DEM derived elevations provided the comparative error dataset. Analysis of this comparative error data suggested that although

the global (average) error is small, local error values can be large, and also spatially correlated. Stochastic conditional

simulation was used to generate multiple realizations of the DEM error surface that reproduce the error measurements at

their original locations and sample statistics such as the histogram and semivariogram model. The differences between these

alternative error surfaces provide a model of uncertainty for the unknown DEM error spatial distribution. These DEM errors had

a signi®cant impact on terrain attributes which compound elevation values of many grid cells (e.g. slope, wetness index, etc.). A

case study using terrain modeling demonstrates that the result of error propagation is most dramatic in valley bottoms and along

streamlines. q 2000 Elsevier Science B.V. All rights reserved.

Keywords: Digital terrain models; Uncertainty; Spatial distribution; Digital simulation; Geostatistics; Global positioning systems

1. Introduction

Topography controls ¯uxes of energy, nutrient

distribution, mass movement, and water dispersion

in many landscapes. As a result, topographic maps

and their digital analogues are often useful for study-

ing spatially distributed landscape processes. Digital

elevation models (DEMs) have been used for mapping

and environmental spatial analysis in landslide

prediction and characterization (e.g. Dikau et al.,

1996), climate/meteorological applications (e.g.

Thornton et al., 1997), route optimization (e.g.

Ehlschlaeger and Shortridge, 1996), integrated studies

of hillslope processes (e.g. McDermid and Franklin,

1995), landform analysis (e.g. Weibel and Heller,

1990), image registration (e.g. Giles and Franklin,

1998), hydrologic modeling (e.g. Band, 1993), sedi-

ment ¯ux modeling (e.g. Mitasova et al., 1996), land

use planning (e.g. Mellerowicz et al., 1992) and soil-

landscape modeling (e.g. McKenzie and Austin,

1993). Summaries of DEM applications can be

found in Moore et al. (1990), Weibel and Heller

(1990), and Milne and Sear (1997). For terrain

analyses, researchers either produce their own

Journal of Hydrology 233 (2000) 154±173www.elsevier.com/locate/jhydrol

0022-1694/00/$ - see front matter q 2000 Elsevier Science B.V. All rights reserved.

PII: S0022-1694(00)00229-8

* Corresponding author. Tel.: 11-805-893-8525; fax: 11-805-

893-3146.

E-mail address: [email protected] (K.W. Holmes).1 Present address: Earth Sciences Division, Lawrence Berkeley

National Laboratory, Berkeley, CA, 94720, USA.

Page 2: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et

DEMs or use publicly available ones. In the ®rst case,

they know exactly what techniques were used in

collecting and processing the data, and can calculate

the accuracy both globally and spatially. However,

DEM production can be very dif®cult and time

consuming, and is typically not the goal of most

researchers who plan to do terrain analysis, although

it may prove necessary if high-resolution data are

required (Dietrich et al., 1993). If pre-processed

DEMs are used, there is usually little information

available to data users about data collection, proces-

sing, or error distribution.

In the United States, the most commonly used

DEMs are those produced and distributed by the US

Geological Survey. These digital datasets range from

3-arcsecond to 30-m cell resolution (US Geological

Survey, 1998). The only error information provided

are global estimates of root mean square error

(RMSE), which give the user no information about

data accuracy at speci®c locations within the DEM.

In addition, terrain attributes such as slope or aspect

are derived from elevation grids, and tend to

compound systematic errors caused by the resolution

of the data and the methods by which the DEM was

produced (Bolstad and Stowe, 1994; McKenzie et al.,

2000). Environmental model predictions based on a

combination of DEM-derived surfaces contain an

uncertainty component as a result of the unknown

accuracy of the original elevation data. It is important

to quantify this uncertainty inherited from the DEM,

and to investigate its impact on the interpretation and

utilization of model predictions.

The obvious approach to assess DEM error is to use

higher accuracy ®eld-surveyed data to evaluate the

DEM elevation values (Adkins and Merry, 1994;

Bolstad and Stowe, 1994). In the absence of ®eld-

collected data (the usual case), researchers have

devised other ways of characterizing uncertainty,

including using limited elevation values derived

from higher resolution DEMs as ªground-truthº data

(Shortridge, 1997; Kyriakidis et al., 1999), exploring

fractal dimensions of DEMs to reveal production arti-

facts or anomalies (Polidori et al., 1991), or assigning

distributions of error for each grid cell based on the

reported global error measurements (Fisher, 1993;

Ehlschlaeger and Shortridge, 1996). The error

measured at discrete points is used to supply estimates

of error, and of error uncertainty, in locations where

error was not measured directly. The process respon-

sible for inducing errors in a DEM, like many

geographic or environmental processes, is not suf®-

ciently well understood to permit a deterministic

analysis of error. A geostatistical approach to error

characterization, based on a probabilistic model that

recognizes these inevitable uncertainties, is more

appropriate.

Geostatistics allows the use of effective estimation

procedures, gauges the accuracy of the estimates, and

assigns con®dence intervals to estimates by treating

the variable of interest as random. This does not imply

the variable itself or the deterministic process is

random, but rather re¯ects our uncertainty about the

process which generates values at unsampled loca-

tions (Journel, 1996). Taken a step further, geostatis-

tics can be used to model the uncertainty of unknown

values at a set of locations by generating alternative

images, or realizations, which reproduce the original

data at their measurement locations and spatial

patterns or statistics considered important for the

speci®c case. This process is called stochastic simula-

tion (Goovaerts, 1997). These alternative realizations,

which span the range of possible attribute values

given the model of uncertainty at any location, can

be used as alternative inputs for environmental model-

ing or building scenarios, providing a distribution of

possible results (Heuvelink, 1998).

While kriging alone provides the best, in the least

squares sense, interpolated (smooth) surface from a

set of measurements, simulation allows the generation

of a series of equiprobable realistic surfaces or

volumes, each having the correct spatial structure. A

simple Monte Carlo method would simulate values

for each grid cell independent of the rest of the

surface. Conditional simulation combines the actual

data values with the spatial correlation information

from the semivariogram to generate simulated

outcomes at each grid cell. The average value in

each grid cell over a very large number of realizations

is virtually identical to the simple kriging estimate

(Burrough and McDonnell, 1998). Because it

provides alternative plausible representations of an

attribute's possible spatial distribution, simulation is

commonly used in risk analysis, medical imaging,

transportation analysis, mineral exploration, hydro-

geological modeling, and environmental sciences.

In this paper, we evaluate the magnitude and spatial

K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173 155

Page 3: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et

distribution of error in USGS 30-m elevation values in

relation to high accuracy Global Positioning Systems

(GPS) data in order to assess the potential effect of

DEM error on environmental modeling applications.

The selected method is a simpli®ed version of the

geostatistical approach developed by Kyriakidis et

al. (1999), who used a sparse, high quality elevation

dataset (from a USGS 30-m DEM) to evaluate the

quality of a much more extensive, continuous eleva-

tion dataset of unknown quality (USGS 3-arcsecond

DEM) using conditional Gaussian simulation.

The DEM accuracy assessment is carried out as

follows: (1) collection and assembly of the DEM

error dataset; (2) exploratory analysis of the data;

(3) declustering, normalization, and structural analy-

sis (variography); and (4) generation of 50 equiprob-

able elevation surfaces using conditional sequential

Gaussian simulation (discussed below). The effect of

DEM error, as revealed by the differences among the

50 realizations, on terrain attribute calculation and on

a simple map algebra model of the likelihood of hill-

slope failure were explored as examples of the impact

of USGS 30-m DEM error on digital terrain modeling.

2. USGS DEM accuracy assessment

2.1. Field area description

Sedgwick Natural Reserve, located in the San

Rafael Mountains at the southern most end of the

California Coastal Range, includes landforms ranging

from extensive ¯oodplains along Figueroa Creek, to

low relief foothills, and high relief, long mountain

slopes to the northeast (Fig. 1). Sedgwick's diverse

terrain provides a test of the ability of the USGS

K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173156

Fig. 1. Map of the study area: University of California Sedgwick Natural Reserve, Santa Barbara County, California.

Page 4: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et

DEMs to accurately represent topographic character-

istics. The ®eld area is approximately 27 km2

(2700 hectares, 6670 acres), spans about 1000 vertical

meters, and includes several different geologic forma-

tions, which support different types of landforms such

as low rolling hills, and steep mountain slopes.

2.2. Elevation data

USGS 30-m DEM: The study area is located in the

Los Olivos USGS 7.5-min Quadrangle. The corre-

sponding DEM is a level 2 data set, indicating the

data were acquired by contour digitizing, either photo-

grammetrically or from existing maps. These data sets

have been processed or smoothed for consistency and

identi®able errors have been removed. A root mean

square error (RMSE) of up to one half of a contour

level is the maximum error allowed. At least 28 test

points (20 within the quadrangle, 8 along the edges)

from the source data located on contour lines, bench

marks, or spot elevations were compared with the

DEM data by the USGS to calculate a RMSE (US

Geological Survey, 1987). For the Los Olivos DEM,

RMSE is reported as 6 m for horizontal coordinates,

and 2 m for height, relative to the ®le datum, which

has a measured error relative to the absolute datum of

3 m �x; y; z�: In the worst case scenario, this means the

data could have on average as much as ^(9,9,5) m of

error. The spatial distribution of this uncertainty is not

provided to users.

GPS elevation data: We collected GPS positions

with a vertical accuracy better than 25 cm over

0.4% of the study area by taking the coordinates of

the grid cell centers of the 30 m data, and choosing

200 sites using a random number generator. In order

to compare GPS horizontal (x,y) measurements to the

coordinates reported by the USGS, we chose 60 addi-

tional sample sites (road intersections, rock outcrops,

etc.) identi®able within a 2 m radius on the USGS 1-m

digital orthophotoquad (DOQ). The GPS equipment

used was a Trimble 4400 total station with an addi-

tional data collector to allow both the rover and base

receivers to log raw GPS data. Trimble GPSurvey

software was used to post-process the data and calcu-

late the associated accuracy. ARC/INFO (ESRI,

1996) was used for producing maps and overlays,

and S-PLUS software (Statistical Sciences, 1995)

was used for statistical analyses of the GPS data.

K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173 157

Fig. 2. (a) GPS reference network. (b) Radial survey design. The background image is the USGS 30-m DEM, shown in grey scale. Black is low

elevation, white is high elevation.

Page 5: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et

We established a horizontal and vertical static

reference network in Sedgwick Reserve using three

GPS receivers over ®ve controls: three National

Geodetic Survey (NGS) monuments, 1st order hori-

zontal control, 4th order vertical control, one with 1st

order differentially leveled vertical control; and two

Santa Barbara County monuments, 1st order horizon-

tal control, 1st order differentially leveled vertical

control (Of®ce of the County Surveyor, 1993;

National Geodetic Survey, 1996). Each baseline was

occupied for more than one hour, some as many as

four times. The reference network (Fig. 2a) was post-

processed with high accuracy results, and the remain-

ing points were then processed within the same

network (Table 1).

We adopted a radial survey design (Fig. 2b) to

record the locations and elevations of the 252

randomly selected sites. This approach provided opti-

mal use of the available horizontal and vertical

controls, equipment, and time resources. The base

station was set up over an NGS 1st order control

approximately 8 km from the ®eld area. The base

logged raw GPS data continuously, while the second

receiver was carried to each sample location for 8±

20 min, depending on the number and geometric

con®guration of satellites visible during data collec-

tion. Numerous established survey monuments, points

from the reference network, and benchmarks were

included in the survey to adjust the GPS survey

network during post processing. The independent

monuments also provided checks on the accuracy of

the GPS results.

In addition to the 252 distributed GPS points, ten

areas located in different types of topography were

intensively surveyed, yielding an additional 2400

GPS measurements for the DEM accuracy assessment

(Fig. 3). For these surveys, a real-time kinematic

survey method was used, which provides accuracy

of about 1-cm horizontal, 2-cm vertical, relative to

the base station coordinates (Trimble Navigation

Ltd, 1996). The previously surveyed GPS points

(Table 1) were used for base station locations.

2.3. Exploratory data analysis

Several interpolation techniques were tested for

extracting USGS elevation values at the GPS

measurement locations (Holmes, 1999a). All of

these datasets had a correlation with the GPS dataset

of greater than 0.9991. For the purposes of this

project, the ªnearest neighborº dataset, simply using

the nearest gridded point value, was chosen for use in

uncertainty modeling because it was the simplest

method, and there was no evidence that more complex

interpolation methods improved the data quality.

Fig. 4 shows the USGS DEM and GPS data sets

with their histograms. The USGS dataset histogram is

skewed to the left, indicating that a high percentage of

cells have low elevations. The random GPS dataset

(252 points) shows the same histogram characteristics

as the USGS DEM, though it is noisier. The histogram

of the large GPS dataset (2652 points) shows strong

bimodality, which is a result of the high degree of

clustering of elevation measurements in the 10 select

locations.

The DEM error was calculated by subtracting the

nearest neighbor USGS elevations from the GPS

measured elevations at each GPS point location. The

spatial distribution of these error values and their

histogram are plotted in Fig. 5a and b. The histogram

shows a roughly normal distribution, with a mean of

20.10 m, median of 20.46 m, and a standard

K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173158

Table 1

Sedgwick natural reserve GPS survey results

Survey (includes all points) No. of points Maximum error (m) Minimum error (m) Mean error (m)

Field area 252

Horizontal (xy) 0.073 0.003 0.027

Vertical (z) 0.206 0.018 0.057

3-D (xyz) 0.218 0.018 0.063

Reference network 10

Horizontal (xy) 0.007 0.002 0.004

Vertical (z) 0.028 0.018 0.021

3D (xyz) 0.028 0.018 0.021

Page 6: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et

K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173 159

Fig. 3. GPS measurement locations: 252 randomly located points (upper right) and 2400 highly clustered points (white £ 's).

Page 7: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et

K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173160

Fig. 4. Maps and histograms of (a,b) the USGS 30-m DEM; (c,d) the randomly located 252 point GPS dataset; and (e,f) the clustered 2562 point

GPS dataset.

Page 8: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et

K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173 161

Fig. 5. (a) Histogram of error data (GPS elevation ±USGS DEM elevation); (b) location map of error data; (c) histogram of declustered error

data; (d) map of the declustering weights used to produce the histogram in (c); (e) histogram of error data after a normal scores transform. Color

images are available from Holmes, 1999b (web page).

Page 9: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et

deviation of 4.11 m, indicating that on average over

the study area the USGS DEM overestimates the

elevation by only 10 cm. However the maximum

(18.15 m) and minimum (212.72 m) error values

show that there are signi®cant differences in some

areas.

As shown in Fig. 3, the data points were highly

clustered. Because simple kriging is used within the

simulation routine to estimate the conditional distri-

bution of the error at any location given the surround-

ing error data, dense clusters of points can greatly

affect estimations in areas with sparse data due to a

misspeci®cation of the regional mean. A declustering

algorithm was used to weight the data according to

how much in¯uence each data point should have to

the regional histogram (Goovaerts, 1997; Deutsch and

Journel, 1998). Because there was no evident prefer-

ential clustering of data in high- or low-valued error

areas, the cell size used in the declustering algorithm

was chosen such that a single isolated datum was

located within each cell (on average). In this way,

isolated samples, which are more representative of

the area around them, received more weight in the

calculation of the (declustered) error histogram than

those samples located in clusters. Fig. 5a±c show the

histograms of the raw error data, the location map of

K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173162

Fig. 6. Scatterplots of USGS DEM error (absolute value) vs. GPS measured elevation, roughness or relief, slope, and aspect.

Page 10: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et

error, and the histogram of declustered error data. Fig.

5d is a map of the declustering weights, which shows

that low weights are assigned to clustered data, and

higher weights assigned to isolated data.

Local topographic roughness was calculated as the

standard deviation of elevation measured within a

moving window (a 5-cell or 150 m radius circle) on

the USGS DEM. A high standard deviation indicates

high variability in the local terrain surface. Fig. 6

shows scatter plots of the absolute value of error vs.

elevation, roughness, slope, and aspect. The strongest

correlation is between slope and absolute error, with a

value of 0.325. GPS elevation, roughness, and aspect

show correlations with error of 0.288, 0.300, and

0.125, respectively.

The semivariograms of the USGS dataset (Fig. 7a)

show a steady increase in elevation variance with

increasing distance. There is a slight difference in

semivariograms calculated in north±south and east±

west directions, but they have the same behavior at

distances shorter than 1200 m. These semivario-

grams are typical of topographic data because

there is no reason for elevation values to reach a

sill, unless the shear strength of the geologic

material upholding the topography is exceeded,

limiting the maximum elevation in an area. Topo-

graphy has the potential to become more and more

diverse over larger distances (i.e. the semivariogram

does not reach a sill) due to differing lithologies and

sur®cial processes. The semivariogram of the GPS

dataset (Fig. 7b) is much more irregular because of

the high degree of clustering of the data points.

Some of the dips and spikes in the semivariogram

represent a large number of measurements at a ®xed

distance apart that have either similar elevations

(low variance) although there may be variable topo-

graphy between the two sites, or very different

elevations (high variance) because of the high

number of measurements at two very different

elevations within a certain lag. The semivariogram

of error (Fig. 7c) shows spatial autocorrelation, and

there is a slight trend toward increasing variance

with distance. The north±south and east±west semi-

variograms show no clear trend with direction. The

omnidirectional semivariogram of the normal score

transformed error data was modeled with an isotro-

pic semivariogram function for use in the simulation

routine.

K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173 163

Fig. 7. Semivariograms of (a) the USGS elevation data, (b) the

clustered GPS measured elevations, and (c) the error data, calcu-

lated by subtracting the USGS values from the GPS measured

values. Note differences in y-scales.

Page 11: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et

2.4. Data preparation

The weak correlations between calculated DEM

error and terrain ruggedness or other basic terrain

attributes (Fig. 6) eliminated the need to use both

elevation data sets in the simulation routine as was

done by Kyriakidis et al. (1999). They incorporated

both original datasets in the simulation routine

because they found that high DEM error was asso-

ciated with areas of rough terrain. In this case, the

association of error with patterns of elevation or

terrain attributes was negligible, allowing us to

directly simulate the error surface using only the

calculated error data set to produce alternative error

realizations. The public domain software package

GSLIB (Deutsch and Journel, 1998) was used for all

calculations related to simulation.

Simulations model the uncertainty in the attribute

spatial distribution based on the data sources available

near each point of interest. Simple kriging is used

within the simulation routine to establish uncertainty

models of error values at every location. Likely data

values are simulated at each location by drawing

randomly from the possible distribution of error

provided by the local conditional cumulative distribu-

tion function (ccdf). The easiest way to derive local

ccdfs is to assume a model for the entire multivariate

distribution of the original random function represent-

ing, in this case, the spatial distribution of elevation

errors. This model needs to be ¯exible enough that all

of the local ccdfs have the same analytical expression

and can be fully speci®ed through a few parameters.

By using this method, the problem of determining the

local ccdf is reduced to the estimation of these few

parameters (such as a local mean and variance). The

multivariate Gaussian random function model is the

most commonly used parametric approach because

the inherent structure of the model makes determining

local ccdfs fairly straight forward (Goovaerts, 1997).

A normal scores transform, a non-linear, rank preserv-

ing transform that remaps any distribution to a normal

distribution (see Journel and Huijbregts (1978; p. 476)

or Goovaerts (1997; p. 266)) was applied to the

declustered dataset (Fig. 5e) to meet the format

requirements of Gaussian simulation, which are that

the univariate distribution of the error data be standard

normal.

Once the data were declustered and transformed to

a standard normal distribution, the spatial variability

of the data was modeled for inclusion in the simula-

tion routine. Semivariance values are calculated at

de®ned lags to build an experimental semivariogram,

but kriging and simulation algorithms require a

continuous model of spatial variance, which is

supplied in the form of a semivariogram model. The

number of lags and lag distance for the semivariogram

were chosen experimentally to smooth the data

enough to see a de®nite semivariogram pattern but

retain some of the detail (Fig. 8e). The ®nal isotropic

normal scores semivariogram model input to the

sequential Gaussian simulation is:

g�h� � 0:3 £ Expuhu

125

� �1 0:7 £ Exp

uhu3000

� �where Exp(´) denotes an exponential semivariogram

model, and h denotes the distance between any two

locations (lag distance).

2.5. Stochastic simulation

The basic steps of sequential Gaussian simulation

are as follows (Goovaerts, 1997; Deutsch and Journel,

1998): (1) A random path is de®ned for visiting each

grid node once; (2) At each node, simple kriging is

used to determine the parameters of the Gaussian local

ccdf (mean and variance), based on the normal score

semivariogram modelÐboth the normal scores origi-

nal data values and previously simulated values

within a local neighborhood are considered for simple

kriging; (3) A value is drawn from that local ccdf

using a Monte Carlo type simulation, and is added

to the sample data set; (4) Steps 2 and 3 are repeated

at each node along the random path until each node in

the grid has been visited, and a corresponding error

value has been generated. Once the simulations of the

normal score values have been produced, each reali-

zation must be back-transformed to the original error

distribution. This essentially consists of taking the

inverse of the normal scores transform to graphically

remap the normal scores distribution to the original

error histogram.

Fifty error realizations were generated using

sequential Gaussian simulation. Figs. 5a and b, and

8a±h show the maps, histograms, and semivariograms

of the original data and of two randomly selected

realizations. All of the simulated error surfaces

K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173164

Page 12: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et

K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173 165

Fig. 8. (a,b) Maps, (c,d) histograms, and (e,f, solidlines) semivariograms of two randomly selected simulated error surfaces. The points on

semivariograms (e±h) are the semivariogram of the original data. (g) shows the semivariograms of the 50 realizations superimposed on the

normal score semivariogram of the raw GPS data.

Page 13: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et

reproduce the measured error values at their locations.

Each of these 50 realizations would have approxi-

mately the same histogram as the original data values,

as well as the semivariogram model quantifying their

spatial correlation. Within individual realizations,

deviations from the sample statistics are expected.

In Fig. 8, note the similarity between the distributions,

summary and spatial statistics of the realizations

themselves. The largest differences in the spatial

patterns in the two realizations occur in areas far

away from where GPS measurements were taken.

The ®nal step in the simulation procedure was to

add the simulated error realizations to the original

USGS 30 m DEM. This created 50 different versions

of the DEM, each of which is theoretically a more

accurate representation of topography than the

USGS DEM because the 2652 GPS measured eleva-

tions and the modeled spatial correlation were

accounted for in each new realization. Fig. 9 shows

the ®nal results for the two randomly selected

realizations, along with the original DEM and

elevation histogram. The scatterplots of USGS DEM

vs. realizations #12 and #41 show the high degree of

correlation among the datasets. These 50 simulated

DEMs are used as a Monte Carlo framework for

testing the effect of DEM error on terrain attribute

calculations and DEM-based terrain modeling

below.

3. Error propagation

In order to explore the effect of DEM error on

different types of DEM products, simple Boolean

operations and an empirical regression model were

run on the original DEM and the 50 simulated realiza-

tions discussed above. These alternative numerical

models provide a sample of the joint probability distri-

bution of the higher accuracy elevation for each grid

cell of the DEM of the study area. The range and

spatial patterns of elevation in the simulated surfaces

reveal the magnitude and spatial patterns of uncer-

tainty in the original DEM, which are propagated

through DEM-derived products. In the following

sections, we present examples of the effect of uncer-

tainty in elevation data on the calculation of terrain

attributes, and a simple case study of error propaga-

tion in hillslope failure analysis.

3.1. Terrain attribute calculations

Terrain attributes (Table 2) were calculated on the

50 simulated DEMs in ARC/INFO, and statistics

calculated for the distribution of values in each

pixel. Selected statistics are displayed in Fig. 10.

The spatial patterns found in the map of average simu-

lated error values at each pixel (Fig. 10a) are similar

to, but more systematic than, those found in the two

examples of simulated error surfaces shown in Fig. 8a

and b. Although there were no signi®cant correlations

between DEM error and terrain attributes, there is a

recognizable pattern of spatially distributed error.

Error was calculated as the GPS elevations minus

the USGS DEM elevations, so negative errors signify

the DEM overestimated elevation, and positive errors

mean it underestimated elevation.

The range in attribute values that resulted from

adding simulated error surfaces to the DEM (Fig.

10c±f) were calculated by subtracting the minimum

of the 50 simulated values from the maximum simu-

lated value in each pixel across the study area. The

range in elevation values (Fig. 10c) shows two major

patterns, which are evident on most of the terrain

attribute grids. First, the range of elevations increases

considerably outside the boundary of the study area,

where there were no measured error values. The

second obvious pattern is the white patches indicating

a very low range in elevation values. Ten of these are

the areas where GPS data are clustered. The simula-

tion algorithm reproduces each GPS data value in

their original positions, thus preserving as much of

the ªtrueº data as possible in each simulation. The

small patches of very low standard deviation are typi-

cally where there are numerous GPS data. These

higher accuracy data were preserved in the simulation

process, resulting in all of the simulations having

similar elevation values in those regions.

The same white patches indicating low range of

values are obvious in the map of slope range (Fig.

10d), however the rest of the pattern is much more

complex. The main valley bottoms and a few extre-

mely steep slopes show low range values indicating

that they were consistently de®ned on most of the

simulations. The majority of the area has such high

local variability in slope range that it appears random,

a result of the sensitivity of the slope calculation to

elevations within a 3 £ 3 cell window. The map of

K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173166

Page 14: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et

K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173 167

Fig. 9. Maps and histograms of the original USGS 30-m DEM, and two randomly selected alternative DEMs (a±f). (g) and (h) show scatterplots

of elevation of each realization vs. the original DEM.

Page 15: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et

range in ¯ow accumulation values is very different

(Fig. 10e). The distribution of this attribute on just

one DEM is highly skewed toward low values,

because it represents the total number of cells in the

up-slope that would channel water through a particu-

lar grid cell. The majority of the study area consists of

short slopes and ridgelines, so high ¯ow accumulation

values occur downstream as each digital tributary

logarithmically increases the values in the main

river valleys. Small changes in the representation of

topography can have a large effect on the calculation

of ¯ow accumulation downstream. The highest range

values occur in the main river valleys. Cells close to

the main channel sometimes have low-¯ow routed

through them, but slight changes in ¯ow direction

(highly sensitive to changes in aspect, curvature and

slope) could redirect all of the ¯ow from the adjacent

hillslopes through different cells, causing a radical

range of possible values in the valley bottoms. The

wetness index, or compound topographic index (CTI),

is calculated using both slope and upslope contribut-

ing area, which is similar to ¯ow accumulation. The

map of the range in CTI values (Fig. 10f) on the 50

simulated grids shows the effect of DEM error is high-

est in the valleys. The range of CTI values is much

smaller than that of ¯ow accumulation, which makes

it easier to map and see details away from the main

river valleys.

We calculated a series of `deviance' grids by

subtracting the USGS DEM-derived value for each

pixel from the mean simulated value (Fig. 11). Posi-

tive values indicate the average simulated value was

higher than the USGS DEM and negative values indi-

cate the DEM was higher than the average simulated

value. The elevation deviance grid (Fig. 11a) is iden-

tical to the error dataset (Fig. 10a). The deviance of

slope (Fig. 11b) shows the DEM consistently under-

estimates the slope in the river valleys, and exhibits

high variation in areas with a large number of GPS

measurements. The roughness deviance map (Fig.

11c) indicates that the USGS DEM gently smooths

topography by consistently underestimating rough-

ness in the river valleys, although the pattern becomes

more complex in areas of higher relief. The deviance

maps of CTI and ¯ow accumulation (Fig. 11d and e)

show that the DEM overestimates values along ¯ow

lines, underestimates on the surrounding ¯oodplain,

and overestimates on hillslopes (¯ow accumulation)

or in smaller drainages (CTI). The range in values for

¯ow accumulation is very large, and the mean may not

be the most representative statistic to use for this

comparison. The plan curvature deviance map (Fig.

11f) shows no obvious patterns, except for deviance at

the clustered GPS data sites. Pro®le curvature (not

shown) is very similar to plan curvature. Aspect (not

shown) has extreme differences in ¯at areas, where

aspect is dif®cult to measure at a 30 m resolution.

3.2. Slope failure prediction

A slightly more complex example of terrain

analysis illustrates the impact of elevation uncertainty

on DEM dependent products when higher derivatives

of the surface are combined in a simple map algebra

calculation. Cannon et al. (1998) used digital terrain

analysis to describe, monitor, and predict the effects of

shoreline erosion and slope failure in the San

Francisco Bay area during the high precipitation

K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173168

Table 2

Terrain attribute descriptions

Terrain attributes Description

Elevation (Z) Data provided on the USGS 30-m

DEM.

Slope Calculated from the 8 neighboring cells

using the average maximum technique

(Burrough and McDonnell, 1998).

Aspect Azimuth of the direction of maximum

slope.

Plan curvature Curvature in map view (contour).

Pro®le curvature Curvature of the slope pro®le,

perpendicular to the contours.

Flow accumulation Total cells that would contribute water

to a given cell based on the

accumulated weights for all cells that

¯ow into each down slope cell.

Upslope ¯ow path length Number of cells through which water

would theoretically ¯ow to reach each

cell from the nearest ridge line.

Roughness Calculated at each cell as the standard

deviation of elevation within a 5 cell

radius circle.

Compound topographic

index (CTI)

ln[(upslope contributing area per unit

width orthogonal to the ¯ow direction)/

tan(slope angle)] an indicator of

catenary position (McKenzie et al.,

2000).

Rock type Digitized from a 1:24,000 USGS

geology map, Los Olivos quadrangle.

Page 16: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et

K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173 169

Fig. 10. Terrain attribute statistics calculated by pixel for all 50 simulated grids. Color images and animations are available from Holmes

(1999b) (web-page).

Page 17: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et

K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173170

Fig. 11. Differences between USGS 30-m DEM-derived terrain attribute values and the mean of the 50 simulated values, calculated at each pixel

across the study area. Negative values indicate the USGS DEM value is higher than the average simulation value, and positive values indicate it

is lower than the average simulated value.

Page 18: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et

1997±1998 El NinÄo event. The authors took a very

generalized approach to predicting sources of future

debris ¯ows, most of which in southern California

occur as shallow landslides on slopes steeper than

258 in wet colluvial soils. Cannon et al. mapped actual

debris ¯ow source areas on to a USGS 30 m DEM

from air photos after early storms, then predicted

other areas prone to sliding based on similar slope

and concavity to the documented failed slopes. They

note that coarseness of 30 m DEM resolution, lack of

K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173 171

Fig. 12. Likelihood of slopes failing, based on selection of areas with greater than 258 slopes and concave plan curvature. The USGS DEM

predicts a smaller area will fail than is predicted when 50 simulated elevation surfaces are used (a). The result of analysis using the USGS DEM

is a binary map of slopes that will or will not fail (b). However, simulated grids allow one to map percent likelihood of slope failure (c) and the

standard deviation of slope failure (d), which are much more useful products for land use planning or hazard assessment.

Page 19: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et

data on watershed area, and the fact that debris ¯ows

travel well beyond their sources prevent these maps

from being accurate predictors of hazards at speci®c

locations. However the binary predictive maps they

produced were highly effective in sharpening public

awareness of the links between surface processes and

land use planning in the Bay area, and were used for

emergency preparation in high risk zones. These

predictive debris ¯ow maps were constructed from

slope and plan curvature grids, calculated from

USGS 30 m DEMs. The authors did not take into

account the accuracy of the elevation data as an addi-

tional source of error.

As an example of how this error would affect the

results of a simple map overlay problem, we

constructed a model of potential debris ¯ow sources

for the study area, based on the criterion discussed in

Cannon et al. (1998). All areas with slope greater than

25 degrees and positive plan curvature (concave) were

selected in the grid module of ARC/INFO, both on the

original DEM, and for each of the 50 simulated grids.

The results are shown in Fig. 12. The histogram of the

percent of the area likely to fail shows that the USGS

DEM predicts a 25% smaller area (1% of the total

®eld area) will fail than do the simulated grids. In

fact, the USGS DEM prediction is lower than any of

the 50 simulation predictions. During 1997±1998

many hillslopes at Sedgwick did fail, and are apparent

in air photos taken in the Spring of 1998. Old debris

¯ows or landslides can also be seen in the DOQ. These

slope failures generally align with those areas

predicted to fail in this simple map overlay analysis.

However, the USGS DEM result failed to predict a

large area of debris ¯ows, which blocked the access

road for close to a week. The simulated probability

shows the area had up to a 50% chance of failure.

Such a probability map would have been much

more useful for raising public awareness and emer-

gency planning in San Francisco than were the

binary maps produced by using only the USGS

DEM. Of course, effects from roads and other

anthropogenic factors which in¯uence landslide

susceptibility were not taken into account.

4. Conclusions

Digital elevation models, like all maps, are models

which deviate from reality. New applications for

terrain analysis are proving highly popular and effec-

tive, increasing the importance of measuring and

understanding the impacts of digital elevation data

quality on modeling efforts. This investigation of the

accuracy of the Los Olivos USGS 30 m DEM has

shown that the USGS product is generally quite accu-

rate, given the reported global error estimates of

(9,9,5) m. However, there are spatial patterns of

error in this DEM, which can not be explained by

the distribution of the terrain attributes included in

this study. Terrain attribute calculations which

compound values from a large number of cells on

the DEM are affected by elevation errors most drama-

tically. Even a small amount of elevation error can

greatly affect derivative products such as ¯ow accu-

mulation, wetness index, or slope, and subsequently

all interpretations dependent on these calculations.

The case study of the effect of DEM error on hillslope

failure prediction illustrates the uncertainty DEM

error contributes to terrain modeling results. We

used high accuracy GPS data with sequential Gaus-

sian simulation to provide a Monte Carlo framework

for quantifying the effect of measured USGS 30 m

DEM error on digital terrain modeling. The spatial

resolution, precision and accuracy of DEM products

should be tested and speci®ed prior to any type of

terrain modeling. Methods such as those applied

here improve the quantitative aspect of terrain analy-

sis by providing a means to explicitly characterize the

undocumented error in digital terrain models.

Acknowledgements

Special thanks go to Ashton Shortridge, Brenda

Franklin, Keith Clarke, Joel Michaelson, and two

anonymous reviewers for their assistance, comments

and suggestions. The authors gratefully acknowledge

support provided by a grant from NASA Earth

Science Enterprise.

References

Adkins, K.F., Merry, C.J., 1994. Accuracy assessment of elevation

data sets using the Global Positioning System. Photogr. Engng

Remote Sensing 60 (2), 195±202.

Band, L.E., 1993. Extraction of channel networks and topographic

K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173172

Page 20: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et

parameters from digital elevation data. In: Beven, K., Kirkby,

M.J. (Eds.), Channel Network Hydrology, Wiley, New York,

pp. 13±42.

Bolstad, P.V., Stowe, T., 1994. An evaluation of DEM accuracy:

elevation, slope, and aspect. Photogr. Engng Remote Sensing 60

(11), 1327±1332.

Burrough, P.A., McDonnell, R.A., 1998. Principles of Geographical

Information Systems. Spatial Information Systems, Oxford

University Press, New York (333pp).

Cannon, S.H., Ellen, S.D., Graham, S.E., Graymer, R.W., Hampton,

M.A., Hillhouse, J.W., Howell, D.G., Jayko, A.S., LaHusen,

R.L., Lajoie, K.R., Pike, R.J., Ramsey, D.W., Reid, M.E., Rich-

mond, B.M., Savage, W.Z., Wentworth, C.M., Wilson, R.C.,

1998. Slope failure and shoreline retreat during northern Cali-

fornia's latest El NinÄo. GSA Today 8 (8), 1±6.

Deutsch, C.V., Journel, A.G., 1998. GSLIB: Geostatistical Software

Library and User's Guide, Oxford University Press, New York.

Dietrich, W.E., Wilson, C.J., Montgomery, D.R., McKean, J., 1993.

Analysis of erosion thresholds, channel networks, and landscape

morphology using a digital terrain model. J. Geol. 101, 259±

278.

Dikau, R., Brunsden, D., Schrott, L., Ibsen, M.-L. (Eds.), 1996.

Landslide Recognition: Identi®cation, Movement and Causes

Wiley, New York, pp. 1±12.

Ehlschlaeger, C.R., Shortridge, A.M., 1996. Modeling elevation

uncertainty in geographical analyses. In: Kraak, M.J., Molenaar,

M. (Eds.), Spatial Data Handling '96, Delft, pp. 9B.15±9B.25.

ESRI, 1996. ARC/Info. ESRI, Inc, Redlands, CA.

Fisher, P.F., 1993. Algorithm and implementation uncertainty in

viewshed analysis. Int. J. Geog. Info. Sys. 7 (4), 331±347.

Giles, P.T., Franklin, S.E., 1998. An automated approach to the

classi®cation of the slope units using digital data. Geomorphol-

ogy 21, 251±264.

Goovaerts, P., 1997. Geostatistics for Natural Resources Evalua-

tion, Oxford University Press, New York (483pp).

Heuvelink, G.B.M., 1998. Error propagation in environmental

modelling with GIS. Research Monographs in Geographic

Information Systems, Taylor & Francis, Bristol, PA

(127pp).

Holmes, K.W., 1999a. Calculation of error in a USGS 30-meter

digital elevation model and its effects on terrain attributes and

environmental modeling. Masters thesis, University of Califor-

nia, Santa Barbara CA, 91pp.

Holmes, K.W., 1999b. Color ®gures and animations: available on-

line at http://www.geog.ucsb.edu/~karen.

Journel, A.G., 1996. Modelling uncertainty and spatial dependence:

Stochastic imaging. Int. J. Geog. Info. Sys. 10 (5), 517±522.

Journel, A.G., Huijbregts, C.J., 1978. Mining Geostatistics,

Academic Press, New York (600pp).

Kyriakidis, P.C., Shortridge, A.M., Goodchild, M.F., 1999.

Geostatistics for con¯ation and accuracy assessment of digital

elevation models. Int. J. Geog. Info. Sci. 13 (7), 677±707.

McDermid, G.J., Franklin, S.E., 1995. Remote sensing and geomor-

phometric discrimination of slope processes. Z. Geomorph.

N.F., Suppl.-Bd. 101, 165±185.

McKenzie, N.J., Austin, M.P., 1993. A quantitative Australian

approach to medium and small scale surveys based on soil stra-

tigraphy and environmental correlation. Geoderma 57, 329±

355.

McKenzie, N.J., Gessler, P.E., Ryan, P.J., O'Connell, D., 2000. The

role of terrain analysis in soil mapping (chap. 10). In: Wilson,

J.P., Gallant, J.C. (Eds.), Terrain Analysis: Principles and Appli-

cations, Wiley, New York (in press).

Mellerowicz, K.T., Rees, H.W., Chow, T.L., Ghanem, I., 1992. Soil

conservation planning at the watershed level using the Universal

Soil Loss Equation with GIS and microcomputer technologies: a

case study. J. Soil Water Conserv. 49 (2), 194±200.

Milne, J.A., Sear, D.A., 1997. Modelling river channel topography

using GIS. Int. J. Geog. Info. Sci. 11 (5), 499±519.

Mitasova, H., Ho®erka, J., Zloca, M., Iverson, L.R., 1996. Model-

ling topographic potential for erosion and deposition using GIS.

Int. J. Geog. Info. Sys. 10 (5), 629±641.

Moore, I.D., Grayson, R.B., Ladson, A.R., 1990. Digital terrain

modelling: a review of hydrological, geomorphological, and

biological applications. Hydrol. Proces. 5, 3±30.

National Geodetic Survey, 1996. NGS datasheets, from DXE.EXE,

Silver Spring, MD.

Of®ce of the County Surveyor, 1993. Station Recovery Sheets.

1017, 1018, Santa Barbara County Public Works Department,

Santa Barbara, CA.

Polidori, L., Chorowicz, J., Guillande, R., 1991. Description of

terrain as a fractal surface, and application to digital elevation

model quality assessment. Photogr. Engng Remote Sensing 57

(10), 1329±1332.

Shortridge, A.M., 1997. Characterizing the relationship between

7.5 0 and 1 degree digital elevation models. Masters thesis,

University of California, Santa Barbara, 70pp.

Statistical Sciences, 1995. S-PLUS. StatSci, a division of MathSoft

Inc., Seattle.

Thornton, P.E., Running, S.W., White, M.A., 1997. Generating

surfaces of daily meteorological variables over large regions

of complex terrain. J. Hydrol. 190 (3-4), 214±250.

Trimble Navigation Ltd, 1996. TRIMMAP User's Manual, Version

6.0, Sunnyvale, CA.

US Geological Survey, 1987. Digital Elevation Models: Data Users

Guide 5. Department of the Interior, USGS, Reston, Virginia,

38pp.

US Geological Survey, 1998. USGS Geospatial Data Clearing-

house, National Mapping and Remotely Sensed Data: Digital

Elevation Models (DEMs).: available on-line at http://

edcwww.cr.usgs.gov/nsdi/gendem.htm.

Weibel, R., Heller, M., 1990. A framework for digital terrain model-

ling. In: Fourth International Symposium on Spatial Data

Handling, Zurich, Switzerland, pp. 219±229.

K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173 173