error in a usgs 30-meter digital elevation model and its ... · geological modeling, and...
TRANSCRIPT
![Page 1: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et](https://reader034.vdocuments.mx/reader034/viewer/2022042100/5e7c5c8216c93e64552d5745/html5/thumbnails/1.jpg)
Error in a USGS 30-meter digital elevation model and itsimpact on terrain modeling
K.W. Holmesa,*, O.A. Chadwicka, P.C. Kyriakidisb,1
aDepartment of Geography, University of California, Santa Barbara, CA, 93106, USAbDepartment of Geological and Environmental Sciences, Stanford University, Stanford, CA, 94305-2115, USA
Received 15 October 1999; received in revised form 21 February 2000; accepted 2 March 2000
Abstract
Calculations based on US Geological Survey (USGS) digital elevation models (DEMs) inherit any errors associated with that
particular representation of topography. We investigated the potential impact of error in a USGS 30 m DEM on terrain analysis
over 27 km2. The difference in elevation between 2652 differential Global Positioning Systems measurements and USGS 30-m
DEM derived elevations provided the comparative error dataset. Analysis of this comparative error data suggested that although
the global (average) error is small, local error values can be large, and also spatially correlated. Stochastic conditional
simulation was used to generate multiple realizations of the DEM error surface that reproduce the error measurements at
their original locations and sample statistics such as the histogram and semivariogram model. The differences between these
alternative error surfaces provide a model of uncertainty for the unknown DEM error spatial distribution. These DEM errors had
a signi®cant impact on terrain attributes which compound elevation values of many grid cells (e.g. slope, wetness index, etc.). A
case study using terrain modeling demonstrates that the result of error propagation is most dramatic in valley bottoms and along
streamlines. q 2000 Elsevier Science B.V. All rights reserved.
Keywords: Digital terrain models; Uncertainty; Spatial distribution; Digital simulation; Geostatistics; Global positioning systems
1. Introduction
Topography controls ¯uxes of energy, nutrient
distribution, mass movement, and water dispersion
in many landscapes. As a result, topographic maps
and their digital analogues are often useful for study-
ing spatially distributed landscape processes. Digital
elevation models (DEMs) have been used for mapping
and environmental spatial analysis in landslide
prediction and characterization (e.g. Dikau et al.,
1996), climate/meteorological applications (e.g.
Thornton et al., 1997), route optimization (e.g.
Ehlschlaeger and Shortridge, 1996), integrated studies
of hillslope processes (e.g. McDermid and Franklin,
1995), landform analysis (e.g. Weibel and Heller,
1990), image registration (e.g. Giles and Franklin,
1998), hydrologic modeling (e.g. Band, 1993), sedi-
ment ¯ux modeling (e.g. Mitasova et al., 1996), land
use planning (e.g. Mellerowicz et al., 1992) and soil-
landscape modeling (e.g. McKenzie and Austin,
1993). Summaries of DEM applications can be
found in Moore et al. (1990), Weibel and Heller
(1990), and Milne and Sear (1997). For terrain
analyses, researchers either produce their own
Journal of Hydrology 233 (2000) 154±173www.elsevier.com/locate/jhydrol
0022-1694/00/$ - see front matter q 2000 Elsevier Science B.V. All rights reserved.
PII: S0022-1694(00)00229-8
* Corresponding author. Tel.: 11-805-893-8525; fax: 11-805-
893-3146.
E-mail address: [email protected] (K.W. Holmes).1 Present address: Earth Sciences Division, Lawrence Berkeley
National Laboratory, Berkeley, CA, 94720, USA.
![Page 2: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et](https://reader034.vdocuments.mx/reader034/viewer/2022042100/5e7c5c8216c93e64552d5745/html5/thumbnails/2.jpg)
DEMs or use publicly available ones. In the ®rst case,
they know exactly what techniques were used in
collecting and processing the data, and can calculate
the accuracy both globally and spatially. However,
DEM production can be very dif®cult and time
consuming, and is typically not the goal of most
researchers who plan to do terrain analysis, although
it may prove necessary if high-resolution data are
required (Dietrich et al., 1993). If pre-processed
DEMs are used, there is usually little information
available to data users about data collection, proces-
sing, or error distribution.
In the United States, the most commonly used
DEMs are those produced and distributed by the US
Geological Survey. These digital datasets range from
3-arcsecond to 30-m cell resolution (US Geological
Survey, 1998). The only error information provided
are global estimates of root mean square error
(RMSE), which give the user no information about
data accuracy at speci®c locations within the DEM.
In addition, terrain attributes such as slope or aspect
are derived from elevation grids, and tend to
compound systematic errors caused by the resolution
of the data and the methods by which the DEM was
produced (Bolstad and Stowe, 1994; McKenzie et al.,
2000). Environmental model predictions based on a
combination of DEM-derived surfaces contain an
uncertainty component as a result of the unknown
accuracy of the original elevation data. It is important
to quantify this uncertainty inherited from the DEM,
and to investigate its impact on the interpretation and
utilization of model predictions.
The obvious approach to assess DEM error is to use
higher accuracy ®eld-surveyed data to evaluate the
DEM elevation values (Adkins and Merry, 1994;
Bolstad and Stowe, 1994). In the absence of ®eld-
collected data (the usual case), researchers have
devised other ways of characterizing uncertainty,
including using limited elevation values derived
from higher resolution DEMs as ªground-truthº data
(Shortridge, 1997; Kyriakidis et al., 1999), exploring
fractal dimensions of DEMs to reveal production arti-
facts or anomalies (Polidori et al., 1991), or assigning
distributions of error for each grid cell based on the
reported global error measurements (Fisher, 1993;
Ehlschlaeger and Shortridge, 1996). The error
measured at discrete points is used to supply estimates
of error, and of error uncertainty, in locations where
error was not measured directly. The process respon-
sible for inducing errors in a DEM, like many
geographic or environmental processes, is not suf®-
ciently well understood to permit a deterministic
analysis of error. A geostatistical approach to error
characterization, based on a probabilistic model that
recognizes these inevitable uncertainties, is more
appropriate.
Geostatistics allows the use of effective estimation
procedures, gauges the accuracy of the estimates, and
assigns con®dence intervals to estimates by treating
the variable of interest as random. This does not imply
the variable itself or the deterministic process is
random, but rather re¯ects our uncertainty about the
process which generates values at unsampled loca-
tions (Journel, 1996). Taken a step further, geostatis-
tics can be used to model the uncertainty of unknown
values at a set of locations by generating alternative
images, or realizations, which reproduce the original
data at their measurement locations and spatial
patterns or statistics considered important for the
speci®c case. This process is called stochastic simula-
tion (Goovaerts, 1997). These alternative realizations,
which span the range of possible attribute values
given the model of uncertainty at any location, can
be used as alternative inputs for environmental model-
ing or building scenarios, providing a distribution of
possible results (Heuvelink, 1998).
While kriging alone provides the best, in the least
squares sense, interpolated (smooth) surface from a
set of measurements, simulation allows the generation
of a series of equiprobable realistic surfaces or
volumes, each having the correct spatial structure. A
simple Monte Carlo method would simulate values
for each grid cell independent of the rest of the
surface. Conditional simulation combines the actual
data values with the spatial correlation information
from the semivariogram to generate simulated
outcomes at each grid cell. The average value in
each grid cell over a very large number of realizations
is virtually identical to the simple kriging estimate
(Burrough and McDonnell, 1998). Because it
provides alternative plausible representations of an
attribute's possible spatial distribution, simulation is
commonly used in risk analysis, medical imaging,
transportation analysis, mineral exploration, hydro-
geological modeling, and environmental sciences.
In this paper, we evaluate the magnitude and spatial
K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173 155
![Page 3: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et](https://reader034.vdocuments.mx/reader034/viewer/2022042100/5e7c5c8216c93e64552d5745/html5/thumbnails/3.jpg)
distribution of error in USGS 30-m elevation values in
relation to high accuracy Global Positioning Systems
(GPS) data in order to assess the potential effect of
DEM error on environmental modeling applications.
The selected method is a simpli®ed version of the
geostatistical approach developed by Kyriakidis et
al. (1999), who used a sparse, high quality elevation
dataset (from a USGS 30-m DEM) to evaluate the
quality of a much more extensive, continuous eleva-
tion dataset of unknown quality (USGS 3-arcsecond
DEM) using conditional Gaussian simulation.
The DEM accuracy assessment is carried out as
follows: (1) collection and assembly of the DEM
error dataset; (2) exploratory analysis of the data;
(3) declustering, normalization, and structural analy-
sis (variography); and (4) generation of 50 equiprob-
able elevation surfaces using conditional sequential
Gaussian simulation (discussed below). The effect of
DEM error, as revealed by the differences among the
50 realizations, on terrain attribute calculation and on
a simple map algebra model of the likelihood of hill-
slope failure were explored as examples of the impact
of USGS 30-m DEM error on digital terrain modeling.
2. USGS DEM accuracy assessment
2.1. Field area description
Sedgwick Natural Reserve, located in the San
Rafael Mountains at the southern most end of the
California Coastal Range, includes landforms ranging
from extensive ¯oodplains along Figueroa Creek, to
low relief foothills, and high relief, long mountain
slopes to the northeast (Fig. 1). Sedgwick's diverse
terrain provides a test of the ability of the USGS
K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173156
Fig. 1. Map of the study area: University of California Sedgwick Natural Reserve, Santa Barbara County, California.
![Page 4: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et](https://reader034.vdocuments.mx/reader034/viewer/2022042100/5e7c5c8216c93e64552d5745/html5/thumbnails/4.jpg)
DEMs to accurately represent topographic character-
istics. The ®eld area is approximately 27 km2
(2700 hectares, 6670 acres), spans about 1000 vertical
meters, and includes several different geologic forma-
tions, which support different types of landforms such
as low rolling hills, and steep mountain slopes.
2.2. Elevation data
USGS 30-m DEM: The study area is located in the
Los Olivos USGS 7.5-min Quadrangle. The corre-
sponding DEM is a level 2 data set, indicating the
data were acquired by contour digitizing, either photo-
grammetrically or from existing maps. These data sets
have been processed or smoothed for consistency and
identi®able errors have been removed. A root mean
square error (RMSE) of up to one half of a contour
level is the maximum error allowed. At least 28 test
points (20 within the quadrangle, 8 along the edges)
from the source data located on contour lines, bench
marks, or spot elevations were compared with the
DEM data by the USGS to calculate a RMSE (US
Geological Survey, 1987). For the Los Olivos DEM,
RMSE is reported as 6 m for horizontal coordinates,
and 2 m for height, relative to the ®le datum, which
has a measured error relative to the absolute datum of
3 m �x; y; z�: In the worst case scenario, this means the
data could have on average as much as ^(9,9,5) m of
error. The spatial distribution of this uncertainty is not
provided to users.
GPS elevation data: We collected GPS positions
with a vertical accuracy better than 25 cm over
0.4% of the study area by taking the coordinates of
the grid cell centers of the 30 m data, and choosing
200 sites using a random number generator. In order
to compare GPS horizontal (x,y) measurements to the
coordinates reported by the USGS, we chose 60 addi-
tional sample sites (road intersections, rock outcrops,
etc.) identi®able within a 2 m radius on the USGS 1-m
digital orthophotoquad (DOQ). The GPS equipment
used was a Trimble 4400 total station with an addi-
tional data collector to allow both the rover and base
receivers to log raw GPS data. Trimble GPSurvey
software was used to post-process the data and calcu-
late the associated accuracy. ARC/INFO (ESRI,
1996) was used for producing maps and overlays,
and S-PLUS software (Statistical Sciences, 1995)
was used for statistical analyses of the GPS data.
K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173 157
Fig. 2. (a) GPS reference network. (b) Radial survey design. The background image is the USGS 30-m DEM, shown in grey scale. Black is low
elevation, white is high elevation.
![Page 5: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et](https://reader034.vdocuments.mx/reader034/viewer/2022042100/5e7c5c8216c93e64552d5745/html5/thumbnails/5.jpg)
We established a horizontal and vertical static
reference network in Sedgwick Reserve using three
GPS receivers over ®ve controls: three National
Geodetic Survey (NGS) monuments, 1st order hori-
zontal control, 4th order vertical control, one with 1st
order differentially leveled vertical control; and two
Santa Barbara County monuments, 1st order horizon-
tal control, 1st order differentially leveled vertical
control (Of®ce of the County Surveyor, 1993;
National Geodetic Survey, 1996). Each baseline was
occupied for more than one hour, some as many as
four times. The reference network (Fig. 2a) was post-
processed with high accuracy results, and the remain-
ing points were then processed within the same
network (Table 1).
We adopted a radial survey design (Fig. 2b) to
record the locations and elevations of the 252
randomly selected sites. This approach provided opti-
mal use of the available horizontal and vertical
controls, equipment, and time resources. The base
station was set up over an NGS 1st order control
approximately 8 km from the ®eld area. The base
logged raw GPS data continuously, while the second
receiver was carried to each sample location for 8±
20 min, depending on the number and geometric
con®guration of satellites visible during data collec-
tion. Numerous established survey monuments, points
from the reference network, and benchmarks were
included in the survey to adjust the GPS survey
network during post processing. The independent
monuments also provided checks on the accuracy of
the GPS results.
In addition to the 252 distributed GPS points, ten
areas located in different types of topography were
intensively surveyed, yielding an additional 2400
GPS measurements for the DEM accuracy assessment
(Fig. 3). For these surveys, a real-time kinematic
survey method was used, which provides accuracy
of about 1-cm horizontal, 2-cm vertical, relative to
the base station coordinates (Trimble Navigation
Ltd, 1996). The previously surveyed GPS points
(Table 1) were used for base station locations.
2.3. Exploratory data analysis
Several interpolation techniques were tested for
extracting USGS elevation values at the GPS
measurement locations (Holmes, 1999a). All of
these datasets had a correlation with the GPS dataset
of greater than 0.9991. For the purposes of this
project, the ªnearest neighborº dataset, simply using
the nearest gridded point value, was chosen for use in
uncertainty modeling because it was the simplest
method, and there was no evidence that more complex
interpolation methods improved the data quality.
Fig. 4 shows the USGS DEM and GPS data sets
with their histograms. The USGS dataset histogram is
skewed to the left, indicating that a high percentage of
cells have low elevations. The random GPS dataset
(252 points) shows the same histogram characteristics
as the USGS DEM, though it is noisier. The histogram
of the large GPS dataset (2652 points) shows strong
bimodality, which is a result of the high degree of
clustering of elevation measurements in the 10 select
locations.
The DEM error was calculated by subtracting the
nearest neighbor USGS elevations from the GPS
measured elevations at each GPS point location. The
spatial distribution of these error values and their
histogram are plotted in Fig. 5a and b. The histogram
shows a roughly normal distribution, with a mean of
20.10 m, median of 20.46 m, and a standard
K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173158
Table 1
Sedgwick natural reserve GPS survey results
Survey (includes all points) No. of points Maximum error (m) Minimum error (m) Mean error (m)
Field area 252
Horizontal (xy) 0.073 0.003 0.027
Vertical (z) 0.206 0.018 0.057
3-D (xyz) 0.218 0.018 0.063
Reference network 10
Horizontal (xy) 0.007 0.002 0.004
Vertical (z) 0.028 0.018 0.021
3D (xyz) 0.028 0.018 0.021
![Page 6: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et](https://reader034.vdocuments.mx/reader034/viewer/2022042100/5e7c5c8216c93e64552d5745/html5/thumbnails/6.jpg)
K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173 159
Fig. 3. GPS measurement locations: 252 randomly located points (upper right) and 2400 highly clustered points (white £ 's).
![Page 7: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et](https://reader034.vdocuments.mx/reader034/viewer/2022042100/5e7c5c8216c93e64552d5745/html5/thumbnails/7.jpg)
K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173160
Fig. 4. Maps and histograms of (a,b) the USGS 30-m DEM; (c,d) the randomly located 252 point GPS dataset; and (e,f) the clustered 2562 point
GPS dataset.
![Page 8: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et](https://reader034.vdocuments.mx/reader034/viewer/2022042100/5e7c5c8216c93e64552d5745/html5/thumbnails/8.jpg)
K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173 161
Fig. 5. (a) Histogram of error data (GPS elevation ±USGS DEM elevation); (b) location map of error data; (c) histogram of declustered error
data; (d) map of the declustering weights used to produce the histogram in (c); (e) histogram of error data after a normal scores transform. Color
images are available from Holmes, 1999b (web page).
![Page 9: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et](https://reader034.vdocuments.mx/reader034/viewer/2022042100/5e7c5c8216c93e64552d5745/html5/thumbnails/9.jpg)
deviation of 4.11 m, indicating that on average over
the study area the USGS DEM overestimates the
elevation by only 10 cm. However the maximum
(18.15 m) and minimum (212.72 m) error values
show that there are signi®cant differences in some
areas.
As shown in Fig. 3, the data points were highly
clustered. Because simple kriging is used within the
simulation routine to estimate the conditional distri-
bution of the error at any location given the surround-
ing error data, dense clusters of points can greatly
affect estimations in areas with sparse data due to a
misspeci®cation of the regional mean. A declustering
algorithm was used to weight the data according to
how much in¯uence each data point should have to
the regional histogram (Goovaerts, 1997; Deutsch and
Journel, 1998). Because there was no evident prefer-
ential clustering of data in high- or low-valued error
areas, the cell size used in the declustering algorithm
was chosen such that a single isolated datum was
located within each cell (on average). In this way,
isolated samples, which are more representative of
the area around them, received more weight in the
calculation of the (declustered) error histogram than
those samples located in clusters. Fig. 5a±c show the
histograms of the raw error data, the location map of
K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173162
Fig. 6. Scatterplots of USGS DEM error (absolute value) vs. GPS measured elevation, roughness or relief, slope, and aspect.
![Page 10: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et](https://reader034.vdocuments.mx/reader034/viewer/2022042100/5e7c5c8216c93e64552d5745/html5/thumbnails/10.jpg)
error, and the histogram of declustered error data. Fig.
5d is a map of the declustering weights, which shows
that low weights are assigned to clustered data, and
higher weights assigned to isolated data.
Local topographic roughness was calculated as the
standard deviation of elevation measured within a
moving window (a 5-cell or 150 m radius circle) on
the USGS DEM. A high standard deviation indicates
high variability in the local terrain surface. Fig. 6
shows scatter plots of the absolute value of error vs.
elevation, roughness, slope, and aspect. The strongest
correlation is between slope and absolute error, with a
value of 0.325. GPS elevation, roughness, and aspect
show correlations with error of 0.288, 0.300, and
0.125, respectively.
The semivariograms of the USGS dataset (Fig. 7a)
show a steady increase in elevation variance with
increasing distance. There is a slight difference in
semivariograms calculated in north±south and east±
west directions, but they have the same behavior at
distances shorter than 1200 m. These semivario-
grams are typical of topographic data because
there is no reason for elevation values to reach a
sill, unless the shear strength of the geologic
material upholding the topography is exceeded,
limiting the maximum elevation in an area. Topo-
graphy has the potential to become more and more
diverse over larger distances (i.e. the semivariogram
does not reach a sill) due to differing lithologies and
sur®cial processes. The semivariogram of the GPS
dataset (Fig. 7b) is much more irregular because of
the high degree of clustering of the data points.
Some of the dips and spikes in the semivariogram
represent a large number of measurements at a ®xed
distance apart that have either similar elevations
(low variance) although there may be variable topo-
graphy between the two sites, or very different
elevations (high variance) because of the high
number of measurements at two very different
elevations within a certain lag. The semivariogram
of error (Fig. 7c) shows spatial autocorrelation, and
there is a slight trend toward increasing variance
with distance. The north±south and east±west semi-
variograms show no clear trend with direction. The
omnidirectional semivariogram of the normal score
transformed error data was modeled with an isotro-
pic semivariogram function for use in the simulation
routine.
K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173 163
Fig. 7. Semivariograms of (a) the USGS elevation data, (b) the
clustered GPS measured elevations, and (c) the error data, calcu-
lated by subtracting the USGS values from the GPS measured
values. Note differences in y-scales.
![Page 11: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et](https://reader034.vdocuments.mx/reader034/viewer/2022042100/5e7c5c8216c93e64552d5745/html5/thumbnails/11.jpg)
2.4. Data preparation
The weak correlations between calculated DEM
error and terrain ruggedness or other basic terrain
attributes (Fig. 6) eliminated the need to use both
elevation data sets in the simulation routine as was
done by Kyriakidis et al. (1999). They incorporated
both original datasets in the simulation routine
because they found that high DEM error was asso-
ciated with areas of rough terrain. In this case, the
association of error with patterns of elevation or
terrain attributes was negligible, allowing us to
directly simulate the error surface using only the
calculated error data set to produce alternative error
realizations. The public domain software package
GSLIB (Deutsch and Journel, 1998) was used for all
calculations related to simulation.
Simulations model the uncertainty in the attribute
spatial distribution based on the data sources available
near each point of interest. Simple kriging is used
within the simulation routine to establish uncertainty
models of error values at every location. Likely data
values are simulated at each location by drawing
randomly from the possible distribution of error
provided by the local conditional cumulative distribu-
tion function (ccdf). The easiest way to derive local
ccdfs is to assume a model for the entire multivariate
distribution of the original random function represent-
ing, in this case, the spatial distribution of elevation
errors. This model needs to be ¯exible enough that all
of the local ccdfs have the same analytical expression
and can be fully speci®ed through a few parameters.
By using this method, the problem of determining the
local ccdf is reduced to the estimation of these few
parameters (such as a local mean and variance). The
multivariate Gaussian random function model is the
most commonly used parametric approach because
the inherent structure of the model makes determining
local ccdfs fairly straight forward (Goovaerts, 1997).
A normal scores transform, a non-linear, rank preserv-
ing transform that remaps any distribution to a normal
distribution (see Journel and Huijbregts (1978; p. 476)
or Goovaerts (1997; p. 266)) was applied to the
declustered dataset (Fig. 5e) to meet the format
requirements of Gaussian simulation, which are that
the univariate distribution of the error data be standard
normal.
Once the data were declustered and transformed to
a standard normal distribution, the spatial variability
of the data was modeled for inclusion in the simula-
tion routine. Semivariance values are calculated at
de®ned lags to build an experimental semivariogram,
but kriging and simulation algorithms require a
continuous model of spatial variance, which is
supplied in the form of a semivariogram model. The
number of lags and lag distance for the semivariogram
were chosen experimentally to smooth the data
enough to see a de®nite semivariogram pattern but
retain some of the detail (Fig. 8e). The ®nal isotropic
normal scores semivariogram model input to the
sequential Gaussian simulation is:
g�h� � 0:3 £ Expuhu
125
� �1 0:7 £ Exp
uhu3000
� �where Exp(´) denotes an exponential semivariogram
model, and h denotes the distance between any two
locations (lag distance).
2.5. Stochastic simulation
The basic steps of sequential Gaussian simulation
are as follows (Goovaerts, 1997; Deutsch and Journel,
1998): (1) A random path is de®ned for visiting each
grid node once; (2) At each node, simple kriging is
used to determine the parameters of the Gaussian local
ccdf (mean and variance), based on the normal score
semivariogram modelÐboth the normal scores origi-
nal data values and previously simulated values
within a local neighborhood are considered for simple
kriging; (3) A value is drawn from that local ccdf
using a Monte Carlo type simulation, and is added
to the sample data set; (4) Steps 2 and 3 are repeated
at each node along the random path until each node in
the grid has been visited, and a corresponding error
value has been generated. Once the simulations of the
normal score values have been produced, each reali-
zation must be back-transformed to the original error
distribution. This essentially consists of taking the
inverse of the normal scores transform to graphically
remap the normal scores distribution to the original
error histogram.
Fifty error realizations were generated using
sequential Gaussian simulation. Figs. 5a and b, and
8a±h show the maps, histograms, and semivariograms
of the original data and of two randomly selected
realizations. All of the simulated error surfaces
K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173164
![Page 12: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et](https://reader034.vdocuments.mx/reader034/viewer/2022042100/5e7c5c8216c93e64552d5745/html5/thumbnails/12.jpg)
K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173 165
Fig. 8. (a,b) Maps, (c,d) histograms, and (e,f, solidlines) semivariograms of two randomly selected simulated error surfaces. The points on
semivariograms (e±h) are the semivariogram of the original data. (g) shows the semivariograms of the 50 realizations superimposed on the
normal score semivariogram of the raw GPS data.
![Page 13: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et](https://reader034.vdocuments.mx/reader034/viewer/2022042100/5e7c5c8216c93e64552d5745/html5/thumbnails/13.jpg)
reproduce the measured error values at their locations.
Each of these 50 realizations would have approxi-
mately the same histogram as the original data values,
as well as the semivariogram model quantifying their
spatial correlation. Within individual realizations,
deviations from the sample statistics are expected.
In Fig. 8, note the similarity between the distributions,
summary and spatial statistics of the realizations
themselves. The largest differences in the spatial
patterns in the two realizations occur in areas far
away from where GPS measurements were taken.
The ®nal step in the simulation procedure was to
add the simulated error realizations to the original
USGS 30 m DEM. This created 50 different versions
of the DEM, each of which is theoretically a more
accurate representation of topography than the
USGS DEM because the 2652 GPS measured eleva-
tions and the modeled spatial correlation were
accounted for in each new realization. Fig. 9 shows
the ®nal results for the two randomly selected
realizations, along with the original DEM and
elevation histogram. The scatterplots of USGS DEM
vs. realizations #12 and #41 show the high degree of
correlation among the datasets. These 50 simulated
DEMs are used as a Monte Carlo framework for
testing the effect of DEM error on terrain attribute
calculations and DEM-based terrain modeling
below.
3. Error propagation
In order to explore the effect of DEM error on
different types of DEM products, simple Boolean
operations and an empirical regression model were
run on the original DEM and the 50 simulated realiza-
tions discussed above. These alternative numerical
models provide a sample of the joint probability distri-
bution of the higher accuracy elevation for each grid
cell of the DEM of the study area. The range and
spatial patterns of elevation in the simulated surfaces
reveal the magnitude and spatial patterns of uncer-
tainty in the original DEM, which are propagated
through DEM-derived products. In the following
sections, we present examples of the effect of uncer-
tainty in elevation data on the calculation of terrain
attributes, and a simple case study of error propaga-
tion in hillslope failure analysis.
3.1. Terrain attribute calculations
Terrain attributes (Table 2) were calculated on the
50 simulated DEMs in ARC/INFO, and statistics
calculated for the distribution of values in each
pixel. Selected statistics are displayed in Fig. 10.
The spatial patterns found in the map of average simu-
lated error values at each pixel (Fig. 10a) are similar
to, but more systematic than, those found in the two
examples of simulated error surfaces shown in Fig. 8a
and b. Although there were no signi®cant correlations
between DEM error and terrain attributes, there is a
recognizable pattern of spatially distributed error.
Error was calculated as the GPS elevations minus
the USGS DEM elevations, so negative errors signify
the DEM overestimated elevation, and positive errors
mean it underestimated elevation.
The range in attribute values that resulted from
adding simulated error surfaces to the DEM (Fig.
10c±f) were calculated by subtracting the minimum
of the 50 simulated values from the maximum simu-
lated value in each pixel across the study area. The
range in elevation values (Fig. 10c) shows two major
patterns, which are evident on most of the terrain
attribute grids. First, the range of elevations increases
considerably outside the boundary of the study area,
where there were no measured error values. The
second obvious pattern is the white patches indicating
a very low range in elevation values. Ten of these are
the areas where GPS data are clustered. The simula-
tion algorithm reproduces each GPS data value in
their original positions, thus preserving as much of
the ªtrueº data as possible in each simulation. The
small patches of very low standard deviation are typi-
cally where there are numerous GPS data. These
higher accuracy data were preserved in the simulation
process, resulting in all of the simulations having
similar elevation values in those regions.
The same white patches indicating low range of
values are obvious in the map of slope range (Fig.
10d), however the rest of the pattern is much more
complex. The main valley bottoms and a few extre-
mely steep slopes show low range values indicating
that they were consistently de®ned on most of the
simulations. The majority of the area has such high
local variability in slope range that it appears random,
a result of the sensitivity of the slope calculation to
elevations within a 3 £ 3 cell window. The map of
K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173166
![Page 14: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et](https://reader034.vdocuments.mx/reader034/viewer/2022042100/5e7c5c8216c93e64552d5745/html5/thumbnails/14.jpg)
K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173 167
Fig. 9. Maps and histograms of the original USGS 30-m DEM, and two randomly selected alternative DEMs (a±f). (g) and (h) show scatterplots
of elevation of each realization vs. the original DEM.
![Page 15: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et](https://reader034.vdocuments.mx/reader034/viewer/2022042100/5e7c5c8216c93e64552d5745/html5/thumbnails/15.jpg)
range in ¯ow accumulation values is very different
(Fig. 10e). The distribution of this attribute on just
one DEM is highly skewed toward low values,
because it represents the total number of cells in the
up-slope that would channel water through a particu-
lar grid cell. The majority of the study area consists of
short slopes and ridgelines, so high ¯ow accumulation
values occur downstream as each digital tributary
logarithmically increases the values in the main
river valleys. Small changes in the representation of
topography can have a large effect on the calculation
of ¯ow accumulation downstream. The highest range
values occur in the main river valleys. Cells close to
the main channel sometimes have low-¯ow routed
through them, but slight changes in ¯ow direction
(highly sensitive to changes in aspect, curvature and
slope) could redirect all of the ¯ow from the adjacent
hillslopes through different cells, causing a radical
range of possible values in the valley bottoms. The
wetness index, or compound topographic index (CTI),
is calculated using both slope and upslope contribut-
ing area, which is similar to ¯ow accumulation. The
map of the range in CTI values (Fig. 10f) on the 50
simulated grids shows the effect of DEM error is high-
est in the valleys. The range of CTI values is much
smaller than that of ¯ow accumulation, which makes
it easier to map and see details away from the main
river valleys.
We calculated a series of `deviance' grids by
subtracting the USGS DEM-derived value for each
pixel from the mean simulated value (Fig. 11). Posi-
tive values indicate the average simulated value was
higher than the USGS DEM and negative values indi-
cate the DEM was higher than the average simulated
value. The elevation deviance grid (Fig. 11a) is iden-
tical to the error dataset (Fig. 10a). The deviance of
slope (Fig. 11b) shows the DEM consistently under-
estimates the slope in the river valleys, and exhibits
high variation in areas with a large number of GPS
measurements. The roughness deviance map (Fig.
11c) indicates that the USGS DEM gently smooths
topography by consistently underestimating rough-
ness in the river valleys, although the pattern becomes
more complex in areas of higher relief. The deviance
maps of CTI and ¯ow accumulation (Fig. 11d and e)
show that the DEM overestimates values along ¯ow
lines, underestimates on the surrounding ¯oodplain,
and overestimates on hillslopes (¯ow accumulation)
or in smaller drainages (CTI). The range in values for
¯ow accumulation is very large, and the mean may not
be the most representative statistic to use for this
comparison. The plan curvature deviance map (Fig.
11f) shows no obvious patterns, except for deviance at
the clustered GPS data sites. Pro®le curvature (not
shown) is very similar to plan curvature. Aspect (not
shown) has extreme differences in ¯at areas, where
aspect is dif®cult to measure at a 30 m resolution.
3.2. Slope failure prediction
A slightly more complex example of terrain
analysis illustrates the impact of elevation uncertainty
on DEM dependent products when higher derivatives
of the surface are combined in a simple map algebra
calculation. Cannon et al. (1998) used digital terrain
analysis to describe, monitor, and predict the effects of
shoreline erosion and slope failure in the San
Francisco Bay area during the high precipitation
K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173168
Table 2
Terrain attribute descriptions
Terrain attributes Description
Elevation (Z) Data provided on the USGS 30-m
DEM.
Slope Calculated from the 8 neighboring cells
using the average maximum technique
(Burrough and McDonnell, 1998).
Aspect Azimuth of the direction of maximum
slope.
Plan curvature Curvature in map view (contour).
Pro®le curvature Curvature of the slope pro®le,
perpendicular to the contours.
Flow accumulation Total cells that would contribute water
to a given cell based on the
accumulated weights for all cells that
¯ow into each down slope cell.
Upslope ¯ow path length Number of cells through which water
would theoretically ¯ow to reach each
cell from the nearest ridge line.
Roughness Calculated at each cell as the standard
deviation of elevation within a 5 cell
radius circle.
Compound topographic
index (CTI)
ln[(upslope contributing area per unit
width orthogonal to the ¯ow direction)/
tan(slope angle)] an indicator of
catenary position (McKenzie et al.,
2000).
Rock type Digitized from a 1:24,000 USGS
geology map, Los Olivos quadrangle.
![Page 16: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et](https://reader034.vdocuments.mx/reader034/viewer/2022042100/5e7c5c8216c93e64552d5745/html5/thumbnails/16.jpg)
K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173 169
Fig. 10. Terrain attribute statistics calculated by pixel for all 50 simulated grids. Color images and animations are available from Holmes
(1999b) (web-page).
![Page 17: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et](https://reader034.vdocuments.mx/reader034/viewer/2022042100/5e7c5c8216c93e64552d5745/html5/thumbnails/17.jpg)
K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173170
Fig. 11. Differences between USGS 30-m DEM-derived terrain attribute values and the mean of the 50 simulated values, calculated at each pixel
across the study area. Negative values indicate the USGS DEM value is higher than the average simulation value, and positive values indicate it
is lower than the average simulated value.
![Page 18: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et](https://reader034.vdocuments.mx/reader034/viewer/2022042100/5e7c5c8216c93e64552d5745/html5/thumbnails/18.jpg)
1997±1998 El NinÄo event. The authors took a very
generalized approach to predicting sources of future
debris ¯ows, most of which in southern California
occur as shallow landslides on slopes steeper than
258 in wet colluvial soils. Cannon et al. mapped actual
debris ¯ow source areas on to a USGS 30 m DEM
from air photos after early storms, then predicted
other areas prone to sliding based on similar slope
and concavity to the documented failed slopes. They
note that coarseness of 30 m DEM resolution, lack of
K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173 171
Fig. 12. Likelihood of slopes failing, based on selection of areas with greater than 258 slopes and concave plan curvature. The USGS DEM
predicts a smaller area will fail than is predicted when 50 simulated elevation surfaces are used (a). The result of analysis using the USGS DEM
is a binary map of slopes that will or will not fail (b). However, simulated grids allow one to map percent likelihood of slope failure (c) and the
standard deviation of slope failure (d), which are much more useful products for land use planning or hazard assessment.
![Page 19: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et](https://reader034.vdocuments.mx/reader034/viewer/2022042100/5e7c5c8216c93e64552d5745/html5/thumbnails/19.jpg)
data on watershed area, and the fact that debris ¯ows
travel well beyond their sources prevent these maps
from being accurate predictors of hazards at speci®c
locations. However the binary predictive maps they
produced were highly effective in sharpening public
awareness of the links between surface processes and
land use planning in the Bay area, and were used for
emergency preparation in high risk zones. These
predictive debris ¯ow maps were constructed from
slope and plan curvature grids, calculated from
USGS 30 m DEMs. The authors did not take into
account the accuracy of the elevation data as an addi-
tional source of error.
As an example of how this error would affect the
results of a simple map overlay problem, we
constructed a model of potential debris ¯ow sources
for the study area, based on the criterion discussed in
Cannon et al. (1998). All areas with slope greater than
25 degrees and positive plan curvature (concave) were
selected in the grid module of ARC/INFO, both on the
original DEM, and for each of the 50 simulated grids.
The results are shown in Fig. 12. The histogram of the
percent of the area likely to fail shows that the USGS
DEM predicts a 25% smaller area (1% of the total
®eld area) will fail than do the simulated grids. In
fact, the USGS DEM prediction is lower than any of
the 50 simulation predictions. During 1997±1998
many hillslopes at Sedgwick did fail, and are apparent
in air photos taken in the Spring of 1998. Old debris
¯ows or landslides can also be seen in the DOQ. These
slope failures generally align with those areas
predicted to fail in this simple map overlay analysis.
However, the USGS DEM result failed to predict a
large area of debris ¯ows, which blocked the access
road for close to a week. The simulated probability
shows the area had up to a 50% chance of failure.
Such a probability map would have been much
more useful for raising public awareness and emer-
gency planning in San Francisco than were the
binary maps produced by using only the USGS
DEM. Of course, effects from roads and other
anthropogenic factors which in¯uence landslide
susceptibility were not taken into account.
4. Conclusions
Digital elevation models, like all maps, are models
which deviate from reality. New applications for
terrain analysis are proving highly popular and effec-
tive, increasing the importance of measuring and
understanding the impacts of digital elevation data
quality on modeling efforts. This investigation of the
accuracy of the Los Olivos USGS 30 m DEM has
shown that the USGS product is generally quite accu-
rate, given the reported global error estimates of
(9,9,5) m. However, there are spatial patterns of
error in this DEM, which can not be explained by
the distribution of the terrain attributes included in
this study. Terrain attribute calculations which
compound values from a large number of cells on
the DEM are affected by elevation errors most drama-
tically. Even a small amount of elevation error can
greatly affect derivative products such as ¯ow accu-
mulation, wetness index, or slope, and subsequently
all interpretations dependent on these calculations.
The case study of the effect of DEM error on hillslope
failure prediction illustrates the uncertainty DEM
error contributes to terrain modeling results. We
used high accuracy GPS data with sequential Gaus-
sian simulation to provide a Monte Carlo framework
for quantifying the effect of measured USGS 30 m
DEM error on digital terrain modeling. The spatial
resolution, precision and accuracy of DEM products
should be tested and speci®ed prior to any type of
terrain modeling. Methods such as those applied
here improve the quantitative aspect of terrain analy-
sis by providing a means to explicitly characterize the
undocumented error in digital terrain models.
Acknowledgements
Special thanks go to Ashton Shortridge, Brenda
Franklin, Keith Clarke, Joel Michaelson, and two
anonymous reviewers for their assistance, comments
and suggestions. The authors gratefully acknowledge
support provided by a grant from NASA Earth
Science Enterprise.
References
Adkins, K.F., Merry, C.J., 1994. Accuracy assessment of elevation
data sets using the Global Positioning System. Photogr. Engng
Remote Sensing 60 (2), 195±202.
Band, L.E., 1993. Extraction of channel networks and topographic
K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173172
![Page 20: Error in a USGS 30-meter digital elevation model and its ... · geological modeling, and environmental sciences. In this paper, we evaluate the magnitude and spatial K.W. Holmes et](https://reader034.vdocuments.mx/reader034/viewer/2022042100/5e7c5c8216c93e64552d5745/html5/thumbnails/20.jpg)
parameters from digital elevation data. In: Beven, K., Kirkby,
M.J. (Eds.), Channel Network Hydrology, Wiley, New York,
pp. 13±42.
Bolstad, P.V., Stowe, T., 1994. An evaluation of DEM accuracy:
elevation, slope, and aspect. Photogr. Engng Remote Sensing 60
(11), 1327±1332.
Burrough, P.A., McDonnell, R.A., 1998. Principles of Geographical
Information Systems. Spatial Information Systems, Oxford
University Press, New York (333pp).
Cannon, S.H., Ellen, S.D., Graham, S.E., Graymer, R.W., Hampton,
M.A., Hillhouse, J.W., Howell, D.G., Jayko, A.S., LaHusen,
R.L., Lajoie, K.R., Pike, R.J., Ramsey, D.W., Reid, M.E., Rich-
mond, B.M., Savage, W.Z., Wentworth, C.M., Wilson, R.C.,
1998. Slope failure and shoreline retreat during northern Cali-
fornia's latest El NinÄo. GSA Today 8 (8), 1±6.
Deutsch, C.V., Journel, A.G., 1998. GSLIB: Geostatistical Software
Library and User's Guide, Oxford University Press, New York.
Dietrich, W.E., Wilson, C.J., Montgomery, D.R., McKean, J., 1993.
Analysis of erosion thresholds, channel networks, and landscape
morphology using a digital terrain model. J. Geol. 101, 259±
278.
Dikau, R., Brunsden, D., Schrott, L., Ibsen, M.-L. (Eds.), 1996.
Landslide Recognition: Identi®cation, Movement and Causes
Wiley, New York, pp. 1±12.
Ehlschlaeger, C.R., Shortridge, A.M., 1996. Modeling elevation
uncertainty in geographical analyses. In: Kraak, M.J., Molenaar,
M. (Eds.), Spatial Data Handling '96, Delft, pp. 9B.15±9B.25.
ESRI, 1996. ARC/Info. ESRI, Inc, Redlands, CA.
Fisher, P.F., 1993. Algorithm and implementation uncertainty in
viewshed analysis. Int. J. Geog. Info. Sys. 7 (4), 331±347.
Giles, P.T., Franklin, S.E., 1998. An automated approach to the
classi®cation of the slope units using digital data. Geomorphol-
ogy 21, 251±264.
Goovaerts, P., 1997. Geostatistics for Natural Resources Evalua-
tion, Oxford University Press, New York (483pp).
Heuvelink, G.B.M., 1998. Error propagation in environmental
modelling with GIS. Research Monographs in Geographic
Information Systems, Taylor & Francis, Bristol, PA
(127pp).
Holmes, K.W., 1999a. Calculation of error in a USGS 30-meter
digital elevation model and its effects on terrain attributes and
environmental modeling. Masters thesis, University of Califor-
nia, Santa Barbara CA, 91pp.
Holmes, K.W., 1999b. Color ®gures and animations: available on-
line at http://www.geog.ucsb.edu/~karen.
Journel, A.G., 1996. Modelling uncertainty and spatial dependence:
Stochastic imaging. Int. J. Geog. Info. Sys. 10 (5), 517±522.
Journel, A.G., Huijbregts, C.J., 1978. Mining Geostatistics,
Academic Press, New York (600pp).
Kyriakidis, P.C., Shortridge, A.M., Goodchild, M.F., 1999.
Geostatistics for con¯ation and accuracy assessment of digital
elevation models. Int. J. Geog. Info. Sci. 13 (7), 677±707.
McDermid, G.J., Franklin, S.E., 1995. Remote sensing and geomor-
phometric discrimination of slope processes. Z. Geomorph.
N.F., Suppl.-Bd. 101, 165±185.
McKenzie, N.J., Austin, M.P., 1993. A quantitative Australian
approach to medium and small scale surveys based on soil stra-
tigraphy and environmental correlation. Geoderma 57, 329±
355.
McKenzie, N.J., Gessler, P.E., Ryan, P.J., O'Connell, D., 2000. The
role of terrain analysis in soil mapping (chap. 10). In: Wilson,
J.P., Gallant, J.C. (Eds.), Terrain Analysis: Principles and Appli-
cations, Wiley, New York (in press).
Mellerowicz, K.T., Rees, H.W., Chow, T.L., Ghanem, I., 1992. Soil
conservation planning at the watershed level using the Universal
Soil Loss Equation with GIS and microcomputer technologies: a
case study. J. Soil Water Conserv. 49 (2), 194±200.
Milne, J.A., Sear, D.A., 1997. Modelling river channel topography
using GIS. Int. J. Geog. Info. Sci. 11 (5), 499±519.
Mitasova, H., Ho®erka, J., Zloca, M., Iverson, L.R., 1996. Model-
ling topographic potential for erosion and deposition using GIS.
Int. J. Geog. Info. Sys. 10 (5), 629±641.
Moore, I.D., Grayson, R.B., Ladson, A.R., 1990. Digital terrain
modelling: a review of hydrological, geomorphological, and
biological applications. Hydrol. Proces. 5, 3±30.
National Geodetic Survey, 1996. NGS datasheets, from DXE.EXE,
Silver Spring, MD.
Of®ce of the County Surveyor, 1993. Station Recovery Sheets.
1017, 1018, Santa Barbara County Public Works Department,
Santa Barbara, CA.
Polidori, L., Chorowicz, J., Guillande, R., 1991. Description of
terrain as a fractal surface, and application to digital elevation
model quality assessment. Photogr. Engng Remote Sensing 57
(10), 1329±1332.
Shortridge, A.M., 1997. Characterizing the relationship between
7.5 0 and 1 degree digital elevation models. Masters thesis,
University of California, Santa Barbara, 70pp.
Statistical Sciences, 1995. S-PLUS. StatSci, a division of MathSoft
Inc., Seattle.
Thornton, P.E., Running, S.W., White, M.A., 1997. Generating
surfaces of daily meteorological variables over large regions
of complex terrain. J. Hydrol. 190 (3-4), 214±250.
Trimble Navigation Ltd, 1996. TRIMMAP User's Manual, Version
6.0, Sunnyvale, CA.
US Geological Survey, 1987. Digital Elevation Models: Data Users
Guide 5. Department of the Interior, USGS, Reston, Virginia,
38pp.
US Geological Survey, 1998. USGS Geospatial Data Clearing-
house, National Mapping and Remotely Sensed Data: Digital
Elevation Models (DEMs).: available on-line at http://
edcwww.cr.usgs.gov/nsdi/gendem.htm.
Weibel, R., Heller, M., 1990. A framework for digital terrain model-
ling. In: Fourth International Symposium on Spatial Data
Handling, Zurich, Switzerland, pp. 219±229.
K.W. Holmes et al. / Journal of Hydrology 233 (2000) 154±173 173