lecture 16: spatial sampling and interpolation by austin troy ------using gis-- introduction to gis

Lecture 16:Spatial Sampling and Interpolation

By Austin Troy

------Using GIS--Introduction to GIS

©2005 Austin Troy

How are raster surfaces Made?•Raster surfaces are generally either made:

• From remote sensing (covered later) which collects reflectance values at every pixel within the geographic extent and can be classified later on or

•From sample points whose Z values are Interpolated across space to fill in all the blank areas.

Introduction to GIS

©2005 Austin Troy

What is interpolation?•Process of creating a surface based on values at isolated sample points.

•Sample points are locations where we collect data on some phenomenon and record the spatial coordinates

•We use mathematical estimation to “guess at” what the values are “in between” those points

•We can create either a raster or vector interpolated surface

•Interpolation is used because field data are expensive to collect, and can’t be collected everywhere

Introduction to GIS

©2005 Austin Troy

How does it Look

Introduction to GIS

•Let say we have our ground water pollution samples

This gives us

©2005 Austin Troy

How does it work

Introduction to GIS

•This can be displayed as a 3D trend surface in 3D analyst

©2005 Austin Troy

How does it work

Introduction to GIS

•We can also use interpolation methods to create contours

©2005 Austin Troy

Sample points• Also known as “control points.”

•These are points where you or someone else has collected data (attributes) for a spatial coordinate (point)

•Any number of attributes can be collected at that point

•E.g.1 weather stations collect data on temperature, rainfall, wind, humidity, etc.

•E.g. 2 soil invertebrate samples would record abundance of numerous species at each location

Introduction to GIS

©2005 Austin Troy

What isn’t interpolation?•Interpolation only works where values are spatially dependant, or spatially autocorrelated, that is, where nearby location tend to have similar Z values.

•Examples of spatially autocorrelated features: elevation, property value, crime levels, precipitation

•Non-autocorrelated examples: number of drum sets per city block; cheeseburgers consumed per household.

•Where values across a landscape are geographically independent, interpolation does not work because value of (x,y) cannot be used to predict value of (x+1, y+1).

Introduction to GIS

©2005 Austin Troy

Interpolation examples•Elevation:

Introduction to GIS

Source: LUBOS MITAS AND HELENA MITASOVA, University of Illinois

©2005 Austin Troy

Interpolation examples•Elevation:

•Elevation values tend to be highly spatially autocorrelated because elevation at location (x,y) is generally a function of the surrounding locations

•Except is areas where terrain is very abrupt and precipitous, such as Patagonia, or Yosemite

•In this case, elevation would not be autocorrelated at local (large) scale, but still may be autocorrelated at regional (small scale)

Introduction to GIS

©2005 Austin Troy

Interpolation examples•Imagine this elevation cross section: If each dashed line represented a sample point (in 1-D), this spacing would miss major local sources of variation, like the gorge

Introduction to GIS

©2005 Austin Troy

Interpolation examples•Our interpolated surface (represented in 1-D by the blue line) would look like this

Introduction to GIS

©2005 Austin Troy

Interpolation examples•If we increased the sampling rate, we would pick up that local variation

Introduction to GIS

©2005 Austin Troy

Interpolation examples•Here our interpolated surface is much closer to reality at the local level, but we pay for this in the form of higher data gathering cost

Introduction to GIS

©2005 Austin Troy

Interpolation examples•Weather

•Weather tends to be modeled on a regional level (e.g. your local weather report) because, in most places, weather systems and trends happen over a very large area. Hence the need for sample point density is not so great

•In other places, local climate variability is very great, such as in the SF Bay Area where temperatures can vary 50 degrees within 10 miles due to ocean effects.

Introduction to GIS

©2005 Austin Troy

Interpolation examples•Weather

•Weather is also extremely variable over time, so samples must be continually taken. This is why weather stations are usually permanent

Introduction to GIS

Source: LUBOS MITAS AND HELENA MITASOVA, University of Illinois

Example: precipitation varying over a season

©2005 Austin Troy

Interpolation examples

Introduction to GIS

•Groundwater contamination:

•The needed density of points will depend on the geology and the type of terrain

•Areas where geology allows for free groundwater flows across large areas will have less local variation and need less dense points, while areas with geologic features that inhibit or redirect flow (e.g. karst topography) will need denser points

©2005 Austin Troy

Where interpolation does not work

Introduction to GIS

•Cannot use interpolation where values are not spatially autocorrelated

•Say looking at household income—in an income-segregated city, you could take a small sample of households for income and probably interpolate

•However, in a highly income-integrated city, where a given block has rich and poor, this would not work

©2005 Austin Troy

Sampling

Introduction to GIS

•As you can see, the density and spacing of samples depends on many things

•A key component of any study with spatially referenced field data is the sampling strategy

•If the values in your interpolation surface (layer A) depend on some factor in layer B, then we can design our sample of A based on layer B

•We can do this by conducting a stratified random sample

©2005 Austin Troy

Sampling

Introduction to GIS

•Example: let’s say want to make an average precipitation layer and we find that in our study zone precipitation is highly spatially variable within 10 miles of the ocean

•We’d a coastline layer to help us sample.

•We’d have high density of sampling points within 10 miles of the ocean a much lower density in the inland zones

©2005 Austin Troy

Sampling

Introduction to GIS

•Say we were looking at an inland area, far from any ocean, and we decided that precipitation varied with elevation. How would we set up our sampling design?

•In this case, flat areas would need fewer sample points, while areas of rough topography would need more

•In our sampling design we would set up zones, or strata, corresponding to different elevation zones and we would make sure that we get a certain minimum number of samples within each of those zones

•This ensures we get a representative sample across, in this case, elevation;

©2005 Austin Troy

Sampling

Introduction to GIS

•The number of zones we use will determine how representative our sample is; if zones are big and broad, we do not ensure that all elevation ranges are represented

©2005 Austin Troy

Sampling

Introduction to GIS

•The number of samples we want within each zone depends on the statistical certainty with which we want to generate our surface

•Do we want to be 95% certain that a given pixel is classified right, or 90% or 80%?

•Our desired confidence level will determine the number of samples we need per strata

•This is a tradeoff between cost and statistical certainty

•Think of other examples where you could stratify….

©2005 Austin Troy

Sampling

Introduction to GIS

•A common problem with sampling points for interpolation is what is not being sampled?

•Very frequently people leave out sample points that are hard to get to or hard to collect data at

•This creates sampling biases and regions whose interpolated values are essentially meaningless

•So spacing of sample points from interpolation should be based on some meaningful factor—if they are dense in a region in sparse in a region, it should be because the values are variable in the first area and homogeneous in the other

©2005 Austin Troy

Sampling and Scale dependency•Sampling strategy for interpolation depends on the scale at which you are working and the scale dependency of the phenomenon you are studying

•In many cases interpolation will work to pick up regional trends but lose the local variation in the process

•The density of sample points must be chosen to reflect the scale of the phenomenon you are measuring.

Introduction to GIS

©2005 Austin Troy

Scale dependency•If you have a high density of sample points, you will capture local variation, which is appropriate for large-scale (small-area) studies

•If you have low density of sample points, you will lose sensitivity of local variation and capture only the regional variation; this is more appropriate for small-scale (large-area) studies

Introduction to GIS

©2005 Austin Troy

How does interpolation work

Introduction to GIS

•In ArcGIS, to interpolate:

•Create or add a point shapefile with some attribute that will be used as a Z value

•Click Spatial Analyst>>Interpolate to Raster and then choose the method

©2005 Austin Troy

Inverse Distance Weighting

Introduction to GIS

•IDW weights the value of each point by its distance to the cell being analyzed and averages the values.

•IDW assumes that unknown value is influenced more by nearby than far away points, but we can control how rapid that decay is. “Influence diminishes with distance.”

•“To predict a value for any unmeasured location, IDW will use the measured values surrounding the prediction location. Those measured values closest to the prediction location will have more influence on the predicted value than those farther away. It weights the points closer to the prediction location greater than those farther away, hence the name inverse distance weighted.” From ArcGIS help

©2005 Austin Troy

Spline Method

Introduction to GIS

•Another option for interpolation method

•This fits a curve through the sample data assign values to other locations based on their location on the curve

•Thin plate splines create a surface that passes through sample points with the least possible change in slope at all points, that is with a minimum curvature surface

•SPLINE has two types: regularized and tension

•Tension results in a rougher surface that more closely adheres to abrupt changes in sample points

•Regularized results in a smoother surface that smoothes out abruptly changing values somewhat

©2005 Austin Troy

Kriging Method

Introduction to GIS

•Semivariograms measure the strength of statistical correlation as a function of distance; they quantify spatial autocorrelation

•Because Kriging is based on the semivariogram, it is probabilistic, while IDW and Spline are deterministic

•Kriging associates some probability with each prediction, hence it provides not just a surface, but some measure of the accuracy of that surface

•Kriging equations are determined by fitting line through points so as to minimize weighted sum of squares between points and line

•These equations are weighted based on spatial autocorrelation, which is determined from the semivariograms

©2005 Austin Troy

Example

Introduction to GIS

•Here are some sample elevation points from which surfaces were derived using the three methods

©2005 Austin Troy

Example: Spline

Introduction to GIS

•Note how smooth the curves of the terrain are; this is because Spline is fitting a simply polynomial equation through the points

©2005 Austin Troy

Example: IDW

Introduction to GIS

•Done with P =2. Notice how it is not as smooth as Spline. This is because of the weighting function introduced through P

©2005 Austin Troy

Example: Kriging

Introduction to GIS

•This one is kind of in between—because it fits an equation through point, but weights it based on probabilities

©2005 Austin Troy

Other methods of interpolation

Introduction to GIS

•Thiessen polygons

•This method builds polygons, rather than a raster surface, from control points

•“grows” polygons around sample points that are supposed to represent areas of homogeneity

Source: Jens-Ulrich Nomme http://www.tu-harburg.de/sb3/pssd/GIS-Methods/thiessen.html

©2005 Austin Troy

Density Functions

Introduction to GIS

•We can also use sample points to map out density raster surfaces. This need to require a z value in each, it can simply be based on the abundance and distribution of points.

©2005 Austin Troy

Density Functions

Introduction to GIS

•These settings would give us a raster density surface, based just on the abundance of points within a “kernel” or data frame. In this case, a z value for each point is not necessary.

lecture 16: spatial sampling and interpolation by austin troy ------using gis-- introduction to gis

Documents

elevation values

reflectance values

similar z values

isolated sample points

interpolation methods

autocorrelated features

contourssample points

control points