# applied spatial statistics in r, section 5 - huit sites zhukov/ spatial statistics in r, section 5...

Post on 31-Mar-2018

215 views

Category:

## Documents

Embed Size (px)

TRANSCRIPT

• Applied Spatial Statistics in R, Section 5Geostatistics

Yuri M. Zhukov

IQSS, Harvard University

January 16, 2010

Yuri M. Zhukov (IQSS, Harvard University) Applied Spatial Statistics in R, Section 5 January 16, 2010 1 / 1

• Geostatistics

Outline

1 IntroductionWhy use spatial methods?The spatial autoregressive data generating process

2 Spatial Data and Basic Visualization in R

PointsPolygonsGrids

3 Spatial Autocorrelation

4 Spatial Weights

5 Point Processes

6 Geostatistics7 Spatial Regression

Models for continuous dependent variablesModels for categorical dependent variablesSpatiotemporal models

Yuri M. Zhukov (IQSS, Harvard University) Applied Spatial Statistics in R, Section 5 January 16, 2010 2 / 1

• Geostatistics

Geostatistics

What if the pattern of point locations is not of primary interest? You maywish to...

determine where new data should be collected,

identify which observations are spatial outliers,

perform spatial prediction,

interpolate missing data from nearby observed locations,

estimate local averages of spatially autocorrelated variables.

These problems are the domain of a subfield called geostatistics.

Yuri M. Zhukov (IQSS, Harvard University) Applied Spatial Statistics in R, Section 5 January 16, 2010 3 / 1

• Geostatistics

Geostatistics

Lets take a look at some data on U.S. Air Strikes in Laos during theVietnam War.The variable we are interested is LOAD LBS, the payload of each bombdropped.

95E 100E 105E 110E

16N

17N

18N

19N

20N

21N

Yuri M. Zhukov (IQSS, Harvard University) Applied Spatial Statistics in R, Section 5 January 16, 2010 4 / 1

• Geostatistics

Geostatistics: IDW Interpolation

Spatial interpolation is the prediction of values of attributes atunsampled locations x0 from existing measurements at xi .This procedure converts a sample of point observations into analternative representation, such as a contour map or grid.One approach to interpolation is to use a locally-weighted average ofnearby values.Inverse-distance weighted (IDW) interpolation computes one suchweighted average:

Z (x0) =n

i=1

wi0Z (xi )

where weights wij are determined according to the distance betweenpoints xi and xj , and scaled by parameter k .

wij =1

dkij

Yuri M. Zhukov (IQSS, Harvard University) Applied Spatial Statistics in R, Section 5 January 16, 2010 5 / 1

• Geostatistics

Geostatistics: IDW Interpolation

Predicted values and variance for Laos bombing data are shown below.

Values of k > 1 reduce the relative impact of distant points andproduce a peaky map.

Values of k < 1 increase the impact of distant points and produce asmooth map.

k = 1/5

7.17.27.37.47.57.67.7

k = 1

6

7

8

9

10

k = 5

6789101112

Yuri M. Zhukov (IQSS, Harvard University) Applied Spatial Statistics in R, Section 5 January 16, 2010 6 / 1

• Geostatistics

Geostatistics: Variogram

In geostatistics, spatial autocorrelation has traditionally beenmodelled by a variogram, which describes the degree to which nearbylocations have similar values.

A variogram cloud is a scatterplot of data pairs, in which thesemivariance is plotted against interpoint distance.

The semivariance (d) is formally defined as the squared difference inheight between locations:

(d) =1

2n(d)

dij=d

(Z (xi ) Z (xj))2

where n(d) is the number of point pairs separated by distance d , andZ (xi ) is the value of a variable at location xi .

Yuri M. Zhukov (IQSS, Harvard University) Applied Spatial Statistics in R, Section 5 January 16, 2010 7 / 1

• Geostatistics

Geostatistics: Variogram

Below is the variogram cloud for Laos bomb load (natural log).

Upper left corner: point pairs are close together, but havevery different values.

Lower left corner: close together, similar values.

Upper right corner: far apart, different values.

Lower right corner: far apart, similar values.

Yuri M. Zhukov (IQSS, Harvard University) Applied Spatial Statistics in R, Section 5 January 16, 2010 8 / 1

• Geostatistics

Geostatistics: Variogram

A variogram can be used to identify outliers...

selected point pairs

LONG

LAT

18.0

18.5

19.0

102.5 103.0 103.5 104.0 104.5 105.0

Yuri M. Zhukov (IQSS, Harvard University) Applied Spatial Statistics in R, Section 5 January 16, 2010 9 / 1

• Geostatistics

Geostatistics: Variogram

To test the null hypothesis that an increase in semivariance with distanceis due to chance, we can use simulation to generate 100 spatially randomdatasets and check whether the sample variogram falls within the range ofthe random variograms. As shown below, the CSR hypothesis seemsunlikely for the Laos bombing data.

distance

semivariance

2

4

6

0.0 0.2 0.4 0.6 0.8 1.0

Yuri M. Zhukov (IQSS, Harvard University) Applied Spatial Statistics in R, Section 5 January 16, 2010 10 / 1

• Geostatistics

Geostatistics: Variogram

The variogram can be used for spatial prediction.This can be done by fitting a parametric model to the variogram.In the Laos example below, an exponential model was used.The shape of the curve indicates that at small separation distances,the variance in z is small. After a certain level of separation (.5degrees), the variance in z values becomes somewhat random and themodel flattens out to a value corresponding to the average variance.

distance

semivariance

1

2

3

4

5

0.2 0.4 0.6 0.8 1.0

Yuri M. Zhukov (IQSS, Harvard University) Applied Spatial Statistics in R, Section 5 January 16, 2010 11 / 1

• Geostatistics

Geostatistics: Ordinary Kriging

Kriging is used to interpolate a value Z (x0) of a random field Z (x) atunobserved location x0, using data from observed location xi .

Allows variance to be non-constant, dependent on distance betweenpoints as modeled by the variogram (d).

The kriging estimator is given by

Z (x0) =n

i=1

wi (x0)Z (xi )

where wi (x0), i = 1, . . . , n is a spatial weight.

Kriging is very similar to IDW interpolation, expect that the weightsused in kriging are based on the model variogram, rather than anarbitrary function of distance.

Yuri M. Zhukov (IQSS, Harvard University) Applied Spatial Statistics in R, Section 5 January 16, 2010 12 / 1

• Geostatistics

Geostatistics: Ordinary Kriging

To interpolate at a point x0 based on points x1, . . . , xn, the weightsw1, . . .wn must be found. This can be done by solving the system oflinear equations:

(d11) (d12) (d1n) 1...

.... . .

......

(dn1) (dn2) (dnn) 11 1 1 0

=w1...wn

(d10)

...(dn0)

1

where (dij) is the semivariance for the distance between points xiand xj , and is the trend parameter.

Ordinary kriging assumes an unknown constant trend: (x) = .

Yuri M. Zhukov (IQSS, Harvard University) Applied Spatial Statistics in R, Section 5 January 16, 2010 13 / 1

• Geostatistics

Geostatistics: Ordinary Kriging

Once weights are estimated, interpolation by ordinary kriging is given by:

Z (x0) =

w1...wn

Z (x1)...

Z (xn)

The ordinary kriging error is given by:

var(Z (x0) Z (x0)

)=

w1...wn

(d10)...

(dn0)1

Yuri M. Zhukov (IQSS, Harvard University) Applied Spatial Statistics in R, Section 5 January 16, 2010 14 / 1

• Geostatistics

Geostatistics: Ordinary Kriging

Predicted values and variance for ordinary kriging is shown below forthe Laos bombing data.

Ordinary Kriging, Bomb Load (log)Predictions Variance

024681012

Yuri M. Zhukov (IQSS, Harvard University) Applied Spatial Statistics in R, Section 5 January 16, 2010 15 / 1

• Geostatistics

Examples in R

Switch to R tutorial script. Section 5.

Yuri M. Zhukov (IQSS, Harvard University) Applied Spatial Statistics in R, Section 5 January 16, 2010 16 / 1