www.spatialanalysisonline.com chapter 5 part a: spatial data exploration

34
www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

Upload: christian-kent

Post on 26-Mar-2015

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

www.spatialanalysisonline.com

Chapter 5

Part A: Spatial data exploration

Page 2: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 2

Spatial data exploration

Spatial analysis and data models (Anselin, 2002)

Object Field

GIS vector raster

Spatial Data points, lines, polygons

surfaces

Location discrete continuous

Observations process realisation sample

Spatial Arrangement spatial weights distance function

Statistical Analysis lattice geostatistics

Prediction extrapolation interpolation

Models lag and error error

Asymptotics expanding domain infill

Page 3: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 3

Spatial data exploration

Sampling frameworks Pure random sampling Stratified random – by class/strata

(proportionate, disproportionate) Randomised within defined grids Uniform Uniform with randomised offsets Sampling and declustering

Page 4: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 4

Spatial data exploration

Sampling frameworks – point sampling

Page 5: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 5

Spatial data exploration

Sampling frameworks – within zonesSelection of 5 random points per zone

Grid generation - square grid within field boundaries

Grid generation (hexagonal) - selection of 1 point per cell, random

offset from centre

Page 6: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 6

Spatial data explorationA. 10% random sample from existing point set B. Stratified random selection, 30% of each

stratum

800 radio-activity monitoring sites in Germany. Random sample of 80 (red/large dots)

200 radio-activity monitoring sites in Germany. Random sample of 30 (red/large dots)<100 units of radiation and 30 (crosses)>=100 units of radiation

Page 7: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 7

Spatial data exploration

Random points on a network

Page 8: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 8

Spatial data exploration

EDA, ESDA and ESTDA EDA – basic aims (after NIST)

maximize insight into a data set uncover underlying structure extract important variables detect outliers and anomalies test underlying assumptions develop parsimonious models determine optimal factor settings

Page 9: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 9

Spatial data exploration

ESDA (see GeoDa and STARS) Extending EDA ideas to the spatial domain

(lattice/zone models) Brushing Linking Mapped histograms Outlier mapping Box plots Conditional choropleth plots Rate mapping

Page 10: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 10

Spatial data exploration

ESDA: Brushing & linking

Page 11: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 11

Spatial data exploration

ESDA: Histogram linkage

Page 12: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 12

Spatial data exploration

ESDA: Parallel coordinate plot & star plot

Page 13: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 13

Spatial data exploration

ESDA: Mapped box plots

Page 14: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 14

Spatial data exploration

ESDA: Conditional choropleth mapping

Page 15: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 15

Spatial data exploration

ESDA: Mapped point dataA. Variable point size

B. Variable colourC. Semivariogram pairs

D. Voronoi analysis

Page 16: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 16

Spatial data exploration

ESDA: Trend analysis (continuous spatial data)

Page 17: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 17

Spatial data exploration

ESDA: Cluster hunting – GAM/K (steps)

1. Read data for the population at risk2. Identify the MBR containing the data, identify starting circle radius, and degree of

overlap3. Generate a grid covering the MBR4. For each grid-intersection generate a circle of radius r 5. Retrieve two counts for the population at risk and the variable of interest 6. Apply some “significance” test procedure 7. Keep the result if significant 8. Repeat Steps 5 to 7 until all circles have been processed 9. Increase circle radius by dr and return to Step 3 else go to Step 10 10. Create a smoothed density surface of excess incidence for the significant circles11. Map this surface and inspect the results

Page 18: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 18

Spatial data explorationGrid-based statistics

Univariate analysis of attribute data (non-spatial metrics)

Cross-classification and cross-tab analyses Spatial pattern analysis for grid data

(including Landscape metrics)Patch metrics; Class-level metrics; Landscape-

level metrics Quadrat analysis Multi-grid regression analysis

Page 19: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 19

Spatial data exploration

Grid-based statistics Landscape metrics

Non-spatial• Proportional abundance; Richness; Evenness; Diversity

Spatial• Patch size distribution and density; Patch shape

complexity; Core Area; Isolation/Proximity; Contrast; Dispersion; Contagion and Interspersion; Subdivision; Connectivity

Page 20: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 20

Spatial data exploration

Point (event) based statistics Typically analysis of point-pair distances Points vs events Distance metrics: Euclidean, spherical, Lp or

network Weighted or unweighted events Events, NOT computed points (e.g. centroids) Classical statistical models vs Monte Carlo and

other computational methods

Page 21: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 21

Spatial data exploration

Point (event) based statistics Basic Nearest neighbour (NN) model

Input coordinates of all points Compute (symmetric) distances matrix D Sort the distances to identify the 1st, 2nd,...kth

nearest values Compute the mean of the observed 1st, 2nd, ...kth

nearest values Compare this mean with the expected mean under

Complete Spatial Randomness (CSR or Poisson) model

Page 22: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 22

Spatial data exploration

Point (event) based statistics – NN model

r+drr

Area = r2 Area = 2rdr

Width = dr

Page 23: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 23

Spatial data exploration

Point (event) based statistics – NN model Mean NN distance:

Variance:

NN Index (Ratio):

Z-transform:

m2

1

mn

rrz

e

eeo

/261358.0n/

where N(0,1), ~ /)(

2

m

4)4(

2

/o eR r r

Page 24: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 24

Spatial data exploration

Point (event) based statistics Issues

Are observations n discrete points? Sample size (esp. for kth order NN, k>1) Model requires density estimation, m Boundary definition problems (density and edge

effects) – affects all methods NN reflexivity of point sets Limited use of frequency distribution Validity of Poisson model vs alternative models

Page 25: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 25

Spatial data exploration

Frequency distribution of nearest neighbour distances, i.e. The frequency of NN distances in distance bands, say

0-1km, 1-2kms, etc The cumulative frequency distribution is usually denoted

G(d) = #(di < r)/n where di are the NN distances

and n is the number of

measurements, or F(d) = #(di < r)/m where m is the number of random

points used in sampling

Page 26: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 26

Spatial data exploration

Computing G(d) [computing F(d) is similar] Find all the NN distances Rank them and form the cumulative frequency

distribution Compare to expected cumulative frequency distribution:

Similar in concept to K-S test with quadrat model, but compute the critical values by simulation rather than table lookup

2

1)( rmerG

Page 27: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 27

Spatial data exploration

Point (event) based statistics – clustering (ESDA) Is the observed clustering due to natural background

variation in the population from which the events arise? Over what spatial scales does clustering occur? Are clusters a reflection of regional variations in

underlying variables? Are clusters associated with some feature of interest,

such as a refinery, waste disposal site or nuclear plant? Are clusters simply spatial or are they spatio-temporal?

Page 28: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 28

Spatial data exploration

Point (event) based statistics – clustering kth order NN analysis Cumulative distance frequency distribution, G(r) Ripley K (or L) function – single or dual pattern PCP Hot spot and cluster analysis methods

Page 29: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 29

Spatial data exploration

Point (event) based statistics – Ripley K or L

Construct a circle, radius d, around each point (event), i

Count the number of other events, labelled j, that fall inside this circle

Repeat these first two stages for all points i, and then sum the results

Increment d by a small fixed amount Repeat the computation, giving values of

K(d) for a set of distances, d Adjust to provide ‘normalised measure’ L: d

dKdL

)(

)(

Page 30: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 30

Spatial data exploration

Point (event) based statistics – Ripley KRipley K - Lung Cancer dataset

0.00

50.00

100.00

150.00

200.00

250.00

300.00

350.00

9.9 99.3 188.7 278.0 367.4 456.8 546.1 635.5 724.9 814.2 903.6 993.0

Distance

L(d) observed

L(d) min

L(d) max

Page 31: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 31

Spatial data exploration

Point (event) based statistics – comments CSR vs PCP vs other models Data: location, time, attributes, error, duplicates

Duplicates: deliberate rounding, data resolution, genuine duplicate locations, agreed surrogate locations, deliberate data modification

Multi-approach analysis is beneficial Methods: choice of methods and parameters Other factors: borders, areas, metrics, background variation,

temporal variation, non-spatial factors Rare events and small samples Process-pattern vs cause-effect ESDA in most instances

Page 32: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 32

Spatial data exploration

Hot spot and cluster analysis – questions where are the main (most intensive) clusters located? are clusters distinct or do they merge into one another? are clusters associated with some known background

variable? is there a common size to clusters or are they variable

in size? do clusters themselves cluster into higher order

groupings? if comparable data are mapped over time, do the

clusters remain stable or do they move and/or disappear?

Page 33: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 33

Spatial data exploration

Hot spot (and cool-spot) analysis Visual inspection of mapped patterns Scale issues Proximal and duplicate points Point representation (size) Background variation/controls (risk adjustment) Weighted or unweighted Hierarchical or non-hierarchical Kernel & K-means methods

Page 34: Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration

3rd edition www.spatialanalysisonline.com 34

Spatial data exploration

Hot spot analysis – Hierarchical NN Cancer incidence data 1st and 2nd order clusters