spatial autocorrelation using gis

Download Spatial Autocorrelation using GIS

Post on 24-Feb-2016




4 download

Embed Size (px)


Spatial Autocorrelation using GIS. Jennie Murack Objectives. Understand the concept of spatial autocorrelation Learn which tools to use in Geoda and Arcmap to test for autocorrelation Interpret output from spatial autocorrelation tests. - PowerPoint PPT Presentation


Spatial Statistics using GIS

Spatial Autocorrelation using GISJennie Murackmurack@mit.eduObjectivesUnderstand the concept of spatial autocorrelationLearn which tools to use in Geoda and Arcmap to test for autocorrelationInterpret output from spatial autocorrelation testsWhat is spatial autocorrelation?What is spatial autocorrelation?Based on Toblers first law of geography, Everything is related to everything else, but near things are more related than distant things.Its the correlation of a variable with itself through space.Patterns may indicate that data are not independent of one another, violating the assumption of independence for some statistical tests.Tests for spatial autocorrelation will allow you to answer the following questions about your data:How are the features distributed?What is the pattern created by the features?Where are the clusters?How do patterns and clusters of different variables compare to one another?PatternsUseful to:Better understand geographic phenomena (ex. Habitats)Monitor conditions (ex. Level of clustering)Compare different sets of features (ex. Patterns of different types of crimes)Track changePatterns

You can measure the pattern formed by the location of features or patterns of attribute values associated with features (ex. median home value, percent female, etc.).New AIDS cases in 1994New AIDS cases in 2003Types of data often analyzedLocation of crimes, animals, retail, industry, etc.Land coverLand useCensus/social dataSoftwareArcGISComplete GIS software with hundreds of toolsCan work with several datasets (layers) at once.

GeoDa open sourceSolely for spatial statisticsUse one dataset (layer) at a time.Simple, easy-to-use, interfaceAvailable with registration at:

Conceptual ModelsSpatial Neighborhoods and WeightsNeighborhood = area in which the GIS will compare the target values to neighboring valuesNeighborhoods are most often defined based on adjacency or distance, but can be defined based on travel time, travel cost, etc.You can also define a cutoff distance, the amount of adjacency (borders vs. corners), or the amount of influence at different distancesA table of spatial weights is used to incorporate these definitions into statistical analysis.Distance ModelsInverse distance all features influence all other features, but the closer something is, the more influence it hasDistance band features outside a specified distance do not influence the features within the areaZone of indifference combines inverse distance and distance band

Distance could be Euclidean or ManhattanYou choose the model for certain calculations based on what you know about the data.12Adjacency ModelsK Nearest Neighbors a specified number of neighboring features are included in calculationsPolygon Contiguity polygons that share an edge or node influence each otherSpatial weights specified by user (ex. Travel times or distances)Types of ContiguityRook = Share edgesBishop = share cornersQueen = share edges or cornersSecondary order contiguity = neighbor of neighbor

Image from: Data LocationsAverage Nearest NeighborMeasures how similar the actual mean distance is to the expected mean distance for a random distributionMeasures clustering vs. dispersion of feature locationsCan be used to compare distributions to one anotherConcerns: one point on a line is chosen for analysis, extent of study area can affect results (many features near the edge of the study bias results)

Z score used to determine significanceFeatures at an edge bias the study because they do not have as many neighbors16Ripleys K-functionGIS counts the number of neighboring features within a given distance to each feature.Like Nearest Neighbor, the K-function measures clustering/dispersion of feature locations, but includes neighbors occurring within a certain distance.It is often used with individual points.The test compares the observed K value at each distance to the expected K value for a random distribution at each distance.Concerns: points at the edge of the study area may have few neighborsRipleys K-function

Assaults are clustered until about 13,000 ft. and then dispersed beyond 15,000 ft.We wont be using these in the exercises, but they are useful to know they exist.18Measuring Data ValuesGlobal vs. Local StatisticsGlobal statistics identify and measure the pattern of the entire study areaDo not indicate where specific patterns occurLocal Statistics identify variation across the study area, focusing on individual features and their relationships to nearby features (i.e. specific areas of clustering)Spatial Autocorrelation (Morans I)Global statisticMeasures whether the pattern of feature values is clustered, dispersed, or random.Compares the difference between the mean of the target feature and the mean for all features to the difference between the mean for each neighbor and the mean for all features.For more information on the equation, see ESRI online help Mean of Target Feature

Mean of all features

Mean of each neighbor

Spatial Autocorrelation (Morans I)

Calculates I values:I=0=random distributionI The value (i) is included in the numerator and denominator.

Permutations in GeodaPermutation inference is shuffling values around and re-computing statistics each time with a different set of random numbers to construct a reference distribution.Permutations are used to determine how likely it would be to observe the Morans I value of an actual distribution under conditions of spatial randomness.P-values are dependent on the number of permutations so they are pseudo p-values

Reference DistributionGeoda generates a historgram of the Morans I values compared to the observed Morans I.

Bivariate Morans IAn option in GeodaIt tests 2 variables: the correlation between a given variable (x) at a location and a different variable (y) at surrounding locations.Results are difficult to interpret.It is useful in examining the range of interaction provided x and y are correlated at the same location.ResourcesESRI Spatial Statistics Website: Workbook: Spatial Statistics Tool help:

ArcMap Help LinksSpatial Autocorrelation (Morans I) Local Morans I Ord General G Spot Analysis (Getis Ord Gi*)


View more >