공간적자기상관 개념과측정 -...

43
공간적 자기 상관 개념과 측정 Global Measures of Spatial Autocorrelation 1 China

Upload: others

Post on 28-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

공간적자 상개념과측정Global Measures of

Spatial Autocorrelation

1

China

Page 2: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

공간적자 상(spatial autocorrelation)

• 공간패 을 석하 특정 측치 가진지역들이서집해있는 이런 량의크 가유사한지역이서 집해있는현상을공간적자 상 이라고함

• 공간적자 상 이나타나는경 에는특정현상이그지역의다 현상과연 하여나타나 보다는해당지역에서나타나고있는현상자체가다 지역과연 어있다는추 이가능

• Tobler의정의에 공간적자 상 성은 든것들은다든것들과 어있지만가 게있는것들일수어져있는것들보다 그연 성이크다는것으 정의

• 정적인자 상 은이 하는지역들과 유사한특성이나타난다는것이 , 적인자 상 은가 이 보다어진지역과유사성이나타난다는의미

• 공간적으 인접할수 유사한특성을지니게 고상 성이높아짐

Page 3: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

공간적자 상 성의측정

• 공간적인접성을정의하고측정할필

• 공간가중행 (spatial weighted matrix)의 축: 인접성(spatial contiguity)을 준으 축, 또는공간거 (spatial distance) 준으 축

• Rook형: 역이한 을공유하는경 , 동서남 으 인접한형태

• Bishop형: 역이한정점을공유한경 , 각선향으 인접한상태

• Queen형: 역이한 을공유하거나한정점을공유하는경 , 동서남 이나 각선의어느향으 든인접한경

Page 4: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13
Page 5: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

자료에서공간적자 상 측정

Page 6: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13
Page 7: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

등간척 비율척 의경

Page 8: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13
Page 9: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

공간적자 상 의유의성검정• 척 의경 공간패 검정

• 자유표본추출: 표본이추출 후다시중복이가능, pq는 성을가짐

Page 10: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

등간비율척 의경 공간패 검정

Page 11: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

Global Measures and Local Measures

Briggs Henan University 2010 11

An equivalent local measure can be calculated for most global measures

China

• Global Measures

– A single value which applies to the entire data set

• The same pattern or process occurs over the entire geographic area

• An average for the entire area

• Local Measures

– A value calculated for each observation unit

• Different patterns or processes may occur in different parts of the region

• A unique number for each location

Page 12: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

12

Join (or Joins or Joint) Count Statistic• Polygons only

• binary (1,0) data only– Polygon has or does not have a

characteristic

– For example, a candidate won or lost an election

• Based on examining polygons which share a border– Do they have the same characteristic or not?

• Border same

on each side

• Border not the same

on each side

• Requires a contiguity matrix for polygons

Page 13: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

Different numbers of BW, BB and WW joins

13

Join (or Joint or Joins) Count Statistic• Uses binary (1,0) data

– Shown here as B/W (black/white)

• Measures the number of borders (“joins”) of each type (1,1), (0,0), (1,0 or 0,1) relative to total number of borders

• For 6 x 6 matrix, border totals are:– 60 for Rook Case

– 110 for Queen Case

Small number of BW joins (6 only for rook)Large proportion of BB and WW joins

Large number of BW joins Small number of BB and WW joins

Page 14: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

14

Join Count: Test Statistic

Test Statistic given by: Z= Observed - Expected

SD of Expected

Expected given by: Standard Deviation of Expected (standard error) given by:

Where: k is the total number of joins (neighbors)pB is the expected proportion Black, if randompW is the expected proportion Whitem is calculated from k according to:

Note: the formulae given here are for free (normality) sampling. Those for non-free (randomization) sampling are substantially more complex. See Wong and Lee 1st ed. p. 151 compared to p. 155. Se next slide for explanation.

Expected = random pattern generated by tossing a coin in each cell.

Page 15: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

A Note on Sampling Assumptions:applies to most tests for spatial autocorrelation

• Test results depend on the assumption made regarding the type of sampling:– Free (or normality) sampling

• Analogous to sampling with replacement

• After a polygon is selected for a sample, it is returned to the population set

• The same polygon can occur more than one time in a sample

– Non-free (or randomization) sampling• Analogous to sampling without replacement

• After a polygon is selected for a sample, it is not returned to the population set

• The same polygon can occur only one time in a sample

• The formulae used to calculate the test statistic (particularly the standard error) are different for each– Generally, the formulae are substantially more complex for free

sampling—unfortunately, it is also the more common situation!

– Assuming free sampling requires knowledge about larger trends from outside the region or access to additional information within the region in order to estimate parameters.

15

Page 16: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

16

total number of joins = 109 = sum of neighbors/2 in the sparse contiguity matrix= number of 1s/2 in the full contiguity matrix for US States

(see slides from SA Concepts lecture)

Actual

Jbb 60

Jgg 21

Jbg 28

Total 109

Gore/Bush Presidential Election 2000

Is there evidence of clustering by State?

Use Join Count to answer this question!

Many BB joins

Page 17: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8

Alabama 1 4 28 13 12 47

Arizona 4 5 35 8 49 6 32

Arkansas 5 6 22 28 48 47 40 29

California 6 3 4 32 41

Colorado 8 7 35 4 20 40 31 49 56

Connecticut 9 3 44 36 25

Delaware 10 3 24 42 34

District of Columbia 11 2 51 24

Florida 12 2 13 1

Georgia 13 5 12 45 37 1 47

Idaho 16 6 32 41 56 49 30 53

Illinois 17 5 29 21 18 55 19

Indiana 18 4 26 21 17 39

Iowa 19 6 29 31 17 55 27 46

Kansas 20 4 40 29 31 8

Kentucky 21 7 47 29 18 39 54 51 17

Louisiana 22 3 28 48 5

Maine 23 1 33

Maryland 24 5 51 10 54 42 11

Massachusetts 25 5 44 9 36 50 33

Michigan 26 3 18 39 55

Minnesota 27 4 19 55 46 38

Mississippi 28 4 22 5 1 47

Missouri 29 8 5 40 17 21 47 20 19 31

Montana 30 4 16 56 38 46

Nebraska 31 6 29 20 8 19 56 46

Nevada 32 5 6 4 49 16 41

New Hampshire 33 3 25 23 50

New Jersey 34 3 10 36 42

New Mexico 35 5 48 40 8 4 49

New York 36 5 34 9 42 50 25

North Carolina 37 4 45 13 47 51

North Dakota 38 3 46 27 30

Ohio 39 5 26 21 54 42 18

Oklahoma 40 6 5 35 48 29 20 8

Oregon 41 4 6 32 16 53

Pennsylvania 42 6 24 54 10 39 36 34

Rhode Island 44 2 25 9

South Carolina 45 2 13 37

South Dakota 46 6 56 27 19 31 38 30

Tennessee 47 8 5 28 1 37 13 51 21 29

Texas 48 4 22 5 35 40

Utah 49 6 4 8 35 56 32 16

Vermont 50 3 36 25 33

Virginia 51 6 47 37 24 54 11 21

Washington 53 2 41 16

West Virginia 54 5 51 21 24 39 42

Wisconsin 55 4 26 17 19 27

Wyoming 56 6 49 16 31 8 46 30

Sparse Contiguity Matrix for US States -- obtained from Anselin's web site (see powerpoint for link)

Queens Case Sparse Contiguity Matrix for US States•Ncount is the number of neighbors for each state•Equals number of 1s in a row of full contiguity matrix •Sum of Ncount is 218•Number of common borders (joins) =

ncount / 2 = 109

•N1, N2… FIPS codes for neighbors

å

17

Page 18: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

18

Join Count Statistic for Gore/Bush 2000 by StateActual Expected Stan Dev Z-score

Jbb 60 27.125 8.667 3.7930

Jgg 21 27.375 8.704 -0.7325

Jbg 28 54.500 5.220 -5.0763

Total 109 109.000

• The expected number of joins is calculated based on the proportion of votes each received in the election (for Bush = 109*.499*.499=27.125)

• K = 109= total number of joins

• There are far more Bush/Bush joins (actual = 60) than would be expected (27)

– Since test score (3.79) is greater than the critical value (2.54 at 1%) result is statistically significant at the 99% confidence level (p <= 0.01)

– Strong evidence of spatial autocorrelation—clustering

• There are far fewer Bush/Gore joins (actual = 28) than would be expected (54)

– Since test score (-5.07) is greater than the critical value (2.54 at 1%) result is statistically significant at 99% confidence level (p <= 0.01)

– Again, strong evidence of spatial autocorrelation—clustering

– Actual calculations available in spatstat.xls spreadsheet (JC-%vote tab)

% of Votes

in election

Bush % (Pb) 0.49885

Gore % (Pg) 0.50115

Page 19: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

Moran’s I• The most common measure of Spatial Autocorrelation• Use for points or polygons

– Join Count statistic only for polygons

• Use for a continuous variable (any value)– Join Count statistic only for binary variable (1,0)

• Varies on a scale between –1 through 0* to + 1

19

-1 0 +1

high negative spatial autocorrelation

no spatial autocorrelation*

high positive spatial autocorrelation

Can also use it as an index for dispersion/random/cluster patterns.

Dispersed Pattern Random Pattern Clustered Pattern

CLUSTERED

UNIFORM/

DISPERSED

*technically it is:–1/(n-1)

Page 20: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

Moran’s I and Correlation Coefficient rDifferences and Similarities

Correlation Coefficient r• Relationship between two variables

20

Moran’s I– Involves one variable only– Correlation between variable, X, and the “spatial lag” of X formed

by averaging all the values of X for the neighboring polygons

Education

Inco

me r = -0.71

Price

Qu

an

tity

orr = 0.71

Crime Rate

Crime in nearby area

r = 0.71 r = -0.71

Grocery Store Density

Grocery Store Density Nearby

Page 21: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

Formula for Moran’s I

• Where:N is the number of observations (points or polygons)

is the mean of the variableXi is the variable value at a particular locationXj is the variable value at another locationWij is a weight indexing location of i relative to j

ååå

åå

== =

= =

-

--

=n

1i

2i

n

1i

n

1jij

n

1i

n

1jjiij

)x(x)w(

)x)(xx(xwN

I

21

x

Page 22: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

22

n

)x(x

n

)y(y

)/nx)(xy1(y

n

1i

2i

n

1i

2i

n

1iii

åå

å

==

=

--

--

n

)x(x

n

)x(x

w/)x)(xx(xw

n

1i

2i

n

1i

2i

n

1i

n

1i

n

1jij

n

1jjiij

åå

å ååå

==

= = ==

--

--

Spatial auto-correlation

CorrelationCoefficient

ååå

åå

== =

= =

-

--

n

1i

2i

n

1i

n

1jij

n

1i

n

1jjiij

)x(x)w(

)x)(xx(xwN

=

Note the similarity of the numerator (top) to the measures of spatial association discussed earlier if we view Yi as being the Xi for the neighboring polygon

(see next slide)

Page 23: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

23

n

)x(x

n

)y(y

)/nx)(xy1(y

n

1i

2i

n

1i

2i

n

1iii

åå

å

==

=

--

--

n

)x(x

n

)x(x

w/)x)(xx(xw

n

1i

2i

n

1i

2i

n

1i

n

1i

n

1jij

n

1jjiij

åå

å ååå

==

= = ==

--

--

Moran’s I

CorrelationCoefficient

Yi is the Xi for the neighboring polygon

Spatial weights

ååå

åå

== =

= =

-

--

n

1i

2i

n

1i

n

1jij

n

1i

n

1jjiij

)x(x)w(

)x)(xx(xwN

=

Page 24: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

24

Adjustment for Short or Zero Distances• If an inverse distance measure is used,

and distances are very short, then wij

becomes very large and distorts I.

• An adjustment for short distances can be used, usually scaling the distance to one mile.

• The units in the adjustment formula are the number of data measurement units in a mile

• In the example, the data is assumed to be in feet.

• With this adjustment, the weights will never exceed 1

• If a contiguity matrix is used (1or 0 only), this adjustment is unnecessary

Page 25: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

25

Statistical Significance Tests for Moran’s I• Based on the normal frequency distribution with

E(I) = -1/(n-1)

• Again, there are two different formula for calculating the standard error – The free sampling or normality method

– The nonfree sampling or randomization method

• These formulae are complicated!– They are in Lee and Wong 1st Ed. p. 82 and 160-1

• In either case, the statistical test is carried out in the same way

Where: I is the calculated value for Moran’s I from the sample

E(I) is the expected value if random

S is the standard error

)(

)(

IerrorS

IEIZ

-=

Page 26: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

Test Statistic for Normal Frequency Distribution

26

0-1.96

2.5%

1.96

2.5% 1%

2.54

*technically –1/(n-1)

–1/(n-1)

Reject null at 5%Reject null

Reject null at 1%Null Hypothesis: no spatial autocorrelation*Moran’s I = 0

Alternative Hypothesis: spatial autocorrelation exists*Moran’s I > 0

Reject Null Hypothesis if Z test statistic > 1.96 (or < -1.96)---less than a 5% chance that, in the population, there is no

spatial autocorrelation---95% confident that spatial auto correlation exits

Page 27: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

Null Hypothesis: no spatial autocorrelation*Moran’s I = 0

Alternative Hypothesis: spatial autocorrelation exists*Moran’s I > 0

Reject Null Hypothesis if Z test statistic > 1.96 (or < -1.96)---less than a 5% chance that, in the population, there is no

spatial autocorrelation---95% confident that spatial auto correlation exits

27

Page 28: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

28

Moran Scatter PlotsMoran’s I can be interpreted as the correlation between variable, X,

and the “spatial lag” of X formed by averaging all the values of X for the neighboring polygons

We can then draw a scatter diagram between these two variables (in standardized form): X and lag-X (or W_X)

Least squares “best fit” line to the points.The slope of this regression line is Moran’s I(will discuss Regression later)

Xi

Lag Xi

is average of these

Page 29: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

Moran Scatterplot: example

• Scatterplot of X vs. Lag-X

• The slope of the regression line is Moran’s I

GISC 7361 Spatial Statistics 29

Moran’s I = 0.49

High surrounded by highLow

surrounded by low

Population density in Puerto Rico

X

La

g-X

Page 30: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

30

Moran’s I for rate-based data• Moran’s I is often calculated for rates, such as crime

rates (e.g. number of crimes per 1,000 population) or infant mortality rates (e.g. number of deaths per 1,000 births)

• An adjustment should be made, especially if the denominator in the rate (population or number of births) varies greatly (as it usually does)

• Adjustment is know as the EB adjustment:– see Assuncao-Reis Empirical Bayes Standardization

Statistics in Medicine, 1999

• GeoDA software includes an option for this adjustment

Page 31: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

31

Geary’s C (Contiguity) Ratio• Calculation is similar to Moran’s I,

– For Moran, the cross-product is based on the deviations from the mean for the two location values

– For Geary, the cross-product uses the actual values themselves at each location

• Interpretation is very different, essentially the opposite!

Geary’s C varies on a scale from 0 to 2– 0 indicates perfect positive autocorrelation/clustered

– 1 indicates no autocorrelation/random

– 2 indicates perfect negative autocorrelation/dispersed

• Can convert to a -/+1 scale by: calculating C* = 1 - C

• Moran’s I usually used!

ååå

åå

== =

= =

-

--

=n

1i

2i

n

1i

n

1jij

n

1i

n

1jjiij

)x(x)w(

)x)(xx(xwN

I

ååå

åå

== =

= =

-

-

=n

1i

2i

n

1i

n

1jij

n

1i

n

1j

2jiij

)x(x)w(2

)x(xwN

C

Page 32: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

32

Statistical Significance Tests for Geary’s C• Similar to Moran

• Again, based on the normal frequency distribution with

however, E(C) = 1

• Again, there are two different formulations for the standard error calculation– The randomization or nonfree sampling method

– The normality or free sampling method

• The actual formulae for calculation are in Lee and Wong, 1st Ed. p. 81 and p. 162

)(

)(

IerrorS

CECZ

-= Where: C is the calculated value for Geary’s C

from the sample E(C) is the expected value if no

autocorrelation

S is the standard error

Page 33: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

Hot Spots and Cold Spots• What is a hot spot?

– A place where high values

cluster together

• What is a cold spot?

– A place where low values

cluster together

33

• Moran’s I and Geary’s C cannot distinguish them

• They only indicate clustering

• Cannot tell if these are hot spots, cold spots, or both

e.g. high crime area

e.g. low crime area

Page 34: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

34

Getis-Ord General/Global G-Statistic• The G statistic distinguishes between hot spots and cold spots. It

identifies spatial concentrations.

– G is relatively large if high values cluster together

– G is relatively low if low values cluster together

• The General G statistic is interpreted relative to its expected value– The value for which there is no spatial association

– G > (larger than) expected value è potential “hot spots”

– G < (smaller than) expected value è potential “cold spots”

• A Z test statistic is used to test if the difference is statistically significant

• Calculation of G based on a neighborhood distance within which cluster is expected to occur

Getis, A. and Ord, J.K. (1992) The analysis of spatial association by use of distance statistics Geographical Analysis, 24(3) 189-206

Page 35: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

Briggs Henan University 2010 35

Calculating General G

• Begins by identifying a distance band, d, within which clustering occurs

• Actual Value for G is given by:

• the terms in the numerator (top) are calculated “within a distance ring (d),” and are then divided by totals for the entire region to create a proportion

– if nearby x values are both large (indicating “hot” spot), the numerator (top) will be large

– If they are both small (indicating “cold” spot), the numerator (top) will be small

• Expected value for G (if no concentration) is given by:

Where:d is neighborhood distanceWij weights matrix has only 1 or 0

1 if j is within d distance of i0 if its beyond that distance

Thus any point beyond distance d has a value of zero and therefore is excluded

)1()(

-=

nn

WGE where

Number of points within distance band d

Total number of points in study region

d

Page 36: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

Comments on General G• General G will not show negative spatial autocorrelation

• Should only be calculated for ratio scale data

– data with a “natural” zero such as crime rates, birth rates

• Although it was defined using a contiguity (0,1) weights matrix, any type of spatial weights matrix can be used

– ArcGIS gives multiple options

• There are two global versions: G and G*

– G does not include the value of Xi itself, only “neighborhood” values

– G* includes Xi as well as “neighborhood” values

36

Page 37: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

37

Testing General G• The test statistic for G is normally distributed and is given by:

• The next slide shows the results for running General G on Anselin’s Columbus crime data – This data is not good, but is very common since Anselin uses it in his

original LISA article and in the examples in the GeoDA documentation

– The geographic coordinates are completely arbitrary

)(

)(

GerrorS

GEGZ

-=

Calculation of the standard error is complex. See Lee and Wong 1st pp 164-167 or Getis and Ord 1992 for formulae.

)1()(

-=

nn

WGEwith

Page 38: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

38

Variable to analyze

Different options available for specifying cluster neighborhood--simple distance band selected, as described in lecture

Options for measuring distance--straight line (Euclidean)--city block

General/Global G in ArcGIS

Shapefile containing polygon or point data

Size of neighborhood distance band

Page 39: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

39

Observed G = .777Expected G = .637

Observed > Expected >> “Hot spots”Z score: 5.067 > 1.96 >> significant

But where are the hot spots?

For this we use Local Statistics

General/Global G in ArcGIS: results

Page 40: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

What have we learned today?• Difference between global and local measures of spatial

autocorrelation

• How to calculate and interpret some global measures

– Join Count Statistic • Used for binary (0,1) data only

– Moran’s I • The most common global measure of spatial autocorrelation

– Geary’s C• interpretation almost opposite of Moran’s I, but not used very often

– Getis-Ord G statistic• Identifies hot spots or cold spots

Next Time:

local measures of spatial autocorrelation40

Page 41: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

Challenge for You

• Calculate Moran’s I and/or General G for some appropriate variables in the China provinces data set

• Use ArcGIS or GeoDA software

41

Page 42: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

42

References• O’Sullivan and Unwin Geographic Information Analysis New

York: John Wiley, 1st ed. 2003, 2nd ed. 2010

• Jay Lee and David Wong Statistical Analysis with ArcView GISNew York: Wiley, 1st ed. 2001 (all page references are to this book), 2nd ed. 2005– Unfortunately, these books are based on old software (Avenue scripts

used with ArcView 3.x) and no longer work in the current version of ArcGIS 9 or 10.

• Ned Levine and Associates CrimeStat III Washington: National Institutes of Justice, 2010

– Available as pdf

– download from: http://www.icpsr.umich.edu/NACJD/crimestat.html

Page 43: 공간적자기상관 개념과측정 - KOCWelearning.kocw.net/contents4/document/lec/2013/Konkuk/... · 2014-01-24 · Name Fips Ncount N1 N2 N3 N4 N5 N6 N7 N8 Alabama 1 4 28 13

43